I was recently asked to help with a perplexing issue at work. Someone accessed a web page and noticed that they were logged in as somebody else. This is a fairly dangerous issue to have. A customer could potentially be logged in as another customer and have full access to their information. Obviously, this is unwanted behavior. The only solace was that this issue was extremely rare. It was only seen a few times out of thousands of web page hits.
A team of people was assembled, including myself, and we all sat and thought about the issue. We tried to wrap our heads around what was happening. My immediate thought was that this was a caching issue. But after speaking with our web developer, I learned that he controls which pages are cached and that the login page is definitely not cached. All of us were stumped. We met a few times and each time people became less and less interested in reproducing the issue. It was said by one person, “This issue cannot be reproduced.” Most people agreed, including myself.
One day, I was alerted that someone in our office was experiencing the issue right at that same moment in time. I immediately started inspecting his network traffic to see if I could see anything amiss. I was looking for cookies – information that the server stores on the client’s computer. When computers talk to a server, if they have a cookie, they will send it back to the server to be processed. The only cookies that I saw were being sent FROM his computer, not from the server. It quickly became apparent that whatever happened that caused this issue, it already happened and it was too late to get any useful information now. The team met and we discussed the new information. The conclusions were mostly focused around disabling caching, but again everyone agreed that this issue would be nearly impossible to reproduce.
A few weeks passed by and our web developer wrote something to monitor when cookies were set from the web server itself. Meanwhile, I wrote something to monitor when cookies are set from the network itself. At first, I saw no data. I thought my script was not functioning. But then I realized that the login page is secure, and I was only monitoring non-secure communications. I expected at this point that my network monitoring would be useless. But then I saw some data fill up my screen. It was strange. How would someone get a cookie on a non-secure page if the only page that sets them is secure? I thought about this for some time and then something else happened. My screen lit up with traffic. It was coming from the same network that I was on. Another interesting bit of information was that every request had a different cookie. I wanted to know who these requests were coming from. So I took the cookie and applied it to my browser. At this point, I was logged in as the other person. I then browsed to their profile information to find out which coworker was affected. I did this for a couple of the cookies that I saw and both pointed to the same coworker. So I paid him a visit.
Within an hour, I had met with the team again and presented this new information to them. One member suggested again that we stop trying to find out what happened because the issue has not happened in weeks. After sharing this opinion, he had an epiphany that he shared with us. When a web server is configured to extend the expiration of cookies (called sliding cookies), it sends a new cookie out to replace the old one if it is older than the mid-way point between the cookie’s birth-date/time and expiration date/time. This information brought a smile to my face. At this point, everything clicked in my head. It all made sense now.
How It Happened
Someone loaded up the web site and logged in. They received a cookie from the server identifying them. Their cookie was good for only 30 minutes. After 15 minutes passed by, the next page that they clicked on triggered the server to send them a new cookie. Not being a login page, the server was not told that it should not cache it. So the server went ahead and cached the page – including that cookie. The cache is very short-lived. Only a few seconds. But with high traffic to the web site, the next person that hit the same page was blindly given the first person’s cookie.
With a good idea of how the problem happened, now was the hardest task of all; to reproduce the problem. I realized that I could leverage the repeated cookie issue to help trigger the stolen cookie issue. I started by logging into the web site as normal. I then took my cookie contents and created a new cookie for the sub domain. I pasted the legit cookie contents into the new cookie and then deleted the original one. To test, I loaded the web site again and it showed me as being logged in. Perfect. Now I had to get the server to send me a graphic or non-login page along with a cookie. All I had to do at this point was wait 15 minutes. After 15 minutes passed, the server would know that my cookie needed to be replaced. It would send me a new cookie but my machine would still send the server the old one back. This would repeat with every page and every object that I loaded. I waited the 15 minutes and hit refresh. BAM, just as I expected, the cookies started pouring in. I was close now. I just had to get the server to cache a page with a cookie on it. So I kept hitting refresh on a single page. I must have reloaded that same page a couple dozen times. Then I went to another browser and cleared my cookies and my cache and then hit the same page as the first browser hit. VIOLA – I had stolen a cookie! I reproduced the issue! The issue that was “too difficult to reproduce” was now no longer. Just to be certain that this was not a fluke, I attempted the steps again but this time I had our web developer load the page from his computer. BAM – he was logged in as me. He had stolen my cookie too.
Now most of the rest of the effort is on our web developer’s shoulders. He has to find out how to make certain that this never happens again.
Note: All of my tests were completed in a test environment, not in a production environment. The actual production environment had caching disabled, which rendered this issue inert. With the goal of turning caching back on, it was in our best interests to try to get to the bottom of this issue first – and that is what I succeeded in doing.
Root Cause: IIS caching static objects with header information.
Primary Aggravating Factor: IIS can resend a cookie with a static object if sliding is enabled AND the half-life of the cookie has passed..
Secondary Aggravating Factor: Some users may receive 2 cookies (.www.domain.com and .domain.com). .
www.domain.com remains static, with a diminishing expiration date. After the half life of that cookie passes, IIS sends a new cookie with a new 14-day expiration date. The browser accepts the new cookie for .domain.com and therefore does not replace .www.domain.com, which continues to diminish. Since .www.domain.com was created first, the browser sends that cookie to IIS first. IIS only cares about the first cookie it receives if multiple are provided with the same name. For up to 7 days, it is therefore possible that IIS will send new cookies on ALL requests – including static objects. When this behavior is combined with the root cause issue of IIS caching static objects with header information, this can cause user B to receive user A’s cookie. Before the cookie swap happens, though, IIS must cache the static object that user A receives. If user A is exhibiting the double-cookie issue, there is a much higher likelihood of one of the many cookie-laden objects to be cached by IIS. As long as caching is enabled, the risk is always present. However, without double cookies, the number of objects being served up with cookies is reduced from many to just one – thereby dramatically reducing the risk that a cookie might be cached.