Possible Browser Cache Bug Since Firefox 15
Please refer to http://forums.mozillazine.org/viewtopic.php?f=38&t=2675237 for details.
All Replies (19)
Hello!
Thank you very much for reaching out to Mozilla Support about your issue. Please be rest assured that we are still looking into your inquiry. We would appreciate your patience while we are finding an answer for your question. At this moment I am escalating this issue to our advanced troubleshooting team who will get back to you within 24-48 hours.
Thank You!
Best Regards,
feer56
Thanks for the in-depth, super detailed report. That has to point somebody in the right direction. I'll see if I can file a bug on your behalf and then subscribe you to the bug. Hopefully a developer will know what's wrong there.
I've done much of what of you've done on Firefox nightly builds (which are builds that get new changes every night!) when trying to track down the cause of a new problem. I call it "regression hunting". So I feel your pain there b/c between any 2 days Firefox nightlies can have something completely random broken. The only problem I am worried about is the developers not being able to reproduce the problem b/c they don't have the same modem model as you (if that truly matters).
Did you try to check the HTTP request and response header via the net log in the "Web Developer > Web Console" (Ctrl+Shift+K) or with the Live Http Headers extension or otherwise to see on with header it fails?
- https://developer.mozilla.org/en/Tools/Web_Console
- Live Http Headers: https://addons.mozilla.org/firefox/addon/live-http-headers/
See also:
Modified
cor-el,
Using the Web Console, I do not see any difference between request headers - fail vs. succeed. However, that tool does not present the raw headers, so it may not be conclusive.
I'll research further based on your suggestions.
- First trace is connect_left_refresh.html, which is an XHR request made every 3 seconds. Tracing in Web Console, 400 error will occur very sporadically on this request.
- Second trace is clicking the advancedsetup_schedulingaccess.html link on the advancedsetup_dhcpsettings.html page. This usually succeeds the first time.
- Third trace is clicking the advancedsetup_schedulingaccess.html link on the advancedsetup_schedulingaccess.html page. This always fails. However, hit F5 to refresh and it will succeed. Then, I can click the advancedsetup_schedulingaccess.html link once, maybe twice, and it will succeed. After that, it will fail. The only obvious difference is the Referer.
Output from Live HTTP headers follows:
http://192.168.0.1/connect_left_refresh.html
GET /connect_left_refresh.html HTTP/1.1 Host: 192.168.0.1 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Referer: http://192.168.0.1/advancedsetup_dhcpsettings.html Connection: keep-alive
HTTP/1.1 200 Ok Server: micro_httpd Cache-Control: no-cache Date: Fri, 15 Mar 2013 09:46:20 GMT Content-Type: text/html Connection: close
http://192.168.0.1/advancedsetup_schedulingaccess.html
GET /advancedsetup_schedulingaccess.html HTTP/1.1 Host: 192.168.0.1 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Referer: http://192.168.0.1/advancedsetup_dhcpsettings.html Connection: keep-alive
HTTP/1.1 200 Ok Server: micro_httpd Cache-Control: no-cache Date: Fri, 15 Mar 2013 09:46:20 GMT Content-Type: text/html Connection: close
http://192.168.0.1/advancedsetup_schedulingaccess.html
GET /advancedsetup_schedulingaccess.html HTTP/1.1 Host: 192.168.0.1 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Referer: http://192.168.0.1/advancedsetup_schedulingaccess.html Connection: keep-alive
HTTP/1.1 400 Bad Request Server: micro_httpd Cache-Control: no-cache Date: Fri, 15 Mar 2013 09:46:22 GMT Content-Type: text/html Connection: close
Very odd. When using the basic HTTP logging technique from the link provided. I can not get the 400 error to occur. If I open Fx normally, it's easy to generate the error. What's up with that?
OK, more info...
With the cache enabled:
From a successfully loaded page, click any link (other than the current page) -> OK, click another link (other than the current page) -> 400 error. Pretty consistent pattern (although, occasionally it fails on the first, rather than the second, click). If you click on the link for the current page, it almost always fails the first time.
With the cache disabled:
Very random 400 errors - no discernible pattern yet.
Modified
OK, odd yet again...
After dozens of attempts, I was able to generate the 400 error during HTTP logging.
It appears I can only upload images here. How do I get the log file to you?
I REALLY had to prune this down to meet the 500K limit. Hopefully, I didn't trim something you need...
Modified
Another odd observation...
When I run HTTP logging under my standard profile, log.txt has 100,000's of nulls (0x00) at the beginning of the file. Run under a new profile, no nulls. What's up with that?
Modified
Can anyone explain why this is happening?
- If I run Fx normally, I can reliably get a 400 error in just a few clicks.
- If I run Fx with HTTP logging, it is difficult, if not impossible, to get a 400 error. When it happens, it usually takes over 100 clicks.
- If I run Fx and Fiddler, I have never been able to get a 400 error.
Just don't understand how this can be...
Please refer to this discussion: Fiddler Question. Fx 14 does not open multiple connections, whereas Fx 15 and later do.
Nice work getting that answer!
Something else that may help is changing how many connections Firefox can make by changing a certain setting.
To do that, you need to go to the about:config page:
- In the url bar, type about:config and press Enter. If you see the "This might void your warranty!" warning, just click the I'll be careful, I promise! button to continue.
- Copy & paste network.http.max-persistent-connections-per-server into the Search box. When it shows up in the results below, double-click it to change it.
Maybe you can set it to 1 but I don't know if that will slow down your browsing or cause other side effects. But since your modem is single-threaded to begin with, those potential problems should not apply to you.
Modified
Thanks for the suggestion. Here's what I found so far:
- Fx 14, 15, and 19 default to network.http.max-persistent-connections-per-server = 6
- Even with connections set to 1, Fx 19 typically opens 2 connections with every click (usually opens the first, then begins closing the first as it opens a second, then closes the second). It often also spastically opens and closes connections for a few seconds after the page has loaded (with no further clicks). It usually takes 2-4 seconds after the page loads for all the connections to close. As a result, 400 errors still occur, just less frequently (at 6 connections, usually takes 3-5 seconds to close all connections).
- Even at default connections of 6, Fx 14 opens only 1 connection per click. That connection closes immediately after the page loads. This mimics the behavior observed in Chrome, Safari, Opera, and IE.
- Fx 19 with HTTP logging seems to open fewer connections and closes them much faster then w/o logging. May explain why it is difficult to get 400 errors with logging on.
I'll keep testing, but it seems safe to say that connection management took a wrong turn after Fx 14.
OK, here's another interesting twist...
Fx 19 - connections at 6 and CACHE OFF:
- Opens 1-6 connections simultaneously, page loads, and quickly closes all connections simultaneously (no pause).
Fx 19 - connections at 6 and CACHE ON:
- Opens 2-6 connections simultaneously, page loads, and slowly closes 1 connection at time, followed by a pause (often with connections sporadically going to CLOSE_WAIT), until all connections are closed.
These observations represent the dominant behavior trend for each case. There is some variation in behavior. Can't discern a pattern controlling the number of connections that are opened on each click.
Overall, cache on results in connections closing one-at-a-time and very slowly. Also, sometimes another connection will open/close quickly (sometimes multiple times) after the initial connections have closed (with no further clicking). Clicking again while any of the initial connections are still open results in 400 errors virtually every time.
With cache off, even clicking again while initial connections are still open does not result in 400 errors about 95% of the time. This strongly suggests that cache on/off is somehow changing the very nature of the connections (not sure exactly how).
In the other browsers, I can even stress test by repeatedly clicking on the same/different links faster than the server can respond. I see the number of connections grow with each click, but never get a 400 error. The connections simply close in rapid succession until the last click is loaded. In Fx 19, with cache off, I see similar behavior, although Fx seems to do a better job of reusing connections, rather than creating new ones.
At this point, I'm not firmly convinced that this is a question of how many connections are open. It may have more to do with the state of the connections. Need greater understanding of TCP/IP.
As previously reported, Fx 14, regardless of connections or cache settings, opens one connection, loads, and closes immediately and, so far, has never seen 400 errors.
Since you don't have this hardware, please outline additional tests I can conduct to help you troubleshoot this issue...
Have been experimenting with MS Network Monitor and MS Message Analyzer...
Have some interesting results, however the 500K limit on pastebin.com prevents me from sending you the most revealing captures. Got any alternatives we can use?
Quick summary:
- Eric was correct. The server generates a 400 error if it does not receive a request 1-2 seconds after a connection is established. It then initiates the handshake to terminate the connection. Normally, this should not be a problem (as the server will simply close the connection) and is not displayed in the browser.
- Fx 14, Chrome, Safari, Opera, and IE NEVER opens a connection unless there is a request ready, and NEVER receives a 400 error.
- Fx 15 and up, however, does open multiple connections with no request and ALMOST ALWAYS receives a 400 error.
Here's the rub! The 400 error that I'm seeing in the browser follows this scenario:
- Fx establishes a connection w/o a request
- Server -> Status 400
- Connection termination handshake begins
- Server -> FIN
- Fx -> ACK
- User clicks a link (not exactly sure what the relative timing of this is)
- Fx sends the GET request from the user's click on this HALF-OPEN connection (which, of course, the server never receives)
- Fx -> FIN/ACK
- Server -> RESET (sometimes more than once)
- Fx displays the 400 response
Looks to me like Fx is being too aggressive with its connection reuse and is sending a request on a HALF-OPEN connection.
Please see http://pastebin.com/Bzp0tdb2 (Fx 19, max connections 6, cache on).
This is a typical conversation where a 400 error occurs in the browser.
Fx is sending a request when the connection state is FinWait1. This appears to be the root of the problem.
Modified
chromeUsers += 1; firefoxUsers -= 1;