Firefox reports a secure connection without existing certificate
When i click on the lock icon in the address bar, nothing happens (I would expect to see the encryption and certificate, but no widget opens). When I right-click into the page and select "View Page Info", "Security", "View Certificate", no certificate exists. Only generic info is shown (Servers+Authorities). Also an error is reported in the system log, confirming that no certificate exists: JavaScript error: chrome://browser/content/browser-siteIdentity.js, line 642: TypeError: issuerCert is undefined
Questions:
- Why does Firefox show the lock icon while no certificate exists?
- How does Firefox establish a secure connection without having a certificate? (From my understanding a certificate is needed for TLS to work.)
Version used: Firefox 78.12.0esr ('Mozilla/5.0 (X11; FreeBSD amd64; rv:78.0) Gecko/20100101 Firefox/78.0'), no add-ons or plugins.
Keazen oplossing
Looks like some progress on the bug report. However, Firefox 94 release comes in November, so it's some time away.
Dit antwurd yn kontekst lêze 👍 0Alle antwurden (20)
Usually if you click the lock icon, a site information panel drops down. Can you give an example of a page where that does not happen? To avoid the moderation queue, you can type a space in the URL before the TLD (example. com) so the forum doesn't recognize the link.
Distros often make tweaks to Firefox in their custom builds, but it doesn't make sense to have a View Certificate button when there's no certificate to view.
jscher2000 schrieb
Usually if you click the lock icon, a site information panel drops down.
Yes, this is what I would expect. It does not happen; the icon is functionless when clicked.
Can you give an example of a page where that does not happen?
See below.
Distros often make tweaks to Firefox in their custom builds, but it doesn't make sense to have a View Certificate button when there's no certificate to view.
I think this can be almost ruled out. The integration tweaks on Berkeley/FreeBSD seem to mostly concern the audio backend, which should be quite independent from this issue.
I did now a real lot of testing and comparing, and found the following:
- The problem is highly erratic. In hundreds of tries I failed to find a scenario where it could be predicted if it would appear or not appear.
- The problem is related to the tab used. It may appear in one tab and not appear in another tab from the same server/application. Also it may go away when pressing the back button (or it may not).
- The problem does not go away by deleting cookies and/or cache.
- The problem may or may not go away by restarting Firefox. I could not pinpoint exact circumstances.
- The problem does appear with a fresh virgin profile; the probability varies from try to try. With two freshly created profiles, it may appear almost always in one of them and almost never in the other.
- The problem does appear with version 90.0.1 also. (But 90.0.1 is unuseable on Berkeley/FreeBSD like most non-ESR versions; it crashes immediately when accessing any video, e.g. YT).
In version 90.0.1 the error-message is the same, but the line number changes to 763. Also, in that version is no option "View Page Info" when clicking into the page, so only the nonfunctional lock icon can be perceived, and there is no way to look at the (missing) certificate.
I now spent days trying to isolate and pinpoint the issue. I had it reproducibly appear when logging in to an application, and only when the "remember me" flag is checked (and exactly at the time when the answer from the server comes in). As the only difference here is that a persistent "remember" cookie is sent along, I thought that this cookie would trigger the issue, and focused on that. Then I found it occasionally happen also without the "remember me" checked. So this was a red herring.
Then I found the problem disappear when I added the "Secure" flag to the cookies on the server side. So I thought it might be triggered by the missing Secure flag (considering the SameSite issue and the stuff along that). But then I discovered that developer.mozilla .org also sends an unsecure cookie, so if there would be any problem, it would by long have been detected. A red herring, again. And so it boils down to in fact happen only with my own webservers - and that is really bad, because it is now imperative for me to figure out what might be wrong with these (if anything - they are properly installed to what I understand as best practices, and have letsencrypt certificates).
I have looked into practically everything I can think of, with no success yet. So it comes back to the visible error message. I had hoped somebody here would tell me what that means and where it comes from.
The original error seems to be related to the signing certificate, which usually is an intermediate certificate that Firefox prefers to be served by the server, but which may also be cached after a visit to another site.
Do you serve a "bundle" file that contains the intermediate certificate(s) needed to chain your site's certificate up to the trusted root certificate?
jscher2000 said
The original error seems to be related to the signing certificate, which usually is an intermediate certificate that Firefox prefers to be served by the server, but which may also be cached after a visit to another site. Do you serve a "bundle" file that contains the intermediate certificate(s) needed to chain your site's certificate up to the trusted root certificate?
Yes I do. I have 3 certificates: my domain certificate, an intermediate "R3" CA-cert, and the root CA cert. All are configured in the webserver, and all are nicely shown from "Page Info -> View Certificates" in the case when things do work correctly.
I found out some more:
- I figured why neither clearing the history nor restarting firefox nor creating a new tab resolves the current flaw:
When clearing the history, the tab persists. When restarting firefox, the history persists. Only when clearing the history AND creating a new tab (and removing the old) AT THE SAME TIME gives back a pristine condition.
- I figured some of why the behaviour is so erratic: my application runs with turbolinks, which produces a cached page and meanwhile loads the new page asynchronously with AJAX. So it is not obvious when the new data actually appears. Switching off javascript triggers the flaw nevertheless, but triggers it at more intellegibe points.
- I discovered the Browser Console Toolbox. There I get a break on exception and can see exactly why the error happens: there is a variable called "gIdentityHandler._secInfo.succeededCertChain" which under normal conditions holds an array with the 3 certificates. In the case of the flaw this is an empty array - there are no certificates present!
- I tried to trace that variable. It gets copied from another variable called "gBrowser.securityUI.secInfo.succeededCertChain", and only at certain times when Firefox seems it fit to copy (which adds to the erratic behaviour). In the case of the flaw that other variable holds also an empty array (but that one becomes empty in a rather intellegibe fashion after page load.)
Sadly, that other variable cannot be traced; it apparently gets written from somewhere deep in the cpp code.
- I analyzed the server configuration and did full data traces on the SSL/TLS communication. It looks sane to me. But there is somehing interesting: the flaw appears when the new page gets loaded as a subsequent request, without TLS negotiation and therefore without certificates! (TLS opens a TCP connection, does the crypto negotiations, and then keeps that TCP session open for further requests and transfers subsequent page requests thru the same TCP session, without another TLS negotiation.) When I restart the webserver immedately before the problematic page request, that way forcing a new TLS negotiation, the flaw does not appear!
So in some way the observable behaviour is correct: there were indeed no certificates received with that specific page load (because the TLS connection was already open). But why does it happen only here??
- finally I downloaded peppermint linux. It brings some Firefox 71.x.x in it's installer, and the faulty behaviour is there exactly the same. It is NOT an issue of FreeBSD, it is NOT an issue of my desktop configuration or my profiles.
pmc1 said
* I figured why neither clearing the history nor restarting firefox nor creating a new tab resolves the current flaw: When clearing the history, the tab persists. When restarting firefox, the history persists. Only when clearing the history AND creating a new tab (and removing the old) AT THE SAME TIME gives back a pristine condition.
When you say you cleared history, which categories did you clear? For example: cache, browsing history (URLs / session history), cookies / offline web storage, or all of those. It's interesting that tab state data could persist through that.
* I figured some of why the behaviour is so erratic: my application runs with turbolinks, which produces a cached page and meanwhile loads the new page asynchronously with AJAX. So it is not obvious when the new data actually appears. Switching off javascript triggers the flaw nevertheless, but triggers it at more intellegibe points.
Hmm, I think you're saying even without background requests, and presumably without a service worker used by a script, the problem still occurs from time to time.
* I analyzed the server configuration and did full data traces on the SSL/TLS communication. It looks sane to me. But there is somehing interesting: the flaw appears when the new page gets loaded as a subsequent request, without TLS negotiation and therefore without certificates! (TLS opens a TCP connection, does the crypto negotiations, and then keeps that TCP session open for further requests and transfers subsequent page requests thru the same TCP session, without another TLS negotiation.) When I restart the webserver immedately before the problematic page request, that way forcing a new TLS negotiation, the flaw does not appear! So in some way the observable behaviour is correct: there were indeed no certificates received with that specific page load (because the TLS connection was already open). But why does it happen only here??
What's puzzling is that a lot of sites probably use similar design patterns but as far as I can tell, this is a not a commonly reported issue. Is there any reason to imagine that Firefox or the server have kept the connection open beyond some critical time-out, in other words, they should have closed it earlier?
* finally I downloaded peppermint linux. It brings some Firefox 71.x.x in it's installer, and the faulty behaviour is there exactly the same. It is NOT an issue of FreeBSD, it is NOT an issue of my desktop configuration or my profiles.
Firefox 78 is near it's end-of-life. Can you test in Firefox 91esr or in FIrefox 93beta (for example, https://www.mozilla.org/firefox/developer/ )?
It might be I have found a little bit of a lead now: with some high probability the flaw does only appear when TLSv1.3 is used. (And it is used whenever the webserver allows it.)
jscher2000 said
When you say you cleared history, which categories did you clear? For example: cache, browsing history (URLs / session history), cookies / offline web storage, or all of those. It's interesting that tab state data could persist through that.
The more the better, on the safe side: all. I didn't investigate what exactly is needed when. I agree that this could give some clue about what is happening, but I postponed such tests in favour of more promising ones, given that I do not know exactly which objects are part of which category, and so my gain of insight would be limited. Currently I would rather like to find some good MOZ_LOG options to see a bit more of the cert handling.
* I figured some of why the behaviour is so erratic: my application runs with turbolinks, which produces a cached page and meanwhile loads the new page asynchronously with AJAX. So it is not obvious when the new data actually appears. Switching off javascript triggers the flaw nevertheless, but triggers it at more intellegibe points.Hmm, I think you're saying even without background requests, and presumably without a service worker used by a script, the problem still occurs from time to time.
The problem appears just the same with or without JS, but without JS it appears more precisely after when a link is clicked, and not at some arbitrary time.
What's puzzling is that a lot of sites probably use similar design patterns but as far as I can tell, this is a not a commonly reported issue.
Yes, this is strange. That string in the error-message "issuerCert", is quite unique, and it would appear in stderr any time this error happens and somebody clicks on the lock icon. But google doesn't find anything useful. You can be certain I am sitting over my webserver's SSL configuration pulling my hair out and wondering, what could there be different than on most installations...
Is there any reason to imagine that Firefox or the server have kept the connection open beyond some critical time-out, in other words, they should have closed it earlier?
Unlikely. It can happen 20 seconds after server restart.
Firefox 78 is near it's end-of-life. Can you test in Firefox 91esr or in FIrefox 93beta (for example, https://www.mozilla.org/firefox/developer/ )?
I did already try with 90.0.1, that is the version that has been integrated and is in my deploy system, so that I can push-button compile it. Arbitrary other versions I would need to integrate and make compile on my own (not so funny). But if I get that peppermint linux (or some other linux) to install on a stick, I could probably fetch a precompiled linux version and install that.
See also hasCustomRoot (746) and refreshIdentityPopup (921).
cor-el said
See also hasCustomRoot (746) and refreshIdentityPopup (921).
It does not get that far. The interesting line is this:
Put a break on this line. Then put a watch on these two and you see either this:
gBrowser.securityUI.secInfo.succeededCertChain: (3) […] gIdentityHandler._secInfo.succeededCertChain: (3) […]
or this:
gBrowser.securityUI.secInfo.succeededCertChain: (0) [] gIdentityHandler._secInfo.succeededCertChain: (3) […]
Then click single-step and see this:
gBrowser.securityUI.secInfo.succeededCertChain: (0) [] gIdentityHandler._secInfo.succeededCertChain: (0) []
And, btw. if the page does not have any TLS, then you will see this:
gBrowser.securityUI.secInfo.succeededCertChain: undefined gIdentityHandler._secInfo.succeededCertChain: undefined
We should NEVER see zero-Length arrays there. They are bogus. Now please do me a favor and figure out where in the code these arrays are filled with data. Then we will see under which (probably erroneous) conditions they can become zero-length.
cor-el said
See:
Oh yes, it's beautiful. And so well coloured. But, then, otherwise it is basically the same as my grep does. And what I said above is, I need to know "'where these variables are filled with data". So, sorry, but this does not yet fully helpful.
Now for the other workproducts:
jscher2000 said
Firefox 78 is near it's end-of-life. Can you test in Firefox 91esr or in FIrefox 93beta (for example, https://www.mozilla.org/firefox/developer/ )?
Putting a linux on a stick did not work: it would take 5 minutes to only open "Help/About Firefox" to see the current version. (These sticks may support usb-3, but they are unuseable with journaled filesystems. Wonder for what they are good for at all - but then they cost nothing.) So now I learned how linux install and configuration works, learned how graphical virtualization works, learned how vnc works and finally have a small kali linux integrated in my desktop (which seems surprizingly well maintained and managed for what I know of linux). And the outcome of all of is this:
Firefox 93.0b8 developer shows the exact same behaviour.
Next workproduct:
jscher2000 said
Usually if you click the lock icon, a site information panel drops down. Can you give an example of a page where that does not happen? To avoid the moderation queue, you can type a space in the URL before the TLD (example. com) so the forum doesn't recognize the link.
This is now flag.daemon .contact/flowm/ There click Log In (upper right), enter something ( or enter Guest0 Guest0 ), check 'remember me' and send it - and in reply you should get the second context, which more or less likely triggers the flaw. And then you probably see the same as I see.
pmc1 said
jscher2000 said
Usually if you click the lock icon, a site information panel drops down. Can you give an example of a page where that does not happen? To avoid the moderation queue, you can type a space in the URL before the TLD (example. com) so the forum doesn't recognize the link.This is now flag.daemon .contact/flowm/ There click Log In (upper right), enter something ( or enter Guest0 Guest0 ), check 'remember me' and send it - and in reply you should get the second context, which more or less likely triggers the flaw. And then you probably see the same as I see.
Sorry, is this on the web now or do I need to install something?
This is on the web now. It should answer to https, and the letsencrypt people have said it does so correctly. (It's a rule builder for the berkeley ipfw firewall, but I thought I could just make it multiuser capable - and so I can put it online as well - should have thought that far earlier)
Bewurke troch pmc1 op
Sorry, I didn't realize that was the URL.
I was not able to replicate the problem in my regular profile, but I can replicate it in a fresh profile on 93 beta. Firefox validated the site certificate on the first access and is able to show that information (example screenshot attached), but when trying to pull up the about:certificate screen after log in, the page is blank.
In the case where it's working, the encoded version of the certificate is passed to the about:certificate page in the URL, and when it fails, it is not. Very puzzling.
Weirdly, if I use Back to return to the Log In page, it's fine. If I log out after logging in, I need to reload the page to view the certificate again.
Also, after testing in a second tab, I can't replicate the error. ??
jscher2000 said
I was not able to replicate the problem in my regular profile, but I can replicate it in a fresh profile on 93 beta.
Same here. It seems to happen a lot quicker with a pristine profile. It will probably happen in the other one also, but maybe only after half an hour of working with the app or after accessing some different apps on that server. (Tracking protection?)
Firefox validated the site certificate on the first access and is able to show that information (example screenshot attached), but when trying to pull up the about:certificate screen after log in, the page is blank.
Same here. The log-in creates a new security context. I do not know why it does that, but MOZ_LOG=nsSecureBrowserUI:5 reports that it does it:
[Parent 67891: Main Thread]: D/nsSecureBrowserUI we have a security info 0x8218b3700 [Parent 67891: Main Thread]: D/nsSecureBrowserUI set mTopLevelSecurityInfo
The commentary in the source near that message says this:
// Our BrowsingContext either has a new WindowGlobalParent, or the // existing one has mutated its security state. // Recompute our security state and fire notifications to listeners
I don't see why the security state would change at this point. But it does, and at (or before?) that point the chain is lost. The server certificate is still there and is intact, but the certificate chain is gone. According to the debug log it has not even been re-evaluated, it has just vanished into thin air. And consequentially the data in the securityInfo structure is now inconsistent:
isBuiltCertChainRootBuiltInRoot: false resumed: true succeededCertChain: Array []
But then, a lot of other sites do the same, and do NOT loose the certChain alongside.
In the case where it's working, the encoded version of the certificate is passed to the about:certificate page in the URL, and when it fails, it is not. Very puzzling. Weirdly, if I use Back to return to the Log In page, it's fine. If I log out after logging in, I need to reload the page to view the certificate again. Also, after testing in a second tab, I can't replicate the error. ??
Same here. And it goes on and on in that fashion - this thing was me a source of infinite bewilderment. But, while for me it still seems impossible to pinpoint which component is to blame for the root cause of these effects, can we agree that, no matter what actually comes down from the server, Firefox is behaving in a way that is not fully correct?
Bewurke troch pmc1 op
pmc1 said
But, while for me it still seems impossible to pinpoint which component is to blame for the root cause of these effects, can we agree that, no matter what actually comes down from the server, Firefox is behaving in a way that is not fully correct?
My assumption is that logging in to a website may change a cookie or other stored data, but it definitely should not change anything else on the browser side.
I can't replicate it with this simplified example:
https://www.jeffersonscher.com/res/post302form.php
Source:
I most likely found it.
I'm far from understanding it, I don't even understand what that code is supposed to do and what is wrong with that. I'ts totally cryptic. But it is what is different. It seems to come from the certificate.
jscher2000 said
My assumption is that logging in to a website may change a cookie or other stored data,
Would this trigger a new security context? The cookies are the only thing I can think of. And my app definitely sends a (different) ccokie with every request. See question 42044076 on stackoverflow for explanation. But this is apparently not the problem. It just triggers it.
I go to sleep now. We unravel this in due time.
Alright. This may get a bit lengthy now.
TL;DR: the problem appears to be with OCSP stapling.
(A) The observables
We see two effects:
- the lock icon in the address bar becomes nonfunctional (while still being present)
- the Page Info "Show Certificates" does not show the certificates
These two effects rely on different data, so they do not always align. In due time Firefox copies one data to the other, and then they will align again.
All of this (including the data) belongs to the presentation layer. It is only what we see, it is NOT where the problem happens.
(B) the security context
We must assume that the security context changes when a new cookie is received.
In this case, the application uses authenticated encrypted cookies with a lot of cryptography included. (These cookies should be safe to send even over unprotected HTML - which is why the new SameSite regimen, prohibiting this for 3rd-party context, is a nuisance.) The bottomline is: the cookie is not only sent with every request, it actually changes with every request.
The log-in process therefore creates already three security contexts:
- first connect to the page
- The form sent after <Log In> is clicked.
- Login process completed.
But the 'turbolinks' page preloader running in javascript does somehow mangle and merge these requests. They are cleanly observable only with javascript disabled.
It is usually the third security context that exposes the problem, but may be the second (if it appears at all).
(C) The internals
At the point where the problem actually appears, nothing significant is reported in debug messages - only that a new security context appeared ("we have a security info").
But in the preceding security context a message appears, which does not appear with other sites ( visbile with MOZ_LOG=pipnss:4 )
D/pipnss HandshakeCallback: couldn't rebuild verified certificate info
This happens in RebuildVerifiedCertificateInformation(), and it means that a call to certVerifier->VerifySSLServerCert() was not successful.
At the end of that first function we find this:
if (rv == Success) { uint16_t status = TransportSecurityInfo::ConvertCertificateTransparencyInfoToStatus( certificateTransparencyInfo); infoObject->SetCertificateTransparencyStatus(status); nsTArray<nsTArray<uint8_t>> certBytesArray = TransportSecurityInfo::CreateCertBytesArray(builtChain); infoObject->SetSucceededCertChain(std::move(certBytesArray)); infoObject->SetIsBuiltCertChainRootBuiltInRoot( isBuiltCertChainRootBuiltInRoot); } </pre>(something is broken with the text formatting; I'll continue in another message)
Bewurke troch pmc1 op
Some infoObject gets filled with certificate data only if that call would have been successful.
(We do not know what this infoObject does already contain in the case when it is not filled from here. We also do not know what this infoObject actually is - but there might be some probability that it has something to do with the perceived failure.)
Now findung out why the certVerifier->VerifySSLServerCert() does not return success, is a long and winding road, and finally leads to this piece of code (in security/nss/lib/mozpkix/lib/pkixcheck.cpp):
Result TLSFeaturesSatisfiedInternal(const Input* requiredTLSFeatures, const Input* stapledOCSPResponse) { if (!requiredTLSFeatures) { return Success; } // RFC 6066 10.2: ExtensionType status_request const static uint8_t status_request = 5; const static uint8_t status_request_bytes[] = { status_request }; Reader input(*requiredTLSFeatures); return der::NestedOf(input, der::SEQUENCE, der::INTEGER, der::EmptyAllowed::No, [&](Reader& r) { if (!r.MatchRest(status_request_bytes)) { return Result::ERROR_REQUIRED_TLS_FEATURE_MISSING; } if (!stapledOCSPResponse) { return Result::ERROR_REQUIRED_TLS_FEATURE_MISSING; }
return Result::Success; }); }
Honestly, I do not understand this construct. But this if (!stapledOCSPResponse) seems to trigger the former message.
Bewurke troch pmc1 op
(D) OCSP stapling
Recent discussions with the letsencrypt people explained:
The perceived error ERROR_REQUIRED_TLS_FEATURE_MISSING and the message couldn't rebuild verified certificate info are normal in the case when the configuration requires OCSP stapling and the server does not provide the OCSP data.
In our case here, the certificate is configured for must-staple, so the server is required to provide the OCSP data. (According to the logs it does that, but a closer look into that is still required.)
Letsencrypt people explained that the most simple solution is to just remove the must-staple feature, thereby making OCSP stapling optional instead of mandatory. This is probably what most people do.
Bewurke troch pmc1 op