Etsi tuesta

Avoid support scams. We will never ask you to call or text a phone number or share personal information. Please report suspicious activity using the “Report Abuse” option.

Lue lisää

utf-8 displayed incorrectly

more options

My site uses a template engine to wrap standard page HTML around templates that contain only the body text. However utf-8 multibyte characters are displayed correctly if there is no HTML header but incorrectly if the standard page header, which explicitly specifies utf-8 encoding, is presented to Firefox.

https://www.jamescobban.net/templates/Articles/FrenchOccupations.html is the template of the body text https://www.jamescobban.net/displayPage.php?template=Articles/FrenchOccupations has the HTML header and footer wrapped around it. You can look at it yourself, but to save time the header added by displayPage is:

   
   
      
        <title>
	    ...text extracted by DisplayPage from the h1 tag...
        </title>
        <meta charset="utf-8">
        <meta http-equiv="default-style" content="text/css">
        <meta name="author" content="James A. Cobban">
        <meta name="copyright" content="© 2018 James A. Cobban">
        <meta name="keywords" content="genealogy, family, tree, ontario, canada ">
        <link rel="stylesheet" type="text/css" href="/styles.css">
      

My site uses a template engine to wrap standard page HTML around templates that contain only the body text. However utf-8 multibyte characters are displayed correctly if there is no HTML header but incorrectly if the standard page header, which explicitly specifies utf-8 encoding, is presented to Firefox. https://www.jamescobban.net/templates/Articles/FrenchOccupations.html is the template of the body text https://www.jamescobban.net/displayPage.php?template=Articles/FrenchOccupations has the HTML header and footer wrapped around it. You can look at it yourself, but to save time the header added by displayPage is: <!DOCTYPE HTML > <html lang="en"> <head> <title> ...text extracted by DisplayPage from the h1 tag... </title> <meta charset='utf-8'> <meta http-equiv='default-style' content='text/css'> <meta name='author' content='James A. Cobban'> <meta name='copyright' content='&copy; 2018 James A. Cobban'> <meta name='keywords' content='genealogy, family, tree, ontario, canada '> <link rel='stylesheet' type='text/css' href='/styles.css'> </head>

Valittu ratkaisu

Let's be very clear about the situation. Firefox renders this page --

https://www.jamescobban.net/displayPage.php?template=Articles/FrenchOccupations

-- as directed by the server. There is no bug there.

However, this page --

https://www.jamescobban.net/templates/Articles/FrenchOccupations.html

-- does not use a DOCTYPE Declaration on the first line, and therefore renders in Quirks mode. Please understand that Quirks mode is not going to be updated because its entire purpose is backwards compatibility with poor web design practices of decades past. See: https://developer.mozilla.org/docs/Web/HTML/Quirks_Mode_and_Standards_Mode

All that actually is not relevant to you because you would never serve that raw HTML fragment to a visitor under normal circumstances, would you? Presumably you can redirect a request for anything in the templates directory to the proper page.

Lue tämä vastaus kontekstissaan 👍 0

Kaikki vastaukset (7)

more options

When Firefox loads this page, it uses windows-1252 encoding:

https://www.jamescobban.net/templates/Articles/FrenchOccupations.html

If I use the classic menu bar (tap Alt to display)

View > Text Encoding > Unicode

then I see the same problem as the other page.

I'm not sure how you author your templates, but can you check the encoding used by your editor and see whether you can re-save the file as UTF-8?

(Note: I see the same issue in Chrome for Windows.)

more options

Thank you. I was using VIM. I explicitly set the encoding and fileencoding to utf-8 and it corrected the problem.

more options

By the way why would the default text encoding for an html file, in the absence of an explicit specification in the head, not be utf-8? And since it is not I cannot find anywhere in about:config where I can fix that.

more options

There is a fallback setting in Options/Preferences, but that is about 8-bit encoding and doesn't support Unicode.

  • Options/Preferences -> Content -> Fonts & Colors -> Advanced -> Character Encoding for Legacy Content

You can setup the server to send files as Unicode (utf-8) encoding as that will always prevail.

more options

The template (first of your two links) doesn't have a DOCTYPE declaration, so it renders in Quirks Mode. Quirks Mode is not standardized, and may use behaviors from the 1990s such as using the OS default for character encoding.

more options

All of the standards state that the preferred and default character encoding for html documents is utf-8. So why does Firefox implement a proprietary encoding belonging to a single manufacturer, especially when displaying documents on systems for which Windows is a swear word. Even if you think that this default is in the best interests of your customers, why would you not permit your customers to make their own decision of what the default encoding is?

I have created a .htaccess on my development site to explicitly set what should have been the server default in the first place, but FileZilla doesn't show the .htaccess on the local side and the only documentation I can find is how to get it to display hidden files on the server side.

"using the OS default for character encoding." But I am running Ubuntu and there is no way that the Ubuntu default character encoding is a Microsoft proprietary code page!

Please fix the broken default.

Muokattu , muokkaaja jamescobban

more options

Valittu ratkaisu

Let's be very clear about the situation. Firefox renders this page --

https://www.jamescobban.net/displayPage.php?template=Articles/FrenchOccupations

-- as directed by the server. There is no bug there.

However, this page --

https://www.jamescobban.net/templates/Articles/FrenchOccupations.html

-- does not use a DOCTYPE Declaration on the first line, and therefore renders in Quirks mode. Please understand that Quirks mode is not going to be updated because its entire purpose is backwards compatibility with poor web design practices of decades past. See: https://developer.mozilla.org/docs/Web/HTML/Quirks_Mode_and_Standards_Mode

All that actually is not relevant to you because you would never serve that raw HTML fragment to a visitor under normal circumstances, would you? Presumably you can redirect a request for anything in the templates directory to the proper page.