Preventing mojibake with HTML emails
I noticed that many of my recipients are seeing junk characters in my emails. I have taken a look at the raw bytes and I think I know what is going on.
MIME headers and HTML head elements being sent:
Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> The body is encoded correctly in windows-1252 encoding. Everything looks correct, so what's the problem? Here's my theory: The recipient sees the MIME header. Because it is encoded in windows-1252, it converts the HTML content to UTF-8 but of course the meta tag will still *say* it is windows-1252. Basically, having two headers at different layers specifying window-1252 encoding is causing the recipient to doubly decode the HTML body. I think what we need is some way to make the HTML 7-bit friendly. Convert all non-ASCII characters to numerical entities like . That way charset conversions can be done at the MIME layer without corrupting the HTML. Is there an option for this in Thunderbird? I poked around in Settings but couldn't find anything.
Ñemoĩporã poravopyre
Under Tools/Options/Display/Formatting, click the Advanced button where you can set the default encoding for outgoing mail, and the choice of encoding for replies (see picture).
If you're sending through a Yahoo (ATT, Rogers, Verizon etc.) type server and your recipients see unwanted characters, set mail.strictly_mime to true in Tools/Options/Advanced/General/Config. editor to enforce Quoted Printable encoding (Outlook default).
Emoñe’ẽ ko mbohavái ejeregua reheve 👍 0Opaite Mbohovái (4)
Folks using yahoo have issues because for about a year they have been enforcing 7bit encoding. You would think they were living in a US centric version of the world.
Basically your mail should be encoded as UTF-8 and those windows encodings forgotten as they are just obsolete and their decoding varies from product to product. They are a product of the days in the early 1990s when Microsoft was of the view the internet was a fad that they could safely ignore.
How are you managing to get windows encoding in there? pasting from word as anything but plain text is a good way, or always has been.
I was replying to an email that used windows encoding. I agree about pushing towards "Unicode everywhere", but I can't control what encoding others use, and it appears that Thunderbird keeps the same encoding when replying (but not when forwarding).
Is there an option to force use of UTF-8 when replying?
Ñemoĩporã poravopyre
Under Tools/Options/Display/Formatting, click the Advanced button where you can set the default encoding for outgoing mail, and the choice of encoding for replies (see picture).
If you're sending through a Yahoo (ATT, Rogers, Verizon etc.) type server and your recipients see unwanted characters, set mail.strictly_mime to true in Tools/Options/Advanced/General/Config. editor to enforce Quoted Printable encoding (Outlook default).
Thanks a lot for your help. I tested each of those options separately, and each of them solved the problem.
I am using yahoo SMTP servers. @Matt, you also mentioned Yahoo, but I thought you meant Yahoo Mail (a web client). Thanks @sfhowes for being more specific. It did not occur to me that my ISP could suck that much!
I tested their servers myself, and they are allowing 8-bit but only if it's valid UTF-8. Anything non-UTF-8 gets changed to UTF8(U+FFFD). Therefore either option works: forcing replies to UTF-8, or forcing quoted printable (7-bit).
Thanks for pointing me to the options. I would never have found the Reply encoding option under Display (Fonts and Colors). What an illogical place for that option!