Print rendering searchable PDFs
Hello,
What is the story with Firefox print rendering searchable PDFs? Apparently this has to do with how the web pages are print rendered.
Chrome and I think IE? are now print rendering all web pages as graphics which kills the ability to create text searchable PDF files using various PDF printers like Bullzip.
So has or will Firefox going to be joining this same path anytime soon?
Thanks, Steve
All Replies (13)
STRANGE... I got an email "reply" to this post but it is NOT showing up here! It doesn't appear to be a PM either! What is going on????
So what site is this happening at?
WestEnd said
So what site is this happening at?
This one HERE. It was your first response about the developer forum... Which doesn't appear to exist.
Hi Steve_Sr., Firefox will render text as a bitmap if it uses a downloadable font. If a site uses a downloadable font for its main body text, not just a few decorative headers, then that could affect the entire printout.
Other than overriding downloadable fonts, I am not aware of a workaround for this. If you want to try a tool for changing the fonts before you print, you could try my "experimental" extension here:
https://addons.mozilla.org/firefox/addon/printable-the-print-doctor/
Only click the second button on the panel, unless there is a special reason to click the first one (the first one is more likely to wreck the layout of the page).
jscher2000 said
Hi Steve_Sr., Firefox will render text as a bitmap if it uses a downloadable font. If a site uses a downloadable font for its main body text, not just a few decorative headers, then that could affect the entire printout.
Unfortunately, as an end user "downloadable font" means nothing to me. However I have some captured web pages of one site in question if you would be interested in evaluating it.
The current application is PDF print archiving of paid subscription time-limited web content in the form of Toyota car repair manuals which Toyota allows.
As you might imagine searching becomes an issue because of the large amount of information. Being able to combine PDF files and search them would be a great help.
Currently FF version 59 will produce text searchable content when paired with Bullzip PDF printer. The current version of Chrome will not.
So I am wondering if the newer FF versions have (or will be) going the way of Chrome in bit map rendering all print output? It would be shame if this happens.
So I guess what you are saying here is that this could also be web page dependent?
I suspect that Toyota is using some form of Adobe tools to generate their web pages. At one time it was SVG but that has apparently been discontinued.
Thanks, Steve
A downloadable font is a font a web page directs a browser to retrieve from a URL on the web. They are in common use now, but apparently not on the site you are most concerned about.
Here is an example of a page that directs Firefox to use a web font. If Firefox can't load the web font (for example, if it unavailable or blocked), then Firefox will use a built-in font instead.
https://www.consumerreports.org/cro/about-us/what-we-do/index.htm
By default, Firefox will print the text in the "Averta W01" font as an image.
Modified
jscher2000 said
Here is an example of a page that directs Firefox to use a web font. If Firefox can't load the web font (for example, if it unavailable or blocked), then Firefox will use a built-in font instead. https://www.consumerreports.org/cro/about-us/what-we-do/index.htm By default, Firefox will print the text in the "Averta W01" font as an image.
Correct. A PDF created from the above link was NOT text searchable using the same tools and versions that I had created searchable PDfs of the Toyota site.
So it looks like it is more than just the browser brand that determines if a page is print rendered such that a PDF is text searchable or not.
jscher2000 said
Here is an example of a page that directs Firefox to use a web font. If Firefox can't load the web font (for example, if it unavailable or blocked), then Firefox will use a built-in font instead. https://www.consumerreports.org/cro/about-us/what-we-do/index.htm By default, Firefox will print the text in the "Averta W01" font as an image.
Followup question... So What happened in the previous case? Was the web font unavailable or blocked?
If the web font had been available and was downloaded would the generated PDF then be text searchable?
No, if the page uses a webfont, Firefox generates an image PDF. Pages that use built-in fonts (like Times New Roman and Arial) print best for PDF searchability purposes.
jscher2000 said
No, if the page uses a webfont, Firefox generates an image PDF. Pages that use built-in fonts (like Times New Roman and Arial) print best for PDF searchability purposes.
I think I got it. Built-in fonts are required for PDF text searchability.
So what does your extension do? Modify the print style sheet to use built-in fonts?
The extension goes through the document looking for nonstandard fonts and makes them standard, but it ignores short paragraphs, so it doesn't fix part of the Consumer Reports page. I will probably tweak that some time in the next month.
Too bad that the web page creators are making their pages non-print friendly. Really, how many different fonts do we need on this planet?