Join the AMA (Ask Me Anything) with the Firefox leadership team to celebrate Firefox 20th anniversary and discuss Firefox’s future on Mozilla Connect. Mark your calendar on Thursday, November 14, 18:00 - 20:00 UTC!

ابحث في الدعم

Avoid support scams. We will never ask you to call or text a phone number or share personal information. Please report suspicious activity using the “Report Abuse” option.

Learn More

Does Firefox Automatically perform OCR on PDF Documents?

more options

My bank delivers monthly statements as rasterized copies of their paper statements. They are clearly pixelated and not text. However, when I open one of these PDFs in Firefox I am able to select the rasterized text, as you can see from the attached screenshot clip.

How is this possible?

My bank delivers monthly statements as rasterized copies of their paper statements. They are clearly pixelated and not text. However, when I open one of these PDFs in Firefox I am able to select the rasterized text, as you can see from the attached screenshot clip. How is this possible?
Attached screenshots

All Replies (9)

more options

I assume that your bank actually sends real PDF files. If you use Print then in some cases Firefox converts the page to an image.

more options

I would have assumed the same thing except that Sumatra won't t allow me to highlight and copy text and Acrobat will select it but won't copy it. Firefox allows both.

more options

Also I've never seen a pixelated PDF that still contains text. Will wonders never cease?!

Modified by Helmanfrow

more options

If the PDF consists purely of a series of full-page images, unfortunately, Firefox's PDF viewer doesn't have the ability to OCR it.

I suspect your bank applied "security" to the PDF to prevent certain actions, such as copying, editing, and/or printing. (https://helpx.adobe.com/acrobat/how-to/password-protect-pdf.html)

Firefox's PDF viewer is based on the pdf.js JavaScript library, which ignores these "security" restrictions by default. It is a bit of an annoyance to people who create the PDFs, but Mozilla doesn't seem inclined to enforce the restrictions in Firefox.

more options

jscher2000 - Support Volunteer said

I suspect your bank applied "security" to the PDF to prevent certain actions, such as copying, editing, and/or printing. (https://helpx.adobe.com/acrobat/how-to/password-protect-pdf.html)

Yes, I did a little more digging and that's apparently what it is. The document is protected from editing and apparently this can sometimes present text as pixelated images.

Modified by Helmanfrow

more options

jscher2000 - Support Volunteer said

I suspect your bank applied "security" to the PDF to prevent certain actions, such as copying, editing, and/or printing. (https://helpx.adobe.com/acrobat/how-to/password-protect-pdf.html)

Yes, the document is password-protected so that's probably it.

more options

By the way, when you select text in Firefox's PDF viewer, you are selecting a transparent layer of text positioned in front of the page image.

more options

It's funny that "security" can be partially bypassed by simply ignoring it in code.

more options

Helmanfrow said

It's funny that "security" can be partially bypassed by simply ignoring it in code.

Once upon a time, basing "security" on the honor system actually worked, I guess.