About Text Recognition in Keep It

Keep It 1.6 and later will perform text recognition on standalone PDF documents and images, using computer vision and machine learning technologies to minimize work and produce the most accurate results. 

Keep It doesn’t modify PDFs or convert images to make them searchable, but rather indexes the text so that it can be found again, and stores that text in iCloud to save repeating the work on other devices.

Keep It has always been able to index the text in PDFs that have selectable text or had OCR performed on them already, and does not perform unnecessary text recognition on those, or on images that do not appear to contain any text.

Keep It only performs OCR on standalone PDFs and images, not attachments in notes or other files.

For existing pre-1.6 Keep It users on Mac, the app will perform text recognition on all suitable files when version 1.6 or later is first installed, and thereafter whenever suitable files are added or modified.

On iOS, Keep It will temporarily download items to perform text recognition on them, while the app is in the foreground and connected to Wi-Fi.

Text recognition may take some time — see below for more details.

PDF Documents

For PDFs, Keep It will not perform text recognition if there is indexable text in the document already. Instead, the text stored in the document will be indexed so that it can be searched (as in earlier versions of Keep It).

After performing text recognition, Keep It does not modify PDF documents to add an invisible text layer, but instead indexes that text so it can be searched. The text is also stored in iCloud, if in use, to avoid repeating that work on other devices. This data will take up minimal space, typically between 1 and 2 kilobytes per page.

Larger PDF documents and those with more complex layouts may take some time. While it might take a few seconds to recognize the text on a single page, larger and more complex documents could take a few minutes. 

Keep It always performs text recognition in the background (and on Mac, in a completely separate process). To see progress on Mac, choose Window > Activity from the menu and check whether Keep It is performing any “Fetching metadata” operations. On iOS, tap the status bar below either of the lists.

Images

For images, Keep It uses computer vision to detect areas of text in the image, and only performs text recognition on any areas found. Keep It will take steps to refine the quality of the text, but only text where there is a high contrast between foreground and background colors is likely to yield good results.

As with PDFs, image files are not modified, but rather the text is indexed so that it can be searched, and stored in iCloud (if in use) to avoid duplication of work across devices.

Text recognition for images may not be as accurate on macOS High Sierra and iOS 11 as on Mojave and iOS 12 or later, due to advancements in Apple’s computer vision technology.

Screenshots

OCR works best with higher resolution images such as photos, and for text where there is a high contrast between foreground and background colors. Screenshots may produce good results where larger fonts are used, or the screenshot was taken from a higher resolution screen, such as a Retina display.

Attachments

Keep It does not perform text recognition on PDFs or images that are embedded in other files, such as attachments in notes.

Handwriting

The OCR engine has not been trained to recognize handwriting, and so it’s unlikely to produce usable results.

Seeing the Recognized Text

There is no way to see the recognized text in the app, but it will be indexed so that it can be searched.

Languages and Scripts

The OCR engine Keep It uses relies on knowing which language it needs to recognize. By default, Keep It uses the same language as your Mac or iOS device. This can be changed to another language, or a script that encompasses many related languages (e.g. Latin). 

It is not possible to override the language on a per-document basis, or to specify multiple languages or scripts, but most of the scripts also include support for English, except Cyrillic.

In cases where languages can be written both horizontally and vertically (e.g. Japanese), the vertical version may be used as a secondary language.

To change the language, or choose a script:

  • On Mac, choose Keep It > Preferences from the menu, then click Search. Choose the language or script using the pop-up.
  • On iOS, tap the gear icon above the Lists view, then tap on Text Recognition, and choose a language from the list.

Keep It will offer to reindex documents when the language is changed.

Disabling Text Recognition

Text recognition can be enabled or disabled on a per-device basis. To disable text recognition:

  • On Mac, choose Keep It > Preferences from the menu, then click Search. Disable the “Recognize text in PDFs and images” option.
  • On iOS, tap the gear icon above the Lists view, then tap on Text Recognition, and switch Text Recognition off.