QNAP, Inc. - Network Attached Storage (NAS)

Language

Support

How to Use OCR Converter to Recognize and Extract Text from Images?

About OCR Converter

OCR Converter recognizes the texts in images and converts them into editable documents using OCR (Optical Character Recognition) technology. You can specify text file formats and the languages of the texts in source images. You can also create schedules to perform conversion tasks at specified times to enhance conversion efficiency.

System Requirements and Compatibility

To use OCR Converter, the NAS must run QTS 4.3.4 (or later versions) and have at least 2 GB of memory. OCR Converter supports both x86-based and ARM-based models, except the TAS series. Text Editor is required for running OCR Converter.

OCR Converter currently supports recognizing texts written in English, Traditional Chinese, Simplified Chinese, and German. We will add support for more languages in the future releases.

Installation

To install and enable OCR Converter, log on to QTS and then go to the App Center. Note that QTS automatically downloads and installs Text Editor when installing OCR Converter.

Creating an OCR Task

To create an OCR task, click “Create OCR Task” on the top-right corner and then select a task type.

One-time Task

You can create OCR tasks that are performed only one time.

  1. Select “One-time”.
  2. Specify the task name.
  3. Click folders on the tree structure to view folders and select files. You can double-click the folders to view their subfolders.
  4. Configure conversion settings.
    You can manually configure the settings or click “Apply default settings” to apply the default settings to all the files on the conversion list.
    1. OCR Languages: Select up to three languages in the source images and rank them based on their proportions in the images.
      Note: This order will affect the conversion result. You can drag languages to adjust the order.
    2. Output format: You can choose TXT or PDF as the output format. You can further edit the converted text file using “Text Editor”.
    3. Text direction: Specify the direction of the text in the source images to improve the efficiency of text recognition.
    4. Download folders: The converted files will be saved in the same path as the source files. This helps you avoid repeatedly converting the same images.
  5. Verify the task settings and then click “Apply”.
    You can view the status of tasks on the homepage.

Scheduled Task

You can convert image files at specified times using the same settings (such as languages and text direction). We recommend placing images with the same languages in the same folders.

  1. Select “Schedule”.
  2. Specify the details of the schedule.
  3. Specify the path of the source folders.
    Note: The converted files will be saved in the same path as the source files.
  4. Select the languages of the text.
  5. Select the output format and text direction.
  6. Verify the settings and then click “Apply”.

OCR tasks will be automatically created at the specified times.

Other Settings and Operations

OCR Converter also allows you to configure other settings and perform various actions:

  • You can select multiple files, download files, and remove files from the homepage. You can also sort tasks by their creation time, end time, name, or status.
  • You can choose to download only converted files or both the source files and converted files.
  • You can remove completed tasks from the homepage. Removing completed tasks does not delete the actual files. After tasks, you can still view and access the files in File Station.
  • To manage scheduled tasks, click on the top-right corner and then select ”Schedule”.
  • To view the files in a conversion task, click the title of a task.
  • You can view the status of OCR tasks. You can also click a file to preview its source file and converted file.
  • You can preview and compare converted documents with source images. To edit a text file, click “Open with Text Editor”.

Improving Conversion Results

OCR Converter is based on the open-source engine Tesseract. The level of recognition varies according to the quality of images and the conversion settings. We recommend choosing images that have least 300 dpi and a clear background. The source images should contain no or few handwritten words. To convert images efficiently, ensure that you select all the languages that appear in the source images and rank them based on their proportions.

Release date: 2017-10-26
Was it helpful?
Thank you for your feedback.
Thank you for your feedback. If you have any question, please contact support@qnap.com
100% of people think it helps.