OCR

plugin-ocr

Local OCR plugin using Tesseract for on-screen text search and extraction.

Overview

The @nut-tree/plugin-ocr plugin provides an implementation of the TextFinderInterface for on-screen text detection. It processes text locally using Tesseract—no external data transmission required.

Text Search

Find text on screen by word or line

screen.find(singleWord("Login"))

Text Extraction

Read text from screen regions

screen.read({ searchRegion })

100+ Languages

Preload and use multiple OCR languages

preloadLanguages([Language.English])

Installation

typescript
npm i @nut-tree/plugin-ocr

Subscription Required

This package is included in OCR, Solo, and Team subscription plans.

Configuration

Configure the OCR plugin with configure() to set the data path and language model type:

typescript
import { useOcrPlugin, configure, LanguageModelType } from "@nut-tree/plugin-ocr";

useOcrPlugin();

configure({
    // Directory for storing downloaded language models
    dataPath: "./ocr-data",

    // Model quality: DEFAULT, BEST (higher accuracy), or FAST
    languageModelType: LanguageModelType.BEST,
});

Preloading Languages

Use preloadLanguages() to download language models before use. Supports 100+ languages:

typescript
import { useOcrPlugin, preloadLanguages, Language, LanguageModelType } from "@nut-tree/plugin-ocr";

useOcrPlugin();

await preloadLanguages(
    [Language.English, Language.German],
    [LanguageModelType.BEST],
);

Find text on screen using singleWord() or textLine():

typescript
import { screen, singleWord, mouse, centerOf, straightTo } from "@nut-tree/nut-js";
import { useOcrPlugin, configure, Language } from "@nut-tree/plugin-ocr";

useOcrPlugin();
configure({ dataPath: "./ocr-data" });

screen.config.ocrConfidence = 0.8;

const location = await screen.find(singleWord("WebStorm"), {
    providerData: {
        lang: [Language.English, Language.German],
        partialMatch: false,
        caseSensitive: false,
    },
});

await mouse.move(straightTo(centerOf(location)));

Search Configuration

lang

lang?: Language[]
default: [Language.English]

Languages for OCR recognition

partialMatch

partialMatch?: boolean
default: false

Allow partial text matches

caseSensitive

caseSensitive?: boolean
default: false

Case-sensitive text matching

preprocessConfig

preprocessConfig?: object
optional

Image preprocessing settings applied before OCR

Text Extraction

Extract text from screen regions using screen.read():

typescript
import { screen, getActiveWindow } from "@nut-tree/nut-js";
import { useOcrPlugin, configure, TextSplit } from "@nut-tree/plugin-ocr";

useOcrPlugin();
configure({ dataPath: "./ocr-data" });

const activeWindow = await getActiveWindow();
const text = await screen.read({
    searchRegion: activeWindow.getRegion(),
    split: TextSplit.LINE,
});

TextSplit Options

Control the granularity of extracted text:

NONE

TextSplit.NONE
default

Return as a single text block

SYMBOL

TextSplit.SYMBOL

Split by individual characters

WORD

TextSplit.WORD

Split by words

LINE

TextSplit.LINE

Split by lines

PARAGRAPH

TextSplit.PARAGRAPH

Split by paragraphs

BLOCK

TextSplit.BLOCK

Split by text blocks


plugin-ocr vs plugin-azure

Aspectplugin-ocrplugin-azure
ProcessingLocal (Tesseract)Cloud (Azure AI Vision)
Data privacyNo external transmissionData sent to Azure
AccuracyStandardHigher on complex/low-quality images
RequirementsNone (standalone)Azure account + API key

Was this page helpful?