OCR
plugin-ocr
Local OCR plugin using Tesseract for on-screen text search and extraction.
Overview
The @nut-tree/plugin-ocr plugin provides an implementation of the TextFinderInterface for on-screen text detection. It processes text locally using Tesseract—no external data transmission required.
Text Search
Find text on screen by word or line
screen.find(singleWord("Login"))Text Extraction
Read text from screen regions
screen.read({ searchRegion })100+ Languages
Preload and use multiple OCR languages
preloadLanguages([Language.English])Installation
npm i @nut-tree/plugin-ocrSubscription Required
Configuration
Configure the OCR plugin with configure() to set the data path and language model type:
import { useOcrPlugin, configure, LanguageModelType } from "@nut-tree/plugin-ocr";
useOcrPlugin();
configure({
// Directory for storing downloaded language models
dataPath: "./ocr-data",
// Model quality: DEFAULT, BEST (higher accuracy), or FAST
languageModelType: LanguageModelType.BEST,
});Preloading Languages
Use preloadLanguages() to download language models before use. Supports 100+ languages:
import { useOcrPlugin, preloadLanguages, Language, LanguageModelType } from "@nut-tree/plugin-ocr";
useOcrPlugin();
await preloadLanguages(
[Language.English, Language.German],
[LanguageModelType.BEST],
);Text Search
Find text on screen using singleWord() or textLine():
import { screen, singleWord, mouse, centerOf, straightTo } from "@nut-tree/nut-js";
import { useOcrPlugin, configure, Language } from "@nut-tree/plugin-ocr";
useOcrPlugin();
configure({ dataPath: "./ocr-data" });
screen.config.ocrConfidence = 0.8;
const location = await screen.find(singleWord("WebStorm"), {
providerData: {
lang: [Language.English, Language.German],
partialMatch: false,
caseSensitive: false,
},
});
await mouse.move(straightTo(centerOf(location)));Search Configuration
lang
lang?: Language[]Languages for OCR recognition
partialMatch
partialMatch?: booleanAllow partial text matches
caseSensitive
caseSensitive?: booleanCase-sensitive text matching
preprocessConfig
preprocessConfig?: objectImage preprocessing settings applied before OCR
Text Extraction
Extract text from screen regions using screen.read():
import { screen, getActiveWindow } from "@nut-tree/nut-js";
import { useOcrPlugin, configure, TextSplit } from "@nut-tree/plugin-ocr";
useOcrPlugin();
configure({ dataPath: "./ocr-data" });
const activeWindow = await getActiveWindow();
const text = await screen.read({
searchRegion: activeWindow.getRegion(),
split: TextSplit.LINE,
});TextSplit Options
Control the granularity of extracted text:
NONE
TextSplit.NONEReturn as a single text block
SYMBOL
TextSplit.SYMBOLSplit by individual characters
WORD
TextSplit.WORDSplit by words
LINE
TextSplit.LINESplit by lines
PARAGRAPH
TextSplit.PARAGRAPHSplit by paragraphs
BLOCK
TextSplit.BLOCKSplit by text blocks
plugin-ocr vs plugin-azure
| Aspect | plugin-ocr | plugin-azure |
|---|---|---|
| Processing | Local (Tesseract) | Cloud (Azure AI Vision) |
| Data privacy | No external transmission | Data sent to Azure |
| Accuracy | Standard | Higher on complex/low-quality images |
| Requirements | None (standalone) | Azure account + API key |