AI Agent Tooling

The CLI that AI agents call

Structured JSON commands for mouse, keyboard, screen, windows, and accessibility trees. One npm install, zero boilerplate.

Get Started View Commands

terminal

# Discover all interactive elements in a window
nib snapshot --window "Login" -i

# → @txt:Username  @txt:Password  @chk:Remember  @btn:Login

# Fill in the form using stable element refs
nib type-element @txt:Username admin --window "Login"
nib type-element @txt:Password secret123 --window "Login"
nib click-element @btn:Login --window "Login"

# Verify the result
nib diff --window "Login"
# → @btn:Login changed: isEnabled true → false

npm install @nut-tree/nibNode.js ≥ 22 · macOS, Linux, Windows · x64, arm64·Included in Solo & Team plans

Built for machine consumption

Every design decision optimized for AI agents — structured output, semantic identifiers, and minimal token overhead.

{ }

JSON Protocol

Every command returns structured JSON. Parseable by any LLM, any language, any agent framework.

Element Refs

Human-readable @btn:Save refs instead of fragile coordinates or XPath. Stable across layout changes.

a11y

Accessibility Tree

Snapshot, query, and diff the full a11y tree. Agents see the UI semantically, not just pixels.

.nib

Batch Scripts

Chain commands in .nib files. Execute multi-step workflows in a single call.

From prompt to desktop action

The agent loop is four steps: observe the UI, let the LLM decide, execute the command, verify the result. NIB handles steps 1, 3, and 4.

No SDK required

Shell exec + JSON parse. Works from TypeScript, Python, Go, Rust — anything.

Observe-Act-Verify built in

Snapshot captures state, diff confirms the effect. The agent always knows what happened.

Structured error handling

Errors include codes, messages, and suggestions the LLM can act on directly.

agent-loop.ts

import { execSync } from "node:child_process";
import Anthropic from "@anthropic-ai/sdk";

function nib(cmd: string) {
    return JSON.parse(execSync(`nib ${cmd}`).toString());
}

const client = new Anthropic();

async function agentStep(task: string, window: string) {
    // 1. Observe — snapshot the UI
    const { data } = nib(`snapshot --window "${window}" -i`);

    // 2. Decide — let the LLM pick an action
    const response = await client.messages.create({
        model: "claude-sonnet-4-5-20250929",
        system: `You control a desktop via nib commands.
Available elements:\n${data.refs.join("\n")}`,
        messages: [{ role: "user", content: task }],
        tools: nibTools,
    });

    // 3. Act — execute the command
    const tool = response.content.find(b => b.type === "tool_use");
    nib(`${tool.name} ${tool.input.ref} --window "${window}"`);

    // 4. Verify — diff to confirm the effect
    return nib(`diff --window "${window}"`);
}

terminal session

$ nib snapshot --window "Settings" -i

{
  "ok": true,
  "command": "snapshot",
  "data": {
    "window": { "title": "Settings" },
    "refCount": 8,
    "refs": [
      "@btn:Apply",
      "@btn:Cancel",
      "@chk:DarkMode",
      "@chk:Notifications [checked]",
      "@sld:Volume = \"75\"",
      "@tab:General [focused]",
      "@tab:Privacy",
      "@txt:DisplayName = \"Alice\""
    ]
  }
}

Accessibility-First

See the UI semantically

Snapshots return every interactive element as a human-readable ref. Diffs show exactly what changed after an action. Your agent understands the UI without parsing pixels.

Stable refs survive layout reflows and resolution changes
Semantic types: @btn, @txt, @chk, @mnu, @tab, and 25 more
State tracking: focused, checked, enabled, selected, expanded
Interactive-only mode for minimal token overhead

Verify Every Action

Know what actually changed

After every action, nib diff compares the current accessibility tree against the last snapshot. The agent gets a precise report of what was added, removed, or changed — no guesswork, no screenshots to parse.

Confirm actions succeeded

Did the checkbox actually toggle? Did the dialog close? Diff tells you.

Catch unexpected side effects

See new elements that appeared, buttons that became disabled, or values that changed.

Faster than re-snapshotting

Diff returns only what changed — fewer tokens, faster agent decisions.

verify-with-diff

# Take a baseline snapshot
$ nib snapshot --window "Settings" -i

# Perform an action
$ nib click-element @chk:DarkMode --window "Settings"

# Compare current state against the last snapshot
$ nib diff --window "Settings"

{
  "ok": true,
  "command": "diff",
  "data": {
    "added": [
      "@btn:Restart"
    ],
    "removed": [],
    "changed": [
      {
        "ref": "@chk:DarkMode",
        "isChecked": { "from": false, "to": true }
      }
    ]
  }
}

Fits your agent architecture

NIB is a CLI. It works with whatever agent framework you already use.

MCP Tool Server

Expose nib commands as MCP tools. Claude Code, Cursor, and other MCP clients drive desktops directly.

MCPClaude CodeCursor

Function Calling

Map nib commands to OpenAI/Anthropic tool schemas. The LLM picks the tool, your harness runs the command.

Tool UseGPT-4Claude

Script Generation

LLM generates a .nib batch script, you execute it in one shot. Ideal for planned, multi-step workflows.

BatchPlan-ExecuteReplay

40+ commands, one interface

Mouse, keyboard, screen, windows, clipboard, accessibility, OCR, and more — all through a single CLI with consistent JSON responses.

Mouse & Keyboard

clicktypepressdragscrollmouse-move-smooth

Screen & OCR

screenshot-base64readfind-textscreen-sizecolor-at

Windows

list-windowsfocus-windowactive-windowresize-windowmove-window

Element Interaction

click-elementtype-elementfocus-elementhover-elementscroll-element

Accessibility

snapshotdifffind-elementget-elementcheck-element

Utilities

waitwait-for-windowbatchclipboard-getclipboard-set

What agents build with NIB

🤖

AI Assistants

Let Claude or GPT operate any desktop app on the user's behalf — no per-app integration needed.

🧪

Autonomous QA

Agents that explore applications, fill forms, and verify state changes without hand-written test scripts.

📊

Data Extraction

Snapshot a window, read text via OCR, and parse structured data from any application.

⚡

Workflow Orchestration

Chain desktop actions with API calls. Automate cross-app workflows that have no API.

Ships with SKILL.md

Ready for Claude Code out of the box

NIB ships with a SKILL.md that teaches AI coding agents how to use it. Drop it into your project and Claude Code, Cursor, or any agent that reads markdown instructions can drive your desktop immediately.

Complete command reference

Every command, option, and pattern documented for the agent

Common patterns included

Form filling, menu navigation, dialog handling, state verification

Reliability rules baked in

Best practices the agent follows automatically: snapshot before refs, re-snapshot after transitions, prefer refs over coordinates

SKILL.md

Agent instruction file

# Workflow

Always prefer accessibility-based element interaction over coordinate-based mouse clicks.

# Rules for reliability

1. Always snapshot before using element refs

2. Re-snapshot after UI transitions

3. Use diff for incremental updates

# Common patterns

Form filling, menu navigation, dialog handling, keyboard shortcuts, state verification...

Give your agent a desktop

NIB is included in Solo and Team plans. Let your AI operate any application — no per-app integration, no browser required.

View Plans View Documentation