Hanzi Documentation

Hanzi gives AI agents a real browser. Use it locally with your own model, or embed it in your product via the API.

Use Hanzi now

One command sets up everything: detects your browsers, installs the Chrome extension, finds AI agents on your machine, and configures MCP.

npx hanzi-browse setup

Supports Claude Code, Cursor, Windsurf, Claude Desktop, and Codex.

What setup does

  1. Checks for the Chrome extension — opens the install page if missing
  2. Scans for supported AI agents on your machine
  3. Adds Hanzi as an MCP server to each agent's config
  4. Imports credentials (Claude Code OAuth, Codex, or API key)

Supported credentials

SourceHow
Claude CodeAuto-detected from claude login
CodexAuto-detected from codex login
API keySet ANTHROPIC_API_KEY env var or enter during setup
Custom endpointAny OpenAI-compatible API (Ollama, LM Studio, etc.)

Manual setup

If you prefer to configure manually:

# Claude Code
claude mcp add browser -- npx -y hanzi-browse

# Cursor / Windsurf (mcp.json)
{
  "mcpServers": {
    "browser": {
      "command": "npx",
      "args": ["-y", "hanzi-browse"]
    }
  }
}

Test it

After setup, ask your agent something that needs a browser:

"Go to Hacker News and tell me the top 3 stories right now"

Build with Hanzi

Embed browser automation in your product. Your app calls the Hanzi API, a real browser executes the task, you get the result back.

How it works

Your App POST /v1/tasks GET /v1/tasks/:id Shows results to user API call Hanzi Runs the AI agent Executes browser tools Returns answer Vertex AI (Gemini) WebSocket User's Browser Chrome + Hanzi extension Real signed-in session Clicks, reads, navigates tool results answer

Quick start: let your AI agent build it

Copy this prompt into Claude Code, Cursor, or any AI coding agent. It has everything the agent needs to integrate Hanzi into your project.

Add browser automation to this project using the Hanzi API. Read the codebase first, then ask me:

1. What browser task should Hanzi automate? (e.g. "read patient chart", "fill out a form", "extract data from a web portal")
2. Where in the UI should the browser pairing flow go? (e.g. settings page, onboarding, a dedicated page)
3. Where should task results appear? (e.g. inline in the app, a chat interface, a dashboard)

Then build the integration using this API reference:

## Hanzi API (base URL: https://api.hanzilla.co)

Auth: `Authorization: Bearer hic_live_...` header on all requests.

### Core flow
1. Create pairing token → show user a link → they connect their browser
2. Run tasks against their connected browser → poll for results
3. Show the answer in your app

### Endpoints

POST /v1/browser-sessions/pair
  Body: {"label": "User Name", "external_user_id": "your_user_id"}
  Returns: {"pairing_token": "hic_pair_...", "expires_in_seconds": 300}
  → Build a link: https://api.hanzilla.co/pair/{pairing_token}
  → User clicks it, their Chrome auto-pairs. Token expires in 5 min.

GET /v1/browser-sessions
  Returns: {"sessions": [{"id": "...", "status": "connected", "label": "..."}]}

POST /v1/tasks
  Body: {"task": "description", "browser_session_id": "...", "url": "optional", "context": "optional"}
  Returns: {"id": "task_id", "status": "running"}
  → task: what to do (max 10K chars). Be specific.
  → url: starting page (optional). If set, agent navigates there first.
  → context: extra info like form data, preferences (max 50K chars).

GET /v1/tasks/:id
  Returns: {"status": "running|complete|error", "answer": "...", "steps": 4}
  → Poll every 2s until status != "running". Typical task takes 10-60s.

POST /v1/tasks/:id/cancel
  → Stops a running task.

GET /v1/tasks/:id/steps
  Returns: {"steps": [{"step": 1, "status": "tool_use", "toolName": "navigate", ...}]}
  → Full execution log for debugging.

GET /v1/billing/credits
  Returns: {"free_remaining": 20, "credit_balance": 0, "free_tasks_per_month": 20}

### Key details
- 20 free tasks/month, then $0.05 per completed task. Errors are free.
- Tasks timeout after 30 min. Use cancel to stop early.
- Browser sessions last 30 days and auto-reconnect.
- The user needs the Hanzi Chrome extension installed.
  Install link: https://chromewebstore.google.com/detail/iklpkemlmbhemkiojndpbhoakgikpmcd

### Example: Express + HTML (minimal)
  See: https://github.com/hanzili/hanzi-browse/tree/main/examples/partner-quickstart

Read the codebase to understand the stack and project structure, then ask me the 3 questions above. After I answer, build the full integration.

Or follow the steps manually

  1. Sign inopen your developer console (Google or email)
  2. Create an API key — from the console, or via POST /v1/api-keys
  3. Pair a browser — generate a pairing token, send your user a link (/pair/{token})
  4. Run a taskPOST /v1/tasks with a task and browser session ID
Sample app: See examples/partner-quickstart for a complete working integration (Express + HTML, ~150 lines).

Authentication

All API endpoints (except /v1/health) require authentication. Two methods are supported:

MethodUse caseHow
API keyServer-to-server, SDKAuthorization: Bearer hic_live_...
Session cookieDeveloper console, browserSet automatically after sign-in via Better Auth
curl
curl https://api.hanzilla.co/v1/api-keys \
  -H "Authorization: Bearer hic_live_your_key_here"

API keys are scoped to a workspace. Each key can access all sessions, tasks, and usage within its workspace. Keys are hashed at rest — the plaintext is shown once on creation.

API Keys

POST /v1/api-keys
Create a new API key for your workspace.
# Request
curl -X POST https://api.hanzilla.co/v1/api-keys \
  -H "Authorization: Bearer hic_live_..." \
  -H "Content-Type: application/json" \
  -d '{"name": "production"}'

# Response (201)
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "key": "hic_live_a1b2c3d4e5f6...",
  "name": "production",
  "workspace_id": "...",
  "_warning": "Save this key now. It will not be shown again."
}
GET /v1/api-keys
List all API keys. Returns prefixes only, not full keys.
DELETE /v1/api-keys/:id
Delete an API key. Integrations using this key will immediately stop working.

Browser Sessions

POST /v1/browser-sessions/pair
Create a pairing token (5-minute expiry). The user enters this in the Chrome extension to connect their browser.
# Request
curl -X POST https://api.hanzilla.co/v1/browser-sessions/pair \
  -H "Authorization: Bearer hic_live_..." \
  -H "Content-Type: application/json" \
  -d '{"label": "Dr. Smith", "external_user_id": "user_123"}'

# Response (201)
{
  "pairing_token": "hic_pair_a1b2c3...",
  "expires_at": 1710000000000,
  "expires_in_seconds": 300
}
POST /v1/browser-sessions/register
Exchange a pairing token for a session credential. Called by the extension, not your app.
GET /v1/browser-sessions
List all browser sessions with status, label, and external_user_id.
# Response (200)
{
  "sessions": [
    {
      "id": "550e8400-...",
      "status": "connected",
      "label": "Dr. Smith",
      "external_user_id": "user_123",
      "connected_at": 1710000000000,
      "last_heartbeat": 1710000060000
    }
  ]
}

Tasks

POST /v1/tasks
Start a browser automation task. Requires a connected browser session.
# Request
curl -X POST https://api.hanzilla.co/v1/tasks \
  -H "Authorization: Bearer hic_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Read the patient chart on the current page",
    "browser_session_id": "550e8400-...",
    "url": "https://example.com/chart",
    "context": "Extract: name, medications, allergies"
  }'

# Response (201)
{
  "id": "task_abc123",
  "status": "running",
  "task": "Read the patient chart on the current page",
  "browser_session_id": "550e8400-..."
}
FieldRequiredDescription
taskYesWhat to do (max 10,000 chars)
browser_session_idYesConnected session to run against
urlNoStarting URL (max 2,048 chars)
contextNoExtra context for the agent (max 50,000 chars)
GET /v1/tasks/:id
Get task status, answer, steps, and usage.
# Response (200) — completed task
{
  "id": "task_abc123",
  "status": "complete",
  "task": "Read the patient chart on the current page",
  "answer": "Patient: Jane Doe. Medications: Lisinopril 10mg...",
  "steps": 4,
  "usage": { "inputTokens": 12000, "outputTokens": 800, "apiCalls": 5 },
  "browser_session_id": "550e8400-...",
  "created_at": 1710000000000,
  "completed_at": 1710000120000
}
POST /v1/tasks/:id/cancel
Cancel a running task.
GET /v1/tasks
List recent tasks for your workspace.

Usage

GET /v1/usage
Usage summary for your workspace.
# Response (200)
{
  "totalInputTokens": 150000,
  "totalOutputTokens": 12000,
  "totalApiCalls": 45,
  "totalCostUsd": 0.082,
  "taskCount": 8
}

Browser Pairing

Pairing connects a user's Chrome browser to your workspace. Users pair once — the session lasts 30 days and auto-reconnects on browser restart.

How it works

  1. Your backend calls POST /v1/browser-sessions/pair to get a pairing token
  2. Show your user a link: https://api.hanzilla.co/pair/{token}
  3. User clicks the link → their browser auto-pairs → done
# Your backend generates the link:
curl -X POST https://api.hanzilla.co/v1/browser-sessions/pair \
  -H "Authorization: Bearer hic_live_..." \
  -H "Content-Type: application/json" \
  -d '{"label": "Dr. Smith", "external_user_id": "user_123"}'

# Response:
# { "pairing_token": "hic_pair_abc123...", "expires_in_seconds": 300 }

# Give your user this link:
# https://api.hanzilla.co/pair/hic_pair_abc123...

The pairing page detects the Hanzi extension and pairs automatically. If the extension isn't installed, the user sees an "Install" button.

Sessions auto-reconnect on browser restart — no re-pairing needed. Use label and external_user_id to track which session belongs to which user.

Session Metadata

When creating a pairing token, attach a label and external_user_id to map Hanzi sessions to your users:

POST /v1/browser-sessions/pair
{
  "label": "Dr. Smith's browser",
  "external_user_id": "user_abc123"
}

Both fields are inherited by the browser session and returned in GET /v1/browser-sessions. Use them to identify whose browser is whose in your system.

Troubleshooting

Extension not detected

Make sure the Chrome extension is installed and enabled. Reload at chrome://extensions if needed.

Agent can't find Hanzi

Restart your AI agent after running setup. MCP config is written to disk but agents need a restart.

Session disconnected

The browser was closed or lost network. Sessions auto-reconnect when the browser reopens. Check GET /v1/browser-sessions for status before creating tasks.

Task fails or times out

Check that the session is connected. Verify credentials are valid. Tasks have a 30-minute timeout. If the page requires login, make sure the user is signed in.

Pairing token expired

Tokens are valid for 5 minutes. Generate a new one via the developer console or POST /v1/browser-sessions/pair.

API key not working

Keys start with hic_live_. Check that you're using the full key (shown once on creation). Verify the key belongs to the correct workspace.

Error Codes

StatusMeaningCommon cause
400Bad RequestMissing required field, input too long, invalid URL
401UnauthorizedMissing or invalid API key / session cookie
402Payment RequiredPlan upgrade needed (when billing is active)
403ForbiddenSession belongs to a different workspace
404Not FoundResource doesn't exist, or belongs to another workspace
409ConflictBrowser session not connected or expired
429Too Many RequestsRate limit exceeded (10 tasks/min, 5 concurrent)
500Server ErrorInternal error — check request_id in response for support
503Service UnavailableBilling not configured, or server degraded

All error responses include a request_id in the X-Request-Id response header for tracing.

# Error response format
{
  "error": "Browser session is not connected. The extension must be running and registered.",
  "request_id": "a1b2c3d4"
}

Security

MechanismDetails
API keysSHA-256 hashed at rest. Plaintext shown once on creation. Prefix stored for display.
Pairing tokensSHA-256 hashed. 5-minute expiry. Single use — cannot be replayed.
Session tokens30-day expiry. Auto-rotated by the relay. Revocable.
Workspace isolationAll resources scoped to workspace. Cross-workspace access returns 404.
BYOM privacyNo data leaves your machine. Screenshots sent only to your chosen provider.

Full privacy policy: PRIVACY.md