Hanzi Documentation
Hanzi gives AI agents a real browser. Use it locally with your own model, or embed it in your product via the API.
Use Hanzi now
Install the extension + MCP server. Bring your own model. One command to get started.
Build with Hanzi
Sign in to your developer console. Create an API key. Pair browsers. Run tasks from your backend.
Use Hanzi now
One command sets up everything: detects your browsers, installs the Chrome extension, finds AI agents on your machine, and configures MCP.
npx hanzi-browse setup
Supports Claude Code, Cursor, Windsurf, Claude Desktop, and Codex.
What setup does
- Checks for the Chrome extension — opens the install page if missing
- Scans for supported AI agents on your machine
- Adds Hanzi as an MCP server to each agent's config
- Imports credentials (Claude Code OAuth, Codex, or API key)
Supported credentials
| Source | How |
|---|---|
| Claude Code | Auto-detected from claude login |
| Codex | Auto-detected from codex login |
| API key | Set ANTHROPIC_API_KEY env var or enter during setup |
| Custom endpoint | Any OpenAI-compatible API (Ollama, LM Studio, etc.) |
Manual setup
If you prefer to configure manually:
# Claude Code
claude mcp add browser -- npx -y hanzi-browse
# Cursor / Windsurf (mcp.json)
{
"mcpServers": {
"browser": {
"command": "npx",
"args": ["-y", "hanzi-browse"]
}
}
}Test it
After setup, ask your agent something that needs a browser:
"Go to Hacker News and tell me the top 3 stories right now"
Build with Hanzi
Embed browser automation in your product. Your app calls the Hanzi API, a real browser executes the task, you get the result back.
How it works
Quick start: let your AI agent build it
Copy this prompt into Claude Code, Cursor, or any AI coding agent. It has everything the agent needs to integrate Hanzi into your project.
Add browser automation to this project using the Hanzi API. Read the codebase first, then ask me:
1. What browser task should Hanzi automate? (e.g. "read patient chart", "fill out a form", "extract data from a web portal")
2. Where in the UI should the browser pairing flow go? (e.g. settings page, onboarding, a dedicated page)
3. Where should task results appear? (e.g. inline in the app, a chat interface, a dashboard)
Then build the integration using this API reference:
## Hanzi API (base URL: https://api.hanzilla.co)
Auth: `Authorization: Bearer hic_live_...` header on all requests.
### Core flow
1. Create pairing token → show user a link → they connect their browser
2. Run tasks against their connected browser → poll for results
3. Show the answer in your app
### Endpoints
POST /v1/browser-sessions/pair
Body: {"label": "User Name", "external_user_id": "your_user_id"}
Returns: {"pairing_token": "hic_pair_...", "expires_in_seconds": 300}
→ Build a link: https://api.hanzilla.co/pair/{pairing_token}
→ User clicks it, their Chrome auto-pairs. Token expires in 5 min.
GET /v1/browser-sessions
Returns: {"sessions": [{"id": "...", "status": "connected", "label": "..."}]}
POST /v1/tasks
Body: {"task": "description", "browser_session_id": "...", "url": "optional", "context": "optional"}
Returns: {"id": "task_id", "status": "running"}
→ task: what to do (max 10K chars). Be specific.
→ url: starting page (optional). If set, agent navigates there first.
→ context: extra info like form data, preferences (max 50K chars).
GET /v1/tasks/:id
Returns: {"status": "running|complete|error", "answer": "...", "steps": 4}
→ Poll every 2s until status != "running". Typical task takes 10-60s.
POST /v1/tasks/:id/cancel
→ Stops a running task.
GET /v1/tasks/:id/steps
Returns: {"steps": [{"step": 1, "status": "tool_use", "toolName": "navigate", ...}]}
→ Full execution log for debugging.
GET /v1/billing/credits
Returns: {"free_remaining": 20, "credit_balance": 0, "free_tasks_per_month": 20}
### Key details
- 20 free tasks/month, then $0.05 per completed task. Errors are free.
- Tasks timeout after 30 min. Use cancel to stop early.
- Browser sessions last 30 days and auto-reconnect.
- The user needs the Hanzi Chrome extension installed.
Install link: https://chromewebstore.google.com/detail/iklpkemlmbhemkiojndpbhoakgikpmcd
### Example: Express + HTML (minimal)
See: https://github.com/hanzili/hanzi-browse/tree/main/examples/partner-quickstart
Read the codebase to understand the stack and project structure, then ask me the 3 questions above. After I answer, build the full integration.
Or follow the steps manually
- Sign in — open your developer console (Google or email)
- Create an API key — from the console, or via
POST /v1/api-keys - Pair a browser — generate a pairing token, send your user a link (
/pair/{token}) - Run a task —
POST /v1/taskswith a task and browser session ID
Authentication
All API endpoints (except /v1/health) require authentication. Two methods are supported:
| Method | Use case | How |
|---|---|---|
| API key | Server-to-server, SDK | Authorization: Bearer hic_live_... |
| Session cookie | Developer console, browser | Set automatically after sign-in via Better Auth |
curl https://api.hanzilla.co/v1/api-keys \ -H "Authorization: Bearer hic_live_your_key_here"
API keys are scoped to a workspace. Each key can access all sessions, tasks, and usage within its workspace. Keys are hashed at rest — the plaintext is shown once on creation.
API Keys
# Request
curl -X POST https://api.hanzilla.co/v1/api-keys \
-H "Authorization: Bearer hic_live_..." \
-H "Content-Type: application/json" \
-d '{"name": "production"}'
# Response (201)
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"key": "hic_live_a1b2c3d4e5f6...",
"name": "production",
"workspace_id": "...",
"_warning": "Save this key now. It will not be shown again."
}Browser Sessions
# Request
curl -X POST https://api.hanzilla.co/v1/browser-sessions/pair \
-H "Authorization: Bearer hic_live_..." \
-H "Content-Type: application/json" \
-d '{"label": "Dr. Smith", "external_user_id": "user_123"}'
# Response (201)
{
"pairing_token": "hic_pair_a1b2c3...",
"expires_at": 1710000000000,
"expires_in_seconds": 300
}# Response (200)
{
"sessions": [
{
"id": "550e8400-...",
"status": "connected",
"label": "Dr. Smith",
"external_user_id": "user_123",
"connected_at": 1710000000000,
"last_heartbeat": 1710000060000
}
]
}Tasks
# Request
curl -X POST https://api.hanzilla.co/v1/tasks \
-H "Authorization: Bearer hic_live_..." \
-H "Content-Type: application/json" \
-d '{
"task": "Read the patient chart on the current page",
"browser_session_id": "550e8400-...",
"url": "https://example.com/chart",
"context": "Extract: name, medications, allergies"
}'
# Response (201)
{
"id": "task_abc123",
"status": "running",
"task": "Read the patient chart on the current page",
"browser_session_id": "550e8400-..."
}| Field | Required | Description |
|---|---|---|
task | Yes | What to do (max 10,000 chars) |
browser_session_id | Yes | Connected session to run against |
url | No | Starting URL (max 2,048 chars) |
context | No | Extra context for the agent (max 50,000 chars) |
# Response (200) — completed task
{
"id": "task_abc123",
"status": "complete",
"task": "Read the patient chart on the current page",
"answer": "Patient: Jane Doe. Medications: Lisinopril 10mg...",
"steps": 4,
"usage": { "inputTokens": 12000, "outputTokens": 800, "apiCalls": 5 },
"browser_session_id": "550e8400-...",
"created_at": 1710000000000,
"completed_at": 1710000120000
}Usage
# Response (200)
{
"totalInputTokens": 150000,
"totalOutputTokens": 12000,
"totalApiCalls": 45,
"totalCostUsd": 0.082,
"taskCount": 8
}Browser Pairing
Pairing connects a user's Chrome browser to your workspace. Users pair once — the session lasts 30 days and auto-reconnects on browser restart.
How it works
- Your backend calls
POST /v1/browser-sessions/pairto get a pairing token - Show your user a link:
https://api.hanzilla.co/pair/{token} - User clicks the link → their browser auto-pairs → done
# Your backend generates the link:
curl -X POST https://api.hanzilla.co/v1/browser-sessions/pair \
-H "Authorization: Bearer hic_live_..." \
-H "Content-Type: application/json" \
-d '{"label": "Dr. Smith", "external_user_id": "user_123"}'
# Response:
# { "pairing_token": "hic_pair_abc123...", "expires_in_seconds": 300 }
# Give your user this link:
# https://api.hanzilla.co/pair/hic_pair_abc123...The pairing page detects the Hanzi extension and pairs automatically. If the extension isn't installed, the user sees an "Install" button.
Sessions auto-reconnect on browser restart — no re-pairing needed. Use label and external_user_id to track which session belongs to which user.
Session Metadata
When creating a pairing token, attach a label and external_user_id to map Hanzi sessions to your users:
POST /v1/browser-sessions/pair
{
"label": "Dr. Smith's browser",
"external_user_id": "user_abc123"
}Both fields are inherited by the browser session and returned in GET /v1/browser-sessions. Use them to identify whose browser is whose in your system.
Troubleshooting
Extension not detected
Make sure the Chrome extension is installed and enabled. Reload at chrome://extensions if needed.
Agent can't find Hanzi
Restart your AI agent after running setup. MCP config is written to disk but agents need a restart.
Session disconnected
The browser was closed or lost network. Sessions auto-reconnect when the browser reopens. Check GET /v1/browser-sessions for status before creating tasks.
Task fails or times out
Check that the session is connected. Verify credentials are valid. Tasks have a 30-minute timeout. If the page requires login, make sure the user is signed in.
Pairing token expired
Tokens are valid for 5 minutes. Generate a new one via the developer console or POST /v1/browser-sessions/pair.
API key not working
Keys start with hic_live_. Check that you're using the full key (shown once on creation). Verify the key belongs to the correct workspace.
Error Codes
| Status | Meaning | Common cause |
|---|---|---|
400 | Bad Request | Missing required field, input too long, invalid URL |
401 | Unauthorized | Missing or invalid API key / session cookie |
402 | Payment Required | Plan upgrade needed (when billing is active) |
403 | Forbidden | Session belongs to a different workspace |
404 | Not Found | Resource doesn't exist, or belongs to another workspace |
409 | Conflict | Browser session not connected or expired |
429 | Too Many Requests | Rate limit exceeded (10 tasks/min, 5 concurrent) |
500 | Server Error | Internal error — check request_id in response for support |
503 | Service Unavailable | Billing not configured, or server degraded |
All error responses include a request_id in the X-Request-Id response header for tracing.
# Error response format
{
"error": "Browser session is not connected. The extension must be running and registered.",
"request_id": "a1b2c3d4"
}Security
| Mechanism | Details |
|---|---|
| API keys | SHA-256 hashed at rest. Plaintext shown once on creation. Prefix stored for display. |
| Pairing tokens | SHA-256 hashed. 5-minute expiry. Single use — cannot be replayed. |
| Session tokens | 30-day expiry. Auto-rotated by the relay. Revocable. |
| Workspace isolation | All resources scoped to workspace. Cross-workspace access returns 404. |
| BYOM privacy | No data leaves your machine. Screenshots sent only to your chosen provider. |
Full privacy policy: PRIVACY.md