A local HTTP proxy that intercepts LLM API calls from tools like Claude Code, Codex CLI, and any OpenAI/Anthropic-compatible client. Routes traffic through a local model server (Ollama, LM Studio, vLLM) or the real cloud API — while exposing a real-time web UI to inspect every request, response, token count, and timing.
- Real-time web UI — live request list with status, model, duration, and token counts
- Response viewer — streaming display of LLM outputs as they arrive
- Token tracker — per-request input / output / total token breakdown
- Analysis view — deep request/response breakdown with timing, token stats, and payload comparison
- Raw JSON inspector — full request and response payloads with copy-to-clipboard
- Search and filter — narrow by model, status code, or keyword
- Export — save request history as JSON or CSV
- MITM mode — intercept HTTPS traffic from tools that enforce certificate pinning
- Persistence — optional SQLite history across restarts
- Multi-provider — Claude (Anthropic), GPT (OpenAI), and any local OpenAI-compatible server
Global install (CLI):
npm install -g llm-proxymanRun without installing:
npx -g llm-proxymanOr use npm exec:
npm exec -g llm-proxymanLocal development:
git clone https://github.com/israelio/llm-proxyman.git
cd llm-proxyman
npm installOpen http://localhost:8080 after starting the proxy.
- Left panel — request list with status, model, duration, token count
-
Right panel tabs:
- Request — full message payload sent to the LLM
- Response — live streaming display, then full response on completion
- Tokens — input / output / total token counts
- Raw — full JSON, copy to clipboard
- Toolbar — search, filter by model/status, export JSON/CSV, clear history
One-liners that start the proxy and open your tool in one command.
Claude Code with local LLM:
UPSTREAM_URL=http://127.0.0.1:8001 npm start &
ANTHROPIC_BASE_URL="http://127.0.0.1:8080" claude --model your-model-nameClaude Code with real Anthropic API:
UPSTREAM_URL=https://api.anthropic.com npm start &
ANTHROPIC_BASE_URL="http://127.0.0.1:8080" claudeCodex CLI with OpenAI:
npm start &
HTTPS_PROXY=http://127.0.0.1:8080 HTTP_PROXY=http://127.0.0.1:8080 codexOr with base URL override:
npm start &
OPENAI_BASE_URL="http://127.0.0.1:8080" codexYou have a local LLM running at http://127.0.0.1:8001 (Ollama, LM Studio, llama.cpp, etc.).
Claude Code → proxy (:8080) → local LLM (:8001)
Start the proxy:
# Default: upstream is http://127.0.0.1:8001
npm start
# Or explicitly:
UPSTREAM_URL=http://127.0.0.1:8001 npm startConfigure Claude Code to use the proxy:
export ANTHROPIC_BASE_URL="http://127.0.0.1:8080"Add to your shell profile (~/.zshrc or ~/.bashrc) to make it permanent:
echo 'export ANTHROPIC_BASE_URL="http://127.0.0.1:8080"' >> ~/.zshrcRun Claude Code with your local model:
ANTHROPIC_BASE_URL="http://127.0.0.1:8080" claude --model Qwen3.6-35B-A3B-UD-Q4_K_XReplace Qwen3.6-35B-A3B-UD-Q4_K_X with whatever model name your local LLM server exposes.
You want to inspect what Claude Code sends and receives when talking to the real Claude API — useful for debugging prompts, understanding token usage, or auditing requests.
Claude Code → proxy (:8080) → api.anthropic.com
Start the proxy pointing at the Anthropic API:
UPSTREAM_URL=https://api.anthropic.com npm startTell Claude Code to route through the proxy:
export ANTHROPIC_BASE_URL="http://127.0.0.1:8080"Or as a one-liner:
ANTHROPIC_BASE_URL="http://127.0.0.1:8080" claudeYour existing ANTHROPIC_API_KEY is passed through transparently — the proxy forwards all headers including authentication.
Note: The proxy runs on HTTP locally but forwards to HTTPS upstream. Your API key is only in memory on your own machine and is never logged to disk unless you enable
PERSIST=true.
Intercept and inspect all calls from the Codex CLI in real time.
Codex CLI → proxy (:8080) → api.openai.com
The proxy auto-detects gpt-* models (in Auto mode) and routes them to api.openai.com. Your OPENAI_API_KEY is forwarded transparently — the proxy never stores it.
HTTPS_PROXY=http://127.0.0.1:8080 HTTP_PROXY=http://127.0.0.1:8080 codexexport OPENAI_BASE_URL="http://127.0.0.1:8080"Add to ~/.zshrc or ~/.bashrc to make it permanent:
echo 'export OPENAI_BASE_URL="http://127.0.0.1:8080"' >> ~/.zshrcThen start Codex as usual — it will route through the proxy automatically.
Edit ~/.codex/config.json:
{
"model": "gpt-4o",
"baseUrl": "http://127.0.0.1:8080"
}npm startNo extra flags needed. The proxy is already in Auto mode, which routes gpt-* models to OpenAI and claude-* models to Anthropic.
By default the proxy forwards gpt-* calls to https://api.openai.com. To override (e.g. for Azure OpenAI or a local OpenAI-compatible server):
OPENAI_UPSTREAM_URL=https://my-azure-openai.openai.azure.com npm startOr set it live in the web UI — there's an OpenAI upstream URL field in the toolbar.
Build and run with Docker:
docker build -t llm-proxyman .
docker run -p 8080:8080 \
-e UPSTREAM_URL=https://api.anthropic.com \
-e ANTHROPIC_API_KEY=your-key \
llm-proxymanRun with a local LLM:
docker run -p 8080:8080 \
--add-host=host.docker.internal:host-gateway \
-e UPSTREAM_URL=http://host.docker.internal:8001 \
llm-proxyman--add-host=host.docker.internal:host-gateway resolves to the host machine's IP so the container can reach services on your local machine (works on macOS, Linux, and Windows).
On Linux, you can also use --network host to share the host network namespace directly:
docker run --network host \
-e UPSTREAM_URL=http://127.0.0.1:8001 \
llm-proxymanWith --network host, 127.0.0.1 points to the host, so no extra hostname needed.
Run with persistence:
docker run -p 8080:8080 \
-v proxy-data:/root/.llm-proxyman \
-e PERSIST=true \
-e DB_PATH=/root/.llm-proxyman/proxy-history.db \
llm-proxymanAll endpoints are exposed on port 8080:
| Endpoint | Purpose |
|---|---|
http://localhost:8080 |
Web UI |
http://localhost:8080/events |
SSE real-time stream |
http://localhost:8080/api/* |
REST API |
http://localhost:8080/v1/* |
Proxy target |
All settings via environment variables (or a .env file — copy .env.example):
| Variable | Default | Description |
|---|---|---|
PROXY_PORT |
8080 |
Port for the proxy and web UI |
UPSTREAM_URL |
http://127.0.0.1:8001 |
Default upstream for local LLM or non-matched models |
OPENAI_UPSTREAM_URL |
https://api.openai.com |
Upstream for gpt-* models (auto mode) |
PERSIST |
false |
Persist request history to SQLite across restarts |
DB_PATH |
./proxy-history.db |
SQLite file path (when PERSIST=true) |
MAX_HISTORY |
1000 |
Max requests kept in memory |
.env file example:
PROXY_PORT=8080
UPSTREAM_URL=https://api.anthropic.com
PERSIST=true
DB_PATH=./history.dbnpm start # start proxy
npm run dev # start with --watch (auto-restart on file changes)
npm test # run test suite# Local LLM
UPSTREAM_URL=http://127.0.0.1:8001 npm start
# Real Anthropic API
UPSTREAM_URL=https://api.anthropic.com npm startIn both cases, Claude Code is configured the same way:
export ANTHROPIC_BASE_URL="http://127.0.0.1:8080"



