CLI
Compress prompts from your terminal, scripts, or CI pipeline — or slot a local proxy in front of the OpenAI and Anthropic SDKs so every prompt is compressed transparently with zero code changes.
Quick start
Install
npm install -g ziptoken-cliTwo binaries ship: ziptoken (full) and zt (short alias). Every example on this page works with either.
Prefer one-off usage? npx ziptoken-cli "…"
Set your API key
ziptoken config set-key sk-zip-your-key-here
ziptoken doctor # verify connectivity~/.ziptoken/config.json with user-only read permissions (chmod 600). You can also set ZIPTOKEN_API_KEY in the environment to override it per shell.Get your key from the API Keys page →
Hook into Claude Code
One command — every prompt compressed automatically.
ziptoken init claudeThis writes a PreToolUse hook into ~/.claude/settings.json. From now on every Task prompt is silently compressed before it reaches Claude — no changes to your workflow.
Try it now
Paste any prompt and compress it in-browser — no key needed. Switch modes to see the difference.
Three integration modes
Automatic — zero workflow change
Hooks into Claude Code via PreToolUse. Every Task prompt is compressed before it's sent.
ziptoken init claudeCompress anything you send to an LLM
Chain into any CLI that reads stdin — Claude Code, llm, your own scripts.
cat prompt.md | ziptoken run claudeZero-config for any SDK
Point OpenAI/Anthropic SDKs at a local port. Every request is compressed before it leaves your machine.
ziptoken proxyCommand reference
| Command | Description |
|---|---|
| Compress | |
ziptoken "text" | Compress a string directly (pretty on TTY, raw on pipe) |
echo "…" | ziptoken | Auto-detects stdin — just pipe into it |
ziptoken --file prompt.md | Compress a file |
ziptoken --raw "text" | Compressed text only — perfect for piping |
ziptoken --json "text" | Full JSON response with stats |
ziptoken --mode aggressive "…" | conservative | balanced | aggressive (default balanced) |
ziptoken --max-words 120 "…" | Cap compressed output at N words (1–2000) |
ziptoken --no-cache "…" | Bypass the local response cache for this call |
| Integrations | |
ziptoken run <cmd> [args…] | Compress stdin, then exec <cmd> with the result on its stdin |
ziptoken proxy | Local HTTP proxy — transparent compression for OpenAI & Anthropic |
ziptoken init <shell> | Shell-integration snippet (bash | zsh | fish | powershell) or claude hook writer |
| Token management | |
ziptoken stats | Cumulative tokens + dollars saved (local log) |
ziptoken stats --since 7d | Windowed summary (e.g. 24h, 7d, 30d) |
ziptoken cache status | Inspect the local response cache |
ziptoken cache clear | Drop all cached compressions |
ziptoken doctor | Health check — verify key, connectivity, round-trip |
| Configuration | |
ziptoken config set-key KEY | Save API key to ~/.ziptoken/config.json |
ziptoken config get | Show full configuration (API key is masked) |
ziptoken config set <key> <value> | defaultMode, defaultModel, cacheEnabled, budgetTokens |
ziptoken config unset <key> | Remove a config field |
ziptoken --version | Print installed version |
Every subcommand supports --help — e.g. ziptoken proxy --help.
Compression modes
Production prompts, precise instructions
--mode conservativeMost use cases — good default
--mode balancedHigh-volume workloads, exploratory/dev prompts
--mode aggressive# Default is balanced — override per call
ziptoken --mode conservative "Your precise production prompt"
ziptoken --mode aggressive "High-volume batch prompt"
# Or set a global default
ziptoken config set defaultMode aggressiveTransparent proxy
Start the proxy and point any OpenAI- or Anthropic-compatible SDK at it. Every prompt is compressed before it leaves your machine — no library changes, no SDK wrappers, streaming responses pass through untouched.
# Start the proxy (defaults to :8787)
ziptoken proxy
# In another shell — any SDK call now goes through compression
export OPENAI_BASE_URL=http://127.0.0.1:8787/v1
export ANTHROPIC_BASE_URL=http://127.0.0.1:8787
python my_script.py
node agent.js
curl $OPENAI_BASE_URL/chat/completions -H "Authorization: Bearer $OPENAI_API_KEY" …Provider is auto-detected by request path: /v1/chat/completions, /v1/completions → OpenAI; /v1/messages → Anthropic. Requests are compressed in parallel across every message, so round-trip latency overhead is a single extra hop regardless of conversation length.
--port 8080, --mode aggressive, --skip-system, --min-len 200, --roles user,system — see ziptoken proxy --help for the full list.Shell integration
One command appends three helpers to your shell config. They are thin wrappers — no magic — so you can read the source before opting in.
# zsh / bash / fish / powershell — pick your shell
ziptoken init zsh >> ~/.zshrc
ziptoken init bash >> ~/.bashrc
ziptoken init fish >> ~/.config/fish/config.fish
ziptoken init powershell >> $PROFILE| Helper | Does |
|---|---|
| zc | Compress the clipboard in-place (macOS pbcopy, Linux xclip/wl-clipboard, Windows Get-Clipboard). |
| zf <file> | Compress a file and print to stdout. |
| zp <cmd> | Spin up a proxy, run <cmd> with OPENAI_BASE_URL / ANTHROPIC_BASE_URL pre-set, tear down when it exits. |
Scripting examples
Pipe into an LLM API call
COMPRESSED=$(echo "$PROMPT" | ziptoken --raw)
curl -X POST https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_KEY" \
-H "content-type: application/json" \
-d "{\"messages\": [{\"role\": \"user\", \"content\": \"$COMPRESSED\"}]}"Batch compress a directory of prompts
for f in prompts/*.txt; do
ziptoken --file "$f" --raw > compressed/"$(basename $f)"
doneStream compressed prompt into any LLM CLI
# ziptoken run compresses stdin, then execs the child with the result piped in
cat prompt.md | ziptoken run claude
cat prompt.md | ziptoken run ollama run llama3Extract stats with jq
ziptoken --json "Your long prompt here" | jq '{
saved_pct: (.stats.ratio * 100),
tokens_before: .stats.original_tokens,
tokens_after: .stats.compressed_tokens,
cost_saved: .stats.estimated_cost_saved_usd
}'Token management
Every call is logged to a local JSONL file (~/.ziptoken/stats.jsonl) — nothing leaves your machine. A content-hash cache short-circuits repeat prompts so you're not re-billed for identical text.
ziptoken stats # cumulative savings + cache-hit rate
ziptoken stats --since 7d # last week
ziptoken stats --json # machine-readable output
ziptoken stats --reset # clear the local log
ziptoken cache status # entries in the local response cache
ziptoken cache clear # drop them all
# Monthly budget — ziptoken stats will warn when compressed tokens cross 80%
ziptoken config set budgetTokens 1000000
# Opt out of caching globally (defaults to on)
ziptoken config set cacheEnabled false
ziptoken doctor # round-trip probe against the APIEnvironment variables
| Variable | Purpose |
|---|---|
| ZIPTOKEN_API_KEY | Overrides the saved key for the current shell |
| ZIPTOKEN_API_BASE | Override upstream ziptoken endpoint (enterprise / staging) |
| ZIPTOKEN_PROXY_PORT | Default port for ziptoken proxy and the zp helper |
| ZIPTOKEN_PROXY_OPENAI | Upstream OpenAI host the proxy forwards to |
| ZIPTOKEN_PROXY_ANTHROPIC | Upstream Anthropic host the proxy forwards to |
| NO_COLOR | Disable ANSI colors in CLI output |