terminal

○ install○ auto-compress

# one-time setup

✓ ziptoken hook added to ~/.claude/settings.json

$claude

›

ziptoken compressing…

⚡ ziptoken·67% compressed·48 → 16 tokens

Claude

CLI

Compress prompts from your terminal, scripts, or CI pipeline — or slot a local proxy in front of the OpenAI and Anthropic SDKs so every prompt is compressed transparently with zero code changes.

Quick start

Install

bash

npm install -g ziptoken-cli

Two binaries ship: ziptoken (full) and zt (short alias). Every example on this page works with either.

Prefer one-off usage? npx ziptoken-cli "…"

Set your API key

bash

ziptoken config set-key sk-zip-your-key-here
ziptoken doctor   # verify connectivity

Key is stored in ~/.ziptoken/config.json with user-only read permissions (chmod 600). You can also set ZIPTOKEN_API_KEY in the environment to override it per shell.

Get your key from the API Keys page →

Hook into Claude Code

One command — every prompt compressed automatically.

bash

ziptoken init claude

This writes a PreToolUse hook into ~/.claude/settings.json. From now on every Task prompt is silently compressed before it reaches Claude — no changes to your workflow.

Try it now

Paste any prompt and compress it in-browser — no key needed. Switch modes to see the difference.

ziptoken compress

Your prompt⌘↵ to run

Three integration modes

Claude Code hook

Automatic — zero workflow change

Hooks into Claude Code via PreToolUse. Every Task prompt is compressed before it's sent.

ziptoken init claude

As a pipe

Compress anything you send to an LLM

Chain into any CLI that reads stdin — Claude Code, llm, your own scripts.

cat prompt.md | ziptoken run claude

As a proxy

Zero-config for any SDK

Point OpenAI/Anthropic SDKs at a local port. Every request is compressed before it leaves your machine.

ziptoken proxy

Command reference

Command	Description
Compress
`ziptoken "text"`	Compress a string directly (pretty on TTY, raw on pipe)
`echo "…" \| ziptoken`	Auto-detects stdin — just pipe into it
`ziptoken --file prompt.md`	Compress a file
`ziptoken --raw "text"`	Compressed text only — perfect for piping
`ziptoken --json "text"`	Full JSON response with stats
`ziptoken --mode aggressive "…"`	conservative \| balanced \| aggressive (default balanced)
`ziptoken --max-words 120 "…"`	Cap compressed output at N words (1–2000)
`ziptoken --no-cache "…"`	Bypass the local response cache for this call
Integrations
`ziptoken run <cmd> [args…]`	Compress stdin, then exec <cmd> with the result on its stdin
`ziptoken proxy`	Local HTTP proxy — transparent compression for OpenAI & Anthropic
`ziptoken init <shell>`	Shell-integration snippet (bash \| zsh \| fish \| powershell) or claude hook writer
Token management
`ziptoken stats`	Cumulative tokens + dollars saved (local log)
`ziptoken stats --since 7d`	Windowed summary (e.g. 24h, 7d, 30d)
`ziptoken cache status`	Inspect the local response cache
`ziptoken cache clear`	Drop all cached compressions
`ziptoken doctor`	Health check — verify key, connectivity, round-trip
Configuration
`ziptoken config set-key KEY`	Save API key to ~/.ziptoken/config.json
`ziptoken config get`	Show full configuration (API key is masked)
`ziptoken config set <key> <value>`	defaultMode, defaultModel, cacheEnabled, budgetTokens
`ziptoken config unset <key>`	Remove a config field
`ziptoken --version`	Print installed version

Every subcommand supports --help — e.g. ziptoken proxy --help.

Compression modes

conservative~15–25%

Production prompts, precise instructions

--mode conservative

balanced~30–45%

Most use cases — good default

--mode balanced

aggressive~45–60%

High-volume workloads, exploratory/dev prompts

--mode aggressive

bash

# Default is balanced — override per call
ziptoken --mode conservative "Your precise production prompt"
ziptoken --mode aggressive "High-volume batch prompt"

# Or set a global default
ziptoken config set defaultMode aggressive

Transparent proxy

Start the proxy and point any OpenAI- or Anthropic-compatible SDK at it. Every prompt is compressed before it leaves your machine — no library changes, no SDK wrappers, streaming responses pass through untouched.

bash

# Start the proxy (defaults to :8787)
ziptoken proxy

# In another shell — any SDK call now goes through compression
export OPENAI_BASE_URL=http://127.0.0.1:8787/v1
export ANTHROPIC_BASE_URL=http://127.0.0.1:8787

python my_script.py
node agent.js
curl $OPENAI_BASE_URL/chat/completions -H "Authorization: Bearer $OPENAI_API_KEY" …

Provider is auto-detected by request path: /v1/chat/completions, /v1/completions → OpenAI; /v1/messages → Anthropic. Requests are compressed in parallel across every message, so round-trip latency overhead is a single extra hop regardless of conversation length.

Tune it per run. --port 8080, --mode aggressive, --skip-system, --min-len 200, --roles user,system — see ziptoken proxy --help for the full list.

Shell integration

One command appends three helpers to your shell config. They are thin wrappers — no magic — so you can read the source before opting in.

bash

# zsh / bash / fish / powershell — pick your shell
ziptoken init zsh        >> ~/.zshrc
ziptoken init bash       >> ~/.bashrc
ziptoken init fish       >> ~/.config/fish/config.fish
ziptoken init powershell >> $PROFILE

Helper	Does
zc	Compress the clipboard in-place (macOS `pbcopy`, Linux `xclip`/`wl-clipboard`, Windows `Get-Clipboard`).
zf <file>	Compress a file and print to stdout.
zp <cmd>	Spin up a proxy, run `<cmd>` with `OPENAI_BASE_URL` / `ANTHROPIC_BASE_URL` pre-set, tear down when it exits.

Scripting examples

Pipe into an LLM API call

bash

COMPRESSED=$(echo "$PROMPT" | ziptoken --raw)
curl -X POST https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_KEY" \
  -H "content-type: application/json" \
  -d "{\"messages\": [{\"role\": \"user\", \"content\": \"$COMPRESSED\"}]}"

Batch compress a directory of prompts

bash

for f in prompts/*.txt; do
  ziptoken --file "$f" --raw > compressed/"$(basename $f)"
done

Stream compressed prompt into any LLM CLI

bash

# ziptoken run compresses stdin, then execs the child with the result piped in
cat prompt.md | ziptoken run claude
cat prompt.md | ziptoken run ollama run llama3

Extract stats with jq

bash

ziptoken --json "Your long prompt here" | jq '{
  saved_pct:     (.stats.ratio * 100),
  tokens_before: .stats.original_tokens,
  tokens_after:  .stats.compressed_tokens,
  cost_saved:    .stats.estimated_cost_saved_usd
}'

Token management

Every call is logged to a local JSONL file (~/.ziptoken/stats.jsonl) — nothing leaves your machine. A content-hash cache short-circuits repeat prompts so you're not re-billed for identical text.

bash

ziptoken stats                       # cumulative savings + cache-hit rate
ziptoken stats --since 7d             # last week
ziptoken stats --json                 # machine-readable output
ziptoken stats --reset                # clear the local log

ziptoken cache status                 # entries in the local response cache
ziptoken cache clear                  # drop them all

# Monthly budget — ziptoken stats will warn when compressed tokens cross 80%
ziptoken config set budgetTokens 1000000

# Opt out of caching globally (defaults to on)
ziptoken config set cacheEnabled false

ziptoken doctor                       # round-trip probe against the API

Environment variables

Variable	Purpose
ZIPTOKEN_API_KEY	Overrides the saved key for the current shell
ZIPTOKEN_API_BASE	Override upstream ziptoken endpoint (enterprise / staging)
ZIPTOKEN_PROXY_PORT	Default port for ziptoken proxy and the zp helper
ZIPTOKEN_PROXY_OPENAI	Upstream OpenAI host the proxy forwards to
ZIPTOKEN_PROXY_ANTHROPIC	Upstream Anthropic host the proxy forwards to
NO_COLOR	Disable ANSI colors in CLI output

What's next?

API Reference →Quickstart →Try the Playground →Get your API key →