Blog

Engineering deep-dives, product updates, and integration tutorials.

How Token Compression Can Cut Your LLM Bill by 40–70%

A deep dive into the techniques behind prompt compression and a real-world benchmark against GPT-4o, Claude 3.5, and Gemini.

ziptoken·April 1, 2025

Product4 min read

Rule-Based vs. Neural Compression: When to Use Each

LLMLingua and similar neural approaches achieve higher compression, but at a cost. We explain the trade-offs and when each mode is the right choice.

ziptoken

Engineering

Mar 15, 2025

Tutorial3 min read

Integrating ziptoken with LangChain in 10 Lines

A step-by-step guide to wrapping ziptoken compression into a LangChain Runnable, so every chain automatically compresses prompts before calling the LLM.

ziptoken

Developer Relations

Feb 28, 2025

Announcement2 min read

Announcing the Batch API: Compress 100 Prompts in One Call

The new /api/v1/batch endpoint lets you compress up to 100 texts in a single round-trip — perfect for RAG pipelines and document processing.

ziptoken

Product

Jan 20, 2025