How We Translate Entire Books with LLMs Without Losing Context

Our chunking strategy that keeps chapters coherent, respects context windows, and handles multi-lingual books.

The problem: books don’t fit in a prompt

At LectuLibre, we translate entire books — novels, technical manuals, poetry — using large language models. It sounds simple: feed each paragraph to an LLM, concatenate results, done. But the moment we tried a 300‑page EPUB, chaos ensued. Chapters bled into each other, sentences were chopped mid‑word, and the translation of chapter 5 had no idea what happened in chapter 4.

LLMs have limited context windows. Even the massive 200K token window of Claude 3 can’t hold a whole 150K‑word book. And even if it could, the cost and latency would be absurd. We needed a way to split the book into manageable chunks while preserving enough context so that the translation remains coherent across thousands of pages.

Here’s how we designed a chunking pipeline that respects your wallet, the context window, and the book’s narrative flow.

Our chunking strategy that keeps chapters coherent, respects context windows, and handles multi-lingual books.

The problem: books don’t fit in a prompt

Here’s how we designed a chunking pipeline that respects your wallet, the context window, and the book’s narrative flow.

How We Translate Entire Books with LLMs Without Losing Context

How We Translate Entire Books with LLMs Without Losing Context

Related reading

How We Built a Robust EPUB Parsing and Rebuilding Pipeline in Python

Parsing and Rebuilding EPUB Files in Python: Lessons Learned from Building an…

Plan, divide, and conquer: How weak models excel at long context tasks

Parsing and Rebuilding EPUB Files in Python: Lessons Learned

Notes: Memory, Context, and Large Language Models (LLMs)

How I Finally Tamed Long Document Analysis with LLMs (It Wasn't Simple Chunking)

Related reading

How We Built a Robust EPUB Parsing and Rebuilding Pipeline in Python

Parsing and Rebuilding EPUB Files in Python: Lessons Learned from Building an…

Plan, divide, and conquer: How weak models excel at long context tasks

Parsing and Rebuilding EPUB Files in Python: Lessons Learned

Notes: Memory, Context, and Large Language Models (LLMs)

How I Finally Tamed Long Document Analysis with LLMs (It Wasn't Simple Chunking)