Our chunking strategy that keeps chapters coherent, respects context windows, and handles multi-lingual books.
The problem: books don’t fit in a prompt
At LectuLibre, we translate entire books — novels, technical manuals, poetry — using large language models. It sounds simple: feed each paragraph to an LLM, concatenate results, done. But the moment we tried a 300‑page EPUB, chaos ensued. Chapters bled into each other, sentences were chopped mid‑word, and the translation of chapter 5 had no idea what happened in chapter 4.
LLMs have limited context windows. Even the massive 200K token window of Claude 3 can’t hold a whole 150K‑word book. And even if it could, the cost and latency would be absurd. We needed a way to split the book into manageable chunks while preserving enough context so that the translation remains coherent across thousands of pages.
Here’s how we designed a chunking pipeline that respects your wallet, the context window, and the book’s narrative flow.






