"Extracting text with coordinates from pdf.js, rebuilding paragraphs and tables, fixing Arabic right-to-left order, and generating a real .docx — all client-side, no upload."

tags: javascript, webdev, pdf, opensource

canonical_url: https://doctor-pdf.com/pdf-to-word.html

cover_image:

TL;DR — I built a PDF → Word (.docx) converter that does everything in the browser: no file ever leaves the user's device. The hard parts weren't the file formats — they were reconstructing layout (paragraphs, tables, headings) from a flat stream of positioned glyphs, and getting Arabic / right-to-left text to come out in logical order. Here's how it works, with the gotchas I hit along the way. Live tool: doctor-pdf.com/pdf-to-word.html.