Step 1 in depth: pdfjs-dist
pdfjs-dist is Mozilla's PDF rendering library — the same engine that powers Firefox's built-in PDF viewer. In jaklens.ai, it runs in the Node.js process (via Electron's main process) to extract text content from each page of the invoice.
For a typical digital invoice PDF (generated by Stripe, PayPal, a CRM, or invoicing software), pdfjs produces clean Unicode text that preserves line structure. The output looks something like:
Invoice #: INV-2024-0891
Date: 15 March 2025










