the insight that started this project hit me while i was finishing a bytecode-compiled language i'd written in C
i'd spent months building a hand-written lexer, a single-pass Pratt compiler, a stack VM with 35 opcodes, and a mark-and-sweep garbage collector. and right near the end i had this realization: an LLM inference engine is the same problem. it's a graph-compile plus memory-plan plus kernel-schedule problem. i'd just built one
so i decided to find out if that was actually true
the project
the result is ignis, a from-scratch LLM inference engine in Rust. i used it specifically to see how far the compiler analogy held up. the dependency count ended up at 2: memmap2 (to mmap the weight blob off disk) and fancy-regex (for one look-ahead in the BPE tokenizer). everything else is hand-written, because the whole point was to understand what's actually happening






