Building a polyglot code parser with tree-sitter that handles Python, Rust, Dart, Go, and 10 more — without writing 14 parsers.

I needed to extract every function and class from any codebase a user throws at me — Python, Dart, Rust, Java, Go, C++, all of them. I started with Python’s built-in “ast” module. It worked — for Python. Then I needed JavaScript. And Rust. And 11 more. I wasn’t about to write 14 parsers.

I ended up with a single “parse_file()” function that handles all 14 languages. Here’s exactly how — using tree-sitter, a registry pattern, and three edge case fallbacks.

WHAT IS CODEWALK?

Codewalk is an open-source AI code analysis tool I’m building. You point it at any codebase and it gives you module detection, blast radius analysis, dependency graphs, reading order, and AI-powered code review— all from one “pip install”. This is the first post in a series where I break down the algorithms and engineering decisions inside Codewalk.