What's new in Data Preprocessor 1.5.x — R codegen, Robust Scaler, and a deadlock post-mortem

It's been a few months since I last wrote about Data Preprocessor, the IntelliJ plugin I built to stop re-writing the same pandas preprocessing scripts every project. The 1.5.x series has landed a real R codegen path, a more honest outlier-resistant normalizer, and one genuinely embarrassing deadlock that I want to talk about openly because the lesson is useful.

tl;dr on what the plugin does

You load a CSV, Excel, or JSON file inside your JetBrains IDE. The plugin profiles every column (type, null count, mean/median/std, mode, unique count). You build a pipeline visually — drop nulls, fill with mean, deduplicate, remove IQR outliers, normalize (min-max / z-score / robust), label-encode, one-hot, train/test split, sort, filter, type-cast — and then one click emits a complete, ready-to-run Python (pandas) or R (base + a few small libs) script.

All processing is local. The plugin collects no telemetry. The generated code is normal pandas or normal R — no runtime library, no plugin import, nothing magic. Read it, edit it, commit it alongside your dataset, run it long after you've uninstalled the plugin.

Here's roughly what a 5-step pipeline turns into:

tl;dr on what the plugin does

Here's roughly what a 5-step pipeline turns into:

What's new in Data Preprocessor 1.5.x — R codegen, Robust Scaler, and a deadlock post-mortem

What's new in Data Preprocessor 1.5.x — R codegen, Robust Scaler, and a deadlock post-mortem

Related reading

🧠 NeuroDoc: From Broken Prototype to Production-Ready Async AI Documentation…

How to Use Multiple Prettier Plugins for Consistent Code Formatting ?

DeepCoder-14B, Biome 97%, Stripe agent databases — Dev Signal #31

Closing the execution gap: a series

A Weekend of Testing

Feature freeze for Python 3.15 as first beta released

Related reading

🧠 NeuroDoc: From Broken Prototype to Production-Ready Async AI Documentation…

How to Use Multiple Prettier Plugins for Consistent Code Formatting ?

DeepCoder-14B, Biome 97%, Stripe agent databases — Dev Signal #31

Closing the execution gap: a series

A Weekend of Testing

Feature freeze for Python 3.15 as first beta released