The country has 247 million schoolchildren. Across five subjects and three assessments a year, that is around 3.7 billion evaluations annually. No system this big can be marked entirely by hand. Automation is the only answer. But how good is AI at assessing subjective answers?