According to Evgeny Chereshnev's calculations, training a modern typical model with 1.7-2 trillion parameters costs developers about $100 million per cycle