Bugcrowd launches reinforcement learning environments to train AI on real software vulnerabilities
Crowdsourced cybersecurity company Bugcrowd Inc. today launched Reinforcement Learning Environments, a new offering that lets frontier artificial intelligence labs train models on real vulnerable software rather than synthetic test data.
The product is built on technology Bugcrowd picked up through its acquisition of Mayhem Security in November and is already in use with large language model providers. Bugcrowd describes the offering as a way to compress what would typically be years of in-house engineering work into a few weeks.
Reinforcement learning, the technique behind much of the recent progress in agentic AI, requires environments where a model can take actions, observe outcomes and receive a reward signal. Bugcrowd argues that security has been underserved on that front because most existing training data is synthetic and does not mirror how vulnerabilities behave in production code. Models that score well on curated benchmarks often stumble when they hit real flaws.
The new platform supplies what the company calls hundreds of thousands of training environments, each built from open-source software with real source code and verifiable outcomes. AI agents are tasked with locating bugs, triggering them, assessing exploitability and producing fixes, with objective scoring at every step. Bugcrowd says no customer data or work from its researcher community is used in the environments.












