The rise of generative artificial intelligence (AI) poses challenges for the free and open-source software (FOSS) community, a global network that is committed to creating and maintaining publicly available software that anyone can use, modify, and share. Many AI models have been built on open-source software but do not reciprocate the transparency that the FOSS community’s principles require, leaving open-source developers uncertain about how these AI tools are using their code.A new study by researchers at Yale’s Digital Ethics Center (DEC) explores a potential solution to this problem based on a concept used in free and open-source software known as “copyleft” licenses — a twist on typical copyright rules that obliges works derived from open-source materials to remain as free and transparent as the original work rather than re-licensing it under more restrictive terms.

Our analysis showed that extending the copyleft concept to generative artificial intelligence has the potential to give open-source software developers meaningful control over how AI developers use their code.

Grant Shanklin

The authors propose what they call a Contextual Copyleft AI License (CCAI) — a novel extension of copyleft licensing, which would treat generative AI models as derivative works and require AI developers training models on open-source code to make their architecture and training data freely available. “Our analysis showed that extending the copyleft concept to generative artificial intelligence has the potential to give open-source software developers meaningful control over how AI developers use their code,” said lead author Grant Shanklin, a de Vries-Sherif Junior Fellow at the DEC and rising senior at Yale College. “Importantly, it would incentivize the formation of a community committed to building AI tools aligned with the values of the free and open-source movement, which could help ensure that AI models are developed openly and responsibly.”Free and open-source software — which includes operating systems, web browsers, databases, scientific and creative tools, internet infrastructure, and programing and development tools — is a critically important component of modern technology. Cloud computing, smartphones, and AI and other scientific research depend on it. The new study, published in the International Journal of Law and Information Technology, was co-authored by DEC researchers Claudio Novelli and Emmie Hine, and Luciano Floridi, the John K. Castle Professor in the Practice of Cognitive Science and the DEC’s founding director. Tyler Schroder, a former undergraduate fellow at the DEC, is a coauthor of the study. In a comprehensive legal and policy analysis, they evaluated the benefits and risks of their proposed CCAI licensing, concluding that it is legally feasible under current copyright law as long as training of AI models does not constitute “fair use” — a legal doctrine that promotes free expression by permitting unlicensed use of copyright protected works under specific circumstances. They also show that free and truly open-source generative AI models would present several potential benefits, including enhanced transparency, accountability, and innovation. They note that generative AI has a higher risk profile than traditional software because it can be used directly to generate harmful or deceptive content and amplify malicious activities, such as generating highly effective phishing emails.