According to the group, an open-source AI system can be used for any purpose without securing permission, and researchers should be able to inspect its components and study how the system works.
It should also be possible to modify the system for any purpose—including to change its output—and to share it with others to use, with or without modifications, for any purpose. In addition, the standard attempts to define a level of transparency for a given model’s training data, source code, and weights.
The previous lack of an open-source standard presented a problem. Although we know that the decisions of OpenAI and Anthropic to keep their models, data sets, and algorithms secret makes their AI closed source, some experts argue that Meta and Google’s freely accessible models, which are open to anyone to inspect and adapt, aren’t truly open source either, because of licenses that restrict what users can do with the models and because the training data sets aren’t made public. Meta, Google, and OpenAI have been contacted for their response to the new definition but did not reply before publication.
“Companies have been known to misuse the term when marketing their models,” says Avijit Ghosh, an applied policy researcher at Hugging Face, a platform for building and sharing AI models. Describing models as open source may cause them to be perceived as more trustworthy, even if researchers aren’t able to independently investigate whether they really are open source.