Publication | Open Access
Generative AI and the Digital Commons
21
Citations
0
References
2023
Year
Artificial IntelligenceEngineeringMachine LearningAvailable DataData InfrastructureGenerative SystemData ScienceManagementData IntegrationBig DataData GovernanceData SharingDigital CommonsData ManagementIntellectual PropertyGenerative Artificial IntelligencePublic PolicyData ModelingData PrivacyGenerative ModelsComputer ScienceInformation ManagementData SecurityResponsible Data ManagementGenerative AiGfm CompaniesData Protection
Many generative foundation models (or GFMs) are trained on publicly available data and use public infrastructure, but 1) may degrade the "digital commons" that they depend on, and 2) do not have processes in place to return value captured to data producers and stakeholders. Existing conceptions of data rights and protection (focusing largely on individually-owned data and associated privacy concerns) and copyright or licensing-based models offer some instructive priors, but are ill-suited for the issues that may arise from models trained on commons-based data. We outline the risks posed by GFMs and why they are relevant to the digital commons, and propose numerous governance-based solutions that include investments in standardized dataset/model disclosure and other kinds of transparency when it comes to generative models' training and capabilities, consortia-based funding for monitoring/standards/auditing organizations, requirements or norms for GFM companies to contribute high quality data to the commons, and structures for shared ownership based on individual or community provision of fine-tuning data.