Silent Collapse
Gradual degradation in the performance of AI models when trained on synthetic data produced by other AIs, leading to a decline in output quality over successive iterations.
This phenomenon, also called "model collapse," occurs when generative AI systems, such as large language models, are recursively trained on their own outputs or on data generated by other AI models. As errors in the synthetic data accumulate over time, the model's predictions become increasingly distorted, producing nonsensical or biased content.
This collapse happens quietly and can be difficult to detect in its early stages, as the degradation often affects infrequent or complex data points first. Over time, this can significantly impact the reliability and generalization of the AI, limiting its ability to function effectively in real-world applications.
The concept of silent collapse was first identified around 2023 by researchers, including Dr. Ilia Shumailov from the University of Oxford. His team's experiments showed that repeated use of AI-generated data leads to rapid performance decay, with noticeable impacts after just a few training cycles.