AI systems that convert visual information from images into descriptive textual representations, enabling machines to understand and communicate the content of images.
Generality: 755