Multimodality

Multimodalität

Multimodality refers to an AI model's ability to process and generate not only text but also other data types such as images, audio, or video. A multimodal model can, for example, describe an image or answer a question about a chart.

Source: Google DeepMind — Multimodal AI

← Back to the glossary