In artificial intelligence (AI), a foundation model (FM), also known as large X model (LxM), is a machine learning or deep learning model trained on vast datasets so that it can be applied across a wide range of use cases. Generative AI applications like large language models (LLM) are common examples of foundation models.
Building foundation models is often highly resource-intensive, with the most advanced models costing hundreds of millions of dollars to cover the expenses of acquiring, curating, and processing massive datasets, as well as the compute power required for training. These costs stem from the need for sophisticated infrastructure, extended training times, and advanced hardware, such as GPUs. In contrast, adapting an existing foundation model for a specific task or using it directly is far less costly, as it leverages pre-trained capabilities and typically requires only fine-tuning on smaller, task-specific datasets.
Early examples of foundation models are language models (LMs) like OpenAI's GPT series and Google's BERT. Beyond text, foundation models have been developed across a range of modalities—including DALL-E and Flamingo for images, MusicGen for music, and RT-2 for robotic control. Foundation models are also being developed for fields like astronomy, radiology, genomics, music, coding, times-series forecasting, mathematics, and chemistry.