Meta has released its newest collection of AI models , Llama 4 , which now powers the company's AI assistant across its platforms including WhatsApp, Messenger, and Instagram. The release includes two immediately available models—Scout and Maverick—with a third model, Behemoth, still in training.
Llama 4 Scout, designed to run on a single Nvidia H100 GPU, features a massive 10-million-token context window allowing it to process extremely lengthy documents. Meta claims Scout outperforms Google's Gemma 3 and Gemini 2.0 Flash-Lite models, as well as the open-source Mistral 3.1 across various benchmarks.
The larger Maverick model , which requires more substantial computing resources like an Nvidia H100 DGX system, reportedly competes with OpenAI 's GPT-4o and Google's Gemini 2.0 Flash on coding, reasoning, multilingual capabilities, and image processing tasks.
Both models utilize a "mixture of experts" (MoE) architecture, which improves efficiency by activating only the necessary portions of the model for specific tasks. Scout has 109 billion total parameters with 17 billion active parameters, while Maverick contains 400 billion total parameters with 17 billion active across 128 "experts."
The still-in-development Behemoth model will feature 288 billion active parameters and nearly two trillion total parameters. According to Meta's internal testing, Behemoth outperforms models like GPT-4.5 and Claude 3.7 Sonnet on several STEM evaluations.
Meta has adjusted these models to respond to more "contentious" questions than previous versions, claiming they provide "more balanced" responses to political and social topics. This comes amid accusations from some that AI chatbots demonstrate political bias.
Despite being labeled "open-source," Llama 4's license restricts usage by companies with over 700 million monthly active users without special permission and prohibits use by entities based in the EU.
Llama 4 Scout, designed to run on a single Nvidia H100 GPU, features a massive 10-million-token context window allowing it to process extremely lengthy documents. Meta claims Scout outperforms Google's Gemma 3 and Gemini 2.0 Flash-Lite models, as well as the open-source Mistral 3.1 across various benchmarks.
The larger Maverick model , which requires more substantial computing resources like an Nvidia H100 DGX system, reportedly competes with OpenAI 's GPT-4o and Google's Gemini 2.0 Flash on coding, reasoning, multilingual capabilities, and image processing tasks.
Both models utilize a "mixture of experts" (MoE) architecture, which improves efficiency by activating only the necessary portions of the model for specific tasks. Scout has 109 billion total parameters with 17 billion active parameters, while Maverick contains 400 billion total parameters with 17 billion active across 128 "experts."
The still-in-development Behemoth model will feature 288 billion active parameters and nearly two trillion total parameters. According to Meta's internal testing, Behemoth outperforms models like GPT-4.5 and Claude 3.7 Sonnet on several STEM evaluations.
Meta has adjusted these models to respond to more "contentious" questions than previous versions, claiming they provide "more balanced" responses to political and social topics. This comes amid accusations from some that AI chatbots demonstrate political bias.
Despite being labeled "open-source," Llama 4's license restricts usage by companies with over 700 million monthly active users without special permission and prohibits use by entities based in the EU.
You may also like
IPL 2025: SRH haven't assess and respect the conditions, admits Vettori
Pleas challenging validity of Waqf Act mentioned before CJI for urgent listing
Shinde jibe row: Kunal Kamra moves Bombay HC seeking cancellation of FIR
'Truly special morning', Kiren Rijiju and Omar Abdullah go on Tulip garden walk in Srinagar
Photos and videos sent in WhatsApp will not be saved, the strongest privacy feature is coming.