You are currently viewing Zuckerberg states that Meta will require ten times the computing power to train Llama 4 compared to Llama 3.

Zuckerberg states that Meta will require ten times the computing power to train Llama 4 compared to Llama 3.

Meta, the developer of one of the largest foundational open-source large language models, Llama, anticipates needing significantly more computing power for future model training.

During Meta’s second-quarter earnings call on Tuesday, Mark Zuckerberg revealed that training Llama 4 will require ten times the computing power used for Llama 3. Despite this, he emphasized the importance of building the necessary capacity to stay ahead of competitors.

“The amount of computing needed to train Llama 4 will likely be almost 10 times more than what we used to train Llama 3, and future models will continue to grow beyond that,” Zuckerberg stated.

He added, “It’s hard to predict how this will trend multiple generations out into the future. But at this point, I’d rather risk building capacity before it is needed rather than too late, given the long lead times for spinning up new inference projects.”

Meta released Llama 3 with 80 billion parameters in April. Last week, the company launched an upgraded version, Llama 3.1 405B, featuring 405 billion parameters, making it Meta’s largest open-source model to date.

Meta’s CFO, Susan Li, mentioned that the company is considering various data center projects and building capacity to train future AI models. She noted that this investment is expected to increase capital expenditures in 2025.

Training large language models is a costly endeavor. Meta’s capital expenditures rose nearly 33% to $8.5 billion in Q2 2024, up from $6.4 billion a year earlier, driven by investments in servers, data centers, and network infrastructure.

According to a report by The Information, OpenAI incurs $3 billion in costs for training models and an additional $4 billion for renting servers from Microsoft at a discounted rate.

Li stated during the call, “As we expand generative AI training capacity to enhance our foundational models, we’ll continue to build our infrastructure to maintain flexibility in its usage over time. This approach allows us to allocate training capacity to generative AI inference or to our core ranking and recommendation work when it’s expected to be more beneficial.”

Meta also discussed the usage of its consumer-facing Meta AI chatbot, highlighting that India is its largest market. However, Li noted that the company does not anticipate significant revenue contributions from generative AI products.

Leave a Reply