Meta released Llama 3.1 405B, an open "frontier-level model" that aims performs as well as proprietary models. For Meta CEO Mark Zuckerberg, the Llama cadence is designed to play the long game and bet open-source models ultimately win.
The company released Llama 3.1 405B, which will support synthetic data generation and model distillation. Those two features haven't been available in open-source models. Meta also said it released upgraded 8B and 70B Llama models with context length of 128K and better reasoning. The models are also multilingual and support multiple use cases.
For Zuckerberg, Llama is a mission. In a blog post, he said large language models (LLMs) will develop much like Linux did. First there was Unix and over time open-source Linux won. He said:
"I believe that AI will develop in a similar way. Today, several tech companies are developing leading closed models. But open source is quickly closing the gap. Last year, Llama 2 was only comparable to an older generation of models behind the frontier. This year, Llama 3 is competitive with the most advanced models and leading in some areas. Starting next year, we expect future Llama models to become the most advanced in the industry. But even before that, Llama is already leading on openness, modifiability, and cost efficiency."
Meta's latest Llama release and letter from Zuckerberg are designed to court developers that want to fine tune models, evaluate models for specific applications, pre-train and make models their own. The company said developers can leverage workflows and directions from partners such as Nvidia, AWS, Google Cloud, Microsoft Azure, Dell Technologies, Databricks and others.
Zuckerberg added that developers want affordable model options that can be fine-tuned on sensitive data while avoiding vendor lock-in. Llama is a high-profile effort, but not out of character for Meta, which also led the Open Compute Project.
The win for Meta's approach with Llama is that it can leverage the open-source community and build an ecosystem. Meta wants Llama to be a standard and can play neutral party since its business model isn't about selling access to LLMs.
Enterprises start to harvest AI-driven exponential efficiency efforts | Generative AI use cases, takeaways from projects underway and how the technology fits in with broader digital transformation.
In the end though, Meta's Llama efforts may be personal for Zuckerberg. He said:
"One of my formative experiences has been building our services constrained by what Apple will let us build on their platforms. Between the way they tax developers, the arbitrary rules they apply, and all the product innovations they block from shipping, it’s clear that Meta and many other companies would be freed up to build much better services for people if we could build the best versions of our products and competitors were not able to constrain what we could build. On a philosophical level, this is a major reason why I believe so strongly in building open ecosystems in AI and AR/VR for the next generation of computing."
More:
- OpenAI, Mistral AI aim for models that can show their work, tackle mathematical problems
- Foundation model debate: Choices, small vs. large, commoditization
- SpreadsheetLLM may yet decipher, democratize spreadsheets for rest of us
-
Anthropic adds more collaboration features to Claude for Pro, Team customers
- GenAI may be the new UI for enterprise software
- 14 takeaways from genAI initiatives midway through 2024
- OpenAI and Microsoft: Symbiotic or future frenemies?
- Copilot, genAI agent implementations are about to get complicated
- Generative AI spending will move beyond the IT budget