Nvidia at Siggraph outlined an AI vision where developers will create, test and optimize generative AI models and large language models (LLMs) on a PC and workstation and then scale them via data centers or the cloud.
Not surprisingly, this vision includes a heavy dose of Nvidia GPUs. PC makers already highlighted that systems were on deck for generative AI training and workloads.
- Enterprise IT vendors see gradual demand improvement, AI-driven buying
- Nvidia DGX Cloud now available on Oracle Cloud Infrastructure
- Snowflake, Nvidia team up to enable custom enterprise generative AI apps
- How commercial PC configurations will change due to generative AI, data science
The two headliners during CEO Jensen Huang's keynote were Nvidia RTX workstations as well as Nvidia AI Workbench. Nvidia AI Workbench is a toolkit to enable developers to create, test and customize models on a PC or workstation and then move them to deploy in data centers, public clouds or Nvidia DGX Cloud.
AI Workbench includes a simplified interface with models housed at Hugging Face, GitHub and Nvidia NGC that can be combined with custom data and shared. AI Workbench will be included in systems from Dell Technologies, Hewlett Packard Enterprise, HP Inc., Lambda, Lenovo and Supermicro.
To go along with AI Workbench, Nvidia launched Nvidia AI Enterprise 4.0, its enterprise software platform for production deployments. AI Enterprise 4.0 includes Nvidia NeMo, Triton Management Service, Base Command Manager Essentials as well as integration with public cloud marketplaces from Google Cloud, Microsoft Azure and Oracle Cloud.
As for the Nvidia RTX workstations, the systems will include Nvidia's RTX 6000 Ada Generation GPUs, AI Enterprise and Omniverse Enterprise software. These systems will include up to four RTX 6000 Ada Generation GPUs with 48GB of memory and up to 5,828 TFLOPS of AI performance and 192GB of GPU memory. These systems will be announced by OEMs in the fall.
Among other Nvidia items from Siggraph:
- The company announced Nvidia OVX servers with the new Nvidia L40S GPU, which is designed for AI training and inference, 3D designs, visualization and video processing. The Nvidia L40S will be available starting in the fall. ASUS, Dell Technologies, GIGABYTE, HPE, Lenovo, QCT and Supermicro will offer OVX systems with L40S GPUs.
- Nvidia launched a new release of Nvidia Omniverse for developers and enterprises using 3D tools and applications. Omniverse uses the OpenUSD framework and adds generative AI features. Additions include modular app building, new templates, better efficiency and native RTX spatial integration. Nvidia also launched new Omniverse Cloud APIs.
- The company also rolled out frameworks, resources and services to speed up adoption of OpenUSD (Universal Scene Description). OpenUSD is a 3D framework that connects software tools, data types and APIs for building virtual worlds. The APIs include ChatUSD, a LLM copilot so developers and ask questions and generate code, RunUSD, which translates OpenUSD files to create rendered images, DeepSearch, an LLM for semantic search through untagged assets, and USD-GDN Publisher, which will publish OpenUSD experiences to Omniverse Cloud in a click.
Constellation Research’s take
Constellation Research analyst Andy Thurai said:
“Nvidia NeMO is an end-to-end framework for building foundational models that can be a pain to build. Nvidia AI workbench will allow for the cloning of AI projects and allow developers a workspace to build LLMs.
The new service and the interface will allow users to train or retrain models from Hugging Face in the Nvidia DGX cloud whether on a public cloud platform such as GCP or Azure or a Nvidia private cloud. Notoriously missing was AWS.
Nvidia claims the compute power today is built for older technologies and workloads. According to Nvidia, modern workloads must be run on the newer chips such as the Nvidia GPUs and Grace Hopper super chips. To that end, the Dual GH200 (a combination of Grace CPU and Hopper GPU into a Grace Hopper) is one of the most powerful processors ever. In that combination, Nvidia aims to get a lower capital expense for the processing power required (lower capex) and lower operational costs with energy consumption and a much faster inference thereby reducing the opex. If this dream were to come true, Nvidia can kill the mighty Intel's x86 business by demonstrating that Grace Hopper can process AI technologies, particularly the AI training workloads, with 20x less power and 12x less cost than the comparable CPU-based processing technologies.
In short, Nvidia has claimed the AI workloads, both training and inferencing, must be run by Nvidia based chips to be more efficient than rivals Intel and AI chip companies like SambaNova Systems."