December 08, 2023

Article
4 min

Choosing the Right Infrastructure for Generative AI – 3 Keys to Success

To capitalize on artificial intelligence, organizations must avoid common pitfalls associated with choosing infrastructure to support development.

Person logging into a laptop with a aerial projection of the lock screen as an abstract.

Generative artificial intelligence has arrived at what NVIDIA CEO Jensen Huang recently called the “iPhone moment” for AI. 

There are useful — sometimes revolutionary — applications in nearly every industry, and leaders in every enterprise are thinking about how to implement generative AI within their companies. And yet, many businesses are struggling to effectively build out their AI initiatives. One key hurdle is infrastructure: Too often, IT organizations attempt to build AI on the same infrastructure that supports mainstream enterprise workloads.

What many don’t realize is that generative AI places unique demands on IT resources and having an infrastructure that’s not optimized for training and customizing AI models can stifle data science innovation and delay time to market. To avoid these pitfalls, IT and business leaders should consider three key factors. 

1. Developer Inefficiency Is Costing You

To make the most of their time, AI developers and data scientists need a user experience that’s push-button simple. These users don’t need to know (or care) about infrastructure; they simply want to build prototypes, experiment and get to production-ready models sooner. They need a simplified user interface, and tools that make it easier to start with pretrained, ready-to-customize models offering a solid foundation for a faster start. An IT platform should help streamline model development, letting developers access resources without having to worry about infrastructure. 

Why is their productivity critical? Why can’t they make do with the same compute resources they use today? Data science talent doesn’t come cheap and retaining it can be hard. When they’re waiting on resources, the business is essentially burning cash — workloads that should only take a couple hours to run might take days. Many expend up to a month “DIY”-ing their software stack to run on the infrastructure provided to them. A dollar spent on traditional, non-optimal infrastructure might actually be costing you three, if that infrastructure has your developers idling or expending effort that adds no value, such as re-engineering one’s software stack to make that infrastructure usable.

2. IaaS vs PaaS

When provisioning AI resources, many IT leaders instinctively turn to Infrastructure as a Service (IaaS) offerings, accessing bare-metal server instances in the cloud for the lowest possible price per GPU-hour. This is understandable, given the way organizations have become accustomed to provisioning resources for more traditional enterprise workloads. However, when it comes to AI, it often makes more sense to move up the stack and adopt a full-stack AI platform.

Platforms optimized for AI include the right infrastructure such as multinode clusters of GPU resources interconnected with ultrahigh bandwidth, low latency networking. They also include a developer workflow hub that insulates teams from the complexity of infrastructure while letting them collaborate, share their work and dynamically allocate resources across multiple projects at once. And to jump-start projects, they also include accelerated data science libraries, optimized AI frameworks and even pretrained models that unleash productivity.

3. Filling the AI Expertise Gap

AI talent can be extremely hard to find and expensive to retain. For some organizations, AI talent is essentially unavailable at any price. With enterprise AI being such a nascent space riddled with unsupported, unproven technology, today’s businesses need enterprise-grade 24/7 support and access to AI-fluent practitioners who know how to solve problems.

This was an important consideration as NVIDIA developed the DGX™ platform, and it’s why NVIDIA makes its AI expertise available, on-demand, to every DGX customer — helping them achieve better results, quicker. This expertise can range from addressing problems related to optimizing models for faster training runs to finding the root of software incompatibilities that cause training jobs to crash. A full-stack platform that comes with integrated access to AI expertise can help ensure applications are delivered to market quickly and cost-effectively.

Story by Tony Paikeday, who is a Senior Director of AI systems at NVIDIA, responsible for go-to-market for NVIDIA’s DGX platform. In his role, Tony helps enterprise organizations infuse their businesses with the power of AI via infrastructure solutions that enable faster insights from data. Tony was previously with VMware, where he was responsible for bringing desktop and application virtualization solutions to market, as well as key enabling technologies, including GPU virtualization and software-defined data centre.

Tony Paikeday

Senior Director of AI systems at NVIDIA
Responsible for go-to-market for NVIDIA’s DGX platform. In his role, Tony helps enterprise organizations infuse their businesses with the power of AI via infrastructure solutions that enable faster insights from data. Tony was previously with VMware, where he was responsible for bringing desktop and application virtualization solutions to market, as well as