How NVIDIA® NIMs Bring Speed, Scale and Simplicity to Your AI Project
Article
6 min

How NVIDIA® NIMs Bring Speed, Scale and Simplicity to Your AI Project

NIMs are small self-contained microservices running in a Docker container. NVIDIA makes them easy to deploy, streamlines the means of leveraging GPU resources and allows consumers to deploy NIMs using Helm Charts – a package manager for containers.

What's Inside
  • What is the NVIDIA NIM platform?

    NIM stands for NVIDIA Inference Microservice, available from simple to complex configurations, but all consumable in a number of different ways: as a service, in the public cloud, on-premises and even on your laptop for smaller services deployments.

  • How NIMs simplify the AI path to production

    There are three main aspects to NIMs that can help organizations remove the complexities involved in the prototyping, deployment and production stages of an AI project.

  • Leverage the power of NIMs with CDW

    CDW’s AI practice offers solutions using NVIDIA’s AI architecture, Pro Visualization and Virtualization offerings. From designing the right fit for your business to using NVIDIA’s leading technology, we can support you in your entire AI journey.

Mature IT server technician using a laptop to repair a data centre mainframe in a server room.

Recently I attended a session at the NVIDIA HQ in Santa Clara and spoke to a number of key NVIDIA technical professionals regarding the state of AI and what NVIDIA is doing to make AI more easily consumable for organizations.

One of the key technologies that resonated with me as I walked away from those discussions was the NVIDIA NIMs platform. The current state of NIMs is a catalogue of premade AI services consumable in a blueprint format. 

What are NVIDIA NIMs?

NIMs are small self-contained microservices running in a Docker container. NVIDIA makes them easy to deploy, streamlines the means of leveraging GPU resources and allows consumers to deploy NIMs using Helm Charts – a package manager for containers.

NVIDIA NIM Agent Blueprints

NVIDIA NIM Agent Blueprints are preconstructed, more complex services made up of one or more NIMs as well as other templated services and sometimes connect to other external services through APIs. Blueprints built with NIMs are generally more consumable and can be targeted to a specific industry or vertical. They can help speed up the time to deliver AI microservices with greater consistency.

This highly agile way of creating AI tools has a few key aspects that differentiate it from other ways of deploying services, but before I get into that, let me first define what an NIM is and some of the ways we might get started with it.

What is the NVIDIA NIM platform?

NIM stands for NVIDIA Inference Microservice. These microservices are available from very simple to complex configurations, but all consumable in a number of different ways: as a service, in the public cloud, on-premises and even on your laptop for smaller services deployments (and depending on your laptop specs). 

/

The above diagram shows where NVIDIA NIMs belong in the software stack for deploying AI applications and services. NVIDIA NIMs can be deployed as part of a larger blueprint to create a single deployed service consisting of NIMs microservices that work in collaboration as a more complex application. 

An example of this would be leveraging the “PDF to Podcast” blueprint that consists of multiple NIMs to create a service that can take text data and present that content as a podcast for audio consumption of the data.

How NIMs simplify the AI path to production

There are three main aspects to NIMs that can help organizations remove the complexities involved in the prototyping, deployment and production stages of an AI project.

1. Deploy across platforms with ease of scalability

NIMs can scale from smaller use cases to very large enterprise solutions if needed, which brings me to the first reason I believe that the NIM story is a valuable one for organizations to learn about: portability.

Because a NIM can be deployed as a service, in the public cloud, on-premises and on a local device, it is easy to deploy services for a PoC rapidly. It can then be refactored and migrated to the public cloud or data centre where additional power is required. 

By being extremely easy to move through not just the prototyping, quality assurance and deployment processes, but also simplifying the scaleup process, NIMs allow us to start small, innovate and then invest when the tool shows promise.

/

A good way to think of a NIM is like a virtual machine where a traditional server has been virtualized.  The NIM abstracts much of the complexity associated with the deployment of a Docker container while leveraging the underlying GPU hardware.

NIMs offer ease of administration as part of larger blueprints using Jupyter notebooks, providing consistent and streamlined deployment methods.

2. Adapt and fine-tune AI models with a much simpler approach

The second reason I think we should spend some time digging into the power of NIMs is the simplicity they provide. There are several blueprints for services that NVIDIA has made available to customers out of the box. Building out a service using a foundation model is not just about getting the model up and running, it is about how you want your users to consume that model and interact with it. 

Bringing up a chatbot can be simple, but once you want to have it interact with a private data set or maybe have the output validated by a second model, things can get complicated very quickly. NVIDIA has solved a number of these issues by creating a catalogue of services to use as-is or as a reference for creating similar yet more customized versions for your users.

3. Access powerful AI infrastructure where you need it

There are several other qualities that make it easier to build and consume AI by leveraging NIMs, but the last one I’ll key in on here is helping accelerate AI development by commoditizing the infrastructure needed for building either locally, in the data centre, at the edge or in a public cloud. 

NVIDIA is able to recommend model and model size options that will work best with your allocated hardware. NIMs create a platform for consumption rather than needing to deal with complex dependency installations and prerequisites. 

I increasingly engage with organizations to emphasize the importance of adopting platforms that streamline service deployment and consumption. NIMs are specifically designed to support and accelerate your AI initiatives effectively.

So, by adopting an agile, portable and simplified means of creating, testing and deploying AI assets, NVIDIA has delivered a platform for organizations to move their AI projects forward, focused on the identified business outcomes rather than worrying about the plumbing of the IT systems. 

Leverage the power of NIMs with CDW

I recommend checking out the build.nvidia.com website to check out the NVIDIA NIM catalogue and services for yourself. If you need help creating data governance frameworks, architecting larger AI infrastructure in the cloud or on-premises, or doing AI workshops, please reach out to CDW and our team of solution architects to assist.

CDW’s AI practice offers innovative solutions using NVIDIA’s AI architecture, Pro Visualization and Virtualization offerings. From designing the right fit for your business to using NVIDIA’s leading technology, we support organizations in their entire AI journey.

KJ Burke

Principal Technology Strategist
KJ Burke is an innovative and driven IT infrastructure architect with solid interpersonal and communication skills. He is currently the Principal Technology Strategist at CDW Canada, with over 20 years in the IT industry and plenty of experience in planning and deploying technology to improve business processes and drive measurable value.