NVIDIA AI Podcast · 2025-04-02

NVIDIA's Jacob Liberman on Agentic AI in the Enterprise

Hosts: Noah Kravitz

Guests: Jacob Liberman

Agentic AIEnterprise AI adoptionLLM reasoningAI standardizationGTC 2025AI agentsGenerative AI

Read summary Jump to transcript Go to episode

Podcast feed URL

Open feed

Why it matters

Agents represent the third era of GenAI use, after chat/copilots and RAG.

Key claims

Agents represent the third era of GenAI use, after chat/copilots and RAG.
Lack of standardization in agent communication and memory storage is a key adoption blocker.
Liberman predicts most LLM inference tokens will eventually be agent-to-agent, not human-to-agent.
Enterprises need deterministic outcomes; agent reasoning can make token costs unpredictable.

Episode summary

Summary

Recorded live at GTC 2025, NVIDIA Director of Product Management Jacob Liberman frames AI agents as the third era of GenAI, following chat/copilot usage and RAG. He explains that an agent is essentially an instruction-tuned LLM trained to use tools, likening it to giving a generalist model a vocational degree. Liberman argues that within a few years the majority of LLM tokens will be generated for agent-to-agent communication rather than human interaction, mirroring the rise of machine-to-machine trading in computational finance.

On adoption challenges, Liberman highlights the lack of standardization across agent implementations, which creates friction when agents need to communicate and share memory. He stresses that enterprises require deterministic outcomes, and that agentic workloads can produce unpredictable token costs due to unbounded reasoning. He also notes a shift in enterprise meetings, where legal and HR now join IT and security to discuss governance.

Liberman details NVIDIA's response: reference blueprints for AI agents (digital humans, biomedical research, cybersecurity vulnerability analysis, and simulation-to-real-world robotics) available at build.nvidia.com/blueprints and on GitHub. He describes reasoning models and test-time compute as the dominant enterprise paradigm, with use cases spanning software co-development, document summarization, and rewriting previously solved problems like predictive failure analysis to leverage natural-language interaction. Personal tools he uses include NVIDIA's PDF-to-Podcast blueprint, Perplexity Pro, and Cursor.

Agents represent the third era of GenAI use, after chat/copilots and RAG.
Lack of standardization in agent communication and memory storage is a key adoption blocker.
Liberman predicts most LLM inference tokens will eventually be agent-to-agent, not human-to-agent.
Enterprises need deterministic outcomes; agent reasoning can make token costs unpredictable.
Autonomy should scale with risk, drawing lessons from physical-world robotics and autonomous vehicles.
NVIDIA ships open-source agent blueprints at build.nvidia.com/blueprints for digital humans, research, and security use cases.
Reasoning models and test-time compute are becoming the dominant enterprise paradigm.
Use cases include software co-development, document summarization, cybersecurity triage, and rewriting solved problems like predictive failure analysis.

Source material

Transcript

[Music] Hello and welcome to the NVIDIA AI Podcast.

I'm your host, Noah Kravitz.

Developers are excited about agentic AI, and they're not alone.

More and more enterprises are deploying applications with agentic capabilities.

But with the excitement comes new questions and challenges.

How widespread will adoption of agentic AI become?

What should AI teams be thinking about when designing and developing AI agent applications for the enterprise?

And what should they be thinking about during adoption?

With us live from GTC 2025 to explore the growing role of agentic AI in the enterprise is Jacob Liberman.

Jacob is a Director of Product Management at NVIDIA, where he leads a team building cloud-native GenAI software solutions.

Currently, he's focused on building accelerated storage platforms that connect AI agents to enterprise data.

And prior to joining NVIDIA, Jacob held product management and engineering positions at Red Hat, AMD, and Dell.

Jacob, welcome to the AI Podcast, and thanks so much for taking the time to join us.

Thank you for having me.

I'm very happy to be here.

So should we start with the basics?

And I'm just going to ask you, what is an AI agent?

So, Noah, I'd say an AI agent is the latest evolution in the way people are using GenAI.

We're kind of in the third era of GenAI use already, which is crazy when you think about it, because it's really been only 18 months or two years since the technology became widespread.

But I would say that it started out, people would chat with LLMs and they would use GenAI as co-pilots and assistants to do their work.

Next, they used retrieval, augment, and generation to attach their LLMs to data and chat about their data.

And now people are using large language models to reason and act and do things in the world.

So you could think about it this way with an LLM, you could ask it to plan your trip to Europe.

With an AI agent, you could ask it to plan your trip to Europe and to book it for you and to give it little cues like, "Hey, I like castles."

And it will do the research and create an itinerary for you, compare prices, and then actually book the trip.

Right.

So before we get a little deeper, is an agent a large language model with different capabilities?

Is it a completely different piece of technology?

Is there kind of a concise way?

And if there's not, that's fine.

To sort of just set the level for the listeners of how agents and LLMs and other models work together.

Sure.

So I would say that a large language model, when you train it, it's a bit of a generalist.

And you can give it additional training data to make it more of a specialist.

You can instruction tune it to actually follow directions and call tools and learn how to use the tools.

So you can kind of think of an AI agent as, you know, you have a machine shop with a bunch of tools on the wall, you've taken your LLM to school and has a bachelor's degree, and then you give it a vocational degree, and it knows how to use those tools on the wall now like an expert.

Right.

Fantastic.

And then there are agents kind of popping up all over the place now, and there are agent stores and things like that.

Is it a matter of kind of going out and finding the agent that you want?

And then you have to ensure compatibility with the models that you're using?

Or is it designed so that, you know, you can grab an agent and put it into your workflow and these things generally work with each other?

That's a great question.

So I would say that one of the biggest challenges to AI agent adoption right now in the industry is a lack of standardization.

Okay.

So things have been standardized in terms of the approach, tuning agents to be able to use tools and to follow instructions.

But where we lack standardization is in the way that the agents communicate with one another and the way that they store their actions and represent it in memory.

That becomes important whenever you need agents to interact with each other.

It can add a lot of friction if they're communicating with different protocols or storing their conversations and memory in different ways.

Right.

So typically, you can grab an agent from, let's say, any vendor off the shelf and apply it to some task.

But then when it goes out into the real world and starts interacting with other agents, that's where things become tricky.

Gotcha.

And so how widespread do you foresee Agenic AI becoming?

Is this the future or at least the sort of near term present future?

I think, yes.

I think that the vast majority of tokens generated by large language models will be to enact and to serve agent communication.

So agents do two things.

They reason and they call tools and communicate.

I guess that's three things.

So that act of reasoning, it's almost like talking to yourself.

It's generating a lot of intermediary tokens.

So if you look at it in terms of inference, a single agent Agenic workload will usually generate a lot more tokens than a corresponding, let's say, inference chat workload.

And the way I think things will evolve will be a lot like computational finance.

Not too long ago, the vast majority of stock trades were conducted by humans.

But in the world we live in now, probably 75%, 80 of them are conducted by machines with other machines.

And so if you look at that ratio, I would say we're going to see the same thing where the vast majority of LLM inference will be between agents and not involve humans.

And what will they be doing?

Well, the agents will be doing many of the things that human workers do right now, or maybe that human workers would like to do, but they don't have time to do, or things that human workers don't like to do, which I call toil.

Toil is very, let's say, repetitive, error prone tasks that, you know, they're not creative tasks, they're not necessarily productive tasks, but they take up a lot of our time.

That is probably where we will see the first stages of agent adoption.

Which I'm all for.

Yeah, freeing people from doing the kind of busy work that fills up a lot of their time and leaving them available to do the higher value work.

Right.

And then we've had, you know, dating back several years now, when I think about it, guests come on to the pod and talk about the metaphor of an orchestra conductor or a trained, you know, train station conductor is used a lot where the human will be assigning those tasks out to the agents, the AI systems, and then almost being like the manager, right?

And you've got a fleet of agents going and doing the work for you and they come back and you review, or you kind of prompt it to take a different tack or that kind of thing.

Is that still a viable metaphor?

You know, I'm not sure.

I think it's very supportive of our human egos.

Right.

To believe that we will be in the best position to kind of conduct the orchestra.

Yes.

That's not clear.

So probably what will happen is that there will be teams composed of carbon people and silicon agents and they're collaborating on tasks.

And at various times, the humans will be conducting the orchestra and at other times the orchestra will be conducting itself.

And that might be the most efficient way to get the work done.

Right.

But I do think it's a comforting metaphor.

Right.

No, that's well said.

And it was interesting when you said that it was one of those, I wasn't expecting that response, but it made perfect sense, right?

Because if this keeps going the way it's been going, agents are going to get pretty smart, pretty fast.

They will.

And I think that there are unique characteristics of humans.

I mean, now I'm getting way outside of my role as a product manager at Nvidia and I'm just kind of philosophizing.

That's fine.

I think you called me a carbon human a few minutes ago, so I'm good with it.

You're carbon, right?

You're carbon.

You're not a digital human.

No, but I do think that, again, if we go back to the finance metaphor and computational finance, there are often algorithms that rebalance portfolios.

Right.

And maybe a human spot checks them, or maybe the human uses their intuition and experience to recognize some unique situation where they need to stray from that path.

Right.

And maybe the human knows some information or has some intuition about their individual clients that, well, maybe this risk profile is not right for this client.

So there's always, there's human judgment is critical, human strategizing is critical, and there's always room for that.

So it's a way to compliment the things that we're very good at with some of the things where we could use some help.

Yeah.

Toil.

All right.

So let's kind of flip perspective for a minute and talk about some of the challenges in the enterprise and in an organization when it comes to adopting a Gentaic AI.

Yes.

So how much time do we have?

Well, no, it's a great question.

So my primary role is to bring generative AI to every industry.

Right.

And there are challenges, there are technological challenges, and there are social challenges because work always occurs in a social context.

So on the technology side, we touched on this a bit earlier, but the lack of standardization across agent implementations can make agents working together very inefficient.

Right.

And this becomes a big problem because this work can be costly and enterprises want to extract the most benefit from their investment in the infrastructure.

So we have to make those communications more efficient.

Standardization is one way to drive that.

Another problem is that just like any language model, the work of an agent is not always deterministic.

What that means is there's this notion of hallucination where if a model doesn't know the answer, it might make something up and that can happen with an AI agent.

And enterprises need deterministic business outcomes because you're betting your business on it.

So that's another area where there's things we can do to increase that determinism and to add checkpoints and whatnot to ensure that it's there.

The other source of, I would say indeterminacy is that a typical LLM interaction, you can more or less judge how much it's going to cost in terms of the amount of tokens it generates.

Well, when you combine reasoning, which is kind of an unbounded source of tokens potentially with complex problems, now all of the sudden, you can't really predict at times how much that seemingly innocuous question will cost you.

It's like when you travel to Europe and you're using your roaming cell phone and you come back and you have a $5,000 bill.

And so that's what we want to avoid.

And so I think enterprises, they need deterministic outcomes.

That's the technological side.

On the social side, the autonomy of agents raises a lot of ethical and legal questions.

When I first started working with enterprises, it was IT folks and security folks who would show up in the meetings, but lately, a lot of lawyers have been showing up and HR people have been showing up.

So that's just kind of an interesting shift and we'll have to see where that goes.

Our guest is Jacob Lieberman.

Jacob is a director of product management here at NVIDIA, where he leads a team building cloud native Gen I software solutions with a focus on accelerated storage platforms that can connect the enterprise and the enterprise data to all of the AI capabilities we've been talking about.

Jacob, you touched on this a little bit, but I think it's something worth digging into more here.

This notion of agentic AI and autonomy and mentioning some of the systems where the human in the loop maybe checks in less often than some of the others because the system can go faster and the human can see everything's cool.

And we're going, how far do we take that?

How much autonomy should agents have?

I mean, technically speaking theoretically, how much could they have, but how much should they have?

And what are some of the conversations that you've been a part of thinking about this?

What's the current thinking about agentic autonomy?

So that's another great question.

This is a frequent question that we get and a cause of a lot of concern for people.

This notion that we're going to kind of turn these agents loose on the world and they'll be able to do whatever they want.

So what I usually tell people is that just like human workers, the roles and responsibilities of agents require a range of autonomy.

And in some places, in some roles, the agent needs wide latitude to kind of make decisions.

For example, if you're a customer service AI agent and someone calls you up, they might have something missing from their order.

They might have a customer service complaint.

They might have this.

They might want to know what the other options are.

And you really need to be creative in how you address that.

And probably the risk of giving that AI agent so much autonomy is relatively low.

And I guess then it's kind of a, let's say like a plot where you have autonomy on the x-axis and risk on the y.

There are other scenarios, say you have an AI agent that's responsible for rebalancing your retirement portfolio.

There you don't want it to get very creative.

Oh, we're not all crypto.

You want it to kind of follow the tried and true formulas.

And you can embed that level of autonomy and determinism into the actions of the agent.

So I think that the fear of these autonomous agents kind of running around is having fun and being wild.

I think that is a bit unfounded, but there are situations where you want autonomy, frankly.

I don't know, is there a best practice?

Is there a formula?

Not formula, but for kind of figuring out when you're meeting with customers, potential customers, how do you kind of find that balance?

Well, this is an area that's still pretty nascent.

So I would say we have a map.

We have a roadmap.

We know what we need to do because if you look at autonomy in the physical world, we've already seen these questions a bit with robotics.

And in video, of course, we work with robots.

We work in simulation and we work in the real world.

So if it's an autonomous vehicle, you're putting on the road.

If it's an AMR that is in a factory, if it's a flight control system, an aviatic co-pilot, we have AI agents that assist tractors in the field.

All of those uses are governed by standards.

And the standards basically assess the level of risk and the inherent and the task and the level of risk mitigation that you need.

So what I expect will happen is that we're going to learn from AI autonomy in the physical world and apply those lessons to how we deploy AI in the enterprise.

So to dig into that a little deeper with what NVIDIA is doing now with agents and working in the enterprise and elsewhere, can you talk a little bit about some of the things NVIDIA is doing in the space?

Yeah.

So this will be a bit of a shameless plug for the work my team is doing.

Fantastic.

Yes.

I mean, after all, I'm a human agent and I need to feed my family.

So the first thing that we're doing, the first thing NVIDIA is doing and that my team is doing specifically is we're building blueprints for AI agents.

And blueprints are reference architectures implemented in code where we show how you can take NVIDIA software and apply it to some productive task in an enterprise to solve some real business problem.

And these blueprints are generally taken by global system integrators and service deployment partners, service delivery partners who take them in, adapt them to their own portfolio, differentiate, and then take them out to our customers at scale.

So for example, we have a blueprint for a digital human.

The digital human can be made into a bedside digital nurse, a sportscaster, a bank teller with just some verticalization.

I'm grateful you said sports and not pod.

Podcaster.

No, no, we still...

That's too far away from our capability.

Yeah.

And then also where these AI agents start to intersect the physical world, the thing I was just talking about, we also have blueprints for teaching agents how to work in simulation and then deploying them to the physical world.

Yeah.

So that's very cool.

And then the other thing we're working on, of course, NVIDIA, we're an acceleration company.

We make things run faster.

The other thing we're doing is we're working with this diverse ecosystem of agent platform builders, and we're trying to make sure that they all run great on NVIDIA software, both the inference piece at scale and the distributed communications.

So it's really those two things.

What are you hearing from developers?

What are they excited about when it comes to using Agenic AI and the work they're doing?

It's very interesting and not surprising that software developers are among the earliest adopters of Agenic AI.

Now, a lot of the focus of Agenic AI up until this point has been on consumer products.

But developers have kind of adopted these technologies at a very rapid rate.

Sure.

And it's really interesting when you watch people program now, they kind of co-develop with the AI agent and they'll say things like, "Document this for me.

How could I do this differently?

Okay, take this code and encapsulate it and make it multi-user."

So they're actually using human language to ask the AI agent to perform some task, whether they can do it themselves or not as kind of immaterial.

It's just more efficient to at least use the agent as a starting point.

Right.

Reasoning has kind of become a little bit of a net word over the past few months and when it comes to all of this stuff.

What does it actually mean?

And we can stick with the developer context maybe to talk about it.

What does it actually add if I'm a developer and I'm coding and I'm using a coding co-pilot to do the things you're just talking about?

If the co-pilot has reasoning capabilities versus not having them, what material difference might that make?

So reasoning and the use of reasoning with large language models, it's kind of become the dominant paradigm and use case for large language models and agents in enterprise.

Is it the default at this point, basically?

It's not the default because the more time you spend thinking about something, the longer it takes, the more latency there is in terms of your reaction.

And in some use cases, it's appropriate and some it isn't.

But where it's appropriate are doing things like, let's say, biomedical research.

We have a blueprint that can be used to simulate developing molecules.

And so one of the things our customers can do is take a reasoning model, attach it to all of their private research data, attach it to all the public research data on PubMed, on the internet, come up with a unique molecular design, and then pass it into our simulation software to see if they can build it and make sure that it's stable.

So there you don't need a real-time interactive response.

You're okay if the LLM goes off and thinks for a while before it comes back with an answer.

So the actual capability that's emerging is something that we're calling test time compute.

Test time compute is system to thinking.

It's thinking about thinking.

The model will look at the way it's solving the problem and decide if it's doing it in the most efficient way or in the best way, in the optimal way.

And it's fairly interesting.

You can actually watch the reasoning models think through a problem.

And you asked about developers specifically, I saw this one reasoning model interacting with one of my coworkers and he said something to it like, I'm a product manager at Nvidia.

Look at this GitHub repo and explain it to me as though I were five.

And then the reasoning agent actually thought, well, this will be difficult to explain to a five-year-old, but this guy is a product manager at Nvidia.

So clearly he's not five years old.

So I will up my language a little bit to be more appropriate for his level.

Oh, that's amazing.

Yeah.

It was actually interesting to see the reasoning model work this out.

Maybe we can dig into some other use cases.

You were talking about development, but where else are we seeing, are you seeing a Gentic AI systems being used in their press?

Sure.

So we talked about software development.

We talked about research.

Research can span things like summarization.

Let's say you have a bunch of complicated documents along email thread, along Slack thread.

You could go through and read each one of those things, or you could ask an AI agent to summarize it for you.

You could ask it to prepare structured reports for you.

So these are two variations that both require reasoning that are becoming dominant use cases in enterprise.

Now, another really interesting thing we're seeing in enterprise is that people are rewriting solved problems, applications that already solve problems through automation in this new format, because they want to take advantage of the natural human interaction.

So for example, let's say predictive failure analysis.

You have an offshore oil drilling rig.

It's very costly.

If a component breaks, you have all sorts of telemetry that will let you know if something's about to happen.

You could implement that predictive failure analysis with the traditional data science or machine learning approach.

But if you use a large language model, if you use an agent, now you have the capability to interact with it in a natural way.

So that you can use that to plan the responses, be alerted, trigger all of the logistic actions that might come into place to circumvent that failure.

So I would say those are probably the three.

There are very few unique use cases for LLMs, but the technology is so powerful that we're basically re-solving all of the solved problems in order to do it better.

To do it better, yeah.

You mentioned earlier that the lack of standardization was a challenge in adoption and development when it comes to the agents in general, but also how they communicate with one another.

Stepping back from that a little bit, are agents being used more broadly in cybersecurity and risk management applications?

They are.

So within NVIDIA, we do a lot of work to make sure our software is secure.

And we use AI.

We use AI throughout NVIDIA.

We use AI to design our GPUs.

We use AI to write our software.

We use AI to make sure our software is secure.

And then we will often package those approaches and those learnings as blueprints that we can re-deliver to our customers.

And one example of that is that we have a computer vulnerability assessment and analysis pipeline so that if we have a bit of software and we're alerted that there's a vulnerability in the software, the AI agent will actually look and see if our code paths that trigger that exploit are executed.

And it will make an assessment of how much risk risk we're exposed to.

And it will also recommend how to remediate the problem.

And so that assists our human worker, Christina, who does all of this work.

And it should make her work more efficient.

So it's a great example of how, you know, a very applied thing that's potentially error prone, but important, where we have an expert human who is basically using the agent to give her more leverage.

Right.

So kind of as we start to wrap up here, advice for listeners, I'm sure there are listeners who are in enterprise situations and thinking about, you know, how do we start, how do we start designing and developing the playings down the line, but thinking about agentic AI for whatever the use case is, you know, for where they're at.

What advice would you give them?

Broad advice right now, best practices for designing, developing, deploying agentic AI in these situations?

So this is the advice I would give to anyone.

And in fact, I try to follow it myself and I give it to my team is that soon, this will be the way everyone does everything.

Right.

We have to start getting familiar with these tools and these approaches just broadly.

So are there any that you like to have, you know, consumer facing in a broad sense, but AI tools that you're using regularly at work, not at work that, you know, you're just, you're a fan of?

Yeah, there's so in video, of course, we have many partners and I use a lot of their technologies in my personal life.

In fact, I recently presented to a group of energy investors and energy professionals about the sustainability of AI.

And I used a lot of the tools, the blueprints that my team built to help me prepare for that conversation.

Right.

So for example, we have something on our blueprints page called PDF to podcast, and you can give it a bunch of PDF documents and it will generate an engaging monologue or a dialogue.

It could be a conversation or a debate, right?

So that you can listen to it in your car and familiarize yourself with the content.

Right.

So the entertainment value is not what you get, you know, here clearly, but it's useful.

You're all weak folks.

No, it is though, right?

Because you can then, you can, you know, we all have earbuds in our ears all the time anyway.

So why not be learning what you need to learn for your next meeting?

Right.

And so another tool I really like is Perplexity Pro.

Perplexity is, it functions a bit as a certain search engine, but it's also a genteck in that it will generate net new content.

It's not finding content, it's generating new content.

So you can ask it questions.

For example, when I was researching sustainability, I asked it to develop a report for me on trends between the top 500 list of the world's fastest supercomputers and the green 500 list of the world's most energy efficient supercomputers.

And are there any crossovers and what are the trends you're seeing?

Yeah.

And it prepared, that didn't exist anywhere.

It, it made that report for me.

Right.

Net new for you.

Yeah.

Yep.

And for the developers, you know, I think cursor is very powerful.

Cursor will kind of give you a development environment where you have an AI assistant that you could interact with natural language and it's kind of a visual basic plugin and it will assist you as you work.

So I'd say those are, those are three that I use pretty much every day.

Fantastic.

Jacob, for listeners who want to learn more, Gentic AI, Gentic AI and the enterprise, the work your team is specifically doing, where would you send them?

Are there places online, specific part of the Nvidia site, social media, where should they go?

I think a great place to start is to go to build.invidia.com/blueprints.

Most of the Gen AI workflows that we create have interactive demos.

I don't even know if you can call them demos.

We deploy them in our own cloud so you can experience them first at hand.

Right.

And we do not have a consumer focus.

We have a very enterprise focus.

So we have use cases that you won't see anywhere else, like training fleets of robots, building wind tunnels entirely in simulation.

So it's a very cool place to hang out and get started.

We also have a capability where if you're experimenting with the interactive demo and you actually want to spin up a deployment of that thing in your own VPC, you can enter your Nvidia developer API key.

You can enter your .pem credentials for your VPC and you can spin up a virtual machine with the Blueprint pre-deployed in your network.

Very cool.

You can bring your data to it.

You can customize it.

And again, all of these things are open source.

Fantastic.

So we also have them available on GitHub.

Jacob Lieberman, thank you so much for taking the time out of GTC week to join the podcast.

Gentec AI, obviously hot topic right now, but that's underselling it.

Like you said, this is where it's all heading.

It's how we're going to be doing things.

So no better time in the present to get started.

Thank you.

[Music] [BLANK_AUDIO]