OpenAI Podcast · 2025-08-15

Defining AGI and the Road Ahead

Hosts: Andrew Main

Guests: Jakub Pachocki, Szymon Sidor

AGI definitionBenchmark saturationReasoning modelsScalingAI competitions (IMO, IOI, AtCoder)Progress measurementAI in educationOpenAI research roadmap

Why it matters

Recent competition wins: gold at IMO, top performance at IOI, and 2nd place at AtCoder behind a human competitor

Key claims

  • OpenAI defines AGI less as a fixed benchmark and more as automating research and the production of new technology — Pachocki says this is 'not that far away'
  • Benchmarks are saturating; the team argues standardized tests are no longer adequate proxies for general capability or real-world utility
  • Scaling is not dead — pre-training continues to compound with reasoning models, and Pachocki highlights compute-per-answer (e.g., GPT-5 Pro using 10–20x more compute than GPT-4) as a key frontier
  • Recent competition wins: gold at IMO, top performance at IOI, and 2nd place at AtCoder behind a human competitor

Episode summary

Summary

OpenAI's Chief Scientist Jakub Pachocki and Szymon Sidor sit down with Andrew Main to discuss how OpenAI thinks about AGI, where the next breakthroughs will come from, and why traditional benchmarks are losing their usefulness. Pachocki frames AGI less as a fixed threshold and more as the automation of research and discovery — envisioning an entity that can produce new technology, codebases, and scientific insights at scale. He argues that capabilities previously lumped together under "AGI" (natural conversation, math, research) are actually diverging, and that point-wise human-competition benchmarks like the IMO gold medal are useful but increasingly inadequate measures of real-world impact.

Both guests push back on the "AI is slowing down" narrative, with Szymon noting that headline economic-impact estimates (e.g., 3%) ignore the near-zero capability of a decade ago. They discuss the saturation of standard benchmarks, the rise of reasoning models like o1 (whose success surprised the team enough to prompt an 11pm call with Sam and Mira), and how scaling pre-training is compounding with extended chain-of-thought reasoning. Pachocki points to model persistence — using vastly more compute per problem — as the next clear frontier, citing GPT-5 Pro using 10–20x more compute than GPT-4 for meaningfully better answers.

The conversation also covers recent competition results: OpenAI's model took gold at IMO, won IOI, and placed second at AtCoder (losing only to Szymon's friend). They discuss practical implications for users — including trust trade-offs around models accessing personal data like email and calendars — and close with advice for high school students: learn to code, as structured problem decomposition remains a premium skill regardless of how AI evolves.

  • OpenAI defines AGI less as a fixed benchmark and more as automating research and the production of new technology — Pachocki says this is 'not that far away'
  • Benchmarks are saturating; the team argues standardized tests are no longer adequate proxies for general capability or real-world utility
  • Scaling is not dead — pre-training continues to compound with reasoning models, and Pachocki highlights compute-per-answer (e.g., GPT-5 Pro using 10–20x more compute than GPT-4) as a key frontier
  • Recent competition wins: gold at IMO, top performance at IOI, and 2nd place at AtCoder behind a human competitor
  • Pachocki describes the o1 reasoning breakthrough as a 'freaked out' moment — an 11pm call with Sam and Mira signaled the pace had changed
  • Szymon pushes back on low economic-impact estimates, framing them against near-zero capabilities a decade ago
  • Practical AGI impact framed as a largely automated company of researchers/engineers that radically accelerates technical progress
  • Both advise students to learn to code — structured problem decomposition remains a high-value skill even as AI improves

Source material

Transcript

Hello, I'm Andrew Main and this is the OpenAI Podcast.

Today our guests are OpenAI's Chief Scientist Jakub Pachocki and Szymon Sidor.

We're going to talk about measuring AI progress, how you determine AGI, and where the next breakthrough might come from.

I'm at OpenAI.

We seek to create intelligence very general.

► What does Chief Scientist mean?

I want to first start off by understanding the rules.

Jakub, you're the Chief Researcher, Chief Scientist at OpenAI?

Chief Scientist, yes.

Okay, what does Chief Scientist mean?

So the primary thing I'm responsible for is setting the research roadmap for the company.

So deciding what is the technical path we are going to bet on and what is the underlying long-term research that we're going to pursue.

So how about you, Szymon?

What do you do?

Random things.

Random things, okay.

Yeah, I mostly do IC work.

I try to maybe sprinkle leadership somewhere in there.

I try to do what's the very most useful.

Now you two knew each other before working at OpenAI, right?

Yeah, we went to the same high school.

Same high school?

Yeah.

Were you guys friends?

I think we became best friends after we left.

I think coming to US is the kind of emotional experience that forms bonds.

I think in high school we were more like colleagues.

What kind of high school produces guys like you?

Well, yeah, we went to this high school in Tanya in Poland.

I think we were both drawn there by this computer science teacher, Mr. Richard Szubartowski, who had a great track record before we went there of bringing up computer scientists, programmers, with this big focus on programming competitions and pursuing excellence in this one field.

Yeah, I think that was a very formative experience and a great mentor for us.

Oh, wow.

Yeah, definitely.

I think Keter was really going deep on programming.

I think it went way beyond typical high school curriculum.

There was graph theory, matrices, and all sorts of stuff like that.

I actually hope that maybe with chat GPT it's a little bit easier for people now to do this kind of deep dives because without the right mentor and without a lot of work, it's kind of hard to replicate that experience.

I've been using it to explain things like there's the money hall problem where you have to choose which door.

You go into chat GPT and you say make a graphic interactive version of this.

All of a sudden you can see it.

It can show you the different solutions if you do one thing or the other.

It's one of these things where I'm excited about the ability not just to explain in text but to build multimedia to do things.

It does get into the area of there's not really a measure for that.

It's a use case.

This didn't exist before.

We talk about AGI, but we kind of have very loose definitions and whatever.

I'd love to hear how you would describe it both from a technical and also like a layperson's understanding of it.

Yeah, maybe to address the point about teaching, this sort of better explanation of some concept or teaching cross-socratic methods, definitely powerful use of chat GPT and I think works well with a teacher like our Mr. Swartowski.

At the same time, I think the thing that he was able to provide was more like emotional support and space, which I think it will be hard for AI to do alone.

That's a great point.

I think that gets lost a lot because sometimes you hear people talk about, "Oh, AI will place education."

For me, I had teachers that maybe their facts weren't always right, but their heart was there and they were carried and they answered questions for stuff.

I think that's a good point that these are companions to that.

I think that a teacher using these tools can be even more capable teacher.

On the subject of AGI though, I want to hear first like, give me your technical definition of it or actually not a technical definition.

Give me like how you would describe it to like if you were talking to a younger sibling.

A few years ago, when we would talk about AGI, it felt like the technology de-planning has incredible promise, but at the same time, the concept still felt a little bit abstract and far away.

I think whether you talk about human level intelligence, ability to converse naturally, ability to solve math problems or pursue research, they all kind of felt like in the same space.

I think as technology has progressed, now we see these are actually quite distinct capabilities and I think we pretty clearly are at the point where the AI is able to converse naturally on a wide range of topics.

It is able to solve math problems.

I think getting a gold medal at IMO is something we've long discussed as like a milestone on the path to AGI and that happened.

I think solving all the problems on the National Math Olympics is actually a little bit harder and I think it's like another milestone on the path there.

But I think increasingly, we see that like this kind of point-wise measures are less adequate and so we turn to thinking about what is its actual impact in the world.

For me personally, the thing that I think about when I think about how AI progress really impacts the world meaningfully, I first think about its potential for automating the discovery and production of new technology.

I think we tend to associate new ideas, fundamental technological progress with just human ingenuity and we measure our progress by this major milestone inventions and technological revolutions.

I think it is just hard to internalize.

It is possible to automate most of this process, it is possible to have a big computer that is coming up with ideas that fundamentally change our understanding of the world and I actually think that is not that far away.

Thinking about what separates us from that and what are the consequences of such technology is my first thought.

I just ordered a little Mac Studio because I want to take the open source model GPD-OSS and I want to just let it run nonstop because that idea, just the idea of letting it generate and do stuff 24/7 is fascinating to me but you are talking about a scale of basically automating science at a huge scale.

What kind of discoveries, what kind of things do we think might be the first things we could see from that?

When we think about how we shape our research program at OpenAI, we seek to create intelligence as very general.

We drive towards this automated researcher as a priority but we don't really think of it as like let's take this specific domains and let's deploy this technology there.

I think that is a way to make faster point-wise progress but I think the potential for the really big discoveries and the most meaningful technology advancement comes from this generality.

Still I think we see the technology is easier to apply in some domains than others.

I think especially in places that combine a large amount of reasoning with a lot of domain knowledge and intuition seem very amenable to these systems.

I think in particular we see pretty incredible results on medicine which is very encouraging.

I have high hopes about that.

I think naturally being a company of AI researchers, we think a lot about automating our own work.

I think it is also kind of a...

If AI can indeed reach a point where you can automate AI research then that is probably a very important thing to automate and similarly thinking about how it can help with automating research on AI alignment and safety.

I'm obviously impressed by the IMO AI results.

I was actually about to add that in the past when we were talking about the IMO vehicle that was a few years ago and we are still trying to even figure out what our definition of AGI might be, like what kind of concept we are considering is something like solving all the problems on the math Olympiad.

Why did that feel appropriate?

It's just like, okay, if you have a model of such a superior mathematical reasoning, then it should be able to disrupt a bunch of different domains that can be mathematically modeled.

I mean, I'm in general just...

I think maybe this podcast is just a good opportunity to kind of like share a little bit more of an inside perspective.

I was astounded by the progress.

I think...

So sometimes I see those headlines where people say that like, oh, the economic kind of impact of AI is only like 3% or 5%, right?

And those headlines are often accompanied by comments like, well, so AI is slowing down or people are overhyping AI so much and it's only like 3%, so what's up with that, right?

And when I see headlines like this, I remember to like maybe 10 years ago, I was working on natural language processing with deep learning and back then it just didn't really work.

I remember Jacob once came to test like one of the technologies we were working on and I was like trying to detect sentiment of sentences and he was trying, this movie is bad, correctly classified as negative, this movie is good, correctly classified as positive and then you would say this movie is not bad and the model is like, oh, negative.

So that was 10 years ago, right?

And since then, like, okay, like we slowly got like...

We slowly started solving tasks like this, solving tasks like decide is this word a noun, a verb, that was like sentiment neuron, then we had GPT-1, GPT-2 started producing like a paragraph of text that made sense, right?

That was such a breakthrough.

Right now it feels so simple, but back then it was such a breakthrough.

Then we had like GPT-3, GPT-4, GPT-4 was like to me like kind of like let's say my personal AGI moment because it would sometimes say things that surprised me and I was like, like, can this model actually surprise me, right?

It's still back then like charge GPT for my personal use kind of felt a little bit more like a nuance and kind of like maybe slightly better Google, but like, like, what's the big deal?

And then like suddenly we get to deep research and this can actually like answer questions truly like rarely make things up that felt useful.

And then finally, now we have like models that can like compete in programming competitions, which was like a, you know, like very hard for me personally and even more so for Jacob, obviously.

Yeah, the pace of progress just like from the perspective of somebody working on this technology is like absolutely amazing.

So when you see that 3%, like I raised you like 10 years ago, if we had to quantify it, it would probably be like 0.00001% or something, right?

So like really, I think those numbers need to be put in perspective, right?

And there is no reason not to believe that like in a year, it will be 10% in two years, it will be 20 and so on and so forth.

Yeah, I've heard it said that if you looked at like a graph of the economy from let's say like, you know, World Wide Web, you know, early 90s forward and you said point to the internet happening to the economy, you can't find the point.

There's no point you go, oh, okay, Tim Berners-Lee announced this whatever.

And I think AI is a lot like that where we were, oh, we've only measured this one.

Our measures are hard.

It's hard to know that, you know, one who's using it, how they're using it.

And you brought up a very good point too about if you've been following it for a while.

I remember training like a very simple next character predictor on my computer and it was terrible, right?

One, I'm using a small computer, but even then, and then you got the sentiment analysis, you're playing with BERT and it's kind of getting a little bit better.

But then GPT-2 comes out and I read every single output on GitHub.

Every single output GPT-2 came out because I'm like, there is something going on with this.

And that's how I ended up working at OpenAI was because I was this obsessive person about that and then with access to GPT-3 kept saying, oh, this is really this path that's moving forward.

But it's kind of crazy now because like if six weeks go by and a benchmark hasn't been broken, people are like, oh, we hit the wall.

We hit the wall.

And I would say part of the problem though is that benchmarks in some ways feel like you'll see modest improvement on them.

I've heard some of the benchmarks have problems that some of them actually have wrong answers and it's impossible to get 100% if you answer them correctly.

But also we talk about the term internally.

I've heard people talk about this as, you know, saturation.

Do you want to talk about that for?

Yeah, I think there's a few issues that we're hitting with benchmarks right now.

Yeah, I mean, a pretty clear one is saturation and that is just the models genuinely reaching a point where for the kind of standardized forms of measuring intelligence or ability, like they are at human level for a lot of them.

If you're kind of like able to perform amongst the top on this like very high school competitions where we have the best competitors from around the world, it just becomes quite hard to like have this like very constrained measurement.

You know, previously when we were looking at, you know, this like GPT-1, GPT-2, GPT-3, GPT-4, scaling, paradigm, you know, the benchmarks were really very, they were really just like measuring the rising of the tide.

I think now, you know, the field has developed a lot of more data efficient ways to train for specific abilities, right?

It doesn't mean, you know, train on these benchmarks, but you can train models that are like disproportionately good at math compared to their ability, you know, to write, for example, right?

And so they will do better on math benchmarks, but it's no longer as representative of their overall intelligence in other topics.

I think, you know, these two issues combined, we, yeah, I think we really have to think about the reward utility of these models and especially like their ability to discover new insights.

Yeah, I guess that's a thing that sort of kind of gets overlooked is that you can build a model that's a really good test taker, but that model may not really be that useful for work.

Ideally, your model should score well in tests, but just because a model got these scores doesn't mean you're going to find it personally useful.

And I certainly think that's a challenge right now where when people say is a model good or bad, it's kind of like saying, you know, you're trying to create a blanket assessment when there is a hundred different use cases for it.

You know, is a model good or bad?

Maybe it's great at creative writing.

Maybe it's bad at math.

Maybe it's great at math and bad at creative writing.

And that becomes a really big challenge.

And we've talked about this with like one for math, your National Math Olympiad and these kinds of metrics.

Why are they important?

Why is it important to put it into these sort of human level competitions?

I think the reason we've been excited about these competitions like the International Math Olympiad and the International Information Olympiad is that they are a pretty interesting example of like a test that is constrained.

Doesn't require that much knowledge, but really tests your ability to think about a problem hard for an hour or two or three.

And you know, and we have like a very kind of good, we have very good evidence that like these problems are hard.

You know, there's a lot of people that try to solve them and compete at solving them and it matters to them.

So yes, I think this is then like for models that like, you know, excelled at like kind of knowing a lot of things, but not necessarily, you know, thinking very hard in the past that really seemed like the kind of the right milestone to be working towards.

And I understand it, the model that scored gold medal level on that wasn't using like a calculator, it wasn't using other tools, it wasn't using some of the frameworks, it was doing it purely through reasoning.

Yeah, that's right.

For the International Math Olympiad, yeah, the model was not using other tools like, yeah.

And again, and that was like two years ago, you ask it to multiply two four digit numbers, it would fail.

Yeah, but definitely, like, you know, for this kind of context, it's really like, it is of course, like in a limited domain of math, but it really is about like fairly creative thinking not about applying a formula.

I guess that's part of the challenge though, is that once you start moving outside of math, it gets to be harder.

You can start to come up with things like humanity's last exam, which I think is a pretty neat test, but you find that certain models after learn a certain kind of tool use kind of figure out maybe sort of how to solve these problems better.

And I would wonder what kind of benchmarks are we going to need?

You know, what are you looking at to say, okay, this is how I can kind of get an objective measure of a capability.

One thing that surprised me in the past, I was talking to one of our coworkers here, Anna MacKanjo.

And I was telling her about IMO, I was excited about like some progress.

And she's like, what's IMO?

And that kind of like, it was like a very important amount for me because I do realize that like some of those benchmark, we kind of live in a bubble a little bit.

For me, that competition feels important, especially like the computer science counterpart IOI because it was a big part of my life.

And so it's true for many coworkers here, but actually like for an average person, like working in other fields or maybe not as interested in mathematics or computer science, maybe they're interested in history or something.

A lot of speaks like five languages too.

So I could see for her a different metric based on that would be interesting.

Yeah.

So I think like one thing that like, that it's not a perfect metric, but at least helps keep us honest and helps keep us escape the bubble is just charge EPT users, right?

Because everybody uses charge EPT and they use it for all sorts of use cases.

And obviously there's like a lot of pitfalls to using that as a metric, but at least it avoids that particular problem where like there are just some things that I'm more familiar with and other people might appreciate other things.

And this gives you like a very wide coverage.

Yeah.

And in there too, you have subsets of users, people who are building GPTs and doing more complicated stuff.

You mentioned before to the fact that the model will reason longer.

And that seems like a very interesting way to evaluate capabilities.

Yeah.

Yeah.

So maybe like one challenge we're focusing on only kind of usage of like charge EPT and broad adoption of AI as the metric of progress.

Like I think this hasn't really happened to a very meaningful extent yet, but I think it will start happening pretty soon.

We should be able to use vastly more compute than a user would normally be willing to.

To buy for themselves, to produce technology artifacts that are useful to a lot of people.

And I think that for me will be a very important measure of progress.

Which of these wins were the most surprising to you?

I think we definitely kind of anticipated getting to this point when we saw the reasoning models starting to work.

At the same time, I think this recent set of wins is very impressive.

I think maybe out of those, I think IMO came a little bit sooner than I expected.

IMO got again, I think IMO problem six will still...

IMO has all the problems require like creative thought and some new insights.

I think there's this problem six that requires very out of the box thinking.

And it's really kind of like usually outside the typical domains of the other problems.

So in the past we were actually kind of drawing a boundary between getting a goal, solving these other problems and actually going to solve all the problems.

And in particular problem six.

So it was pretty hilarious in some way to see ourselves and also Google DeFi at the same time like, "Oh yeah, we solved problems one for five perfectly and we didn't make any problems on problem six."

I think that kind of makes that challenge pretty clear.

Yeah, that was...

I think that was what's interesting is that...

Yeah, I think that the open AI model said like, "Yeah, I don't think I can solve this.

It didn't even try."

Or said that it had a problem with that.

Is that correct?

Yeah, the model was able to correctly identify that it didn't make progress on the problem.

That's pretty fascinating to think about that.

The model is able to sort of determine that.

There's a lot of conversations about people talking about hallucination, which I think is a kind of a poorly understood thing.

And there's a difference between fluid and crystalline thinking.

And one is how much knowledge a model has and the other is its problem solving capability.

And when you get to the point where it's able to do that, it's able to say, "I think I won't be able to answer this."

That's pretty interesting sort of point to get to.

I've been told to ask this question about a livestream in Japan.

So I think in the past few weeks, actually, our models have performed incredibly well in three competitions.

So we talked about two of them, which is IOI and IMO.

There is also this competition that is open to everyone, not just high schoolers called Atcoder.

It's a very prestigious, very high quality competition organized in Japan, but open to competitors worldwide.

And in this particular contest, it was about a kind of longer horizon, heuristic problems where you're even only a single problem, you have 10 hours to solve it.

And so you have competitors racing to figure out the best approach to this difficult optimization problem.

So it's a bit different because there isn't like a single correct solution.

There isn't like a single pattern to follow.

These tasks are extremely diverse, and you can focus on the single task for 10 hours.

And so we entered our model into this contest.

And to me, this had a little bit of personal significance.

I used to be a kind of very engaged competitor in the past in this more short form, like closed form contest like IOI.

And my friend, Siho, who also worked at the company the other time, excelled at this long duration contest.

And when we worked together, he would mock me a little bit that my sort of contest would be automated long before his.

Because they are kind of like longer duration require kind of more focused work.

And it turns out, in this contest in Japan, Siho was actually one of the top contenders.

And so I was watching this livestream, watching our model race with Siho throughout the competition.

In the end, our model actually got second place, and Siho won.

So he alone stood in the way of his prediction not coming through.

Still two wins for opening eye.

I think one thing that also stood out to me is like Siho at the end of the competition, he was like really, really tired.

And they interviewed him a little bit to talk about his experience, like in the middle of the competition.

And I don't think I can quote him directly on this podcast, but he's like, "Your models are very, very bad.

I want to go to sleep.

I am tired."

Yeah.

We've heard talks about the wall, we mentioned that before, and I think it was interesting because reasoning kind of came out of nowhere.

There were hints of stuff, some papers and stuff, but people really hadn't drawn the line.

Then all of a sudden, the O1 model comes out.

And the whole idea that you can not just have a model give answers, you can let the model kind of have an inner monologue, talk to itself, and solve things through.

Do we think that's enough to take us to AGI?

Were there other breakthroughs needed?

Or there's other breakthroughs you think are going to happen?

I just need to point out that the team here worked extremely hard on this public air thing.

It feels like something simple, like you just need longer chain of thought, but actually to make it work was really hard earned.

And I do think back to your previous question of what was the surprising result when we first noticed that it's working, or we first noticed that we can train those models and give them more data and they get better.

That was I think one of the most kind of shocking moments, the moments where we started asking very, very seriously the question, are we ready as an organization for incredibly fast based progress?

I remember there was one particular evening, I think 11 p.m.

I think we were on the line with Sam and Mira and just kind of trying.

I think we got a little bit freaked out by those results.

Sometimes that happens.

The pace is fast.

I mean, it is a fast thing.

And like I said, the joke is people, nothing happens for six weeks, they think it's slowed down.

But then if you look year over year, it is.

And I mean, it's a fair point because yeah, you have things that you're aware of internally when you work on something for a couple of years and they're, hey, there's a research paper, but it's like, yeah, it's not like I came out last night.

It was like, there's a lot of work on it.

But I'd say to the world was sort of surprised by the fact that there is this really fundamental new way to sort of make these models do even more to take kind of the existing sort of infrastructure, so to speak, and get a lot more capability out of it.

Where do you think the next breakthroughs are going to happen?

I think one thing we always try to not underestimate is the importance of scaling.

I think even as we look at these Riesink models, it's not like the previous scaling paradigm of pre-training has vanished.

I think we will see these things compound.

And I think there's also new directions that we can move in particular.

We were talking about extending the horizon that these models can plan for and reason it.

And I think if you look at it from the perspective of just like compute span, we say, okay, yeah, we went from GPT4 pursuing some out of compute for every answer to GPT5 Pro, which maybe uses 10x, 20x, I don't know.

Some non-trivial, but in some ways not that impressive amount of compute more, right?

And it can produce much better answers.

I think on the scale of what amount of compute would you be willing to spend on a problem that actually matters to a lot of people, right?

On progress on a medical research question, progress on developing the next generation of models, right?

These are incomparably larger amounts.

I think that is that question of model persistence and ability to work for a very long time on focus problems is a pretty clear next step.

How would you put the practical implications of AGI to sort of like if you were talking to a typical chat GPT user or something like what would their experience be like in a few years from now or five years from now, which sounds far away, but it's really not because which is five years ago, GPT3 came out and that feels like a blur.

What would an AGI like model be capable of?

So I was talking about automating research.

My picture of how that would actually look like is imagine a company of very capable researchers and engineers that is largely automated, right?

And now again, I think that is something that will interface with the world in all sorts of ways.

It won't be just like kind of a black box.

It will talk to people.

It will kind of like take in inputs.

It will run experiments.

But I think like having this sort of potential for developing new technology and other kind of artifacts, code bases, designs, I think kind of like radically accelerate the pace of technical progress.

So I think that is something that we will feel and we need to do a lot of work to get it right from a technical and societal perspective.

But I think that is kind of where our time is.

I think we should also expect a lot of progress on the actual kind of interfaces that we interact with.

We see like GPT can feel quite human-like.

We can form attachments with it.

I think as it becomes more persistent, as it becomes kind of capable of expressing itself in like different forms and texts, right?

Like I think that those effects will become stronger.

And again, like that will be something I think will become a very big and important conversation.

I just got access in chat GPT to have it actually read my calendar in Gmail.

And I realized like how far we've come because I'm excited about that now.

I'm not really terrified that it's going to start writing like you walk fan fiction to somebody.

And I think that's sort of this neat threshold that we sort of cross this sort of level of trust.

I think there's definitely like we are in a place where there's like very tough trade off where like there is like such clear, just economic personal value can extract out of having the model have access to a lot of your data.

And at the same time, I think like we are not at like the threshold of like robustness where like we can fully trust these models to not be exploited by someone trying to exploit them.

Yeah, it's definitely like a big problem.

I think we as a field will have to iterate on.

What would you tell two versions of you guys today in high school?

What would you do if you're visiting your old classroom?

What would you say right now?

Tell them about the future?

What advice would you give?

Invest in Bitcoin.

No, I mean today, even today in 2025, what would you tell a high school student?

High school students today?

Oh yeah, that one is also I think a great question, right?

Because I hear a lot of kind of what I consider misinformation on that online.

So you should absolutely learn to code.

Like one skill that is at premium and will continue being at premium is to have like really structured interact that can like break complicated problems into pieces.

And you know, like that might not be programming in the future, but programming is a fine way to acquire that skill.

So are other other kind of domains where you need to think a lot.

So don't let people tell you that you should not learn to code.

Yeah, I learned to code late in life.

And that's actually I ended up working at OpenAI as an engineer.

And I try to explain to people just because a system can do the thing doesn't mean you don't want to know how it works anymore.

And as you said, when you understand how to break down a task, when I worked at OpenAI and prompt engineering, my coding understanding helped me understand to take both language and break it down and make it do better things.

I think that people who bridge those gaps are really an advantage.

And so whenever I hear people say like, don't learn to code, it's like, do I want an airplane pilot who doesn't understand aerodynamics?

Like this doesn't make much sense to me.

Well, you know, thinking about how I thought about things in high school, I think it's like pretty incredible, like how many kind of perceived constraints are not actually there when you really think about it.

You know, maybe maybe like the first revelation to me was like, hey, you know, if I really kind of like, I'm passionate about this computer science stuff, like it is, I can actually spend a bit more time on it at the cost of, you know, maybe spending a bit of a bit less time on like, you know, like other 12 subjects in school.

But you know, but then like, you know, but then like somehow it like again, like to kind of like, it was like a big revelation to me that like, actually, you know, I can I can I can go and, you know, study in the USA at some point.

Like, you know, that's not really something that like seems that's like obviously interaction space and, you know, obviously kind of like, you know, like spending some time here and so they can value right and kind of seeing like how you know, people are willing to really like attack these big problems with ambition and like the kind of and the belief that like you can actually you can actually make a meaningful positive change in the world.

Yeah, I think I think has been incredibly inspiring and yeah, I cherish about about about this community.

Is there a book or something that like inspired you?

I think there's a couple books.

I remember my it's actually yeah, it's actually hilarious like thinking about it now.

I didn't really connect the dots but my dad gave me this book once when I was like in a pretty.

I think I was like 15.

I was like pretty unsure what I want to do.

It was a Polish Polish version of a book by like some other I didn't know called a Hackerson painters.

Yeah, so yeah, it was actually a program.

So I guess like, yeah, again, like this community.

I found that pretty pretty inspiring.

Yeah, there's something helpful.

I think that hearing the message of like, no, it's okay to dream big and go do stuff that you can just make things happen in the world.

And I think that the more people realize that kind of the better the world gets to be.

Was there any book that influenced you or movie TV show or movie?

I have a stupid answer to that question.

I love stupid answers.

But after the profound one, but like, okay, so I watched Iron Man.

Yeah.

And it inspired me to start the PhD in robotics.

That's a great answer though.

Like, you know, the Martian by Andy Weir, I met a scientist at NASA who was a botanist who read that book.

And I'm like, well, they got the atmospheric physics wrong and all this.

He's like, well, it's why I'm here.

I'm like, oh, well, yeah, I guess I didn't get to the stupid part.

The stupid part was like, when I started working on robotics, I was very disappointed how bad the robots are.

I somehow didn't occur to me that like maybe the movie is a movie.

Yeah.

So that whole experience was kind of bad for me, if not for, would be kind of bad for me.

If not for the fact that this is there, I like met a friend who was into deep learning.

And at the time I thought all of the machine learning kind of is a hype, but it was an interesting systems problem.

And then out of nowhere, as I'm sure, like, you know, like I would frustrate some deep mind folks by saying that AlphaGo came out.

No, I'm sure it wasn't out of nowhere.

I'm sure it was years in the making.

And that like was very inspiring.

I actually think to both of us.

And since then it was just hard not to work.

Yeah.

Took me a while to become convinced that the planning is more than a fad.

Because you know, like we don't really understand the kind of underlying optimization.

I think this kind of has been the story of our research here trying to make progress on these questions, like about how it really works.

But it really is like selling a physical phenomenon in some way.

And, you know, to a classically trained computer scientist, that was a weird thing to accept.

I do remember when Jacob was telling me about like scaling up principles, convex optimization.

That was before AlphaGo.

AlphaGo was interesting because first like, oh, cool, it solved Go.

And then we're like, yeah, but it just learned by watching all these.

Then they did AlphaGo Zero where it self-taught and you're like, okay, game over folks.

There's a trajectory here.

And I think that's continued on.

But I think that, yeah, if you hadn't watched Iron Man, maybe Thor instead, you know, maybe things would have turned out better.

Who knows?

No.

I kind of wish I started maths instead.

It has been more useful.

Study what?

Mafs.

Or a theoretical computer science.

Either of those like this.

Yeah.

Physics probably.

Physics.

Physics.

I started off as a magician.

None of you know that.

So I actually had my own reality TV show.

So you find a very strange path to end up here.

So Jacob, Simone, it's been an absolute pleasure to talk to you both.

And I hope we can meet again and talk about the next big breakthrough that you guys have been strictly working on that's going to come out of nowhere.

And we'll be like, that was an overnight thing.

Thanks, Andrew.

Thank you.

[BLANK_AUDIO]