The a16z Show · 2025-09-30

Building an AI Physicist: ChatGPT Co-Creator's Next Venture

Hosts: Anjanay Meetha

Guests: Liam Fedus, Ekin Dogus Cubuk

Periodic Labsexperiment-in-the-loop RLAI for sciencehigh-temperature superconductivitymid-trainingscaling laws limitationsacademic-industry collaborationadvanced materials R&DOpenAI alumniDeepMind alumni

Read summary Jump to transcript Go to episode

Podcast feed URL

Open feed

Why it matters

Team composition is roughly half ML researchers and half physicists/chemists.

Key claims

Periodic Labs replaces math/code reward functions with nature-as-RL-environment: experiments become the ground truth that agents optimize against
Founders argue domain-shift makes internet-scale pre-training insufficient for physics—out-of-domain power-law slopes can be too shallow to reach target capabilities
First targets are superconductivity (current ambient-pressure record ~135K) and magnetism, chosen for technical robustness to defects, philosophical appeal, and team alignment
Team composition is roughly half ML researchers and half physicists/chemists; culture emphasizes cross-disciplinary teaching and 'bridge' researchers who live between domains

Episode summary

Summary

Liam Fedus (formerly OpenAI, co-creator of ChatGPT) and Ekin Dogus Cubuk (formerly Google DeepMind physics lead) join A16Z's Anjanay Meetha to discuss Periodic Labs, their new venture building LLMs with real-world experiment in the loop to advance physics and chemistry. They argue that scaling laws alone won't crack scientific discovery because pre-training on internet text optimizes against the wrong distribution—nature itself should be the reward function, with experiments serving as the RL environment.

Periodic Labs replaces math/code reward functions with nature-as-RL-environment: experiments become the ground truth that agents optimize against
Founders argue domain-shift makes internet-scale pre-training insufficient for physics—out-of-domain power-law slopes can be too shallow to reach target capabilities
First targets are superconductivity (current ambient-pressure record ~135K) and magnetism, chosen for technical robustness to defects, philosophical appeal, and team alignment
Team composition is roughly half ML researchers and half physicists/chemists; culture emphasizes cross-disciplinary teaching and 'bridge' researchers who live between domains
Mid-training is used to inject simulation data, crystal structures, and experimental knowledge that doesn't exist in web pre-training corpora
Commercial strategy targets advanced industries (semiconductors, space, defense, manufacturing) with co-pilot tools; current customer pain points include simulation automation, format-matching for design pipelines, and going beyond retrieval to actual model training
Launching an advisory board (including ZX Shen from Stanford, Mercouri Kanatzidis from Northwestern, Konstantin Novoselov from Manchester) plus an academic grant program for materials discovery and physics modeling
Three core theses for why a physical lab is needed: noisy literature data, lack of published negative results, and the necessity of iterative action to do science at all

Source material

Transcript

Ultimately, science is driven against experiment in the real world.

That's what we're doing with periodic labs, we're taking these precursor technologies, and we're saying, okay, if you care about advancing science, we need to have experiment in the loop.

The applications of building an AI physicist, for lack of a better word, that can design the real world are so broad.

You can apply them to advanced manufacturing, you can apply the material science to chemistry.

Any process, whether it is R&D with the physical world required.

It seems like will benefit from breakthroughs that periodic is working on.

For example, if you could find a 200 Kelvin superconductor, even before we make any product with it, to be able to see such quantum effects on such high temperatures, I think would be such an update to people's view of the city universe.

What if AI could move from talking about science to doing science?

Today's conversation features Anjanay Meetha, general partner at A16Z, with Liam Vedas and Adoj Chubuk, co-founders of Periodic Labs, a frontier research lab building experiment in the loop AI for physics and chemistry.

They unpack why real world reward functions matter, how mid-training and high-compute RL fit together, and why superconductivity and magnetism are the first B-lines towards an AI physicist.

They also get into noisy datasets and negative results, what happens when ML researchers sit shoulder to shoulder with bench scientists and the near-term payoff, co-pilot tools for advanced industries from semiconductors to space and manufacturing.

Let's get into it.

So Liam, you were the co-creator of chat GPT.

Doge, you were running some of the physics teams at DeepMind.

Let's talk about how you guys met and what was the moment where you realized that you guys had to leave both of those labs to start periodic.

I believe we met eight years ago at Google Brain flipping over a large tire.

Yep.

At the Google...

You gotta give us more on that story.

So Google Rails was one of the gems at the Google facilities.

And I think that's where Doge and I met, and it was just this massive tire that a single person basically can't flip single by themselves.

And so Doge was trying to flip it and he pulled me over.

He's like, "I think the two of us could do it."

And why were you trying to flip this tire?

Why not?

But yeah, I tried doing it, I couldn't do it.

And then I was like, "Who's the strongest person I can find?"

And I was like, "Bare it or Liam."

And I'm like, "Or Liam."

And it worked, we just flip it.

And was that the moment where you guys both realized you had physics backgrounds?

How did that happen?

How did you go from flipping tires to flipping experiments?

Yeah, I mean, so I don't know if Liam remembers this, but we would catch up over the years.

And we would often end up talking about quantum mechanics or superconductivity.

So it was very common.

But I never thought that we'd end up working on physics together.

So Liam was working on LLMs and they were going really well.

And I was not using LLMs, but I was noticing that LLMs are becoming more and more impactful in my work.

So one way it was becoming impactful is when I was trying to remember some things about chemistry, physics, I could just talk to the chatbot and actually learn a lot of stuff I forgot.

Another way was of course coding, like we were writing simulations and the LLM was so helpful in writing these simulations for us.

So then the question was, "Can we use LLMs kind of more as a first-class citizen in the physics research?"

Yeah, I think kind of leading up to this decision to leave, Doge and I were just connecting and talking about these different tech trees.

We're looking at the improvements on language models, on reasoning.

We're seeing what high-compute reinforcement learning can do.

And on the material science side, we're seeing scaling laws within physics, within chemistry, both with respect to simulations, with respect to experiment.

And it's like the same kind of principles at play and ML.

And I think to both of us and to a lot of people in the field, the goal of this technology is accelerate science, accelerate physical R&D.

Chatbots was like a great milestone along the way, but we really want to see technology out in the world.

And we felt like this was just the right place to begin.

Physics is very verifiable.

It's a great reward function, fairly fast iteration loop.

You have simulators for large classes of physical systems.

And we felt like in order to create this AI scientist, this is like the beginning of this path.

So built that conviction and decided to found periodic.

Well, let's take a second to talk about what periodic is and what does it do?

So periodic labs is a frontier AI research lab that's trying to use LLMs to advance physics and chemistry.

We feel like having experiment in the loop tightly coupled with simulations and LLMs is extremely important.

So we're building up a lab that will generate high throughput, high quality data.

And we will use LLMs and simulations in conjunction with the experiments to try to iterate.

Science by its nature is an iterative direction.

And we feel like LLMs using all these tools that are available to humans can do a great job in accelerating physical R&D.

I'd say the objective is let's replace the reward function from math graders and code graders that we're using today.

So like math graders, to give an example, you have a prompt, what is two plus two?

You know, the ground truth is four.

You can put a lot of optimization pressure against problems like that that are programmatically checkable.

And what we're doing, and by having the lab, is we create a physically grounded reward function that becomes the basis on which we're optimizing against.

And so if a simulator has some deficiencies or some issues, we always error cracked because for us, the ground truth is the experiment, like the RL environment.

Nature is our RL environment in our setting.

Let's just take a second for folks who might not be familiar to explain what you guys mean by a lab that will verify RL in the real world.

Can you talk a little bit about how experiments work?

How are AI models trained today?

And how are those different from how they're going to be trained and developed and both trained and deployed at periodic?

And it might be helpful to talk about how you created ChachiPT.

So ChachiPT originally, the technology evolved very rapidly over the last few years.

When we were first creating it, it was a very standard RLHF pipeline.

So you have a pre-trained model, and it's sort of like this raw substrate.

And what you're trying to do is take this auto-completion model and turn it into something useful.

The way we did it at that point was we would have supervised data.

So given some input, we would say this is a desired output.

So if we're trying to get it to act as an assistant, we create some tuples like that.

Then you run reinforcement learning, but now you're learning against a reward function that's trained against human preferences.

So humans will say, well, given this input, I would prefer completion A to completion B.

And you do that over and over again, and you can create a reward function that can then be optimized against.

That is sort of the basis of how we created ChachiPT.

But then there's a huge gap between the original model and what we have today.

And I think part of that is reasoning, but also part of that is just much better, more precise reward functions.

So the reward functions that we were using originally couldn't determine whether you were mathematically correct or not.

So early versions of ChachiPT were mathematically not particularly strong, and it sort of results from the reward function.

What did you optimize against?

The reward function basically encoded, be a friendly assistant, try to help people get to their thing, but it had no sense of is this mathematically correct or not?

Is this code valid or not?

And we made huge advances over the correctness of a reward functions.

But this is all digital.

We're creating tasks based on the internet, textbooks, papers.

And this is great.

This lays a foundation.

But ultimately, science is driven against experiment in the real world.

And so that's what we're doing with periodic labs.

We're taking these precursor technologies and we're saying, OK, if you care about advancing experiments, we need to have experiment in the loop and that becomes our reward function for agents.

So as Josh was saying, our agents are doing the same type of things you would use for coding or to help answer a query.

But now instead of just giving tools like here's Python, here's a browser, now we have tools like quantum mechanics.

So simulate different systems.

But ultimately, we're going to a lab and then that becomes like the basis of what is the system optimizing against.

So that's sort of just like the natural end state of these systems.

People in AI often say lab.

Often what they're referring to is quite different from what you guys mean by lab.

What's the difference?

That's right.

So as you mentioned, so far the LMs have gotten really good at logic and math.

There's like verifiable rewards.

What is like the next frontier in terms of inquiry after logic and math?

I'd say it's physics.

And then when you say physics, there are different energy scales.

So there's astrophysics, studying galaxies, there's fusion, nuclear physics.

But then there's the energy scale of physics that's more relevant to our life.

And that's the quantum mechanics, like Schrodinger's equation.

This is where biology happens, chemistry around us happens, materials happen.

So we felt like our first lab should be basically probing that quantum mechanical energy scale.

And for us, that would be physics at the level of solid state physics, material science, and chemistry.

One of the more fundamental ways of making things around us is powder synthesis.

So you take powders of existing materials, you mix them, and you heat them up to a certain temperature and it becomes a new material.

So this is one of our labs.

We're going to have a powder synthesis lab.

And it turns out this is one of those methods where robots can do it, like very cheap, simple methods.

I don't know if you saw this coffee-making robot in the SF airport.

A robot that's basically at that level can mix powders and put it in the furnace.

And there's a very rich field.

So you can actually, using that method, discover new superconductors, magnets, all kinds of materials that are very important for technologies around us.

But at the core of it is just quantum mechanics.

And we feel like teaching these our labs to be foundation models, but for quantum mechanics, will be the next frontier for our labs.

Why haven't the models that are currently out in the world and deployed able to do this?

Great question.

I think, as you mentioned earlier, science is by its nature iterative.

Even the smartest humans tried many times before they discovered the things they discovered.

And I think maybe this is one of the confusing points about LLMs.

LLMs can be very smart.

But if they're not iterating on science, they won't discover science.

Because humans won't either, like you put a human in a room without any chance to iterate on something.

They won't discover anything important.

So we feel like the important thing to teach these LLMs is the method of scientific inquiry.

So you do simulations, you do theoretical calculations, you do experiments, you get results.

And the results are probably incorrect or not, but you want it first, but you iterate on it.

And we feel like that hasn't been done yet.

So this is what we want to do.

But we feel like you have to do it with the real physics, not just the simulation.

So this is why we have our own lab, where the LLM will have the opportunity to iterate on its understanding of quantum mechanics.

Fundamentally, machine learning models are good at what you train them to do.

And that's sort of like the nature of it.

And so if a model is acting badly, you're like, well, did you train it to do that task?

Kind of building on Dosh's point, there's sort of like an epistemic uncertainty, this like reducible uncertainty that you aren't really building or collapsing unless you're actually running an experiment.

So for instance, one of the engineers on our team was looking at a reported property of some physical property in the literature.

And it spanned many orders of magnitude.

So if I train a system on that, these systems aren't magic, the best they can do is replicate that distribution.

But it's really no closer to a deeper understanding of the universe, physics, chemistry.

And another point is it's very uncommon to publish negative results.

All of the results are basically positive and a valid negative result is very valuable.

A negative result could be discarded because it was sloppy science.

But there are valid negative results and that's a learning signal.

And this is something that our lab will produce as well.

So I think these three things are just like noisy data, no negative results, and you need the ability to act in order to actually do science, which is an iterative endeavor.

Those are like the core theses of why we need a lab.

And what might be the core way to measure if theoretics progress against that goal in your guys' minds?

One simple one is let's say high temperature superconductivity.

What is the highest temperature superconductor we synthesized?

Today the best number for ambient pressure is 135 Kelvin or so.

So we'll know very easily if you're doing well if we can go beyond that number.

So that's pretty fundamental.

On the more applied side, there's processing of materials and its effect on the materials properties.

So we can just measure these properties directly.

Let's say it's the ductility, it's the toughness, strength of the material.

And as we measure it, the LLM will get very clear signal.

It's hard to hack unlike these other LLM training techniques.

It's like really what you see in real life is the signal that's going to the LLM.

Yeah, effectively.

Can you design the world around you?

So you're like, I need something with this property.

Can the system discover and produce that?

Both from a fundamental scientific discovery perspective, but also in industry.

So someone's working in space or defense or semiconductors.

And yeah, we're having these issues.

We're trying to achieve this property of this material or this layer.

Can the system accelerate the development of those technologies?

So it's very grounded.

That's how we'll know it's working.

It feels like the applications of solving, building an AI physicist, for lack of a better word, that can design the real world are so broad.

You can apply them to advanced manufacturing.

You can apply the material science to chemistry to all anything that any process where there's R&D with the physical world required, it seems like will benefit from breakthroughs that periodic is working on.

Why hasn't it been done before?

And what is it about this moment in history that makes it the right time to attack this problem?

Maybe one comment is difficult.

What makes it so difficult?

I mean, I think part of it is the team.

So in our view, this has been enabled by frontier technology in the last couple of years.

And so Doge and I have been so focused on basically putting together this N of one team.

These group of physicists, chemists, simulation experts, and some of the best machine learning researchers in the world have never been part of one concerted effort.

And we feel in order to actually achieve this, you need all these expertise.

You need these pillars to do this.

So when you guys went about designing the team after you left OpenAI and DeepMind, what was the primary heuristic that you used to guide yourself in figuring out who we wanted on the team?

So in terms of expertise, we wanted to have LLM expertise covered, the experimental expertise and simulation.

And for each of these, we wanted to have is the world class talent.

And of course, for each team, there's actually a lot of sub teams, like it's like a fractal, right?

The expertise is very fractal like.

So for the experimental side, we want to cover solid state chemistry, solid state physics, automation, and kind of the more facilities, like the more operational aspects of experiments.

On the simulation side, there's the more kind of theoretical physics parts.

There's the more kind of coding aspects of simulations.

And on the LLM side, of course, there's mid training, RL, infra.

And yeah, for each of these, we try to get basically the best people who have innovated in these like sub pillars.

Yeah, so I think it's like, there's not a team to do it.

The technology that we think is necessary to do it has really just emerged in the last couple of years.

And this data isn't like on a Reddit forum or something like you need to actually go produce experimental data, simulation data, it's siloed across all of these advanced industries.

And many of them, while there's a desire, they may not have knowledge of some of the most recent techniques that's been driving this recent wave in AI.

There was a moment in time when models like, or papers like the GPT-3 paper, for example, that said language models are few shot learners and proposed the idea of scaling laws.

And then there was a follow up paper, if you guys remember from OpenAI, that was called, I think, scaling laws for generative modeling.

That just showed that as long as you just kept throwing, you scaled up the amount of compute and data in the right combination, you could very predictably improve the performance of these models.

And the theory was that if you just kept doing that at infinitum, there would be a bunch of emergent capabilities.

These models would be able to reason about all kinds of problems out of domain, out of distribution.

Wouldn't that argue...

How would you square the circle with that school of thought that naively the current pre-training and post-training pipelines at most of the frontier labs won't just eventually crack physics as well?

Why is this idea of physical verification so necessary?

And is that school of reasoning wrong?

Yeah.

Excellent question.

Scaling laws empirically seem to continue to hold.

So that's not in question.

But I think there's a question of what is this y-axis?

And that test distribution is very different from what we're talking about.

That test distribution, let's say you're pre-training on the internet, might be a representative sat from the internet.

And you'll have these sort of predictable scaling properties.

But that's not going to capture that you have a very different set of scaling properties with respect to different distributions.

So try to make this a little bit more concrete.

Let's say, hypothetically, we're training a coding model.

And we have unit tests to provide some reward signal.

So the model writes some PR, we check that the unit tests go from failing to passing, and we say, "This was successful.

We're going to reinforce these things."

You might say, "You start optimizing this, and now the system is becoming ever more capable of writing code for its own development."

And you have this acceleration, you have this kind of takeoff scenario.

Code is one of the most promising areas for this because there's abundant of data online.

You have this feedback loop where the system itself can begin to improve itself.

And it's a very promising technique.

And we're all seeing the benefits of advanced coding models, and it's accelerating quickly.

However, that model is not going to then cure cancer.

The knowledge simply doesn't exist.

You need to optimize against the distribution you care about.

So that model, while it's going to be a very valuable tool as a software engineer, it may help a cancer researcher do their analysis.

It simply doesn't have the data, the knowledge, or the expertise iterating against that environment.

And I think that's just sort of the fundamental belief we have.

Yeah, so actually, Lee, when I worked on this a bit, when we look at the scaling laws for vision models, and this also came up a lot in the clip paper from OpenAI.

The in-domain generalization and the out-of-domain generalization are monotonically correlated, but it's not linear necessarily.

And so what that means is you can keep improving your model, and it will improve as the power law in domain, and for out-of-domain tasks, by which I mean, as Liam said, the things that you're trying to do that's a bit different than what's in your training set, will also improve as a power law, but the slope of the power law may not be good enough.

So you might need to spend centuries before you get to Zadu-1.

We saw this in the norm paper, for example.

We published a paper where we saw that as you increase the size of your training set, the IID performance, the in-domain performance, improves the power law.

Out-of-domain performance also improves the power law, but depending on what the out-of-domain is, like how far you are from training distribution, the power law might have such a small slope that is basically useless.

So this is one of the reasons we feel like the best way to make progress is to make your target as close to your in-domain training set as possible.

And the best way of doing this is to basically iterate on changing your training set to be more like what you want to do.

So this is one answer.

The other one is actually maybe even simpler.

The experimental data we want actually doesn't exist.

So for example, if you look at, like you want to say, learn on the experimental data in literature for synthesis, turns out the formation enthalpy labels, which is like the energy it takes to basically assemble the atoms in the shape you want, is so high that if you train a machine learning model on it, it's not predictive enough to predict the next one.

And one of the reasons for it is, as Liam mentioned, people don't usually publish negative results.

And negative results are usually very context dependent.

So what's a negative result for someone might be positive if they do things differently.

So yes, so not only is there this domain shift problem where what you're trying to do might be different than your training set.

So the power law won't have the large enough slope you want.

But the other problem is for some of these things we want to do, there's no data for it.

For example, for superconductivity, there is a lot of data sets you can look at.

But the noise flow on them is so high that training on them usually doesn't help.

Doge, me, the entire team are deep believers in scaling up and scaling laws.

But it's just do a beeline for the thing you care about.

And in our case, we care about advancing science, advancing physical R&D.

That's sort of like the thesis.

Is there a tension between being super bitter lesson-filled and just throwing more compute at the problem and the, I guess, domain-specific pipelines that the lab you guys just described will have to focus on?

In the case of periodic, I think you mentioned the first beelines you guys are making are towards superconductivity and magnetism.

What is it about those domains that make them good candidates for the first few pipelines that periodic is working on?

And why are they just, are they, it stops along the way to any eye physicists that generalizes across all kinds of domains?

Or is there a danger of them being essentially off-ramps that don't result in sort of the AI sort of scientific superintelligence that is the North Star for what you guys are doing?

Yeah, I feel like, for example, the high temperature superconductivity goal is actually a goal that has so many sub-goals in it.

It's a bit like when DeepMind and OpenAI started and said, we're going to AGI.

But what that meant was they had to do so many things before they got to these cool results.

Like for us, if you want to get a high temperature superconductor, we probably need to get good at autonomous synthesis, autonomous characterization.

We need to get good at characterizing different aspects of the material, using the LLM to run the simulations correctly.

So it's a North Star, and there's so many goals on the way that would be very, I think, impactful for the community.

That's one reason.

Another reason is I feel like high temperature superconductivity is such a fundamentally interesting question.

For example, if you could find a 200 Kelvin superconductor, even before we make any product with it, that in itself says so much about the universe that we didn't know yet.

To be able to see such quantum effects at such high temperatures, I think would be such an update to people's view of how they see the universe.

So we feel like it will be really impactful for humanity even before we make a product out of it.

I think that's one of the reasons.

A technical reason also is superconductivity is a phase transition.

So it's pretty robust to some of these details that we cannot simulate yet.

So for example, when you make the material, the superconducting temperature usually is more dominated by its kind of crystal fundamental property than the defects or microstructure.

Whereas there are certain other materials properties where even if the crystal has the property you want, there are so many other factors that you cannot simulate that would prevent you from seeing that property.

So superconductivity has this nice philosophical upside to it, has this technical upside to it and it really rallies both the physicists.

There are people who studied physics for 40 years and really excited about superconductivity.

And there are people who've never studied physics but are very excited about superconductivity.

It's quite rare to find a topic that unites the whole team.

Yeah.

I mean, like Dersh said, in order to do this, there are so many foundational pieces to solve and our tactic is in order to actually get to this goal of AI scientist, you need to make contact, do the full loop somewhere.

If you say you're doing this in just like very vague terms, you sort of just end up back on archive papers and textbooks.

And so it's really important for us to do the loop, but then create this repeatable process like how do you go from sub-domain to sub-domain?

And there's really interesting questions about how well do the ML systems generalize between these things?

What is the generalization of a system between like superconductivity data to magnetism data, for instance?

And maybe that looks very different than its ability to generalize to fluid mechanics.

And I think there's like fundamental arguments to make there.

But the goal is create this repeatable system, prove it, and then just go through the different domains that way.

So I can see the argument for why cracking room temperature superconductivity from an experimental basis is extraordinarily valuable for humanity, but you guys are building a startup.

And to use an analogy for why you need to have a clear medium term path or short a medium term path along the way to a North Star that is both commercially viable and net positive to society.

What we've seen, for example, with other frontier labs that are working on automating white collar work or software knowledge work, is that there's this North Star of an AI researcher.

But that along the way, there were a bunch of sub-goals and so on.

But a concrete kind of application that opened up a ton of commercial value and benefits for users on the way to that AI researcher was the idea of AI programming.

Software engineering has become probably the first major domain that's caused people to really update their priors about how useful AI models are beyond consumer applications and in terms of productivity, their impact has been extraordinary just in a few short months.

So if the traditional frontier labs as North Star was an AI researcher and the path along the way to get there was AI programming, what is that for periodic?

Basically co-pilots for engineers, researchers in advanced industries.

So maybe perhaps just being in Silicon Valley, we really think about computer-oriented work, everything is digital, everything is bits.

But there's so many industries, like we were talking about a few space defense semiconductors where they're dealing with iteration of materials of physics and that's part of their workflow.

How are they designing these new technologies, these new devices?

And in the absence of data, in the absence of good systems, they don't really have particularly good tools.

That is our opportunity and these are massive R&D budgets.

So while high-temp superconductivity is a great North Star, we very much understand that technology and capital are intertwined.

We're going to be able to maximally accelerate science if this is a wildly successful commercial entity.

And to do so, we want to accelerate advanced manufacturing in all these different industries, become like an intelligence layer for all these teams to accelerate their workflow and start reducing their iteration time, get them to better solutions more quickly, accelerate their researchers and their engineers.

Let's click a little bit deeper on that in practice, sort of a day in the life of a periodic team member where let's say half the team, is this roughly right?

About half the team are ML scientists with machine learning backgrounds and the remaining half are physical scientists with physics or chemistry backgrounds.

How do you start by uniting the cultures?

How do you take somebody whose primary career so far in work has been experiments in a lab in wet labs doing physics and chemistry and give them an intuition for ML and vice versa?

Because you guys are both physicists who then had the career trajectory where you also had the chance to be at frontier AI labs and were part of training systems that are now considered sort of landmark, hallmark machine learning systems like Jack GPT, like Nome.

But for others who might be coming from one domain, how do you get the team to build an intuition for the other?

Yeah, so this is a great question.

I mean, I feel like it's actually crucial for us to make sure these teams work very closely with each other.

So one of the things we're seeing is the physics and the chemists need to figure out how to teach the LLM, how to reason about these things.

Because I think the frontier AI labs have figured out how to train them on math and logic but not yet on physics, chemistry.

So one thing we're seeing that's been really, I think, productive is the physicists and chemists are thinking about what are the steps we should include in the mid training, in the RL training that will teach the LLM how to reason correctly about quantum mechanics, how to reason correctly about these physical systems.

Another one, of course, is the LLM researchers are learning quite a bit about the physics, the simulation tools, the goals.

So they've been working together really well.

We have weekly teaching sessions where the LLM researchers teach how the RL loops work, how the data cleaning works.

And then the physicists and chemists are teaching about different aspects of the science, the history of science.

That's also very important.

So we feel like that's been going really well.

And one way of looking at this is the things we have to teach the LLM to be able to discover, say, a superconductor includes being able to read the literature really well, like read all the papers, the textbooks, find the relevant parts, and then being able to run simulations, theoretical calculations, and then take action, run experiments.

We feel like this is quite similar to the physical R&D researchers in these companies.

They have to read the literature, read maybe internal documents or external documents, and then run simulations, run theoretical calculations, and then actually attempt the thing experimentally, learn from that.

So we feel like all the progress we're making towards our internal superconductivity or physics goals actually is making our LLMs much better at serving our customers who are doing very similar workflows.

Yeah, I think just culture, no stupid questions.

You can ask just like the dumbest physics question, the dumbest ML question.

And I mean, there's a few faculty as part of our company, and they're actually excellent teachers.

So, I mean, these like learning sessions have been really fantastic.

And another thing I noticed is computer scientists often think in terms of like APIs.

So scientists will say something and they're always trying to map it.

You're like, "Okay, well, what's the input?

What's the output?

What's the target?

How do I map that back?

And it's always just like this translation.

And I think we also have built up as part of the team, there's people like on these different edges.

So like if you have a simplex of like pure ML, LLM, pure experimentalists, pure simulation, there's people who like gonna live in this inside as well.

And so they've been like excellent bridges for translating between these different groups of people.

So it's like active learning to like learn the other spaces, creating APIs, and then these kind of bridge connector peoples.

I think Doge being an excellent example of that.

Is it a requirement for somebody who wants to join periodic to have to have an advanced degree in physics or chemistry?

Absolutely not.

One of the jokes we're making is who was the NBA player who was saying that I'm much closer to LeBron James than you are to me?

We were saying the opposite of that to candidates because the amount that even our best physicist doesn't know about physics is much bigger than the amount that they know about physics.

So for this new candidate, even if they have no background in physics, how much they have to learn about what we're trying to do is actually not that different on how much the best physicist has to learn because there's so much chemistry to learn, so much material science to learn.

And I think this is one of the interesting aspects of science today.

In the past, in the 1800s, there were these physicists that could do so many different things at the frontier.

Today we reached a point where our intellectual knowledge is so large that a leading thinker can usually only advance in one very specific field.

And maybe this is actually holding us back because, say, to discover an amazing superconductor, as we keep going back to this example, you have to know so much about chemistry, physics, synthesis, characterization.

And unfortunately, I don't think any human knows enough about all of these.

So we have to collaborate.

So I think our team is kind of like a small example of this where we have, as Diem said, a lot of different points in that simplex.

And for any person, they have so much to learn.

But that's true for basically every other scientist.

So for example, I supposedly come from the physics side of it, but I've been learning so much more physics because we now have people from different areas of chemistry in the team, different areas of physics.

And I think it's true for LLM researchers as well.

I mean, they come in their aspects to LLM that they probably didn't know until they started working with other researchers in our team.

So I think it's a great and it's like a small example of what we're trying to do with the LLM because we're trying to teach this LLM all these different things that we're learning as researchers.

It's like a really fun experience, I think.

And what are you finding makes a great researcher at periodic that's different from what might make a great researcher at OpenAI or Anthropic or DeepMind?

I would say there's very high overlap.

But probably one of the biggest determinants is, do you care about this mission?

Is accelerating science...

To you, is that like the big goal?

And I think looking at the team right now, it's just incredibly mission driven set of folks who are like, "Yeah, this is the North Star.

Let's do that."

If someone really wants to improve some Megacorp's products, yeah, you'd probably be better off at that Megacorp in iterating and improving their products.

But if you care about scientific discovery, periodic labs is the best place to do that.

How big is the team today?

We're roughly 30, I believe.

And as you think about taking a lot of the research that's going on at the company and deploying that out in the real world, the kinds of customers that we've talked about - space, defense, advanced manufacturing - these are mission critical industries that are known for being essential to whatever part of the economy they're part of.

But often they're not the fastest to adopt new technology.

How do you think about deploying the kinds of frontier agents that we've talked about that are great at science, great at physics in companies or organizations that might not be anywhere close to as sophisticated as you are in AI or ML?

Is there...

Do you have a working thesis for how to make sure that the arc of progress is not bottlenecked on deployment?

It sounds like you have a fairly good thesis on how to unblock the arc of scientific progress on the research side.

But when it comes to deployment, what might be a working theory that you guys are optimistic about that would help get the systems that periodic is building out into the real world?

Well, maybe one thing that we've noticed in our conversations with all these companies is they all are looking for their AI strategy.

They understand that the technology is shifting really quickly and they're looking at how they're doing their work and it's not changing as quickly as they think it should be.

Some industries also are losing key expertise in different fields and they're losing these senior engineers, senior researchers, and they're like, "Okay, how do we preserve that?"

But one thesis is understand...

It's thinking about these APIs and thinking about what are the evaluations, what are the biggest bottlenecks for these companies looking at some of the problems they face and we can map that to our systems and we say, "Well, we think we can dramatically accelerate this."

And so it's not coming in and saying, "Hey, we're going to transform your fab line on day one.

We're going to transform how you're doing everything.

Forget everything."

It's like, "No, we're going to solve a really critical problem, well-scoped, very clear evaluations."

You can't code draft that with them and just show them how powerful this technology can be when you optimize against the thing you care about.

So nothing particularly surprising here, but a land and expand type method as you might expect, but really looking for who are the biggest promoters within that company and what are the biggest problems, make sure you're solving a very real thing for them and intersect that with where is our technical capability, the highest.

You were on a call this morning with one of the customers in your pipeline.

We don't need to name who, but what were some of the things you heard as the most urgent problems that they'd like for periodic to solve?

So one of them was simulations.

They spend a lot of time training people on some of these simulations.

They need to use critical for their development.

And being able to automate those simulations, I think would be quite enabling.

The design process and then some of the small things like matching the formats, being able to feed the simulation results into the design pipeline, all of these seem quite important.

And then being able to treat the data together in the same place.

What else?

Well, I think there's a really fundamental question.

So a lot of these companies will rely on retrieval.

So that's sort of like a super lightweight thing.

Someone shows up with a neural net and they're like, great, we'll just retrieve over all of your data.

And then that's your solution.

However, as we've seen with things like chat GPT and other things, when you pre-train on the data, when you actually encode the knowledge into the weights, it's not just a retrieval system, you have a richer, deeper understanding of the material.

And I think this is a big fundamental challenge.

So for instance, for this customer, they can give privileges to their employees and have retrieval as acting on behalf, like the system acts as the user.

And so you can match those same kind of like privileges for access.

But if you start doing pre-training or mid-training on different parts, it's like, well, if you pre-train on every piece of data, that might only be accessible to say like the CEO of that company.

So then you have to figure out how do you sort of bucket that knowledge and create different types of systems.

But I think right now, like we're after talking with the user, they don't seem to have a great solution for sort of distilling all of the knowledge into like a single model or into a set of models.

So like going beyond retrieval to proper training.

And then I think also the supervised training they're doing is really akin to like the early days of chat GPT, where it's like input output, you have a few examples and kind of transforming this new way of thinking.

It's like, no, high compute reinforcement learning is really effective.

This is how you should think about the strategies that's using.

This is how you create effective tool using towards those problems.

And this is how you optimize it effectively.

Could you describe for folks who may not be familiar with it, what do you mean by mid-training?

Because people are familiar with pre-training, they're familiar with post-training, but in the periodic context, what does mid-training mean?

Yeah, sorry for the lingo.

So I think this term came up years ago where it's like, well, we had pre-training, we had post-training, but sometimes you need to put in a little bit more knowledge.

So before search worked really well, there was an issue of freshness.

So we had pre-trained models and they have a knowledge cutoff.

So there was like a scrape of the internet at that point, but users want more real-time knowledge.

So it's like, how do you get that in there and enter mid-train?

Mid-train is basically you're taking new data, new knowledge that's not in the model, and you continue pre-train.

And this differs from standard post-training where post-training typically is more reinforcement learning, supervised learning.

And the mechanism is basically, or the goal of it is just put a lot of knowledge into the model that doesn't exist before.

So that's mid-training in a nutshell.

And in the periodic context, does that mean essentially going and injecting a ton of custom sort of data from an experimental implementation in a particular customer or particular industry?

What are the sort of the lines, the atomic unit that you guys think will, of mid-training that will improve the capabilities of the models on problems that they're just terrible at today?

I mean, it's all the knowledge.

So it's like you can have very low level descriptions of physical objects, like crystal structures for instance.

You can also have higher level semantic descriptions of like, "Well, this is how I made material XYZ."

And trying to get all this data into the model is really valuable.

So it's like simulation data, experimental data, none of this exists.

And basically putting that knowledge into the model and making sure that these distributions are connected in some way.

And what I mean by that is if you just sort of mix together distribution A, B, and C, there's no guarantee of generalization.

What you want to hope to see from these systems is the inclusion of this other dataset is improving performance on the other datasets.

And so these are sort of just like machine learning techniques or machine learning problems to solve.

But basically just make it an expert in physics and chemistry and where it was deficient before.

You guys both know that I spent some time running E-Vals on a bunch of these models at the Stanford Physics Lab earlier this year.

And the results were that the models are terrible at scientific analysis.

Because they weren't trained to do so.

But on the other hand, many of the existing research teams working on the general models are investing in trying to make these better.

Is there something about the way you're building periodic that gets the draft off of all of that progress in the base models?

Or do you have to start everything from scratch and therefore not be able to be composable with advancements happening in the mainline models today?

Yeah, I mean, we benefit from all the different advances.

So one of them is the LLMs are getting better.

And we definitely benefit from that because we take a pre-trained model and then mid-train it, you know, high computer.

Another one is the physical simulation tools are getting better.

So DeepMind, Meta, Microsoft, academic groups, they're open sourcing new ways of simulating, new ways of using machine learning to predict properties.

So we get to basically utilize all of those.

And it seems like machine learning has made such an impact in the physics and chemistry fields that we expect these improvements to continue.

Think another thing is when we think about tools for agents, we think of like, here's a browser, here's a Python.

But increasingly, people think about tools as other neural nets, as other agents.

And so if you look at a lot of like physics code, it's not particularly deep.

It's not this isn't competition programming.

This is like kind of like hacky scripts.

But you can rely on some of the best systems for, you know, wherever they spike on.

So neural net as a tool to these agents is something that immediately accelerates our work.

So you don't have to like replicate every everything.

There's a historical pattern that a lot of the fundamental research in the physical sciences that we're talking about here, physics, chemistry, biology, has historically been done at university labs.

Is there a role at all that the university ecosystem you think will play in periodic's future?

Or do you think these are just completely divergent paths?

Absolutely.

I mean, so much of the simulation tooling we use have been developed in academia.

I knew it is in Europe, for example, a lot of the novel synthesis methods.

So we definitely benefit from a lot of these different, very deep technical progress.

Like for example, all the physical simulation tools are these, you know, complicated Fortran code that in our team, for example, we don't really like know how to develop very efficiently.

But we feel like there's definitely a very deep connection between academia and industry labs.

So recently, a lot of the large scale simulations have been done in industry labs like Microsoft Deep, Mind & Meta.

But a lot of those tools have been actually developed in academia and then passed on.

So there's actually really nice synergy there.

I think I'd add a few other things too.

So like you found when you were evaluating models on their ability to do scientific analysis, they were deficient.

This was probably, I mean, not a direct goal for those teams training those models.

So I think academia and these collaborations say, well, help us inform what are the important tasks?

Like how do you do this analysis?

What skills do we want to put in the model?

A skill could be a full analysis or a skill could be like a smaller primitive as part of a larger analysis.

But also secondarily, it's how do you think?

So one of the physicists was looking at the reasoning strategies of one of our models.

He's like, "It's all wrong.

It's all wrong."

And we're like, "What do you mean?"

He's like, "No, this should be thinking higher level.

It should be thinking in terms of symmetries."

This is the book that encodes the thinking strategies that will be more effective.

And of course, your reinforcement learning environment needs to reward those types of strategies.

But given some of the most premier scientists are using these strategies, they're likely effective.

And these are types of things where it's like an industry academic partnership can just be so powerful because industry just simply is blind to these types of analyses, these tools, as well as just this way of thinking.

Yeah.

And there's a way of connecting that to the tooling question as well because language is very important.

But then in the human brain, we also see other visual processing like geometric.

So it's plausible that while these LLMs will keep getting better and better, they'll actually benefit from having a geometric reasoning that's separate.

So today we can do that with equivarian graph neural networks.

We can do it with diffusion models that are geometric tools by construction.

And the LLM can call them so then it can have both the language aspect, which is very good for say synthesis recipe, but also the geometric aspects, which is very good for representing atoms, just design geometries in general.

So how are you thinking about deepening periodic styles with academic labs?

Yeah, this is very important for us.

So we have two major initiatives in this direction.

One of them is we're starting an advisory board.

This will be expertise spanning from superconductivity to solid state chemistry to physics.

And we want to make sure we're in touch with this long-term research directions.

A lot of important government funding goes to these groups and we want to have a tight coupling between what's important for them and us.

So this includes superconductivity expertise such as ZX-CHAN from Stanford on the experimental side and Steve Kewelson from the theory side.

We also have synthesis expertise on the advisory board from Mercury Kanatsidis from Northwestern University and Chris Wahlverton on the high throughput DFT side.

And then we have Kostya from Manchester University who is really well known for discovering graphene.

So he'll be able to advise us on these novel, exotic electronic states and materials.

And our second initiative is going to be through a grant program.

We really want to enable some of this amazing work going on in academia and some of their work isn't a good fit for industry.

It's best done in academia.

So we want to accept grant proposals and we want to enable and support the kind of work that's going to help community, especially in relation to LLMs, agents in synthesis, materials discovery, physics modeling.

So maybe after this show you can include the link.

Yeah, we'll include them in the show notes if grants are open starting today.

Absolutely.

Great.

So for people who might be interested in joining periodic, what are you guys looking for?

First off, someone deeply curious.

Someone who really wants to understand the machine learning, the science at a deeper level, who wants to make contact with reality, who wants to advance science.

This has to be a driving thing.

But also pragmatic.

What we're trying to do is incredibly challenging and someone who has like very careful process and they get to their solution oriented, they get to goals quickly.

And really someone world class along some dimension.

We're looking across all these different pillars, so machine learning, experimentalists, simulation, and people who can bring some sort of innovation on how do you create a creative ML system?

How do you bring new types of tools or new types of thinking to some of these state of the art models?

Someone who can advance simulations and make it more robust and more reliable with experiment.

Yeah, and maybe one more thing I'd add is, Liam and I have been really looking for a sense of urgency in candidates because we want these technologies not in 10 years.

We don't want these elements to start improving science in 10 years, but we want them ASAP.

So if the candidate feels like a sense of urgency for improving these physical systems, discovering these amazing materials, innovating on superconductivity, they would be a good fit.

Yeah, if you match all these, please reach out.

All right.

It sounds like we got to amp up the speed, the scale of stuff happening at periodic and we'll put the career links in the show notes.

Thanks for coming, guys.

Thanks for listening to the A16Z podcast.

If you enjoyed the episode, let us know by leaving a review at ratethispodcast.com/a16z.

We've got more great conversations coming your way.

See you next time.

As a reminder, the content here is for informational purposes only.

It should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any A16Z fund.

Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast.

For more details, including a link to our investments, please see a16z.com/disclosures.