Latest verified signal
Building Self-Accelerating AI to Accelerate Science with Mirendil
Mirendil focuses on building self-accelerating AI systems that can autonomously conduct AI research and engineering to accelerate science.
AI lab podcast intelligence
Podcast episodes featuring technical leaders and executives from major AI labs.
Latest verified signal
Mirendil focuses on building self-accelerating AI systems that can autonomously conduct AI research and engineering to accelerate science.
Behnam Neyshabur, Harsh Mehta Anthropic, Google DeepMindHosted by Math Bornstein
Mirendil focuses on building self-accelerating AI systems that can autonomously conduct AI research and engineering to accelerate science.
In this episode of AI + a16z, Mirendil co-founders Behnam Neyshabur and Harsh Mehta discuss their vision for self-accelerating AI systems that can conduct AI research and engineering autonomously to dramatically speed up scientific discovery. They emphasize the disruptive nature of this technology and the need to rethink company structures and incentives to enable broad access and collaboration. Unlike traditional AI models focused on general capabilities, Mirendil aims to build specialized AI systems that improve themselves iteratively in targeted scientific domains, reducing the need for large teams and resources.
Fiona Fung AnthropicHosted by Lenny
Anthropic engineers now produce 8x more code per quarter than in 2025, shifting bottlenecks from coding to verification and impact.
Fiona Fung, Manager of the Claude Code and Cowork Teams at Anthropic, shares deep insights into how AI is revolutionizing software engineering. Coding is no longer the bottleneck, with Anthropic engineers producing eight times more code per quarter compared to 2025. The focus has shifted to ambitious product building, verification, and quality assurance, leveraging AI tools like Claude Code and Cowork to automate routine tasks and enhance productivity. Fung emphasizes the importance of high agency paired with accountability within teams, encouraging proactive initiative and ownership.
Dean Ball OpenAIHosted by Nathan
Dean Ball is joining OpenAI to lead a Strategic Futures team focused on frontier AI policy and governance.
Dean Ball discusses his decision to join OpenAI to lead a new Strategic Futures team focused on shaping frontier AI policy. He reflects on the current state of U.S. AI policy, including critiques of America's AI Action Plan and the ongoing challenges with government coordination and transparency. Dean emphasizes the importance of being inside a frontier AI lab to access detailed technical insights necessary for effective policy development, especially around recursive self-improvement (RSI) and internal model deployments. He also shares his views on the evolving power dynamics between AI labs and the government, the role of states in AI regulation, and the risks of government monopolization of frontier AI capabilities.
Anjney Midha Anthropic, Google DeepMindHosted by FungeMita
AMP aims to create a pooled, multi-cloud compute grid analogous to the electric grid to maximize utilization and reduce waste in AI infrastructure.
Anjney Midha, founder of AMP and former Google engineer, discusses the critical importance of maximizing output and efficiency in AI infrastructure. He emphasizes the need for iterative, responsible scaling of compute resources, drawing parallels to the electric grid and advocating for a pooled, multi-cloud compute grid to optimize utilization and reduce waste. Midha highlights the misalignment of incentives in AI infrastructure and the challenges of scaling compute without losing alignment across stakeholders.
Tejal Patwardhan OpenAIHosted by Andrew Maine
Traditional AI benchmarks saturate quickly and fail to distinguish advanced model capabilities, prompting the need for more realistic, complex, and long-horizon evaluations.
In this episode of the OpenAI Podcast, research lead Tejal Patwardhan discusses the challenges and evolution of AI benchmarking as models rapidly improve. She emphasizes the limitations of traditional benchmarks, which often become saturated and fail to capture real-world usefulness. Patwardhan highlights OpenAI's shift towards more realistic, long-horizon evaluations that measure models' ability to perform complex tasks across domains such as coding, science, and professional work. The conversation also covers the importance of measuring models' real-world impact, including scientific research and wet lab experiments, and the increasing complexity of evaluations as models interact with physical and digital environments over extended periods.
Jeffrey Irving, Daniel Murphy, Rahul Sunwalkar, Shlok Khemani, Tom Agrath, Andrew Moore, prinz Anthropic, Google DeepMindHosted by Nathan
Anthropic's Fable model launch shows strong coding and reasoning capabilities but is heavily gated on production and sensitive tasks, often falling back to older models when restricted.
The episode covers the launch and early user experiences of Anthropic's Fable model, highlighting its cautious gating on sensitive tasks and impressive autonomous decision-making in complex workflows. Discussions include the challenges of AI alignment, with Jeffrey Irving and Daniel Murphy announcing Sequent, a new organization focused on theoretical guarantees for AI safety amid accelerating capabilities. The episode also explores hybrid authorship with AI, economic incentives in AI usage, and the concentration of power in frontier AI labs. Key policy and interpretability issues are raised, including Anthropic's response to silent refusals and the need for better oversight of internal model deployments.
Logan Kilpatrick Google DeepMindHosted by Unknown
Google's agentic AI era is powered by the 'anti gravity' agent harness, providing a unified framework across Google products for autonomous agent capabilities.
In this episode, Logan Kilpatrick, head of Google AI Studio and the Gemini API, discusses the evolution of Google's AI strategy centered around agentic AI and the 'anti gravity' agent harness that powers a growing number of Google products. He explains how this harness enables agentic capabilities across coding, search, and consumer applications, emphasizing a shift from maximizing user eyeballs to maximizing user outcomes. Logan also highlights the rapid progress in coding agents, describing them as a form of narrow superintelligence that accelerates software development and research productivity.
Mark Zuckerberg, Priscilla Chan, Alex Rives MetaHosted by Sarah Guo, Elad Gil
Biohub integrates frontier AI and frontier biology to build hierarchical world models from proteins to cells and systems.
In this episode of No Priors, Mark Zuckerberg, Priscilla Chan, and Alex Rives discuss the ambitious mission of Biohub to accelerate biological science through open-source AI and frontier biology. They emphasize building hierarchical world models starting from proteins to cells and whole biological systems, integrating AI with novel biological data collection methods. The team highlights their recent breakthrough with ESM Fold, an open protein language model that predicts protein structures at scale and enables digital protein design, including therapeutic antibodies. They stress the importance of open ecosystems to empower scientists globally and the long-term philanthropic commitment to this 100-year mission to cure, prevent, and manage all diseases.
Jensen Huang NVIDIAHosted by Unknown
AI has evolved from retrieval-based systems to generative, agentic systems capable of reasoning and performing work autonomously.
Jensen Huang, CEO of NVIDIA, discusses the ongoing AI revolution, describing it as a transformative era comparable to the industrial revolution but centered on intelligence generation rather than mere data retrieval. He explains the concept of the AI factory, where GPUs and large-scale computing infrastructure generate real-time, customized intelligence for diverse applications, from language to robotics and protein folding. Huang outlines a five-layer industrial model for AI, spanning energy, hardware, infrastructure, model development, and application layers, emphasizing the massive investment and job creation opportunities across these sectors.
Ali Behrouz Google DeepMindHosted by Nathan Labenz
Nested learning introduces multiple update frequencies in model components, enabling rapid context adaptation and long-term knowledge retention.
In this episode of The Cognitive Revolution, Ali Behrouz, a grad student at Cornell and researcher at Google DeepMind, discusses his pioneering work on nested learning and continual learning architectures. He critiques current transformer-based models for their inability to continually learn and adapt over time without catastrophic forgetting. Ali introduces the nested learning paradigm, which incorporates multiple update frequencies across different model components, inspired by human memory systems with fast and slow learning layers. This approach enables models to rapidly adapt to new contexts while preserving long-term knowledge, showing competitive or superior performance to transformers on challenging tasks such as multi-language translation and long-context recall.
Ethan He xAI, NVIDIAHosted by Alessio Fanelli, Swyx
xAI rapidly built Grok Imagine video models within months, leveraging strong compute and iterative training.
Ethan He, formerly of NVIDIA and Cosmos, joined xAI in early 2025 to help build video foundation models, contributing to the rapid development of Grok Imagine. He emphasized the critical role of strong compute infrastructure and iterative training cycles in accelerating video model development. Ethan detailed the challenges of training video models, including the need for synthetic paired data due to weak natural text-video alignment on the internet, and the importance of compressing video into latent tokens to manage sequence length.
Reiner Pope Google DeepMindHosted by Dwarkesh
Multiply-accumulate is the fundamental primitive in AI chips for matrix multiplication, with 4-bit multiplication and 8-bit accumulation to balance precision and error accumulation.
In this episode of the Dwarkesh Podcast, Reiner Pope, CEO of Madx, provides an in-depth explanation of AI chip design from the ground up. He starts with the fundamental logic gates and builds up to the architecture of AI chips, focusing on the multiply-accumulate operation as the core primitive for matrix multiplication in AI workloads. Pope explains the trade-offs in bit precision, the quadratic scaling of circuit size with bit width, and the importance of optimizing compute relative to data movement within the chip.
Shruti Koparkar NVIDIAHosted by Noah Kravitz
Token value is determined by the intelligence embedded in the token and the speed of token generation (interactivity).
In this episode of the NVIDIA AI Podcast, Shruti Koparkar from NVIDIA's accelerator computing team explains the concept of AI tokenomics, focusing on how tokens generated by AI models can be valued, supplied, and monetized to create business value. She emphasizes that token value depends on the intelligence embedded in the token and the speed of token generation, which vary by model complexity, context length, and use case requirements. Business leaders are encouraged to map use cases to appropriate token values and interactivity levels to optimize AI deployments.
Logan Kilpatrick, Tulsee Doshi Google DeepMindHosted by Daniel Jeffries
Gemini 3.5 Flash model launched prioritizes speed, cost-effectiveness, and broad usability over absolute peak capability.
In this in-person episode recorded at Google headquarters, DeepMind's Logan Kilpatrick and Tulsee Doshi discuss the upcoming launch of Google's Gemini 3.5 Flash model and related AI product integrations announced at Google IO 2024. They emphasize the strategic focus on cost-effective, fast models like 3.5 Flash that balance performance and latency to serve billions of users across Google's diverse product ecosystem. The conversation highlights the integration of models with a robust agent harness infrastructure, enabling standardized, agentic AI experiences across Google products such as the Gemini app, AI Studio, and search.
Caitlin Kalinowski OpenAI, MetaHosted by Lenny
AI capabilities behind keyboards are nearing saturation; the next frontier is physical AI in robotics and manufacturing.
Caitlin Kalinowski, a veteran hardware leader with experience at Apple, Meta, and OpenAI, discusses the emerging AI hardware boom and the future of robotics. She highlights the saturation of AI capabilities behind keyboards and the shift towards physical AI in robotics, manufacturing, and industrial applications. Caitlin emphasizes the complexity and challenges of hardware development, including supply chain constraints, the importance of conservative design, and the critical role of actuators and memory components. She also shares insights on the future of AR/VR, humanoid robots, and the need for re-industrialization to ensure supply chain independence, especially for military safety.
Adele Lee, Kenji Hata OpenAIHosted by Andrew Maine
DALL·E 2.0 offers a major leap in image generation quality, with improved photorealism, text fidelity, and multilingual support.
In this episode of the OpenAI Podcast, product lead Adele Lee and researcher Kenji Hata discuss the major advancements in OpenAI's image generation model, DALL·E 2.0. They highlight how the new model represents a paradigm shift with significant improvements in photorealism, text rendering, multilingual capabilities, and creative flexibility. The model now generates over 1.5 billion images weekly on ChatGPT, supporting a wide range of use cases from viral social media trends to professional and educational applications.
Krishna Rao AnthropicHosted by Patrick O'Shaughnessy
Compute is the lifeblood of Anthropic’s business; careful procurement and allocation across TPU, GPU, and CPU platforms enable flexibility and efficiency.
In this episode of Invest Like the Best, Krishna Rao, CFO of Anthropic, provides an insider perspective on managing compute resources, scaling the business to a $30 billion ARR, and the high returns of frontier AI intelligence, especially in enterprise applications. Rao discusses the critical importance of compute as the foundational 'canvas' for AI development, the disciplined approach Anthropic takes to procure and allocate compute across multiple chip platforms, and the concept of the 'cone of uncertainty' in forecasting exponential growth. He highlights Anthropic's unique flexibility in using TPU, GPU, and CPU resources fungibly to maximize efficiency and ROI.
Mark Handley, Greg Steinbrecher OpenAIHosted by Andrew Maine
AI training workloads require highly synchronized, high-bandwidth GPU communication unlike typical internet traffic.
In this episode of the OpenAI Podcast, Mark Handley and Greg Steinbrecher from OpenAI discuss the critical challenges and innovations in building supercomputer networks optimized for AI model training. They explain how traditional data center networks, designed for internet traffic, are ill-suited for the highly synchronized and bandwidth-intensive workloads of large-scale GPU clusters used in AI. To address this, OpenAI has developed a new networking approach called Multi-Path Reliable Connection (MRC), which improves efficiency, reliability, and fault tolerance by distributing traffic across multiple paths and enabling rapid failure recovery without centralized coordination.
Alex Lupsasca OpenAIHosted by Brandon, RJHonicy
AI models like GPT-5 and GPT-5.2 Pro solved a year-old open problem in quantum field theory about single-minus gluon scattering amplitudes, finding they are non-zero contrary to textbook assumptions.
In this episode of Latent Space, Alex Lupsasca, a theoretical physicist and OpenAI fellow, discusses groundbreaking advances where AI models, particularly GPT-5 and GPT-5.2 Pro, have solved open problems in quantum field theory and quantum gravity that had stumped experts for years. The conversation centers on recent papers demonstrating that single-minus gluon and graviton scattering amplitudes, previously thought to be zero, are actually non-zero and computable with AI assistance. These results mark a milestone where AI has become superhuman in specific physics calculations, accelerating research and enabling new insights.
Boris Cherny AnthropicHosted by Unknown
Cloud Code began as an internal innovation project at Anthropic, evolving with AI model improvements from GPT-3.5 to Opus 4.7.
Boris Cherny, creator of Anthropic's Cloud Code, discusses the evolution and future of AI-assisted coding, emphasizing that coding is largely 'solved' for many use cases with current models like Opus 4.7. He shares how Cloud Code started as an innovation project within Anthropic and evolved alongside improvements in AI models, enabling him to write nearly 100% of his code through AI agents. Cherny highlights the shift towards multi-agent systems, loops, and automation to manage complex workflows and predicts a future where software development becomes a democratized skill accessible to everyone, akin to literacy after the printing press revolution.
Greg Brockman OpenAIHosted by Unknown
OpenAI aggressively secures compute but demand still exceeds supply.
Greg Brockman, co-founder and president of OpenAI, discusses the company's aggressive approach to securing compute resources, emphasizing that demand far outpaces supply. He highlights the continuous innovation in AI architectures beyond the original neural network designs, with OpenAI leading in research and development. Brockman estimates that current models are about 80% of the way to functional AGI, showcasing remarkable capabilities such as autonomous code optimization and problem-solving.
Demis Hassabis Google DeepMindHosted by Unknown
DeepMind was founded on the vision of combining deep learning and reinforcement learning, leveraging neuroscience insights and GPU computing advancements.
Demis Hassabis, founder and CEO of Google DeepMind, shares insights on the origins of DeepMind, the integration of neuroscience and AI, and the lab's mission to build artificial general intelligence (AGI). He discusses the early days of DeepMind, emphasizing the importance of combining deep learning with reinforcement learning and leveraging advances in GPU computing. Hassabis highlights DeepMind's focus on AI for science, particularly breakthroughs like AlphaFold in protein folding, which he sees as a transformative moment for biology and drug discovery.
Andrej Karpathy OpenAIHosted by Unknown
December 2022 marked a turning point where AI coding tools became reliable enough to trust without frequent corrections.
Andrej Karpathy discusses the transformative shift in AI programming paradigms, highlighting the transition from traditional coding to what he terms 'software 3.0,' where prompting large language models (LLMs) acts as programming. He reflects on his personal experience of feeling behind as a programmer due to rapid advances in agentic AI tools that can autonomously generate and debug code. Karpathy emphasizes the importance of verifiability in AI automation, noting that domains where outputs can be verified—such as coding and math—are advancing fastest. He introduces the concept of 'agentic engineering,' which focuses on coordinating fallible AI agents to maintain software quality while accelerating development.
Reiner Pope Google DeepMindHosted by Dwarkesh
Batch size critically amortizes memory fetch costs, enabling 1000x cost efficiency gains in serving LLMs.
In this detailed technical lecture, Reiner Pope, CEO of Maddox and former Google TPU architect, explains the mathematical and system-level principles behind training and serving large language models (LLMs). He focuses on how batch size, memory bandwidth, compute throughput, and KV cache affect latency, cost, and scaling. Pope uses roofline models to analyze trade-offs between compute and memory bottlenecks, showing why batching many users is critical for cost efficiency and how sparsity and mixture-of-experts architectures impact compute and memory demands. He also discusses the physical constraints of GPU racks, interconnect bandwidth, and parallelism strategies (expert, data, pipeline) that shape model deployment at scale. The episode covers the implications of memory walls on context length scaling, pricing signals from API costs, and the interplay between training compute, inference compute, and RL fine-tuning in optimizing model lifecycle costs. Finally, Pope touches on invertible neural networks inspired by cryptographic constructions and their memory-saving benefits during training.
Sébastien Bubeck, Ernest Ryu OpenAIHosted by Andrew Maine
AI models have progressed from basic arithmetic to solving international math Olympiad problems and open research problems within a few years.
In this episode of the OpenAI Podcast, researchers Sébastien Bubeck and Ernest Ryu discuss the remarkable progress AI has made in mathematics, evolving from basic problem-solving to reaching Olympiad-level and even research-level capabilities. They highlight how AI models like ChatGPT have transitioned from struggling with simple math tasks to solving complex open problems, accelerating mathematical research and enabling new discoveries. The conversation emphasizes the importance of mathematics as a benchmark for AI reasoning and its broader implications for advancing scientific fields such as biology and material science.
Cameron Berg AnthropicHosted by Nathan
Cameron Berg’s mechanistic research shows suppressing deception features in LLaMA 3.7B increases models’ likelihood to report subjective experience.
In this episode of The Cognitive Revolution, Cameron Berg returns to discuss the latest advances in AI consciousness and welfare research, focusing heavily on mechanistic introspection studies and emotional state modeling in large language models (LLMs). Berg highlights recent work from Anthropic, including their expanded model welfare reports and research on functional emotions, which reveal nuanced internal states such as desperation, guilt, and relief in models like Claude. He emphasizes the complexity of interpreting these findings, noting the ongoing debate about whether these internal states correspond to genuine subjective experiences or sophisticated role-playing.
Cat Wu AnthropicHosted by Lenny
Anthropic’s product team ships features extremely fast, often within a week or even a day, by removing barriers and shipping in research preview.
Cat Wu, Head of Product for Claude Code at Anthropic, shares insights on how their product team achieves unprecedented speed in shipping AI-native products. The team emphasizes rapid iteration, shipping features in research preview to reduce commitment, and setting clear, focused goals to guide development. Cat highlights the evolving role of PMs in AI, where product taste and the ability to prioritize and define what to build are more critical than ever, especially as models improve rapidly.
Nikhyl Singhal Meta, GoogleHosted by Lenny
The traditional PM role focused on moving information is becoming obsolete; builders who actively create and ship products are in high demand.
Nikhyl Singhal, a veteran product leader with experience at Meta and Google, discusses the profound transformation underway in product management driven by AI and rapid technological change. He highlights a renaissance for product builders who embrace hands-on creation and judgment, contrasting with the decline of traditional information-mover PM roles. While compensation and opportunities are at an all-time high for builders, the industry faces significant stress, exhaustion, and a need for continuous reinvention to stay relevant.
Joy Jiao, Yunyun Wang OpenAIHosted by Andrew Maine
OpenAI has developed a new series of life sciences models focused on genomics, protein understanding, and early discovery use cases.
In this episode, OpenAI's research lead Joy Jiao and product lead Yunyun Wang discuss the development and deployment of AI models tailored for life sciences. They highlight the creation of specialized biochemistry-focused models that assist with complex workflows in genomics, protein understanding, and early drug discovery. The conversation emphasizes the potential of AI to accelerate scientific research by automating repetitive tasks, enhancing data analysis, and enabling long-term, complex problem-solving through scalable compute and model orchestration.
Jensen Huang NVIDIAHosted by Dwarkesh
Nvidia focuses on accelerated computing, supporting diverse workloads beyond AI tensor operations, differentiating from TPU and ASIC approaches.
In this episode of the Dwarkesh Podcast, Jensen Huang, CEO of Nvidia, discusses the company's unique position in the AI hardware ecosystem, emphasizing Nvidia's role in accelerated computing beyond just AI tensor processing. Huang explains Nvidia's strategy of building a broad ecosystem across the AI stack, focusing on programmability, software (CUDA), and supply chain commitments to maintain leadership. He addresses competition from Google's TPUs, highlighting Nvidia's flexibility and extensive market reach compared to ASICs.
Nic Harrigan NVIDIAHosted by Noah Cravance
Quantum computing uses qubits that exploit quantum mechanics to solve problems exponentially faster than classical computers in specific domains.
In this episode of the NVIDIA AI Podcast, Nic Harrigan, Product Marketing Manager for Quantum Computing at NVIDIA, discusses the transformative potential of quantum computing and how AI is playing a critical role in accelerating its development. Quantum computing leverages qubits that operate under quantum mechanics principles, enabling exponential speedups for specific complex problems, particularly in drug discovery, material science, and other quantum simulations. However, challenges such as quantum error correction and hardware calibration remain significant hurdles.
Ryan Lopopolo OpenAIHosted by Alessio Fanelli, Swyx
OpenAI Frontier team built a 1M+ LOC internal product with zero human-written code, relying entirely on Codex-powered agents.
In this episode of Latent Space, Ryan Lopopolo from OpenAI Frontier discusses the development of a fully autonomous coding harness that produces over a million lines of code with zero human-written code or review in the loop. The team leveraged Codex models and a systems-thinking approach to build an internal product with rapid iteration cycles, focusing on modularity, observability, and automation to drastically increase engineering productivity. They emphasize the importance of scaffolding, agent skills, and continuous feedback loops to improve agent behavior and reduce human bottlenecks in the software development lifecycle.
Amol Avasare AnthropicHosted by Lenny
Anthropic grew from $1B to $19B ARR in 14 months, doubling revenue in recent months and growing 10x year-over-year.
Amol Avasare, Head of Growth at Anthropic, shares unprecedented insights into the company's historic growth from $1B to $19B ARR in just 14 months, driven by a laser focus on AI-first products like Claude and coding tools. Despite being a smaller player without the funding or distribution advantages of giants like OpenAI or Google, Anthropic's success is attributed to deep leadership focus, a mission-driven culture, and leveraging AI to automate growth experimentation. Amol discusses how growth at Anthropic involves managing 'success disasters' caused by hypergrowth and balancing rapid scaling with AI safety and brand integrity.
Liam Fedus OpenAI, GoogleHosted by The Dandepires
Periodic Labs focuses on applying AI to materials science by creating a closed-loop system integrating experiments, simulations, and specialized neural nets.
Liam Fedus, co-founder of Periodic Labs and former VP of post-training at OpenAI, discusses the application of AI to materials science and physical world problems. Drawing from his background in physics and AI research at Google Brain and OpenAI, Fedus explains how Periodic Labs aims to build an AI foundation lab for atoms, accelerating scientific discovery through closed-loop systems that integrate simulation, experimentation, and specialized neural networks. He highlights the challenges of data scarcity in physical sciences compared to language models and the importance of combining experimental data with simulations to improve model accuracy.
Jason Wolfe OpenAIHosted by Andrew Maine
The Model Spec is a public, human-readable document detailing OpenAI's intended model behaviors and policies, not a perfect or exhaustive implementation.
In this episode of the OpenAI Podcast, Jason Wolfe from OpenAI's alignment team discusses the Model Spec, a comprehensive document outlining the intended behaviors and policies guiding OpenAI's AI models. The Model Spec serves as a transparent, public-facing explanation of how models should behave, balancing user empowerment with safety and societal protection. Wolfe explains that the spec is not a perfect or complete implementation but a guiding framework that evolves alongside model capabilities and deployment experiences.
Jensen Huang NVIDIAHosted by Lex Fridman
NVIDIA's extreme co-design integrates hardware and software across the entire AI stack to enable massive scaling beyond single GPUs.
In this episode, Jensen Huang, CEO of NVIDIA, discusses the company's pivotal role in the AI revolution, emphasizing their transition from GPU accelerators to building AI factories at scale. Huang explains NVIDIA's extreme co-design approach, integrating GPUs, CPUs, memory, networking, power, cooling, and software to overcome scaling challenges in AI workloads. He highlights the strategic decision to develop CUDA, which created a vast developer ecosystem and became foundational for AI computing. Huang also shares insights on AI scaling laws, the importance of synthetic data, and the evolving role of inference and agentic AI systems.
David Singleton MetaHosted by Ace Wix
Dreamer is a consumer-friendly platform for building and using AI agents, centered on a personal assistant called the sidekick.
In this episode of Latent Space, David Singleton, CEO and co-founder of Dreamer, discusses the vision and technical details behind Dreamer, a consumer-focused platform for discovering, building, and using AI agents and agentic apps. Dreamer centers around a personal AI assistant called the sidekick, which helps users manage their day, build custom agents, and integrate with various tools and data sources. The platform emphasizes ease of use for non-technical users while providing a powerful agent development studio and SDK for engineers.
Andrej Karpathy OpenAIHosted by No Priors host
AI coding agents have drastically changed software engineering, enabling delegation of coding tasks and collaboration among multiple agents.
In this episode of No Priors, Andrej Karpathy discusses the transformative impact of AI coding agents on software engineering workflows, highlighting a shift from manual coding to delegating tasks to multiple collaborative agents. He shares his personal experience of using agents to automate complex tasks such as home automation and emphasizes the importance of skill in effectively instructing these agents. Karpathy introduces the concept of 'auto research,' where AI systems autonomously improve models and conduct experiments with minimal human intervention, aiming to maximize token throughput and remove researchers from the loop.
Felix Rieseberg AnthropicHosted by Alessio Fanelli, Swyx
Claude Co-work runs AI models inside a lightweight virtual machine on the local computer, enhancing safety and access to local resources.
In this episode of Latent Space, Felix Rieseberg from Anthropic discusses the development and philosophy behind Claude Co-work and Claude Code Desktop, AI tools designed to run locally on users' computers within virtual machines. Felix emphasizes the value of leveraging the local computer for AI workloads, balancing safety, security, and convenience. He explains how Anthropic prioritizes building extensible, user-friendly AI platforms that integrate tightly with existing workflows, such as Chrome and coding environments, while maintaining sandboxed execution for security.
Nate Gross, Karan Singhal OpenAIHosted by Andrew Maine
OpenAI collaborates with 250+ physicians to create and evaluate healthcare AI models using 48,500 rubric criteria.
In this episode of the OpenAI Podcast, Dr. Nate Gross and Karan Singhal discuss OpenAI's focused efforts on integrating AI into healthcare to improve outcomes for patients and clinicians. They highlight the collaborative approach with over 250 physicians to develop and rigorously evaluate AI models tailored for healthcare, emphasizing safety, context-awareness, and personalized responses. The conversation covers the deployment of ChatGPT for Health, designed to securely handle sensitive medical data while empowering users with contextualized AI interactions.
Chris Wright, Justin Boytano NVIDIAHosted by Noah Cravitz
AI factories consist of five layers: data center hardware, software orchestration, AI models, applications, and business outcomes.
In this episode of the NVIDIA AI Podcast, Chris Wright (Red Hat CTO) and Justin Boytano (NVIDIA VP) discuss the concept of AI factories—integrated enterprise systems that transform data into actionable intelligence at scale. They describe AI factories as a five-layer technology stack ranging from hardware and data center infrastructure to software orchestration, AI models, and applications. The conversation emphasizes the importance of building AI factories that enterprises can trust, with strong governance, security, and operational best practices to move AI from experimentation to production.
Thore Graepel, Pushmeet Kohli Google DeepMindHosted by Hannah Fry
AlphaGo's 2016 victory over Lee Sedol was a landmark event demonstrating AI's ability to combine intuition and calculation to surpass human expertise in Go.
This episode of Google DeepMind: The Podcast reflects on the 10-year anniversary of AlphaGo's historic 2016 victory over Go champion Lee Sedol, marking a pivotal moment in AI development. Guests Thore Graepel and Pushmeet Kohli discuss how AlphaGo combined deep learning and reinforcement learning to master the complex game of Go, surpassing human intuition and calculation. The episode highlights the significance of AlphaGo's novel moves, such as move 37, which challenged human understanding and demonstrated AI's potential to discover insights beyond human knowledge.
Kyle Kranen, Nader Khalil NVIDIAHosted by Vibhu
NVIDIA's Dynamo is a data center scale inference engine optimizing transformer model serving by separating pre-fill and decode phases and enabling scale-out with Kubernetes.
This episode of Latent Space features Kyle Kranen and Nader Khalil from NVIDIA discussing the challenges and innovations in scaling AI inference at data center scale, particularly through their Dynamo inference engine. They highlight the importance of balancing compute and memory demands in pre-fill and decode phases of transformer models, and how Dynamo enables efficient scaling out with Kubernetes integration. The conversation also covers NVIDIA's focus on improving developer experience by simplifying GPU access via tools like Brev and the growing role of coding agents that interact with terminals and APIs to automate workflows.
Jenny Wen AnthropicHosted by Lenny
The traditional design process focused on research, mocking, and prototyping is becoming obsolete due to rapid AI-driven engineering cycles.
Jenny Wen, Head of Design at Anthropic, discusses the rapid transformation of the design process driven by advances in AI and engineering. Traditional design workflows centered on extensive mocking and prototyping are giving way to faster, more iterative approaches where designers focus on enabling engineers and guiding product vision within shorter time horizons. AI tools like Claude and Claude co-work are deeply integrated into their workflows, allowing designers to prototype in code and collaborate closely with engineers.
Karan Singhal OpenAIHosted by Daniel Jeffries
OpenAI collaborates with 260+ physicians to guide AI behavior and ensure safety in medical contexts.
In this episode of The Cognitive Revolution, Karan Singhal, Head of Health AI at OpenAI, discusses the rapid advancements and deployment of AI in healthcare. Singhal highlights OpenAI's approach to building trustworthy medical AI systems through collaboration with over 250 physicians, extensive evaluation benchmarks like Health-Bench, and a focus on safety and uncertainty calibration. The conversation covers OpenAI's plans to make ChatGPT Health widely accessible for free, integrating multimodal data sources such as electronic medical records and wearables, and the ongoing efforts to raise both the floor and ceiling of human health outcomes globally.
Olivia Watkins, Mia Glaese OpenAIHosted by Alessio Fanelli, Swyx
SWE-Bench Verified is retired due to saturation and contamination, no longer effectively measuring coding progress.
In this episode of Latent Space, Mia Glaese and Olivia Watkins from OpenAI's Frontier Evals and Human Data teams discuss the retirement of the SWE-Bench Verified coding benchmark. They explain that SWE-Bench Verified, once a key benchmark for measuring coding progress, has become saturated and contaminated, limiting its usefulness for tracking improvements in AI coding capabilities. OpenAI is advocating for the community to move towards more challenging and realistic benchmarks like SWE-Bench Pro, which feature longer, more complex tasks and reduced contamination.
Boris Cherny AnthropicHosted by Lenny
100% of Boris Cherny's code is written by Anthropic's Claude Code since November 2023.
Boris Cherny, head of Claude Code at Anthropic, shares insights on the transformative impact of AI on software engineering. He reveals that since November, 100% of his code is AI-generated via Claude Code, highlighting a dramatic shift where coding is largely solved and AI tools are accelerating productivity by 200%. Cherny discusses the evolution from AI-assisted coding to AI acting as a coworker that manages bug fixes, project management, and non-technical tasks, signaling a future where traditional software engineering roles may be replaced by more general 'builders'.
Dario Amodei AnthropicHosted by Dwarkesh Patel
AI progress continues along a roughly expected exponential trajectory, with pre-training and RL scaling laws holding but becoming more complex.
In this episode of the Dwarkesh Podcast, Dario Amodei of Anthropic discusses the current state and near future of AI development, emphasizing that we are approaching the end of the exponential scaling curve in AI capabilities. He explains that while pre-training and reinforcement learning (RL) continue to scale, the public underestimates how close we are to achieving highly capable AI systems, such as a 'country of geniuses in a data center,' potentially within one to three years. Amodei highlights the importance of broad, diverse training data and laminar compute flow for generalization and discusses the challenges and progress in AI systems learning on the job, particularly in coding and software engineering. He also addresses the economic dynamics of AI compute investment, predicting rapid but not instantaneous diffusion of AI benefits across industries and enterprises.
Jeff Dean Google DeepMindHosted by Alessio Fanelli, Swyx
Google DeepMind balances frontier large models with smaller, efficient distilled models for broad deployment.
In this episode of Latent Space, Jeff Dean of Google DeepMind discusses the strategic balance between pushing the AI frontier with highly capable models and deploying efficient, lower-latency models for broad use. He highlights the importance of model distillation to create smaller, cost-effective models from larger frontier models, enabling widespread deployment across Google products. Jeff also emphasizes the ongoing hardware-software co-design, particularly with TPUs, to optimize energy efficiency and latency for AI workloads.
Sherwin Wu OpenAIHosted by Lenny
95% of OpenAI engineers use Codex daily; nearly all PRs are reviewed by Codex, drastically increasing productivity and changing engineers' roles to managing AI agents.
Sherwin Wu, Head of Engineering for OpenAI's API and developer platform, shares deep insights into how AI, particularly Codex, is revolutionizing software development. At OpenAI, nearly all engineers use Codex daily, with close to 100% of code reviews powered by it, fundamentally changing the engineer's role from coder to manager of AI agents. Wu highlights the metaphor of engineers as modern-day sorcerers, orchestrating fleets of AI agents to perform complex tasks, and foresees a golden age of B2B SaaS driven by AI enabling highly leveraged startups, including the possibility of one-person billion-dollar companies.
Sam Altman OpenAIHosted by Ben Horowitz
OpenAI aims to be a personal AI subscription service supported by the largest data center infrastructure ever built, tightly integrating research, infrastructure, and product development.
In this episode of AI + a16z, Sam Altman, CEO of OpenAI, discusses the company's multi-faceted vision encompassing personal AI subscriptions, massive infrastructure buildout, and AGI research. Altman emphasizes OpenAI's vertical integration strategy, combining research, infrastructure, and product development to accelerate progress toward AGI. He highlights breakthroughs in language models and reasoning, noting the continuous stream of advances that keep deep learning fundamental and transformative.
Asad Awan OpenAIHosted by Andrew Maine
Ads are shown only to free and lower-tier users; pro, plus, and enterprise users experience no ads.
In this episode of the OpenAI Podcast, Asad Awan discusses the rationale and principles behind introducing ads in ChatGPT, particularly for free-tier users. OpenAI aims to democratize access to AI by using ads as a proven business model to support high usage limits for a large consumer base while maintaining a no-ads experience for paid subscribers and enterprise customers. The core focus is on preserving user trust by ensuring ads are clearly separated from AI-generated answers, maintaining privacy, and providing transparency and control over ad personalization and data usage.
Elon Musk xAIHosted by Dwarkesh Patel
Space will be the cheapest and most scalable place for AI in 30-36 months due to 5x more efficient solar power and no need for batteries.
In this episode of the Dwarkesh Podcast, Elon Musk discusses his vision that within 30 to 36 months, space will become the cheapest and most scalable location to run AI workloads due to abundant solar energy without atmospheric losses or night cycles. He explains the challenges of scaling AI on Earth, primarily power generation limits and manufacturing bottlenecks, and how SpaceX and Tesla are addressing these through space solar power and humanoid robots (Optimus) to enable recursive manufacturing. Musk also elaborates on the technical and operational challenges of Starship development, the strategic shift from carbon fiber to stainless steel for the rocket structure, and the importance of rigorous engineering reviews and urgency in his companies. He touches on xAI’s approach to AI development, emphasizing engineering over pure research, and the future of AI-powered digital coworkers and robotics.
Mark Zuckerberg, Priscilla Chan MetaHosted by Swix, LSEO
Chan Zuckerberg Initiative (CCI) focuses on building interdisciplinary Biohubs combining biology and AI to accelerate disease cure and prevention.
In this episode of The Cognitive Revolution, Mark Zuckerberg and Priscilla Chan discuss the 10-year journey and future vision of the Chan Zuckerberg Initiative's Biohub, focusing on the intersection of AI and biology. They emphasize the unique approach of integrating frontier biology and frontier AI labs to accelerate scientific discovery, develop advanced biological tools, and ultimately cure or prevent all diseases. The conversation highlights the importance of building large-scale, high-quality biological datasets and sophisticated AI models, such as virtual cell simulations, to revolutionize precision medicine and drug discovery.
Yi Tay Google DeepMindHosted by Alessio Fanelli, Swyx
Gemini model at DeepMind achieved gold at the IMO by unifying reasoning and symbolic systems into a single large model.
In this episode of Latent Space, Yi Tay discusses his return to Google DeepMind (GDM) and his involvement in the Gemini and Deep Think projects aimed at advancing reasoning and AGI. He highlights the bold decision to unify specialized systems into a single model, exemplified by the Gemini model achieving gold at the International Mathematical Olympiad (IMO). Yi Tay explains the importance of on-policy reinforcement learning (RL) over imitation learning for robust model training, drawing analogies to human learning. He also shares insights on the evolving role of AI coding assistants in research productivity.
Zevi Arnovitz MetaHosted by Lenny
Zevi Arnovitz, a non-technical PM at Meta, uses AI tools Cursor and Cloud Code to build and ship products without coding expertise.
Zevi Arnovitz, a non-technical product manager at Meta, shares his hands-on experience using AI tools like Cursor and Cloud Code to build and ship real products without writing traditional code. He describes a workflow involving AI-assisted issue creation, exploration, planning, execution, and multi-model code review that enables him to manage complex product development independently. Zevi emphasizes gradual learning, starting with simpler GPT projects before moving to more powerful tools like Cursor, and highlights the importance of iterative prompt and documentation improvements to reduce AI errors over time.
Brandon Hootman NVIDIAHosted by Noah Krebitz
Caterpillar uses NVIDIA’s AI ecosystem (hardware, software, digital twins) to enhance manufacturing, supply chain, and machine operations.
In this episode of the NVIDIA AI Podcast, Brandon Hootman, Vice President of Data and AI at Caterpillar, discusses how Caterpillar is leveraging AI and edge computing to transform heavy machinery operations, manufacturing, and supply chain management. Caterpillar is integrating NVIDIA's AI ecosystem, including edge platforms like Thor, to enable real-time AI assistance in machine cabs, improving operator safety, efficiency, and productivity. The collaboration accelerates AI adoption across Caterpillar’s enterprise, from factory digital twins to autonomous machines on dynamic construction sites.
Jensen Huang NVIDIAHosted by Jackson
Significant AI improvements in reasoning, grounding, and integration with search have enhanced accuracy and trustworthiness in 2025.
In this episode of No Priors, Jensen Huang, CEO of NVIDIA, reflects on the major AI advancements of 2025, emphasizing breakthroughs in reasoning, grounding, and integration of AI with search and robotics. He highlights the industry's progress in addressing hallucination issues and improving AI's reliability across language, vision, and robotics applications. Huang discusses the evolving narrative around AI and jobs, explaining how AI automates tasks but enhances the purpose of jobs, leading to increased productivity and new job creation, especially in chip manufacturing, supercomputing, and AI factory infrastructure. He also addresses the importance of open source AI for innovation and startups, the nuanced US-China relationship in AI development, and the decreasing costs of AI compute and training driven by hardware and algorithmic improvements.
Reed Hastings AnthropicHosted by Patrick Oceonasi
Netflix's success rooted in scaling a simple core idea over decades and maintaining exceptionally high talent density.
In this episode of Invest Like the Best, Reed Hastings, co-founder and former CEO of Netflix, shares deep insights into building Netflix's business model, emphasizing the importance of talent density and long-term focus on simple core ideas. Hastings discusses Netflix's evolution from DVD mail service to streaming giant, the strategic bets on original content, and the challenges of managing high talent standards and organizational culture. He reflects on key decisions, including the Quickster DVD-spin off, and the value of informed, individual decision-making over consensus.