Spatial Intelligence: AI's Next Frontier

Fei-Fei Li explains why spatial intelligence is AI's next breakthrough, enabling machines to understand and interact with the physical world.

by HowAIWorks Team
Spatial IntelligenceFei-Fei LiWorld ModelsComputer VisionRoboticsAI ResearchGenerative AIEmbodied AI3D GenerationAI Applications

Introduction

In a landmark essay published in November 2025, Dr. Fei-Fei Li, one of the most influential figures in modern AI research, argues that spatial intelligence represents AI's next frontier—the capability that will transform how machines understand, reason about, and interact with the physical world. While large language models (LLMs) have revolutionized how we work with abstract knowledge, they remain "wordsmiths in the dark"—eloquent but ungrounded in physical reality.

Spatial intelligence, Li explains, is the scaffolding upon which human cognition is built. It enables us to park a car by visualizing the gap between bumper and curb, catch keys tossed across a room, navigate crowded spaces, and pour coffee without looking. More fundamentally, it drives our imagination, creativity, and ability to reason about physical spaces—capabilities that today's AI systems largely lack.

This vision has guided Li's career, from building ImageNet (one of three key elements enabling modern AI) to her current work at World Labs, where she and her cofounders are building world models—a new type of generative AI that can understand, reason about, and generate semantically, physically, geometrically, and dynamically complex worlds. The implications span from revolutionizing storytelling and creativity to enabling truly autonomous robots and accelerating scientific discovery.

What is Spatial Intelligence?

The foundation of human cognition

Spatial intelligence is the ability to perceive, understand, reason about, and interact with physical spaces and objects in three dimensions. It's fundamental to how humans navigate the world, from the most ordinary acts—like catching a ball or navigating a room—to complex professional tasks like firefighters assessing building stability through smoke or architects visualizing structures before construction.

Li traces spatial intelligence's evolutionary origins: long before animals could nest, communicate with language, or build civilizations, the simple act of sensing created a bridge between perception and survival. This bridge grew stronger over generations, forming nervous systems that interpret the world and coordinate interactions. Perception and action became the core loop driving the evolution of intelligence, forming the foundation for human cognition.

Beyond language and text

While today's LLMs excel at reading, writing, and pattern recognition, they bear fundamental limitations when representing or interacting with the physical world. State-of-the-art multimodal LLMs (MLLMs) rarely perform better than chance on estimating distance, orientation, and size, or "mentally" rotating objects. They can't navigate mazes, recognize shortcuts, or predict basic physics. AI-generated videos, while impressive, often lose coherence after a few seconds.

Spatial intelligence represents the frontier beyond language—the capability that links imagination, perception, and action. It enables understanding not just what we're looking at, but how everything relates spatially, what it means, and why it matters. Without it, AI remains disconnected from physical reality, unable to effectively drive cars, guide robots, enable immersive experiences, or accelerate discovery in materials science and medicine.

Historical examples of spatial intelligence

Throughout history, spatial intelligence has driven civilization-defining breakthroughs:

  • Eratosthenes (Ancient Greece): Transformed shadows into geometry, measuring a 7-degree angle in Alexandria at the exact moment the sun cast no shadow in Syene, to calculate Earth's circumference
  • Hargreave's "Spinning Jenny": Revolutionized textile manufacturing through spatial insight—arranging multiple spindles side-by-side increased productivity eightfold
  • Watson and Crick's DNA discovery: Discovered DNA's structure by physically building 3D molecular models, manipulating metal plates and wire until the spatial arrangement clicked into place

In each case, spatial intelligence enabled breakthroughs that couldn't be captured in text alone—requiring manipulation of objects, visualization of structures, and reasoning about physical spaces.

Building spatially intelligent AI: World models

The challenge beyond LLMs

Building spatially intelligent AI requires something more ambitious than LLMs: world models, a new type of generative model whose capabilities far exceed today's language models. World models can understand, reason about, generate, and interact with semantically, physically, geometrically, and dynamically complex worlds—both virtual and real.

The field is nascent, with current approaches including:

  • 3D generation models that create objects and scenes
  • Physics simulators that model dynamics
  • Embodied AI systems that learn through interaction

However, truly general world models that combine all these capabilities remain an active area of research and development.

World Labs' approach

Li co-founded World Labs with Justin Johnson, Christoph Lassner, and Ben Mildenhall to realize spatial intelligence in full. The company is building world models that can:

  • Understand complex 3D scenes semantically and geometrically
  • Reason about physical properties, dynamics, and interactions
  • Generate coherent, explorable 3D worlds from text or other inputs
  • Interact with environments in ways that respect physics and geometry

Their Marble platform puts unprecedented spatial capabilities and editorial controllability in the hands of creators, allowing rapid creation and iteration of fully explorable 3D worlds without conventional 3D design software overhead.

Applications of spatial intelligence

Creativity and storytelling

Spatial intelligence will transform how we create and experience narratives:

  • Narrative experiences in new dimensions: Filmmakers and game designers can conjure entire worlds without budget or geography constraints, exploring scenes and perspectives intractable in traditional production pipelines
  • Spatial narratives through design: Architects can quickly visualize structures before investing months into designs, walking through spaces that don't yet exist. Industrial and fashion designers can translate imagination into form instantly
  • New immersive experiences: Combined with VR and XR headsets, spatial intelligence makes world-building accessible not just to studios but to individual creators, educators, and anyone with a vision to share

Robotics: Embodied intelligence in action

Spatial intelligence is essential for autonomous robots:

  • Scaling robotic learning: World models can rapidly close the gap between simulation and reality, helping train robots across countless states, interactions, and environments—addressing the critical shortage of training data in robotics
  • Companions and collaborators: Robots aiding scientists or assisting seniors need spatial intelligence that perceives, reasons, plans, and acts while staying empathetically aligned with human goals
  • Expanding forms of embodiment: From nanobots delivering medicine to soft robots navigating tight spaces, future spatial intelligence models must integrate both environments and embodied perception and movement

Science, healthcare, and education

Spatial intelligence's impact extends to fields where AI can enhance human capability:

  • Scientific research: Spatially intelligent systems can simulate experiments, test hypotheses in parallel, and explore environments inaccessible to humans—from deep oceans to distant planets. This can transform computational modeling in climate science and materials research
  • Healthcare: AI can accelerate drug discovery by modeling molecular interactions in multiple dimensions, enhance diagnostics by helping radiologists spot patterns in medical imaging, and enable ambient monitoring systems that support patients and caregivers
  • Education: Spatial intelligence enables immersive learning that makes abstract concepts tangible. Students can explore cellular machinery or walk through historical events in multiple dimensions. Teachers gain tools to personalize instruction through interactive environments

Why spatial intelligence matters now

The limits of current AI

Today's AI excels at:

  • Reading and writing
  • Research and pattern recognition
  • Generating text, images, and short videos

But it struggles with:

  • Understanding physical spaces and relationships
  • Reasoning about physics and dynamics
  • Interacting with the real world
  • Maintaining coherence in longer sequences

These limitations prevent AI from effectively driving cars, guiding robots in homes and hospitals, enabling truly immersive experiences, or accelerating discovery in materials science and medicine.

The path forward

Spatial intelligence represents the next major leap in AI capabilities. As Li writes, "Almost a half billion years after nature unleashed the first glimmers of spatial intelligence in the ancestral animals, we're lucky enough to find ourselves among the generation of technologists who may soon endow machines with the same capability."

The goal isn't to replace human judgment, creativity, and empathy, but to augment human expertise, accelerate discovery, and amplify human care. Spatially intelligent AI can help us understand diseases, revolutionize storytelling, and support us in vulnerable moments—elevating the aspects of life we care about most.

Conclusion

Dr. Fei-Fei Li's vision of spatial intelligence as AI's next frontier represents a fundamental shift from language-centric models to systems that truly understand and interact with the physical world. While LLMs have transformed how we work with abstract knowledge, spatial intelligence will enable machines to reason about, create, and navigate both real and virtual worlds.

The implications are profound: from revolutionizing creativity and storytelling through platforms like World Labs' Marble, to enabling truly autonomous robots, to accelerating scientific discovery and enhancing healthcare and education. As world models mature, they'll unlock capabilities that today's AI systems cannot achieve—bridging the gap between abstract knowledge and physical reality.

The pursuit of spatial intelligence has been Li's North Star throughout her career, from ImageNet to her current work at World Labs. As this technology develops, it promises to transform not just what AI can do, but how we create, learn, discover, and interact with the world around us.

Explore more about computer vision, robotics, and generative AI in our Glossary, and learn about AI models and applications in our Models catalog and AI Tools directory.

Sources

Frequently Asked Questions

Spatial intelligence is the ability to perceive, understand, reason about, and interact with physical spaces and objects in three dimensions. It enables AI systems to understand not just what they're looking at, but how everything relates spatially, what it means, and why it matters.
While large language models excel at abstract knowledge, they struggle with understanding physical spaces, reasoning about physics, and interacting with the real world. Spatial intelligence bridges this gap, enabling AI to drive cars, guide robots, enable immersive experiences, and accelerate scientific discovery.
World models are a new type of generative AI that can understand, reason about, generate, and interact with semantically, physically, geometrically, and dynamically complex worlds—both virtual and real. They represent a more ambitious approach than LLMs for building spatially intelligent AI.
Spatial intelligence has applications in creativity and storytelling (3D world creation), robotics (autonomous navigation and manipulation), scientific research (simulation and modeling), healthcare (drug discovery and diagnostics), and education (immersive learning experiences).
World Labs is a company co-founded by Fei-Fei Li to realize spatial intelligence. Their Marble platform puts unprecedented spatial capabilities in the hands of creators, allowing rapid creation and iteration of fully explorable 3D worlds without conventional 3D design software overhead.
Current AI excels at reading, writing, and pattern recognition but struggles with understanding physical spaces, reasoning about physics, and interacting with the real world. Spatial intelligence enables machines to reason about, create, and navigate both real and virtual worlds.

Continue Your AI Journey

Explore our lessons and glossary to deepen your understanding.