Introduction
In a landmark essay published in November 2025, Dr. Fei-Fei Li, one of the most influential figures in modern AI research, argues that spatial intelligence represents AI's next frontier—the capability that will transform how machines understand, reason about, and interact with the physical world. While large language models (LLMs) have revolutionized how we work with abstract knowledge, they remain "wordsmiths in the dark"—eloquent but ungrounded in physical reality.
Spatial intelligence, Li explains, is the scaffolding upon which human cognition is built. It enables us to park a car by visualizing the gap between bumper and curb, catch keys tossed across a room, navigate crowded spaces, and pour coffee without looking. More fundamentally, it drives our imagination, creativity, and ability to reason about physical spaces—capabilities that today's AI systems largely lack.
This vision has guided Li's career, from building ImageNet (one of three key elements enabling modern AI) to her current work at World Labs, where she and her cofounders are building world models—a new type of generative AI that can understand, reason about, and generate semantically, physically, geometrically, and dynamically complex worlds. The implications span from revolutionizing storytelling and creativity to enabling truly autonomous robots and accelerating scientific discovery.
What is Spatial Intelligence?
The foundation of human cognition
Spatial intelligence is the ability to perceive, understand, reason about, and interact with physical spaces and objects in three dimensions. It's fundamental to how humans navigate the world, from the most ordinary acts—like catching a ball or navigating a room—to complex professional tasks like firefighters assessing building stability through smoke or architects visualizing structures before construction.
Li traces spatial intelligence's evolutionary origins: long before animals could nest, communicate with language, or build civilizations, the simple act of sensing created a bridge between perception and survival. This bridge grew stronger over generations, forming nervous systems that interpret the world and coordinate interactions. Perception and action became the core loop driving the evolution of intelligence, forming the foundation for human cognition.
Beyond language and text
While today's LLMs excel at reading, writing, and pattern recognition, they bear fundamental limitations when representing or interacting with the physical world. State-of-the-art multimodal LLMs (MLLMs) rarely perform better than chance on estimating distance, orientation, and size, or "mentally" rotating objects. They can't navigate mazes, recognize shortcuts, or predict basic physics. AI-generated videos, while impressive, often lose coherence after a few seconds.
Spatial intelligence represents the frontier beyond language—the capability that links imagination, perception, and action. It enables understanding not just what we're looking at, but how everything relates spatially, what it means, and why it matters. Without it, AI remains disconnected from physical reality, unable to effectively drive cars, guide robots, enable immersive experiences, or accelerate discovery in materials science and medicine.
Historical examples of spatial intelligence
Throughout history, spatial intelligence has driven civilization-defining breakthroughs:
- Eratosthenes (Ancient Greece): Transformed shadows into geometry, measuring a 7-degree angle in Alexandria at the exact moment the sun cast no shadow in Syene, to calculate Earth's circumference
- Hargreave's "Spinning Jenny": Revolutionized textile manufacturing through spatial insight—arranging multiple spindles side-by-side increased productivity eightfold
- Watson and Crick's DNA discovery: Discovered DNA's structure by physically building 3D molecular models, manipulating metal plates and wire until the spatial arrangement clicked into place
In each case, spatial intelligence enabled breakthroughs that couldn't be captured in text alone—requiring manipulation of objects, visualization of structures, and reasoning about physical spaces.
Building spatially intelligent AI: World models
The challenge beyond LLMs
Building spatially intelligent AI requires something more ambitious than LLMs: world models, a new type of generative model whose capabilities far exceed today's language models. World models can understand, reason about, generate, and interact with semantically, physically, geometrically, and dynamically complex worlds—both virtual and real.
The field is nascent, with current approaches including:
- 3D generation models that create objects and scenes
- Physics simulators that model dynamics
- Embodied AI systems that learn through interaction
However, truly general world models that combine all these capabilities remain an active area of research and development.
World Labs' approach
Li co-founded World Labs with Justin Johnson, Christoph Lassner, and Ben Mildenhall to realize spatial intelligence in full. The company is building world models that can:
- Understand complex 3D scenes semantically and geometrically
- Reason about physical properties, dynamics, and interactions
- Generate coherent, explorable 3D worlds from text or other inputs
- Interact with environments in ways that respect physics and geometry
Their Marble platform puts unprecedented spatial capabilities and editorial controllability in the hands of creators, allowing rapid creation and iteration of fully explorable 3D worlds without conventional 3D design software overhead.
Applications of spatial intelligence
Creativity and storytelling
Spatial intelligence will transform how we create and experience narratives:
- Narrative experiences in new dimensions: Filmmakers and game designers can conjure entire worlds without budget or geography constraints, exploring scenes and perspectives intractable in traditional production pipelines
- Spatial narratives through design: Architects can quickly visualize structures before investing months into designs, walking through spaces that don't yet exist. Industrial and fashion designers can translate imagination into form instantly
- New immersive experiences: Combined with VR and XR headsets, spatial intelligence makes world-building accessible not just to studios but to individual creators, educators, and anyone with a vision to share
Robotics: Embodied intelligence in action
Spatial intelligence is essential for autonomous robots:
- Scaling robotic learning: World models can rapidly close the gap between simulation and reality, helping train robots across countless states, interactions, and environments—addressing the critical shortage of training data in robotics
- Companions and collaborators: Robots aiding scientists or assisting seniors need spatial intelligence that perceives, reasons, plans, and acts while staying empathetically aligned with human goals
- Expanding forms of embodiment: From nanobots delivering medicine to soft robots navigating tight spaces, future spatial intelligence models must integrate both environments and embodied perception and movement
Science, healthcare, and education
Spatial intelligence's impact extends to fields where AI can enhance human capability:
- Scientific research: Spatially intelligent systems can simulate experiments, test hypotheses in parallel, and explore environments inaccessible to humans—from deep oceans to distant planets. This can transform computational modeling in climate science and materials research
- Healthcare: AI can accelerate drug discovery by modeling molecular interactions in multiple dimensions, enhance diagnostics by helping radiologists spot patterns in medical imaging, and enable ambient monitoring systems that support patients and caregivers
- Education: Spatial intelligence enables immersive learning that makes abstract concepts tangible. Students can explore cellular machinery or walk through historical events in multiple dimensions. Teachers gain tools to personalize instruction through interactive environments
Why spatial intelligence matters now
The limits of current AI
Today's AI excels at:
- Reading and writing
- Research and pattern recognition
- Generating text, images, and short videos
But it struggles with:
- Understanding physical spaces and relationships
- Reasoning about physics and dynamics
- Interacting with the real world
- Maintaining coherence in longer sequences
These limitations prevent AI from effectively driving cars, guiding robots in homes and hospitals, enabling truly immersive experiences, or accelerating discovery in materials science and medicine.
The path forward
Spatial intelligence represents the next major leap in AI capabilities. As Li writes, "Almost a half billion years after nature unleashed the first glimmers of spatial intelligence in the ancestral animals, we're lucky enough to find ourselves among the generation of technologists who may soon endow machines with the same capability."
The goal isn't to replace human judgment, creativity, and empathy, but to augment human expertise, accelerate discovery, and amplify human care. Spatially intelligent AI can help us understand diseases, revolutionize storytelling, and support us in vulnerable moments—elevating the aspects of life we care about most.
Conclusion
Dr. Fei-Fei Li's vision of spatial intelligence as AI's next frontier represents a fundamental shift from language-centric models to systems that truly understand and interact with the physical world. While LLMs have transformed how we work with abstract knowledge, spatial intelligence will enable machines to reason about, create, and navigate both real and virtual worlds.
The implications are profound: from revolutionizing creativity and storytelling through platforms like World Labs' Marble, to enabling truly autonomous robots, to accelerating scientific discovery and enhancing healthcare and education. As world models mature, they'll unlock capabilities that today's AI systems cannot achieve—bridging the gap between abstract knowledge and physical reality.
The pursuit of spatial intelligence has been Li's North Star throughout her career, from ImageNet to her current work at World Labs. As this technology develops, it promises to transform not just what AI can do, but how we create, learn, discover, and interact with the world around us.
Explore more about computer vision, robotics, and generative AI in our Glossary, and learn about AI models and applications in our Models catalog and AI Tools directory.