3d generation

30 articles about 3d generation in AI news

PartRAG Revolutionizes 3D Generation with Retrieval-Augmented Part-Level Control

Researchers introduce PartRAG, a breakthrough framework that combines retrieval-augmented generation with diffusion transformers for precise part-level 3D creation and editing from single images. The system achieves superior geometric accuracy while enabling localized modifications without regenerating entire objects.

70% relevant

Meshcraft Democratizes 3D Creation: Multi-Engine AI Platform Bridges Text-to-3D Gap

Meshcraft emerges as a web-based platform offering text-to-3D and image-to-3D generation with selectable AI engines. The tool provides both free and premium options, addressing quality bottlenecks in 3D generation through engine optimization rather than image model refinement.

80% relevant

Browser-Based Text-to-CAD Tool Emerges, Enabling Local 3D Model Generation from Prompts

A developer has built a text-to-CAD application that operates entirely within a web browser, enabling local generation and manipulation of 3D models from natural language descriptions. This approach eliminates cloud dependency and could lower barriers for rapid prototyping.

87% relevant

Developer Open-Sources 'Prompt-to-3D' Tool for Instant, Navigable World Generation

A developer has released an open-source tool that creates interactive 3D worlds from text or image inputs. This moves 3D asset generation from static models to instant, explorable environments.

91% relevant

NVIDIA Releases Brain MRI Generation Model on Hugging Face: 3D Latent Diffusion for T1, FLAIR, T2, and SWI Scans

NVIDIA has open-sourced a 3D latent diffusion model for generating high-resolution brain MRI scans across four modalities. The model claims state-of-the-art FID scores and 33× faster inference than prior methods.

95% relevant

Text-to-Game AI Emerges: How a Single Prompt Can Now Generate Complete 3D Worlds

A breakthrough AI system can transform simple text descriptions into fully playable 3D games complete with NPCs, physics, multiplayer capabilities, and persistent worlds. This development represents a quantum leap in procedural content generation and democratizes game development.

85% relevant

BetterScene Bridges the Gap: How Aligning AI Representations Unlocks Photorealistic 3D Synthesis

Researchers introduce BetterScene, a novel AI method that dramatically improves 3D scene generation from just a handful of photos. By aligning the internal representations of a powerful video diffusion model, it produces consistent, artifact-free novel views, pushing the boundary of what's possible in computational photography and virtual world creation.

78% relevant

How to Build a 3D Engine with Claude Code: The Demoscene Case Study

A developer used Claude Code to build a complete 3D engine from scratch. Here are the actionable prompting techniques and CLAUDE.md strategies that made it work.

90% relevant

QuatRoPE: New Positional Embedding Enables Linear-Scale 3D Spatial Reasoning in LLMs, Outperforming Quadratic Methods

Researchers propose QuatRoPE, a novel positional embedding method that encodes 3D object relations with linear input scaling. Paired with IGRE, it improves spatial reasoning in LLMs while preserving their original language capabilities.

79% relevant

Momentum-Consistency Fine-Tuning (MCFT) Achieves 3.30% Gain in 5-Shot 3D Vision Tasks Without Adapters

Researchers propose MCFT, an adapter-free fine-tuning method for 3D point cloud models that selectively updates encoder parameters with momentum constraints. It outperforms prior methods by 3.30% in 5-shot settings and maintains original inference latency.

75% relevant

NVIDIA Releases NVPanoptix-3D on Hugging Face: Single-Image 3D Indoor Scene Reconstruction

NVIDIA has open-sourced NVPanoptix-3D, a model that reconstructs complete 3D indoor scenes—including panoptic segmentation, depth, and geometry—from a single RGB image in one forward pass.

90% relevant

Open-Source 'AI Office' Platform Lets Users Walk Through 3D Space to Monitor Autonomous Agents

An open-source project called AI Office creates a 3D virtual workspace where AI agents are visualized as avatars performing tasks. Users can navigate the space instead of reading logs, offering a novel interface for multi-agent systems.

85% relevant

NVIDIA DLSS 5 Demo Shows 3D Guided Neural Rendering for Next-Gen Upscaling

A leaked demo of NVIDIA's upcoming DLSS 5 technology showcases 3D guided neural rendering, promising a significant leap in image reconstruction quality for real-time graphics.

85% relevant

New Research Improves Text-to-3D Motion Retrieval with Interpretable Fine-Grained Alignment

Researchers propose a novel method for retrieving 3D human motion sequences from text descriptions using joint-angle motion images and token-patch interaction. It outperforms state-of-the-art methods on standard benchmarks while offering interpretable correspondences.

75% relevant

From Flat Images to 3D Worlds: How Persistent 3D State Models Will Revolutionize Virtual Try-On and Digital Showrooms

PERSIST introduces world models with persistent 3D scene memory, enabling coherent, evolving 3D environments from single images. For luxury retail, this means photorealistic virtual try-on with perfect garment physics and immersive digital showrooms that customers can explore and customize.

60% relevant

Freepik's Imagen Nano 2: Democratizing AI Image Generation with Google's Compact Model

Freepik has launched Imagen Nano 2, a significantly upgraded version of Google's lightweight image generation model. The new iteration promises faster performance, reduced computational requirements, and greater affordability, potentially making AI image creation accessible to more users.

85% relevant

VGGT-Det: How AI Is Learning to See in 3D Without Camera Calibration

Researchers have developed VGGT-Det, a breakthrough framework for multi-view 3D object detection that works without calibrated camera poses. The system mines internal geometric priors through attention mechanisms, outperforming traditional methods in indoor environments.

85% relevant

AI Game Engine Breakthrough: Complete 3D Worlds Generated in Seconds

A revolutionary AI system can now generate fully functional 3D games in seconds, complete with interactive worlds, moving characters, and working gameplay systems. This browser-based technology represents a quantum leap in procedural content creation.

95% relevant

From Prompt to Playable: New AI Platform Generates Complete 3D Games Instantly

A groundbreaking AI system can now transform simple text prompts into fully functional 3D games complete with NPCs, physics, multiplayer capabilities, and persistent worlds. Backed by NVIDIA and YouTube's co-founder with $28M in funding, this represents a seismic shift in game development.

95% relevant

The Next Platform Shift: How Persistent 3D World Models Are Becoming the New Programmable Interface

A new collaboration between Baseten and World Labs signals a paradigm shift where persistent 3D world models become programmable platforms, potentially rivaling the transformative impact of large language models through accessible developer APIs.

85% relevant

Sparse Sensors, Rich Views: How Minimal Radar Data Supercharges AI Scene Generation

Researchers have developed a novel approach that combines single images with extremely sparse radar or LiDAR data to dramatically improve AI's ability to generate realistic 3D views from 2D photos. This multimodal technique overcomes fundamental limitations of vision-only systems in challenging conditions like bad weather and low texture.

70% relevant

Zatom-1: The First Unified AI Model for 3D Molecular and Materials Science

Researchers have developed Zatom-1, the first foundation model that simultaneously handles generative and predictive tasks for both molecules and materials. This multimodal flow matching approach enables faster sampling and improved accuracy across chemical domains.

75% relevant

DeepMind's Diffusion Breakthrough: Training Better Latents for Superior AI Generation

Google DeepMind researchers have developed new techniques for training latent representations in diffusion models, potentially leading to more efficient, higher-quality AI-generated content across images, audio, and video domains.

85% relevant

OpenCAD Browser Tool Enables Local, Private Text-to-CAD Conversion Without Cloud API

A developer has released an open-source text-to-CAD tool that runs entirely in a user's browser, enabling private, local 3D model generation from natural language descriptions. This approach bypasses cloud API costs and data privacy issues inherent in most current AI CAD solutions.

89% relevant

OpenSCAD Web: Open-Source Text-to-CAD Tool Runs Fully In-Browser via WebAssembly

A developer has released an open-source text-to-CAD tool that runs entirely in a web browser using WebAssembly. Users describe a 3D object in plain English, optionally upload a reference image, and receive a parametric model with adjustable dimensions that exports directly to 3D printer formats.

85% relevant

Higgsfield AI Pays Bartender $1M+ for Face Scan to Train AI Video Model Diffuse

AI startup Higgsfield paid a New Jersey bartender over $1 million for a full-face 3D scan to train its text-to-video model Diffuse. The deal highlights the emerging market for high-fidelity biometric data to create photorealistic digital humans.

85% relevant

Vision AI Trends 2026: Manufacturing, Warehouse Automation, and Luxury Authentication Enter Visual Data Era

A 2026 trends report highlights Vision AI's expansion into manufacturing quality inspection, warehouse automation, and luxury brand authentication, marking a shift toward 3D visual data systems. This reflects the maturation of computer vision beyond basic recognition into operational and trust applications.

95% relevant

BrepCoder: The AI That Speaks CAD's Native Language

Researchers have developed BrepCoder, a multimodal AI that understands CAD designs in their native B-rep format. By treating 3D models as structured code, it performs multiple engineering tasks without task-specific retraining, potentially revolutionizing design automation.

75% relevant

The One-Stop AI Platform Revolution: GlobalGPT Consolidates 100+ Models Without Barriers

GlobalGPT has launched a unified platform offering access to over 100 AI models for image and video generation without waitlists, restrictions, or invite codes. This consolidation represents a significant shift toward democratizing advanced AI tools for creators and businesses alike.

85% relevant

From Prompt to Play: How AI is Building Entire Games in Minutes

A developer has created 'Riftwater,' a sci-fi fishing game where every element—from 3D assets to NPC behavior—is generated through prompt-based AI. This breakthrough demonstrates how AI is evolving from content assistant to full game development engine.

85% relevant