data science

30 articles about data science in AI news

Claude Code Now Integrates with Google Colab via Official MCP Server

Google released an official, open-source MCP server for Google Colab, enabling Claude Code to automate data science workflows directly from your terminal.

100% relevant

How Netflix's Recommendation System Works: A Technical Breakdown

An explainer on the data science behind Netflix's recommendation engine, covering collaborative filtering, content-based filtering, and hybrid approaches. This provides a foundational understanding of personalization systems relevant to retail.

75% relevant

OpenCSF: A 1.5TB Free Computer Science Library Emerges from Unstructured Web Data

A new open-source dataset called OpenCSF has been compiled, containing 1.5TB of computer science materials scraped from public web sources. It provides a massive, free corpus for AI training and research in software engineering and CS education.

85% relevant

Anthropic Launches Dedicated Science Blog to Chronicle AI Research and Applications

Anthropic has launched a new Science Blog to publish its research and case studies on using AI to accelerate scientific discovery, aligning with its mission to increase the pace of scientific progress.

85% relevant

Nvidia's Jensen Huang Dismisses Custom AI Chip Threat: 'Science Projects' Versus 'AI Factories'

Nvidia CEO Jensen Huang confidently dismissed concerns about custom AI chips challenging Nvidia's dominance, framing competitors' efforts as 'science projects' while Nvidia builds revenue-generating 'AI factories' with a complete platform approach.

85% relevant

Google's TITANS Architecture: A Neuroscience-Inspired Revolution in AI Memory

Google's TITANS architecture represents a fundamental shift from transformer limitations by implementing cognitive neuroscience principles for adaptive memory. This breakthrough enables test-time learning and addresses the quadratic scaling problem that has constrained AI development.

80% relevant

How a 50-Year-Old Computer Science Concept Just Outperformed Anthropic's Claude Code

A small startup has outperformed Anthropic's flagship Claude Code using a novel architecture based on persistent memory systems. This breakthrough demonstrates how classic computer science principles can solve modern AI limitations in context retention and reasoning.

70% relevant

Neuroscience Visualization: Time-Lapse Video Shows Lab-Cultured Neurons Forming Connections

A researcher shared a time-lapse video of actual neurons in a lab dish forming new connections. This raw visualization provides a direct, non-AI view of biological computation.

85% relevant

Microsoft's Phi-4-Vision: A Compact AI Model That Excels at Math, Science, and Understanding Interfaces

Microsoft has released Phi-4-reasoning-vision-15B, a 15-billion parameter open-weight multimodal model designed for tasks requiring both visual perception and selective reasoning. The compact model excels at scientific, mathematical, and GUI understanding while balancing compute efficiency.

85% relevant

Zatom-1: The First Unified AI Model for 3D Molecular and Materials Science

Researchers have developed Zatom-1, the first foundation model that simultaneously handles generative and predictive tasks for both molecules and materials. This multimodal flow matching approach enables faster sampling and improved accuracy across chemical domains.

75% relevant

BioBridge AI Merges Protein Science with Language Models for Breakthrough Biological Reasoning

Researchers introduce BioBridge, a novel AI framework that combines protein language models with general-purpose LLMs to enable enhanced biological reasoning. The system achieves state-of-the-art performance on protein benchmarks while maintaining general language understanding capabilities.

75% relevant

Boston Consulting Group on 'Speaking Your AI Agent’s Language'

BCG highlights the critical need for effective human-AI agent communication as a cornerstone of digital transformation, particularly in complex, regulated industries like life sciences. This principle is broadly applicable to retail.

80% relevant

AI Accelerates Genomic Discovery, Unlocking '7 Years of Potential in 30 Minutes'

An AI science-research technology is reportedly accelerating discovery in genomics at an unprecedented rate, described as unlocking seven years of potential work in just thirty minutes.

85% relevant

Mirendil: Ex-Anthropic Scientists Launch $1B Venture to Build AI That Thinks Like a Scientist

Former Anthropic researchers are raising $175M at a $1B valuation for Mirendil, a startup aiming to build AI systems for long-term scientific reasoning. The goal is to accelerate breakthroughs in biology and materials science, aligning with a broader industry push toward autonomous AI researchers.

100% relevant

The AI Trap: How Professors Are Fighting Back Against Student Over-Reliance on Language Models

University professors are deploying 'trap words' in digital assignments to catch students who blindly use AI for complex cognitive tasks. While science departments embrace these tools, literature professors report a collapse in students' ability to synthesize information independently.

85% relevant

From Bota to Enhe: The Dawn of Physical AI in Biomanufacturing

Bota Bio has rebranded as Enhe Technology and launched SAION AI, a pioneering Physical AI platform for biomanufacturing. The platform claims state-of-the-art performance across four key life science AI benchmarks, signaling a major shift in how biology is engineered.

87% relevant

Microsoft's EMPO²: A Memory-Augmented RL Framework That Supercharges LLM Agent Exploration

Microsoft has unveiled EMPO², a hybrid reinforcement learning framework that enhances LLM agents with augmented memory for true exploration. The system combines on- and off-policy optimization to discover novel states, achieving 128.6% performance gains over existing methods on ScienceWorld benchmarks.

85% relevant

RealChart2Code Benchmark Exposes Major Weakness in Vision-Language Models for Complex Data Visualization

A new benchmark reveals state-of-the-art Vision-Language Models struggle to generate code for complex, multi-panel charts from real-world data. Proprietary models outperform open-weight ones, but all show significant degradation versus simpler tasks.

72% relevant

Meta's TRIBE v2 Predicts Brain Activity from fMRI Data, Surpassing Real Scan Accuracy

Meta released TRIBE v2, a foundation model trained on 500+ hours of fMRI data from 700+ people. It predicts a new person's brain responses to sensory input without retraining, reportedly exceeding the accuracy of a real brain scan.

95% relevant

Georgia Tech Launches Free, Interactive Data Structure & Algorithm Visualization Tool

Researchers at Georgia Tech have released a free, web-based educational tool that generates real-time, interactive animations for data structures and algorithms. The platform aims to improve comprehension by visually demonstrating code execution step-by-step.

85% relevant

Palantir Maven + Anthropic Claude AI System Processes Classified Data to Generate 1,000 Military Targets in 24 Hours

The US military used Palantir's Maven platform integrated with Anthropic's Claude AI to analyze classified data streams and generate approximately 1,000 target packages within 24 hours, accelerating a workflow that previously took days or weeks.

97% relevant

Massive Open-Source Dataset of Computer Screen Recordings Released to Train AI Agents

Researchers have released the world's largest open-source dataset of computer-use recordings on Hugging Face. The collection contains 48,478 screen recording videos totaling approximately 12,300 hours of professional software usage, licensed under CC-BY-4.0 for AI training and evaluation.

97% relevant

Google's Groundsource: Using AI to Mine Historical Disaster Data from Global News

Google AI Research has unveiled Groundsource, a novel methodology using the Gemini model to transform unstructured global news reports into structured historical datasets. The system addresses critical data gaps in disaster management, starting with 2.6 million urban flash flood events.

75% relevant

Temporal Freedom: How Unrestricted Data Access Could Revolutionize LLM Performance

Researchers at Tsinghua University have discovered that allowing Large Language Models to freely search through temporal data significantly outperforms traditional rigid pipeline approaches and costly retrieval methods. This breakthrough suggests a paradigm shift in how we structure AI information access.

85% relevant

AI Bridges the Gap Between Data and Discovery: New Framework Aligns Scientific Observations with Decades of Literature

Researchers have developed a novel AI framework that aligns X-ray spectra with scientific literature using contrastive learning. This multimodal approach improves physical variable estimation by 16-18% and identifies high-priority astronomical targets, demonstrating how AI can accelerate scientific discovery by connecting data with domain knowledge.

75% relevant

Federated Fine-Tuning: How Luxury Brands Can Train AI on Private Client Data Without Centralizing It

ZorBA enables collaborative fine-tuning of large language models across distributed data silos (stores, regions, partners) without moving sensitive client data. This unlocks personalized AI for CRM and clienteling while maintaining strict data privacy and reducing computational costs by up to 62%.

65% relevant

Multimodal Knowledge Graphs Unlock Next-Generation AI Training Data

Researchers have developed MMKG-RDS, a novel framework that synthesizes high-quality reasoning training data by mining multimodal knowledge graphs. The system addresses critical limitations in existing data synthesis methods and improves model reasoning accuracy by 9.2% with minimal training samples.

80% relevant

The Trillion-Dollar AI Infrastructure Boom: How Data Center Spending Is Reshaping Technology

AI infrastructure spending is accelerating at unprecedented rates, with data center capital expenditures projected to reach $800 billion by 2026 and surpass $1 trillion annually by 2027, signaling a fundamental transformation in global technology investment.

85% relevant

Bridging Data Worlds: How MultiModalPFN Unifies Tabular, Image, and Text Analysis

Researchers have developed MultiModalPFN, an AI framework that extends TabPFN to handle tabular data alongside images and text. This breakthrough addresses a critical limitation in foundation models for structured data, enabling more comprehensive analysis in healthcare, marketing, and other domains where multiple data types coexist.

72% relevant

Generative World Renderer: 4M+ RGB/G-Buffer Frames from Cyberpunk 2077 & Black Myth: Wukong Released for Inverse Graphics

A new framework and dataset extracts over 4 million synchronized RGB and G-buffer frames from Cyberpunk 2077 and Black Myth: Wukong, enabling AI models to learn inverse material decomposition and controllable game environment editing.

85% relevant