ex-openai-ctos-startup-deepseek-humanoid-google-ai-co-scientist-microsoft-quantum-llamaindex-llm-meta-brain2text-grok-3s-power-img-0

AI_Distilled #82: We are back!

ex-openai-ctos-startup-deepseek-humanoid-google-ai-co-scientist-microsoft-quantum-llamaindex-llm-meta-brain2text-grok-3s-power-img-1

Get Early Access - 40% Discount

Use Code AGENT40 at checkout

This week saw many breakthroughs and announcements and we bring them all together in one place. Our goal is to curate the most relevant news and updates for you. Fill out this form and tell us what you’d like to read next on AI Distilled.

LLM Expert Insights Team,

Packt

📰 News

Google introduces AI co-scientist

The AI co-scientist by Google is a multi-agent AI system that is intended to function as a collaborative tool for scientists. It is built on Gemini 2.0 and is designed to mirror the reasoning process underpinning the scientific method. The AI co-scientist can be used to generate novel research hypotheses, a detailed research overview, and experimental protocols.

Microsoft unveils Majorana 1

Microsoft has developed a the topoconductor that allows them to create topological qubits and engineer a new state of matter. The toplogical qubits are more stable than traditional qubits, making them more suitable for building a large-scale quantum computer. Microsoft is now gearing towards the next step - building a fault-tolerant quantum computer using topological qubits.

Thinking Machines Lab launched, Mira Murati CEO

Former OpenAI CTO Mira Murati has launched a startup with Barret Zoph (CTO), John Schulman (Chief Scientist). Various other AI stalwarts who have experience in creating AI products like ChatGPT, Segment Anything, Mistral, Pytorch, Character.ai, OpenAI Gym, and FairSeq are also a part of Thinking Machines Lab. The startup’s core mission is to build intelligent, adaptable, and personalized AI systems, emphasizing human-AI collaboration and safety. It aims to make AI more capable, customizable, understood, and user-friendly.

Perplexity Deep Research launched and is free to usePerplexity recently launched its Deep Research model, designed to generate comprehensive reports, using capabilities like iterative search, reasoning, coding, and refinement of research plans. On the Humanity’s Last Exam benchmark test, Perplexity ranked second—behind OpenAI’s deep research model but ahead of other leading competitors—completing most research tasks in under three minutes.

Google aims to serve 10 cities with Waymo self-driving cars in 2025

Speaking at the 2025 World Government Summit in Dubai, Google and Alphabet CEO Sundar Pichai talked about expanding Waymo to 10 new cities. He also highlighted Google’s recent achievement in quantum computing and indicated that quantum computers could become mainstream in the next 5 to 10 years.

Isomorphic Labs and Novartis expand collaboration

Google DeepMind partner Isomorphic Labs, an AI-first drug discovery company, and Novartis have extended their collaboration to add three more research programs aimed at accelerating drug discovery research. Isomorphic Labs is augmenting the AlphaFold breakthrough to connect research with biotech, drug discovery, and medical design.

HP acquires Humane’ AI capabilities including the AI platform Cosmos; end of the road for AI Pins

HP is acquiring Humane in a $116 million deal to accelerate the development of an intelligent ecosystem across its products and services. Humane has also announced the end of production and consumer availability. AI Pin’s services, features, and data access will be available till February 28, 2025, 12 pm PST.

Grok 3 launched; Musk claims it is the Smartest AI

Grok 3, a chatbot built in less than a year, was launched this week in a live demo by the xAI team. The live demonstration showcased Grok 3 handling tasks such as creating a launch plan from Earth to Mars and back and an “insanely great game”, a hybrid between Tetris and Bejeweled. The team claimed that Grok 3’s SOTA model is better than DeepSeek, Claude, and Gemini and is comparable to OpenAI’s model. Check out the recorded demo here (at 19:11 seconds).

Project Waterworth, a subsea cable connectivity project by Meta

Meta has announced a multi-billion-dollar, multi-year project to open three oceanic corridors connecting five major continents. This will be the longest subsea cable project, spanning 50000 kilometers and linking the U.S., Brazil, South Africa, India, and other key regions. Apart from economic collaboration and digital inclusion, this project aims to drive AI innovation across the world with high-speed connectivity.

💻 Awesome AI: Tools for Work

Moonshot AI introduces MoBA that combines Mixture of Experts with sparse attention

Following the release of Kimi, Moonshot AI introduced the Mixture of Block Attention (MoBA) model, designed to tackle long conversations and large text. After dividing the text into blocks, MoBA uses a gating mechanism that switches between full and sparse attention, focusing on the most informative blocks, thus reducing computation time. MoBA has been able to maintain competitive performance with 1-million-token context length.

Perplexity open-sources DeepSeek R1776 to mitigate bias and censorship

To tackle DeepSeek’s avoidance of censored topics in China, Perplexity compiled a dataset of 40k multilingual prompts covering 300 censored topics. R1 was then post-trained on this censorship dataset using an adapted NeMo 2.0 Nvidia framework. The model weights can be downloaded from Hugging Face.

Mistral Saba, a custom-trained model for Middle East and South Asian regional languages

Mistral has introduced Saba, which has been trained on datasets curated from South Asia and Middle East, to capture cultural and linguistic nuances whilst providing accurate and relevant responses to cater to customers in these regions.

Meta Segment Anything Model (SAM) 2.1 is now available in Amazon SageMaker Jumpstart

The SOTA vision segmentation model, SAM 2.1, is now publicly available through Amazon SageMaker Jumpstart. SAM 2.1 enables zero-shot object segmentation, object detection using prompts, long-context processing, and context segmentation scenarios.

🛠️ Hackhub

Hugging Face introduces agent ratings

To evaluate the performance of AI agents in real-world business scenarios, Hugging Face has introduced the AI Agent Leaderboard. The leaderboard currently ranks 17 LLMs, evaluated using the Tool Selection Quality (TSQ) metric across 14 multi-domain datasets. This benchmark assesses LLMs on their ability to select appropriate tools for a given query. This includes parameter handling, multi-step decision making, error handling, context management, and reasoning. At present, gemini-2.0-flash-001 is topping the charts with the highest TSQ of 0.938.

LlamaIndex introduces LLM Consortium

LlamaIndex has introduced a vision for the AI boardroom of the future by creating an LLM consortium, where multiple LLMs answer the same question, and their responses are synthesized by an arbiter to produce a final result. The arbiter iterates and asks the LLMs to try again if it finds their responses subpar. You can check out the notebook here.

Meta achieves breakthroughs in decoding language from brain

Meta AI can now decode up to 80% of the characters in a sentence using non-invasive brain recordings. Brain2Qwerty, a deep-learning architecture trained on EEG and MEG data, can decode briefly memorized sentences that participants typed on a QWERTY keyboard. In another related experiment, MEG and EEG data was analyzed to capture the neural dynamics of language production in the human brain.

⚙️Techhub

Engine AI’s PM01 Robot deployed for public service in Shenzhen

70 Engine AI’s open-source robots are now serving as community workers and patrolling the streets of Shenzhen, in South China. Powered by DeepSeek, the PM01 robot has achieved human-like mobility and is now making grassroot governance more efficient.

YouTube integrates Veo2 to Shorts

YouTube Shorts is now integrating Google DeepMind’s popular video generation model. Users in the US, Canada, Australia, and New Zealand can now use text prompts in Shorts to generate standalone video footage.

Goku AI

ByteDance has recently released GokuAI, a generative flow-based image and video generation model trained on millions of image-text and video-text pairs. Built on a transformed based architecture with 1, 2, and 8 billion parameters, Goku uses diffusion techniques, Rectified Flow, and Variational Autoencoder to create high quality visuals that enable business and content creators to amplify their creative applications.

🧠Masterclass

DeepSeek researchers introduce CodeI/O, a new technique to improve LLM reasoning

DeepSeek researchers recently shared an approach that uses the structured nature of code to learn symbolic, logical, mathematical, and commonsense reasoning patterns. By collecting Python code from sources like CodeMix and PyEdu-R, the code files are unified using DeepSeek-V2.5. The dataset includes 3.5 million input-output pairs generated from transformed code functions, along with natural language Chain-of-Thought (CoT) explanations.

During training, DeepSeek is prompted to generate an output (response), with incorrect responses and feedback fed back into the LLM. Instruction tuning is then applied in the second stage. This multi-turn revision enhances accuracy and shows improvements over baseline models.

Less is More for Reasoning (LIMO) improves LLM performance with only 1% training data

The LIMO approach challenges the notion that LLMs require extensive data, achieving competitive results with just 817 samples and cognitive templates. LIMO employs a rigorous selection process that includes structural organization, effective cognitive explanations, and verification to curate high-quality math problems from NuminaMath-CoT, AIME, and MATH datasets. Using the Qwen2.5-32B-Instruct model with a 16,384-token sequence length, LIMO applies SFT for training and utilizes step-by-step prompting to achieve generalization capabilities.

Large Memory Model (LM2) an auxiliary memory-based model for long context reasoning

LM2 incorporates a structured memory system that interacts with input embeddings through cross-attention. Built on a decoder-only transformer architecture, the model utilizes memory updates regulated by gating mechanisms, allowing it to selectively retain relevant information. LM2 was tested on the BABILong and MMLU datasets, demonstrating significant improvements in long-context reasoning and general reasoning capabilities.

📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.

If you have any comments or feedback, just reply back to this email.

Thanks for reading and have a great day!