





















































JoinGenerativeAI InActionnow withaFull Event Pass for just $239.99—40% off the regular price—with codeFLASH40.
Three Reasons Why You Cannot Miss This Event:
Network with 25+ Leading AI Experts
Gain Insights from 30+ Dynamic Talks and Hands-On Sessions
Engage with Experts and Peers through 1:1 Networking, Roundtables, and AMAs
Act fast—this FLASH SALE is only for a limited number of seats!
Welcome to AI_Distilled. Today, we’ll talk about:
Techwave:
Mistral AI Launches Ministral 3B and 8B Models for On-Device AI Computing
Make charts on perplexity code interpreter
Introducing Swarm: OpenAI’s New Open-Source Multi-Agent Orchestration Framework
OpenAI MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Anthropic’s Responsible Scaling Policy, October 15, 2024
Awesome AI:
Adobe Launches Firefly Video Model and Enhances Image, Vector and Design Models
You can now search with Google Lens in the Chromebook Gallery app
Strella - AI-Powered Customer Research
Masterclass:
Aria: First Open Multimodal Native MOE Model
Understanding the Limitations of Mathematical Reasoning in Large Language Models
No Priors Ep. 80 | With Andrej Karpathy from OpenAI and Tesla
Multi document agentic RAG: A walkthrough
LLMs From Scratch Ch05/08:_Memory efficient_weight_loading
HackHub
Llama-3.1-Nemotron-70B - a nvidia Collection
mlc-ai/mlc-llm: Universal LLM Deployment Engine with ML Compilation
Surya: OCR, layout analysis, reading order, table recognition in 90+ languages
TEN-Agent: world’s first real-time multimodal agent integrated with the OpenAI Realtime API
Cinnamon/kotaemon: An open-source RAG-based tool for chatting with your documents
Cheers!
Shreyans Singh
Editor-in-Chief, Packt
Looking to build, train, deploy, or implement Generative AI?
Meet Innodata — offering high-quality solutions for developing and implementing industry-leading generative AI.
With 5,000+ in-house SMEs and expansion and localization supported across 85+ languages, Innodata drives AI initiatives for enterprises globally.
Mistral AI Launches Ministral 3B and 8B Models for On-Device AI Computing
Mistral AI has introduced two new advanced models, Ministral 3B and Ministral 8B, designed for efficient on-device and edge computing. These models, which are more powerful and faster than their predecessors, excel in areas like knowledge, reasoning, and task execution, making them ideal for privacy-focused, offline applications such as local translation and robotics. With a large context length and specialized attention patterns, they offer low-latency and cost-effective solutions for a variety of uses, from personal projects to industrial tasks. Both models are now available for commercial and research use.
Make charts on perplexity code interpreter
Introducing Swarm: OpenAI’s New Open-Source Multi-Agent Orchestration Framework
Swarm is an experimental, educational framework developed by OpenAI to explore lightweight orchestration of multiple agents in a flexible and ergonomic way. It allows developers to create and manage multi-agent systems where agents can pass tasks or conversations between each other, handling complex workflows efficiently. Designed for educational purposes, Swarm uses OpenAI’s Chat Completions API, with agents executing Python functions and handling different tasks.
OpenAI MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
MLE-bench is a benchmark created by OpenAI to evaluate how well AI agents can perform tasks related to machine learning engineering. It uses 75 competitions from Kaggle to test real-world skills such as training models, preparing datasets, and running experiments. Human baselines are established using Kaggle's leaderboards, and the best-performing AI setup, OpenAI's o1-preview with AIDE scaffolding, achieves results comparable to a Kaggle bronze medal in about 17% of competitions.
Anthropic’s Responsible Scaling Policy, October 15, 2024
Anthropic's updated Responsible Scaling Policy (RSP) outlines its commitment to ensuring that AI models do not cause catastrophic harm by implementing safety and security measures. The policy introduces AI Safety Level (ASL) Standards, which become stricter as AI capabilities increase. These standards help determine when models need stronger safeguards. The update includes guidelines for assessing models based on Capability Thresholds, focusing on areas like chemical, biological, radiological, and nuclear (CBRN) risks. If a model reaches a higher capability, additional safeguards (ASL-3 or higher) are required to mitigate risks.
Perplexity revealed a preview of its upcoming financial analysis platform, "Perplexity for Finance," designed to provide users with real-time stock quotes, historical earnings reports, industry comparisons, and detailed financial data, all through an intuitive and user-friendly interface. A video shared by the company demonstrated how users can easily access and visualize financial data, such as Nvidia’s earnings history and stock price trends.
Adobe Launches Firefly Video Model and Enhances Image, Vector and Design Models
Adobe has launched its new Firefly Video Model (beta), expanding its AI-powered creative tools to video content, marking the first such model designed for safe commercial use. In addition to this, Adobe enhanced its Firefly Image, Vector, and Design models, offering faster image generation and new capabilities integrated into apps like Photoshop, Illustrator, and Premiere Pro. These tools allow users to generate videos and images from text prompts, extend video clips, and more.
You can now search with Google Lens in the Chromebook Gallery app
Chromebooks now have Google Lens integrated into their Gallery app, allowing users to quickly search for information related to any image or document they view. By opening a file in the Gallery app, users can select a section of the image or document and use Google Lens to perform a search. This new feature acts as a shortcut to Chrome’s existing Google Lens tool, saving users time by streamlining the process of capturing and searching with images.
Gradio 5.0 is a user-friendly tool that makes it easy to create web-based interfaces for machine learning models. With just a few lines of Python code, developers can build interactive apps that allow anyone to test and interact with their models. Gradio can be embedded in notebooks or shared via public links, and it supports integration with various Python libraries. It also offers permanent hosting on Hugging Face Spaces. Gradio is widely used by companies like Google and Amazon, as well as researchers and developers for quick and efficient model demos.
Strella - AI-Powered Customer Research
Strella is an AI-powered tool designed to streamline customer research by automating interviews, recruitment, and analysis. It helps researchers quickly create custom interview guides, conduct AI-moderated interviews, and analyze insights in real-time, making decisions faster and more informed. Strella handles logistics like scheduling and incentives, allowing researchers to focus on higher-impact tasks. It supports global participants, runs interviews 24/7, and offers features like dynamic follow-up questions, screen recording, and multilingual capabilities. The platform boosts efficiency, speeds up research timelines, and enhances research output.
Aria: First Open Multimodal Native MOE Model
Rhymes AI introduced Aria, an open-source multimodal native Mixture-of-Experts (MoE) model, designed to process various input types—text, images, video, and code—simultaneously. It excels in tasks involving complex multimodal data and offers a long context window of up to 64,000 tokens, making it highly efficient for tasks like video captioning or document understanding. Aria outperforms other open and some proprietary models like GPT-4o and Gemini-1.5, demonstrating competitive performance with fewer activated parameters.
Understanding the Limitations of Mathematical Reasoning in Large Language Models
Recent advancements in Large Language Models (LLMs) have led to interest in their ability to handle formal reasoning, especially in math. The widely used GSM8K benchmark tests models on grade-school-level math questions, but it's unclear if improvements in scores reflect true advances in reasoning. To address this, researchers created GSM-Symbolic, a new benchmark with symbolic templates that generate more varied and controlled questions. They found that LLMs struggle when numerical values or clauses are slightly changed in questions, suggesting that current models rely on patterns from training data rather than genuine logical reasoning.
No Priors Ep. 80 | With Andrej Karpathy from OpenAI and Tesla
In this episode of the *No Priors* podcast, Andrej Karpathy, a key figure in AI and former leader of Tesla Autopilot, discusses the evolution of self-driving cars, comparing Tesla's approach with Waymo's. He also touches on Tesla's Optimus humanoid robot and the challenges in robotics and AI today. Karpathy explores the potential for integrating AI with human cognition and shares insights on AI-driven education and its impact on future learning. He also talks about his new venture, Eureka Labs, and offers advice on what young people should study to prepare for a future shaped by AI advancements.
Multi document agentic RAG: A walkthrough
This blog post by Vipul Maheshwari explains the concept of Agentic Retrieval-Augmented Generation (RAG), an advanced version of traditional RAG systems. Unlike basic RAG models that retrieve relevant data for language models to generate responses, Agentic RAG introduces decision-making autonomy. It can analyze a task, break it into smaller steps, and take actions without constant supervision. The post walks through how to build an Agentic RAG system for car diagnostics using LanceDB, LlamaIndex, and vector databases.
Llama-3.1-Nemotron-70B - a nvidia Collection
NVIDIA has released several advanced AI models on Hugging Face, including the Llama-3.1-Nemotron series, which offers state-of-the-art (SOTA) performance on benchmarks like Arena Hard and RewardBench. These models, like Llama-3.1-Nemotron-70B, focus on text generation and include variations tailored for instruction-following (Instruct) and reward-based tasks. NVIDIA's collection also includes models for specialized tasks such as speech synthesis (Parakeet) and reinforcement learning with human feedback (RLHF).
mlc-ai/mlc-llm: Universal LLM Deployment Engine with ML Compilation
MLC LLM is an open-source project that provides a universal deployment engine for large language models (LLMs) with machine learning compilation (MLC). Its goal is to enable developers to optimize and deploy AI models across various platforms, such as AMD, NVIDIA, and Apple GPUs, and even on mobile devices like iOS and Android.
Surya: OCR, layout analysis, reading order, table recognition in 90+ languages
Surya is an open-source document OCR (Optical Character Recognition) toolkit that supports over 90 languages. It offers advanced features like text detection, layout analysis (including tables, images, and headers), reading order detection, and table recognition, working efficiently across a wide range of documents, from scientific papers to forms.
TEN-Agent: world’s first real-time multimodal agent integrated with the OpenAI Realtime API
TEN Agent is a real-time multimodal AI agent that integrates the OpenAI Realtime API and RTC for ultra-low latency performance. The agent can be extended with edge-cloud integrations, real-time state management, and drag-and-drop tools for complex applications.
Cinnamon/kotaemon: An open-source RAG-based tool for chatting with your documents
Kotaemon is an open-source tool designed for interacting with documents through a Question Answering (QA) system built on Retrieval-Augmented Generation (RAG) technology. It supports various large language models (LLMs), both local and via APIs (like OpenAI), and allows users to ask questions about their documents.
📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.
If you have any comments or feedback, just reply back to this email.
Thanks for reading and have a great day!