mistral-ai-launches-ministral-3b-and-8b-models-for-on-device-ai-computing-img-0

AI_Distilled #72: Mistral AI Launches Ministral 3B and 8B Models for On-Device AI Computing

mistral-ai-launches-ministral-3b-and-8b-models-for-on-device-ai-computing-img-1

JoinGenerativeAI InActionnow withaFull Event Pass for just $239.99—40% off the regular price—with codeFLASH40.

BOOK TODAY AT $239.99 $399.99

Three Reasons Why You Cannot Miss This Event:

Network with 25+ Leading AI Experts

Gain Insights from 30+ Dynamic Talks and Hands-On Sessions

Engage with Experts and Peers through 1:1 Networking, Roundtables, and AMAs

Act fast—this FLASH SALE is only for a limited number of seats!

BOOK TODAY AT $239.99 $399.99

Welcome to AI_Distilled. Today, we’ll talk about:

Techwave:

Mistral AI Launches Ministral 3B and 8B Models for On-Device AI Computing

Make charts on perplexity code interpreter

Introducing Swarm: OpenAI’s New Open-Source Multi-Agent Orchestration Framework

OpenAI MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Anthropic’s Responsible Scaling Policy, October 15, 2024

Awesome AI:

MU - Perplexity Finance

Adobe Launches Firefly Video Model and Enhances Image, Vector and Design Models

You can now search with Google Lens in the Chromebook Gallery app

Gradio

Strella - AI-Powered Customer Research

Masterclass:

Aria: First Open Multimodal Native MOE Model

Understanding the Limitations of Mathematical Reasoning in Large Language Models

No Priors Ep. 80 | With Andrej Karpathy from OpenAI and Tesla

Multi document agentic RAG: A walkthrough

LLMs From Scratch Ch05/08:_Memory efficient_weight_loading

HackHub

Llama-3.1-Nemotron-70B - a nvidia Collection

mlc-ai/mlc-llm: Universal LLM Deployment Engine with ML Compilation

Surya: OCR, layout analysis, reading order, table recognition in 90+ languages

TEN-Agent: world’s first real-time multimodal agent integrated with the OpenAI Realtime API

Cinnamon/kotaemon: An open-source RAG-based tool for chatting with your documents

Cheers!

Shreyans Singh

Editor-in-Chief, Packt

Looking to build, train, deploy, or implement Generative AI?

Meet Innodata — offering high-quality solutions for developing and implementing industry-leading generative AI.

With 5,000+ in-house SMEs and expansion and localization supported across 85+ languages, Innodata drives AI initiatives for enterprises globally.

Learn More

⚡ TechWave: AI/GPT News & Analysis

Mistral AI Launches Ministral 3B and 8B Models for On-Device AI Computing

Mistral AI has introduced two new advanced models, Ministral 3B and Ministral 8B, designed for efficient on-device and edge computing. These models, which are more powerful and faster than their predecessors, excel in areas like knowledge, reasoning, and task execution, making them ideal for privacy-focused, offline applications such as local translation and robotics. With a large context length and specialized attention patterns, they offer low-latency and cost-effective solutions for a variety of uses, from personal projects to industrial tasks. Both models are now available for commercial and research use.

Make charts on perplexity code interpreter

Introducing Swarm: OpenAI’s New Open-Source Multi-Agent Orchestration Framework

Swarm is an experimental, educational framework developed by OpenAI to explore lightweight orchestration of multiple agents in a flexible and ergonomic way. It allows developers to create and manage multi-agent systems where agents can pass tasks or conversations between each other, handling complex workflows efficiently. Designed for educational purposes, Swarm uses OpenAI’s Chat Completions API, with agents executing Python functions and handling different tasks.

OpenAI MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

MLE-bench is a benchmark created by OpenAI to evaluate how well AI agents can perform tasks related to machine learning engineering. It uses 75 competitions from Kaggle to test real-world skills such as training models, preparing datasets, and running experiments. Human baselines are established using Kaggle's leaderboards, and the best-performing AI setup, OpenAI's o1-preview with AIDE scaffolding, achieves results comparable to a Kaggle bronze medal in about 17% of competitions.

Anthropic’s Responsible Scaling Policy, October 15, 2024

Anthropic's updated Responsible Scaling Policy (RSP) outlines its commitment to ensuring that AI models do not cause catastrophic harm by implementing safety and security measures. The policy introduces AI Safety Level (ASL) Standards, which become stricter as AI capabilities increase. These standards help determine when models need stronger safeguards. The update includes guidelines for assessing models based on Capability Thresholds, focusing on areas like chemical, biological, radiological, and nuclear (CBRN) risks. If a model reaches a higher capability, additional safeguards (ASL-3 or higher) are required to mitigate risks.

💻 Awesome AI: Tools for Work

MU - Perplexity Finance

Perplexity revealed a preview of its upcoming financial analysis platform, "Perplexity for Finance," designed to provide users with real-time stock quotes, historical earnings reports, industry comparisons, and detailed financial data, all through an intuitive and user-friendly interface. A video shared by the company demonstrated how users can easily access and visualize financial data, such as Nvidia’s earnings history and stock price trends.

Adobe Launches Firefly Video Model and Enhances Image, Vector and Design Models

Adobe has launched its new Firefly Video Model (beta), expanding its AI-powered creative tools to video content, marking the first such model designed for safe commercial use. In addition to this, Adobe enhanced its Firefly Image, Vector, and Design models, offering faster image generation and new capabilities integrated into apps like Photoshop, Illustrator, and Premiere Pro. These tools allow users to generate videos and images from text prompts, extend video clips, and more.

You can now search with Google Lens in the Chromebook Gallery app

Chromebooks now have Google Lens integrated into their Gallery app, allowing users to quickly search for information related to any image or document they view. By opening a file in the Gallery app, users can select a section of the image or document and use Google Lens to perform a search. This new feature acts as a shortcut to Chrome’s existing Google Lens tool, saving users time by streamlining the process of capturing and searching with images.

Gradio

Gradio 5.0 is a user-friendly tool that makes it easy to create web-based interfaces for machine learning models. With just a few lines of Python code, developers can build interactive apps that allow anyone to test and interact with their models. Gradio can be embedded in notebooks or shared via public links, and it supports integration with various Python libraries. It also offers permanent hosting on Hugging Face Spaces. Gradio is widely used by companies like Google and Amazon, as well as researchers and developers for quick and efficient model demos.

Strella - AI-Powered Customer Research

Strella is an AI-powered tool designed to streamline customer research by automating interviews, recruitment, and analysis. It helps researchers quickly create custom interview guides, conduct AI-moderated interviews, and analyze insights in real-time, making decisions faster and more informed. Strella handles logistics like scheduling and incentives, allowing researchers to focus on higher-impact tasks. It supports global participants, runs interviews 24/7, and offers features like dynamic follow-up questions, screen recording, and multilingual capabilities. The platform boosts efficiency, speeds up research timelines, and enhances research output.

🔛 Masterclass: AI/LLM Tutorials

Aria: First Open Multimodal Native MOE Model

Rhymes AI introduced Aria, an open-source multimodal native Mixture-of-Experts (MoE) model, designed to process various input types—text, images, video, and code—simultaneously. It excels in tasks involving complex multimodal data and offers a long context window of up to 64,000 tokens, making it highly efficient for tasks like video captioning or document understanding. Aria outperforms other open and some proprietary models like GPT-4o and Gemini-1.5, demonstrating competitive performance with fewer activated parameters.

Understanding the Limitations of Mathematical Reasoning in Large Language Models

Recent advancements in Large Language Models (LLMs) have led to interest in their ability to handle formal reasoning, especially in math. The widely used GSM8K benchmark tests models on grade-school-level math questions, but it's unclear if improvements in scores reflect true advances in reasoning. To address this, researchers created GSM-Symbolic, a new benchmark with symbolic templates that generate more varied and controlled questions. They found that LLMs struggle when numerical values or clauses are slightly changed in questions, suggesting that current models rely on patterns from training data rather than genuine logical reasoning.

No Priors Ep. 80 | With Andrej Karpathy from OpenAI and Tesla

In this episode of the *No Priors* podcast, Andrej Karpathy, a key figure in AI and former leader of Tesla Autopilot, discusses the evolution of self-driving cars, comparing Tesla's approach with Waymo's. He also touches on Tesla's Optimus humanoid robot and the challenges in robotics and AI today. Karpathy explores the potential for integrating AI with human cognition and shares insights on AI-driven education and its impact on future learning. He also talks about his new venture, Eureka Labs, and offers advice on what young people should study to prepare for a future shaped by AI advancements.

Multi document agentic RAG: A walkthrough

This blog post by Vipul Maheshwari explains the concept of Agentic Retrieval-Augmented Generation (RAG), an advanced version of traditional RAG systems. Unlike basic RAG models that retrieve relevant data for language models to generate responses, Agentic RAG introduces decision-making autonomy. It can analyze a task, break it into smaller steps, and take actions without constant supervision. The post walks through how to build an Agentic RAG system for car diagnostics using LanceDB, LlamaIndex, and vector databases.

LLMs From Scratch Ch05/08:_Memory efficient_weight_loading

🚀 HackHub: AI Tools

Llama-3.1-Nemotron-70B - a nvidia Collection

NVIDIA has released several advanced AI models on Hugging Face, including the Llama-3.1-Nemotron series, which offers state-of-the-art (SOTA) performance on benchmarks like Arena Hard and RewardBench. These models, like Llama-3.1-Nemotron-70B, focus on text generation and include variations tailored for instruction-following (Instruct) and reward-based tasks. NVIDIA's collection also includes models for specialized tasks such as speech synthesis (Parakeet) and reinforcement learning with human feedback (RLHF).

mlc-ai/mlc-llm: Universal LLM Deployment Engine with ML Compilation

MLC LLM is an open-source project that provides a universal deployment engine for large language models (LLMs) with machine learning compilation (MLC). Its goal is to enable developers to optimize and deploy AI models across various platforms, such as AMD, NVIDIA, and Apple GPUs, and even on mobile devices like iOS and Android.

Surya: OCR, layout analysis, reading order, table recognition in 90+ languages

Surya is an open-source document OCR (Optical Character Recognition) toolkit that supports over 90 languages. It offers advanced features like text detection, layout analysis (including tables, images, and headers), reading order detection, and table recognition, working efficiently across a wide range of documents, from scientific papers to forms.

TEN-Agent: world’s first real-time multimodal agent integrated with the OpenAI Realtime API

TEN Agent is a real-time multimodal AI agent that integrates the OpenAI Realtime API and RTC for ultra-low latency performance. The agent can be extended with edge-cloud integrations, real-time state management, and drag-and-drop tools for complex applications.

Cinnamon/kotaemon: An open-source RAG-based tool for chatting with your documents

Kotaemon is an open-source tool designed for interacting with documents through a Question Answering (QA) system built on Retrieval-Augmented Generation (RAG) technology. It supports various large language models (LLMs), both local and via APIs (like OpenAI), and allows users to ask questions about their documents.

📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want to advertise with us.

If you have any comments or feedback, just reply back to this email.

Thanks for reading and have a great day!