





















































ChatGPT 5 just dropped and guess what? 300 million jobs became obsolete overnight.
While companies are panic-firing entire departments, a small group of AI-skilled professionals are charging $10K/month as consultants to automate those same jobs.
The difference? They know the frameworks, workflows, and monetization strategies that 99% of people don't.
Join Outskill's 16-Hour AI Sprint this weekend (usually for $895) and become the AI expert companies are desperately hiring – not firing. Register now for free
Date: Saturday and Sunday, 10 AM - 7 PM.
Rated 9.8/10 by trustpilot– an opportunity that makes you an AI Generalist that can build, solve & work on anything with AI.
In just 16 hours & 5 sessions, you will:
✅ Build AI Agents and custom bots that handle your repetitive work and free up 20+ hours weekly
✅ Learn how AI really works by learning 10+ AI tools, LLM models and their practical use cases.
✅ Learn to build websites and ship products faster, in days instead of months
✅ Create professional images and videos for your business, social media, and marketing campaigns.
✅ Turn these AI skills into10$k income by consulting or starting your own AI services business.
Learn million $ insights used by biggest giants like google, amazon, microsoft from their practitioners 🚀🔥
Unlock bonuses worth $5100 in 2 days!
🔒day 1:3000+ Prompt Bible
🔒day 2: Roadmap to make $10K/month with AI
🎁Additional bonus: Your Personal AI Toolkit Builder
Sponsored
Welcome to DataPro 145 ~your go-to guide for all things data and AI.
You might be wondering why DataPro landed in your inbox on a Monday. We’re experimenting, testing which send days work best for readers like you, while also exploring new topic areas to ensure this newsletter continues to meet your needs. In a world where new models, frameworks, and breakthroughs arrive almost daily, staying up to date isn’t just nice to have, it’s essential. Whether you’re building, researching, or scaling AI systems, the difference between leading and lagging often comes down to who’s plugged into the right knowledge at the right time. That’s why we bring you DataPro: your weekly pulse on the launches, research, and tutorials shaping the field, curated with clarity, context, and links you can act on.
This week’s lineup brings major releases and practical guides shaping AI and data:
🔗 Hugging Face introduces AI Sheets - a no-code, local-first spreadsheet tool that makes dataset creation and enrichment as simple as typing a prompt.
🔗 Salesforce AI releases Moirai 2.0 - a decoder-only transformer setting new benchmarks in time-series forecasting with smaller, faster, and more accurate models.
🔗 Amazon unveils DeepFleet - a foundation model suite trained on billions of robot-hours to predict and optimize fleet traffic patterns in warehouses.
🔗 Meet dots.ocr - a 1.7B parameter open-source vision-language model achieving state-of-the-art multilingual OCR and document parsing across 100+ languages.
We also dive into tutorials that caught fire this week: Google’s Gemma 3 270M, built for hyper-efficient fine-tuning, and a hands-on guide to Model Predictive Control (MPC) using Python and CasADi.
👉 A full-stack edition for builders, researchers, and thinkers who thrive on fresh ideas in data and AI,let’s unpack it.
Cheers,
Merlyn Shelley
Growth Lead, Packt
🔵 Q2 2025 AI Hypercomputer updates: Google Cloud’s AI Hypercomputer is redefining scale: powering Gemini, Veo 3, and serving 980T+ tokens monthly. Highlights this quarter include Dynamic Workload Scheduler, Cluster Director upgrades, llm-d v0.2, and MaxText/MaxDiffusion improvements. Explore open frameworks, TPU/GPU scaling, and claim $300 free credit to simplify AI deployment and boost performance.
🔵 How to Test an OpenAI Model Against Single-Turn Adversarial Attacks Using deepteam? Learn how to red team OpenAI models with deepteam, an open-source toolkit offering 10+ single-turn adversarial attacks including prompt injection, jailbreaking, leetspeak, Base64, and more. This hands-on guide shows how to install dependencies, set up your API key, define vulnerabilities, and test GPT-4o-mini against real-world adversarial prompts.
🔵 Salesforce AI Releases Moirai 2.0: Salesforce’s Latest Time Series Foundation Model Built on a Decoder‑only Transformer Architecture. Salesforce AI Research introduces Moirai 2.0, a decoder-only transformer that tops GIFT-Eval benchmarks for time series forecasting. It’s 44% faster, 96% smaller, yet more accurate than Moirai_large. With multi-token prediction, advanced filtering, and diverse training data, it enables scalable forecasting across IT ops, sales, demand, and supply chain planning.
🔵 Transform your data to Amazon S3 Tables with Amazon Athena: Amazon Athena now supports CTAS with S3 Tables, enabling serverless SQL-based data transformation with built-in Iceberg optimization, ACID transactions, and automatic maintenance. Easily migrate datasets (CSV, Parquet, JSON, etc.) into analytics-ready tables. The tutorial demonstrates transforming customer review data into S3 Tables, unlocking faster queries, simplified ETL, and robust enterprise-scale analytics.
🔵 Estimating from No Data: Deriving a Continuous Score from Categories. This blog is about how to derive a continuous, fine-grained score from categorical outcomes when only labeled categories are available for training. It explains why standard classifiers fail to produce meaningful scores, and demonstrates how low-capacity networks with a linear bottleneck and category approximator head can generate interpretable, ordered risk scores.
🔵 Meet DeepFleet: Amazon’s New AI Models Suite that can Predict Future Traffic Patterns for Fleets of Mobile Robots. Amazon unveils DeepFleet, a suite of foundation models trained on billions of robot-hours to optimize warehouse fleets. Already enhancing operations across 300+ facilities, DeepFleet improves robot coordination, cuts congestion, and boosts efficiency by up to 10%. With RC, RF, IF, and GF architectures, it marks a leap in multi-robot forecasting.
🔵 From Deployment to Scale: 11 Foundational Enterprise AI Concepts for Modern Businesses. A guide to 11 foundational AI concepts shaping enterprise adoption, from the integration gap and RAG reality to the agentic shift and feedback flywheel. The post highlights challenges like vendor lock-in, trust, and risk, while outlining how businesses can scale AI by embedding it natively and continuously reinventing processes.
🔵 Meet dots.ocr: A New 1.7B Vision-Language Model that Achieves SOTA Performance on Multilingual Document Parsing. A new open-source vision-language model, dots.ocr (1.7B parameters), delivers state-of-the-art multilingual OCR and document parsing. Covering 100+ languages, it unifies layout detection and content recognition, preserves structure, and outputs JSON/Markdown/HTML. Benchmarks show it surpasses Gemini2.5-Pro in table accuracy and text precision, offering scalable, production-ready document analysis under the MIT license.
🔵 Enhanced throttling observability in Amazon DynamoDB: Amazon DynamoDB introduces enhanced throttling observability, including structured exception messages with ThrottlingReasons, eight new CloudWatch metrics for detailed breakdowns, and a cost-efficient Contributor Insights mode that tracks only throttled keys. These features simplify diagnosing hot partitions, improving monitoring, and enabling faster mitigation of performance issues across tables and global secondary indexes.
🔵 Smarter Authoring, Better Code: How AI is Reshaping Google Cloud's Developer Experience. Google Cloud is using Gemini-powered AI to accelerate documentation and code sample workflows. New systems auto-generate, validate, and test quickstarts and API code samples, ensuring accuracy and freshness at scale. By combining agentic AI systems with human oversight, Google delivers faster, more reliable developer guidance across evolving cloud services.
🔵 A Coding Guide to Build and Validate End-to-End Partitioned Data Pipelines in Dagster with Machine Learning Integration: A step-by-step guide to building an end-to-end partitioned data pipeline in Dagster, integrating raw data ingestion, cleaning, feature engineering, validation checks, and model training. Using a custom CSV-based IOManager, daily partitions, and lightweight regression, the tutorial shows how to create reproducible, modular pipelines with structured outputs and integrated machine learning.
🔵 Google AI Introduces Gemma 3 270M: A Compact Model for Hyper-Efficient, Task-Specific Fine-Tuning. Google AI introduces Gemma 3 270M, a 270M-parameter model built for hyper-efficient fine-tuning and on-device AI. With a 256k vocabulary, INT4 quantization, and strong instruction-following out of the box, it enables privacy-preserving, domain-specific applications. Compact yet powerful, it delivers energy-efficient inference, rapid customization, and production-ready deployment across mobile, edge, and enterprise environments.
🔵 “My biggest lesson was realizing that domain expertise matters more than algorithmic complexity.“ This blog is about a data scientist’s journey from corporate ML to independent AI consulting, reflecting on real-world lessons from competitions, the importance of domain expertise over algorithmic complexity, and a problem-first approach to AI adoption. It also covers mentoring advice, career path choices in data/AI, and emerging trends like text-to-speech for language preservation.
🔵 Build a deep research agent with Google ADK: This guide shows how to build an agentic lead generation system using Google’s Agent Development Kit (ADK). By orchestrating cooperative agents for pattern discovery and lead generation, it demonstrates state management, parallel research, and dynamic validation, transforming brittle scripts into intelligent, scalable workflows that mimic a market research team.
🔵 Hugging Face Unveils AI Sheets: A Free, Open-Source No-Code Toolkit for LLM-Powered Datasets. Hugging Face launches AI Sheets, a free, open-source, no-code tool that merges spreadsheets with LLM-powered data enrichment. Users can clean, transform, and generate datasets via prompts, using models like Qwen, Kimi, Llama 3, or custom local deployments. With built-in privacy, collaboration, and flexibility, it lowers barriers to AI-driven dataset creation.
🔵 Building an MCP-Powered AI Agent with Gemini and mcp-agent Framework: A Step-by-Step Implementation Guide. A hands-on guide to building an MCP-powered AI agent with Gemini and the mcp-agent framework. The tutorial shows how to set up an MCP tool server, wire structured services like search, analysis, code execution, and weather, and integrate them with Gemini for asynchronous, extensible, and production-ready agent workflows.
🔵 Model Predictive Control Basics: A step-by-step tutorial on Model Predictive Control (MPC) using Python and CasADi. It covers the fundamentals of MPC, formulates and solves an optimal control problem (OCP), and demonstrates implementation on a double integrator system. Includes full code, closed-loop simulations, and discussion of constraints, stability, and feasibility.
See you next time!
Day Zero: Navigating the Aftermath: Immediate steps post-cyberattack, exploring new recovery approaches beyond traditional methods.
From Crisis to Continuity with the Minimum Viable Hospital: Learn to define and rapidly restore core applications critical for patient care continuity.
Rubrik for Healthcare: Discover modern cyber resilience capabilities, including automated ransomware recovery and rapid data restoration.
Sponsored