Together with Growth School & Infinite Uptime

Join this 16 hour AI Learning Sprint to become an AI Genius (worth $895 but $0 today)

The AI race is getting faster & dirtier day by day. Things we could never have imagined are happening.

--Thousands of people are getting laid off everyday

--People are building 1-person million dollar companies

--Tech giants are fighting for AI talent

Meta just poached OpenAI’s 4 top researchers …….

So if you’re not learning AI today, you probably won't have a job in the next 6 months.

That’s why, you need to join the 3-Day Free AI Mastermind by Outskill which comes with 16 hours of intensive training on AI frameworks, building with sessions, creating images and videos etc. that will make you an AI expert. Originally priced at $895, but the first 100 of you get in for completely FREE! Extended 4th of july SALE! 🎁

📅FRI-SAT-SUN- Kick Off Call & Live Sessions

🕜10AM EST to 7PM EST

✅ trusted by 4M+ learners

smollm3-hugging-faces-small-but-mighty-multilingual-model-with-128k-token-context-mlarena-a-diagnostic-rich-algorithm-agnostic-toolkit-img-0

In the 5 sessions, you will:

✅ Master prompt engineering to get the best out of AI.
✅ Build custom GPT bots & AI agents for email management to save you 20+ hours weekly.
✅ Create high-quality images and videos for PPTs, marketing, and branding.
✅ Monetise your AI skills into a $10,000/mo business.

All by global experts from companies like Amazon, Microsoft, SamurAI and more. And it’s ALL. FOR. FREE. 🤯 🚀

Join now and get $5100+ in additional bonuses

$5100+ worth of AI tools across 3 days — Day 1: 3000+ Prompt Bible, Day 2: Roadmap to make $10K/month with AI, Day 3: Your Personal AI Toolkit Builder.

Sponsored

Subscribe|Submit a tip|Advertise with us

Welcome to DataPro #141 ~ Engineering Intelligence, Not Just Models

In this landmark edition, we go beyond algorithms and hyperparameters to explore how data science is evolving into a discipline of system design, orchestration, and reasoning. As GenAI shifts the boundaries of what’s possible, the conversation is no longer about what model to use, but how we structure intelligence itself.

Our feature deep dive, “Beyond Prompts: The Rise of Context Engineering” byRahul Singh, Data Science Manager at Adobe,challenges the prompt-centric mindset and introduces Context Engineering as a foundational pillar for building scalable, intelligent agents. If you’re architecting the future of enterprise AI, this is essential reading.

Also inside:

– Build a fully autonomous multi-agent system with Python, OpenAI API, and PrimisAI Nexus
– Explore SmolLM3, Hugging Face’s small-but-mighty multilingual model with 128k-token context
– Microsoft’s Copilot Chat goes open-source, offering powerful AI pair programming to everyone
– Google’s MCP Toolbox simplifies secure, schema-aware database access for AI agents
– A technical teardown of Shazam’s algorithmic magic, from FFT to hash matching
– How POSETs in Python provide better multi-criteria decisions than rankings
– Launch smarter ML pipelines with MLarena, a diagnostic-rich, algorithm-agnostic toolkit
– Unlock true concurrency with free-threaded Python 3.13 and StaticFrame for blazing-fast row ops

Whether you're scaling models, building infrastructure, or shaping AI policy, this issue delivers insights for every data scientist at the frontier.

✉️ Have tips or tools to share? Reply and contribute to our next edition.

Cheers,

Merlyn Shelley

Growth Lead, Packt

smollm3-hugging-faces-small-but-mighty-multilingual-model-with-128k-token-context-mlarena-a-diagnostic-rich-algorithm-agnostic-toolkit-img-1

Unlock 99.97% Availability with PlantOS: Production Reliability, Redefined

PlantOS Manufacturing Intelligence is powering the next era of industrial performance — delivering 99.97% equipment availability and up to 2% energy savings per unit produced. From steel to cement, manufacturers worldwide are turning fragmented data into confident decisions across every layer of production — from parameter to plant to global scale.

Experience Infinite Uptime Now

Sponsored

Beyond Prompts: The Rise of Context Engineering

Why context engineering is the next frontier in building smarter, more reliable AI systems.

Written by Rahul Singh, Data Science Manager @Adobe.

smollm3-hugging-faces-small-but-mighty-multilingual-model-with-128k-token-context-mlarena-a-diagnostic-rich-algorithm-agnostic-toolkit-img-2

Over my seven-plus-year career in data science, working on projects ranging from customer-value measurement to product analytics and personalization, one question has remained constant through it all:Do we have the right data, and can we trust it?

With the rapid rise of Generative AI, that question hasn’t disappeared; it’s become even more urgent. As AI systems evolve from proof-of-concept assistive chatbots to autonomous agents capable of reasoning and acting, their success increasingly depends not on how complex or powerful they are, but on how well they understand the context in which they operate.

In recent weeks, leaders like Tobi Lütke (CEO of Shopify), Andrej Karpathy (former Director of AI at Tesla), and others have spotlighted this shift. Lütke’s tweet was widely reshared, including by Karpathy, who elaborated on it further. He emphasized that context engineering is not about simple prompting, but about carefully curating, compressing, and sequencing the right mix of task instructions, examples, data, tools, and system states to guide intelligent behavior. This emerging discipline, still poorly understood in most organizations, is quickly becoming foundational to any serious application of generative AI.

This growing attention tocontext engineeringsignals a broader shift underway in the AI landscape. For much of the past year,prompt engineeringdominated the conversation, shaping new job titles and driving a surge in hiring interest. But that momentum is tapering. A Microsoft survey across 31 countries recently ranked “Prompt Engineer” near the bottom of roles companies plan to hire(Source).Job search trends reflect the change as well: according to Indeed, prompt-related job searches have dropped from144 per milliontojust 20–30(Source).

But this decline doesn’t signal the death of prompt engineering by any means. Instead, it reflects a field in transition. As use cases evolve from assistive to agentic AI, ones that can plan, reason, and act autonomously, the core challenge is no longer just about phrasing a good prompt. It’s about whether the model has the right information, at the right time, to reason and take meaningful action.

This is where Context Engineering comes in!

Suppose prompt engineering is about writing the recipe, carefully phrased, logically structured, and goal-directed. In that case,context engineeringis about stocking the pantry, prepping the key ingredients, and ensuring the model remembers what’s already been cooked. It’s the discipline of designing systems that feed the model relevant data, documentation, code, policies, and prior knowledge, not just once, but continuously and reliably.

In enterprises, where critical knowledge is often proprietary and fragmented across various platforms, including SharePoint folders, Jira tickets, Wiki pages, Slack threads, Git Repositories, emails, and dozens of internal tools, the bottleneck for driving impact with AI is rarely the prompt. It’s the missing ingredients from the pantry, the right data, delivered at the right moment, in the right format. Even the most carefully crafted prompt will fall flat if the model lacks access to the organizational context that makes the request meaningful, relevant, and actionable.

And as today’s LLMs evolve intoLarge Reasoning Models(LRM), and agentic systems begin performing real, business-critical tasks, context becomes the core differentiator. Models like OpenAI’s o3 and Anthropic’s Claude Opus 4 can handle hundreds of thousands of tokens in one go. But sheer capacity is not enough to guarantee success. What matters is selectively injecting the right slices of enterprise knowledge: source code, data schemas, metrics, KPIs, compliance rules, naming conventions, internal policies, and more.

This orchestration of context is not just document retrieval; it’s evolving into a new systems layer. Instead of simply fetching files, these systems now organize and deliver the right information at the right step, sequencing knowledge, tracking intermediate decisions, and managing memory across interactions. In more advanced setups, supporting models handle planning, summarization, or memory compression behind the scenes, helping the primary model stay focused and efficient. These architectural shifts are making it possible for AI systems to reason more effectively over time and across tasks.

Without this context layer, even the best models stall on incomplete or siloed inputs. With it, they can reason fluidly across tasks, maintain continuity, and deliver compounding value with every interaction.

Case in point:This isn’t just theory. One standout example comes from McKinsey. Their internal GenAI tool,Lilli,is context engineering in action. The tool unifies over 40 knowledge repositories and 100,000+ documents into a single searchable graph. When a consultant poses a question, it retrieves the five to seven most relevant artifacts, generates an executive summary, and even points to in-house experts for follow-up. This retrieval-plus-synthesis loop has driven ~72% firm-wide adoption and saves teams ~30% of the time they once spent hunting through SharePoint, wikis, and email threads, proof that the decisive edge isn’t just a bigger model, but a meticulously engineered stream of proprietary context (Source).

What Does ContextActuallyMean in the Enterprise?

By now, it’s clear that providing the right context is key to unlocking the full potential of AI and agentic systems inside organizations. But “context” isn’t just a document or a code snippet; it’s a multi-layered, fragmented, and evolving ecosystem. In real-world settings, it spans everything from database schemas to team ownership metadata, each layer representing a different slice of what an intelligent system needs to reason, act, and adapt effectively.

Based on my experience working across hundreds of data sources and collaborating with cross-functional product, engineering, and data teams, I’ve found that most enterprise context and information fall into nine broad categories. These aren’t just a checklist; they form a mental model: each category captures a dimension of the environment that AI agents must understand, depending on the use case, to operate safely, accurately, and effectively within your organization.

Read the full article on Packt’s Medium. If you’re new, make sure to follow our Medium handle and subscribe to our newsletter for more insights like this!

📈 Patterns & Practice: What’s Moving the World of Data & ML

⭕ Implementing a Tool-Enabled Multi-Agent Workflow with Python, OpenAI API, and PrimisAI Nexus: Learn how to implement a multi-agent AI system using Python, OpenAI API, and PrimisAI Nexus. The tutorial covers setting up hierarchical supervision, defining structured JSON schemas, and integrating tools for code validation, statistical analysis, and documentation search. Agents collaborate to automate complex workflows across planning, development, QA, and data analysis with scalable, role-based coordination.

⭕ Hugging Face Releases SmolLM3: A 3B Long-Context, Multilingual Reasoning Model: Hugging Face's SmolLM3 is a compact 3B-parameter multilingual model offering SoTA reasoning, tool use, and 128k-token context handling. Released in base and instruct variants, it rivals 7B+ models across benchmarks like XQuAD and MGSM. SmolLM3 is ideal for multilingual RAG, agent workflows, and edge deployments, delivering powerful performance with efficiency and accessibility.

⭕ Microsoft Open-Sources GitHub Copilot Chat Extension for VS Code—Now Free for All Developers: Microsoft has open-sourced the GitHub Copilot Chat extension for VS Code under the MIT license, unlocking premium AI coding tools for free. With Agent Mode, Edit Mode, predictive Code Suggestions, and in-editor Chat, developers gain powerful automation, multi-file editing, and contextual assistance, paving the way for customizable, AI-enhanced workflows across open-source and enterprise environments.

⭕ Google AI Just Open-Sourced a MCP Toolbox to Let AI Agents Query Databases Safely and Efficiently: Google’s new MCP Toolbox for Databases simplifies secure, schema-aware SQL integration for AI agents with just a few lines of Python. Part of the open-source GenAI Toolbox, it supports PostgreSQL/MySQL, MCP-compliant interfaces, connection pooling, and safe query generation, enabling reliable database access for LLM workflows in analytics, customer support, DevOps, and enterprise automation.

⭕ The Five-Second Fingerprint: Inside Shazam’s Instant SongID: Part of the Behind the Tap series, this deep dive unpacks how Shazam identifies songs in seconds using audio fingerprinting, FFT-based spectrograms, and hash matching. It explains the journey from a tap to real-time song recognition, reveals Shazam’s scalable architecture, and explores its industry impact, from music discovery to market insights used by Apple and record labels.

⭕ POSET Representations in Python Can Have a Huge Impact on Business: POSETs (Partially Ordered Sets) offer a powerful alternative to traditional ranking systems by preserving multidimensional relationships without forcing a linear order. This post shows how POSETs can improve decision-making by avoiding arbitrary weighting and oversimplification, using Python and the Wine Quality dataset to build dominance matrices, Hasse diagrams, and interpret incomparability across samples.

⭕ Build Algorithm-Agnostic ML Pipelines in aBreeze: MLarena is a newly open-sourced, algorithm-agnostic machine learning toolkit built on MLflow for training, evaluating, tuning, and deploying models. It balances automation with expert control, offering built-in diagnostics, explainability tools, robust hyperparameter optimization via Bayesian search, and seamless MLflow integration. MLarena simplifies end-to-end ML workflows while enhancing model transparency, stability, and reproducibility.

See you next time!