





















































Learn about the latest in GenAI for vulnerability management, exposure management and cyber-asset security when you attend the CyberRisk Summit.
This free, virtual event on Wednesday, Nov. 20 includes expert speakers from Yahoo, Wells Fargo, IBM, Vulcan Cyber and more. This is the ninth, semi-annual CyberRisk Summit. Attendees can request CPE credits, and all registrants get access to the session recordings. Join us!
Sponsored
🗞️Welcome to BIPro #84 – Your Weekly Dose of BI Brilliance! 🚀
Fuel your data-driven decisions with the freshest trends, strategies, and hacks from the world of business intelligence.
📊 Data Viz & Tools: Future-Proof Your Insights
◘ Pandas + SQL = Powerhouse Duo: Unleash their combined potential for seamless data analysis.
◘ DuckDB Demystified: A Python-based guide to effortless analytics.
◘ Google Cloud’s Secure Data Playbook: Step-by-step to building a fortress-like platform.
◘ Custom T-SQL in Azure Studio: Speed up workflows with tailored code snippets.
◘ Master Pandas for Data Wrangling: Learn the essentials to transform tabular data.
◘ Small Deployments Made Easy: Cloud Migration App simplifies the process.
◘ Alteryx Fall 2024 Updates: Faster workflows, better reports—dive in!
🔄 BI in Action: Real-World Innovations
◘ REST APIs & Fabric: Master the art of data ingestion.
◘ GraphQL Meets Fabric: Discover powerful relationships through Microsoft’s API.
◘ Dataproc Serverless Gets a Boost: Performance upgrades you can’t miss.
◘ Index Management 101: Clean databases = fast queries.
◘ Saving Big on Open-Source DBs: Proven cost-cutting strategies.
◘ Sentiment Analysis with WebAssembly: SingleStore’s clever approach.
◘ Topgolf’s BI Makeover: Learn how QuickSight transformed their game.
⚡ Quick Wins: BI Hacks You’ll Love
◘ Power BI Magic: Running totals, averages, and more with aggregate functions.
◘ SQL Simplified: Clear examples of IS NULL and IS NOT NULL usage.
◘ SCD vs Overwrite: Navigate data warehouse dimensions with ease.
◘ Moving Averages Made Simple: T-SQL windowing functions explained.
◘ Streaming Architecture 101: Build with Apache Kafka and Zookeeper.
◘ Patient Jarvis Solution: Fractal’s innovative approach to patient insights.
🎤 Voices of BI: Wisdom from the Experts
◘ Tableau Viz Extensions: Everything you need to level up visualizations.
◘ Graph It Right: NetworkX tips for mastering graphs in Python.
◘ Data Validation Done Right: Introducing Pandera for Python users.
◘ Fixing Cross-Validation Flaws: Common pitfalls and practical solutions.
◘ 6 Pillars of Data Analysis: A framework for actionable insights.
◘ AlloyDB Omni 15.7.0: What’s new and why it matters.
Enjoy this week’s curated lineup of BI brilliance!
Calling All Data & BI Enthusiasts!
Do you dream of sharing your insights and building your reputation in the Data & BI community? Contribute to our new column in the Packt BIPro newsletter! Share your experiences, discuss new BI tools, or ask questions. Gain recognition among 37,000 BI professionals. Reply with your Google Docs article or use our weekly feedback form. Enjoy a free PDF of "Interactive Data Visualization with Python - Second Edition" for participating. Click reply or share your content today!
Share your thoughts and opinions here!
Cheers,
Merlyn Shelley
Editor-in-Chief, Packt
➽Learn Microsoft Fabric: Explore Microsoft Fabric's features through real-world examples to build robust data analytics solutions, including lakehouses and data warehouses. Learn to monitor and manage your analytics system for flexibility, performance, and security, while leveraging AI-driven insights with Copilot integration. Start your free trial for access, renewing at $19.99/month.
➽Microsoft Power BI Cookbook - Third Edition: Dive into Microsoft Data Fabric to enhance data strategies and gain deeper insights. Effortlessly create Hybrid tables and comprehensive scorecards while utilizing new visualization tools that transform complex data into clear, actionable charts and reports for effective decision-making in Power BI. Start your free trial for access, renewing at $19.99/month.
➽Fundamentals of Analytics Engineering: Explore how analytics engineering aligns with your organization's data strategy while gaining insights from seven industry experts. Address common challenges faced by businesses and learn to implement scalable analytics solutions, from data ingestion to visualization, using industry-leading tools. Start your free trial for access, renewing at $19.99/month.
➽Getting Started with DuckDB: Utilize DuckDB to efficiently load, transform, and query diverse data sources and formats. Gain hands-on experience with SQL, Python, and R for data analysis, while exploring how open-source tools and cloud services enhance DuckDB’s versatile capabilities in the data ecosystem. Start your free trial for access, renewing at $19.99/month.
⫸ Using Pandas and SQL Together for Data Analysis: This blog helps you understand when to use SQL and Python together for data manipulation, showcasing how PandaSQL bridges SQL's readability with Python's flexibility for seamless integration and analysis in data workflows.
⫸ A Guide to Data Analysis in Python with DuckDB: This blog introduces DuckDB, a powerful in-process OLAP database that lets you seamlessly query pandas DataFrames, CSVs, and Parquet files using SQL in Python. Learn how to set it up, generate sample data, and perform data analysis effortlessly.
⫸ Learn how to build a secure data platform with Google Cloud ebook: Discover how Google Cloud secures data-driven innovation in the Building a Secure Data Platform with Google Cloud ebook. Learn about advanced tools like encryption, access controls, and compliance monitoring to protect your data while enabling intelligent applications and fostering business growth.
⫸ How to Develop Custom T-SQL Code Snippets in Azure Data Studio: This blog guides you on efficiently using and creating custom T-SQL code snippets in Azure Data Studio, helping streamline your workflows by automating repetitive tasks and enhancing productivity in your SQL development process.
⫸ Explore Pandas in Python to Analyze and Manipulate Tabular Data: This blog introduces you to the Pandas library, showcasing its power in data analysis and manipulation in Python. Learn key features, installation steps, and practical use cases like creating Series, performing arithmetic operations, and applying aggregations.
⫸ How to Use the Cloud Migration App for Small Deployments? This blog introduces the Cloud Migration App for Small Deployments, a tool designed for Tableau administrators to easily transition content, users, and workbooks from Tableau Server to Tableau Cloud. Learn its key features, setup process, and limitations for efficient small-scale migrations.
⫸ Alteryx Fall 2024 Release Improves Workflow Efficiency and Reporting: This blog highlights the Fall 2024 Alteryx Release, offering simplified workflows, AI-powered reporting, and enhanced data connectivity. Discover new tools for cloud integration, hybrid architectures, and streamlined productivity to revolutionize data-driven decision-making for businesses and IT leaders.
⫸ Ingesting Data From REST API endpoints: Data Engineering with Fabric. This blog guides you through leveraging REST APIs in Python using a Spotify use case. Learn how to authenticate, retrieve data, handle errors, and interact with endpoints using dynamic functions—all within a Fabric notebook environment.
⫸ Relationships with Microsoft Fabric GraphQL API: This blog explores using the Microsoft Fabric GraphQL API to query data across related tables in a star schema. Learn how to create relationships, handle directional queries, and implement advanced many-to-many relationships to maximize data accessibility for end-users.
⫸ Dataproc Serverless performance and usability updates: This post introduces new features in Dataproc Serverless to enhance your Spark experience, including faster native query execution, real-time monitoring with a built-in Spark UI, and Gemini-powered autotuning for smarter troubleshooting and performance optimization.
⫸ A Tidy Database is a Fast Database: Why Index Management Matters: This post is about identifying, optimizing, and managing database indexes to improve SQL Server performance. Learn how to address unused, fragmented, and overlapping indexes, resolve missing index issues, and implement effective maintenance strategies for efficient resource use and faster queries.
⫸ Cost Optimization Strategies for Large-Scale Open-Source DBs: This post guides you on managing large-scale open-source databases cost-effectively. It covers choosing the right database, optimizing infrastructure, tuning performance, leveraging automation, and implementing strategies like caching, sharding, and containerization for efficiency and scalability.
⫸ Using SingleStore and WebAssembly for Sentiment Analysis: This article guides you in performing sentiment analysis on Stack Overflow comments using SingleStore and WebAssembly, demonstrating data ingestion, function creation, and analysis through SQL and Python in the SingleStore Cloud environment.
⫸ Transforming data into insights: How Topgolf revolutionized business intelligence using Amazon QuickSight. This post highlights how Topgolf transformed its operations with Amazon QuickSight, enabling organization-wide data access, real-time insights, and tailored dashboards to optimize performance, improve customer experiences, and foster a culture of data-driven decision-making.
⫸ Aggregate Functions in Power BI - Running Total, Average, Max and Min: This post demonstrates how to create custom aggregations in Power BI using DAX (Data Analysis Expressions). Learn how to set up your data, build tailored measures, and gain precise insights to enhance your reports and data understanding.
⫸ SQL IS NULL and SQL IS NOT NULL Examples: This post provides a clear guide on handling NULL values in SQL Server. Learn how to use IS NULL and IS NOT NULL operators effectively, understand the nuances of NULL, and avoid common pitfalls in SQL queries.
⫸ Data Warehouse Considerations - SCD Type 2 vs Overwrite Dimensions: This post explores two key strategies for managing dimension table updates in data warehousing: Overwriting Tables and Slowly Changing Dimensions (SCD) Type 2. Learn their use cases, benefits, and why SCD Type 2 is often ideal for tracking historical data changes.
⫸ Calculate a Moving Average with T-SQL Windowing Functions: This post explores two methods for calculating moving averages in SQL Server: an older self-join approach and a modern windowing function approach. Learn how to optimize queries and improve performance with indexes and efficient SQL techniques.
⫸ Build a Streaming Data Architecture with Apache Kafka and Zookeeper: This article demonstrates how to use Apache Kafka and Zookeeper for real-time data streaming, showcasing a project to capture, process, and load data into Elasticsearch and Azure Data Lake Gen 2 for analysis.
⫸ Revolutionizing Patient Insights with Fractal’s Patient Jarvis solution: This article introduces Fractal’s Patient Jarvis, an AI-powered solution designed to streamline pharmaceutical data analytics. It unifies claims data, leverages AWS-powered AI, and provides actionable insights to improve decision-making, operational efficiency, and patient outcomes in the pharmaceutical industry.
⫸ Your Guide to Tableau Viz Extensions: This article highlights the revolutionary Viz Extensions in Tableau 2024.2, enabling the creation of complex visualizations—like Sankey diagrams, radar charts, and network diagrams—as easily as traditional charts, simplifying advanced analytics and expanding Tableau's capabilities.
⫸ Navigating Networks with NetworkX: A Short Guide to Graphs in Python. This article introduces NetworkX, a Python library for building, analyzing, and visualizing networks, showcasing its applications in understanding complex relationships such as social connections or transportation systems through nodes and edges, enriched with attributes and algorithms.
⫸ Data Validation with Pandera in Python: This article explores how Pandera, a Python library, streamlines data validation for dataframe-like objects in machine learning and analytics pipelines. It highlights Pandera's efficiency, scalability, and support for libraries like pandas and Dask, emphasizing its custom validations and schema-based approach to ensure data integrity.
⫸ Why Most Cross-Validation Visualizations Are Wrong (And How to Fix Them)? This article critiques traditional cross-validation diagrams in data science, highlighting how they confuse the brain by making chunks of data appear as one moving piece. It proposes rethinking visuals to align with natural cognition and inclusivity.
⫸ A Practical Framework for Data Analysis: 6 Essential Principles: This article outlines six essential data analysis principles for data scientists, focusing on techniques like establishing baselines, normalizing metrics, MECE grouping, aggregating data, removing irrelevant information, and applying the Pareto principle to extract actionable insights.
⫸ What’s new in AlloyDB Omni version 15.7.0: The article highlights the new features in AlloyDB Omni version 15.7.0, including faster performance, an ultra-fast disk cache, an enhanced columnar engine,ScaNN vector indexing, and an updated Kubernetes operator, advancing PostgreSQL workflows across diverse environments.