Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech News - Databases

233 Articles
Anonymous
30 Oct 2020
1 min read
Save for later

ASF 034: Alex Yates interview (part 1) from Blog Posts - SQLServerCentral

Anonymous
30 Oct 2020
1 min read
Alex is a Data Platform MVP who loves DevOps. He’s has been helping data professionals apply DevOps principles to relational database development and deployment since 2010. He’s most proud of helping Skyscanner develop the ability to deploy 95 times a day. Alex has worked with clients on every continent except Antarctica – so he’s keen to meet anyone who researches penguins. Source The post ASF 034: Alex Yates interview (part 1) appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 580

Anonymous
30 Oct 2020
1 min read
Save for later

shell from Blog Posts - SQLServerCentral

Anonymous
30 Oct 2020
1 min read
Installing go-task This tool is great for cross-platform shell scripting as it runs all the commands in the Taskfile.yml using a built in go shell library that supports bash syntax (and others). Quickly get up and running using the directions here: Install Task # For Default Installion to ./bin with debug logging sh -c "$(curl -ssL https://taskfile.dev/install.sh)" -- -d # For Installation To /usr/local/bin for userwide access with debug logging # May require sudo sh sh -c "$(curl -ssL https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin #development #shell The post shell appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 619

Anonymous
30 Oct 2020
2 min read
Save for later

go from Blog Posts - SQLServerCentral

Anonymous
30 Oct 2020
2 min read
Starter Template I saved this as a snippet for vscode to get up and running quickly with something better than the defaults for handling func main isolation. I’ve been working on modifying this a bit as I don’t really like using args, but am trying not to overcomplicate things as a new gopher. I tend to like better flag parsing than using args, but it’s still a better pattern to get functions isolated from main to easily test. The gist that I’ve taken from this and discussions in the community is ensure that main is where program termination is dedicated instead of handling this in your functions. This isolation of logic from main ensures you can more easily setup your tests as well, since func main() isn’t testable. package main // package template from: import ( "errors" "fmt" "io" "os" ) const ( // exitFail is the exit code if the program // fails. exitFail = 1 ) func main() { if err := run(os.Args, os.Stdout); err != nil { fmt.Fprintf(os.Stderr, "%sn", err) os.Exit(exitFail) } } func run(args []string, stdout io.Writer) error { if len(args) == 0 { return errors.New("no arguments") } for _, value := range args[1:] { fmt.Fprintf(stdout, "Running %s", value) } return nil } Puzzles - FizzBuzz I honestly had never done any algorithm or interview puzzles beyond sql-server, so I was really happy to knock this out relatively easily. At least I pass the basic Joel test ?? #development #golang The post go appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 587
Visually different images

article-image-blog-net-for-apache-spark-udfs-missing-shared-state-from-blog-posts-sqlservercentral
Anonymous
29 Oct 2020
1 min read
Save for later

Blog .NET for Apache Spark UDFs Missing Shared State from Blog Posts - SQLServerCentral

Anonymous
29 Oct 2020
1 min read
The Problem When you use a UDF in .NET for Apache Spark, something like this code: class Program { static void Main(string[] args) { var spark = SparkSession.Builder().GetOrCreate(); _logging.AppendLine("Starting Select"); var udf = Functions.Udf<int, string>(theUdf); spark.Range(100).Select(udf(Functions.Col("id"))).Show(); _logging.AppendLine("Ending Select"); Console.WriteLine(_logging.ToString()); } private static readonly StringBuilder _logging = new StringBuilder(); private static string theUdf(int val) { _logging.AppendLine($"udf passed: {val}"); return $"udf passed {val}"; } } Generally, knowing .NET we would expect the following output: The post Blog .NET for Apache Spark UDFs Missing Shared State appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 617

article-image-migrating-sql-server-container-images-to-the-github-container-registry-from-blog-posts-sqlservercentral
Anonymous
29 Oct 2020
2 min read
Save for later

Migrating SQL Server container images to the Github Container Registry from Blog Posts - SQLServerCentral

Anonymous
29 Oct 2020
2 min read
A couple of months ago Docker announced that they would be implementing a 6 month retention policy for unused images in the Docker Hub. This was due to kick in on the 1st of November but has now been pushed back until mid 2021. I’ve had multiple Windows SQL Server container images up on the Docker Hub for years now. It’s been a great platform and I’m very thankful to them for hosting my images. That being said, I want to make sure that the images that I’ve built are always going to be available for the community so I have pushed my SQL Server images to the Github Container Registry. In the Docker Hub I have the following public SQL Server images: – dbafromthecold/sqlserver2012express:rtm dbafromthecold/sqlserver2012dev:sp4 dbafromthecold/sqlserver2014express:rtm dbafromthecold/sqlserver2014dev:sp2 dbafromthecold/sqlserver2016dev:sp2 These images won’t be removed (by myself) and I will update this post in the event of them being removed. But they are now available on the Github Container Registry: – ghcr.io/dbafromthecold/sqlserver2012:express ghcr.io/dbafromthecold/sqlserver2012:dev ghcr.io/dbafromthecold/sqlserver2014:express ghcr.io/dbafromthecold/sqlserver2014:dev ghcr.io/dbafromthecold/sqlserver2016:dev Bit of a disclaimer with these images…they’re LARGE! The 2012 dev image is ~20GB which is gigantic for a container image. So if you going to use them (for dev and test only please ) you’ll need to pre-pull them before running any containers. Thanks for reading! The post Migrating SQL Server container images to the Github Container Registry appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 906

Anonymous
29 Oct 2020
1 min read
Save for later

Daily Coping 29 Oct 2020 from Blog Posts - SQLServerCentral

Anonymous
29 Oct 2020
1 min read
I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. Today’s tip is to be kind to yourself today. Remember progress takes time. This is something that having and working with kids has taught me. I’ve watched them take advice or coaching from me, or not, but I can see there is a difference across time. Lately I’ve had a lot of deliverables, for various different presentations. Rather than get upset or concerned when I’ve been delayed in getting something done, I’ve stopped and reminded myself that a task may just take time. It’s helpful to remind yourself that despite getting things done today, I may have more to do tomorrow. As along as I’m making progress, that’s OK. The post Daily Coping 29 Oct 2020 appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 646
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
Anonymous
29 Oct 2020
2 min read
Save for later

What do they want to hear? from Blog Posts - SQLServerCentral

Anonymous
29 Oct 2020
2 min read
I was helping my 16yo write his first college essay the other day. He’s always struggled a bit with writing essays so this isn’t an easy task. As I’m going over what he’s written so far I gave him a piece of advice I’d never thought of before. When writing an essay the first thing you need to ask yourself is What do they want to hear? I told him to read the essay prompt Write about an event in your life. and then expand on that. It’s a college essay so writing about how the event he picked made him want to live life to it’s fullest and go join a circus probably isn’t going to help. As I thought about this I realized just how far reaching this concept is. When I write a blog post, I come up with a concept and then have to think of what do the people reading it want to get out of it. In any form of communication, you need to read the prompt and then figure out what they really want. A blog post (as I mentioned above) has a target audience. And make no mistake, you are always going to be part of your target audience. What does that audience want? Do they want a simple how to? A complex, in depth discussion of a problem? The answer can, and probably will, change post by post. Documentation. Same as a blog post really. Any technical request. It could be anything from a new database, an SSIS package, a stored procedure, view etc. You need to figure out not just what they’ve asked for but what do they want, and what do they need. Know your audience. Make sure your output fits the audience. The post What do they want to hear? appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 565

article-image-dax-logic-and-blanks-from-blog-posts-sqlservercentral
Anonymous
29 Oct 2020
3 min read
Save for later

DAX Logic and Blanks from Blog Posts - SQLServerCentral

Anonymous
29 Oct 2020
3 min read
A while back I was chatting with Shannon Lindsay on Twitter. She shares lots of useful Power BI tips there. She shared her syntax tip of the & operator being used for concatenation and the && operator being used for boolean AND, which reminded me about implicit conversions and blanks in DAX. Before you read the below tweet, see how many of these you can guess correctly: Blank + 5 = ? Blank * 5 = ?5 / Blank = ?0 / Blank = ? In DAX, Blank is converted to 0 in addition and subtraction. And you can also have fun with blanks in DAX: Blank + 5 = 5Blank * 5 = Blank5/Blank = 8Blank/Blank = Blank0/Blank = NaN — Meagan Longoria (@MMarie) September 4, 2020 What about boolean logic? Do you know the result of the following expressions? AND(True(), Blank()) = ? OR(True(), Blank()) = ? AND(False(), Blank()) = ? AND(Blank(), Blank()) = ? You can see the results as well as a few more permutations in the screenshot below. Read the left table as Number1 [operator] Number2, so 5 + Blank = 5. 5 * Blank = Blank. And 5 / Blank = Infinity. Read the right table as Bool1 [operator] Bool2, so True AND Blank = False and True OR Blank = True. Why does this matter? You need to understand the impact of blanks in your data. Do you really want to divide by zero when you are missing data? If you are performing a boolean AND, and your data is blank, are you ok with showing a result of False? Remember that your expression may produce undesired results rather than an error. First, you need to be aware of where it is possible in your data to get a blank input. When you are writing your DAX measures, you may need to handle blanks. DAX offers the IFERROR() function to check if the result of an expression throws an error. There is also an ISBLANK() function that you can use to check for a blank value and a COALESCE() function to provide an alternate value when a blank value is detected. But adding extra logic in your measures may have a performance impact. For example, the DIVIDE() function can handle divide by zero errors for you. But DIVIDE() may be slower than the / operator. The performance difference is dependent on your data and the expression you are writing. Alternatively, you can use an IF statement to check if an input value is greater than zero using the > operand. This can be quicker than checking for blanks or errors using other functions. At the end of the day, producing the correct result is more important than fast performance, but we strive to achieve both. If you have any tips for handling blanks in DAX, please share them in the comments. The post DAX Logic and Blanks appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 683

article-image-using-write-debug-from-blog-posts-sqlservercentral
Anonymous
28 Oct 2020
2 min read
Save for later

Using Write-Debug from Blog Posts - SQLServerCentral

Anonymous
28 Oct 2020
2 min read
I wrote a post about PoSh output recently, noting that in general we ought to use Write-Output or Write-Verbose for messaging. In there, I mentioned Write-Debug as well as a way of allowing the user to control debug information. Since I often find myself fumbling a bit with debugging scripts, I decided to give this a try and see how it works. First, let’s build a simple script. In this case, I’ll write a script that takes two parameters and determines which one is larger. $i = $args[0] $j = $args[1] Write-Debug("First Param:$i") Write-Debug("SecondParam:$j") if ($i -eq #null ) {   $i = 1 Write-Debug("Setting first as a default to 1") } if ($j -eq #null ) {   $j = 1 Write-Debug("Setting second as a default to 1") } if ($a -gt $b) {   Write-Output("The first parameter is larger") } elseif ($i -eq $j ) {   Write-Output("The parameters are equal.") } else {       Write-Output("The second parameter is larger")   } If I run this, I get what I expect. Here are a few executions. Now, what if I’m unsure of what’s happening. For example, I forget the second parameter. How does my program know the first parameter is larger? I have some debug information in there, but it doesn’t appear. However, if I change the value of $DebugPreference, I see something. The variable, $DebugPreference, controls how Write-Debug messages are processed. By default, this is set to SilentlyContinue. However, if I change this to Continue, all the messages appear. If I want, I can also set it to Stop or Inquire, allowing me to control the program differently. You can read more about preference variables here. This is a handy thing to use. I’ve often had a variable I set in programs, sometimes as a parameter, that allows me to show debug messages, but I often then need a series of IF statements inside code to check this and display debug information. Now, I can just include write-debug info in my code, and if the preference isn’t set, I don’t see them. I’ve seen this used in custom cmdlets from vendors, including Redgate, and it is nice to be able to access more information when something isn’t working, and have it suppressed by default. The post Using Write-Debug appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 661

Anonymous
28 Oct 2020
1 min read
Save for later

T-SQL Tuesday Retrospective #002: A Puzzling Situation from Blog Posts - SQLServerCentral

Anonymous
28 Oct 2020
1 min read
For the second T-SQL Tuesday ever — again, hosted by Adam Machanic — we were asked one of three options, and I elected to go with the first one: Describe a confusing situation you encountered, and explain how you debugged the problem and what the resolution was. This invitation was originally posted on 4 January-> Continue reading T-SQL Tuesday Retrospective #002: A Puzzling Situation The post T-SQL Tuesday Retrospective #002: A Puzzling Situation appeared first on Born SQL. The post T-SQL Tuesday Retrospective #002: A Puzzling Situation appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 641
Anonymous
28 Oct 2020
1 min read
Save for later

Daily Coping 28 Oct 2020 from Blog Posts - SQLServerCentral

Anonymous
28 Oct 2020
1 min read
I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. Today’s tip is to plan a fun or exciting activity to look forward to. I’ve got a couple, but in the short term, my wife and I decided to take advantage of a gift from Redgate. For the 21st birthday of the company, everyone got a voucher for an “experience” in their area. In looking over the items, we decided to do a Wine and Painting night with our kids. They are artistic, and we like wine. Actually, my son can drink wine as well, so 3 of us will enjoy wine, with 2 painting. I’m looking forward to the night we can do this. The post Daily Coping 28 Oct 2020 appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 787

Anonymous
28 Oct 2020
1 min read
Save for later

Azure Databricks and Azure Key Vault from Blog Posts - SQLServerCentral

Anonymous
28 Oct 2020
1 min read
The key vault should always be a core component of your Azure design because we can store keys, secrets, certicates thus abstract / hide the true connection string within files. When working with databricks to mount storage to ingest your … Continue reading ? The post Azure Databricks and Azure Key Vault appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 851

Matthew Emerick
16 Oct 2020
4 min read
Save for later

Hans-Juergen Schoenig: PostgreSQL: Sophisticating temporary tables from Planet PostgreSQL

Matthew Emerick
16 Oct 2020
4 min read
Temporary tables have been around forever and are widely used by application developers. However, there is more to temporary tables than meets the eye. PostgreSQL allows you to configure the lifespan of a temporary table in a nice way and helps to avoid some common pitfalls. CREATE TEMPORARY TABLE … By default, a temporary table will live as long as your database connection. It will be dropped as soon as you disconnect. In many cases this is the behavior people want: tmp=# CREATE TEMPORARY TABLE x (id int); CREATE TABLE tmp=# d List of relations Schema | Name | Type | Owner -----------+------+-------+------- pg_temp_3 | x | table | hs (1 row) tmp=# q iMac:~ hs$ psql tmp psql (12.3) Type "help" for help. tmp=# d Did not find any relations. Once we have reconnected, the table is gone for good. Also, keep in mind that the temporary table is only visible within your session. Other connections are not going to see the table (which is, of course, the desired behavior). This also implies that many sessions can create a temporary table having the same name. However, a temporary table can do more. The most important thing is the ability to control what happens on commit: [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ] As you can see, there are three options. “PRESERVE ROWS” is the behavior you have just witnessed. Sometimes you don’t want that. It is therefore also possible to empty a temporary table on commit: tmp=# BEGIN; BEGIN tmp=# CREATE TEMP TABLE x ON COMMIT DELETE ROWS AS SELECT * FROM generate_series(1, 5) AS y; SELECT 5 tmp=# SELECT * FROM x; y --- 1 2 3 4 5 (5 rows) tmp=# COMMIT; COMMIT tmp=# SELECT * FROM x; y --- (0 rows) In this case, PostgreSQL simply leaves us with an empty table as soon as the transaction ends. The table itself is still around and can be used. Let us drop the table for now: tmp=# DROP TABLE x; DROP TABLE Sometimes you want the entire table to be gone at the end of the transaction: “ON COMMIT DROP” can be used to achieving exactly that: tmp=# BEGIN; BEGIN tmp=# CREATE TEMP TABLE x ON COMMIT DROP AS SELECT * FROM generate_series(1, 5) AS y; SELECT 5 tmp=# COMMIT; COMMIT tmp=# SELECT * FROM x; ERROR: relation "x" does not exist LINE 1: SELECT * FROM x; PostgreSQL will throw an error because the table is already gone. What is noteworthy here is that you can still use WITH HOLD cursors as shown in the next example: tmp=# BEGIN; BEGIN tmp=# CREATE TEMP TABLE x ON COMMIT DROP AS SELECT * FROM generate_series(1, 5) AS y; SELECT 5 tmp=# DECLARE mycur CURSOR WITH HOLD FOR SELECT * FROM x; DECLARE CURSOR tmp=# COMMIT; COMMIT tmp=# FETCH ALL FROM mycur; y --- 1 2 3 4 5 (5 rows) The table itself is still gone, but the WITH HOLD cursors will ensure that the “content” of the cursor will survive the end of the transaction. Many people don’t expect this kind of behavior, but it makes sense and can come in pretty handy. Controlling memory usage … If you are using temporary tables, it makes sense to keep them relatively small. In some cases, however, a temporary table might be quite large for whatever reason. To ensure that performance stays good, you can tell PostgreSQL to keep more of a temporary table in RAM. temp_buffers is the parameter in postgresql.conf you should be looking at in this case: tmp=# SHOW temp_buffers; temp_buffers -------------- 8MB (1 row) The default value is 8 MB. If your temporary tables are large, increasing this value certainly makes sense. Finally … If you want to find out more about PostgreSQL database performance in general, consider checking out my post about three ways to detect and fix slow queries. The post PostgreSQL: Sophisticating temporary tables appeared first on Cybertec.
Read more
  • 0
  • 0
  • 1319
article-image-scott-mead-tracking-postgres-stats-from-planet-postgresql
Matthew Emerick
15 Oct 2020
9 min read
Save for later

Scott Mead: Tracking Postgres Stats from Planet PostgreSQL

Matthew Emerick
15 Oct 2020
9 min read
Database applications are living, [(sometimes) fire-]breathing systems that behave in unexpected ways. As a purveyor of the pgCraft, it’s important to understand how to interrogate a Postgres instance and learn about the workload. This is critical for lots of reasons: Understanding how the app is using the database Understanding what risks there are in the data model Designing a data lifecycle management plan (i.e. partitions, archiving) Learning how ORM is behaving towards the database Building a VACUUM strategy There’s lots of other reasons this data is useful, but let’s take a look at some examples and get down to a few scripts you can use to pull this together into something useful. First, take a visit to the pgCraftsman’s toolbox to find an easy-to-use snapshot script. This script is designed to be completely self-contained. It will run at whatever frequency you’d like and will save snapshots of the critical monitoring tables right inside your database. There’s even a few reporting functions included to help you look at stats over time. What to Watch There’s a number of critical tables and views to keep an eye on in the Postgres catalog, this isn’t an exhaustive list, but a quick set that the toolbox script already watches. pg_stat_activity pg_locks pg_stat_all_tables pg_statio_all_tables pg_stat_all_indexes pg_stat_database These tables views provide runtime stats on how your application is behaving in regards to the data model. The problem with many of these is that they’re either point-in-time (like pg_stat_activity) or cumulative (pg_stat_all_tables.n_tup_ins contains the cumulative number of inserts since pg_stat_database.stats_reset). In order to glean anything useful from these runtime performance views, you should be snapshot-ing them periodically and saving the results. I’ve seen (and built) lots of interesting ways to do this over the years, but the simplest way to generate some quick stats over time is with the PgCraftsman Toolbox script: pgcraftsman-snapshots.sql. This is approach is great, but as you can guess, a small SQL script doesn’t solve all the world’s database problems. True, this script does solve 80% of them, but that’s why it only took me 20% of the time Let’s say I have a workload that I know nothing about, let’s use pgcraftsman-snapshots.sql to learn about the workload and determine the best way to deal with it: Snapshots In order to build actionable monitoring out of the cumulative or point-in-time monitoring views, we need to snapshot the data periodically and compare between those snapshots. This is exactly was the pgcraftsman-snapshots.sql script does. All of the snapshots are saved in appropriate tables in a new ‘snapshots’ schema. The ‘snapshot’ function simply runs an INSERT as SELECT from each of the monitoring views. Each row is associated with the id of the snapshot being taken (snap_id). When it’s all put together, we can easily see the number of inserts that took place in a given table between two snapshots, the growth (in bytes) of a table over snapshots, or the number of index scans against a particular index. Essentially, any data in any of the monitoring views we are snapshot-ing. 1. Install pgcraftsman-snapshots.sql ❯ psql -h your.db.host.name -U postgres -d postgres -f pgcraftsman-snapshots.sql SET CREATE SCHEMA SELECT 92 CREATE INDEX SELECT 93 CREATE INDEX SELECT 6 CREATE INDEX SELECT 7 CREATE INDEX CREATE INDEX CREATE INDEX SELECT 145 CREATE INDEX SELECT 3 CREATE INDEX SELECT 269 CREATE INDEX CREATE INDEX CREATE INDEX SELECT 1 CREATE INDEX CREATE INDEX CREATE TABLE CREATE INDEX CREATE INDEX CREATE INDEX CREATE TABLE CREATE INDEX CREATE TABLE CREATE INDEX CREATE INDEX CREATE INDEX CREATE TABLE CREATE INDEX CREATE INDEX ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE CREATE SEQUENCE CREATE FUNCTION save_snap ----------- 2 (1 row) CREATE FUNCTION CREATE TYPE CREATE FUNCTION CREATE TYPE CREATE FUNCTION CREATE TYPE CREATE FUNCTION In addition to installing the snapshot schema, this script takes two initial snapshots for you. You can monitor the snapshots by running: postgres=# select * from snapshots.snap; snap_id | dttm ---------+------------------------------- 1 | 2020-10-15 10:32:54.31244-04 2 | 2020-10-15 10:32:54.395929-04 (2 rows) You can also get a good look at the schema: postgres=# set search_path=snapshots; SET postgres=# dt+ List of relations Schema | Name | Type | Owner | Size | Description -----------+------------------------+-------+----------+------------+------------- snapshots | snap | table | postgres | 8192 bytes | snapshots | snap_all_tables | table | postgres | 96 kB | snapshots | snap_cpu | table | postgres | 8192 bytes | snapshots | snap_databases | table | postgres | 8192 bytes | snapshots | snap_indexes | table | postgres | 120 kB | snapshots | snap_iostat | table | postgres | 8192 bytes | snapshots | snap_load_avg | table | postgres | 8192 bytes | snapshots | snap_mem | table | postgres | 8192 bytes | snapshots | snap_pg_locks | table | postgres | 16 kB | snapshots | snap_settings | table | postgres | 32 kB | snapshots | snap_stat_activity | table | postgres | 16 kB | snapshots | snap_statio_all_tables | table | postgres | 72 kB | (12 rows) postgres=# reset search_path; RESET postgres=# There’s a few tables here (snap_cpu, snap_load_avg, snap_mem) that seem interesting, eh? I’ll cover these in a future post, we can’t get that data from within a postgres instance without a special extension installed or some external driver collecting it. For now, those tables will remain unused. 2. Take a snapshot The snapshots.save_snap() function included with pgcraftsman-snapshots.sql does a quick save of all the metadata and assigns it all a new snap_id: postgres=# select snapshots.save_snap(); save_snap ----------- 3 (1 row) postgres=# The output row is the snap_id that was just generated and saved. Every time you want to create a snapshot, just call: select snapshots.save_snap(); The easiest way to do this is via cron or another similar job scheduler (pg_cron). I find it best to schedule these before large workload windows and after. If you have a 24 hour workload, find inflection points that you’re looking to differentiate between. Snapshot Performance Questions here about the performance of a snapshot make lots of sense. You can look a the save_snap() in code, you’ll see that the runtime of the process is going to depend on the number of rows in each of the catalog tables. This will depend on : pg_stat_activity <– Number of connections to the instance pg_locks < — Number of locks pg_stat_all_tables <– Number of tables in the database pg_statio_all_tables <– Number of tables in the database pg_stat_all_indexes <– Number of indexes in the database pg_stat_database <– Number of databases in the instance For databases with thousands of objects, snapshots should be pruned frequently so that the snapshot mechanism itself does not cause performance problems. Pruning old snapshots Pruning old snapshots with this script is really easy. There is a relationship between the snapshots.snap table and all the others, so a simple ‘DELETE FROM snapshots.snap WHERE snap_id = x; ‘ will delete all the rows from the given snap_id. 3. Let the workload run Let’s learn a little bit about the workload that is running in the database. Now that we have taken a snapshot (snap_id = 3) before the workload, we’re going to let the workload run for a bit, then take another snapshot and compare the difference. (Note: snapshots just read the few catalog tables I noted above and save the data. They don’t start a process, or run anything. The only thing that’ll make your snapshots run long is if you have a large number of objects (schema, table, index) in the database. ) 4. Take a ‘post-workload’ snapshot After we’ve let the workload run for a while (5 minutes, 2 hours, 2 days… whatever you think will give the best approximation for your workload), take a new snapshot. This will save the new state of data and let us compare the before and after stats: postgres=# select snapshots.save_snap(); save_snap ----------- 4 (1 row) postgres=# 5. Analyze the report There are two included functions for reporting across the workload: select * from snapshots.report_tables(start_snap_id, end_snap_id); select * from snapshots.report_indexes(start_snap_id, end_snap_id); Both of these reports need a starting and ending snap_id. You can get this by examining the snapshots.snap table: postgres=# select * from snapshots.snap; snap_id | dttm ---------+------------------------------- 1 | 2020-10-15 10:32:54.31244-04 2 | 2020-10-15 10:32:54.395929-04 3 | 2020-10-15 10:56:56.894127-04 4 | 2020-10-15 13:30:47.951223-04 (4 rows) postgres=# Our pre-workload snapshot was snap_id = 3 and our post-workload snapshot was snap_id = 4. Since we are reporting between two snapshots, we can see exactly what occurred between them. The number of inserts / updates / deletes / sequential scans / index scans, and even table growth (bytes and human readable). The key is that this is just what took place between the snapshots. You can take a snapshot at any time and report across any number of them. (Note: You may need to side-scroll to see the full output. I highly recommend it) postgres=# select * from snapshots.report_tables(3,4); time_window | relname | ins | upd | del | index_scan | seqscan | relsize_growth_bytes | relsize_growth | total_relsize_growth_bytes | total_relsize_growth | total_relsize | total_relsize_bytes -----------------+-------------------------+--------+--------+-----+------------+---------+----------------------+----------------+----------------------------+----------------------+---------------+--------------------- 02:33:51.057096 | pgbench_accounts | 0 | 588564 | 0 | 1177128 | 0 | 22085632 | 21 MB | 22085632 | 21 MB | 1590083584 | 1516 MB 02:33:51.057096 | pgbench_tellers | 0 | 588564 | 0 | 588564 | 0 | 1269760 | 1240 kB | 1597440 | 1560 kB | 1720320 | 1680 kB 02:33:51.057096 | pgbench_history | 588564 | 0 | 0 | | 0 | 31244288 | 30 MB | 31268864 | 30 MB | 31268864 | 30 MB 02:33:51.057096 | pgbench_branches | 0 | 588564 | 0 | 587910 | 655 | 1081344 | 1056 kB | 1146880 | 1120 kB | 1204224 | 1176 kB 02:33:51.057096 | snap_indexes | 167 | 0 | 0 | 0 | 0 | 49152 | 48 kB | 65536 | 64 kB | 204800 | 200 kB 02:33:51.057096 | snap_all_tables | 111 | 0 | 0 | 0 | 0 | 40960 | 40 kB | 40960 | 40 kB | 172032 | 168 kB 02:33:51.057096 | snap_statio_all_tables | 111 | 0 | 0 | 0 | 0 | 24576 | 24 kB | 24576 | 24 kB | 114688 | 112 kB 02:33:51.057096 | pg_statistic | 23 | 85 | 0 | 495 | 0 | 16384 | 16 kB | 16384 | 16 kB | 360448 | 352 kB 02:33:51.057096 | snap_pg_locks | 39 | 0 | 0 | 0 | 0 | 8192 | 8192 bytes | 32768 | 32 kB | 98304 | 96 kB 02:33:51.057096 | snap_stat_activity | 6 | 0 | 0 | 0 | 0 | 0 | 0 bytes | 0 | 0 bytes | 32768 | 32 kB 02:33:51.057096 | snap | 1 | 0 | 0 | 0 | 324 | 0 | 0 bytes | 0 | 0 bytes | 57344 | 56 kB 02:33:51.057096 | snap_settings | 1 | 0 | 0 | 1 | 1 | 0 | 0 bytes | 0 | 0 bytes | 114688 | 112 kB 02:33:51.057096 | snap_databases | 1 | 0 | 0 | 0 | 0 | 0 | 0 bytes | 0 | 0 bytes | 24576 | 24 kB 02:33:51.057096 | pg_class | 0 | 1 | 0 | 1448 | 200 | 0 | 0 bytes | 0 | 0 bytes | 245760 | 240 kB 02:33:51.057096 | pg_trigger | 0 | 0 | 0 | 3 | 0 | 0 | 0 bytes | 0 | 0 bytes | 65536 | 64 kB 02:33:51.057096 | sql_parts | 0 | 0 | 0 | | 0 | 0 | 0 bytes | 0 | 0 bytes | 49152 | 48 kB 02:33:51.057096 | pg_event_trigger | 0 | 0 | 0 | 0 | 0 | 0 | 0 bytes | 0 | 0 bytes | 16384 | 16 kB 02:33:51.057096 | pg_language | 0 | 0 | 0 | 1 | 0 | 0 | 0 bytes | 0 | 0 bytes | 73728 | 72 kB 02:33:51.057096 | pg_toast_3381 | 0 | 0 | 0 | 0 | 0 | 0 | 0 bytes | 0 | 0 bytes | 8192 | 8192 bytes 02:33:51.057096 | pg_partitioned_table | 0 | 0 | 0 | 0 | 0 | 0 | 0 bytes | 0 | 0 bytes | 8192 | 8192 bytes 02:33:51.057096 | pg_largeobject_metadata | 0 | 0 | 0 | 0 | 0 | 0 | 0 bytes | 0 | 0 bytes | 8192 | 8192 bytes 02:33:51.057096 | pg_toast_16612 | 0 | 0 | 0 | 0 | 0 | 0 | 0 bytes | 0 | 0 bytes | 8192 | 8192 bytes This script is a building-block. If you have a single database that you want stats on, it’s great. If you have dozens of databases in a single instance or dozens of instances, you’re going to quickly wish you had this data in a dashboard of some kind. Hopefully this gets you started with metric building against your postgres databases. Practice the pgCraft, submit me a pull request! Next time, we’ll look more into some of the insights we can glean from the information we assemble here.
Read more
  • 0
  • 0
  • 1280

article-image-alexey-lesovsky-postgres-13-observability-updates-from-planet-postgresql
Matthew Emerick
15 Oct 2020
2 min read
Save for later

Alexey Lesovsky: Postgres 13 Observability Updates from Planet PostgreSQL

Matthew Emerick
15 Oct 2020
2 min read
New shiny Postgres 13 has been released and now it’s the  time for making some updates to “Postgres Observability” diagram. New release includes many improvements related to monitoring, such as new stats views and new added fields to existing views. Let’s take a closer look at these. List of progress views has been extended with two new views. The first one is the “pg_stat_progress_basebackup” which helps to observe running base backups and estimate their progress, ETA and other properties.  The second view is the “pg_stat_progress_analyze” as the name suggests, it watches over execute/analyze operations. The third new view is called pg_shmem_allocations which is supposed to be used for deeper inspection of how shared buffers are used. The fourth, and the last new view is “pg_stat_slru” related to the inspection of SLRU caches. Both recently added views are help to answer the question “How Postgres spends its allocated memory” Other improvements are general-purpose and related to the existing views. The “pg_stat_statements” has few modifications: New fields related to time planning have been added, and due to this the existing “time” fields have been renamed to executing time. So all monitoring tools that rely on pg_stat_statements should be adjusted accordingly.  New fields related to WAL have been added – now it’s possible to understand how much WAL has been generated by each statement. WAL usage statistics have also been added to EXPLAIN (added WAL keyword), auto_explain and autovacuum. WAL usage stats are appended to the logs (that is if log_autovacuum_min_duration is enabled) Pg_stat_activity has a new column “leader_pid”, which shows the PID of the parallel group leader and helps to explicitly identify background workers with their leader. A huge thank you goes to many who contributed to this new release, among which are my colleagues Victor Yegorov and Sergei Kornilov and also those who help to spread the word about Postgres to other communities and across geographies.  The post Postgres 13 Observability Updates appeared first on Data Egret.
Read more
  • 0
  • 0
  • 1252