Databases | 0 articles | Tech News, Tutorials & Expert Insights

article-image-external-tables-vs-t-sql-views-on-files-in-a-data-lake-from-blog-posts-sqlservercentral

03 Nov 2020

4 min read

External tables vs T-SQL views on files in a data lake from Blog Posts - SQLServerCentral

03 Nov 2020

A question that I have been hearing recently from customers using Azure Synapse Analytics (the public preview version) is what is the difference between using an external table versus a T-SQL view on a file in a data lake? Note that a T-SQL view and an external table pointing to a file in a data lake can be created in both a SQL Provisioned pool as well as a SQL On-demand pool. Here are the differences that I have found: Overall summary: views are generally faster and have more features such as OPENROWSET Virtual functions (filepath and filename) are not supported with external tables which means users cannot do partition elimination based on FILEPATH or complex wildcard expressions via OPENROWSET (which can be done with views) External tables can be shareable with other computes, since their metadata can be mapped to and from Spark and other compute experiences, while views are SQL queries and thus can only be used by SQL On-demand or SQL Provisioned pool External tables can use indexes to improve performance, while views would require indexed views for that Sql On-demand automatically creates statistics both for a external table and views using OPENROWSET. You can also explicitly create/update statistics on files on OPENROWSET. Note that automatic creation of statistics is turned on for Parquet files. For CSV files, you need to create statistics manually until automatic creation of CSV files statistics is supported Views give you more flexibility in the data layout (external tables expect the OSS Hive partitioning layout for example), and allow more query expressions to be added External tables require an explicit defined schema while views can use OPENROWSET to provide automatic schema inference allowing for more flexibility (but note that an explicitly defined schema can provide faster performance) If you reference the same external table in your query twice, the query optimizer will know that you are referencing the same object twice, while two of the same OPENROWSETs will not be recognized as the same object. For this reason in such cases better execution plans could be generated when using external tables instead of views using OPENROWSETs Row-level security (Polybase external tables for Azure Synapse only) and Dynamic Data Masking will work on external tables. Row-level security is not supported with views using OPENROWSET You can use both external tables and views to write data to the data lake via CETAS (this is the only way either option can write data to the data lake) If using SQL On-demand, make sure to read Best practices for SQL on-demand (preview) in Azure Synapse Analytics I often get asked what is the difference in performance when it comes to querying using an external table or view against a file in ADLS Gen2 vs. querying against a highly compressed table in a SQL Provisioned pool (i.e. managed table). It’s hard to quantify without understanding more about each customers scenario, but you will roughly see a 5X performance difference between queries over external tables and views vs. managed tables (obviously, depending on the query, that will vary but that’s a rough number – could be more than 5X in some scenarios). A few things that contribute to that: in-memory caching, SSD based caches, result-set caching, and the ability to design and align data and tables when they are stored as managed tables. You can also create materialized views for managed tables which typically bring lots of performance improvements as well. If you are querying Parquet data, that is in a columnstore file format with compression so that would give you similar data/column elimination as what managed SQL clustered columnstore index (CCI) would give, but if you are querying non-Parquet files you do not get this functionality. Note that for managed tables, on top of performance, you also get a granular security model, workload management capabilities, and so on (see Data Lakehouse & Synapse). The post External tables vs T-SQL views on files in a data lake first appeared on James Serra's Blog. The post External tables vs T-SQL views on files in a data lake appeared first on SQLServerCentral.

0
0
1054

article-image-power-bi-hungry-median-from-blog-posts-sqlservercentral

Anonymous

24 Nov 2020

8 min read

Power BI – Hungry Median from Blog Posts - SQLServerCentral

Anonymous

24 Nov 2020

8 min read

Introduction Median is a useful statistical function, which first time appeared in SSAS 2016 and in Power BI around that year as well. There are several articles on how to implement the median function in DAX from the time before the native DAX function was introduced. With one client we recently faced an issue when using the implicit median function in Power BI. Size of the dataset was roughly 30mio records. I would say nothing challenging for Power BI or DAX itself. However, the behavior of the median function was not convincing at all. Let’s look at the setup: I created a median dataset based on free data from weather sensors in one city (a link to download at the end of the blog) which has similar data characteristics as our report with the original issue. We have the following attributes: date, hour, location (just numeric ID of location which is fine for our test) and we are monitoring the temperature. We have 35mio records -> 944 unique records for temperature, 422 unique locations, and 24 hours of course. Now we make a simple report – we would like to see the median for temperature per hour despite date or location. Measure: MEASURESenzors[Orig_Med] =MEDIAN(Senzors[temperature]) The following result took 71 seconds to complete on the dataset in PB desktop. And took almost 8GB of memory:: Memory profile during the DAX query: If you try to publish this report to Power BI service, you will get the following message: I was just WOW! But what can I tune on such a simple query and such a simple measure? Tunning 1 – Rewrite Median? I was a bit disappointed about the median function. When we used date for filtering, the performance of the query was ok. But when we used a larger dataset it was not performing at all. I do know nothing about the inner implementation of the median function in DAX but based on memory consumption it seems like if there would be column materialization on the background and sorting when searching for the median. Here’s a bit of theory about median and a bit of fact about columnar storage so we can discover how we can take advantage of the data combination/granularity we have in the model. Below are two median samples for a couple of numbers – when the count of the numbers is Even and when is Odd. More median theory on Wikipedia. The rules for calculating median are the same, even when numbers in the set are repeating (non-unique). Here are the steps of the potential algorithm: Sort existing values. Find the median position(s). Take a value or two and make average to get median. Let’s look at this from the perspective of column store where we have just a couple of values with hundreds of repeats. As we know the count is very fast for column store and that could be our advantage as we have a small number of unique values repeated many times. Following is an example of data where we can visualize the way how we can take advantage of the fact described above. Temperature Count Cumulative Count Cumulative Count Start 12 500 500 1 13 500 1000 501 18 500 1500 1001 20 501 2001 1501 Total Count 2001 Position of median Odd 1001 Position of median Even 1001 In this case, we just need to go through 4 values and find in which interval our position of median belongs. In the worst-case scenario, we will hit between two values like on the following picture (we changed the last count from 501 to 500): Temperature Count Cumulative Count Cumulative Count Start 12 500 500 1 13 500 1000 501 18 500 1500 1001 20 500 2000 1501 Total Count 2000 Position of median Odd 1000 Position of median Even 1001 How to implement this in DAX: First helper measures are count and cumulative count for temperature: MEASURESenzors[RowCount] =COUNTROWS( Senzors ) MEASURESenzors[TemperatureRowCountCumul] =VAR _curentTemp = MAX ( ‘Senzors'[temperature] ) RETURN CALCULATE ( COUNTROWS ( Senzors ), Senzors[temperature] <= _curentTemp) Second and third measures give us a position of the median for given context: MEASURESenzors[MedianPositionEven] =ROUNDUP((COUNTROWS( Senzors ) / 2), 0) MEASURESenzors[MedianPositionOdd] =VAR _cnt =COUNTROWS( Senzors )RETURNROUNDUP(( _cnt / 2), 0)— this is a trick where boolean is auto-casted to int (0 or 1) + ISEVEN( _cnt ) The fourth measure – Calculated median – does what we described in the tables above. Iterate through temperature values and find rows that contain median positions and make average on that row(s). MEASURESenzors[Calc_Med] =— get two possible position of medianVAR _mpe = [MedianPositionEven]VAR _mpeOdd = [MedianPositionOdd]— Make Temperature table in current context with Positions where value starts and finishesVAR _TempMedianTable =ADDCOLUMNS(VALUES(Senzors[temperature]),“MMIN”,[TemperatureRowCountCumul] – [RowCount] + 1,“MMAX”, [TemperatureRowCountCumul])— Filter table to keep only values which contains Median positions in itVAR _T_MedianVals =FILTER( _TempMedianTable,(_mpe >= [MMIN]&& _mpe <= [MMAX])||(_mpeOdd >= [MMIN]&& _mpeOdd <= [MMAX]))— return average of filtered dataset (one or two rows)RETURNAVERAGEX( _T_MedianVals, [temperature]) Maximum number of rows which goes to the final average is 2. Let us see the performance of such measure: Performance for Hour (24 values) Duration (s) Memory Consumed (GB) Native median function 71 8 Custom implementation 6.3 0.2 Sounds reasonable and promising! But not so fast – when the number of values by which we group the data grow, the duration grows as well. Here are some statistics when removing hour (24 values) and bringing location (400+ values) into the table. Performance for location (422 values) Duration (s) Memory Consumed (GB) Native Median Function 81 8 Custom Implementation 107 2.5 Look at the memory consumption profile of calculated median for location below: That is not so good anymore! Our custom implementation is a bit slower for location and despite the fact it is consuming a lot less memory, this will not work in Power BI service as well. This means that we solved just a part of the puzzle – our implementation is working fine only when we have a small number of values that we are grouping by. So, what are the remaining questions to make this report working in PBI service? How to improve the overall duration of the query? How to decrease memory consumption? Tuning 2 – Reduce Memory Consumption We start with the memory consumption part. First, we need to identify which part of the formula is eating so much memory. Actually, it is the same one that has the most performance impact on the query. It’s this formula for the cumulative count, which is evaluated for each row of location multiplied by each value of temperature: MEASURESenzors[TemperatureRowCountCumul] =VAR _curentTemp = MAX ( ‘Senzors'[temperature] ) RETURN CALCULATE ( COUNTROWS ( Senzors ), Senzors[temperature] <= _curentTemp) Is there a different way to get a cumulative count without using CALCULATE? Maybe a more transparent way for the PB engine? Yes, there is! We can remodel the temperature column and define the cumulative sorted approach as a many-to-many relationship towards the sensors. Sample content of temperature tables would look like this: I believe that the picture above is self-describing. As a result of this model, when you use the temperature attribute from the TemperatureMapping table, you have: – Cumulative behavior of RowCount. – Relation calculated in advance. For this new model version, we define measures as below: RowCount measure we have already, but with temperature from Mapping table, it will give us CumulativeCount in fact. MEASURESenzors[RowCount] =COUNTROWS( Senzors ) We must create a new measure which will give us a normal count for the mapping table to be able to calculate the starting position of the temperature value: MEASURESenzors[TemperatureMappingRowCount] =CALCULATE([RowCount],FILTER( TemperatureMapping,TemperatureMapping[LowerTemperature] = TemperatureMapping[temperature])) New median definition: MEASURESenzors[Calc_MedTempMap] =VAR _mpe = [MedianPositionEven]VAR _mpeOdd = [MedianPositionOdd]VAR _TempMedianTable =ADDCOLUMNS(VALUES(TemperatureMapping[temperature]),“MMIN”,[RowCount] – [TemperatureMappingRowCount] + 1,“MMAX”, [RowCount])VAR _T_MedianVals =FILTER( _TempMedianTable,(_mpe >= [MMIN]&& _mpe <= [MMAX])||(_mpeOdd >= [MMIN]&& _mpeOdd <= [MMAX]))RETURNAVERAGEX( _T_MedianVals, [temperature]) Alright, let’s check the performance – the memory consumption is now just in MBs! Performance Many2Many Median Duration (s) Memory Consumed (GB) Used with Hours 2.2 0,02 Used with location 41.1 0,08 I think we can be happy about it and the memory puzzle seems to be solved. You can download a sample PBI file (I decreased data to only one month of the data, but you can download the whole dataset). Below is the statistics summary for now: Performance for Hour (24 values) Duration (s) Memory Consumed (GB) Native median function 71.00 8.00 Custom implementation 6.30 0.20 Many2Many median 2.20 0.02 Performance for Location (422 values) Duration (s) Memory Consumed (GB) Native median function 81.00 8.00 Custom implementation 1 107.00 2.50 Many2Many median 41.10 0.08 I’ll stop this blog here, as it is too long already. Next week, I’ll bring the second part in regards to how to improve performance, so the user has a better experience while using this report. The post Power BI – Hungry Median appeared first on SQLServerCentral.

0
0
1052

article-image-basic-cursors-in-t-sql-sqlnewblogger-from-blog-posts-sqlservercentral

Anonymous

23 Dec 2020

4 min read

Basic Cursors in T-SQL–#SQLNewBlogger from Blog Posts - SQLServerCentral

Anonymous

23 Dec 2020

4 min read

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers. Cursors are not efficient, and not recommended for use in SQL Server/T-SQL. This is different from other platforms, so be sure you know how things work. There are places where cursors are useful, especially in one-off type situations. I recently had a situation, and typed “CREATE CURSOR”, which resulted in an error. This isn’t valid syntax, so I decided to write a quick post to remind myself what is valid. The Basic Syntax Instead of CREATE, a cursor uses DECLARE. The structure is unlike other DDL statements, which are action type name, as CREATE TABLE dbo.MyTable. Instead we have this: DECLARE cursorname CURSOR as in DECLARE myCursor CURSOR There is more that is needed here. This is just the opening. The rest of the structure is DECLARE cursorname CURSOR [options] FOR select_statement You can see this in the docs, but essentially what we are doing is loading the result of a select statement into an object that we can then process row by row. We give the object a name and structure this with the DECLARE CURSOR FOR. I was recently working on the Advent of Code and Day 4 asks for some processing across rows. As a result, I decided to try a cursor like this: DECLARE pcurs CURSOR FOR SELECT lineval FROM day4 ORDER BY linekey; The next steps are to now process the data in the cursor. We do this by fetching data from the cursor as required. I’ll build up the structure here starting with some housekeeping. In order to use the cursor, we need to open it. It’s good practice to then deallocate the objet at the end, so let’s set up this code: DECLARE pcurs CURSOR FOR SELECT lineval FROM day4 ORDER BY linekey;OPEN pcurs...DEALLOCATE pcurs This gets us a clean structure if the code is re-run multiple times. Now, after the cursor is open, we fetch data from the cursor. Each column in the SELECT statement can be fetched from the cursor into a variable. Therefore, we also need to declare a variable. DECLARE pcurs CURSOR FOR SELECT lineval FROM day4 ORDER BY linekey;OPEN pcursDECLARE @val varchar(1000);FETCH NEXT FROM pcurs into @val...DEALLOCATE pcurs Usually we want to process all rows, so we loop through them. I’ll add a WHILE loop, and use the @@FETCH_STATUS variable. If this is 0, there are still rows in the cursor. If I hit the end of the cursor, a –1 is returned. DECLARE pcurs CURSOR FOR SELECT lineval FROM day4 ORDER BY linekey;OPEN pcursDECLARE @val varchar(1000);FETCH NEXT FROM pcurs into @valWHILE @@FETCH_STATUS = 0 BEGIN ... FETCH NEXT FROM pcurs into @val ENDDEALLOCATE pcurs Where the ellipsis is is where I can do other work, process the value, change it, anything I want to do in T-SQL. I do need to remember to get the next row in the loop. As I mentioned, cursors aren’t efficient and you should avoid them, but there are times when row processing is needed, and a cursor is a good solution to understand. SQLNewBlogger As soon as I realized my mistake in setting up the cursor, I knew some of my knowledge had deteriorated. I decided to take a few minutes and describe cursors and document syntax, mostly for myself. However, this is a way to show why you know something might not be used. You could write a post on replacing a cursor with a set based solution, or even show where performance is poor from a cursor. The post Basic Cursors in T-SQL–#SQLNewBlogger appeared first on SQLServerCentral.

0
0
1045

article-image-merry-christmas-from-blog-posts-sqlservercentral

Anonymous

25 Dec 2020

1 min read

Merry Christmas from Blog Posts - SQLServerCentral

Anonymous

25 Dec 2020

1 min read

Christmas is this week so not a technical post for this week. Just a simple post wishing you and your family as many blessing as possible (especially in the year 2020) and good tidings during this holiday time. I hope that 2020 wasn’t too harsh on you or anybody close to you. May the holidays bring you peach and joy! Take care and wear a mask! © 2020, John Morehouse. All rights reserved. The post Merry Christmas first appeared on John Morehouse. The post Merry Christmas appeared first on SQLServerCentral.

0
0
1030

Pushkar Sharma

22 Dec 2020

1 min read

Daily Coping 22 Dec 2020 from Blog Posts - SQLServerCentral

Pushkar Sharma

22 Dec 2020

1 min read

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag. Today’s tip is to share a happy memory or inspiring thought with a loved one. Not sure I need to explain, but I did show my kids this one from a celebration.. The post Daily Coping 22 Dec 2020 appeared first on SQLServerCentral.

0
0
1017

article-image-end-of-an-era-sql-pass-and-lessons-learned-from-blog-posts-sqlservercentral

Anonymous

22 Dec 2020

7 min read

End of an Era – SQL PASS and Lessons learned from Blog Posts - SQLServerCentral

Anonymous

22 Dec 2020

7 min read

Most of my blog is filled with posts related to PASS in some way. Events, various volunteering opportunities, keynote blogging, this or that…With the demise of the organization, I wanted to write one final post but wondered what it could be..I could write about what I think caused it go down, but that horse has been flogged to death and continues to be. I could write about my opinion on how the last stages were handled, but that again is similar. I finally decided I would write about the lessons I’ve learned in my 22 year association with them. This is necessary for me to move on and may be worth reading for those who think similar.There is the common line that PASS is not the #sqlfamily, and that line is currently true. But back in those days, it was. Atleast it was our introduction to the community commonly known as #sqlfamily. So many lessons here are in fact lessons in dealing with and living with community issues. Lesson #1: Networking is important. Seems odd and obvious to say it..but needs to be said. When I was new to PASS I stuck to tech sessions and heading right back to my room when I was done. I was, and I am, every bit the introverted geek who liked her company better than anyone else’s, and kept to it. That didn’t get me very far. I used to frequent the Barnes and Noble behind the Washington convention center in the evenings, to get the ‘people buzz’ out of me – it was here that I met Andy Warren, one of my earliest mentors in the community. Andy explained to me the gains of networking and also introduced a new term ‘functional extrovert’ to me. That is, grow an aspect of my personality that may not be natural but is needed for functional reasons. I worked harder on networking after that, learned to introduce myself to new people and hang out at as many parties and gatherings as I could. It paid off a lot more than tech learning did. Lesson #2: Stay out of crowds and people you don’t belong with. This comes closely on the lines of #1 and may even be a bit of a paradox. But this is true, especially for minorities and sensitive people. There are people we belong with and people we don’t. Networking and attempting to be an extrovert does not mean you sell your self respect and try to fit in everywhere. If people pointedly exclude you in conversations, are disrespectful or stand offish – you don’t belong there. Generally immigrants have to try harder than others to explain themselves and fit in – so this is something that needs to be said for us. Give it a shot and if your gut tells you you don’t belong, leave. Lesson #3: You will be judged and labelled, no matter what. I was one of those people who wanted to stay out of any kind of labelling – just be thought of as a good person who was fair and helpful. But it wasn’t as easy as I thought. Over time factions and groups started to develop in the community. Part of it was fed by politics created by decisions PASS made – quite a lot of it was personal rivalry and jealousy between highly successful people. I formed some opinions based on the information I had (which I would learn later was incomplete and inaccurate), but my opinions cost me some relationships and gave me some labelling. Although this happened about a decade ago, the labels and sourness in some of those relationships persist. Minorities get judged a labelled a lot quicker than others in general, and I was no exception to that.Looking back- I realize that it is not possible to be a friend to everyone, no matter how hard we try. Whatever has happened has happened, we have to learn to move on. Lesson #4: Few people have the full story – so try to hold opinions when there is a controversy. There are backdoor conversations everywhere – but this community has a very high volume of that going on. Very few people have the complete story in face of a controversy. But we are all human, when everyone is sharing opinions we feel pushed to share ours too. A lot of times these can be costly in terms of relationships.I have been shocked, many times, on how poor informed I was when I formed my opinion and later learned the truth of the whole story. I think some of this was fuelled by the highly NDA ridden PASS culture, but I don’t think PASS going away is going to change it. Cliques and back door conversations are going to continue to exist. It is best for us to avoid sharing any opinions unless we are completely sure we know the entire story behind anything. Lesson #5: Volunteering comes with power struggles. I was among the naive who always thought of every fellow volunteer as just a volunteer. It is not that simple. There are hierarchies and people wanting to control each other everywhere. There are many people willing to do the grunt work and expect nothing more, but many others who want to constantly be right, push others around and have it their way. Recognizing such people exist and if possible, staying out of their way is a good idea. Some people also function better if given high level roles than grunt work – so recognizing a person’s skills while assigning volunteer tasks is also a good idea. Lesson #6: Pay attention to burnouts. There is a line of thought that volunteers have no right to expect anything , including thank you or gratitude. As someone who did this a long time and burned out seriously, I disagree. I am not advocating selfishness or manipulative ways of volunteering , but it is important to pay attention to what we are getting out of what we are doing. Feeling thankless and going on for a long time with empty, meaningless feeling in our hearts – can add up to health issues, physical and mental. I believe PASS did not do enough to thank volunteers and I have spoken up many times in this regard. I personally am not a victim of that, especially after the PASSion award. But I have felt that way before it, and I know a lot of people felt that way too. Avoid getting too deep into a potential burnout, it is hard to get out of . And express gratitude and thanks wherever and whenever possible to fellow volunteers. They deserve it and need it. Lesson #6: There is more to it than speaking and organizing events. These are the two most known avenues for volunteering, but there are many more. Blogging on other people’s events, doing podcasts, promoting diversity, contributing to open source efforts like DataSaturdays.com – all of these are volunteering efforts. Make a list and contribute wherever and whenever possible. PASS gave people like me who are not big name speakers many of those opportunites..with it gone it may be harder, but we have to work at it. Lesson #7: Give it time..I think some of the misunderstandings and controversies around PASS come from younger people who didn’t get the gains out of it that folks like me who are older did. Part of it has to do with how dysfunctional and political the organization as well as the community got over time – but some of it has to do with the fact that building a network and a respectable name really takes time. It takes time for people to get to know you as a person of integrity and good values, and as someone worth depending on. Give it time, don’t push the river. Last, but not the least – be a person of integrity. Be someone people can depend on when they need you. Even if we are labelled or end up having wrong opinions in a controversy , our integrity can go a long way in saving our skin. Mine certainly did. Be a person of integrity, and help people. It is , quite literally all there is. Thank you for reading and Happy Holidays. The post End of an Era – SQL PASS and Lessons learned appeared first on SQLServerCentral.

0
0
995

article-image-should-there-be-a-successor-to-pass-from-blog-posts-sqlservercentral

Anonymous

26 Dec 2020

4 min read

Should There Be A Successor to PASS? from Blog Posts - SQLServerCentral

Anonymous

26 Dec 2020

4 min read

PASS was a big influence on a lot of us and did a lot of good, if never quite as much good as many of us wished. I wish PASS had survived, but it didn’t, and now we’re at a crossroads for what comes next. We’ve got short term challenges as far as supporting events that are coming up in the next few months and getting groups moved to an alternative of some sort, but beyond that, we have to decide if we want a successor to PASS or not. I think to answer that, it depends on what we want the new org to do. What would that new mission statement be and can all (most) of us agree on it? Even before we get into funding and a governance model, what can/could a new org do that we care about? My short answer is that a new org should do all the stuff that doesn’t make much money, if any. I think it would exist to facilitate career growth in areas not served (or served well) by for profit businesses. I think it could be an org we see as the glue without being in control. I think it probably doesn’t include a Summit class event because it just over shadows everything else. I think it could help facilitate regional events via grants and experienced volunteers. I think it can’t be all things to all people, but it could be some thing to many people. Back in 1998 defining it the focus as SQL Server was an obvious move. Today, there’s still plenty of people that use SQL, but there is lots of other stuff going on and figuring out where to draw the line is important, because that mission statement helps you evaluate everything you do or don’t do. Microsoft Data Platform excludes a lot of cool stuff. Focusing on Azure tends to ignore the world of on premise, AWS, and Google. But…it depends on what you want to accomplish, doesn’t it? Is it to be a professional association? To make money? To do good at a larger scale than a single product or profession? Or to narrow the focus, perhaps on day long events or SQL DBA’s or growing speakers or whatever. I made a small wish list (and surely I could add another 100 lines to this!): A real non-profit, with a sturdy and clear charter that includes a commitment to transparency, and one that owns all the intellectual property we choose to put into it (for example, SQLSaturday.com if we can get it) A plan for raising the minimal amount of funds needed for things like owning a domain, hosting event sites, etc, and building a multi year reserve No full time staff and limited outsourcing on a project basis, with all the day to day stuff automated or handled by volunteers Vendor agnostic, vendor independent, but one that recognizes the importance of vendors in our work and our community. A solid way of deciding who can be a voting member (one person=one vote) and who can join the Board An org that we’ll be proud of and hold up as a best in class example of how to build a technical professional association. As few rules as possible To answer the question I posed in the title, I haven’t decided yet (though I started out two weeks ago thinking “yes”). I don’t know if its possible or practical to have a single successor org to PASS. I’m still thinking about it, and waiting to see what ideas bubble up over the next couple of months. The post Should There Be A Successor to PASS? appeared first on SQLServerCentral.

0
0
995

article-image-tom-swartz-tuning-your-postgres-database-for-high-write-loads-from-planet-postgresql

Matthew Emerick

14 Oct 2020

1 min read

Tom Swartz: Tuning Your Postgres Database for High Write Loads from Planet PostgreSQL

Matthew Emerick

14 Oct 2020

1 min read

As a database grows and scales up from a proof of concept to a full-fledged production instance, there are always a variety of growing pains that database administrators and systems administrators will run into. Very often, the engineers on the Crunchy Data support team help support enterprise projects which start out as small, proof of concept systems, and are then promoted to large scale production uses. As these systems receive increased traffic load beyond their original proof-of-concept sizes, one issue may be observed in the Postgres logs as the following: LOG: checkpoints are occurring too frequently (9 seconds apart) HINT: Consider increasing the configuration parameter "max_wal_size". LOG: checkpoints are occurring too frequently (2 seconds apart) HINT: Consider increasing the configuration parameter "max_wal_size".

0
0
995

Anonymous

04 Dec 2020

5 min read

5 Things You Should Know About Azure SQL from Blog Posts - SQLServerCentral

Anonymous

04 Dec 2020

5 min read

Azure SQL offers up a world of benefits that can be captured by consumers if implemented correctly. It will not solve all your problems, but it can solve quite a few of them. When speaking to clients I often run into misconceptions as to what Azure SQL can really do. Let us look at a few of these to help eliminate any confusion. You can scale easier and faster Let us face it, I am old. I have been around the block in the IT realm for many years. I distinctly remember the days where scaling server hardware was a multi-month process that usually resulted in the fact that the resulting scaled hardware was already out of date by the time the process was finished. With the introduction of cloud providers, the ability to scale vertically or horizontally can usually be accomplished within a few clicks of the mouse. Often, once initiated, the scaling process is completed within minutes instead of months. This is multiple orders of magnitude better than the method of having to procure hardware for such needs. The added benefit of this scaling ability is that you can then scale down when needed to help save on costs. Just like scaling up or out, this is accomplished with a few mouse clicks and a few minutes of your time. It is not going to fix your performance issues If you currently have performance issues with your existing infrastructure, Azure SQL is not going to necessarily solve your problem. Yes, you can hide the issue with faster and better hardware, but really the issue is still going to exist, and you need to deal with it. Furthermore, moving to Azure SQL could introduce additional issues if the underlying performance issue is not addressed before hand. Make sure to look at your current workloads and address any performance issues you might find before migrating to the cloud. Furthermore, ensure that you understand the available service tiers that are offered for the Azure SQL products. By doing so, you’ll help guarantee that your workloads have enough compute resources to run as optimal as possible. You still must have a DR plan If you have ever seen me present on Azure SQL, I’m quite certain you’ve heard me mention that one of the biggest mistakes you can do when moving to any cloud provider is not having a DR plan in place. There are a multitude of ways to ensure you have a proper disaster recovery strategy in place regardless of which Azure SQL product you are using. Platform as a Service (Azure SQL Database or SQL Managed Instance) offers automatic database backups which solves one DR issue for you out of the gate. PaaS also offers geo-replication and automatic failover groups for additional disaster recovery solutions which are easily implemented with a few clicks of the mouse. When working with SQL Server on an Azure Virtual machine (which is Infrastructure as a Service), you can perform database backups through native SQL Server backups or tools like Azure Backup. Keep in mind that high availability is baked into the Azure service at every turn. However, high availability does not equal disaster recovery and even cloud providers such as Azure do incur outages that can affect your production workloads. Make sure to implement a disaster recovery strategy and furthermore, practice it. It could save you money When implemented correctly, Azure SQL could indeed save you money in the long run. However, it all depends on what your workloads and data volume look like. For example, due to the ease of scalability Azure SQL offers (even when scaling virtual machines), secondary replicas of your data could be at a lower service tier to minimize costs. In the event a failover needs to occur you could then scale the resource to a higher performing service tier to ensure workload compute requirements are met. Azure SQL Database offers a serverless tier that provides the ability for the database to be paused. When the database pauses, you will not be charged for any compute consumption. This is a great resource for unpredictable workloads. Saving costs in any cloud provider implies knowing what options are available as well as continued evaluation of which options would best suit your needs. It is just SQL Azure SQL is not magical quite honestly. It really is just the same SQL engine you are used to with on-premises deployments. The real difference is how you engage with the product and sometimes that can be scary if you are not used to it. As a self-proclaimed die-hard database administrator, it was daunting for me when I started to learn how Azure SQL would fit into modern day workloads and potential help save organizations money. In the end, though, it’s the same product that many of us have been using for years. Summary In this blog post I’ve covered five things to know about Azure SQL. It is a powerful product that can help transform your own data ecosystem into a more capable platform to serve your customers for years to come. Cloud is definitely not a fad and is here to stay. Make sure that you expand your horizons and look upward because that’s where the market is going. If you aren’t looking at Azure SQL currently, what are you waiting for? Just do it. © 2020, John Morehouse. All rights reserved. The post 5 Things You Should Know About Azure SQL first appeared on John Morehouse. The post 5 Things You Should Know About Azure SQL appeared first on SQLServerCentral.

0
0
988

Anonymous

15 Dec 2020

2 min read

Daily Coping 15 Dec 2020 from Blog Posts - SQLServerCentral

Anonymous

15 Dec 2020

2 min read

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag. Today’s tip is to notice when you’re hard on yourself or others and be kind instead. This is one of those skills I’ve worked on for years, maybe a decade. I have tried to learn how to balance acceptance with drive. I want to accept, or maybe just experience, a situation where I am not achieving or accomplishing enough. I need to do this without suppressing my drive, but rather more realistically viewing situations. I see many driven, type-A type people never willing to give up, and often chastising themselves to do more, to do better. Maybe the example for me that springs to mind if Michael Jordon. He’s amazing, likely the best ever, but a jerk. Not someone I’d want to emulate. I’d take a more balanced, a more polite approach instead. I’d rather be Tim Duncan, if I were a high achiever. But maybe I’d just be happy being Luol Deng, a semi-successful player, but not a huge star, but a nice guy. What I want to do is drive forward, in a way that balances all parts of my life with success. With my wife’s success. With the support and love I give my kids or friends. If I don’t accomplish something, I try to stop and realistically examine why. It might be I had other commitments, or no energy (which happens a lot in 2020). It might be I chose to do something else and didn’t have time. It might be because I was just being lazy or not putting in effort. The former items are places I give myself a big break. The latter, I try to think about how to do better, how would I do something different in the future in the same situation. I accept what happened, I experience it, and maybe feel disappointed, but I don’t chastise myself. I move forward. The post Daily Coping 15 Dec 2020 appeared first on SQLServerCentral.

0
0
987

Anonymous

25 Dec 2020

1 min read

Daily Coping 25 Dec 2020 from Blog Posts - SQLServerCentral

Anonymous

25 Dec 2020

1 min read

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag. Today’s tip is to stop for a minute today and smile while you remember a happy moment in 2020. I’ve had more than my share this year, but my happy moment from 2020 that comes to mind is from February. The only airplane trip of the year for me, to celebrate a birthday. Merry Christmas. The post Daily Coping 25 Dec 2020 appeared first on SQLServerCentral.

0
0
980

article-image-virtual-log-files-from-blog-posts-sqlservercentral

Anonymous

24 Nov 2020

4 min read

Virtual Log Files from Blog Posts - SQLServerCentral

Anonymous

24 Nov 2020

4 min read

Today’s post is a guest article from a friend of Dallas DBAs, writer, and fantastic DBA Jules Behrens (B|L) One common performance issue that is not well known that should still be on your radar as a DBA is a high number of VLFs. Virtual Log Files are the files SQL Server uses to do the actual work in a SQL log file (MyDatabase_log.LDF). It allocates new VLFs every time the log file grows. Perhaps you’ve already spotted the problem – if the log file is set to grow by a tiny increment, then if your the file ever grows very large, you may end up with thousands of tiny little VLFs, and this can slow down your performance at the database level. Think of it like a room (the log file) filled with boxes (the VLFs). If you just have a few boxes, it is more efficient to figure out where something (a piece of data in the log file) is, than if you have thousands of tiny boxes. (Analogy courtesy of @SQLDork) It is especially evident there is an issue with VLFs when SQL Server takes a long time to recover from a restart. Other symptoms may be slowness with autogrowth, log shipping, replication, and general transactional slowness. Anything that touches the log file, in other words. The best solution is prevention – set your log file to be big enough to handle its transaction load to begin with, and set it to have a sensible growth rate in proportion to its size, and you’ll never see this come up. But sometimes we inherit issues where best practices were not followed, and a high number of VLFs is certainly something to check when doing a health assessment on an unfamiliar environment. The built-in DMV sys.dm_db_log_info is specifically for finding information about the log file, and command DBCC LOGINFO (deprecated) will return a lot of useful information about VLFs as well. There is an excellent script for pulling the count of VLFs that uses DBCC LOGINFO from Kev Riley, on Microsoft Tech Net: https://gallery.technet.microsoft.com/scriptcenter/SQL-Script-to-list-VLF-e6315249 There is also a great script by Steve Rezhener on SQLSolutionsGroup.com that utilizes the view: https://sqlsolutionsgroup.com/capture-sql-server-vlf-information-using-a-dmv/ Either one of these will tell you what you ultimately need to know – if your VLFs are an issue. How many VLFs are too many? There isn’t an industry standard, but for the sake of a starting point, let’s say a tiny log file has 500 VLFs. That is high. A 5GB log file with 200 VLFs, on the other hand, is perfectly acceptable. You’ll likely know a VLF problem when you find it; you’ll run a count on the VLFs and it will return something atrocious like 20,000. (ed – someone at Microsoft support told me about one with 1,000,000 VLFs) If the database is in Simple recovery model and doesn’t see much traffic, this is easy enough to fix. Manually shrink the log file as small as it will go, verify the autogrow is appropriate, and grow it back to its normal size. If the database is in Full recovery model and is in high use, it’s a little more complex. Follow these steps (you may have to do it more than once): Take a transaction log backup . Issue a CHECKPOINT manually. Check the empty space in the transaction log to make sure you have room to shrink it. Shrink the log file as small as it will go. Grow the file back to its normal size. Lather, Rinse, Repeat as needed Now check your VLF counts again, and make sure you are down to a nice low number. Done! Thanks for reading! The post Virtual Log Files appeared first on DallasDBAs.com. The post Virtual Log Files appeared first on SQLServerCentral.

0
0
971

article-image-goodbye-pass-from-blog-posts-sqlservercentral

Anonymous

23 Dec 2020

1 min read

Goodbye PASS from Blog Posts - SQLServerCentral

Anonymous

23 Dec 2020

1 min read

“It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of light, it was the season of darkness, it was the spring of hope, it-> Continue reading Goodbye PASS The post Goodbye PASS appeared first on Born SQL. The post Goodbye PASS appeared first on SQLServerCentral.

0
0
965

Anonymous

24 Dec 2020

2 min read

Daily Coping 24 Dec 2020 from Blog Posts - SQLServerCentral

Anonymous

24 Dec 2020

2 min read

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag. Today’s tip is to Give away something you have been holding on to. I have made more donations this year and in the past,. Partially I think this is because life slowed down and I had time to clean out some spaces. However, I have more to do, and when I saw this item, I decided to do something new. I’m a big supporter of Habitat for Humanity. During my first sabbatical, I volunteered there quite a bit, and I’ve continued to do that periodically since. I believe shelter is an important resource most people need. site:I’ve had some tools at the house that I’ve held onto, thinking they would be good spares. I have a few cordless items, but I have an older miter saw and a table saw that work fine. Habitat doesn’t take these, but I donated them to another local charity that can make use of them. I’m hoping someone will use them to improve their lives, either building something or maybe using them in their work. The post Daily Coping 24 Dec 2020 appeared first on SQLServerCentral.

0
0
944

Anonymous

17 Dec 2020

1 min read

Daily Coping 17 Dec 2020 from Blog Posts - SQLServerCentral

Anonymous

17 Dec 2020

1 min read

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag. Today’s tip is to be generous and feed someone with food, love, or kindness today. My love language is Acts of Service, and I do this often, preparing things for the family when I can. Recently I asked my son what he’d want for dinner. He comes down once or twice a month from college for a few days, and I try to ensure he enjoys the time. His request: ramen. I put this together for him, and the family, last Friday night. The sushi I bought, because that’s something he enjoys, and I’m not nearly as good as some local chefs. The post Daily Coping 17 Dec 2020 appeared first on SQLServerCentral.

0
0
943

Tech News - Databases

External tables vs T-SQL views on files in a data lake from Blog Posts - SQLServerCentral

Power BI – Hungry Median from Blog Posts - SQLServerCentral

Basic Cursors in T-SQL–#SQLNewBlogger from Blog Posts - SQLServerCentral

Merry Christmas from Blog Posts - SQLServerCentral

Daily Coping 22 Dec 2020 from Blog Posts - SQLServerCentral

End of an Era – SQL PASS and Lessons learned from Blog Posts - SQLServerCentral

Should There Be A Successor to PASS? from Blog Posts - SQLServerCentral

Tom Swartz: Tuning Your Postgres Database for High Write Loads from Planet PostgreSQL

5 Things You Should Know About Azure SQL from Blog Posts - SQLServerCentral

Daily Coping 15 Dec 2020 from Blog Posts - SQLServerCentral

Trending Topics

Daily Coping 25 Dec 2020 from Blog Posts - SQLServerCentral

Virtual Log Files from Blog Posts - SQLServerCentral

Goodbye PASS from Blog Posts - SQLServerCentral

Daily Coping 24 Dec 2020 from Blog Posts - SQLServerCentral

Daily Coping 17 Dec 2020 from Blog Posts - SQLServerCentral