Packt+ | Advance your knowledge in tech

You're reading from PostgreSQL Administration Cookbook, 9.5/9.6 Edition Effective database management for administrators

Product type Paperback

Published in Apr 2017

Publisher

ISBN-13 9781785883187

Length 556 pages

Edition 3rd Edition

Languages

SQL

Tools

PostgreSQL

Concepts

Database Administration

Authors (3):

Simon Riggs

GIANNI CIOLLI

Bartolini

View More author details

Table of Contents (19) Chapters

Title Page

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Customer Feedback

Preface

1. First Steps FREE CHAPTER

2. Exploring the Database

3. Configuration

4. Server Control

5. Tables and Data

6. Security

7. Database Administration

8. Monitoring and Diagnosis

9. Regular Maintenance

10. Performance and Concurrency

11. Backup and Recovery

12. Replication and Upgrades

Randomly sampling data

DBAs may be asked to set up a test server and populate it with test data. Often, that server will be old hardware, possibly with smaller disk sizes. So, the subject of data sampling raises its head.

The purpose of sampling is to reduce the size of the dataset and improve the speed of later analysis. Some statisticians are so used to the idea of sampling that they may not even question whether its use is valid or it can cause further complications.

The SQL standard way to perform sampling is by adding the TABLESAMPLE clause to the SELECT statement. This is only available on PostgreSQL 9.5 and later, so we also describe an alternate version of this recipe that doesn't use TABLESAMPLE.

How to do it...

In this section, we will take a random sample of a given collection of data (for example, a given table). First, you should realize that there isn't a simple tool to slice off a sample of your database. It would be neat if there were, but there isn't. You'll need to read all...