In this article by Dominik Mikiewicz, the author of the book Mastering PostGIS, we will see about exporting data from PostgreSQL/PostGIS to files or other data sources. Sharing data via the Web is no less important, but it has its own specific process.

There may be different reasons for having to export data from a database, but certainly sharing it with others is among the most popular ones. Backing the data up or transferring it to other software packages for further processing are other common reasons for learning the export techniques.

(For more resources related to this topic, see here.)

In this article we'll have a closer look at the following:

Exporting data using COPY (and COPY)
Exporting vector data using pgsql2shp
Exporting vector data using ogr2ogr
Exporting data using GIS clients
Outputting rasters using GDAL
Outputting rasters using psql
Using the PostgreSQL backup functionality

We just do the steps the other way round. In other words, this article may give you a bit of a déjà vu feeling.

Exporting data using COPY in psql

When we were importing data, we used the psql COPY FROM command to copy data from a file to a table. This time, we'll do it the other way round—from a table to a file—using the COPY TO command.

COPY TO can not only copy a full table, but also the result of a SELECT query, and that means we can actually output filtered sub datasets of the source tables.

Similarly to the method we used to import, we can execute COPY or COPY in different scenarios: We'll use psql in interactive and non-interactive mode, and we'll also do the very same thing in PgAdmin.

It is worth remembering that COPY can only read/write files that can be accessed by an instance of the server, so usually files that reside on the same machine as the database server.

For detailed information on COPY syntax and parameters, type:

h copy

Exporting data in psql interactively

In order to export the data in interactive mode we first need to connect to the database using psql:

psql -h localhost -p 5434 -U postgres

Then type the following:

c mastering_postgis

Once connected, we can execute a simple command:

copy data_import.earthquakes_csv TO earthquakes.csv WITH DELIMITER ';' CSV HEADER

The preceding command exported a data_import. earthquakes_csv table to a file named earthquakes.csv, with ';' as a column separator. A header in the form of column names has also been added to the beginning of the file. The output should be similar to the following:

COPY 50

Basically, the database told us how many records have been exported. The content of the exported file should exactly resemble the content of the table we exported from:

time;latitude;longitude;depth;mag;magtype;nst;gap;dmin;rms;net;id;updated;place;type;horizontalerror;deptherror;magerror;magnst;status;locationsource;magsource
2016-10-08 14:08:08.71+02;36.3902;-96.9601;5;2.9;mb_lg;;60;0.029;0.52;us;us20007csd;2016-10-08 14:27:58.372+02;15km WNW of Pawnee, Oklahoma;earthquake;1.3;1.9;0.1;26;reviewed;us;us

As mentioned, COPY can also output the results of a SELECT query. This means we can tailor the output to very specific needs, as required. In the next example, we'll export data from a 'spatialized' earthquakes table, but the geometry will be converted to a WKT (well-known text) representation. We'll also export only a subset of columns:

copy (select id, ST_AsText(geom) FROM  data_import.earthquakes_subset_with_geom) TO earthquakes_subset.csv WITH CSV DELIMITER '|' FORCE QUOTE *  HEADER

Once again, the output just specifies the amount of records exported:

COPY 50

The executed command exported only the id column and a WKT-encoded geometry column. The export force wrapped the data into quote symbols, with a pipe (|) symbol used as a delimiter. The file has header:

id|st_astext
"us20007csd"|"POINT(-96.9601 36.3902)"
"us20007csa"|"POINT(-98.7058 36.4314)

Exporting data in psql non-interactively

If you're still in psql, you can execute a script by simply typing the following:

i path/to/the/script.sql

For example:

i code/psql_export.sql

The output will not surprise us, as it will simply state the number of records that were outputted:

COPY 50

If you happen to have already quitted psql, the cmd i equivalent is -f, so the command should look like this:

Psql -h localhost -p 5434 -U postgres -d mastering_postgis -f code/psql_export.sql

Not surprisingly, the cmd output is once again the following:

COPY 50

Exporting data in PgAdmin

In PgAdmin, the command is COPY rather than COPY. The rest of the code remains the same. Another difference is that we need to use an absolute path, while in psql we can use paths relative to the directory we started psql in.

So the first psql query 'translated' to the PgAdmin SQL version looks like this:

copy data_import.earthquakes_csv TO 'F:mastering_postgischapter06earthquakes.csv' WITH DELIMITER ';' CSV HEADER

The second query looks like this:

copy (select id, ST_AsText(geom) FROM  data_import.earthquakes_subset_with_geom) TO 'F:mastering_postgischapter06earthquakes_subset.csv' WITH CSV DELIMITER '|' FORCE QUOTE * HEADER

Both produce a similar output, but this time it is logged in PgAdmin's query output pane 'Messages' tab:

Query returned successfully: 50 rows affected, 55 msec execution time.

It is worth remembering that COPY is executed as part of an SQL command, so it is effectively the DB server that tries to write to a file. Therefore, it may be the case that the server is not able to access a specified directory. If your DB server is on the same machine as the directory that you are trying to write to, relaxing directory access permissions should help.

Exporting vector data using pgsql2shp

pgsql2shp is a command-line tool that can be used to output PostGIS data into shapefiles. Similarly to outgoing COPY, it can either export a full table or the result of a query, so this gives us flexibility when we only need a subset of data to be outputted and we do not want to either modify the source tables or create temporary, intermediate ones.

pgsql2sph command line

In order to get some help with the tool just type the following in the console:

pgsql2shp

The general syntax for the tool is as follows:

pgsql2shp [<options>] <database> [<schema>.]<table>
pgsql2shp [<options>] <database> <query>

Shapefile is a format that is made up of a few files. The minimum set is SHP, SHX, and DBF. If PostGIS is able to determine the projection of the data, it will also export a PRJ file that will contain the SRS information, which should be understandable by the software able to consume a shapefile.

If a table does not have a geometry column, then only a DBF file that is the equivalent of the table data will be exported.

Let's export a full table first:

pgsql2shp -h localhost -p 5434 -u postgres -f full_earthquakes_dataset mastering_postgis data_import.earthquakes_subset_with_geom

The following output should be expected:

Initializing...
Done (postgis major version: 2).
Output shape: Point
Dumping: X [50 rows]

Now let's do the same, but this time with the result of a query:

pgsql2shp -h localhost -p 5434 -u postgres -f full_earthquakes_dataset mastering_postgis "select * from data_import.earthquakes_subset_with_geom limit 1"

To avoid being prompted for a password, try providing it within the command via the -P switch.

The output will be very similar to what we have already seen:

Initializing...
Done (postgis major version: 2).
Output shape: Point
Dumping: X [1 rows]

In the data we previously imported, we do not have examples that would manifest shapefile limitations. It is worth knowing about them, though. You will find a decent description at https://en.wikipedia.org/wiki/Shapefile#Limitations. The most important ones are as follows:

Column name length limit: The shapefile can only handle column names with a maximum length of 10 characters; pgsql2shp will not produce duplicate columns, though—if there were column names that would result in duplicates when truncated, then the tool will add a sequence number.
Maximum field length: The maximum field length is 255; psql will simply truncate the data upon exporting.

In order to demonstrate the preceding limitations, let's quickly create a test PostGIS dataset:

Create a schema if an export does not exist:

CREATE SCHEMA IF NOT EXISTS data_export;
CREATE TABLE IF NOT EXISTS data_export.bad_bad_shp (
  id character varying,
  "time" timestamp with time zone,
  depth numeric,
  mag numeric,
  very_very_very_long_column_that_holds_magtype character varying,
  very_very_very_long_column_that_holds_place character varying,
  geom geometry);
INSERT INTO data_export.bad_bad_shp select * from data_import.earthquakes_subset_with_geom limit 1;
UPDATE data_export.bad_bad_shp
SET very_very_very_long_column_that_holds_magtype = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce id mauris eget arcu imperdiet tristique eu sed est. Quisque suscipit risus eu ante vestibulum hendrerit ut sed nulla. Nulla sit amet turpis ipsum. Curabitur nisi ante, luctus nec dignissim ut, imperdiet id tortor. In egestas, tortor ac condimentum sollicitudin, nisi lacus porttitor nibh, a tempus ex tellus in ligula. Donec pharetra laoreet finibus. Donec semper aliquet fringilla. Etiam faucibus felis ac neque facilisis vestibulum. Vivamus scelerisque at neque vel tincidunt. Phasellus gravida, ipsum vulputate dignissim laoreet, augue lacus congue diam, at tempus augue dolor vitae elit.';

Having prepared a vigilante dataset, let's now export it to SHP to see if our SHP warnings were right:

pgsql2shp -h localhost -p 5434 -u postgres -f bad_bad_shp mastering_postgis data_export.bad_bad_shp

When you now open the exported shapefile in a GIS client of your choice, you will see our very, very long column names renamed to VERY_VERY_ and VERY_VE_01. The content of the very_very_very_long_column_that_holds_magtype field has also been truncated to 255 characters, and is now 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce id mauris eget arcu imperdiet tristique eu sed est. Quisque suscipit risus eu ante vestibulum hendrerit ut sed nulla. Nulla sit amet turpis ipsum. Curabitur nisi ante, luctus nec dignissim ut'.

For the sake of completeness, we'll also export a table without geometry so we are certain that pgsql2shp exports only a BDF file:

pgsql2shp -h localhost -p 5434 -u postgres -f a_lonely_dbf mastering_postgis "select id, place from data_import.earthquakes_subset_with_geom limit 1"

pgsql2shp gui

We have already seen the PgAdmin's GUI for importing shapefiles. As you surely remember, the pgsql2shp GUI also has an Export tab.

If you happen to encounter difficulties locating the pgsql2shp GUI in pgAdmin 4, try calling it from the shell/command line by executing shp2pgsql-gui. If because of some reason it is not recognized, try to locate the utility in your DB directory under bin/postgisgui/shp2pgsql-gui.exe.

In order to export a shapefile from PostGIS, go to the PluginsPostGIS shapefile and DBF loader 2.2 (version may vary); then you have to switch to the Export tab:

spatial-data-img-0

It is worth mentioning that you have some options to choose from when exporting. They are rather self-explanatory:

spatial-data-img-1

When you press the Export button, you can choose the output destination. The log is displayed in the 'Log Window' area of the exporter GUI:

spatial-data-img-2

Exporting vector data using ogr2ogr

We have already seen a little preview of ogr2ogr exporting the data when we made sure that our KML import had actually brought in the proper data. This time we'll expand on the subject a bit and also export a few more formats to give you an idea of how sound a tool ogr2ogr is.

In order to get some information on the tool, simply type the following in the console:

ogr2ogr

Alternatively, if you would like to get some more descriptive info, visit http://www.gdal.org/ogr2ogr.html.

You could also type the following:

ogr2ogr –long-usage

The nice thing about ogr2ogr is that the tool is very flexible and offers some options that allow us to export exactly what we are after. You can specify what data you would like to select by specifying the columns in a -select parameter. -where parameter lets you specify the filtering for your dataset in case you want to output only a subset of data. Should you require more sophisticated output preparation logic, you can use a -sql parameter.

This is obviously not all there is. The usual gdal/ogr2ogr parameters are available too. You can reproject the data on the fly using the -t_srs parameter, and if, for some reason, the SRS of your data has not been clearly defined, you can use -s_srs to instruct ogr2ogr what the source coordinate system is for the dataset being processed.

There are obviously advanced options too. Should you wish to clip your dataset to a specified bounding box, polygon, or coordinate system, have a look at the -clipsrc, -clipdst parameters and their variations.

The last important parameter to know is -dsco—dataset creation options. It accepts values in the form of NAME=VALUE. When you want to pass more than one option this way, simply repeat the parameter. The actual dataset creation options depend on the format used, so it is advised that you consult the appropriate format information pages available via the ogr2pgr website.

Summary

There are many ways of getting the data out of a database. Some are PostgreSQL specific, some are PostGIS specific. The point is that you can use and mix any tools you prefer. There will be scenarios where simple data extraction procedures will do just fine; some other cases will require a more specialized setup, SQL or psql, or even writing custom code in external languages. I do hope this article gives you a toolset you can use with confidence in your daily activities.