





















































In this article by Dominik Mikiewicz, the author of the book Mastering PostGIS, we will see about exporting data from PostgreSQL/PostGIS to files or other data sources. Sharing data via the Web is no less important, but it has its own specific process.
There may be different reasons for having to export data from a database, but certainly sharing it with others is among the most popular ones. Backing the data up or transferring it to other software packages for further processing are other common reasons for learning the export techniques.
(For more resources related to this topic, see here.)
In this article we'll have a closer look at the following:
We just do the steps the other way round. In other words, this article may give you a bit of a déjà vu feeling.
When we were importing data, we used the psql COPY FROM command to copy data from a file to a table. This time, we'll do it the other way round—from a table to a file—using the COPY TO command.
COPY TO can not only copy a full table, but also the result of a SELECT query, and that means we can actually output filtered sub datasets of the source tables.
Similarly to the method we used to import, we can execute COPY or COPY in different scenarios: We'll use psql in interactive and non-interactive mode, and we'll also do the very same thing in PgAdmin.
It is worth remembering that COPY can only read/write files that can be accessed by an instance of the server, so usually files that reside on the same machine as the database server.
For detailed information on COPY syntax and parameters, type:
h copy
In order to export the data in interactive mode we first need to connect to the database using psql:
psql -h localhost -p 5434 -U postgres
Then type the following:
c mastering_postgis
Once connected, we can execute a simple command:
copy data_import.earthquakes_csv TO earthquakes.csv WITH DELIMITER ';' CSV HEADER
The preceding command exported a data_import. earthquakes_csv table to a file named earthquakes.csv, with ';' as a column separator. A header in the form of column names has also been added to the beginning of the file. The output should be similar to the following:
COPY 50
Basically, the database told us how many records have been exported. The content of the exported file should exactly resemble the content of the table we exported from:
time;latitude;longitude;depth;mag;magtype;nst;gap;dmin;rms;net;id;updated;place;type;horizontalerror;deptherror;magerror;magnst;status;locationsource;magsource
2016-10-08 14:08:08.71+02;36.3902;-96.9601;5;2.9;mb_lg;;60;0.029;0.52;us;us20007csd;2016-10-08 14:27:58.372+02;15km WNW of Pawnee, Oklahoma;earthquake;1.3;1.9;0.1;26;reviewed;us;us
As mentioned, COPY can also output the results of a SELECT query. This means we can tailor the output to very specific needs, as required. In the next example, we'll export data from a 'spatialized' earthquakes table, but the geometry will be converted to a WKT (well-known text) representation. We'll also export only a subset of columns:
copy (select id, ST_AsText(geom) FROM data_import.earthquakes_subset_with_geom) TO earthquakes_subset.csv WITH CSV DELIMITER '|' FORCE QUOTE * HEADER
Once again, the output just specifies the amount of records exported:
COPY 50
The executed command exported only the id column and a WKT-encoded geometry column. The export force wrapped the data into quote symbols, with a pipe (|) symbol used as a delimiter. The file has header:
id|st_astext
"us20007csd"|"POINT(-96.9601 36.3902)"
"us20007csa"|"POINT(-98.7058 36.4314)
If you're still in psql, you can execute a script by simply typing the following:
i path/to/the/script.sql
For example:
i code/psql_export.sql
The output will not surprise us, as it will simply state the number of records that were outputted:
COPY 50
If you happen to have already quitted psql, the cmd i equivalent is -f, so the command should look like this:
Psql -h localhost -p 5434 -U postgres -d mastering_postgis -f code/psql_export.sql
Not surprisingly, the cmd output is once again the following:
COPY 50
In PgAdmin, the command is COPY rather than COPY. The rest of the code remains the same. Another difference is that we need to use an absolute path, while in psql we can use paths relative to the directory we started psql in.
So the first psql query 'translated' to the PgAdmin SQL version looks like this:
copy data_import.earthquakes_csv TO 'F:mastering_postgischapter06earthquakes.csv' WITH DELIMITER ';' CSV HEADER
The second query looks like this:
copy (select id, ST_AsText(geom) FROM data_import.earthquakes_subset_with_geom) TO 'F:mastering_postgischapter06earthquakes_subset.csv' WITH CSV DELIMITER '|' FORCE QUOTE * HEADER
Both produce a similar output, but this time it is logged in PgAdmin's query output pane 'Messages' tab:
Query returned successfully: 50 rows affected, 55 msec execution time.
It is worth remembering that COPY is executed as part of an SQL command, so it is effectively the DB server that tries to write to a file. Therefore, it may be the case that the server is not able to access a specified directory. If your DB server is on the same machine as the directory that you are trying to write to, relaxing directory access permissions should help.
pgsql2shp is a command-line tool that can be used to output PostGIS data into shapefiles. Similarly to outgoing COPY, it can either export a full table or the result of a query, so this gives us flexibility when we only need a subset of data to be outputted and we do not want to either modify the source tables or create temporary, intermediate ones.
In order to get some help with the tool just type the following in the console:
pgsql2shp
The general syntax for the tool is as follows:
pgsql2shp [<options>] <database> [<schema>.]<table>
pgsql2shp [<options>] <database> <query>
Shapefile is a format that is made up of a few files. The minimum set is SHP, SHX, and DBF. If PostGIS is able to determine the projection of the data, it will also export a PRJ file that will contain the SRS information, which should be understandable by the software able to consume a shapefile.
If a table does not have a geometry column, then only a DBF file that is the equivalent of the table data will be exported.
Let's export a full table first:
pgsql2shp -h localhost -p 5434 -u postgres -f full_earthquakes_dataset mastering_postgis data_import.earthquakes_subset_with_geom
The following output should be expected:
Initializing...
Done (postgis major version: 2).
Output shape: Point
Dumping: X [50 rows]
Now let's do the same, but this time with the result of a query:
pgsql2shp -h localhost -p 5434 -u postgres -f full_earthquakes_dataset mastering_postgis "select * from data_import.earthquakes_subset_with_geom limit 1"
To avoid being prompted for a password, try providing it within the command via the -P switch.
The output will be very similar to what we have already seen:
Initializing...
Done (postgis major version: 2).
Output shape: Point
Dumping: X [1 rows]
In the data we previously imported, we do not have examples that would manifest shapefile limitations. It is worth knowing about them, though. You will find a decent description at https://en.wikipedia.org/wiki/Shapefile#Limitations. The most important ones are as follows:
In order to demonstrate the preceding limitations, let's quickly create a test PostGIS dataset:
Create a schema if an export does not exist:
CREATE SCHEMA IF NOT EXISTS data_export;
CREATE TABLE IF NOT EXISTS data_export.bad_bad_shp (
id character varying,
"time" timestamp with time zone,
depth numeric,
mag numeric,
very_very_very_long_column_that_holds_magtype character varying,
very_very_very_long_column_that_holds_place character varying,
geom geometry);
INSERT INTO data_export.bad_bad_shp select * from data_import.earthquakes_subset_with_geom limit 1;
UPDATE data_export.bad_bad_shp
SET very_very_very_long_column_that_holds_magtype = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce id mauris eget arcu imperdiet tristique eu sed est. Quisque suscipit risus eu ante vestibulum hendrerit ut sed nulla. Nulla sit amet turpis ipsum. Curabitur nisi ante, luctus nec dignissim ut, imperdiet id tortor. In egestas, tortor ac condimentum sollicitudin, nisi lacus porttitor nibh, a tempus ex tellus in ligula. Donec pharetra laoreet finibus. Donec semper aliquet fringilla. Etiam faucibus felis ac neque facilisis vestibulum. Vivamus scelerisque at neque vel tincidunt. Phasellus gravida, ipsum vulputate dignissim laoreet, augue lacus congue diam, at tempus augue dolor vitae elit.';
Having prepared a vigilante dataset, let's now export it to SHP to see if our SHP warnings were right:
pgsql2shp -h localhost -p 5434 -u postgres -f bad_bad_shp mastering_postgis data_export.bad_bad_shp
When you now open the exported shapefile in a GIS client of your choice, you will see our very, very long column names renamed to VERY_VERY_ and VERY_VE_01. The content of the very_very_very_long_column_that_holds_magtype field has also been truncated to 255 characters, and is now 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce id mauris eget arcu imperdiet tristique eu sed est. Quisque suscipit risus eu ante vestibulum hendrerit ut sed nulla. Nulla sit amet turpis ipsum. Curabitur nisi ante, luctus nec dignissim ut'.
For the sake of completeness, we'll also export a table without geometry so we are certain that pgsql2shp exports only a BDF file:
pgsql2shp -h localhost -p 5434 -u postgres -f a_lonely_dbf mastering_postgis "select id, place from data_import.earthquakes_subset_with_geom limit 1"
We have already seen the PgAdmin's GUI for importing shapefiles. As you surely remember, the pgsql2shp GUI also has an Export tab.
If you happen to encounter difficulties locating the pgsql2shp GUI in pgAdmin 4, try calling it from the shell/command line by executing shp2pgsql-gui. If because of some reason it is not recognized, try to locate the utility in your DB directory under bin/postgisgui/shp2pgsql-gui.exe.
In order to export a shapefile from PostGIS, go to the PluginsPostGIS shapefile and DBF loader 2.2 (version may vary); then you have to switch to the Export tab:
It is worth mentioning that you have some options to choose from when exporting. They are rather self-explanatory:
When you press the Export button, you can choose the output destination. The log is displayed in the 'Log Window' area of the exporter GUI:
We have already seen a little preview of ogr2ogr exporting the data when we made sure that our KML import had actually brought in the proper data. This time we'll expand on the subject a bit and also export a few more formats to give you an idea of how sound a tool ogr2ogr is.
In order to get some information on the tool, simply type the following in the console:
ogr2ogr
Alternatively, if you would like to get some more descriptive info, visit http://www.gdal.org/ogr2ogr.html.
You could also type the following:
ogr2ogr –long-usage
The nice thing about ogr2ogr is that the tool is very flexible and offers some options that allow us to export exactly what we are after. You can specify what data you would like to select by specifying the columns in a -select parameter. -where parameter lets you specify the filtering for your dataset in case you want to output only a subset of data. Should you require more sophisticated output preparation logic, you can use a -sql parameter.
This is obviously not all there is. The usual gdal/ogr2ogr parameters are available too. You can reproject the data on the fly using the -t_srs parameter, and if, for some reason, the SRS of your data has not been clearly defined, you can use -s_srs to instruct ogr2ogr what the source coordinate system is for the dataset being processed.
There are obviously advanced options too. Should you wish to clip your dataset to a specified bounding box, polygon, or coordinate system, have a look at the -clipsrc, -clipdst parameters and their variations.
The last important parameter to know is -dsco—dataset creation options. It accepts values in the form of NAME=VALUE. When you want to pass more than one option this way, simply repeat the parameter. The actual dataset creation options depend on the format used, so it is advised that you consult the appropriate format information pages available via the ogr2pgr website.
There are many ways of getting the data out of a database. Some are PostgreSQL specific, some are PostGIS specific. The point is that you can use and mix any tools you prefer. There will be scenarios where simple data extraction procedures will do just fine; some other cases will require a more specialized setup, SQL or psql, or even writing custom code in external languages. I do hope this article gives you a toolset you can use with confidence in your daily activities.
Further resources on this subject: