





















































In this article by Angel Marquez author of the book PostGIS Essentials see how to insert GIS objects. Now is the time to fill our tables with data. It's very important to understand some of the theoretical concepts about spatial data before we can properly work with it. We will cover this concept through the real estate company example, used previously.
Basically, we will insert two kinds of data: firstly, all the data that belongs to our own scope of interest. By this, I mean the spatial data that was generated by us (the positions of properties in the case of the example of the real estate company) for our specific problem, so as to save this data in a way that can be easily exploited. Secondly, we will import data of a more general use, which was provided by a third party.
Another important feature that we will cover in this article are the spatial data files that we could use to share, import, and export spatial data within a standardized and popular format called shp or Shape files. In this article, we will cover the following topics:
(For more resources related to this topic, see here.)
Developing an insertion query is a very common task for someone who works with databases. Basically, we follow the SQL language syntax of the insertion, by first listing all the fields involved and then listing all the data that will be saved in each one:
INSERT INTO tbl_properties( id, town, postal_code, street, "number) VALUES (1, 'London', 'N7 6PA', 'Holloway Road', 32);
If the field is of a numerical value, we simply write the number; if it's a string-like data type, we have to enclose the text in two single quotes.
Now, if we wish to include a spatial value in the insertion query, we must first find a way to represent this value. This is where the Well-Known Text (WKT) notation enters. WKT is a notation that represents a geometry object that can be easily read by humans; following is an example of this:
POINT(-0.116190 51.556173)
Here, we defined a geographic point by using a list of two real values, the latitude (y-axis) and the longitude (x-axis). Additionally, if we need to specify the elevation of some point, we will have to specify a third value for the z-axis; this value will be defined in meters by default, as shown in the following code snippet:
POINT(-0.116190 51.556173 100)
Some of the other basic geometry types defined by the WKT notation are:
So, as an example, an SQL insertion query to add the first row to the table, tbl_properties, of our real estate database using the WKT notation, should be as follows:
INSERT INTO tbl_properties (id, town, postal_code, street, "number", the_geom)
VALUES (1, 'London', 'N7 6PA', 'Holloway Road', 32, ST_
GeomFromText('POINT(-0.116190 51.556173)'));
The special function provided by PostGIS, ST_GeomFromText, parses the text given as a parameter and converts it into a GIS object that can be inserted in the_geom field.
Now, we could think this is everything and, therefore, start to develop all the insertion queries that we need. It could be true if we just want to work with the data generated by us and there isn't a need to share this information with other entities. However, if we want to have a better understanding of GIS (believe me, it could help you a lot and prevent a lot of unnecessary headache when working with data from several sources), it would be better to specify another piece of information as part of our GIS object representation to establish its Spatial Reference System (SRS). In the next section, we will explain this concept.
We could think about Earth as a perfect sphere that will float forever in space and never change its shape, but it is not. Earth is alive and in a state of constant change, and it's certainly not a perfect circle; it is more like an ellipse (though not a perfect ellipse) with a lot of small variations, which have taken place over the years.
If we want to represent a specific position inside this irregular shape called Earth, we must first make some abstractions:
There are a lot of ways to make a projection; some of them are more precise than others. This depends on the usefulness that we want to give to the data, and the kind of projection that we choose.
The SRS defines which projection will be used and the transformation that will be used to translate a position from a given projection to another. This leads us to another important point. Maybe it has occurred to you that a geographic position was unique, but it is not. By this, I mean that there could be two different positions with the same latitude and longitude values but be in different physical places on Earth. For a position to be unique, it is necessary to specify the SRS that was used to obtain this position.
To explain this concept, let's consider Earth as a perfect sphere; how can you represent it as a two-dimensional square? Well, to do this, you will have to make a projection, as shown in the following figure:
A projection implies that you will have to make a spherical 3D image fit into a 2D figure, as shown in the preceding image; there are several ways to achieve this. We applied an azimuthal projection, which is a result of projecting a spherical surface onto a plane. However, as I told you earlier, there are several other ways to do this, as we can see in the following image:
These are examples of cylindrical and conical projections. Each one produces a different kind of 2D image of the terrain. Each has its own peculiarities and is used for several distinct purposes. If we put all the resultant images of these projections one above the other, we must get an image similar to the following figure:
As you can see, the terrain positions, which are not necessary, are the same between two projections, so you must clearly specify which projection you are using in your project in order to avoid possible mistakes and errors when you establish a position.
There are a lot of SRS defined around the world. They could be grouped by their reach, that is, they could be local (state or province), national (an entire country), regional (several countries from the same area), or global (worldwide). The International Association of Oil and Gas Producers has defined a collection of Coordinate Reference System (CRS) known as the European Petroleum Survey Group (EPSG) dataset and has assigned a unique ID to each of these SRSs; this ID is called SRID.
For uniquely defining a position, you must establish the SRS that it belongs to, using its particular ID; this is the SRID. There are literally hundreds of SRSs defined; to avoid any possible error, we must standardize which SRS we will use. A very common SRS, widely used around the globe is the WGS84 SRS with the SRID 4326. It is very important that you store the spatial data on your database, using EPSG: 4326 as much as possible, or almost use one equal projection on your database; this way you will avoid problems when you analyze your data.
The WKT notation doesn't support the SRID specification as part of the text, since this was developed at the EWKT notation that allows us to include this information as part of our input string, as we will see in the following example:
'SRID=4326;POINT(51.556173 -0.116190)'
When you create a spatial field, you must specify the SRID that will be used.
The matter that was discussed in the previous section is very important to develop our spatial tables. Taking into account the SRS that they will use from the beginning, we will follow a procedure to recreate our tables by adding this feature. This procedure must be applied to all the tables that we have created on both databases. Perform the following steps:
SELECT DropGeometryColumn('tbl_properties', 'the_geom') Add the spatial field using this command: SELECT AddGeometryColumn('tbl_properties', 'the_geom', 4326, 'POINT', 2);
Repeat these steps for the rest of the spatial tables.
INSERT INTO tbl_properties ( id, town, postal_code, street, "number", the_geom)
VALUES (1, 'London', 'N7 6PA', 'Holloway Road', 32, ST_
GeomFromEWKT('SRID=4326;POINT(51.556173 -0.116190)'));
The ST_GeomFromEWKT function works exactly as ST_GeomFromText, but it implements the extended functionality of the WKT notation. Now that you know how to represent a GIS object as text, it is up to you to choose the most convenient way to generate a SQL script that inserts existing data into the spatial data tables. As an example, you could develop a macro in Excel, a desktop application in C#, a PHP script on your server, and so on.
In this section, we will learn how to obtain data from third-party sources. Most often, this data interchange is achieved through a spatial data file. There are many data formats for this file (such as KML, geoJSON, and so on). We will choose to work with the *.shp files, because they are widely used and supported in practically all the GIS tools available in the market.
There are dozens of sites where you could get useful spatial data from practically any city, state, or country in the world. Much of this data is public and freely available. In this case, we will use data from a fabulous website called http://www.openstreetmap.org/.
The following is a series of steps that you could follow if you want to obtain spatial data from this particular provider. I'm pretty sure you can easily adapt this procedure to obtain data from another provider on the Internet. Using the example of the real estate company, we will get data from the English county of Buckinghamshire. The idea is that you, as a member of the IT department, import data from the cities where the company has activities:
At this point, we have just obtained the data (several shp files). The next procedure will show us how to convert this file into SQL insertion scripts.
In the unzipped folder are shp files; each of them stores a particular feature of the geography of this county. We will focus on the shp named buildings.shp.
Now, we will extract this data stored in the shp file. We will convert this data to a sql script so that we can insert its data into the tbl_buildings table. For this, we will use a Postgis tool called shp2pgSQL. Perform the following steps for extracting spatial data from an shp file:
shp2pgsql -g the_geom buildings.shp tbl_buildings > buildings.sql
CREATE TABLE "tbl_buildings"(gid serial, "osm_id" varchar(20), "name" varchar(50), "type"
varchar(20), "timestamp" varchar (30) ); ALTER TABLE "tbl_buildings" ADD PRIMARY KEY (gid); SELECT AddGeometryColumn('','tbl_buildings','geom','0','MULTIPOLYGON',2);
There is another way to import an shp file into our table; we could use a graphical tool called postgisgui for this. To use this tool, perform the following steps:
This is because the structure is not the same as shown in the shp and in our table. There is no way to indicate to the tool which field we don't want to import. So, the only way for us to solve this problem is let the tool create a new table and after this, change the structure. This can be done by following these steps:
In this article, you learned some basic concepts of GIS (such as WKT, EWKT, and SRS), which are fundamental for working with the GIS data. Now, you are able to craft your own spatial insertion queries or import this data into your own data tables.
Further resources on this subject: