Tag Archives: API

Using shapelib to Read Maps

I’ve just had my first fun in drawing maps based on data that is available from various public-domain government sources.  Most of the geography of the US (and the UK) is available in a shapefile format.  For example, you can download the boundaries of all the states of the US.

There are lots of tools to read shapefiles, depending on which language you are writing in – some are very simple, such as the libraries available for JavaScript.  My code is written in C++ so I chose the shapelib library, as it is the lightest-weight way to open them.

Opening Shape Files

The call to open a shape file is:

SHPOpen (filePath, permissions)

The first thing to note is that when you open a shapefile you do not specify the file you want to open, but instead you specify the directory that contains the .shp file, .prj file, and other supporting files.  In other words, if you want to open a shapefile called ne_50m_land.shp, you specify the enclosing directory as the path, such as:

SHPOpen("/path/to/directory/ne_50m_land", permissions);

The second thing to note is that the shape library is picky about the format of the permission strings.  If you want to open with read-write access, you must pass in “r+b”.  It does not accept “rb+” which can be an acceptable parameter to fopen().

 API Notes and Comments

After you open the shape file, the next thing you do is call SHPGetInfo() to fill in the information about your shape.

void SHPGetInfo( SHPHandle hSHP, int * pnEntities, int * pnShapeType,
                 double * padfMinBound, double * padfMaxBound );

The parameters are documented here.  One key point that it is easy to miss if you are not reading the documentation carefully is that padfMinBound and padfMaxBound are both pointers to four-entry arrays that are filled in with the minimum and maximum bounding values in each of four dimensions — x, y, z, and m.

Your next step is to check what type of shape this is (SHPT_POINT, SHPT_ARC, SHPT_POLYGON, or SHPT_MULTIPOINT), and then iterate over the number of entities that was returned in the pnEntities parameter.  For each entity, you call

SHPObject *SHPReadObject( SHPHandle hSHP, int iShape );

This returns a SHPObject that contains the geometry for some shape within the map.

This geometry may represent more than one part, such as multiple polygons representing individual states or counties.  The nParts field indicates the total number of parts in the shape.  The vertices for all the parts are stored in single arrays for X values, Y values, Z values, and M values called padfX, padfY, padfZ, and padfM.  So, you have to determine which of the vertices belong to each shape by using the panPartStart array that lists the starting point for each shape.

Here is some example code that shows how to use nParts and panPartStart[] to access vertices for each individual part of the shape.

// Iterate over all of the shape objects in the shape file
for (unsigned int i = 0; i < shapeObjects.size(); i++) {

    // Iterate over the parts of the shape
    for (int j = 0; j < shapeObjects[i]->nParts; j++) {

        // Maintain multiple lists of points.  Each list contains
        // all of the points required for a single polygon.
        list<Point *> *points = new list<Point *>();
        int nVerticesPerPart = 0;

        // Determine which part (unique polygon) of the shape
        // these points belong to.
        if (j == shapeObjects[i]->nParts - 1) {
            int startOfPart = shapeObjects[i]->panPartStart[j];
            nVerticesPerPart = shapeObjects[i]->nVertices - 
                               shapeObjects[i]->panPartStart[j];
        } else {
            nVerticesPerPart = shapeObjects[i]->panPartStart[j + 1] - 
                               shapeObjects[i]->panPartStart[j];
        }

        // Add the points from the current shape part to a list.
        // We will create multiple lists of points - one for each shape part
        for (int k = shapeObjects[i]->panPartStart[j]; k < 
                     shapeObjects[i]->panPartStart[j] + nVerticesPerPart; k++) {

            Point *newPoint = new Point(shapeObjects[i]->padfX[k], 
                               shapeObjects[i]->padfY[k]);

            // Add this point to the end of a list of points
            points->push_back(newPoint);
        }

    // Add this list of points to a list of point lists
    pointsLists.push_back(points);
    }
} 

Projections:  Mercator, Lat/Long, etc.

Aside from examining the maximum and minimum values within the shapefile (and the supporting documentation downloaded with an individual shapefile), I did not find any explicit information about what coordinate/projection system the shapefile is using.

Additional Information

Details on the shapefile specification can be found here: http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf