Spatial Data

Back Up Next


Home
Geomatics and GIS
Spatial Data
Spatial Analysis
Types of GIS

Spatial Data Concepts and Issues

Please be patient! This page takes about 4 minutes to download over a 56.6K modem (10 seconds over T1), but is worth waiting for it.

Spatial Information

Spatial information is information in 2, 3 or 4 dimensions. It is information where location has some importance or benefit and is not necessarily about locations on the surface of the Earth (e.g., can be a body organ or system).

"Spatial" has to do with any multi-dimensional frame, e.g.,:

  • Medical images are referenced to the human body;

  • Engineering drawings are referenced to a mechanical object;

  • Architectural drawings are referenced to a building.


A location can be a body organ or system. Screenshot of BodyViewer, an
ArcView extension developed by GeoHealth, Inc. BodyViewer allows users
to map any database that contains a geographic reference  (e.g., postal
codes, street addresses, census tract numbers)  and an ICD-9 or ICD-10
code (diagnoses).

Geographic information is a subset of spatial information, though the terms are often used interchangeably. "Geographic" is concerned with planet Earth: its two-dimensional surface, its three-dimensional atmosphere, oceans, and sub-surface. (Bryan, 2000)

Estimates are that 80% of all data has a spatial component. This data can be queried and analysed to answer questions such as "How many (e.g., healthcare facilities, patients with a specific profile, etc.)?" "What kind/type (e.g., type of healthcare facility: GP surgery, district hospital, teaching hospital, specialised centre, etc.)?" "Where are they located (e.g., relationships between populations, their locations and available/planned healthcare facilities)?" (See also: 'Spatial Analysis'.)

Geographically referenced data refers to data referenced by location on Earth (e.g., latitude/longitude, northing/easting) in some standard format.

Source: http://www.hammondmap.com

Geographic References and Geocoding

Geographic information contains either an explicit geographic reference, such as a latitude and longitude or national grid co-ordinate, or an implicit reference such as an address, postal code, census tract name, forest stand identifier, or road name. An automated process called geocoding is used to create explicit geographic references (multiple locations) from implicit references (descriptions such as addresses). These geographic references allow us to locate features, such as a business or forest stand, and events, such as an earthquake or disease outbreak, on the earth's surface for analysis.

Geographic information systems rely on two interrelated types of databases:

The Spatial Database

Describes the location and shape of geographic features, and their spatial relationship to other features. The information contained in the spatial database is held in the form of digital co-ordinates, which describe the spatial features. These can be points (for example, hospitals), lines (for example, roads), or polygons (for example, administrative districts). Normally, the different sets of data will be held as separate layers, which can be combined in a number of different ways for analysis or map production.

The Attribute Database

The attribute database is of a more conventional type; it contains data describing characteristics or qualities of the spatial features (i.e., descriptive information): land use, type of soil, distance from the regional centre, or, using the same examples as in the preceding paragraph, number of beds in the hospital, type of road, population of the administrative districts. Thus, we could have health districts (polygons) and health care centres (points) in the spatial database, and characteristics of these features in the attribute database, for instance persons having access to clean water, number of births, number of 1 year old children fully immunised, number of health personnel, and so on. (Loslier, 1995 – in GIS for Health and the Environment)

GIS links spatial data with geographic information about a particular feature on a map. The information is stored as ‘attributes’ of the graphically represented feature.


Example: A line that denotes a road tells you nothing but its location. An attribute table stores all relevant (descriptive) information about this feature, which can be queried and displayed in many formats based on the user’s needs.

Points, Lines and Polygons

As mentioned above, GIS attempts to describe all features in geometric terms.

  • Point: discrete location represented as a co-ordinate pair (e.g., sampling locations, disease cases, hospitals, and town centroids).

  • Line (Arc): set of ordered co-ordinates represented by a string of co-ordinates (e.g., streams, power and pipelines, and transportation routes).

  • Polygon (Area): closed feature whose boundary encloses a homogeneous area represented by a closed string of co-ordinates which encompass an area (e.g., land use, lakes, census tracts, hospital catchment area, and town boundaries).

Many features can be described by either a point or a polygon. Similarly, lines can be of a specific width. Map scale and resolution define the conditions for appropriate application of these feature types. The uses of co-ordinate based analysis are only limited by the imagination of the user.

Vector and Raster GIS

Source: SAGE Introductory Guidebook, by Robert M. Itami and Robert J. Raulings, published by DLSR, Melbourne, Australia, 1993

There are two major methods to input, store and visualise mapped data in GIS. Geographic Information Systems which store map features in vector format store points, lines and polygons with high accuracy. They are preferred in urban applications where legal boundaries and the analysis of networks are important. Applications of urban GIS include location and allocation of critical resources such as hospitals, study of disease outbreak patterns and crime analysis.

Raster Geographic Information Systems, which store map features in raster or grid format, generalise the location of features to a regular matrix of cells. Raster GIS data structures are preferred for digital elevation modelling (DEM records terrain elevations for ground positions at regularly spaced horizontal intervals - see USA photo below), statistical analysis, remotely sensed data, simulation modelling, and natural resource applications like sedimentation and water quality studies.

In raster-based analysis, the areas of analysis are divided into squares of uniform size (cells). Each cell characterises the feature of interest within this area with a single value. Digital image data, including aerial photos and satellite imagery, are stored in raster format (as pixels). GRID cell-based modelling uses the raster format to determine routing patterns and terrain.


An aerial photo

Source: USGS NSDI Clearinghouse (URI: http://edcwww.cr.usgs.gov/nsdi/gendem.htm)
A digital elevation model (DEM) of the USA

Vector data on the other hand, are coordinate-based data structures commonly used to represent linear features (polygons can be formed by closed strings of co-ordinates). Each feature in this format is represented as a list of ordered x,y co-ordinates.

Computer algorithms exist that can convert data of one type to the other.

Thematic Mapping

Maps in Geographic Information Systems are represented thematically. A standard topographic map will show roads, rivers, contour elevations, vegetation, human settlement patterns and other features on a single map sheet. In a GIS these features are categorised separately and stored in different map themes or overlays. For example, roads will be stored in a separate overlay. Likewise, rivers and streams will each be stored as a separate theme. This way of organising data in the GIS makes maps much more flexible to use since these themes can be combined in any manner that is useful (individual themes can be 'turned on' and 'off' as needed).

Choropleth Maps

These are thematic maps portraying properties of a surface using area symbols such as shading. Area symbols on a choropleth map usually represent categorised classes of the mapped phenomenon, e.g., population density.

Examples of choropleth maps
Examples of choropleth maps
Examples of choropleth maps
Examples of choropleth maps

Maps Defined

A map is a graphic representation of some part of the earth’s surface. A map usually contains a series of themes or coverages that are often combined to form the final product. A map also contains descriptive information (e.g., legend) to help readers interpret the details on the map.

Map Scale

The map scale tells the user how the map relates to the real world features it represents.

  • Scale: describes the relation between a single map unit to the number of same units in the real world, e.g., 1:1000 (1 inch on the map = 1000 inches in the real world - see also 'Resolution' below).

  • Scale Bar: compares the map units to an established real-world unit of measure, e.g., 1 inch = 2.5 miles. It helps users measure real-world distances on the map.

The term ‘map’ is also used to describe a GIS Project or View. A map can provide an interpretation of features on the earth’s surface. Scale, map units and data layers (themes or coverages) are an inherent part of a GIS and allow users to conduct spatial queries and measure distances in their projects when needed.

Co-ordinate Systems

In a GIS, locations on the earth’s surface described by points, lines, and polygons are defined by a series of x,y co-ordinates. Co-ordinate systems can be self-described or in units that relate to the real world. Decimal degrees; degrees, minutes, seconds; metres; and feet are all examples of units of measure in a co-ordinate system.

A degree or ° is a unit of measurement equal to 1/360 of a circle. A degree of latitude on the earth's surface is about 69 miles. A degree of longitude is about 69 miles at the equator and undefined at the poles, but any point on the surface rotates through a degree of longitude in about 4 minutes of time. Minute or ' is the sixtieth part of a degree of angular measurement, often represented by the sign ' as in 12° 30', read 12 degrees, 30 minutes. Arcsecond or " is the sixtieth part of a minute of angular measure often represented by the sign ", as in 30", which is read 30 seconds.

Decimal degrees are the decimal representation of fractions of degrees. Many paper maps express co-ordinates in degrees, minutes, seconds (e.g., 40° 30'), where minutes and seconds are fractions of degrees. 30 minutes equal half a degree, and 30 seconds equal half a minute. GIS software, however, expresses coordinates in decimal degrees (e.g., 40.50 degrees), where fractions of degrees are expressed as decimals. Thus, the longitude: 40 degrees, 30 minutes, would be expressed in ArcView as 40.5 degrees.

Try this: Online Conversions of Degrees, Minutes, Seconds and Decimal Degrees for Co-ordinates by the Audio Services Division of the Federal Communications Commission, US

x,y co-ordinates define the location of map features. Co-ordinate systems must be consistent between map layers. For any database to be useful for spatial analysis, the database must be registered to a recognised global co-ordinate system.

A co-ordinate system consists of:

  • A Spheroid: a mathematical description of the earth’s shape.
  • A Map Projection: a mathematical conversion from spherical to planar co-ordinates.

Map Projections

The accuracy of measurements, e.g., of distance, done on a map is affected by the type of map projection used to draw this map.

Resolution

Resolution depends on map scale

The accuracy with which a given map scale can depict the location and shape of map features is known as resolution. The larger the map scale, the higher the possible resolution. As map scale decreases, resolution diminishes and feature boundaries must be smoothed, simplified, or not shown at all. Resolution plays a large role in GIS, especially in raster-based modelling.

Resolution plays a large part in the ability of a map to accurately describe the earth’s features. It is essential that the user be mindful of the scale of the data layers. Serious errors can result if the theme lacks sufficient resolution to effectively describe an area of interest. A GIS does not tell its user that she has made an error in choosing the right data layer for her project.

Topology

Topology is a mathematical procedure for explicitly defining spatial relationships.

  • Arcs connect to each other at nodes (connectivity),

  • Arcs that connect to surround an area define a polygon (area definition), and

  • Arcs have direction and right and left sides (contiguity).

Connectivity: Arc-Node Topology

  • Points along the arc that define its shape are called vertices.

  • Endpoints of arcs are called nodes.

  • Arcs join only at nodes.

Area Definition: Polygon-Arc Topology

Polygons are represented as a series of x,y co-ordinates that connect to define an area. The GIS also stores the list of arcs that make up the polygon.

Contiguity

Every arc has a direction. The GIS maintains a list of polygons on the left and right side of each arc. The computer uses this information to determine which features are next to one another.

Getting Data into a GIS - Sources of Electronic Data Files

This can be done by:
  • Digitising hard copy maps (see 'Digitisers' below);

  • Keyboard entry of co-ordinate data (co-ordinates are added as a series of numbers defining the location of a point, the shape of a line, or the co-ordinates that define a closed area (polygon); very accurate; requires minimal conversion, but can be time intensive);

  • Electronic entry using a data file;

  • Scanning a map manuscript; and

  • Converting or reformatting existing data.

The Global Positioning System (GPS) can be also used in creating maps.

Data Sources

Electronic data files are the easiest way to get data into a GIS. Ready-to-use data sources include:

Digitisers

A digitiser converts spatial features on a hard copy map into digital format. Point, line and area features are converted into x,y co-ordinates. The process involves manually tracing all features of interest using an electronic stylus. Good base maps must be used. After digitising, a procedure known as transformation converts digitiser units to a real-world co-ordinate system. Tics are used to provide the relationship between the two co-ordinate systems.

Product: Coverage

This term is used in a GIS to describe a spatial dataset that has a particular ‘theme’. A coverage consists of topologically linked geographic features. For maximal analytical power, each theme should exist as a separate coverage; different feature types can coexist in a coverage if they describe the same data.

Four Main Types of Maps/Data Exist:

  • Base Maps: include streets and highways; boundaries for census, postal, and political areas; rivers and lakes; parks and landmarks; place names; and USGS raster maps.

  • Business Maps and Data: include data related to census/demography, consumer products, financial services, healthcare, real estate, telecommunications, emergency preparedness, crime, advertising, business establishments, and transportation.

  • Environmental Maps and Data: include data related to the environment, weather, environmental risk, satellite imagery, topography, and natural resources.

  • General Reference Maps: world and country maps and data that can be a foundation for a GIS database.

Data Availability and Quality Issues

One potential problem with GIS is the availability and quality of data needed by such systems to use and analyse in order to produce meaningful results. The data provided may be inaccurate or incomplete or inappropriate for some particular use, e.g., not of the appropriate geographic scale needed in some situation (Albert et al, 2000). Sometimes, due to patient privacy and confidentiality issues, researchers do not have access to point patient data and census tract data (polygon; aggregated data) is all that is available.

Data is usually accompanied by descriptions (metadata). Spatial data quality standards are now in place to help users understand what is out there and the intended purpose of the dataset.

 

Health Geomatics
© 2000-2002 MIM Centre, School of Informatics
City University, London, UK
All Rights Reserved.


Back to Health Geomatics Homepage
This page was last modified October 14, 2002
Module Editor and Web Designer: M.Nabih-Kamel-Boulos@soi.city.ac.uk

City University Logo - http://www.city.ac.uk