Classification methods

For the classification of data, there are different methods to be used, each of them giving a different visualization of the data.

The methods which can be used in Thematic Mapper for data classification are:

  • equal intervals,
  • quantiles,
  • natural breaks


Equal intervals

In the equal intervals, each class occupies an equal interval along the number line. For the determination of the class interval, you divide the whole range of your data (highest data value minus lowest data value) by the number of classes you have decided to generate (Slocum et al, 2005).


Fig.1 The distribution of data values on the number line, the determination of classes and the choropleth map created using the equal interval classification method (Slocum et al, 2005).

Advantages-Disadvantages:
Equal intervals can be used if the classification steps are nearly equal in size. Moreover, this method can be used when the data distribution has a rectangular shape in the histogramm, but this does not happen very often.
The major disadvantage of this method is that sometimes class limits do not reveal the distribution of the data along the number line, which means that there may be classes that remain blank and will not be shown on the map (e-cartouche, IKG ETHZ).


Quantiles

In quantiles method of classification, data are rank-ordered and equal numbers of observations are placed in each class. To compute the number of observations in a class, the total numbers of observations is divided by the number of classes (Slocum et al, 2005).


Fig.2 The distribution of data values on the number line, the determination of classes and the choropleth map created using the quantile classification method (Slocum et al, 2005).

Advantages-Disadvantages:
Quantiles can be used for ordinal data, since its class alignment is based on ranked data. The classes are easy to compute and each class is approximately equally represented on the final map.
The main disadvantage of this classification method are the gaps that may exist between the observations, which sometimes lead to an over-weighting of some single detached observations at the edge of the number line. You can see the big value gap in class 5 on the number line of the figure above (e-cartouche, IKG ETHZ).


Natural breaks

In natural breaks, graphs are examined visually to determine logical breaks in the data, which means that the purpose of natural breaks is to minimize the differences between data values in the same class and maximize differences between classes.


Fig.4 The distribution of data values on the number line, the determination of classes and the choropleth map created using the natural breaks classification method (Slocum et al, 2005).

Advantages-Disadvantages:
This method is applied considering visually logical and subjective aspects to group data values. The main purpose is to minimize value differences between data within the same class and to emphasize the differences between the created classes. The method we use in this case is Jenks optimization (Dent, 1996).
A disadvantage of this method is that the class limits may vary from one mapmaker to another due to his subjective class definition (e-cartouche, IKG ETHZ).


Manual classification

In manual data classification, the user can define manually the classes he wants to use to classify his data by writing the class breaks on the list. The minimum and maximum values should be omitted.


References:
Dent B.D. (1996) Cartography Thematic Map Design. 4th edition, Times Mirror Higher Education Group, Inc., ISBN 0-697-22970-X, USA
Slocum T. A., McMaster R. B., Kessler F. C., Howard H. H. (2005) Thematic Cartography and Geographic Visualization. 2nd edition, Pearson Education Inc., ISBN 0-13-035123-7, USA
e-cartouche, IKG ETHZ (2007) Data presentation: Cartographic data visualisation - Standardisation and classification of data, Module 1, Swiss Virtual Campus “Dealing with Natural Hazards“. Available at e-cartouche, IKG ETHZ


Visit the OCAD Wiki for more information.