|
The dynamic parallel
coordinate plot
Introduction
The dynamic parallel coordinate plot is one
of the primary interface and display tools in the exploratory visualization
environment being constructed within the Apoala project. The parallel
coordinate technique was originally proposed and implemented by Inselberg
(1985) and task-specific variations of the device have been put forth by
other statisticians (Wegman, 1990; Miller and Wegman, 1991; Jang and Yang,
1996). Its primary advantage over other types of statistical graphics
is its ability to display multi-dimensional data in one representation,
breaking the traditional bounds of two- or three-dimensional multivariate
representations such as scatter plots.
Each observation in a data set is represented as an unbroken series
of line segments which intersect vertical axes, each scaled to a different
variable. The value of the variable for each observation is plotted
along each axis relative to the minimum and maximum values of the variable
for all observations; the points are then connected using line segments.
The result is a "signature" across n dimensions for each observation.
Observations with similar data values across all variables will share
similar signatures. Clusters of like observations can thus be discerned.
Associations among variables can also be visualized; two variables inversely
proportional to each other will be connected by line segments (observations)
which all cross in the region between the axes, while two directly proportional
variables will be connected by parallel line segments. In fact, the
number of crossings of line segments is directly related to the correlation
coefficient r (Wegman, 1990).
The Apoala PCP: a user's guide
The dynamic parallel coordinate plot developed here facilitates the
exploration of these relationships.
-
The user is able to explore relationships among any set of variables by
manipulating the variable displayed on each
axis.
-
The user can strum the lines of the plot,
highlighting the trace of an individual observation across all variables.
-
Clusters of lines may be brushed, to discover
whether the correlations exhibited among observations between two variables
are consistent among all (or some) other variables.
-
The observations are classified (with different colors) according to one
of the variables:
-
Observations which share a range of one variable may be focused
upon, allowing visual exploration of a subset of queried observations.
-
The plot may be used as an interface to other
exploratory analysis tools; though not implemented in this web version
of the tool, the buttons along the top of the plot would be used to dictate
the variables plotted on a two- or three-dimensional scatter plot which
would complement the parallel coordinate display.
An Application
Though the dynamic PCP is a valuable tool for many statistical and
information visualization applications, it was designed for use in a geographic
visualization context. It will serve as an interface tool in
the Apoala space-time GIS
environment being developed in the Geography
department at Penn State. The first
application was the visualization on the results of the data mining
of a set of climate data in and around Texas and northern Mexico.
This application is described in full in a paper in the International Journal
of Geographic Information Science (IJGIS). There is a link to a condensed
version of the printed paper at the web site for the GeoVISTA
center.
What you need to view the plot
The parallel coordinate plot is a "tclet" (pronounced "TICK-let"),
scripted in the Tcl/Tk interface developers' language. You'll need
the Tcl plug-in, available from Scriptics.
If you have the Tcl plug-in already installed, there should be a green
dot in this spot ---> |