| Reseach
Context:
Few cancer types are evenly distributed across a population. Geographic
variations in cancer mortality have been associated with risk factor
prevalence, screening behaviors, health care access and utilization,
genetic predisposition and occupational hazards. Consequently, mapping
cancer records (and other health statistics) has been one important
catalyst for developing hypotheses about cancer etiology. This research
has helped to identify flaws in cancer surveillance networks, and
to develop effective cancer control policy.
Geographic Information Systems have extended the promise of health
mapping since they can integrate heterogeneous data from diverse
sources. However, the capabilities of current GIS (even when linked
with other spatial analysis methods and tools) may limit, and bias,
the geographic observations that cause investigations to be initiated.
The risk remains that geographic variations in cancers may be falsely
observed, go unrecognized, or that relationships between cancers
and risk or preventive factors may be misunderstood.
Primary
Goals:
- To design,
implement, and integrate a suite of geovisualization, exploratory
spatial data analysis (ESDA), and computational software components
targeted to applications in cancer research, surveillance, and
control. We will leverage an existing software platform, GeoVISTA
Studio (described below) that supports independent development
of cancer data analysis software components (by our research team
as well as by others). To this we will add exploratory statistical
and visualization components that can be combined in a highly
coordinated manner using a flexible visual interface. When integrated
through Studio, the created software applications will support
the entire scientific process from data exploration and hypothesis
generation, through rigorous analysis of hypotheses, to presentation
of findings in accessible ways (months 1 - 36).
- To improve
methodologies for exploratory research in cancer epidemiology
and etiology. Exploratory analysis contains a high potential for
finding spurious associations and missing real associations, thereby
introducing errors and misinterpretations. We propose research
targeted at improvements in two areas: (a) development of sound
data exploration methodologies and practices and (b) development
of tools to assess and understand the validity and reliability
of the results. This latter goal will draw on advancements, derived
from the machine learning and data mining communities, in searching
through vast 'hypotheses spaces' and from research on visualizing
data reliability (months 25 - 54).
- To ensure
that the developed methods and components (a) are usable by and
meaningful to professionals engaged in cancer control and (b)
can be used accurately to address important questions in cancer
research, surveillance and control. These goals will be accomplished
through two linked activities. First, formal usability assessment
methods will be applied throughout the process of tool design,
implementation, and deployment (months 1-60). Second, we will
carry out proof-of-concept applications to case studies focused
on cancer research and policy. These case study applications will
address important questions related to accurate assessment of
geospatial patterns in cancer and will provide an opportunity
to assess both usability and usefulness of tools in realistic
applications (months 25-60).
Software
Development:
The project
leverages GeoVISTA Studio--a Java-based software environment for
building userspecific computer applications, being developed by
the GeoVISTA Center. Visual components of Studio fuse geographic
visualization, statistical graphics, and information visualization.
Statistical components focus on ‘local’ spatial statistics. Computational
components will provide pattern recognition and searching across
large data sets. Studio is open-source software with a growing developer/user
base. See our web site for some early examples of methods developed
for health and demographic data analysis.
|