Paper presented at "New Methodologies for the Social Sciences: The Development and Application of Spatial Analysis for Political Methodology." University of Colorado at Boulder, 10th12th of March, 2000.
Keywords: Bayesian networks, spatial analysis, Nazi party, electoral geography, place, space.
Introduction
Defining Place and how it matters
Spatiality in aggregate data
A spatial analysis
of the Nazi party vote
An overview of Bayesian approaches
A Bayesian network
approach to place and politics
Conclusion
The two purposes of this paper are to call for a concentration upon place rather than space in the contextual analysis of politics and to introduce Bayesian Networks (BNs) as a technique to explore a structural understanding of place and politics. In order to make our argument we first discuss differences between compositional and structural conceptions of place, and show the theoretical superiority of the latter. Second, we describe how spatial regressions have incorporated the spatiality in aggregate data to show the placespecific nature of political behavior. Third, spatial regression techniques are criticized for focusing upon space rather than place. Fourth, we describe BNs and show how they may be used to identify the complexity of the mediating role of different aspects of place upon political behavior. Analyses of the growth of the Nazi party vote between 1928 and 1930 are used to illustrate both the spatial statistical techniques and BNs.
Bayesian Networks (BNs) provide a way to uncover probabilistic relationships between variables while also denoting key subsets of variables and the interactions between them (Heckerman et al., 1995). Hence, BNs allow for the evaluation of causal relationships between the dependent and independent elements of a regression equation, and also the complexity of relationships within the explanatory variables. In the quantitative analysis of political behavior, BNs can be used to explore the complexity of the relationships identified by a spatialregression analysis. Such exploration also provides insights into the way that different aspects of place interact, or, in other words, how the mutually constituted elements of place combine to produce placespecific behavior. The end result is an examination of the relationships within places rather than across spaces.
Despite the promise of BNs to the contextual analysis of politics,
we stress that this is our first attempt at applying this technique to
political behavior. Thus, the spirit of the paper is an exploratory foray
into both the technique itself and the analysis of a mediated and structurally
complex notion of place. The paper is organized in the following way. First,
we outline definitions of place and how place is theorized to mediate politics.
In this section we emphasize the difference between a structural and a
compositional notion of place. Next, we note how such notions of place
have been modeled using spatialstructural regression techniques to highlight
the spatiality in the data, spatial dependence and spatial heterogeneity.
The fourth section of the paper uses the example of a spatial regression
analysis of the growth of the Nazi party vote between 1928 and 1930 to
illustrate the focus upon space rather than place. Section five introduces
an alternative focus and an alternative technique by describing the logic
and construction of BN's. Our purpose in adopting BNs is to unpack the
results of a spatialstructural regression to uncover the complex interactions
between the different components of a place and how they combine to influence
political behavior. In section six, we use the growth in the Nazi party
vote between 1928 and 1930 as an example of how BNs can further our understanding
of the contextual influences upon political behavior by evaluating the
causal role of different aspects of place and how they are related to each
other. In the concluding section, we discuss future steps in the use of
BNs as a tool for contextual analysis to assess its potential as a technique
that can provide a systematic analysis of place rather than space.
When place is considered a structure rather than an entity composed of different attributes, it becomes the unit of analysis. Political behavior is placespecific because of “the intricacies of interaction, the specificity of particular times and spaces, the sense of living as meeting, the context” (Thrift, 1983, p. 39). An analysis of placespecific political behavior, at the very least, needs to capture the institutions that are interacting, the senses of identity, and the actions of different socioeconomic groups. Political outcomes are placespecific because knowledge is interpreted and acted upon within the varying contexts of institutionalized memory, interpretation of contemporary events, and endorsement of political responses (Thrift, 1983, p. 45). A contextual analysis of political behavior considers the role of institutions and identity in mobilizing particular groups. In other words, a structural view of place considers the setting of political actors rather than just the attributes of those actors.
Thus, the analysis of place and politics is, at the outset, an ontological issue. The scale of analysis for a contextual approach to politics is the geographic setting that mediates the political outcomes of interest. The key questions are what are the components of place that determine political behavior, and how do these components interact? A number of geographers have established theoretical frameworks to guide us in an interrogation of these questions.
John Agnew’s Place and Politics (1987) established a theoretical basis for the analysis of placespecific behavior. Agnew identified three aspects of place; location, the role a place plays in the worldeconomy; locale, the institutional setting of a place; and sense of place, identities forged and given meaning within places. Of course, these three aspects are separated purely for heuristic reasons. Within places, the existence and vitality of institutions such as unions and chambers of commerce is partly a function of the type and health of local economic activity. In addition, a sense of what it means to be working class or a member of an ethnic minority will be developed within institutions, i.e. within a union, and in relation to the activities of other institutions, the police or religious congregations, for example. Thus, Agnew (1987) provides a view of places as being constituted by economic, institutional and sociocultural processes.
Doreen Massey’s (1994) definition of place makes explicit some of the implications of Agnew’s (1987, 1996) work. For Massey (1994, p.120), places are “networks of social relations” that are dynamic over time. The current expression of social relations is, to some degree, a function of the legacy of previous social relations that have been altered. Places are continually changing as current political actors use the background of existing social relations to foster change. Also, for Massey the nature of a place is a product of its linkages with other places and not just a matter of its internal features. Trade, migration flows, and cultural exchanges are examples of how a place reaches out in ways that alter its economic, institutional and cultural makeup. Thus, Massey makes explicit the temporal dynamic of a place and the way that it is part of a broader network of places.
The importance of place in the study of political behavior demands an ontology that recognizes places as objects of study. Such a structural view of place promotes a holistic and relational view of place instead of a compositional perspective that counts the socioeconomic makeup of places. This is where electoral geographers using aggregate data run into trouble when in dialogue with political science colleagues. “Holistic view” can translate into an argument that places are “complex” and “unique”. In other words, the idea that “place matters” is often asserted by reference to individual case studies, at best, or even anecdotal examples. What have been elusive are systematic studies showing how places matter.
To conclude this section, we will summarize the components of place that need to be identified to include a structural definition of place into a systematic analysis. An operationalization of Agnew and Massey’s definition produces the following measurable components of place; economic role, institutional setting, politicalcultural identity, linkages with other places, and changes over time. The historical dynamism of political behavior within places has been illustrated by a number of studies (Agnew, 1987; Archer and Shelley, 1986; Flint, 1998a; Johnston and Pattie, 1998). Also, studies have shown the importance of linkages between places at either the local (Flint, 1998a) or extralocal scale (Cox, 1998). The empirical section of this paper will illustrate how BNs may be used to identify the relative importance of the other three aspects of place in mediating political behavior and how these aspects are related to each other. In this way it is hoped that place can be shown to matter while retaining the integrity of a structural notion of place.
Studies that attempt to uncover the role of place often end up looking at the spaces created by political behavior. For example, Flint’s (1998b) analysis of the Nazi party showed that its electoral support varied across regional spaces. Other studies of the regional nature of political behavior include O’Loughlin and Bell (1999), Johnston and Pattie (1988), and Archer and Taylor (1981). Also, the role of linkages between places in determining political behavior have been incorporated into contextual analyses by defining localized pockets, or spaces, of support (Flint, 1998a). In summary, space rather than place has dominated geographer’s contextual analyses of politics. Hence, the difficulty in substantiating the claim that place matters. The reason for the prioritization of space over place in contextual analyses lies in the search for spatiality in aggregate data that has driven many studies (O’Loughlin et al, 1994; Flint, forthcoming). The following section describes the two forms of spatiality in aggregate data and illustrates their role in focusing attention upon space rather than place. An alternative approach is to use spatialregression analysis as an aid in constructing BNs that can explore the complexities of place.
Spatial dependence exists when the value of the dependent variable in one spatial unit of analysis is partially a function of the value of the same variable in neighboring units. The existence of spatial dependence may be a manifestation of diffusion and can result in "Galton's problem", whereby "certain traits in an area are often caused not by the same factors operating independently in each area but by diffusion processes" (O'Loughlin and Anselin, 1992, p.17). In other words, an increase in electoral support for a political party in one place may have been a function of increased support in neighboring places. The identification of spatial dependence and its incorporation into the analysis of political behavior operationalizes a key component of place, the importance of the linkages between places in defining context (Massey, 1994).
Spatial heterogeneity refers to a regional pattern in the data that results in instability of parameters across the whole study (Anselin, 1988, p.9). In other words, the slope of any regression equation would not be constant when comparing regions with the complete data set. The identification of spatial heterogeneity within the data indicates the presence of geographical variation in political behavior and its incorporation into statistical models illuminates the placespecific behavior of voter and party. For example, whitecollar employees may have supported a party in one region while bluecollar employees supported the same party in a different region.
The application of spatial dependence and spatial heterogeneity to electoral geography have been detailed elsewhere (O’Loughlin et al, 1994; Flint, 1998a; Flint, 1998b; Flint, 1999). In this paper, we identify the logic of spatial dependence and spatial heterogeneity in order to show how, by definition, they allow us to explore the role of space in electoral behavior rather than the role of place.
Spatial heterogeneity indicates the existence of regionally specific relationships across the data. It may exist as a mere statistical nuisance, expressed as lack of constancy of the regression error variance, or it may be indicative of contextual variation in political behavior (O'Loughlin and Anselin, 1992, p.27). Structurally significant heterogeneity is identified in OLS models by diagnostic tests for heteroskedasticity (Anselin, 1992). Heteroskedasticity is the presence of nonconstant variance of the random regression error over all of the observations. If heteroskedasticity is present, the OLS estimates are unbiased but inefficient and inference based upon the t and F statistics will be misleading and the measures of fit will be wrong (Anselin, 1988, p. 120).
If diagnostic tests for heteroskedasticity are significant, regional patterns of political behavior are indicated. In other words, subsets of geographical units, or regional groupings of counties or census blocks, are identified in which different explanations for political behavior are found. After diagnostic tests have identified heterogeneity, previous studies, theoretical frameworks, and exploratory analysis can be used to identify regions that display voting behavior different from the remainder of the data. The identification of these regions, called spatial regimes, is then used to estimate structural change models (Anselin, 1992, p. 321). To estimate a structural change model, cases within a particular region are identified by the use of a dummy variable and the structural change estimation reported separate regression coefficients for the two sets of cases, those in the region and those that are not. The structural change model is represented by the equations
y_{i} = a_{i}
+ X_{i}b_{ij}
+ e_{i
}for
d = 0
y_{j} = a_{i}
+ X_{j}b_{ij}
+ e_{j
}for
d = 1
where both the constant terms (a_{i(j)}) and the slope terms (b_{i(j)}) take on different terms (O'Loughlin & Anselin, 1992, p. 31). To diagnose whether the structural change estimation captures the heterogeneity within the region, tests for heteroskedasticity should be insignificant.
In addition, a Chow test was reported for the model as a whole and for
each of the explanatory variables. The Chow statistic (Chow, 1960) is a
test upon the stability of regression coefficients. The Chow statistic
is distributed as an F variate with K, NMK degrees of freedom (with M
as the number of regimes). The test is a test of the null hypothesis
H_{0}: g'b = 0
where b is a vector of all the regression coefficients (including the constant terms) and g' is a K by 2K matrix [I_{k}  I_{k}], with I_{k} as a K by K identity matrix (Anselin, 1992, p. 322). The corresponding Wald test may be expressed as the equation
W = (g'b){g'[var(b)]1g}1(g'b)
where b are the estimates of the regression coefficients and var(b) is the corresponding (asymptotic) variance matrix (Anselin, 1992, p. 322). A significant value for the Chow test measuring the stability of the regression coefficients between the two spatial regimes indicates that heterogeneity existed at the regional scale. In other words, different political behavior existed within the regions, or contextual settings, defined by the spatial regimes.
The identification and incorporation of spatial heterogeneity into the analysis of political behavior identifies spaces of regionally specific behavior. These spaces are the product of the combination of relatively similar places and the political behavior that they mediate. The spatial regimes of a structural change model are the manifestation of how place matters in politics but they are not able to capture the processes that determine the role of place as a geographic structure. In other words, spatial heterogeneity and structural change models are a concept and technique that describes the product of placespecific behavior rather than its operation.
Spatial dependence is also incorporated into the spatial analysis of political behavior in order to identify a key component of place, linkages to other places (Massey, 1994). Spatial dependence may be incorporated into models of voting behavior in one of two ways depending upon the diagnostic tests reported in the initial models. The average value of the vote in neighboring geographic units in the first of a sequence of two elections, referred to hereafter as the temporalspatial lag, was incorporated into the initial OLS model. If spatial dependence existed after the inclusion of this variable, then it was replaced by the spatially lagged dependent variable, the average value of the dependent variable in neighboring spatial units. In both of these cases, the definition of a neighbor may be calculated either by distance or contiguity. If the temporalspatial lag is positive in sign and statistically significant, it indicates that the size of the vote in one unit was partially a function of the size in support in the neighboring units in the first of the two elections in that particular period of change. If the spatially lagged dependent variable was positive in sign and statistically significant, it indicates that the vote in a unit was partially a function of the vote in the same election in neighboring units. The inclusion of either of these variables models the role of the interlinkages between places in defining the contextual setting of the voter. Methodologically, the presence of spatial dependence produces biased and inconsistent regression coefficients (Anselin, 1988, p.59).
Spatial dependence can exist in two forms (Anselin, 1988, pp. 11  13). In its substantive form, spatial dependence is interpreted as spatial contagion, whereby the behavior in one spatial unit is partly explained by similar behavior in neighboring units. Methodologically, substantive spatial dependence is incorporated into the regression equation by adding the spatially lagged dependent variable. Formally this may be expressed by the equation
y = pWY + Xb + e
where y is a vector of observations on the dependent variable, X is a matrix of explanatory variables, including the temporalspatial lag, bare the regression coefficients, e is an error term, p is a spatial autoregressive coefficient, and WY is the spatial lag, the average of the value of the dependent variable in neighboring units (Anselin, 1992, p. 271).
In addition to the substantive interpretation, spatial dependence may also have to be controlled for as a nuisance. This form of spatial dependence is known as spatial error dependence as it is associated with model specification errors that are not restricted to one unit but spill across the spatial units of observation. The usual assumptions of homoskedastic and uncorrelated errors no longer hold and so the spatial error model incorporates a spatial autoregressive process in the error term. To estimate regression coefficients in the presence of spatial error dependence, a spatial autoregressive model is estimated which may be stated in the following equations
y = Xb + e
e = We + x
where the notation is the same as above with We being a spatial lag of the errors and x is a "wellbehaved" error term with mean of zero and variance matrix s^{2}I (Anselin, 1992, p.291). The presence of both the spatial lag and the "wellbehaved" error term creates a problem of simultaneity. Therefore, a maximum likelihood procedure that includes the estimation of a nonlinear likelihood function must be executed (Anselin, 1988, p. 59). If the spatial error dependence is ignored, the OLS estimates would be unbiased but could result in misleading inference if the variance estimates are not adjusted because the OLS variance expressions do not account for the dependence among the errors.
Substantive spatial dependence is a means of operationalizing the role linkages between places play in mediating political behavior. However, though the goal is to uncover the specificity of place the result is the inclusion of connections across space between places. Defining neighbors in terms of contiguity or distance is a spatial relationship that aims to uncover the nature of places. As with the consideration of spatial heterogeneity, incorporating spatial dependence into an analysis of contextual political behavior prioritizes the construction and mediating role of spaces rather than places. In other words, including the spatiality of aggregate data in the spatial analysis of political behavior shows the manifestations of placespecific behavior but not the mediation of politics and social processes within places, a mediation that produces the specificity of place.
There is a second implication of using spatial regression models to
investigate the recursive interaction between place and politics. Regression
analysis partitions the roles played by particular aspects of place. Interrelationships
between the independent variables in a multiple regression analysis are
controlled for rather than sought and incorporated as part of the model.
Hence, a compositional view of place is promoted, whereby placespecificity
is a product of the combination of different socioeconomic attributes.
These two implications of using spatial regression to explore politics
and place are exemplified through an analysis of the Nazi party vote.
It should be noted that these variables are also instruments to test theoretical frameworks that have been used to explain Nazism. Institutional setting is the placespecific manifestation of Burnham's (1972) theory of political confessionalism. Burnham argued that Catholics and the industrial proletariat would not have been attracted to the Nazi party because of their respective allegiances to the Center party and the Social Democrats and Communists. In other words, religious and labor institutions would have been an element of the particularity of places. Sense of place is the manifestation of local identities generated by alienation (Arendt, 1958; Kornhauser, 1959), and surrogately measured by unemployment and electoral turnout. Finally, the class theory (Lipset, 1960) argued that the economic policies of the Nazi party attracted the selfemployed middleclass, while Flint (1998b) and Ault and Brustein (1998) found that artisans and skilled workers were also susceptible to the Nazi’s economic policies. The economic role of a place is investigated by the relative size of these economic groups, measured by the variables TOTSELF and BCTRADE.
Exploratory analysis is required before the estimation of spatial regression models. Multiple regression models were specified using the variables discussed above plus adding others by stepwise regression techniques. Though a theoretically informed model is preferred, additional variables were considered in order to counter the critique of electoral geography that contextual influences identified by geographers are a function of poorly specified models that do not include all the relevant explanatory variables (McAllister, 1987). The only other significant variable to be found was TRADJOBS, the percentage of the workforce employed in the trade and transport sector.
The spatial analysis of the Nazi party was conducted at the national scale in order to maximize the number of cases and, therefore, assist in facilitating the robustness of the Bayesian network created later. The results are displayed in Table One. The model provides evidence for a significant role for all three aspects of place but the direction of the relationships is surprising. The positive and significant value of the variable PROT illustrates, as expected, that places without strong Catholic institutions supported the Nazi party. The negative and significant value of the variable TURNOUT indicates that political alienation was not a factor in the Nazi party vote. Instead, the institutional setting was one of weakening political parties allowing the Nazis capture their support. The two variables measuring socioeconomic status are harder to interpret. Both BCTRADE and TRADJOBS measure employment in the trade and transport sector, the former focusing upon a particular class of employee. BCTRADE displays a negative sign while TRADJOBS is positive. Other analyses have shown BCTRADE to be positively related to the Nazi party vote (O’Loughlin et al, 1994; Flint, 1998b). The positive sign of TRADJOBS may indicate support for the Nazi party in urban transport nodes, but this is just speculation.
Fig. 1. click to enlarge. 
In addition to significant socioeconomic variables, regional dummy variables were included to capture heterogeneity across the national surface. Germany was divided into eight historicalcultural regions in order to capture the heterogeneity of German society and its possible influence upon voting behavior. The regions were Prussia, the Northwest, RhinelandWestphalia, Silesia, Central Germany, Baden, Bavaria, and Württemberg (Figure 1, left). The regions were designed to capture cultural similarities and historical interactions and political organization. Also, the borders of these regions were related to the regions created by the Nazi party to organize their political campaigns. In sum, the regions are an attempt to capture a similarity in the message being disseminated by the Nazis and the similarity in the cultural setting within which it was received. The significant value of three of the regional dummy variables indicates the presence of regional heterogeneity. However, the significant value of the BreuschPagan statistic indicates heterogeneity within the regions.
Finally, the regression model is a spatial error estimation to control for spatial autocorrelation across the error terms. LAMBDA is the spatial autoregressive coefficient that controls for the spatial error autocorrelation and allows for the estimation of unbiased and efficient estimates (Anselin, 1992, p.292).
Spatial regression models are effective in illustrating spatial variation in political behavior as well as how linkages across space is a factor in defining placespecific politics. However, the emphasis upon space is to the detriment of the understanding of how different elements of place interact to mediate politics. Spatial regression does offer insights into the attributes of places that mediate political behavior. However, these attributes, as independent variables in a regression analysis, are treated separately. The alternative approach offered by BNs has the benefit of exploring the interaction between different aspects of place and how they combine to mediate political behavior. Spatial regressions promote a compositional view of place by separating out additive socioeconomic attributes of place. On the other hand, BNs promote a structural view of place by showing the mutually constituted complexity of place and its mediation of politics.
Bayesian networks (BNs) have recently gained popularity in the modeling
of uncertain relationships among variables. For introductory and accessible
texts see Pearl (1988), Charniak (1991), Heckerman et al (1995), and Jensen
(1996). The term "Bayesian network" encompasses a variety of graphical
models for representing knowledge and associations within a data set (including
Bayesian belief networks, Bayesian inference networks, and graphical probability
networks). However, the term Bayesian network (BN) is preferred as it is
more neutral than including the term belief, causal, or inference (Charniak,
1991). Though the theory behind BNs has been in existence for over a century,
only now have the difficult problems of computing probabilities given evidence
and conditional probabilities become tractable using modern computing and
breakthroughs in algorithms for "propagating" the uncertainty through the
related variables. Indeed, less than ten years ago Charniak (1991) was
lamenting the computation time needed to construct BNs.
A Bayesian network is a graph of relationships among variables in a data set. A network consists of a series of nodes, each representing a variable, and arcs (or edges), connections (with direction) between nodes representing a causal (but uncertain) relationship between the variables in the nodes (Charniak, 1991, p.50). The assignment of these relationships can be driven either by the data or by the analyst, but most often and most effectively, the relationships are derived from some combination of the two (Spiegelhalter et al, 1993). A connection between two nodes may be interpreted as either a causal path from one to the other or evidence that the nodes are correlated (Charniak, 1991, p. 54). Causal relationships are, of course, useful in making predictions given certain information, but learning causal associations is also important in exploratory analysis, in gaining insight into a data set (and its corresponding problem domain) (Heckerman, 1996a).
The expert input may be interpreted as the prior probability of a relationship that is then judged by the data to create a posterior probability (Mitchell, 1997, p. 157). A maximum likelihood approach is used to create the probabilities taking into account the observations and the assumed probabilistic distribution of the data (Mitchell, 1997, p. 157). Thus, Bayes theorem evaluates the probability of a hypothesis by considering its prior probability, the probability of particular observations given the hypothesis, as well as the actual observations (Mitchell, 1997, p.157). Bayes theorem may be stated
P(hD) = P(Dh)P(h)
P(D)
where P(hD) is the posterior probability of our hypothesis (h), P(h) is the prior probability, P(D) is the probability of observing the data given assumptions of its distribution without reference to h, and P(Dh) is the probability of observing the data given that h is correct (Mitchell, 1997, p. 156).
The type of probabilistic approach of Bayesian methods is in contrast to the classical approach of statistics, which deals with confidence intervals and levels. Where classic deductive reasoning assumes that observation alone can be used to predict unobserved events, Bayesian methods incorporate beliefs about the probability of an outcome held prior to (and perhaps updated by) the observations. In this way, Bayesian networks allow inference without repeated trials by allowing direct construction by a domain expert of the probability tables associated with each node.
The arcs defined by a BN allow for the evaluation of the conditional independence of the variables, or nodes (Mitchell, 1997, p. 185).
Fig. 2: click to enlarge. 
The idea of conditional independence produces three types of path (see Fig. 3, right, adapted from Charniak, 1991).
Fig. 3. click to enlarge 
A BN may include a variety of these paths. It is from the graphic visualization and probabilistic calculation of these relations that the structural nature of place can be explored. Variables representing different aspects of place can be included in a network. The existence of causal paths between these nodes and one representing political behavior graphically display the mediation of politics by elements of place. In addition, a converging path will show that different aspects of place play a role in mediating political behavior. Linear and diverging paths illustrate the complexity of place by showing how different aspects of place are related to each other and may have a less immediate impact upon political outcomes. Another avenue for inquiry, and one not pursued in this paper, is to reverse the direction of the acyclic arcs to explore the recursive interaction between politics and place.
Of course, the assumption of prior knowledge about the relationships between variables is a weakness as well as a strength. A drawback to the construction of Bayesian networks is the determination of its structure with respect to causality and dependence among the nodes. To determine the structure of a Bayesian network, two things are required: some order of the variables that indicates which variables are causes and which are effects, and an assessment of the subset of variables that are conditionally independent of one another. On the one hand, Charniak (1991, p. 61) dismisses the problems of defining prior probabilities. However, Heckerman (1996b, p. 1314) has found that "the causal semantics of Bayesian networks are in large part responsible for the success of Bayesian networks as a representation of an expert system," and asserts that, rather than searching through n! different combinations of orders, people "can often readily assert causal relationships among variables, and causal relationships typically correspond to assertions of conditional dependence." The problems of defining prior probabilities are much more problematic for those considering social behavior than, say, medical diagnoses as the processes being investigated are much more contingent.
A computer, in an unsupervised network generation algorithm, explores all possible pairs of nodes for conditional dependency (though, in our experience, we have not encountered a network generation algorithm that explores all possible orderings of the variables). After the dependencies and causal relationships are in place in a Bayesian network, the local probability distributions for each node (given specific outcomes or values of its parents) is assessed and incorporated into the graph.
It is possible to level a criticism of BNs that often these prior probabilities seem arbitrary; it is not often easy, even with domain experts, to develop probabilities for every possible combination of elements in a set of variables. For this reason, the problem (and its associated computing complexity) is reduced somewhat by discretizing continuous variables into classes or rankings (Charniak, 1991, p. 51). Even so, the number of probabilities in a single node with three possible values (low, medium, and high) with p parent nodes, each having three possible values, is 3p+1. Thus, even the simplest of models become daunting to input all of the values in the node probability tables (Agena Ltd., 1999). However, the complexity of the network is reduced when the conditional independence of some nodes is found and hence the amount of connections and probabilistic outcomes is reduced (Charniak, 1991, p. 53).
In our application of Bayesian networks, the network structure  that is, the conditional dependencies among the nodes of the networks  was not provided in advance. The construction of linkages establishing conditional dependencies among variables is a difficult and timeconsuming task because it (most effectively) must be performed manually by an expert. The machine learning of the network structure is achieved by calculating the probabilities of possible network linkages and selects a model that maximizes the probability of the result. A network structure is created that has a high probability given the data: the structure itself amounts to a hypothesis about the conditional dependencies of the variables in the data set. The search, then, is for the maximum likelihood hypothesis given the data set.
The relative likelihood of each of the large number of possible network structures (Cooper and Herskovits, 1992). The number of possible structures increases exponentially with the number of nodes (variables) present, but a series of assumptions cut down on the number of tested solutions (and resulting complexity) without losing significant explanatory power (Cooper and Herskovits, 1992). A metric for the likelihood of the structure is determined and compared to a measure of the most likely structure found thus far in the search. If the probability, given the data, of the network presently evaluated is greater than the previous "best" result, the present network is established as the most likely structure.
The maximum likelihood measure can be assessed both locally for each link (thus showing the most dominant dependencies in the data set), and globally (the measure for the comparison of one network to another in the search). The log likelihood is given, since the probabilities (which can range from 0 to 1) are typically very small numbers, and it is easier to examine the exponents (usually large negative numbers) of the likelihood measures rather than the measures themselves. Thus, the most likely hypotheses (networks) are those with the lowest negative (closest to zero) global log likelihoods, and those dependencies which are strongest are those with the lowest negative local log likelihoods.
Our purpose in using a BN is to identify how different aspects of place interact to produce placespecific political behavior. BNs are a tool for the systematic analysis of probabilistic relationships that geographers cite as being the key mediating factors of places. The need for expert input into BNs gives them a role within the electoral geographer’s toolkit. Theory, case studies and complementary quantitative techniques can offer evidence for the expert in their construction of the network. In turn, the relationships found within a BN can be used to inform theory, case study and further quantitative analysis.
Following Spiegelhalter et al (1993), the BN of Nazi voting behavior may be thought of at three levels of representation. First, is the qualitative level to investigate the general relationships by creating arcs between the nodes (Spiegelhalter et al, 1993, p. 220). The second level, or the probabilistic domain, calculates the joint distribution of the nodes in terms of probabilities (Spiegelhalter et al, 1993, p. 222). The third level, or the quantitative domain, provides a numerical evaluation of the conditional distributions (Spiegelhalter et al, 1993, p. 223). A BN uses the conditional distributions of individual arcs to create a joint probability for the network as a whole. Hence, we end up with a graphic visualization of the structural relationships creating and place and mediating political behavior as well as a quantitative evaluation of the combined determination of that behavior.
The variables used in the BNs are the same as the ones in the previous regression analysis. The construction of BNs requires discrete data. This is a drawback in the use of BNs as it requires expert intervention in deciding what breaks are used to recode continuous variables. For this analysis a binary classification was adopted, using the mean of the variable as the break. The binary classification produced more robust networks than those adopting three or four categories did.

The complex network (Fig. 4, left) illustrates, well, the complexity of place. The different components of place are related to each other and the nodes representing Protestant, manual industrial workers, and blue collar trade and transport workers are all parents of the Nazi node. The manual industrial workers node has a direct link to the Nazi node and an indirect link in terms of a linear path through the bluecollar trade and transport workers node. The same is true for the Protestant node as it displays a direct link to the Nazi node and a linear path via through the bluecollar trade and transport workers node. Nodes measuring alienation, electoral turnout and unemployment, play different roles. The electoral turnout node is the apex of a diverging path connecting to manual industrial workers and self employed. On the other hand, unemployment is an ending node conditionally dependent upon manual industrial workers and self employed. The class node measuring the self employed is at the end of arcs leading from manual industrial workers, electoral turnout, and protestant. In turn, the self employed node is a parent to unemployment, measuring economic alienation. Finally, the node measuring jobs in the trade sector is the end node of two arcs, one from manual industrial workers and the other, not surprisingly, from through the blue collar trade and transport workers. The global log likelihood score for the network was –3488, allowing for comparison of its explanatory value with subsequent networks.
The network is, perhaps, better thought of as a web. The web contains nodes measuring the institutional, identity, and class composition aspects of place. Sense of place, or identity within place, may be engaged by noting the level of alienation of the inhabitants. Economic alienation is measured via the node measuring unemployment and political alienation is identified with the electoral turnout node. The institutional setting of a place was measured by two nodes, religious institutions are measured via the protestant node and organized labor is measured via the manual industrial workers node. The economic role of a place was captured by nodes measuring class composition, self employed, and blue collar workers in trade and transport. The arcs between the various nodes show how the different components of place are mutually constituted. For example, political alienation is related to the institutions of organized labor as well as the size of the self employed group. As another example, religious institutions are related to the size of the self employed group and also to the presence of blue collar workers in trade and transport. A final example of the relationship between different aspects of place can be seen in the relationships of organized labor and selfemployment to unemployment, or economic alienation.
When it comes to showing how these aspects of place structure political behavior, two institutional nodes, protestant and manual industrial workers, and one class node, blue collar trade and transport workers, explain Nazi party electoral support. In addition, the network also shows that the institutional aspects of place are translated through class composition to explain political behavior. Hence, the two linear paths manual industrial workers to blue collar trade and transport workers to Nazi party vote and also protestant to blue collar trade and transport workers to Nazi party vote. In other words, different aspects of place explain political behavior and these aspects are related to one another.
To try and make the explanation of Nazi party support clearer, two refined networks were created. The construction of these networks was based upon expert knowledge and the previous complex Bayesian network. Agnew’s (1987) theory of place suggests that measures of economic role, institutional setting, and sense of place should be included in the network. Complementing Agnew’s approach are theories of Nazi party support suggesting the role of institutionalized political competition (Burnham, 1972), political and economic alienation (Arendt, 1958; Kornhauser, 1959), and class (Lipset, 1960; Ault and Brustein, 1998). Hence, expert knowledge suggests that variables measuring identity/alienation, class or socioeconomic status, and the role of institutions should be included in the refined network. The complex network identified protestant, manual industrial workers, and blue collar trade and transport workers as parent nodes of the Nazi vote. In addition, earlier regression analysis had identified electoral turnout as an explanatory variable. In combination, previous analyses and expert knowledge called for a network that included alienation (measured by electoral turnout), institutional setting (measured by manual industrial workers and protestant), and class (measured by blue collar trade and transport workers) as potential parents for the change in the Nazi party vote.
The choice of variables to be entered into the network was determined by Agnew’s (1987) theory of place. However, different interpretations of Agnew’s theory produced different ideas of how the aspects of place interacted. Seeing as the order that the variables are entered into the network may have an effect upon the relationships that are found (Heckerman, 1996b, p.13), two
Fig. 5. click to enlarge. 
Figure 6, right, shows the strength of the relationships in the first refined network. The figures are log likelihood scores, and the lower the score the greater the probability that the status of a parent node determines the status of its child. The log likelihood table shows that the strongest relationship (419.39) is a product of considering the religious and labor institutional settings with class status (or Protestant, with manual industrial workers and bluecollar trade and transport workers). Further support for the role of institutional setting in mediating political behavior is offered by the next strongest relationship (420.33), one that considers just the Protestant and manual industrial workers nodes. The third strongest relationship (430.34) confirms the interaction between class standing and institutional setting in mediating political behavior, by quantifying the interaction of Protestant and blue collar trade and transport workers and their relationship to the Nazi party vote. The log likelihood table also indicates the relative unimportance of electoral turnout in mediating the Nazi party vote. The global log likelihood score for the refined network was –2427, illustrating its greater explanatory value compared to the complex network.
The relatively weak role played by electoral turnout in the first refined network suggested an alternative refined network. The second refined network (Fig. 7, left) places class position as the initial node, a position from which alienation is experienced and then interpreted through religious institutions.
Fig. 7. click to enlarge. 
Similar to the first refined network, religious institutional setting (measured by Protestant), displays relationship to the Nazi party vote separate from the other variables. Both the manual industrial workers and blue collar trade and transport workers nodes display direct relationships with political behavior. Entering electoral turnout into the network at a later stage confirms the findings of the first network that it has no relationship with Nazi party vote. However, there is a relationship from manual industrial workers to electoral turnout that suggests the role of class standing and/or labor organization in voter mobilization. Finally, the linkage from manual industrial workers and bluecollar trade and transport workers probably suggests an obvious correlation between two measures of class.
The log likelihood tables for the second refined network are, not surprisingly, the same as the previous ones (Fig. 6, above). The second refined network shows the dominant roles played by institutions and class standing in mediating the Nazi party vote. Religious institutions in particular, and the interaction between class standing and class organization, interacted to form settings that either nurtured or frustrated Nazi electoral success. The global log likelihood score for the second refined network was –2424, meaning that its explanatory value was almost identical to the first refined network.
Fig. 8. click to enlarge. 
A final, complex, network was created in order to test the robustness, or sensitivity, of the refined networks or, in other words, that the relationships found were not merely an artifact of the order in which the nodes were entered into the network (Fig. 8, right). A sensitivity network was created in which all the variables were entered in a random order. The acyclic graph illustrates that the same three nodes (protestant, manual industrial workers, and blue collar trade and transport workers) have direct pathways to the change in the Nazi vote. In addition, electoral turnout plays is located in a linear path, in this case between manual industrial workers and protestant. The role of these four variables in the complex and randomly generated network provides support for the claim that the relationships in the refined expert networks are not an artifact of the data.
With respect to the Nazi party vote, institutions and class position were found to be the most important aspects of place in determining political behavior. The dominant role of Protestantism in explaining the Nazi party vote has been a staple of previous analyses (for example, see Falter, 1991). However, the BN approach conceptualizes Protestantism as an institutional feature of place rather than an individual characteristic. The importance of institutional setting in explaining the Nazi party vote is confirmed by the role of the manual industrial workers node in the networks. Finally, the interrelationship between the manual industrial workers node and the bluecollar trade and transport worker illustrates how institutional setting constitutes class standing. In combination, different aspects of place interacted to form spatial settings conducive to Nazi party electoral success.
BN’s reorient the quantitative analysis of contextual politics from a focus upon space to place. Instead of looking at spatial linkages between places and variation in political behavior across regional spaces, BN’s provide a systematic way of analyzing the mechanisms by which place mediates political behavior. The next step is to incorporate space into the BN’s. Spatially lagged variables of political behavior may be included in the networks to capture local political activity. In addition, and once spatial analysis has identified regions of political behavior, different networks can be created for different regions to engage such spatial heterogeneity. Hence, the mutual construction of place and space can be included in the same analysis.
The purpose of this paper was to investigate how places mediate political behavior. Once the elements of place were identified through theory, the relationships amongst them were identified graphically and probabilistically. Hence the way that aspects of place are mutually constituted was explored. In addition, BN’s show how the complexity of place aligns to mediate political outcomes. Thus, BN’s offer a fruitful mechanism for unpacking the components of place to see how they interact to structure placespecific behavior. Using this technique, place is shown to be more than the sum of its parts, as it is the way that the elements of place combine that produces placespecific behavior.
The election file includes returns for the Reichstag elections between 1920 and 1933, and the census file contains socioeconomic data collected from a variety of sources and at a variety of times. Jürgen Falter and Wolf Gruner (1981) have described how this data set was revised during the 1970's to correct data errors, which were mainly a result of punching errors, and also to compensate for internal political boundary changes within Weimar Germany in order to make the territorial units within the data set as consistent and coherent as possible. The changes in internal political borders in Weimar Germany were a product of major changes in administrative units between 1919 and 1933 that were partially a result of the incorporation of suburbs into urban areas and the reform of local government. Falter and Gruner believe that the sources of the data set are the respective volumes of the Statistik des Deutschen Reiches (Statistics of the German Reich) issued by the Statistisches Reichsamt (State Statistical Office) in Berlin.
2. With the innovations in the computational algorithms for the propagation of evidence given uncertain relationships among variables, BNs demonstrate their utility as selfcontained decision support systems. Once a network has been defined, and the probability distribution among the variables inputted (either by the expert, by the data, or some combination of both), BNs can be used to propagate known evidence, like states of certain variables, and predict the likelihood of states of other variables given that evidence. For example, a BN is an effective representation for the diagnosis of car problems: given the fact (evidence) that the car doesn't start, but that the radio works and the lights work, there is a 0.58 (of 1.00) chance that the likely cause for the failure is a bad distributor, and a 0.35 chance that the spark plugs need to be replaced. The assumption is that the certain variables are independent (like the status of the fuel pump and that of the power windows), but that others are dependent and interrelated (like the status of the ignition and that of the alternator), and that these conditional dependencies can be modeled through some combination of prior (expert) knowledge and evidence. Such potential of the propagation of uncertainty through a network of relationships to reveal the conditional probability of a certain outcome makes a BN a very effective decision support tool. The user presents the network with a "whatif" scenario and the network delivers its prediction (with uncertainty and alternatives) for states of unknown variables.
Agena Ltd. 1999. Bayesian Belief Networks. http://www.agena.co.uk/bbn_article/bbns.html.
Agnew, J. A. 1987. Place and politics: The geographical mediation of state and society. Boston: Allen & Unwin.
Agnew, J. A. 1996. "Mapping politics: how context counts in electoral geography." Political Geography 15: 129146.
Anselin, L. 1988. Spatial econometrics: Methods and models. Dordrecht, Holland: Kluwer Academic Publishers.
Anselin, L. 1992. Spacestat: A program for statistical analysis of spatial data. Santa Barbara, CA: NCGIA.
Archer, J.C. and F. M. Shelley. 1986. American Electoral Mosaics. Washington, D.C.: Association of American Geographers.
Archer, J. C. and P. J. Taylor. 1981. Section and party. Chichester: John Wiley.
Arendt, H. 1958. The origins of totalitarianism. Cleveland: World Publishing Co..
Ault, B. and W. Brustein. 1998. "Joining the Nazi party: Explaining the political geography of NSDAP membership, 1925  1933." American Behavioral Scientist 41: 13041323.
Burnham, W. D. 1972. "Political immunization and political confessionalism: The United States and Weimar Germany." Journal of Interdisciplinary History 3: 130.
Charniak, E. 1991. “Bayesian networks without tears.” AI Magazine 12, 5063.
Chow, G. C. 1960. "Tests of equality between sets of coefficients in two linear regressions." Econometrica 28: 591605.
Cooper, G.F., and E. Herskovits. 1992. "A Bayesian Method for the Induction of Probabilistic Networks from Data." Machine Learning 9: 309347.
Cox, K. 1998. “Spaces of dependence, spaces of engagement and the politics of scale, or: looking for local politics.” Political Geography 17: 123.
Falter, J.W. 1991. Hitlers Wähler. Munich: C.H. Beck.
Falter, J. W., and W. D. Gruner. 1981. "Minor and major flaws of a widely used data set: The ICPSR "German Weimar Republik Data, 19191933" under scrutiny." Historical Social Research 20: 426.
Flint, C. 1998a. “Forming electorates, forging spaces: The Nazi party vote and the social construction of space.” American Behavioral Scientist 41: 12821303.
Flint, C. 1998b. “The political geography of the Nazi party’s electoral support: The NSDAP as regional Milieuparteien and national Sammlungsbewegung.” The Arab World Geographer 1: 79100.
Flint, C. 1999. “Electoral geography and the Social Construction of Space.” Unpublished Manuscript.
Flint, C. (Forthcoming, 2001). “The theoretical and methodological utility of space and spatial statistics for historical studies: The Nazi party in geographic context.” Historical Methods.
Heckerman, D. 1996a. “Bayesian networks for knowledge discovery.” In Advances in knowledge discovery and data mining, edited by U. Fayyad, G. PiatetskyShapiro, P. Smyth, and R. Uthurusamy, 275305. Cambridge, MA: MIT Press.
Heckerman, D. 1996b. A tutorial on learning with Bayesian networks. Redmond, WA: Microsoft Corporation.
Heckerman, D., D. Geiger, D.M. Chickering. 1995. “Learning Bayesian networks: The combination of knowledge and statistical data.” Machine Learning 20: 197243.
Henrion, M., J.S. Breese, E.J. Horvitz. 1991. “Decision analysis and expert systems.” AI Magazine 12: 6491.
Jensen, F.V. 1996. An introduction to Bayesian networks. New York: Springer.
Jensen, F.V. 1999. “A brief overview of the three main paradigms of expert systems.” www.hugin.dk/huginintro/paradigms_pane.html. Aalborg University, Denmark
Johnston, R. J., and C. Pattie. 1988. "Changing voter allegiances in Great Britain." Regional Studies 22: 241275.
Johnston, R. J. and C. Pattie. 1998. “Campaigning and advertising: An evaluation of the components of constituency activism at recent British General Elections.” British Journal of Political Science 28: 677685.
Kornhauser, W. 1959. The politics of mass society. New York: The Free Press.
Lipset, S. M. 1960. Political man: The social bases of politics. Garden City, New York: Doubleday.
Massey, D. 1994. Space, place, and gender. Minneapolis: University of Minnesota Press.
McAllister, I. 1987. "Social context, turnout, and the vote: Australian and British comparisons." Political Geography Quarterly 6: 1730.
Mitchell, T. M. 1997. Machine Learning. New York: McGrawHill Companies, Inc.
O'Loughlin, J. and L. Anselin. 1992. "Geography of international conflict and cooperation: Theory and methods." In The new geopolitics, edited by Michael D. Ward, 1138. Philadelphia: Gordon and Breach.
O’Loughlin, J. and J. Bell. 1999. “The political geography of civic engagement in Ukraine.” PostSoviet Geography and Economics 40: 233266.
O'Loughlin, J., C. Flint, and L. Anselin. 1994. "The political geography of the Nazi vote: Context, confession and class in the 1930 Reichstag election." Annals of the Association of American Geographers 84: 351380.
O’Loughlin, J., C. Flint, and M. Shin. 1995. “Regions and milieux in Weimar Germany: The Nazi party vote of 1930 in geographic perspective.” Erdkunde 49: 305314.
Pearl, J. 1988. Probabilistic inference in intelligent systems. San Mateo, CA: Morgan Kaufmann.
Pred, A. 1990. “Context and bodies in flux: Some comments on space and time in the writings of Anthony Giddens.” In Anthony Giddens: Consensus and Controversy, edited by J. Clark, C. Modgil, and S. Modgil, 117129. London: The Falmer Press.
Spiegelhalter, D.J., A.P. Dawid, S.L. Lauritzen, R.G. Cowell. 1993. Bayesian analysis in expert systems. Statistical Science 8: 219283.
Thrift, N. 1983. “On the determination of social action in time and
space.” Environment and Planning D: Society and Space 1: 2357.
Table One. Spatial error estimation of the change in the Nazi party vote, May 1928September 1930.
OBSERVATIONS = 743
VARIABLES = 12
DEGREES OF FREEDOM = 731
R2 = 0.47
LIK = 2425.92
Variable  Coefficient  Standard Deviation 
CONSTANT  26.83*  4.35 
PROT  0.14*  0.01 
BCTRAD  0.31**  0.16 
N309TURN  0.23*  0.05 
TRADJOBS  1.02*  0.25 
NORWEST  0.44  1.30 
RHINE  0.60  1.30 
CENTRAL  3.80*  1.44 
SILESIA  2.13  1.64 
BADEN  2.09  1.72 
WURTBERG  10.37*  1.58 
BAVARIA  3.02**  1.50 
LAMBDA  0.43*  0.04 
* Statistically Significant at the
0.01 Level
** Statistically Significant at the 0.05 Level 
Regression Diagnostics:
Diagnostics for Heteroskedasticity
Random Coefficients
Test:  DF  VALUE  PROB 
BreuschPagan test  11  120.42  0.000 
Spatial BP test  11  120.42  0.000 
Diagnostics for Spatial Dependence
Test:  DF  VALUE  PROB 
Likelihood Ratio Test  1  91.50  0.000 
Test on Common Factor Hypothesis
Test:  DF  VALUE  PROB 
Likelihood Ratio Test  11  12.96  0.296 
Wald Test  11  12.24  0.346 