SensePlace3 Interface Mini-Guide

 

Alexander Savelyev, Jonathan Nelson, Alan M. MacEachren and Scott Pezanowski

GeoVISTA Center, Pennsylvania State University

 

For more info, contact: maceachren@psu.edu

 

SensePlace3 is an active research project at the GeoVISTA Center focused on leveraging microblog data to support situational awareness. SensePlace3 emphasizes foraging for information about places. Unlike most other efforts to depict twitter data for places, SensePlace3 supports extraction of place references from the tweet text, and depiction of where the tweets are “about”, not just where they are from (which is also supported). For an overview of the conceptual basis and initial implementation of SensePlace2 (precursor to this application), see MacEachren, et al, 2011. Since 2011, we have expanded SensePlace3 functionality considerably and are in the process of performing user studies to obtain feedback on the user interface (UI) features. This mini-guide outlines the key capabilities of the current version of SensePlace3 and provides a short tutorial on their proper use. SensePlace3 also has a built-in legend, accessible through a link at the bottom right corner of the project’s web interface, which provides an abridged summary of some of the information found in this guide.

1. Access and Performance

The current version of SensePlace3 can be accessed using the following web address:

http://www.geovista.psu.edu/SensePlace3/lite/

SensePlace3 allows users to execute queries on a subset of twitter data using a set of event focused query terms that emphasize crisis events for which situational awareness support is relevant. The result is visualized in multiple linked views (including a map, tag clouds, sortable list of relevant tweets, and timeline; all are described below) and the interface supports a range of strategies to focus the query using temporal, spatial, and other constraints (also described below). Additional resources are accessible through links at the bottom right corner of the interface. These include: (a) reset events to initialize the default application settings; (b) show map legend to view an abridged summary of some of the information found in this guide; (c) CoMatrix to explore co-occurring relations among twitter data dimensions. Data accessible through the lite version are limited to tweets collected over the most recent 2 weeks (about 3 million tweets with some form of location information that includes either a geocoded location that the tweet is from, identified and geolocated places mentioned in the tweet text, and/or locations specified in the profile of the tweet poster’s profile).

In order to keep the user posted about the progress of the latest query, a status message is displayed at the top of the screen. Some of the status messages are directed at SensePlace3 users, while others are meant for the development team and can be somewhat cryptic. We are currently working on building a set of status messages that are understandable to all users. A typical status message would look roughly like this:

Once the query is complete, the status message disappears, the tweet list is populated with the 1000 most relevant matching tweets (or less if there are not 1000 that are relevant) and point symbols appear on the map to depict places mentioned – in purple (or tweeted from – green). Relevance is determined by a set of weighted parameters that gives preference to tweets that are more recent, mention places close to a user-specified location (when one is specified), and include mentions of organizations of money (relevant to crisis relief efforts). Once the tweet list appears, this indicates that it is now possible to interact with the display or initiate a new query.

In an unlikely event of a catastrophic UI failure, try the  button. This will preserve the changes made to the UI, and will likely fix all of the outstanding problems. If all else fails, use the browser’s “Reload” button.

2. Search Controls

The figure below shows a screenshot of the entire SensePlace3 web interface, depicting the default view with the 1000 most recent tweets on any topic:

Search controls are located in the purple zone at the top left corner of the web interface (isolated in the figure below). Search controls provide three capabilities: free-text query (plus command-based constraints on such queries); Requiring “from”, “about”, and “user” location; and clustering by similarity (each is outlined below).

 

2.1 Free-text query

As noted above, the current version of SensePlace3 is driven by user queries. The “Search for:” input field allows users to insert one or more query terms of interest. Users may currently search for single- or multi-word phrases (e.g. “football riots”). Using multiple words without quotes is treated like an OR query (with tweets retrieved that contain any of the terms or all of them). Queries in quotes can add a separation parameter, e.g., "freezing ice"~3  will retrieve any tweet containing both “freezing” and “ice” with 3 or fewer intervening words; using a “0” spacer is the same as leaving off the “~” parameter. In addition, specific character combinations may be used to perform constrained searches. #fire would search for tweets with that hashtag. T:140801-150103 (“T:” is for time) will limit the time frame to August 1st, 2014 to  January 3rd, 2015. U:redcross (U is for “user”) will search for the Twitter screen name redcross. UA:redcross (“UA:” is for “user-approximate”) will search for Twitter screen names containing redcross but not necessarily an exact match, and RTC:2 will return tweets that have been retweeted at least twice.

 

2.2 Working with “from”, “about”, and “user” places

As noted in the introduction, one of the main functionalities provided by SensePlace3 is that we extract geographic information from two independent sources. The first source is the body of the tweet itself (i.e. the names of locations that are mentioned in the tweet). We refer to information coming from this source as “about” locations, as people talk “about” them. For example, if someone were to tweet

First thunderstorm in Paris consisted of one thunder and one lighting strike

followed by the usual downpour of rain #wheremysunat

we would then extract “Paris” as an “about” location. “About” locations are represented as purple graduated symbols on the map with larger circles conveying more tweets mentioning a given “about” location.

The second source of place information is available only for tweets that specify the location at which they were issued (usually as explicit geographic coordinates in the form of latitude and longitude, but sometimes as place names only). We use the term “from” locations to refer to this kind of information, as people send tweets “from” them. “From” tweets are distinguished visually on the map by being symbolized as green rather than purple circles.

The number of tweets that have “from” locations is quite small (only 1-2% of all tweets include such geolocation information, somewhat higher in crisis situations), and they tend to be drowned in the stream of relevant tweets with locations of the “about” kind. The checkbox labeled “Limit to tweets with "from" places” enables users to only retrieve the tweets with “from” locations associated with them to focus on where posts are being generated.

Finally, SP3 can show “user” location. This is an optional free-text field in Twitter where users enter their location. Although users sometimes enter false, useless, or fictitious locations, often they do enter valuable information on their location.

2.3 Clustering similar tweets

The “Cluster with” option will be described as part of the Tweet List component in Section 4.

3. Overview and Detail

SensePlace3 provides both overview and detail depictions in the timeline, map, and place-tree views, as described below.

3.1 Timeline

Timeline displays the changes in the frequency of tweets that match the parameters of the user query. Color shaded bands represent the number of matches that a given query has in the entire database, with dark red indicating the time span with the highest number of tweets. The stacked black bars represent the number of query matches in the top 1000 relevant tweets, which are the ones returned to the SensePlace3 interface.

 

Both the color bands and the stacked bars use a quantile-based classification scheme (quintile and tertile, respectively). By default, the width of the individual color bands is set to one day. Tweet frequency can also be binned by week. Users can specify a time range of interest.

The timeline can be manipulated by manually adjusting the timeline sliders on either end, as illustrated below.

Once the timeline sliders have been adjusted, users can drag the entire selection range to any part of the timeline to initiate a new query.

3.2 Map: Depicting overall frequency and locations of most relevant tweets

The map (as shown in the first figure) overlays a combination of a heatmap layer and two graduated point symbol layers, on a Google Maps base layer. The heatmap displays the spatial density of places tweeted about that match the term, time and place parameters of the user query using a quantile-based sequential color scheme. The spatial density of tweets is calculated using the entire database.

The top 1000 relevant tweets are plotted on top of the heatmap using graduated point symbols. Tweets “from” and “about” a particular location are shown as purple and green, respectively. The size of the point symbols represents the number of relevant tweets referring to that location, while their color density represents the aggregate relevance ranking of those tweets.

Users can switch on or off each display component, using the map legend (as shown in the figure below). This allows users to more easily see and inspect data on each layer.

3.3 Place-Tree

The Place-Tree highlights the locations that have been mentioned in retrieved tweets in a more structured fashion. The tree uses the GeoNames place hierarchy, with country names grouped by continent. Each of the nodes in the hierarchy is colored according to the number of matches the given query has in the entire database, whereas the stacked black dots represent the number of matches in the top 1000 tweets. Place-Tree is currently populated down to the country level, but the plan for the next update is to extend this to city-level.

4. Tweet List

The Tweet List (the top portion of a representative example is shown in the figure below) uses visual aspects of the display to signify four attributes of the data (discussed under ‘visual significations’ below) and provides four kinds of manipulation to the user. 

4.1 Visual significations

4.1.1 Tweet relevance

The narrow bar at the left of each tweet is color coded to indicate relevance to the query (in quartiles, dark depicting more relevant), as estimated by the search engine.

4.1.2 Background Color

Background color of each tweet signifies selection; e.g., a selection of tweets mentioning a place. For example, if a user filters by place using the tag cloud to pick tweets mentioning Egypt (this capability will be discussed in more detail below), relevant tweets will be promoted to the top of the list and given a unique background color. Different background colors signify tweets identified by different filters. Tweets remain selected until the user actively deselects them, thus a sequence of selections by the user without deselecting will result in multiple colors as tweet backgrounds in the list. These colors are assigned randomly, simply to visually identify tweets that correspond to selections/filters.

4.1.3 Tweet locations (about)

Locations that the system has identified in each tweet are highlighted in light blue; this is used to help users visually recognize tweets relevant to places they are interested in.

4.1.4 Tweet locations (from)

Tweets that specify the location at which they were issued (usually as explicit geographic coordinates in the form of latitude and longitude, but sometimes as place names only) are signified by a small globe symbol found to the right of the tweet date.

4.2 Manipulations

4.2.1 Sorting tweets

Tweets in the tweet list can be sorted by relevance rank, by timestamp, by their location, and by the number of prominent locations they mention. The location sort (labeled as “space”) is done based on the distance of individual tweets from the current map center. So, to sort tweets based on their proximity to the place of interest, it is (currently) necessary to center the map on that place first.

4.2.2 Clustering tweets

The “Cluster with” option in the search controls (in the purple zone at the top left corner of the web interface) allows users to apply one of two text clustering options (univariate or bivariate k-means) to group tweets into a small number of clusters. The resulting clusters are shown at the bottom of the tweet list using a few frequent terms that occur in tweets within the cluster. Clicking on a cluster in this display will bring the tweets from that cluster to the top of the tweet list. Clicking again will return to the default list.

4.2.3 Subset, promotion & demotion of individual tweets

Individual tweets can be promoted to the top of the list or demoted to the bottom of the list. This is accomplished by first hovering the mouse over the specific tweet in the Tweet List, at which point a line will pop up saying “Promote, demote”, as shown in the figure below.

The results of promotion and demotion will only be visible when the tweet list is sorted by relevance rank, and will be hidden when sorting by time or space.

5. Map: Details of functionality

5.1 Connections between places tweeted from and about  

The map includes several actions that enable users to focus attention on places and regions. For example, hovering the mouse over a point on the map will highlight the tweets associated with that point in the Tweet List (if they are visible; hovering will not promote the corresponding tweets to the top of the list, only selection does promotion). This functionality works in reverse as well – hovering the mouse over a tweet in the tweet list will highlight the locations on the map that are associated with it. Tweets frequently refer to multiple locations at the same time, e.g. in a tweet about multiple cities in which airports have delays.

We just left Paris and are on our way to London #vacation.

In this case, references to Paris and London co-occur within the same tweet. SensePlace3 UI scans the Tweet List and builds a list of co-occurring locations (i.e. the places Paris co-occurred with, the places London co-occurred with, etc.) every time the new query is run. When a particular location is selected (by clicking on its point symbol) on the map, this list is retrieved and symbolized in the form of connecting lines between the original and the co-occurring locations. In the example shown below, Tunisia is the location that is highlighted on the map, and connecting lines are drawn from Tunisia to every other location that is mentioned along with Tunisia in the top 1000 tweet list. The width of the line depicts quintiles of connection strength (bold lines represent more frequent association).

Clicking on any point symbol on the map will bring the tweet(s) associated with that point to the top of the list (multiple tweets can be selected in succession). Clicking the same point symbol again will deselect that particular tweet, while clicking in a blank space within the map will turn all of the promotions off and put the list back to its default. When a place is clicked that has connections, the connections remain visible as long as the place is highlighted. Thus, it is possible to click on a few places in succession (without clicking in blank space to clear the selections) to build up a network of connections from those selected places.

6. Place-Tree

As a reminder, nodes in the Place-Tree hierarchy are colored according to the number of matches a given query has in the entire database, whereas the stacked black dots represent the number of matches in the top 1000 tweets. The Place-Tree (as shown in the figure below) has a number of user-controlled features.

Users can select one or more places of interest using check boxes positioned next to them, which will highlight the tweets related to those particular locations in the tweet list. The “Uncheck selected” button clears all of the check boxes.

The “Search selected” button will launch a new query based on GeoNames IDs of the features selected in the place-tree, i.e. refining the user-defined query with tweets that only include those place name mentions, or are from those places.

Lastly, the “Full/ Compact Tree” button allows users to toggle between expanded and compact views of the place-tree. The compact view only shows nodes relevant to the top 1000 tweet matches, whereas the expanded view shows the entire GeoNames hierarchy.

7. Word/ Place Clouds

Two word clouds (as shown in the figure below) display the list of locations that are most frequently mentioned in the top 1000 tweets (“Most Relevant” tab) and in the full set of query results on the server (“Overview” tab). The size of the words is proportionate to the number of mentions of that particular place name, and the words can be clicked in order to filter the contents of the tweet list.

                          

 

8. CoMatrix

Along the bottom of the SensePlace3 interface are links to additional resources, one of which is the CoMatrix, a linked view activated in a pop-up window when clicked on. The CoMatrix component is built to explore potential relationships among multiple dimensions of Twitter data, including explicit twitter generated metadata (e.g., tweet text, time stamp, authorship information, GPS coordinates, etc.) and implicit metadata (e.g., mentions of places).

The CoMatrix, shown in the figure below, works as follows. Users first select data dimensions to plot as rows and columns, respectively. Users can adjust the maximum number of column entities that would be shown in the matrix, as well as re-order the matrix by name, marginal frequency counts, or continent (where applicable).


The number next to each entity shows the number of matches that particular entity has in the top 1000 tweets. In the figure above, for example, sydney and australia each have five co-occurring “about” location matches in the top 1000 tweets. Users can hover the mouse over each cell to see the number of co-occurrences of the intersecting row and column entities in the top 1000 tweets. When hovering the mouse over each cell, corresponding tweets (containing the co-occurring entities) will be highlighted on the map and in the tweet list (if those tweets are visible). Users can also click cells to filter other views, or click other views to filter the co-occurrence matrix.

To explore co-occurrences among more than two data dimensions (through a specified semantic path), users can generate network paths using the “Network path” dropdown menus. For example, a user may want to explore all possible connections between place mentions. Three possible network paths exist between place mentions:

place mention – tweet – place mention

place mention – tweet – user – tweet – place mention

place mention – tweet – user – user – tweet – place mention

The three paths have different semantics. The first path can be interpreted as “co-occurrence of place mentions within a single tweet”, the second path can be interpreted as “co-occurrence of place mentions across all tweets by the same user”, and the third path can be interpreted as “co-occurrence of place mentions across all tweets by users with an established user community”. The figure shown below illustrates the SensePlace3 network path definition for the first option described above:


 Network paths can easily be reset using the “reset path” hyperlink found to the right of the path definition.

 

References:

MacEachren, A.M., Jaiswal, A., Robinson, A.C., Pezanowski, S., Savelyev, A., Mitra, P., Zhang, X. and Blanford, J. 2011: SensePlace2: GeoTwitter Analytics Support for Situational Awareness. In Miksch, S. and Ward, M., editors, IEEE Conference on Visual Analytics Science and Technology, Providence, RI: IEEE, 181 - 190.

MacEachren, A.M., Robinson, A.C., Jaiswal, A., Pezanowski, S., Savelyev, A., Blanford, J. and Mitra, P. 2011: Geo-Twitter Analytics: Applications in Crisis Management. 25th International Cartographic Conference, Paris, France.

Savelyev, A. 2013: Multiview User Interface Coordination in Browser-Based Geovisualization Environments (Demo Paper). The 1st ACM SIGSPATIAL Workshop on Map Interaction, in conjunction ACM SIGSPATIAL, Orlando, FL.

Savelyev, A. and MacEachren, A.M. 2014: Interactive, Browser-based Information Foraging in Heterogeneous Space-Centric Networks. In Andrienko, G., Andrienko, N., Dykes, J., Kraak, M.-J., Robinson, A. and Schumann, H., editors, Workshop on GeoVisual Analytics: Interactivity, Dynamics, and Scale, in conjunction with GIScience 2014, Vienna, Austria.