TimeTrails: Spatiotemporal datawarehouses for trajectory exploitation
Internet technologies have created a fabric for information exchange far beyond the original intent to share information between humans. It has become a landscape where entities (people, sensors, devices, observatories) generate events bound in time and by location, carving out event trajectories within a domain of activity. We are especially interested in serving space/time understanding in social networks. The proper characterization of the where-and-when of social network member activity in geoprofiles allows the optimization of communication between the members. Collecting, archiving and annotating these trajectories forms the basis for knowledge extraction and management and for the organization of social communities by mutual interest.
TimeTrails is subproject P19 of COMMIT, a 100M Euro Dutch national program involving 10 universities and 70 companies. In TimeTrails, the academic insitutes University of Twente (UT), University of Utrecht (UU), and the Centrum Wiskunde & Informatica (CWI) work together to help solve the problems of and help develop new technology for our industrial partners Arcadis, EuroCottage, Hyves, KNMI, MonetDB, and TomTom. The database group is involved in two workpackages:
In engineering projects, like building a highway, several interest groups must be involved and the aim is to achieve a good consensus between the varies interest groups and get a feasible solution. This consensus process is based on discussing various design alternatives often sketched in a spatial representation (e.g. a map). During this process several alternatives are discussed, modified, rejected and adjusted over time. To keep the spatial representation readable, separate temporal tendencies requires to aggregate contributions provided by the interest groups according to topic, space and time dimension.
In this PhD project a framework will be developed to access tagged information in a temporal spatial environment on varying aggregation levels to enable the zooming in and out of a map. The challenge is that not much is known on which content will be available, on how the discussion will evolve and which scale the discussion will be performed. Further, the amount of information will make it necessary to pre-process the information to enable fast, web based access. Thus, the main challenge is to provide information in a fast and flexible way without making many assumptions on the information provided by the users and the way users are going to query this information. In this project the idea is to observe the query behaviour of users and self-organize the pre-aggregation levels. I.e., based on the observed user behaviour and a storage boundary the optimal pre-aggregation levels are determined in a streaming fashion minimizing the query response time.
We obtained a COMMIT/ valorization grant to turn certain research results into proof-of-concept production software. Geographical data is typically visualized using layers with (derived) information displayed over a map. Interactive exploration needs real-time re-calculation upon zooming and panning actions. For layers with aggregated information derived from voluminous data, such real-time performance is not achievable using only standard database technology. We developed database index technology that provides accurate aggregation data with the required performance as a plug-in to standard open source database and GIS-software. Using this software, we developed a demo which allows exploration of tweeting hot-spots based on 20-30M geo-tagged tweets from The Netherlands and UK.
The PhD project aims to provide a toolset for building support systems for a networked community whose members display similar activities in similar locations and want to share content, valuations and experiences about those. Scientific challenges encountered include geo-referenced entity resolution, automatic data integration for content enrichment, information extraction, uncertainty management and data quality improvement, domain-of-activity specification, understanding user activities and routes (geoprofiling), and spatial data processing. An XML-based ETL-architecture will be developed that is based on a spatially enhanced XML DBMS and that includes a toolset to provide for development support with the above challenges when focusing on user-volunteered content.
An important point of focus of this research project is to adequately address accuracy, completeness, ambiguity, conflicts, and trust in volunteered information, geo-referencing of not explicitly geo-referenced information, and matching this against known spatial information. Volunteered spatial-temporal data will come to us in large quantities as semi-structured reports describing local context. These volunteers may not have positioning devices, or the objects described may not have been collocated with such a device, and thus such data may lack precise geo-reference. In temporal and thematic details, the reports may be similarly imprecise.
The objective of this PhD research project is to establish and validate a framework for geoprofile-driven content enrichment and data quality improvement on the basis of user-volunteered content and open spatial data services. The research is part of a larger national research project on Spatio-Temporal Data Warehousing. Validation of the framework is to be organized and executed in co-operation with the company EuroCottage (see http://www.eurocottage.com) in the domain of community building and holiday home location profiling. Co-operation with a concurrently running project on neogeography offers a second domain for validation: international development collaboration in agriculture.