Database group EEMCS faculty

Project we participate in

Pathfinder is our XQuery to relational compiler developed at the Database Group in cooperation with CWI Amsterdam and University of Tübingen. The Pathfinder project is an exploration of how far we can push the idea of using mature relational database management technology to design and build a full-fledged XML database. Pathfinder requires only local extensions to the underlying DBMS’s kernel, such as the staircase join operator, a join recognition logic in our compiler, as well as a careful consideration of order properties of relational operators. The compiler is part of MonetDB/XQuery, which is claimed to be the world's most efficient XQuery database system. Website:

PF/Tijah (Pathfinder/Tijah, pronounce as "Pee Ef Teeja") is a flexible open source text search system developed at the Database Group in cooperation with CWI Amsterdam and University of Tübingen. The system is integrated in Pathfinder and can be downloaded as part of the MonetDB/XQuery database system. PF/Tijah is used to aid research in information retrieval at the University of Twente, including the application of language models to search, entity retrieval, expert search and the efficient implementation of the W3C candidate recommendation XQuery Full-Text. Website:

StreetTivo is a project run in cooperation with the Human Media Interaction (HMI) Group and CWI Amsterdam that will bring video annotation techniques - such as a large-vocabulary speech recognition system, shot detection, low-level feature detectors for query-by-example, high-level feature detectors such as a face detector, etc. - to everybody's living rooms. StreetTivo connects hard disk recorders in a peer-to-peer network, enabling the distribution of workload of video annotation over many small machines. Website:

InstantDB opens up a new alternative to protect personal data over time. It is based on the assumption that long lasting purposes of data can often be satisfied with a less accurate, and therefore less sensitive, version of the data. In our data degradation model, called Life Cycle Policy model, data is stored accurately for a short period, such that services can make full use of it, then degraded on time progressively decreasing the sensitivity of the data, until complete removal from the system. Website:

SensorDataLab is our sensor lab environment. It is used to identify issues, develop ideas and validate solutions for sensor data management. The SensorDataLab addresses a localization scenario with four different sensor networks and about 60 deployed devices. In the context of this lab meta data management, streaming data, manual sampled data and provenance data are collected and managed. The goal is to come with storage solutions to support querying, annotating and processing of the different data as well as combinations of the different data types. Website: