MultimediaN: Semantic Access

The scientific goal in this project is to define a database formalism and theory that unifies in one framework varying sources of knowledge relevant to search tasks: traditional structured data; semi-structured data in the form of marked-up texts; knowledge about the context of a search; a priori knowledge about physical properties of objects; specific background knowledge about the collection at hand; and generic background knowledge as e.g. formalized in an ontology. The formalism to be developed must be sufficiently simple to allow (reasonably) efficient systems to apply the captured knowledge for reducing the uncertainty inherent in multimedia search activity. The expected result is a paradigm spanning various collections and search tasks that simplifies the development of future systems involving multimedia search.


Multiple representations: The key to the development of systems that support relevance-oriented querying of complex multimedia data, possibly coming from multiple sources of information, is the integration of data (retrieval) models and information retrieval models. With the emergence of statistical language models for text retrieval and Gaussian mixture models for image retrieval, we see a promising opportunity for founding such a formalism strongly in probability theory. Recent work at CTIT, CWI and TNO-TPD has shown that probabilistic models can be used to integrate many sources of information, for instance to capture the combined information from speech transcripts and key-frame images in video. The proposed software architecture forms a blend of a database system and a probabilistic expert system. In database terms, the language to express search strategies would be called a query language. Yet, as the 'queries' are instantiations of probabilistic models to reason about data collections and user feedback, the query processor might have much in common with the inference engine of probabilistic expert systems.


Context-awareness: Multimedia retrieval models and systems of the future should explicitly address the user's context and the user's interaction with the system. While the multimedia ambient databases project makes the context available to systems, the objective here is to handle this additional source of information in the retrieval model. Knowing where a user is focusing his/her attention during image retrieval can enhance the operation of relevance feedback to the system. Similarly knowing the user's device (e.g. laptop, mobile telephone, PDA), the user's geographic position, the time of the day, etc. might give the system some hints on what multimedia content to retrieve. Note that context-aware systems should not necessarily return items that that the system has scored as most relevant given an isolated query, because it might also involve e.g. non-content information (e.g. the 'page rank' of a item based on the number of hyperlinks that is pointing to it); novelty detection (is this information apart from relevant, also new to the user?); and high accuracy retrieval (retrieving the correct amount of information using very targeted interaction with the searcher: not too little and not too many), etc.