ARIA-VALUSPA: Affective Retrieval of Information Assistants; 
Virtual Agents with Linguistic Understanding, Social skills and Personalised Aspects
The ARIA-VALUSPA (Artificial Retrieval of Information Assistants – Virtual Agents with Linguistic Understanding, Social skills, and Personalised Aspects) project creates a ground-breaking new framework that allows easy creation of Artificial Retrieval of Information Assistants (ARIAs) that are capable of holding multi-modal social interactions in challenging and unexpected situations.
 The system can generate search queries and return the information requested by interacting with humans through virtual characters. These virtual humans are able to sustain an interaction with a user for some time, and react appropriately to the user’s verbal and non-verbal behavior when presenting the requested information and refining search results. Using audio and video signals as input, both verbal and non-verbal components of human communication are captured. Together with a rich and realistic emotive personality model, a sophisticated dialogue management system decides how to respond to a user’s input, be it a spoken sentence, a head nod, or a smile. The ARIA uses special speech synthesizers to create emotionally colored speech and a fully expressive 3D face to create the chosen response. Backchannelling, indicating that the ARIA understood what the user meant, or returning a smile are but a few of the many ways in which it can employ emotionally colored social signals to improve communication.
The system can generate search queries and return the information requested by interacting with humans through virtual characters. These virtual humans are able to sustain an interaction with a user for some time, and react appropriately to the user’s verbal and non-verbal behavior when presenting the requested information and refining search results. Using audio and video signals as input, both verbal and non-verbal components of human communication are captured. Together with a rich and realistic emotive personality model, a sophisticated dialogue management system decides how to respond to a user’s input, be it a spoken sentence, a head nod, or a smile. The ARIA uses special speech synthesizers to create emotionally colored speech and a fully expressive 3D face to create the chosen response. Backchannelling, indicating that the ARIA understood what the user meant, or returning a smile are but a few of the many ways in which it can employ emotionally colored social signals to improve communication.

Here is a gist of the research outputs:
- Multilingual speech recognition in English, French and German.
- Improved multi-modal affective signal processing, with paralinguistic features from both audio and video.
- A framework to develop your own virtual humans
- Examples for the framework to start designing your own conversational virtual human
- Improved speech synthesis with affective features such as the emotions happy and cross.
- Dealing with interruptions during a conversation, both from the agent and human perspective
- A database containing conversations in many languages about a myriad of topics, to be used by other academics as well
- A new annotation tool that will let annotators more efficiently deal with annotating multi-modal dialogue
Contact information
Dr. M. Theune (Mariët Theune)
m.theune@utwente.nl
+31 (0)53 489 3817
Dr. M. Bruijnes (Merijn Bruijnes)
m.bruijnes@utwente.nl
+31 (0)53 489 4329
J.B. van Waterschoot MSc (Jelte van Waterschoot)
j.b.vanwaterschoot@utwente.nl
+31 (0)53 489 3100
Link to official project website: http://aria-agent.eu/