HomeEducationDoctorate (PhD & EngD)For current candidatesPhD infoUpcoming public defencesPhD Defence Jelte van Waterschoot | Personalized and Personal Conversations - Designing Agents Who Want to Connect With You

PhD Defence Jelte van Waterschoot | Personalized and Personal Conversations - Designing Agents Who Want to Connect With You

Personalized and Personal Conversations - Designing Agents Who Want to Connect With You

Due to the COVID-19 crisis the PhD defence of Jelte van Waterschoot will take place (partly) online.

The PhD defence can be followed by a live stream.

Jelte van Waterschoot is a PhD student in the research group Human Media Interaction (HMI). His supervisor is prof.dr. D.K.J. Heylen from the Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS).

Social conversational agents are useful tools for handling customer service requests or for social engagement like chit-chat or playing a game. The development of conversational agents has seen a rise in the last decade. For example, companies include chatbots on their website to lend support to visitors and virtual assistants are part of smart speakers in many homes. One large limitation in current conversational agents is their inability to develop long-term rapport and engagement with end-users. This thesis focused on adaptation and long-term real world engagement as steps towards creating more personalized social conversational agents. The work is oriented towards dialogue designers and everyone who is involved with design of conversational agents: programmers, researchers, linguists, user experience experts and so on.

We provided an overview of different ways of adaptation through multimodal interaction as well as an overview of design frameworks for prototyping and developing multimodal conversational agents. We compared different state-of-the-art topic-based models for personalization, with a focus on topic management in conversational agents.

After considering multiple design frameworks and the needs of dialogue designers for a design framework, we found a lack of design patterns and guidelines for dialogue designers, specifically for multimodal design. We developed our dialogue engine, Flipper, which we integrated into a virtual human platform for creating multimodal social conversational agents. We included design patterns for dialogue designers and some examples of how Flipper integrates with other components such as multimodal sensors, existing natural language processing pipelines and virtual humans.

We developed three prototypes with our framework: i) the multimodal virtual agent Alice, ii) the BLISS conversational agent and iii) the CoffeeBot. The Alice agent is a software toolkit which other dialogue designers can use for building a social conversational agent. The BLISS conversational agent, named after its research project, is a prototype using speech containing scripted dialogue and was used for data collection of answers to the agent's questions about mental well-being and happiness. The CoffeeBot is a prototype social robot designed for long-term real world interactions with a focus on asking personalized questions in spontaneous interactions near coffee machines.

The data collection with the BLISS agent was our first step to collect real world data about personal user topics. An interesting finding of the data collection was that there is no immediate need for a complex dialogue system. Despite the relatively high word-error rate of speech recognition, rigid dialogue structure and disfluency of speech synthesis of the agent, at least one topic related to their well-being and happiness could be extracted for each user. To increase more language variability and add a more loose dialogue structure, we developed the CoffeeBot. Its purpose was to have spontaneous speech-based interactions, casual conversation, at the workplace. We based the CoffeeBot's dialogue structure on a model of casual conversation. We combined this with asking questions, specifically starter or opening questions, follow-up questions and questions based on past conversations. We took a template-based approach with syntactic and semantic parsers to recognize user topics and generate the questions to be asked by the CoffeeBot. These questions became more tailored to the user over time. The CoffeeBot learned a personalized user model to have more engaging conversations with people.

We prepared an evaluation for a long-term real world study with the CoffeeBot, which we piloted for five weeks. Our evaluation was focused on two things: i) measuring the impact of personalization on the engagement and ii) the general user experience. We compared different methods and combined questionnaires as well as interviews and interaction metadata to measure the effect of the personalization model. The CoffeeBot's model is yet to be evaluated to see if this type of personalized question asking increases engagement with social conversational agents. This is due to the limited data from the pilot and insufficient time for a full long-term real world study. Despite the study's limitations, we did see usable user models in the CoffeeBot, similar to the data collection with the BLISS agent. Also, from both the BLISS agent and the CoffeeBot's studies we learned that users occasionally needed more time to think about answers. Moreover, distinguishing between an answer to a question and other responses, such as requesting more time to think or a user repeating the agent's question, is still a challenge for a conversational agent. Recognizing and responding to these types of user responses remain an open research problem in speech-based systems. Finally, most of the interactions were engaging for users despite the mistakes the conversational agents made. For long-term use, we expect a drop in engagement if mistakes become a nuisance to the user, however we would argue that an agent making a few mistakes here and there can still provide useful and enjoyable conversations for end-users.