UTFaculteitenEEMCSDisciplines & departementenDMBAssignmentsRunning AssignmentsRunning Bachelor Assignments[D] What dialogue structures can say about emerging trouble in online chat counseling

[D] What dialogue structures can say about emerging trouble in online chat counseling

Bachelor Assignment

What dialogue structures can say about emerging trouble in online chat counseling

Type: Bachelor ATLAS 

Duration: March 2018 until Juli 2018

Student: Verbeek, R. (Rob, Student B-ATLAS)



I will be working on a dataset from the Trimbos institute, which consists of chat conversations between agents and patients. The patients are alcohol and drug addicts, that request such conversations.

The focus will be on the appearance of pauses in the conversations, as this has shown to be a relevant feature in spoken dialogue (e.g. a longer pause suggests a negative message/response). This project will explore the value that these pauses have in chat conversation. Hence, the main research question is,

“What can dialogue structures tell about emerging trouble in online chat counselling?”

It is expected that no labels will be available (e.g. whether a conversation is considered good or bad), so analysis will largely be done using unsupervised machine learning. These methods will focus on the pauses and the context around them, and find connections. Some possible sub-questions are:

- Can a relation be found between pauses and word use?

- Can we classify and make sense of individual chat messages using unsupervised machine learning?

o Is topic modelling (LDA) able to classify chat messages (around pauses)?

o Can we embed the pauses in messages in the Naïve Bayes model?

- Can we distinguish between different types of pauses, considering the nature of different patients?

- Is sentiment analysis able to recognize different types of messages?

This research will be a first step in the direction of automated analysis of dialogue structures in online chat counselling. We hope that from here, more questions and possibilities will become apparent, and that the results will eventually allow improvement of the quality of online chat counselling.


An approach of (agile) data science, combined with basic Machine Learning methods (LDA, SVM, single-layer neural networks) will lay basis for this research. It is expected that basic Machine Learning methods will be sufficiently advanced, whilst also keeping it possible to gain insights from the results. If, however, more detailed language analysis is expected to have predictive value, more advanced machine/deep learning methods will be applied.

The focus will be on continuous evaluation of intermediate results, to from there determine new goals. Right now, it is unclear what the dialogue structures can exactly tell us. After a first round of analysis, more insights will allow for deeper and more relevant research. This method of data science is called agile data science.