Software Testing

For background information on the topic as a whole, scroll to the end of this page.

Available Project Proposals

If you are interested in the general topic of Testing, or if have your own project idea related to the topic, please contact us directly. Alternatively, you can also work on one of the following concrete project proposals:

  • Conversational AI systems, such as chatbots and virtual assistants, are becoming increasingly common across various applications, from customer support over education to healthcare. Ensuring their reliability is essential, as failures can lead to user frustration or misinformation. 

    Model-Based Testing (MBT) is a well-estbalished technique that offers a structured approach to testing of (black-box) software systems by using formal models to define expected behaviors and generate test cases.

    The aim of this project is to explore how MBT can be applied to conversational AI to ensure consistency and accuracy in interactions, focusing on the unique challenges posed by natural language processing.

    Some example research questions are the following:

    Dialogue Modeling: How can we create formal models that capture the flow of conversation, handling context changes, user intent, and response generation? In particular, how can the model represent complex dialogue structures found in real-world conversational agents?

    Test Case Generation: What strategies can be used to generate meaningful test cases that simulate realistic conversations? How can we ensure that generated test cases cover key dialogue scenarios, including edge cases where the agent may fail to respond appropriately? (Natural language generation)

    Response Evaluation: How can the behavior of a conversational agent be validated against the formal model? In particular, what metrics or techniques can be used to assess if the agent’s responses align with the expected outcomes defined in the model? (Natural language processing)

    The research will consist of the following steps:

    1. Conducting a literature review to explore existing model-based (testing) techniques that can be applied to conversational AI.
    2. Developing your own approach or methods based on the chosen research question.
    3. Formulating an answer to the research question by applying the developed methods to a conversational AI system.
  • Supervisors: Arend Rensink and Marcus Gerhold

    Have you ever wondered why trains (almost) never collide? Would you believe it when you were told that more than 70% of the Dutch railway tracks are kept safe by relays only? And that this is now slowly being transformed into modern CPU-based controllers? Do you know how such a complex system is tested? If this has piqued your interest, please read on and start your (train) journey into the world of model-based testing!

    The aim of this project is to apply and extend a method for Model-Based Testing of railway components. It is carried out in collaboration with Pilz, an international company in industrial automation, who are one of the suppliers of such components.

    Railway infrastructure, consisting of tracks, signals, points, stations, tunnels and much more, is a prime example of a complex, safety-critical system. Its correct functioning is essential for the safe and speedy journey of trains. Moreover, the functionality of the system is evolving, for instance in order to adopt the more modern European Rail Traffic Management System (ERTMS).

    "Correctness" in the sense above means "compliance to a standard". For railway infrastructure, that standard is specified in EULYNX, a flavour of SysML (which itself can be regarded as a sibling of UML). The standard is an important interface between the national railway infrastructure managers (ProRail in the case of the Netherlands) and suppliers (such as Pilz): the latter are required to show compliance as a precondition for their products to be acceptable to the former. How to show compliance in practice is, however, not prescribed.

    In a recent research project, the Formal Methods and Tools group at the UT have developed a model-based testing approach for this purpose. Globally, this consists of deriving a test suite from a given EULYNX specification, and applying the tests to railway components of which compliance needs to be shown. Since testing is a bug-hunting technique, the coverage of the test suite is all-important for the trust that can be put into the results of this approach. Initial findings are quite promising, but need to be extended.

    This MSc project is about extending those findings. Concretely, the following steps are or could be involved, depending on how the project develops:

    1. Creating an adapter for automatically applying an existing test suite (for a point controller) to a concrete implementation developed by Pilz;
    2. Generating and applying test suites for other subsystem from their respective (EULYNX) specifications, using the available tooling;
    3. Studying (on a theoretical and practical level) the coverage obtained by the generated test suites;
    4. Modifying and extending the current test generation tooling based on the outcome of the steps above;
    5. As part of the previous step, comparing the results with other tools and approaches.

    These steps should be carried out in an agile fashion and in consultation with Pilz: to what degree they can all be addressed, or will be superseded by other questions that may emerge in the course of the work, will depend on circumstances.

    Background material:

    • - A case in point: verification and testing of a EULYNX interface (see https://doi.org/10.1145/3528207)
    • - Conformance in the Railway Industry: Single-Input-Change Testing a EULYNX Controller (see https://doi.org/10.1007/978-3-031-43681-9_15)
  • Supervisors: Marcus Gerhold, Mariëlle Stoelinga

    Quiescence, or the absence of outputs, is an important notion in model-based testing. A system-under-test (SUT) must fail a certain test case if an output is required, but the SUT does not provide one. Therefore, absence of outputs if often labelled via a special quiescence label, delta.
    In practice, it is difficult (and even impossible) to determine if a required output will never be given. Therefore, it is implemented by a time out: if no output occurs after M seconds, then it is assumed that no output will occur anymore.
    Therefore, it makes sense to investigate the notion of quiescence by means of timed automata: suppose that we model everything (i.e. the SUT and the requirements) by means of timed automata, do we then get the same results as before, with the delta-labelling approach?
    Tasks:

    • study the concept of quiescence
    • study timed automata, and the tool Uppaal (see www.uppaal.org)
    • transform delta-labelled automata into timed automata
    • compare the results 
  • Context

    Technolution, an electronics/digital logic/software technology integrator, regularly creates
    embedded systems for customers. Technolution tests these systems to verify conformance to the specification. Currently, these tests are automated, but hand-crafted. Technolution wants to
    investigate feasibility of further automating test of embedded systems using model-based testing, with the aim of reducing cost and mistakes. The novelty lies in the application of model-based testing to embedded systems. As embedded systems play a critical role in society, further improving testing of embedded systems is highly relevant.

    As a first step, we want to apply model-based testing to a sensor for measuring voltages and
    currents in the electrical grid. Part of the behaviour on the sensor interfaces is expected to be
    representable in typical and existing MBT concepts; part of it is atypical for MBT, for example the (digitized) electrical input signals, and arithmetic operations on the digitized signals like offsets, multiplications, and filtering. Important challenges of this assignment are finding a way to model these atypical aspects and inventing a method to generate tests for them, where the number of combinations is potentially infinite.

    Goal

    The research aims to answer the following questions:

    1. Is it feasible to model the embedded systems’ behaviour so that it can be automatically tested using the model as input? How does this model look, what concepts are needed to describe the behaviour?
    2. What algorithm can be invented, or existing method be used, to automatically test the system using the model as input? What coverage can be achieved using this algorithm or method?
    3. Can the algorithm or method be optimized for a specific goal? For example for duration (typical use case would be development testing), or for coverage (typical use case would be release testing).
    4. How does the model-based testing method evaluate against hand-crafted tests?

    Activities

    • Study the interfaces of the sensor and identify the independent parts;
    • Postulate initial ideas for modelling the interface parts;
    • Study, understand, summarize (document) and present the basic concepts of model-based testing;
    • Determine what interface parts can be modelled using an existing approach, and what parts need a new approach;
    • For each interface part, define the modelling and testing strategy and develop a prototype (consisting of a model, an automated testing tool, adapters to the system and a script that runs the model-based test) that demonstrates the test strategy on the actual sensor;
    • Evaluate the model-based testing strategies on effectiveness compared to the handcrafted tests;
    • Provide recommendations for further developing the model-based testing approach and generalizing it to other embedded systems.

    For prototyping and demonstrating, an actual sensor with software adapters to its interfaces will
    be made available.

    Company info

    Technolution B.V.
    Burg. Jamessingel 1
    P.O. Box 2013
    2800 BD Gouda
    The Netherlands

    T +31 (0)182 59 40 00
    E info@technolution.com
    I www.technolution.com

Contact

Background

Testing is a vital part of the software development lifecycle, ranging from small scale unit tests for individual methods, to system-wide end-to-end tests, and up to bigger-picture acceptance tests of requirements. Unlike most verification techniques, testing is often directly applied on-site on the actual software system. It is the most frequently applied verification technique in industry. At the same time, testing is costly and often consumes large chunks of a project's budget. To counteract this discrepancy, testing can be automated to varying degrees.

In FMT, we are interested in this automation. This starts at easing the creation and maintenance of test cases that can be automatically executed on the system. Model-based testing is a more advanced technique where test cases are generated by an algorithm from a model, e.g. a finite state machine or labelled transition system.

Possible research directions are as follows. You could directly improve test generation algorithms for one of the different modelling formalisms. You could also compare existing algorithms. Additionally, you could investigate how to improve testing as part of the software development process, e.g. by linking requirements with (failed) test cases, via Behaviour Driven-Development. Furthermore, you could develop or improve methods that measure and express how much has been tested for some set of executed tests.

In research on testing, theory and practice are relatively close, such that developing new theory and doing case studies are both possible.

Related Courses and Modules