Software Engineering & Evolution

Software engineering is the most technical part of technical computer science, focused on developing and applying systematic principles common to other kinds of engineering (like mechanical or electrical engineering) to development of software systems. In particular, it covers:

Software evolution, in particular, is a branch of software engineering focused on studying existing software and not necessarily creating new one. It covers, among other topics:

A typical research project in software engineering involves an implementation of a software system at least up to a fully functioning prototype, performing a feasibility study and/or a user study. A typical software evolution project covers development of a software system that analyses or transforms another software system. Both often use methodologies from empirical software engineering.


Related Modules

Available Project Proposals

If you are interested in the general topic of Software Engineering and Evolution, or if have your own project idea related to the topic, please contact us directly. Alternatively, you can also work on one of the following concrete project proposals:

  • CFG-to-VPG (Zaytsev)

    Visibly Pushdown Grammars (VPG) are a secret level of the Chomsky hierarchy of languages, hiding between the class of regular languages, corresponding to the regular expressions we all know and love (and sometimes abuse), and the omnipresent context-free grammars (CFG, used in tools like ANTLR or bison. Essentially, they correspond to a class of automata which look like pushdown automata of the CFG world, but you can only push and pop "brackets" and not just any value on the stack. Visibly Pushdown Grammars are known to be able to parse a Dyck language of balanced brackets, which is pretty much enough to parse real languages like XML.

    Can we build a tool that takes any ANTLR grammar and converts it automatically to an equivalent VPG?

  • Codebase Modernity Meter (Gerhold, Zaytsev)

    The concept of “modernity” has been a topic of discussion in many fields, including philosophy, sociology and technology. In the software engineering context, modernity can be defined as the extent to which the source code of a software system utilises new features and capabilities of the programming language it is written in. In software evolution, we define modernity as a scale of measuring the age of a codebase, expressed as language levels/versions. For instance, your Python code can have a modernity of 3.11, if it runs fine with Python 3.11 but fails with an error with Python 3.10.

    There are two ways to take this project…

  • Exploring Error Handling in Rust Programs (Castor)

    Supervisor: Fernando Castor

    The Rust programming language aims to make systems programming efficient and safe at the same time by helping developers build programs that are safe by construction. The language is statically typed and supports safe access to memory, without the need for a garbage collector or runtime system, with the help of its compiler. It also provides scoped concurrency while avoiding state sharing, with exit synchronization for groups of threads. According to the 2023 StackOverflow developers survey (, it is the most admired technology for survey respondents and has been so for many years.

    One thing that Rust does not have, though, is a specific mechanism for signaling and handling errors, differently from a number of popular programming languages, such as Java, C++, Swift, and Python. In Rust, unrecoverable errors are signaled by the panic() function. Computations that may produce errors are represented by values of Result, an enumerated type that encapsulates both correct and erroneous results. These values are just regular Rust values and are not propagated automatically, differently from exceptions in other languages. On the one hand, this means that Rust avoids additional runtime infrastructure to perform stack unwinding during exception propagation. On the other hand, developers must explicitly worry about whether the output of a function is an error or not.

    Previous work has shown that, in a number of languages, developers give less attention to code that handles errors than to other parts of the code. They test error handling code less [1], capture errors without doing anything with them [2,3], capture the incorrect errors [4], fail to account for potential errors [5], and sometimes simply do not use the language's error handling mechanism [6]. Problemas with error handling are commonplace even in languages that do not include specific mechanisms for handling errors [7].

    In this project we would like to address a high-level research question:

    RQ How do Rust programmers handle errors? How much code is dedicated to that?

    This question can be decomposed in a number of more specific research questions:

    RQ1 How are errors typically handled in Rust programs? Are they often ignored, as in other languages?

    RQ1.1 Is it common to have long sequences (in terms of method calls) where we have chained error handling, the kind of thing that would not be there with exception propagation?

    RQ2 What do developers think of Rust error handling? Is it better than C? Better than exceptions?

    RQ3 Do automated tests for Rust programs test exceptional paths?

    RQ4 What are error handling bugs in Rust like?

    RQ5 How are errors handled in the presence of scoped concurrency?


    [1] Felipe Ebert, Fernando Castor, Alexander Serebrenik. An exploratory study on exception handling bugs in Java programs. J. Syst. Softw. 106: 82-101 (2015)

    [2] Nathan Cassee, Gustavo Pinto, Fernando Castor, Alexander Serebrenik. How swift developers handle errors. MSR 2018: 292-302

    [3] Bruno Cabral, Paulo Marques. Exception Handling: A Field Study in Java and .NET. ECOOP 2007: 151-175

    [4] Nélio Cacho, Thiago César, Thomas Filipe, Eliezio Soares, Arthur Cassio, Rafael Souza, Israel García, Eiji Adachi Barbosa, Alessandro Garcia. Trading robustness for maintainability: an empirical study of evolving c# programs. ICSE 2014: 584-595

    [5] Juliana Oliveira, Deise Borges, Thaisa Silva, Nélio Cacho, Fernando Castor. Do android developers neglect error handling? a maintenance-Centric study on the relationship between android abstractions and uncaught exceptions. J. Syst. Softw. 136: 1-18 (2018)

    [6] Rodrigo Bonifácio, Fausto Carvalho, Guilherme Novaes Ramos, Uirá Kulesza, Roberta Coelho. The use of C++ exception handling constructs: A comprehensive study. SCAM 2015: 21-30

    [7] Magiel Bruntink, Arie van Deursen, Tom Tourwé. Discovering faults in idiom-based exception handling. ICSE 2006: 242-251

  • How complex is you model? (van der Wal, Stoelinga)

    Supervisors: Djurre van der Wal, Mariëlle Stoelinga

    In SW engineering, code metrics tell something about the complexity of a program. Various metrics exist, ranging form simple ones like the number of lines of code, to advanced ones like cyclomatic complexity.

    Code metrics can tell something about the code quality. This is extremely important: good code is easy to maintain, refactor, and port. Bad code costs time and money and frustration.
    However, for software models, few of such metrics exist, even though such metrics could be equally useful for models as for code. Thus, the idea of this project is to come up with model metrics and evaluate if they make sense.

    Note that various models could be studied: finite state automata, UML models, fault trees, etc.
    Even though no company is involved, this is relevant for industry; we can involve companies later.


    • Study the state of the art
    • Define model metrics
    • Evaluate on relevant case studies or examples: do these metrics really tell us what we want?
  • Mining Questions about Software Energy Consumption (Castor)

    Supervisor: Fernando Castor

    Nowadays, thanks to the rapid proliferation of mobile phones, tablets, and unwired devices in general, energy efficiency is becoming a key software design consideration where the energy consumption is closely related to battery lifetime. It is also of increasing interest in the non-mobile arena, such as data centers and desktop environments. Energy-efficient solutions are highly sought after across the compute stack, with more established results through innovations in hardware/architecture [1,2], operating systems [3], and runtime systems [4]. In recent years, there is a growing interest in studying energy consumption from higher layers of the compute stack and most of these studies focus on application software [5,6,7,8]. These approaches complement prior hardware/OS-centric solutions, so that improvements at the hardware/OS level are not cancelled out at the application level, e.g., due to misuses of language/library/application features.

    We believe a critical dimension to further improve energy efficiency of software systems is to understand how software developers think. The needs of developers and the challenges they face may help energy-efficiency researchers stay focused on the real-world problems. The collective wisdom shared by developers may serve as a practical guide for future energy- aware and energy-efficient software development. The conceptually incorrect views they hold may inspire educators to develop more state-of-the-art curricula.

    The goal of this work is to obtain a deeper understanding of (i) whether application programmers are interested in software energy consumption, and, if so, (ii) how they are dealing with energy consumption issues. Specifically, the questions we are trying to answer are:

    RQ1 What are the distinctive characteristics of energy-related questions?

    RQ2 What are the most common energy-related problems faced by software developers?

    RQ3 According to developers, what are the main causes for software energy consumption?

    RQ4 What solutions do developers employ or recommend to save energy?

    We leverage data from StackOverflow, the most popular software development Q&A website, and on issues reported in issue trackers of real open source software projects to answer these questions.


    [1] L. Bircher and L. John. Analysis of dynamic power management on multi-core processors. In ICS, 2008.

    [2] A. Iyer and D. Marculescu. Power efficiency of voltage scaling in multiple clock, multiple voltage cores. In ICCAD, 2002.

    [3] R. Ge, X. Feng, W. chun Feng, and K. Cameron. Cpu miser: A performance-directed, run-time system for power-aware clusters. In ICPP, 2007.

    [4] H. Ribic and Y. D. Liu. Energy-efficient work-stealing language runtimes. In ASPLOS, 2014.

    [5] Wellington Oliveira, Bernardo Moraes, Fernando Castor, João Paulo Fernandes. Analyzing the Resource Usage Overhead of Mobile App Development Frameworks. EASE 2023: 152-161

    [6] Wellington Oliveira, Renato Oliveira, Fernando Castor, Gustavo Pinto, João Paulo Fernandes. Improving energy-efficiency by recommending Java collections. Empir. Softw. Eng. 26(3): 55 (2021)

    [7] Ding Li, Shuai Hao, William G. J. Halfond, Ramesh Govindan. Calculating source line level energy information for Android applications. ISSTA 2013: 78-89

    [8] Stefanos Georgiou, Maria Kechagia, Tushar Sharma, Federica Sarro, Ying Zou. Green AI: Do Deep Learning Frameworks Have Different Costs? ICSE 2022: 1082-1094

  • Modelling and Analysis of Agent-based Models (Hahn, Stoelinga)

    Supervisor: Ernst Moritz Hahn and Mariëlle Stoelinga

    Agent-based models are a dominant model in areas such as the social sciences. These models have led to a better understanding of numerous societal phenomena, including climate models [1], policy applications to land-use [2], stock market index [3], and people’s behaviour in self-driving cars [4].

    This project targets at establishing better links between the social sciences and computer science. In this context, we want to investigate the potential of using model checking for the faithful analysis of such models. Therefore, in this project you will

    • collect appropriate models from publications in the social sciences,
    • formalise those models in the language of a probabilistic model checker,
    • use the model checker to compute properties of these models,
    • interpret the values obtained and compare them to the ones found in the literature.

    [1] Regime shifts in coupled socio-environmental systems: Review of modelling challenges and approaches. Tatiana Filatova, J. Gary Polhill, Stijn van Ewijk. 2016, Environ. Model. Softw., Vol. 75, pp. 333-347.

    [2] Agent-based land-use models: a review of applications. Robin B. Matthews, Nigel G. Gilbert, Alan Roach, J. Gary Polhill & Nick M. Gotts. s.l. : Springer, 2007, Landscape Ecology volume, Vol. 22, pp. 1447–1459.

    [3] An Agent-Based Approach to Artificial Stock Market Modeling. Samuel Vanfossana, Cihan H. Daglia, Benjamin Kwasaa. s.l. : Elsevier, 2020, Procedia Computer Science, Vol. 168, pp. 161-169.

    [4] Human behaviour with automated driving systems: a quantitative framework for meaningful human control. Daniël D. Heikoop, Marjan Hagenzieker, Giulio Mecacci, Simeon Calvert, Filippo Santoni De Sio, Bart van Arem. 6, 2019, Theoretical Issues in Ergonomics Science, Vol. 20, pp. 711-730.

  • Support New Language in RefDetect (Hemati Moghadam, Zaytsev)

    Recently, we have developed a tool capable of detecting refactorings in Java and C++ applications [1]. The tool is designed to be language-independent and can be easily extended to support new object-oriented programming languages. To further expand the tool's capabilities, we are seeking a candidate with strong programming skills and familiarity with Java to join our team and work on extending the tool's capabilities to support a new object-oriented programming language.

    As part of the project, the candidate student will use the current parser to extract code information in the target object-oriented language (e.g., Python) and store it in an existing intermediate data structure. Note that the remaining components responsible for detecting refactorings will remain unchanged and uses information stored in the intermediate data structures. The implemented tool will be thoroughly examined to evaluate its effectiveness in detecting refactorings in the newly supported object-oriented programming language.

    Read the full project description…

  • Extracting Modelling Information using Natural Language Processing (Zameni, van den Bos)

    Supervisors: Tannaz Zameni, Petra van den Bos

    Behavior-Driven Development is an approach to agile software development that focuses on the collaboration of different stakeholders to specify system behavior through scenarios. BDD scenarios provide a structured, textual representation of system behavior, making them valuable resources for software development and testing. In recent work [4][5], we show how to use the information from BDD scenarios for models that are suitable for automatic test case generation. However, manually identifying and organizing information for constructing models from these scenarios can be labor-intensive and prone to errors. By automating the extraction of necessary data through NLP, this project aims to streamline the preliminary phase of model generation, enhancing software development and testing efficiency.

    The project involves exploring existing NLP tools for parsing BDD scenarios and extracting relevant details for BDD Transition Systems [5]. If off-the-shelf solutions fall short, custom implementations will be considered. The project aims to demonstrate the effectiveness of NLP in extracting modelling data from BDD scenarios, potentially improving the integration of model-based testing with behavior-driven development by simplifying the initial modelling stages.

    To start you can follow the below steps:

    1. Perform a literature search on NLP techniques and study the NLP techniques used to integrate BDD and MBT [1][2][3]
    2. Study the papers that integrate BDD and MBT with formal BDD Transition Systems [4][5]
    3. Select techniques and tools that can be used to extract modeling data
    4. Apply the NLP techniques on some BDD scenarios and evaluate the results w.r.t. appropriateness for use in testing models.
    5. If the results are not satisfying, consider developing a tool that meets the expectations


    [1] A. Gupta, G. Poels, and P. Bera, “Generating multiple conceptual models from behavior-driven development scenarios,” Data & Knowledge Engineering, vol. 145, p. 102141, 2023.

    [2] M. Soeken, R. Wille, and R. Drechsler, “Assisted behavior driven development using natural language processing,” in Objects, Models, Components, Patterns, C. A. Furia and S. Nanz, Eds. Berlin, Heidelberg: Springer, 2012, pp. 269–287.

    [3] J. Fischbach, A. Vogelsang, D. Spies, A. Wehrle, M. Junker, and D. Freudenstein, “Specmate: Automated creation of Test Cases from Acceptance Criteria,” in ICST. IEEE, 2020, pp. 321–331.

    [4] T. Zameni, P. van Den Bos, J. Tretmans, J. Foederer, and A. Rensink, "From BDD Scenarios to Test Case Generation," ICSTW, Dublin, Ireland, 2023, pp. 36-44

    [5] T. Zameni, P. van Den Bos, A. Rensink, J. Tretmans. An Intermediate Language to Integrate Behavior-Driven Development Scenarios and Model-Based Testing Accepted at VST 2024.