Software Engineering & Evolution

Software engineering is the most technical part of technical computer science, focused on developing and applying systematic principles common to other kinds of engineering (like mechanical or electrical engineering) to development of software systems. In particular, it covers:

Software evolution, in particular, is a branch of software engineering focused on studying existing software and not necessarily creating new one. It covers, among other topics:

A typical research project in software engineering involves an implementation of a software system at least up to a fully functioning prototype, performing a feasibility study and/or a user study. A typical software evolution project covers development of a software system that analyses or transforms another software system. Both often use methodologies from empirical software engineering.

Prerequisites

Related Modules

Available Project Proposals

If you are interested in the general topic of Software Engineering and Evolution, or if have your own project idea related to the topic, please contact us directly. Alternatively, you can also work on one of the following concrete project proposals:

  • CFG-to-VPG (Zaytsev)

    Visibly Pushdown Grammars (VPG) are a secret level of the Chomsky hierarchy of languages, hiding between the class of regular languages, corresponding to the regular expressions we all know and love (and sometimes abuse), and the omnipresent context-free grammars (CFG, used in tools like ANTLR or bison. Essentially, they correspond to a class of automata which look like pushdown automata of the CFG world, but you can only push and pop "brackets" and not just any value on the stack. Visibly Pushdown Grammars are known to be able to parse a Dyck language of balanced brackets, which is pretty much enough to parse real languages like XML.

    Can we build a tool that takes any ANTLR grammar and converts it automatically to an equivalent VPG?

  • Codebase Modernity Meter (Gerhold, Zaytsev)

    The concept of “modernity” has been a topic of discussion in many fields, including philosophy, sociology and technology. In the software engineering context, modernity can be defined as the extent to which the source code of a software system utilises new features and capabilities of the programming language it is written in. In software evolution, we define modernity as a scale of measuring the age of a codebase, expressed as language levels/versions. For instance, your Python code can have a modernity of 3.11, if it runs fine with Python 3.11 but fails with an error with Python 3.10.

    There are two ways to take this project…

  • How complex is you model? (van der Wal, Stoelinga)

    Supervisors: Djurre van der Wal, Mariëlle Stoelinga

    In SW engineering, code metrics tell something about the complexity of a program. Various metrics exist, ranging form simple ones like the number of lines of code, to advanced ones like cyclomatic complexity.

    Code metrics can tell something about the code quality. This is extremely important: good code is easy to maintain, refactor, and port. Bad code costs time and money and frustration.
    However, for software models, few of such metrics exist, even though such metrics could be equally useful for models as for code. Thus, the idea of this project is to come up with model metrics and evaluate if they make sense.

    Note that various models could be studied: finite state automata, UML models, fault trees, etc.
    Even though no company is involved, this is relevant for industry; we can involve companies later.

    Tasks:

    • Study the state of the art
    • Define model metrics
    • Evaluate on relevant case studies or examples: do these metrics really tell us what we want?
  • Hyperparameter's Impact the Energy Consumption of LLM's Supporting Software Development (Castor)

    Supervisor: Fernando Castor

    Generative AI and coding assistants are revolutionizing software development, but their energy demands are rapidly escalating. At the same time, concerns about data privacy have driven many developers and organizations to consider deploying their own local AI assistants. The widespread adoption of large language models (LLMs) presents a critical trade-off: balancing energy consumption with task accuracy.

    This study aims to investigate the impact of hyperparameter adjustments such as temperature, top-p and max_output_tokens on energy consumption and accuracy in two common software development tasks: code generation and bug fixing. By evaluating a diverse set of language models on an AI-specific GPU to replicate real-world scenarios. We aim to provide actionable insights that guide developers in optimizing the deployment of LLMs for efficiency and performance.

  • Support New Language in RefDetect (Hemati Moghadam, Zaytsev)

    Recently, we have developed a tool capable of detecting refactorings in Java and C++ applications [1]. The tool is designed to be language-independent and can be easily extended to support new object-oriented programming languages. To further expand the tool's capabilities, we are seeking a candidate with strong programming skills and familiarity with Java to join our team and work on extending the tool's capabilities to support a new object-oriented programming language.

    As part of the project, the candidate student will use the current parser to extract code information in the target object-oriented language (e.g., Python) and store it in an existing intermediate data structure. Note that the remaining components responsible for detecting refactorings will remain unchanged and uses information stored in the intermediate data structures. The implemented tool will be thoroughly examined to evaluate its effectiveness in detecting refactorings in the newly supported object-oriented programming language.

    Read the full project description…

  • Extracting Modelling Information using Natural Language Processing (Zameni, van den Bos)

    Supervisors: Tannaz Zameni, Petra van den Bos

    Behavior-Driven Development is an approach to agile software development that focuses on the collaboration of different stakeholders to specify system behavior through scenarios. BDD scenarios provide a structured, textual representation of system behavior, making them valuable resources for software development and testing. In recent work [4][5], we show how to use the information from BDD scenarios for models that are suitable for automatic test case generation. However, manually identifying and organizing information for constructing models from these scenarios can be labor-intensive and prone to errors. By automating the extraction of necessary data through NLP, this project aims to streamline the preliminary phase of model generation, enhancing software development and testing efficiency.

    The project involves exploring existing NLP tools for parsing BDD scenarios and extracting relevant details for BDD Transition Systems [5]. If off-the-shelf solutions fall short, custom implementations will be considered. The project aims to demonstrate the effectiveness of NLP in extracting modelling data from BDD scenarios, potentially improving the integration of model-based testing with behavior-driven development by simplifying the initial modelling stages.

    To start you can follow the below steps:

    1. Perform a literature search on NLP techniques and study the NLP techniques used to integrate BDD and MBT [1][2][3]
    2. Study the papers that integrate BDD and MBT with formal BDD Transition Systems [4][5]
    3. Select techniques and tools that can be used to extract modeling data
    4. Apply the NLP techniques on some BDD scenarios and evaluate the results w.r.t. appropriateness for use in testing models.
    5. If the results are not satisfying, consider developing a tool that meets the expectations

    References:

    [1] A. Gupta, G. Poels, and P. Bera, “Generating multiple conceptual models from behavior-driven development scenarios,” Data & Knowledge Engineering, vol. 145, p. 102141, 2023.

    [2] M. Soeken, R. Wille, and R. Drechsler, “Assisted behavior driven development using natural language processing,” in Objects, Models, Components, Patterns, C. A. Furia and S. Nanz, Eds. Berlin, Heidelberg: Springer, 2012, pp. 269–287.

    [3] J. Fischbach, A. Vogelsang, D. Spies, A. Wehrle, M. Junker, and D. Freudenstein, “Specmate: Automated creation of Test Cases from Acceptance Criteria,” in ICST. IEEE, 2020, pp. 321–331.

    [4] T. Zameni, P. van Den Bos, J. Tretmans, J. Foederer, and A. Rensink, "From BDD Scenarios to Test Case Generation," ICSTW, Dublin, Ireland, 2023, pp. 36-44

    [5] T. Zameni, P. van Den Bos, A. Rensink, J. Tretmans. An Intermediate Language to Integrate Behavior-Driven Development Scenarios and Model-Based Testing Accepted at VST 2024.

Contact