Algorithms and Datastructures for Efficient Analysis of Systems

Background

One branch of theoretical computer science is the analysis of algorithms, protocols, hardware circuits and software. We want to analyse a system such as a piece of software or a hardware circuit for desirable properties, or to automatically generate ("synthesize") systems with desirable properties. Examples of these properties are functional correctness, the absence of forbidden states or deadlocks, or reactivity properties such as that a system is responsive, or that a system can always restart.

To advance this goal, we study algorithms that analyse systems. For example, many properties for reactive systems are specified in the language LTL and ultimately LTL model checking and synthesis is translated (via algorithms) to a parity game. We then solve the parity game and the outcome answers questions such as:

  1. Does a system fulfill the specification? If not, what is a counterexample?
  2. Can a system, with an additional synthesized controller, fulfill the specification?
  3. How many possible (reachable) system states are there?

Concrete questions that can be solved with these methods are:

  1. Does a coffee machine always respond correctly when you ask for coffee?
  2. Do traffic lights ever enter a forbidden state, causing a collision?
  3. If a car stops in front of a traffic light, will the light turn green eventually?

The core challenge in this area is called state space explosion: in order to analyse a system, we usually need to consider exponentially or even double exponentially many different situations. This motivates a lot of research into new algorithms to reduce the number of different situations, often by inventing tricks to analyse similar states simultaneously, or by limiting a system to only "relevant" paths, and abstracting away from paths that don't seem relevant.

In the formal methods and tools group at the UT, we mostly work with binary decision diagrams and with parity games. However, we are also interested in studying abstraction techniques such as IC3, guided refinement, partial order reduction, and so forth. Another goal is to ensure high performance of algorithms, by smart engineering/coding. There are competitions between scientists and it would be nice if we can win them by using smart algorithms that are fast!

  • Model checking a transition system

    Systems are often modeled as a transition system with labels: a transition system has system states and transitions between the states. Furthermore, states and transitions can have labels, indicating for example communication, or actions, or information about the states. We use the LTSmin toolset that we develop to analyse systems via the PINS interface, which is a way to connect any input specification language with the analysis backends. We can then perform analysis tasks such as computing the number of reachable system states, whether a CTL or LTL formula holds, if there are any deadlocks, et cetera.

  • Parity games

    parity game is a two-player game played on a directed graph. Vertices on the graph have numbers called priorities; a play is a path on the graph that ends in a cycle, and the goal for player 0 is that the highest number on the cycle is even, and for player 1 that the highest number on the cycle is odd. We study algorithms that solve parity games: compute who wins the parity game (or parts of the parity game) and compute a winning strategy (choices that lead to a winning cycle).

    Parity games are a very simple concept, but there is an unsolved open problem lurking in the shadows. Parity games are suspected to have a polynomial-time solution, but none have been found yet! We study a variety of parity game algorithms to learn fundamental properties of what makes parity games difficult.

  • SAT/SMT solving

    Boolean satisfiability, also called SAT, is also a very straight forward problem. Given a set of so-called clauses with the syntax "C := C or C | a | not a", where a is an element of a set of propositions, for example "a or b or not c", find a true-false assignment to the propositions such that all clauses evaluate to true. You might recognize this as the conjunctive normal form of a Boolean formula.

    The advantage of SAT is that many if not all problems in NP can be translated to a SAT problem. Moreover, recent solvers (in the past decade) can often solve a SAT problem very quickly, which makes it attractive to simply translate your problem in NP to SAT, and have a SAT solver find the solution.

    A close cousin of SAT solving is SMT solving. This is essentially SAT, but in addition to propositions you can mix in additional extensions called "theories" such as inequalities, arithemetic, floating point operations, and so on. Through the "magic" of the backend SAT solver enriched with these theories, many problems are simply translated to a SAT/SMT problem and "magically" solved by the SMT solver.

  • Symbolic model checking

    A core problem in the analysis and construction of systems is the state space explosion problem. In order to analyse a system, we need to consider all possible states, evaluate all different interleavings of communicating processes, all combinations of subsystem states, and so forth. This results in what is called an exponential blowup, where adding one component doubles the number of states of the system. The state space explosion problem motivates a lot of research in model checking and synthesis.

    One approach that deals with state space explosion is to not consider/evaluate every single state of the system (called "explicit" state evaluation) but to reason about (abstract) sets of states. Then if these sets are concisely represented, we save perhaps exponential space and time, and thus avoid the state space explosion problem. Essentially this is a kind of abstraction technique (and one can consider many techniques as merely abstraction techniques). In particular at the UT we use binary decision diagrams to represents a set of states as a Boolean formula that is encoded as a decision graph. Clever use of BDDs lets us analyse systems that sometimes have many billions of system states within seconds when they can be encoded by relatively small BDDs.

Prerequisites

Related Modules in the Bachelor TCS

AVAILABLE PROJECT PROPOSALS

If you are interested in the general topic, or if have your own project idea related to the topic, please contact us directly. Also open ideas for high performance algorithms / data structures are welcome. We have a focus on efficient implementations and empirical evaluation of algorithms.

Alternatively, you can also work on one of the following concrete project proposals:

  • Delaying strategy iteration algorithms

    Supervisor: Tom van Dijk

    We want a better understanding of what makes parity games hard to solve. We do this by finding parity games that delay a parity game solving algorithm as long as possible. To make these games, we need to have general principles on game graphs.

    This topic is specifically about strategy iteration algorithms. These algorithms following a general principle: fix a strategy for one player, then let the other player find the best counterplay according to some evaluation heuristic. Then fix an improved strategy for the first player and repeat the process. In order to delay strategy iterations, a lot of earlier work by Oliver Friedmann and others made use of certain structures in the parity game. Nowadays we have more insights into parity games, such as more creative use of attractors (that can circumvent the features of Friedmann's countergames) and understanding of so-called distractions and tangles. The goal of this study is to apply these new ideas and to find general principles of creating hard to solve parity games for modern strategy iteration algorithms.

  • Symbolic LTL synthesis

    Supervisor: Tom van Dijk

    In recent work, a first prototype was realized by one of our MSc students (Remco Abraham) of an almost fully symbolic pipeline from LTL specification to AIGER controller for LTL synthesis. However there is plenty of room for improvement, as indicated by the amount of suggested future work. Goal of the project is to look at the future work and consider the next steps.

  • Symbolic automata determinization

    Supervisor: Tom van Dijk

    An open problem for LTL synthesis using BDDs is determinization of symbolic automata. We typically start with non-deterministic automata, that then need to be determinized. This involves a exponential blowup. There is not a lot of research into determinizing using symbolic (BDD-based) methods, but this could be very useful for a full symbolic LTL synthesis pipeline.

  • Incremental parity game solving

    Supervisor: Tom van Dijk

    Since generating the parity game is more costly than solving the parity game, we want to explore a setting where partially generated parity games are solved. Ideally, intermediate outcomes can be reused in later attempts. This study is about figuring out how to do this with a variety of algorithms and determine which approach is most useful in practice.

  • Symbolic parity game algorithms

    Supervisor: Tom van Dijk

    We currently have a few fast parity game algorithms using BDDs. However these algorithms are quite vulnerable to particular parity games. Goal of the project is to study different symbolic versions of parity game algorithms, such as tangle learning, tangle attractors, fixpoints with justification, and possibly quasi-polynomial algorithms.

  • Reducing AIGER circuit size

    Supervisor: Tom van Dijk

    Hardware circuits are modeled using and-inverter-graphs (AIG), a graph structure with AND-gates, latches, and negations. It is very important to reduce the size of the AIG. We want to explore various analysis techniques, including bisimulation minimisation, substitutions, logic rewriting, etc.

  • Theory proving tangle learning properties

    Supervisors: Tom van Dijk, Peter Lammich

    Tangle learning is a promising family of algorithms to crack the open parity game problem (finding a polynomial-time algorithm). Eventually, we want to be able to prove properties of tangle learning algorithms using a theorem prover like Isabelle. This topic is about formalizing properties of tangle learning in Isabelle.

    (This topic is relatively low-risk because once you learn how to use Isabelle, you can just start proving lemmata.)

  • Theory proving parity game algorithms

    Supervisors: Tom van Dijk, Peter Lammich

    There are various parity game algorithms and while we believe that the proofs of correctness are valid, it would be more interesting to prove this systematically using an interactive theory prover such as Isabelle. Goal of the study is to look at algorithms such as priority promotion, quasipolynomial Zielonka, and fixpoint algorithms in order to prove their correctness in Isabelle. Another challenge is to prove that the "two binary counters" parity game family is indeed exponentially hard for various solvers.

  • Quasi-polynomial tangle learning

    Supervisor: Tom van Dijk

    One of the most robust algorithms (most resistant to hard parity games) is tangle learning. But this algorithm does have exponential hard games. Recently, an algorithm by Pawel Parys is strictly below exponential ("quasi-polynomial"). In theory, it is straight-forward to apply tangle learning to this algorithm. The topic is about implementing this algorithm (the easy part), and then finding lower bound examples for QTL (the challenging part).

  • Zero-suppressed decision diagrams

    Supervisor: Tom van Dijk

    There are different ways to minimize the size of a decision diagram by removing redundant decisions. Traditionally decisions are redundant when both the "true" branch and the "false" branch are equal. But for some applications, such redundant decisions are rare and more often the "true" branch leads to the False terminal; different types of BDDs to deal with such situations includef ZDDs and TBDDs. Currently Sylvan has a prototype <zdd> branch and a prototype <tbdd> branch. The goal is to integrate these branches with the master branch, and with LTSmin, and to come up with a comprehensive evaluation of the three types.

  • LTL model checking in LTSmin using modern approaches

    Supervisor: Tom van Dijk

    Currently LTSmin depends on somewhat older tools for converting LTL specifications to an LTS that is then used as a monitor to check specifications. This project is about studying newer approaches such as conversions to Delta2 normal form, and/or using different tools like Owl to preprocess the specification, and newer algorithms such as symbolic parity game solvers to do the model checking.

  • Parallel variable reordering

    Supervisor: Tom van Dijk

    Essential for effective use of binary decision diagrams is a smart order of the BDD variables. A bad order can be exponentially worse. Sometimes it is difficult to come up with a smart ordering, so there exist algorithms (that are very expensive) to automatically reorder BDDs and find a better ordering. Goal of the research is to implement the basic sifting algorithm and then fine-tune it within the parallel BDD package Sylvan.

  • PINS 2.0

    Supervisor: Tom van Dijk

    The core part of the success of LTSmin is the PINS interface, which separates the analysis backends from the language specification languages. Thanks to PINS, many different formalisations can be hooked up to LTSmin without fundamentally changing the backends. However we have feature bloat and technical debt in the interface, which needs to be cleaned up. That is the goal of this research.

  • Does your model make sense?

    Supervisor: Mariëlle Stoelinga

    Model checking is as good as the models you make: if your models do not reflect reality, then the results of the analysis (e.g. via model checking) are not useful. Thus, model validation is an important task. It is also time consuming, and, since similar properties are verified many times, there is room for automation here.
    The purpose of this assignment is to make model validation easier, by checking a number of properties fully automatically: are all locations reachable? Is the model input-enabled? Can time always progress, even in a worse-case environment?
    These questions can be checked automatically, but their formulation in terms of model queries (posed e.g. in temporal logic) depends on the model. Thus, this project aims at formulating a list of validation properties, and implement these in a model checker, e.g. in Uppaal. 

  • Which car should you buy? Analysis of product variations with SAT solving (at BetterBe.com)

    Supervisor: Mariëlle Stoelinga

    BetterBe is a company, based in Enschede, that provides webservices to automotive lease companies, in particular, to calculate the price of a particular lease car.
    Calculating this price is more complex than one may think, because of the large number of intricate dependencies between the various options. For instance: if you opt for a sports package, you cannot choose certain radios (reason: the leather steering wheel coat does not have the right holes for the radio buttons). There are hundreds of such dependency rules. Therefore, figuring out if a certain set of option is feasible, and determine its price, is non-trivial.
    SAT solving seems a natural technology to tackle this challenge: SAT is an advanced method to determine whether or not a logical formula is satisfiable or not. Advanced tool support, such as MathSat and Z3, is available. Existing SAT solvers, however, are good as solving large queries. What is needed here, is to solve a large number of small queries.
    Thus, the goal of this project is to (1) model the price calculations as a SAT solving problem, and (2) see if SAT solving techniques can be tailored to the specific case of running many small queries.

  • Improving Fault Tree analysis via Binary Decision Diagrams

    Supervisors: Matthias Volk, Tom van Dijk

    Fault trees are a common formalism to model how failures occur in system components and how they eventually lead to overall system failure. Analysing fault trees reveals how likely system failures are and whether the reliability of components needs to be improved.

    As fault trees can contain thousands of elements, their analysis must be efficient in both time and memory consumption. One common approach is to use binary decision diagrams (BDDs) as a data structure. A BDD is graph which encodes a Boolean function in a compact representation. Translating a fault tree into a BDD yields an efficient representation which allows to analyse large fault trees. However, the memory consumption of the BDD depends on certain factors such as the variable ordering or the order in which smaller BDDs are composed during the translation.

    The goal of this research project is to improve the existing fault tree analysis. The fault tree analysis we look at is implemented in the Storm model checker and uses the Sylvan library to create and handle BDDs. Building upon Sylvan and Storm, we aim to extend and optimize the existing implementation of fault tree analysis via BDDs.

    Tasks

    Possible research tasks are:

    • Find heuristics for good variable orderings which yield small BDDs.
    • Investigate whether different orders for the composition of sub-BDDs affect the peak memory requirement.
    • Investigate whether multiple cores can be further exploited during the analysis.
    • Integrate modular approaches which only require parts of the BDD to be stored in memory at a time.
    • Bring in own ideas.

    Requirements

    • Knowledge about fault trees and binary decision diagrams is beneficial but not required.
    • Experience with C/C++ is strongly recommended as the implementation uses these languages.
  • Safety aspects of decentralised traffic management systems

    Supervisor: Moritz Hahn, Nico Plat

    Traffic Management refers to the combination of measures that serve to preserve traffic capacity and improve the security, safety and reliability of the overall road transport system. These measures make use of ITS systems, services and projects in day-to-day operations that impact on road network performance. As an example, consider a part of a motorway on which cars are supposed to slow down during times of high traffic density or accidents [1, Chapter 7.3].

    A system to implement this requirement will have to have sensors to count the cars passing by and determine their velocity. It will also have to have a number of electronic speed limit signs as know as matrix signs and more generally denoted as actuators. Matrix signs are placed roughly 500 meters hundred meters apart from each other. Each of them will show a the same or a lower speed limit, such that cars will slow down in a controlled manner. The speed limits are calculated from the data obtained from the sensors, so as to adapt to the current traffic situation.

    In the Netherlands traffic management is performed in a centralised manner, i.e. from one of the five traffic management centres in the country. Thus, in the above scenario, there would be a central instance of the system, evaluating the input from all the sensors and controlling all the speed limit displays.

    Migrating towards a more decentralised setup has a number of advantages:

    • Costs. Requirements for fallback on a central system will decrease and it will therefore be cheaper, since the central system will no longer be a single point of failure and becomes more of a backup option (if needed at all).
    • Feasibility. With increasingly complex traffic management being necessary with increasing road utilisation, having an instance which centrally performs all decisions may become too complex.
    • Robustness. The system will be less vulnerable to communication interruption between the centrals system and decentralised components, and to unavailability of the central system itself, as the decentralised sensor and actuators act as individual, independent agents.

    In the above scenario, the individual units of speed signs and traffic sensors would each have their own controller. They would also have means of interaction with their immediate neighbours (both upstream/downstream as well as on adjacent lanes). Ideally, the interplay between these units and the cars driving on the road will then lead to traffic management which is at least as safe and efficient as the centralised on that is currently in use. 

    The purpose of this project is to model a scenario as the one above in a formal modelling language, such as Promela [2] or SMV [3]. The level of detail of modelling is up to consideration and to be decided while performing the project. We assume that it will be necessary to e.g. model the behaviour of each traffic control unit, physical properties of the cars in an abstract manner, assumptions of the behaviour of drivers, the number of cars entering the scenario, etc. The formal model obtained this way shall then be used to show the safety of the scenario, or to show which constraints have to be imposed to obtain safety using model checking.

    References:

    1. John Fitzgerald, Peter Gorm Larsen, Paul Mukherjee, Nico Plat, Marcel Verhoef: Validated Designs for Object-oriented Systems. Springer London. ISBN 978-1-85233-881-7
    2. Gerard J. Holzmann: The SPIN Model Checker - primer and reference manual. Addison-Wesley 2004, ISBN 978-0-321-22862-8, pp. I-XII, 1-596
    3. Alessandro Cimatti, Edmund M. Clarke, Enrico Giunchiglia, Fausto Giunchiglia, Marco Pistore, Marco Roveri, Roberto Sebastiani, Armando Tacchella: NuSMV 2: An OpenSource Tool for Symbolic Model Checking. CAV 2002: 359-364

CONTACT