Student project proposal ID: MP-HW-BLOCK-01 Level: MSc Title: Reusable Transformer Blocks Synthesis for FPGA Platforms Contact: dr. ir. Uraz Odyurt dr. Amirreza Yousefzadeh ========================================================================================== Description The Transformer Machine Learning (ML) architecture is being used in different fields as an improvement upon the state-of-the-art. One such application is the task of subatomic particle track reconstruction (tracking), which is crucial in High-Energy Physics experiments. Given that the applications are diverse, but the Transformer architecture itself has limited variations when it comes to the design of internals, composition of a synthesis catalogue is achievable. We aim to develop tooling for automated, or semi-automated Transformer blocks synthesis for FPGAs. Ideally speaking, the tooling should support block definitions at different levels of granularity, paving the way for multi-device deployment. ========================================================================================== Task As the current tooling do not provide ready HLS equivalents for Transformer models, these models have to be converted manually. The student shall develop conversions, with as much systematic flexibility as possible. Since large Transformer models exceed the limits of FPGAs, it is important for us to be able to support decomposed deployment of models. The student shall not only work on block-wise conversions, but he/she should define various virtual model blocks. Which set of original model blocks will go in these virtual block is yet to be studied. Initially, the student shall study the state-of-the-art [1, 2] and the available tooling, e.g., hls4ml [3]. A clear workflow with as much automation as possible is expected, alongside performance benchmarking. One or more Transformer models, alongside relevant data sets shall be provided from our previous work [4]. A FPGA platform of type Zynq UltraScale+ MPSoC will also be available for experimentation. ========================================================================================== Application The project is part of an ongoing effort to train, test and deploy ML models for particle track reconstruction for the HL-LHC at CERN, which will drastically increase the scale and frequency of data generation. ========================================================================================== References [1] Alonso, 2021, Elastic-DF: Scaling Performance of DNN Inference in FPGA Clouds through Automatic Partitioning. URL: https://doi.org/10.1145/3470567 [2] Nechi, 2023, FPGA-based Deep Learning Inference Accelerators: Where Are We Standing? URL: https://doi.org/10.1145/3613963 [3] Duarte, 2018, Fast inference of deep neural networks in FPGAs for particle physics. URL: https://doi.org/10.1088/1748-0221/13/07/P07027 [4] Caron, 2024, TrackFormers: In Search of Transformer-Based Particle Tracking for the High-Luminosity LHC Era. URL: https://doi.org/10.48550/arXiv.2407.07179 ==========================================================================================