Student project proposal

ID: MP-HW-BLOCK-01

Level: MSc
Title: Reusable Transformer Blocks Synthesis for FPGA Platforms
Contact: dr. ir. Uraz Odyurt <uodyurt@nikhef.nl>
		 dr. Amirreza Yousefzadeh <a.yousefzadeh@utwente.nl>

==========================================================================================
Description

The Transformer Machine Learning (ML) architecture is being used in different fields as 
an improvement upon the state-of-the-art. One such application is the task of subatomic 
particle track reconstruction (tracking), which is crucial in High-Energy Physics 
experiments.

Given that the applications are diverse, but the Transformer architecture itself has 
limited variations when it comes to the design of internals, composition of a synthesis 
catalogue is achievable. We aim to develop tooling for automated, or semi-automated 
Transformer blocks synthesis for FPGAs. Ideally speaking, the tooling should support 
block definitions at different levels of granularity, paving the way for multi-device 
deployment.

==========================================================================================
Task

As the current tooling do not provide ready HLS equivalents for Transformer models, these 
models have to be converted manually. The student shall develop conversions, with as much 
systematic flexibility as possible.

Since large Transformer models exceed the limits of FPGAs, it is important for us to be 
able to support decomposed deployment of models. The student shall not only work on 
block-wise conversions, but he/she should define various virtual model blocks. Which set 
of original model blocks will go in these virtual block is yet to be studied.

Initially, the student shall study the state-of-the-art [1, 2] and the available tooling, 
e.g., hls4ml [3]. A clear workflow with as much automation as possible is expected, 
alongside performance benchmarking.

One or more Transformer models, alongside relevant data sets shall be provided from our 
previous work [4]. A FPGA platform of type Zynq UltraScale+ MPSoC will also be available 
for experimentation.

==========================================================================================
Application

The project is part of an ongoing effort to train, test and deploy ML models for particle 
track reconstruction for the HL-LHC at CERN, which will drastically increase the scale 
and frequency of data generation.

==========================================================================================
References

[1] Alonso, 2021, Elastic-DF: Scaling Performance of DNN Inference in FPGA Clouds through 
	Automatic Partitioning.
	URL: https://doi.org/10.1145/3470567

[2] Nechi, 2023, FPGA-based Deep Learning Inference Accelerators: Where Are We Standing?
	URL: https://doi.org/10.1145/3613963

[3] Duarte, 2018, Fast inference of deep neural networks in FPGAs for particle physics.
	URL: https://doi.org/10.1088/1748-0221/13/07/P07027

[4] Caron, 2024, TrackFormers: In Search of Transformer-Based Particle Tracking for the 
	High-Luminosity LHC Era.
	URL: https://doi.org/10.48550/arXiv.2407.07179

==========================================================================================