Diving video classification with VLMs foundation models

master Assignment

diving video classification with VLMs foundation models

Type: Master CS

Student: Unassigned

Duration: TBD

If you are interested please contact:

Background:

In recent years, a video understanding task that has gained popularity is fine-grained video classification. Fine-grained classifications of actions can be a challenge, such as in the diving video dataset Diving48.

In this project, you should investigate how action classification and large-scale self-supervised models encode videos and how these different encodings impact performance gains.

You can look at:

 SoA methods, such as the ones above, should be compared based on the quality of the embeddings. It will be up to you to define metrics for the quality.

References: