UTFaculteitenEEMCSDisciplines & departementenDMBAssignmentsOpen AssignmentsOpen Master AssignmentsComputer Vision and BiometricsAnalyzing Visual and Identity-Level Differences Between Synthetic and Real Face Datasets

Analyzing Visual and Identity-Level Differences Between Synthetic and Real Face Datasets

Master Assignment

Analyzing Visual and Identity-Level Differences Between Synthetic and Real Face Datasets

Type: Master EE/CS 

Period: TBD

Student: (Unassigned)



Assignment Title:

Analyzing Visual and Identity-Level Differences Between Synthetic and Real Face Datasets

Assignment Description:

This assignment focuses on identifying and analyzing both superficial and deep-level differences between real identity images from the Chicago Face Database (CFD) and a synthetic FLUXSynID dataset generated to mimic its style. The goal is to determine to what extent synthetic images replicate the visual and identity characteristics of real faces, and to uncover any artifacts, biases, or patterns unintentionally introduced during the generation process.

The student will compare the two datasets on two levels:

Key Questions to Investigate:

Expected Outcome:

The student should deliver a comparative analysis with both visual examples and quantitative results (e.g., embedding distance distributions, t-SNE/UMAP plots). Insights should include artifact detection, identity-level discrepancies, and an evaluation of how convincingly the synthetic dataset mimics CFD-style data, both on the surface and at the representation level.

Short Description:

This assignment involves a comparative analysis between synthetic facial images of FLUXSynID dataset styled to resemble the Chicago Face Database (CFD) and real CFD images. The focus is on identifying both surface-level artifacts, such as repeated reflections or clothing patterns, and deeper differences in identity representations. Facial embeddings will be analyzed using techniques such as dimensionality reduction and clustering. The goal is to assess how closely the synthetic dataset matches the real one and to identify any systematic visual or identity-related discrepancies.