Synthetic Data Generation
The study shows AI-generated datasets can outperform traditional ones—faster, cheaper, more diverse, and easier to create
Synthetic Data Generation
AI-generated datasets are changing the game.
Recent research reveals that synthetic semantic segmentation datasets—created through controlled AI generation—can outperform traditional ones. These datasets are faster to produce, more cost-effective, diverse, and easier to scale, unlocking new potential in computer vision pipelines.
A new approach with real impact
In collaboration with CITIC – Centro de Investigación TIC, Universidade da Coruña, our team published a peer-reviewed study that fuels the foundations of our platform for conscious design.
"Should I Render or Should AI Generate? Crafting Synthetic Semantic Segmentation Datasets with Controlled Generation"
Read the full paper
Using a combination of Diffusion Models and ControlNet, the approach enables the creation of annotated synthetic datasets that rival or exceed those built using traditional 3D rendering methods.
Why this matters
Traditional dataset generation can be slow, expensive, and limited in diversity.
This method offers a scalable, accessible way to produce training data — especially in domains where real images are hard to obtain or annotate.
When crafted intentionally, AI-generated datasets can outperform those rendered via classical pipelines.
Authors and collaborators
This work wouldn't be possible without the brilliant researchers and partners behind it:
- Luis Omar Álvarez Mures
- Manuel Silva Díaz
- Manuel Lijó Sánchez
- Emilio José Padrón González
- José A. Iglesias
Special thanks to the institutions and teams supporting this effort.