SynDAiTE: Synthetic Data for AI Trustworthiness and Evolution

SynDAiTE: Synthetic Data for AI Trustworthiness and Evolution

Workshop at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2025), September 15, 2025 - Porto, Portugal


Organisers

Dr. Marco Piangerelli
University of Camerino
Ylenia Rotalinti
Brunel University London
Prof. Heitor Murilo Gomes
Victoria University of Wellington
Prof. Maroua Bahari
Sorbonne Université
Prof. Yi He
William & Mary
Dr. Bardh Prenkaj
Technical University of Munich
André Carreiro
Fraunhofer AICOS
Prof. Ana Carolina Lorena
Instituto Tecnológico de Aeronáutica
Prof. Kate Smith-Miles
University of Melbourne
Zafeiris Kokkinogenis
University of Porto
Prof. Albert Bifet
University of Waikato
Prof. Carlos Soares
University of Porto | Fraunhofer AICOS

Table of contents

Aims and Scope

The rapid advancement of artificial intelligence (AI) relies heavily on access to large, diverse, and high-quality datasets for training and evaluation. However, the increasing scarcity of data, strict privacy regulations, and the high costs associated with collection and annotation are creating significant barriers to progress. Projections suggest that by 2050, we may face a shortage of fresh text data, and by 2060, image data may become similarly limited. These challenges make it imperative to explore alternatives that can sustain AI’s growth and effectiveness. Synthetic data presents itself as a compelling solution to these issues, offering the advantages of scalability, customisation, and inherent anonymisation. It allows for the generation of large volumes of tailored datasets without the same privacy and cost concerns of real data.

Important Dates

All deadlines are 11:59 pm, AoE.

Topics

SynDAiTE welcomes contributions on all topics related to (e.g., finance, business, basic sciences, construction computational advertising, IoT, etc.) and independent of data types such as (e.g., networks, graphs, logs, spatiotemporal, multimedia, time series, genomic sequences, and streaming data.):

Submission and Publication

Papers must be written in English and formatted in LaTeX, following the outline of our author kit https://ecmlpkdd-storage.s3.eu-central-1.amazonaws.com/2025/ECML_PKDD_2025_Author_Kit.zip. The kit includes a readme document, a LaTeX file template containing author instructions, and style files. The maximum length of papers is 16 pages (including references) besides for short papers where the limit is 10 in this format. The program chairs reserve the right to reject any over-length papers without review. Papers that “cheat” the page limit by, including but not limited to, using smaller than specified margins or font sizes will also be treated as over-length. Note that, for example, negative vspaces are also not allowed by the formatting guidelines; further details can be found in the author kit. Up to 10 MB of additional materials (e.g., proofs, audio, images, video, data, or source code) can be uploaded with your submission. If there is an appendix, ensure it is submitted separately from your paper, which combined with the main matter must adhere to the page limit. The reviewers and the program committee reserve the right to judge the paper solely on the basis of the 16 pages of the paper; looking at any additional material is at the discretion of the reviewers and is not required.

The submission must also be anonymized; authors must omit their names and affiliations from submissions and avoid obvious identifying statements (e.g., citations to the author’s own prior work should be made in the third person). Finally, the submission must not be currently under review at another publication venue. Failure to adhere to policies will result in desk rejection.

We strongly recommend using the template above and providing paper code and data in an Anonymous GitHub repository https://anonymous.4open.science/.

We encourage three types of submissions (reviewers will comment on whether the size is appropriate for each contribution):

Generative AI Usage Policy. Generative AI models, including Chat-GPT, BARD, LLaMA, or similar LLMs, do not satisfy the criteria for authorship of papers accepted in the workshop. If authors use an LLM in any part of the paper-writing process, they assume full responsibility for all content, including checking for plagiarism and correctness of all text.

Originality and Concurrent Submissions. Papers submitted should report original work. Papers that are identical or substantially similar to papers that have been published or submitted elsewhere may not be submitted to ECML PKDD, and the organizers will reject such papers without review. Authors are also NOT allowed to submit or have submitted their papers elsewhere during the review period. Submitting unpublished technical reports available online (such as on arXiv), or papers presented in workshops without formal proceedings, is allowed, but such reports or presentations should not be cited to preserve anonymity.

Submissions that do not follow these guidelines or do not view or print properly, will be desk-rejected.

Post-Proceedings. The accepted papers and the material generated during the meeting will be available on the workshop website. As per ECML-PKDD’s guidelines, The Workshops and Tutorials will be included in a joint Post-Workshop proceeding published by Springer Communications in Computer and Information Science (indexed on Google Scholar, DBLP, and Scopus), in 1-2 volumes, organized by focused scope. Papers authors will have the faculty opt-in or opt-out.

Registration and Presentation Policy

Each accepted paper must have at least one author registered for the full conference by the early registration deadline and must be presented at the workshop even if they opt-out of the post-proceedings. We expect the authors, the program committee, and the organizing committee to adhere to the ECML-PKDD Code of Conduct.

Contacts

For general inquiries about the workshop, please email marco.piangerelli@unicam.it, ylenia.rotalinti@mhra.gov.uk, heitor.gomes@vuw.ac.nz, maroua.bahri@lip6.fr, yihe@wm.edu, bardh.prenkaj@tum.de, and albert@albertbifet.com