Datasets and Benchmarks Track at ICIP 2024
We are thrilled to announce ICIP dataset and benchmark track. High-quality, publicly available images and videos datasets are critical for advancing the field of image processing, and we seek to provide researchers with a diverse collection of datasets that can be routinely used to test, benchmark, and improve the overall performance of image processing methods and algorithms. We encourage researchers from all fields to submit their datasets and be part of this exciting track. This track serves as a venue for high-quality publications on highly valuable images and videos datasets and benchmarks, as well as a forum for discussions on how to improve dataset development.
Submissions to the track will be part of the main ICIP conference, presented alongside the main conference papers. Accepted papers will be officially published in the ICIP proceedings and follow the same deadlines as regular papers. Make sure you choose the “Submit to Datasets and Benchmarks Track” button on the paper submission site.
Criteria:
We are aiming for an equally stringent review as the main conference, yet better suited to datasets and benchmarks. Submissions to this track will be reviewed according to a set of criteria and best practices specifically designed for datasets and benchmarks. A key criterion is accessibility: datasets should be available and accessible, i.e. the data can be found and obtained without a personal request to the PI, and any required code should be open source. Next to a scientific paper, authors should also submit supplementary materials such as detail on how the data was collected and organized, what kind of information it contains, how it should be used ethically and responsibly, as well as how it will be made available and maintained.
The factors that will be considered when evaluating papers include:
- All submissions:
- Utility and quality of the submission: Impact, originality, novelty, relevance to the ICIP community will all be considered.
- Reproducibility: All submissions should be accompanied by sufficient information to reproduce the results described i.e. all necessary datasets, code, and evaluation procedures must be accessible and documented. We encourage the use of a reproducibility framework such as the Research Reproducibility standards to guarantee that all results can be easily reproduced. Benchmark submissions in particular should take care to ensure sufficient details are provided to ensure reproducibility. If submissions include code, please refer to the IEEE guidelines.
- Ethics: Any ethical implications of the work should be addressed. Authors should rely on the IEEE Digital Ethics and Privacy Technology Guidelines.
- Dataset submissions:
- Completeness of the relevant documentation: datasets must be accompanied by documentation communicating the details of the dataset as part of their submissions. Sufficient detail must be provided on how the data was collected and organized, what kind of information it contains, ethically and responsibly, and how it will be made available and maintained.
- Licensing and access: authors should provide licenses for any datasets released. These should consider the intended use and limitations of the dataset, and develop licenses and terms of use to prevent misuse or inappropriate use.
- Consent and privacy: datasets should minimize the exposure of any personally identifiable information, unless informed consent from those individuals is provided to do so. Any paper that chooses to create a dataset with real data of real people should ask for the explicit consent of participants, or explain why they were unable to do so.
- Ethics and responsible use: Any ethical implications of new datasets should be addressed and guidelines for responsible use should be provided where appropriate. Note that, if your submission includes publicly available datasets (e.g. as part of a larger benchmark), you should also check these datasets for ethical issues. You remain responsible for the ethical implications of including existing datasets or other data sources in your work.
- Legal compliance: For datasets, authors should ensure awareness and compliance with regional legal requirements.
Scope.
This track welcomes all work on data-centric image processing research, covering images, videos, and 3D datasets and benchmarks as well as algorithms, tools, methods, and analyses for working with visual data. This includes but is not limited to:
- New datasets, or carefully and thoughtfully designed (collections of) datasets based on previously available data.
- Data generators.
- Data-centric image processing methods and tools.
- Advanced practices in data collection and curation that are of general interest even if the data itself cannot be shared.
- Frameworks for responsible dataset development, audits of existing datasets, identifying significant problems with existing datasets and their use
- Benchmarks on new or existing datasets, as well as benchmarking tools.
- In-depth analyses of image processing challenges and competitions (by organizers and/or participants) that yield important new insight.
- Systematic analyses of existing systems on novel datasets yielding important new insight.
Submission:
There will be one deadline for all papers including those submitted to this track. Submitted papers follow the same format and page limitation as a regular paper. Supplementary materials are strongly encouraged. Submission introducing new datasets must include the following in the supplementary materials:
- Dataset documentation and intended uses. Recommended documentation frameworks
- URL to website/platform where the dataset/benchmark can be viewed and downloaded by the reviewers.
- Author statement that they bear all responsibility in case of violation of rights, etc., and confirmation of the data license.
- Hosting, licensing, and maintenance plan. The choice of hosting platform is yours, as long as you ensure access to the data (possibly through a curated interface) and will provide the necessary maintenance.
- To ensure accessibility, we largely follow the IEEE Guidelines for data submission, but also allowing more freedom for non-static datasets. The supplementary materials for datasets must include the following:
- Links to access the dataset and its metadata. This can be hidden upon submission if the dataset is not yet publicly available but must be added in the camera-ready version. Reviewers must have access to the data. Simulation environments should link to open source code repositories.
- The dataset itself should ideally use an open and widely used data format. Provide a detailed explanation on how the dataset can be read. For simulation environments, use existing frameworks or explain how they can be used.
- Long-term preservation: It must be clear that the dataset will be available for a long time, by uploading to a data repository.
- Explicit license: Authors must choose a license, ideally a CC license for datasets, or an open source license for code. An overview of licenses can be found here: https://paperswithcode.com/datasets/license
- Add structured metadata to a dataset’s meta-data page using Web standards. This allows it to be discovered and organized by anyone. A guide can be found here: https://developers.google.com/search/docs/data-types/dataset. If you use an existing data repository, this is often done automatically.
- Highly recommended: a persistent dereferenceable identifier (e.g. a DOI minted by a data repository or a prefix on identifiers.org) for datasets, or a code repository (e.g. GitHub, GitLab,…) for code. If this is not possible or useful, please explain why.
- For benchmarks, the supplementary materials must ensure that all results are easily reproducible. Where possible, use a reproducibility framework such as the IEEE Research Reproducibility standards, or otherwise guarantee that all results can be easily reproduced, i.e. all necessary datasets, code, and evaluation procedures must be accessible and documented.
- For papers introducing evaluation and new perspectives of existing datasets, the above supplementary materials are required.
- For papers introducing best practices in creating or curating datasets and benchmarks, the above supplementary materials are not required.