Workshop 1: International Workshop on Hyperspectral and Multispectral Imaging on Agriculture and Geo Applications
Organizers: Thirimachos Bourlai and Lorena N. Lacerda, University of Georgia, US

Paper submission:

The integration of multispectral and hyperspectral sensing technologies has become pivotal across diverse applications such as defense, security, agriculture, and geospatial endeavors. This proposed workshop focusing on agriculture, and geospatial endeavors recognizes the expanding landscape of research and commercial applications, facilitated by advancements in radar, audio, and image-sensing technologies spanning the electromagnetic spectrum. Encompassing ultraviolet through longwave infrared (0.3 – 14 μm) spectral regions, these technologies enable data capture and retrieval across field, airborne, and satellite platforms. To remain current with the dynamic developments in the field, the workshop will place a strong emphasis on sensing, data collection, data processing, and the advancement of computer vision, machine learning, and deep learning algorithms. It will serve as a platform to unveil cutting-edge surveillance technologies and foster discussions on the efficacy of novel algorithms in processing data from diverse sensing technologies and networking platforms. The workshop aims to offer insights into hyperspectral and multispectral sensing, empowering organizations and the research community to harness these technologies for enhanced decision-making and resource management.

Workshop 2: Big Visual Data Analytics (BVDA)
Organizers: Ioannis Pitas (Aristotle University of Thessaloniki, Greece), Massimo Villari (University of Messina, Italy), Ioannis Mademlis (Harokopio University of Athens, Greece)

Paper submission:

The ever-increasing visual data availability leads to repositories or streams characterized by big data volumes, velocity (acquisition and processing speed), variety (e.g., RGB or RGB-D or hyperspectral images) and complexity (e.g., video data and point clouds). Such big visual data necessitate novel and advanced analysis methods, in order to unlock their potential across diverse domains. The “Big Visual Data Analytics” (BVDA) workshop aims to explore this rapidly evolving field that encompasses cutting-edge methods, emerging applications, and significant challenges in extracting meaning and value from large-scale visual datasets. From high-throughput biomedical imaging and autonomous driving sensors to satellite imagery and social media platforms, visual data has permeated nearly every aspect of our lives. Analyzing this data effectively requires efficient tools that go beyond traditional methods, leveraging advancements in machine learning, computer vision and data science. Exciting new developments in these fields are already paving the way for fully and semi-automated visual data analysis workflows at an unprecedented scale. This workshop will provide a platform for researchers and practitioners to discuss recent breakthroughs and challenges in big visual data analytics, explore novel applications across diverse domains (e.g., environment monitoring, natural disaster management, robotics, urban planning, healthcare, etc.), as well as for fostering interdisciplinary collaborations between computer vision, data science, machine learning, and domain experts. Its ultimate goal is to help identify promising research directions and pave the way for future innovations.

Workshop 3: SPVis: Security and Privacy of Machine Learning-based Vision Processing in Autonomous Systems
Organizers: Muhammad Shafique, Bassem Ouni, Michail Maniatakos, Ozgur Sinanoglu, Christina Pöpper (New York University, Abu Dhabi, UAE), Nasir Memon (New York University, Shanghai, China)

Paper submission:

In the era of growing cyber-security threats and nano-scale devices, the intelligent camera- based features of smart cyber physical systems (CPS, like autonomous vehicles) and Internet- of-Things (IoT) face new types of attacks and security/privacy threats on the image/video data, requiring novel design principles for robust ML. Besides IP-stealing and data privacy attacks, the foremost threats in this direction to the robustness of modern ML-systems operating on the image/video data are adversarial and backdoor attacks, which are characterized by deliberate and carefully crafted manipulations in the images, exploit inherent vulnerabilities in machine/deep learning models and learning mechanisms, potentially leading to compromised performance and decision-making. Safeguarding against these security and privacy threats has become crucial, requiring continuous advancements in defense and obfuscation strategies to strengthen the resilience of intelligent systems in diverse image/video processing applications and computer vision. This workshop aims to bring together experts, researchers, and practitioners in image/vision processing and machine learning security/privacy to discuss the latest advancements, challenges, and solutions in the critical domain of adversarial machine learning, backdoors, DNN obfuscation, attacks on visual forensics, deepfake detectors for images/videos, etc.

Workshop 4: Embodied AI: Trends, Challenges, and Opportunities – EMAI
Organizers: Yi Fang, Hao Huang, Yu-Shen Liu, Tuka Waddah Alhanai, Shuaihang Yuan, Yu Hao (New York University Abu Dhabi, Tsinghua University)

Web site:
Paper submission:

The “Embodied AI: Exploring Trends, Challenges, and Opportunities” workshop at ICIP 2024 in Abu Dhabi, UAE, is an expansive forum dedicated to the intersection of Embodied AI and fields such as computer vision, language processing, graphics, and robotics. This workshop is designed to deepen the understanding of AI agents’ capabilities in perceiving, interacting, and reasoning within their environments, thereby fostering an interdisciplinary dialogue among leading researchers and practitioners. Attendees can expect a comprehensive agenda including insightful invited talks from eminent figures in the field, a poster session showcasing cutting-edge research, and engaging panel discussions aimed at debating the future directions of intelligent, interactive systems. This event promises to be a pivotal gathering for those keen to contribute to and shape the ongoing advancements in Embodied AI.

The dedicated workshop on Embodied AI is essential due to its unique focus on integrating physical embodiment with AI capabilities, addressing challenges and opportunities not fully explored in the main ICIP conference. It merges computer vision and robotics, pushing beyond traditional boundaries to create agents that perceive, interact, and reason within their environments. This specialized forum encourages cross-disciplinary collaboration, fostering advancements that are vital for the development of intelligent, interactive systems, and addressing the gap between current image processing techniques and the future needs of AI research, including foundation models, robotics, and embodied intelligence.

The EMAI 2024 workshop stands at the confluence of Embodied AI and pivotal areas such as computer vision, natural language processing, graphics, and robotics. This synthesis is poised to catalyze significant momentum in the field, by bringing the frontier of foundation models, robotics, and embodied AI to the research community.

Workshop 5: 2nd Workshop on 3D Computer Vision and Photogrammetry (3DCVP)
Organizers: Lazaros Grammatikopoulos, Elli Petsa, Giorgos Sfikas, Andreas el Saer (University of West Attica, Greece), George Retsinas (National Technical University of Athens, Greece), Christophoros Nikou, Panagiotis Dimitrakopoulos (University of Ioannina, Greece)

Web site:
Paper submission:

Photogrammetry and Computer Vision are two major fields with a significant overlap with Image Processing. We plan to invite presentations of high-quality works on topics that will include novel developments over classic Photogrammetric Computer Vision problems such as Structure from motion and SLAM, as well as papers with a focus on novel learning techniques over 3D geometric data. Topics of interest will include feature extraction, description and matching, multispectral and hyperspectral image processing and fusion, Multi-View Reconstruction and Surface Reconstruction, 3D point cloud analysis and processing, scene understanding, robot vision and perception, path and motion planning.

Workshop 6: Learning-by-Myself (LbM): Self-supervised learning in biosignals and biomedical image processing
Organizers: Leontios Hadjileontiadis, Thanos Stouraitis, Panos Liatsis, Naoufel Werghi, Marius George Linguraru

Web site:
Paper Submission:

Self-Supervised Learning (SSL) is revolutionizing deep learning technologies by enabling the training of models without human-labeled data. In signal and image processing, cutting-edge systems rely on large Transformer neural networks pretrained on extensive signal/image datasets, using methods like contrastive or multitask learning. SSL is further explored in biosignals and medical image processing, dealing with diagnosis and disease prediction, leveraging unlabeled data to train models, reducing the need for manual annotation. This approach is transforming these fields by improving model performance and enabling the use of large, diverse datasets, leading to more accurate and efficient medical diagnostic and analysis tools. Despite the surge in accepted articles mentioning SSL techniques in recent top-tier conferences, challenges hinder broader adoption in real-world applications. SSL models face complexity issues, lack a standardized evaluation protocol, exhibit biases and robustness concerns, and lack integration with related modalities like text or video. The LbM workshop addresses these challenges by fostering interactions among experts. Through two keynotes, a panel, and paper presentations, LbM aims to unite the SSL community, including experts from various modalities, to frame SSL as a groundbreaking solution for biosignals and biomedical image processing and beyond. This workshop provides a dedicated platform for discussing and advancing SSL technologies, paving the way for their broader application in diverse fields.

Workshop 7: Biomedical Imaging & Diagnostics (BID) Workshop: Innovations in Biomarkers, Digital Pathology, & Radiology
Organizers: Arash Mohammadi (Concordia University, Canada) and Ervin Sejdic (University of Toronto, Canada)

Paper submission:

Recent advances in various Artificial Intelligence (AI) approaches have spurred the growth of Biomedical Image Processing applications. We already have algorithms that are deployed in hospital settings to aid clinicians with various tasks when it comes to the analysis of medical images. The main motivation behind the IEEE BID workshop is to introduce various recent image processing advances relevant to healthcare needs. Workshop’s overreaching objective is bringing engineers, computer scientists, and clinicians together to discuss main issues in this field. More specifically, while the main conference will focus on advances in image processing, the workshop will solely consider advances relevant to clinical applications such as biomarker discovery, pathology and radiology. The workshop is designed to initiate a broader conversation between the theoreticians in the field and researchers focused on the practical applications of image processing in medicine and physiology. We aim to address the main issues associated with biomedical image processing such as lack of large and annotated date sets, performance degradation of AI algorithms in real-world settings, and the need for reproducible research in the field. Lastly, the proposed workshop aims to become a future meeting ground for researchers interested in the applications of image processing in various clinical settings. Topics of interest include but not limited to:

  • Biomedical Imaging Diagnostics and Prognostics
  • Image-Based Biomarker Discovery
  • Robust Image Processing Techniques with Limited Datasets
  • (Semi)Autonomous Labeling for Medical Imaging Data
  • Navigating the Variability in Medical Imaging
  • The Human-AI Collaboration
  • Personalized Medicine and Image Processing
  • AI-Driven Multi-Modal Fusion Frameworks
  • Precision Labeling in Medical Imaging

Workshop 8: Integrating Image Processing with Large-Scale Language/Vision Models for Advanced Visual Understanding
Organizers: Yong Man Ro (KAIST, South Korea) ; Hak Gu Kim (Chung-Ang University, South Korea) ; Nikolaos Boulgouris (Brunel University London, UK)

Paper submission:

This workshop aims to bridge the gap between conventional image processing techniques and the latest advancements in large-scale models (LLM and LVLM). In recent years, the integration of large-scale models into image processing tasks has shown significant promise in improving visual object understanding and image classification. This workshop will provide a platform for researchers and practitioners to explore the synergies between conventional image processing methods and cutting-edge large language model and large vision language models, fostering innovation and collaboration in the field.


  1. Explore the foundations of image processing techniques with large-scale models.
  2. Investigate the current landscape of large-scale language/vision models and their capabilities.
  3. Discuss challenges and opportunities in integrating large-scale models with image processing to enhance visual understanding.
  4. Showcase practical examples and case studies where the combined approach has yielded superior results.

This workshop is designed for researchers, academics, and industry professionals working in the fields of image processing, computer vision, multimedia processing and natural language processing. Participants should have a basic understanding of image processing concepts and an interest in exploring innovative approaches for visual understanding. The workshop will consist of paper presentations by leading experts in image processing and large-scale language/vision models. Participants will have the opportunity to engage in discussions, exchange insights, and collaborate on potential research projects.

Workshop 9: 1st Workshop on Intelligent Crowd Engineering (ICE)
Organizers: Baek-Young Choi (University of Missouri – Kansas City,  USA),  Khalid Almalki (Saudi Electronic University, Saudi Arabia), Muhammad Mohzary (Jazan University, Jazan, Saudi Arabia), Sejun Song (Augusta University, GA, USA)

Paper Submission:

Crowd events such as festivals, concerts, shopping, sports, political (protests), and religious events (Hajj or Kumbha Mela) are a significant part of modern human society that can occur anywhere and at any time. Unfortunately, human casualties due to chaotic stampedes at crowd events and the transmission of infectious diseases, exemplified by the COVID-19 pandemic, however, implicate pervasive deficiency and call for effective crowd safety control and management mechanisms.

Machine Learning (ML) methodologies have been applied to crowd counting and density estimation, drawing inspiration from advancements in computer vision and video surveillance paradigms. These technological interventions are strategically designed to mitigate the risk of personal injuries and fatalities amidst densely populated gatherings, including political rallies, entertainment events, and religious congregations. Despite these advancements, contemporary crowd safety management frameworks must be more prominent in their precision, scalability, and capability to perform nuanced crowd characterization in real-time. These deficiencies encompass challenges such as detailed group dynamics analysis, the assessment of occlusions’ impacts, and the execution of adequate mobility, contact tracing, and social distancing strategies.

This inaugural Intelligent Crowd Engineering (ICE) workshop is to bring together eminent scientists, researchers, and engineers, to present and discuss novel crowd safety challenges, broaching cutting-edge topics, and unveiling emerging technologies that transcend conventional crowd-counting methodologies. The ICE calls for diverse technological spectrums, including ML, Artificial Intelligence (AI), and the Internet of Things (IoT), alongside social modeling and integration frameworks to substantially ameliorate crowd safety management systems’ precision, scalability, and real-time operational efficacy.

This workshop plans to integrate comprehensive approaches encompassing pivotal aspects of crowd engineering including but not limited to:

  • Trustworthy visual data processing and knowledge processing
  • An IoT-enabled mobility characterization 
  • An ML-augmented video surveillance 
  • Semantic information-driven application support

Workshop 10: Visual and Sensing AI for Smart Agriculture
Organizers: Yuxing Han, Fengqing Maggie Zhu, Yubin Lan


In recent years, sensor networks, drones, IoT technologies have been introduced to improve many aspects of agricultural practices, including but are not limited to energy efficiency, environmental friendliness, and food healthiness, necessitating advanced signal processing techniques and systems that meet the many challenges in agricultural applications under strict cost, power consumption, weather-proof, and other constraints. Furthermore, with revolutionary advancements in deep learning and AI, smart agriculture is at the eve of explosive growth with its impact felt on a global scale. At this time juncture, we propose to ICIP 2024, a dedicated workshop bringing together leading experts in smart agriculture and signal processing for smart agriculture, as well as broad audience from the signal processing community in general, for an in-depth face to face presentation and discussion of the signal

processing challenges in smart agriculture applications, as well as the state-of-the-art, thereby improving the awareness by the image processing and signal processing community to critical and promising research areas and challenges pertaining to one of the most fundamental and ancient practices of human society. The workshop will include presentations by experts from the drone, smart agriculture and signal processing societies, presentations of papers, and a panel for curated open discussions.

Workshop 11: AI4IPoT: AI for Image Processing Applications on Traffic: Advancements, Challenges, and Opportunities
Organizers: Xian Zhong, Wenxin Huang, Zheng Wang, Yang Ruan, Chia-Wen Lin, Alex Kot

Paper submission:

This workshop is designed as a leading forum to explore complex challenges and opportunities in intelligent traffic systems. It covers diverse topics, including safety under adverse weather conditions, advanced scene reconstruction, visual perception for autonomous driving, multimodal sensor fusion, and behavioral analysis of traffic violations. The session aims to bring together pioneers from image processing and artificial intelligence to explore large models within intelligent transportation and autonomous driving systems. This forum encourages rich discussions on AI’s role in surveillance and driver assistance, focusing on both large models and green computing strategies. The workshop facilitates dialogue between academics and industry experts to promote new ideas and explore new directions in autonomous driving and traffic management. Topics include, but are not limited to:

  1. Scene Reconstruction and Visual Perception

Advanced techniques for scene reconstruction and visual perception to improve accuracy in intelligent transportation and autonomous driving systems.

  1. Multimodal Sensor Fusion

Strategies for integrating data from multimodal sensors to enhance environmental analysis and image quality in traffic systems.

  1. Safety Transportation under Adverse Weather Conditions

Investigation of technologies that ensure vehicle safety under adverse weather conditions.

  1. AI-Driven Traffic Management Systems

Use of large models to optimize traffic flow and enhance functionalities in autonomous driving systems.

  1. Sustainable and Lightweight Traffic Technologies

Development of sustainable traffic systems using green computing and lightweight processing to reduce energy consumption.

  1. Behavioral Analysis for Driving and Traffic Violations 

Analytical technologies for studying driving behaviors and detecting traffic violations.

Workshop 12: Analysis of OCT Signals and Images: From Signal Formation to Practical Applications
Organizers: Taimur Hassan (Abu Dhabi University, United Arab Emirates), Azhar Zam (New York University, Abu Dhabi), Naoufel Werghi (Khalifa University, UAE), Lev Matveev (OpticElastograph LLC), Alex Vitkin (University of Toronto, Canada)

Paper submission:

Optical Coherence Tomography (OCT) is a well-established technique for retinal diagnostics, now expanding into non-ophthalmological fields such as dermatology, oncology, mucosal tissue diagnostics and more. OCT boasts mesoscopic resolution, implying that the resolution volume encompasses only a few cells. It also offers a deeper penetration depth (approximately a few millimeters) compared to microscopy, effectively bridging the gap between ultrasound and microscopy. The OCT signal features are highly sensitive to the properties of sub-resolved optical scatterers. For example, speckle pattern parameters correlate with scatterer concentrations and scatterer clustering, while speckle variance is sensitive to scatterer motion, such as blood flow, among other features. Moreover, OCT can be made sensitive to optical phase and polarization. These features are enhanced by enabling AI. By combining all these technology development components into a seamless pipeline, one can achieve high-performance diagnostics and perform optical biopsy and virtual histology. Leveraging these advancements, numerous OCT applications have emerged over the recent decade. The majority of these approaches and applications are based on specific OCT images and signal processing. This workshop comprises four subsequent panels and is dedicated to the whole pipeline from OCT technology development and signal formation, through signal features and preprocessing, to AI-enabling and practical implementations and applications.