3rd Workshop on Continual and Multimodal Learning for Internet of Things

August 21, 2021 • Online

Co-located with IJCAI 2021

About CML-IOT (previous editions: CML-IOT'20 , CML-IOT'19 )

Internet of Things (IoT) provides streaming, large-amount, and multimodal data (e.g., natural language, speech, image, video, audio, virtual reality, WiFi, GPS, RFID, vibration) over time. The statistical properties of these data are often significantly different by sensing modalities and temporal traits, which are hardly captured by conventional learning methods. Continual and multimodal learning allows integration, adaptation and generalization of the knowledge learnt from previous experiential data collected with heterogeneity to new situations. Therefore, continual and multimodal learning is an important step to improve the estimation, utilization, and security of real-world data from IoT devices.



Call for Papers

This workshop aims to explore the intersection and combination of continual machine learning and multimodal modeling with applications in Internet of Things. The workshop welcomes works addressing these issues in different applications and domains, such as natural language processing, computer vision, human-centric sensing, smart cities, health, etc. We aim at bringing together researchers from different areas to establish a multidisciplinary community and share the latest research.

We focus on the novel learning methods that can be applied on streaming multimodal data:

  • continual learning
  • transfer learning
  • federated learning
  • few-shot learning
  • multi-task learning
  • reinforcement learning
  • learning without forgetting
  • individual and/or institutional privacy
  • manage high volume data flow

  • We also welcome continual learning methods that target:

  • data distribution changed caused by the fast-changing dynamic physical environment
  • missing, imbalanced, or noisy data under multimodal data scenarios

  • Novel applications or interfaces on streaming multimodal data are also related topics.


    As examples, the data modalities include but not limited to: natural language, speech, image, video, audio, virtual reality, biochemistry, WiFi, GPS, RFID, vibration, accelerometer, pressure, temperature, humidity, etc.



    Important Dates

  • Submission deadline: May 12, 2021
  • Notification of acceptance: May 29, 2021
  • Deadline for camera ready version: June 12, 2021
  • Workshop: August 21, 2021
  • Submit Now



    Submission Guidelines

    Please submit papers using the the IJCAI author kit. We invite papers of varying length from 2 to 6 pages, plus additional pages for the reference; i.e., the reference page(s) are not counted to the limit of 6 pages. The reviewing process is double-blind. The qualified accepted papers will be invited to be extended for a journal submission at Frontiers in Big Data.



    Invited Keynote Speakers

    Keynote 1: Knowledge-Guided Graph Representation Learning, Speaker: Sheng Li, University of Georgia

    Abstract: Graph-structured data are ubiquitous, which have been extensively used in many real-world applications. In this talk, I will present our recent work on graph representation learning with applications in multiple domains. First, we leverage the line graph theory and propose novel graph neural networks, which jointly learn embeddings for both nodes and edges. Second, we investigate how to incorporate commonsense and domain knowledge to graph representation learning, and present several applications in computer vision, natural language processing, and recommender systems. Finally, future work on knowledge-guided graph representation learning will also be discussed.

    Bio: Dr. Sheng Li is an Assistant Professor of Computer Science at the University of Georgia (UGA). Before joining UGA in 2018, he was a Data Scientist at Adobe Research. He obtained his Ph.D. degree in computer engineering from Northeastern University in 2017. Dr. Li's research interests include graph-based machine learning, visual intelligence, user modeling, causal inference, and trustworthy artificial intelligence. He has published over 100 papers at peer-reviewed conferences and journals, and has received over 10 research awards, such as the INNS Young Investigator Award, M. G. Michael Award, Adobe Data Science Research Award, Cisco Faculty Award, and SIAM SDM Best Paper Award. He has served as Associate Editor of seven international journals such as IEEE Transactions on Circuits and Systems for Video Technology and IEEE Computational Intelligence Magazine, as an Area Chair of ICLR and ICPR, and as a Senior Program Committee member of AAAI and IJCAI. He is a senior member of IEEE.



    Keynote 2: Large-scale Vision-and-Language Pre-training for Multimodal Learning, Speaker: Zhe Gan, Microsoft

    Abstract: With the advent of models such as OpenAI CLIP and DALL-E, transformer-based vision-and-language pre-training has become an increasingly hot research topic. In this talk, I will share some of our recent work in this direction and try to answer the following questions. First, how to perform vision-and-language pre-training? Second, how to enhance the performance of pre-trained models via adversarial training? Third, how robust are these pre-trained models? And finally, how can we extend image-text pre-training to video-text pre-training? Accordingly, I will present UNITER, VILLA, Adversarial VQA, HERO, and ClipBERT to answer these questions. At last, I will also briefly discuss the challenges and future directions for vision-and-language pre-training.

    Bio: Dr. Zhe Gan is a Principal Researcher at Microsoft. He received the PhD degree from Duke University in 2018. Before that, he received the Master’s and Bachelor’s degree from Peking University in 2013 and 2010, respectively. His current research interests include vision-and-language representation learning, self-supervised pre-training, and adversarial machine learning. He received the Best Student Paper Honorable Mention Award at CVPR 2021 and WACV 2021, and Outstanding Senior Program Committee Member Award at AAAI 2020. He has been regularly serving as an Area Chair for NeurIPS, ICML, ICLR, ACL, and AAAI.

    Organizers

    Workshop Chairs (Feel free to contact us by cmliot2021@gmail.com, if you have any questions.)
  • Tong Yu (Adobe Research)
  • Susu Xu (Stony Brook University)
  • Handong Zhao (Adobe Research)
  • Ruiyi Zhang (Adobe Research)
  • Shijia Pan (UC Merced)


  • Advising Committee
  • Nicholas Lane (University of Cambridge and Samsung AI)
  • Jennifer Healey (Adobe Research)
  • Branislav Kveton (Google Research)
  • Zheng Wen (DeepMind)
  • Changyou Chen (University at Buffalo)


  • Technical Program Committee
  • Bang An (University of Maryland)
  • Guan-Lin Chao (Carnegie Mellon University)
  • Jonathon Fagert (Baldwin Wallace University)
  • Gao Tang (University of Illinois at Urbana-Champaign)
  • Ajinkya Kale (Adobe)
  • Chuanyi Li (Nanjing University)
  • Kunpeng Li (Northeastern University)
  • Wei Ma (Hongkong Polytech University)
  • Mostafa Mirshekari (Searchable.ai)
  • Xidong Pi (Aurora)
  • Can Qin (Northeastern University)
  • Shijing Si (Pingan Technology AI Center)
  • Rui Wang (Duke University)
  • Yikun Xian (Rutgers University)
  • Yifan Zhou (University at Buffalo)
  • Ming Zeng (Facebook)


  • Agenda (Montreal time)

    Welcome! (10:00 - 10:15)
    Keynote 1 (10:15 - 11:00), Speaker: Sheng Li, University of Georgia
    Session 1: Privacy Perserving Continual and Multimodal Learning (11:00 - 12:00)
  • A distillation-based approach integrating continual learning and federated learning for pervasive services, Anastasiia USMANOVA, Philippe Lalanda, Francois Portet, German Vega
  • Capturing occupant activities of daily living through sensor fusion: framework, modelling, privacy aspects and applications, Anooshmita Das, Masab Khalid Annaqeeb
  • Continual Distributed Learning for Crisis Management, Aman Priyanshu, Mudit Sinha, Shreyans Mehta

  • Lunch Break (12:00 - 13:00)
    Keynote 2 (13:00 - 13:45), Speaker: Zhe Gan, Microsoft
    Session 2: Data Augmentation for Continual and Multimodal Learning (13:45 - 14:45)
  • SDA: Improving Text Generation with Self Data Augmentation, Ping Yu, Ruiyi Zhang, Yang Zhao, Changyou Chen
  • Adopting Active Learning for User Requests Classification, Yuan Zhang, Chuanyi Li, Bin Luo
  • Fully Unsupervised Domain Adaptation, Zhimeng Yang, Yazhou Ren, Zirui Wu, Ming Zeng, Jie Xu

  • Session 3: Novel Application (14:45 - 15:00)
  • PopCTR: A Multi-Modal Architecture for Click-Through Rate Prediction of Bank Pop-up Advertisements, Liqiang Song, Jiahao Yang, Chengtian Ren, Mengqiu Yao, Yan Yi, Ye Bi, Jianming Wang, Jing Xiao, Ming Yan, Baijun Shen

  • Summary (15:00 - 15:10)

    Note: For each paper presentation, there are 15 minutes for presentation and 5 minutes for Q&A.

    Copyright © All Rights Reserved | This template is made with by Colorlib