Call for papers: Deep Learning for Intelligent Multimedia Systems

Guest Editors

Meng Liu, Shandong Jianzhu University, China (

Yan Yan, Texas State University, USA (

Tian Gan, Shandong University, China (

Hua Huang, Beijing Institute of Technology, China (

Mohan Kankanhalli, National University of Singapore, Singapore (


We are living in the era of multimedia: a tremendous amount of videos, images, and texts are generated, published, and spread daily. In other words, multimedia data is becoming an indispensable part of today’s big data. In fact, the large-scale multimedia data has raised challenges and opportunities for developing intelligent multimedia systems, like retrieval, recommendation, recognition, categorization, and generation systems. Although shallow learning has achieved some progress, its processing capacity for large-scale data is still limited. Meanwhile, deep learning algorithms have enabled the development of highly accurate systems and have become a standard choice for analyzing different types of data. For instance, convolutional neural networks have demonstrated high capability in image classification, recurrent neural networks are widely exploited in modelling temporal sequence in NLP. Inspired by this, we are keen on applying deep learning techniques to boost the performance of multimedia analysis tasks, including object/action detection, image/video captioning, and image/video classification.

The goal of this special issue is to assemble recent advances in the deep-learning based multimedia analytics and relatively new areas. The multimedia data of interest covers a wide spectrum, ranging from text, audio, image, click-through logs, Web videos, EEG signals, to surveillance videos. In particular, we expect the novel contribution focus on the following research lines:

  1. State-of-the-art models and algorithms for various multimedia analysis tasks range from object detection, semantic classification, entity annotation, to multimedia captioning, multimedia question answering and storytelling, which play an important role in public security, entertainment, healthcare, social media, and so on
  2. Novel directions based on the emerging multimedia data
  3. Surveys of recent progress in this research area
  4. The benchmark dataset construction.


The list of possible topics includes, but is not limited to:

  • Deep learning towards image/video classification, object detection, and segmentation
  • Deep learning towards image object localization and video moment localization
  • Deep learning towards multimedia captioning, question answering and storytelling
  • Deep learning towards content-based retrieval of data (e.g. text, images, video, and music)
  • Deep learning towards content understanding of data from various multimedia systems
  • Deep learning towards data generation, fusion, and enhancement
  • Deep learning towards multimodal systems
  • Applications of deep learning in multimedia research, and intelligent systems more generally
  • New dataset and benchmark for intelligent multimedia systems.

Important dates

  • Submission deadline: November 15 2020
  • First-round review: January 01, 2021
  • Revision submission: February 15, 2021
  • Final decision: March 15, 2021

Instructions for authors

Authors should prepare their manuscript according to the journal's Submission Guidelines, available here. Additionally, authors should select “SI: Deep Learning for Intelligent Multimedia Systems” when they reach the “Article Type” step in the submission process to submit their papers. Note that submitted papers should present original, unpublished work, relevant to one of the topics of the Special Issue. Moreover, all submitted papers will be evaluated by at least three independent reviewers. If the submission is an extended version of a previously published workshop or conference paper, this should also be explicitly mentioned in the cover letter, as well as the published paper must be cited in the submitted journal version. The papers must be written in English and must not exceed 30 pages (single column, double space, 12 pt font, including figures, tables, and references).