Title: Multimodal Learning toward Micro-Video Understanding
Print Length 页数: 188 pages
Edition 版次: 1
Language 语言: English
Released: 2019-09-17
ISBN-10: 1681736284
ISBN-13: 9781681736280
Book Description
By finelybook
Micro-videos,a new form of user-generated content,have been spreading widely across various social platforms,such as Vine,Kuaishou,and TikTok.
Different from traditional long videos,micro-videos are usually recorded by smart mobile devices at any place within a few seconds. Due to their brevity and low bandwidth cost,micro-videos are gaining increasing user enthusiasm. The blossoming of micro-videos opens the door to the possibility of many promising applications,ranging from network content caching to online advertising. Thus,it is highly desirable to develop an effective scheme for high-order micro-video understanding.
Micro-video understanding is,however,non-trivial due to the following challenges: (1) how to represent micro-videos that only convey one or few high-level themes or concepts; (2) how to utilize the hierarchical structure of venue categories to guide micro-video analysis; (3) how to alleviate the influence of low quality caused by complex surrounding environments and camera shake; (4) how to model multimodal sequential data,i.e. textual,acoustic,visual,and social modalities to enhance micro-video understanding; and (5) how to construct large-scale benchmark datasets for analysis. These challenges have been largely unexplored to date.
In this book,we focus on addressing the challenges presented above by proposing some state-of-the-art multimodal learning theories. To demonstrate the effectiveness of these models,we apply them to three practical tasks of micro-video understanding: popularity prediction,venue category estimation,and micro-video routing. Particularly,we first build three large-scale real-world micro-video datasets for these practical tasks. We then present a multimodal transductive learning framework for micro-video popularity prediction. Furthermore,we introduce several multimodal cooperative learning approaches and a multimodal transfer learning scheme for micro-video venue category estimation. Meanwhile,we develop a multimodal sequential learning approach for micro-video recommendation. Finally,we conclude the book and figure out the future research directions in multimodal learning toward micro-video understandingMultimodal Learning toward Micro Video Understanding 9781681736280.pdf[/erphpdown]