Full Papers

Tuesday 23 October – Conference Day 2


15:30 – 16:30 Crystal Ballroom1

Oral Session: Deep – 1

  • Structure Guided Photorealistic Style Transfer Yuheng Zhi (Shanghai Jiao Tong University)*; Huawei Wei (Shanghai JiaoTong University); Bingbing Ni (Shanghai Jiao Tong University)
  • Crossing-Domain Generative Adversarial Networks for Unsupervised Multi-Domain Image-to-Image Translation Xuewen Yang (Stony Brook University)*; Dongliang Xie (Beijing University of Posts and Telecommunications); xin wang (Department of Electrical and Computer Engineering, Stony Brook University)
  • Multi-View Image Generation from a Single-View Bo Zhao (Southwest Jiaotong University); Xiao Wu (Southwest Jiaotong University)*; Zhi-Qi Cheng (Carnegie Mellon University); Hao Liu (NUS); Zequn Jie (Tencent AI Lab); Jiashi Feng (NUS)
  • Sparsely Grouped Multi-task Generative Adversarial Networks for Facial Attribute Manipulation JiChao Zhang (Shandong University)*; Yezhi Shu (Shandong University); Songhua Xu (Xi’an Jiaotong University); Gongze Cao (Zhejiang University); Fan Zhong (Shandong University); Meng Liu (Shandong University); Xueying Qin (Shandong University)


15:30 – 16:30 Crystal Ballroom2

Oral Session: Vision – 1

  • Visual Domain Adaptation with Manifold Embedded Distribution Alignment Jindong Wang (Chinese Academy of Sciences)*; Wenjie Feng (Institute of Computing Technology, CAS, China); Yiqiang Chen (Institute of Computing Technology, Chinese Academy of Sciences); Han Yu (NTU); Meiyu Huang (Qian Xuesen Laboratory of Space Technology, China Academy of Space Technology, Beijing, China); Philip S Yu (UIC)
  • Causally Regularized Learning with Agnostic Data Selection Bias Zheyan Shen (Tsinghua University)*; Peng Cui (Tsinghua University); Kun Kuang (Tsinghua University); Bo Li (Tsinghua University); Peixuan Chen (Tencent)
  • Robust Correlation Filter Tracking with Shepherded Instance-Aware Proposals Yanjie Liang (Xiamen University); Qiangqiang Wu (Xiamen university); Yi Liu (XIamen University); Yan Yan (Xiamen University); Hanzi Wang (Xiamen University)*
  • A Unified Framework for Multimodal Domain Adaptation Fan Qi (HeFei University of Technology, CASIA)*; Xiaoshan Yang (CASIA); Changsheng Xu (CASIA)


15:30 – 16:30 Crystal Ballroom3

Oral Session: Multimedia – 1

  • What dress fits me best? Fashion Recommendation on the Clothing Style for Personal Body Shape Shintami C. Hidayati (Academia Sinica); Cheng-Chun Hsu (NTUST); Yu-Ting Chang (Academia Sinica); Kai-Lung Hua (NTUST); Jianlong Fu (Microsoft Research); Wen-Huang Cheng (EE, NCTU)*
  • CSAN: Contextual Self-Attention Network for User Sequential Recommendation Xiaowen Huang (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences,)*; shengsheng qian (institute of automation chinese academy of sciences); Quan Fang (Institute of Automation Chinese Academy of Sciences); Jitao Sang (Beijing Jiaotong University, China); Changsheng Xu (CASIA)
  • Attentive Interactive Convolutional Matching for Community Question Answering in Social Multimedia Jun Hu (HeFei University of Technology); shengsheng qian (institute of automation chinese academy of sciences); Quan Fang (Institute of Automation Chinese Academy of Sciences); Changsheng Xu (CASIA)*
  • Beyond the Product: Discovering Image Posts for Brands in Social Media Francesco Gelli (NUS)*; Tiberio Uricchio (University of Florence); Xiangnan He (National University of Singapore; University of Science and Technology of China); Alberto Del Bimbo (University of Florence); Tat-Seng Chua (National Univ. of Singapore)


17:00 – 18:00 Crystal Ballroom1

Oral Session: Vision – 2

  • Collaborative Annotation of Semantic Objects in Images with Multi-granularity Supervisions Lishi Zhang (Beihang University); Chenghan Fu (Beihang University); Jia Li (Beihang University)*
  • GraphNet: Learning Image Pseudo Annotations for Weakly-Supervised Semantic Segmentation Mengyang Pu (Beijing Jiaotong University)*; Yaping Huang (Beijing Jiaotong University); Qingji Guan (Beijing Jiaotong University); Qi Zou (Beijing Jiaotong University)
  • Boosting Scene Parsing Performance via Reliable Scale Prediction Hengcan Shi ( University of Electronic Science and Technology of China)*; Hongliang Li (University of Electronic Science and Technology of China); Qingbo Wu (University of ElectronicScience and Technology of China); Fanman Meng (University of Electronic Science and Technology of China); King N. Ngan (CUHK, UESTC)
  • Learning to Synthesize 3D Indoor Scenes from Monocular Images Fan Zhu (the inception institute of artificial intelligence )*; Li Liu (the inception institute of artificial intelligence); Jin Xie (Nanjing University of Science and Technology); Fumin Shen (UESTC); Ling Shao (Inception Institute of Artificial Intelligence); Yi Fang (New York University)


17:00 – 18:00 Crystal Ballroom2

Oral Session: Multimodal – 1

  • Visual Spatial Attention Network for Relationship Detection Chaojun Han (UESTC)*; Fumin Shen (UESTC); Li Liu (the inception institute of artificial intelligence); Yang Yang (University of Electronic Science and Technology of China); Heng Tao Shen (University of Electronic Science and Technology of China (UESTC))
  • Object-Difference Attention: A simple relational attention for Visual Question Answering Chenfei Wu (Beijing University of Posts and Telecommunications)*; Jinlai Liu (Beijing University of Posts and Telecommunications); Xiaojie Wang (Beijing University of Posts and Telecommunications); Xuan Dong (Beijing University of Posts and Telecommunications)
  • Life-long Cross-media Correlation Learning Jinwei Qi (Peking University); Yuxin Peng (Peking University)*; Yunkan Zhuo (Peking University)
  • Human Conversation Analysis Using Attentive Multimodal Networks with Hierarchical Encoder-Decoder Yue Gu (Rutgers University)*; Xinyu Li (Rutgers University); Kaixiang Huang (MTlab, Meitu Inc.); Shiyu Fu (Rutgers University); Kangning Yang (Rutgers University); Shuhong Chen (Rutgers University); Moliang Zhou (Amazon Inc.); Ivan Marsic (Rutgers University)


17:00 – 18:00 Crystal Ballroom3

Oral Session: System – 1

  • End-to-End Blind Quality Assessment of Compressed Video Using Deep Neural Networks Wentao Liu (University of Waterloo)*; Zhengfang Duanmu (University of Waterloo); Zhou Wang (University of Waterloo)
  • FlexStream: Towards Flexible Adaptive Video Streaming on End Devices using Extreme SDN Ibrahim Ben Mustafa (University of North Georgia)*; Tamer Nadeem (Virginia Commonwealth University); Emir Halepovic (AT&T Labs – Research)
  • CLS: A Cross-user Learning based System for Improving QoE in 360-degree Video Adaptive Streaming Lan Xie (Peking University)*; Xinggong Zhang (PKU); Zongming Guo (Peking University)
  • A Distributed Approach for Bitrate Selection in HTTP Adaptive Streaming Abdelhak Bentaleb (National University of Singapore)*; Ali Begen (Ozyegin University); Roger Zimmermann (NUS); Saad Harous (United Arab Emirates University)

 

Wednesday 24 October – Conference Day 3

10:30 – 11:30 Crystal Ballroom1-3

Best Paper Session

  • GestureGAN for Hand Gesture-to-Gesture Translation in the Wild Hao Tang (University of Trento)*; Wei Wang (University of Trento, EPFL); Dan Xu (University of Trento, University of Oxford); Yan Yan (Texas State University); Nicu Sebe (University of Trento)
  • Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training Bei Liu (Kyoto University)*; Jianlong Fu (Microsoft Research); Makoto P Kato (Kyoto University); Masatoshi Yoshikawa (Kyoto University)
  • Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing Jian Zhao (National University of Singapore)*; Jianshu Li (NUS); Yu Cheng (National University of Singapore); Li Zhou (National University of Singapore); Terence Sim (NUS); Yan Shuicheng (National University of Singapore); Jiashi Feng (NUS)
  • Knowledge-aware Multimodal Dialogue Systems lizi Liao (National University of Singapore)*; Yunshan Ma (National University of Singapore ); Xiangnan He (National University of Singapore; University of Science and Technology of China); Richang Hong (Hefei University of Technology); Tat-Seng Chua (National university of Singapore)

15:30 – 16:30 Crystall Ballroom1

Oral Session: Deep – 2

  • Mining Semantics-Preserving Attention for Group Activity Recognition Yansong Tang (Tsinghua University); Zian Wang (Tsinghua University); Peiyang Li (Tsinghua University); Jiwen Lu (Tsinghua University)*; Ming Yang (horizon-robotics); Jie Zhou (Tsinghua University)
  • Participation-Contributed Temporal Dynamic Model for Group Activity Recognition Rui Yan (Nanjing University of Science and Technology)*; Jinhui Tang (Nanjing University of Science and Technology); Xiangbo Shu (Nanjing University of Science and Technology); Zechao Li (Nanjing University of Science and Technology); Qi Tian (The University of Texas at San Antonio)
  • WildFish: A Large Benchmark for Fish Recognition in the Wild Peiqin Zhuang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yu Qiao (Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*
  • PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition Haoxuan You (Tsinghua University)*; Yifan Feng (Xiamen University); Rongrong Ji (Xiamen University, China); Yue Gao (Tsinghua University)


15:30 – 16:30 Crystall Ballroom2

Oral Session: Multimedia – 2

  • EmotionGAN: Unsupervised Domain Adaptation for Learning Discrete Probability Distributions of Image Emotions Sicheng Zhao (University of California Berkeley)*; Xin Zhao (Tsinghua University); guiguang ding (Tsinghua University, China); Kurt Keutzer (EECS, UC Berkeley)
  • USAR: an interactive user-specific aesthetic ranking framework for images Pei Lv (Zhengzhou University)*; Meng Wang (Zhengzhou University); Yongbo Xu (Zhengzhou University); Ze Peng (Zhengzhou University); Junyi Sun (Zhengzhou University); Shimei Su (Zhengzhou University); Bing Zhou (Zhengzhou University); Mingliang Xu (Zhengzhou University)
  • Deep Multimodal Image-Repurposing Detection Ekraam Sabir (USC ISI)*; Wael Abd-Almageed (Information Sciences Institute); Rex Yue Wu (USC ISI); Prem Natarajan (USC ISI)
  • Facial Expression Recognition Enhanced by Thermal Images through Adversarial Learning Bowen Pan (University of Science and Technology of China); Shangfei Wang (University of Science and Technology of China)*


17:00 – 18:00 Crystall Ballroom1

Oral Session: Vision – 3

  • Only Learn One Sample: Fine-Grained Visual Categorization with One Sample Training Xiangteng He (Peking University); Yuxin Peng (Peking University)*
  • LA-Net: Layout-Aware Dense Network for Monocular Depth Estimation Kecheng Zheng (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China)*; Yang Cao (University of Science and Technology of China); Xuejin Chen (University of Science and Technology of China); Feng Wu (University of Science and Technology of China)
  • Robustness and Discrimination Oriented Hashing Combining Texture and Invariant Vector Distance Ziqing Huang (Tianjin University)*; Shiguang Liu (Tianjin University)
  • Joint Global and Co-Attentive Representation Learning for Image-Sentence Retrieval Shuhui Wang (vipl,ict,Chinese academic of science)*; Yangyu Chen (University of Chinese Academy of Sciences); Junbao Zhuo (ICT CAS); Qingming Huang (University of Chinese Academy of Sciences); Qi Tian (The University of Texas at San Antonio)


17:00 – 18:00 Crystall Ballroom2

Oral Session: Multimodal – 2

  • Text-to-image Synthesis via Symmetrical Distillation Networks Mingkuan Yuan (Peking University); Yuxin Peng (Peking University)*
  • Context-Aware Visual Policy Network for Sequence-Level Image Captioning Daqing Liu (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China)*; Hanwang Zhang (Nanyang Technological University); Yongdong Zhang (University of Science and Technology of China); Feng Wu (University of Science and Technology of China)
  • SibNet: Sibling Convolutional Encoder for Video Captioning Sheng Liu (Nanyang Technological University)*; Zhou Ren (“Snap Research, USA, “); Junsong Yuan (“State University of New York at Buffalo, USA”)
  • Paragraph generation network with visual relationship detection Wenbin Che (Harbin Institute of Technology); Xiaopeng Fan (Harbin Institute of Technology)*; Ruiqin Xiong (Peking University); Debin Zhao (Harbin Institute of Technology)

 

Thursday 25 October – Conference Day 4


10:30 – 11:30 Crystall Ballroom1

Oral Session: Multimedia – 3

  • Supervised Online Hashing via Hadamard Codebook Learning Mingbao Lin (Xiamen University, China); Rongrong Ji (Xiamen University, China)*; Hong Liu (Xiamen University); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd)
  • Cascaded Feature Augmentation with Diffusion for Image Retrieval Yuanqiang Fang (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Yijuan Lu (Texas State University); Jinhui Tang (Nanjing University of Science and Technology); Qi Tian (The University of Texas at San Antonio); Houqiang Li (University of Science and Technology of China)
  • Deep Triplet Quantization Bin Liu (Tsinghua University); Yue Cao (Tsinghua University); Mingsheng Long (Tsinghua University)*; Jianmin Wang (“Tsinghua University, China”); Jingdong Wang (Microsoft Research)
  • Fast Discrete Cross-modal Hashing With Regressing From Semantic Labels Xingbo Liu (School of Computer Science and Technology, Shandong University); Xiushan Nie (Shandong University of Finance and Economics)*; Wenjun Zeng (Microsoft Research); Chaoran Cui (Shandong University of Finance and Economics); Lei Zhu (Shandong Normal Unversity); Yilong Yin (School of Software Engineering, Shandong University)


10:30 – 11:30 Crystall Ballroom2

Oral Session: Experience – 1

  • ModaNet: A Large-Scale Street Fashion Dataset with Polygon Annotations Shuai Zheng (eBay)*; Fan Yang (eBay Inc.); Hadi Kiapour (eBay); Robinson Piramuthu (eBay Inc.)
  • SLIONS: A Karaoke Application to Enhance Foreign Language Learning Dania Murad (National University of Singapore)*; Riwu Wang (National University of Singapore); Douglas Turnbull (Ithaca College); Ye Wang (National University of Singapore)
  • Context-Aware Unsupervised Text Stylization Shuai Yang ( Peking University)*; Jiaying Liu (Peking University); Wenhan Yang (Peking University); Zongming Guo (Peking University)
  • Songle Sync: A Large-Scale Web-based Platform for Controlling Various Devices in Synchronization with Music Jun Kato (National Institute of Advanced Industrial Science and Technology (AIST))*; Masa Ogata (National Institute of Advanced Industrial Science and Technology (AIST)); Takahiro Inoue (National Institute of Advanced Industrial Science and Technology (AIST)); Masataka Goto (National Institute of Advanced Industrial Science and Technology (AIST))


10:30 – 11:30 Crystall Ballroom3

Oral Session: System – 2

  • Fine-grained Grocery Product Recognition by One-shot Learning Weidong Geng (Zhejiang University); Feilin Han (Zhejiang University)*; Jiangke Lin (Zhejiang University); Liuyi Zhu (Zhejiang University); Jieming Bai (Zhejiang University); Suzhen Wang (Zhejiang University); Lin He (Zhejiang University); Qiang Xiao (Zhejiang University); Zhangjiong Lai (Zhejiang University)
  • Reconfigurable Inverted Index Yusuke Matsui (National Institute of Informatics)*; Ryota Hinami (The University of Tokyo); Shin’ichi Satoh (National Institute of Informatics)
  • Robust Billboard-based, Free-viewpoint Video Synthesis Algorithm to Overcome Occlusions under Challenging Outdoor Sport Scenes Hiroshi Sankoh (KDDI Research Inc.)*; Sei Naito (KDDI Research Inc.); Keisuke Nonaka (KDDI Research Inc.); Houari Sabirin (KDDI Research. Inc.); Jun Chen (KDDI Research Inc.)
  • iHuman3D: Intelligent Human Body 3D Reconstruction using a Single Flying Camera Wei Cheng (HKUST); Lan Xu (HKUST); Lei Han (HKUST); Yuanfang Guo (Institute of Information Engineering, Chinese Academy of Sciences); Lu Fang (Tsinghua University)*


15:30 – 16:30 Crystall Ballroom1

Oral Session: Deep – 3

  • Learning Collaborative Generation Correction Modules for Blind Image Deblurring and Beyond Risheng Liu (Dalian University of Technology)*; Yi He (Dalian University of Technology); Shichao Cheng (Dalian University of Technology); Xin Fan (Dalian University of Technology); Zhongxuan Luo (DALIAN UNIVERSITY OF TECHNOLOGY)
  • When Deep Fool Meets Deep Prior: Adversarial Attack on Super-Resolution Network Yin Minghao (Tsinghua University)*; yongbing zhang (Tsinghua University); Xiu Li (Tsinghua University); Shiqi Wang (CityU)
  • Semantic Image Inpainting with Progressive Generative Networks Haoran Zhang (HeFei University of Technology); Zhenzhen Hu (Hefei University of Technology)*; Changzhi luo (HFUT); Wangmeng Zuo (Harbin Institute of Technology); Meng Wang (Hefei University of Technology)
  • Structural inpainting Huy V. Vo (Ecole Polytechnique); Ngoc Duong (Technicolor)*; Patrick Pérez (Valeo.ai)


17:00 – 18:00 Crystall Ballroom1

Oral Session: Vision – 4

  • Fine-grained Representation Learning and Recognition by Exploiting Hierarchical Semantic Embedding Tianshui Chen (Sun Yat-Sen University)*; Wenxi Wu (Sun Yat-Sen University); Yuefang Gao (South China Agricultural University); Le Dong (University of Electronic Science and Technology of China (UESTC)); Xiaonan Luo (Sun Yat-sen University); Liang Lin (Sun Yat-sen University)
  • Dissimilarity Representation Learning for Generalized Zero-Shot Recognition Gang Yang (Renmin University of China)*; Jinlu Liu (Renmin University of China); Xirong Li (Renmin University of China); Jieping Xu (Renmin University of China)
  • Attribute-Aware Attention Model for Fine-grained Representation Learning Kai Han (Peking University); Jianyuan Guo (Peking University); Chao Zhang (Peking University)*; Mingjian Zhu (Peking University)
  • GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning Siyu Huang (Zhejiang University); Xi Li (Zhejiang University)*; Zhi-Qi Cheng (Carnegie Mellon University); Zhongfei Zhang (Zhejiang University); Alexander G. Hauptmann ( Carnegie Mellon University, USA)