Full Papers

Tuesday 27 October – Conference Day 1

11am – 12.30pm Auditorium

Best Paper Session

Chair: Heng Tao Shen (The University of Queensland, Australia)

  • Analyzing Free-standing Conversational Groups: A Multimodal Approach Xavier Alameda-Pineda, Yan Yan (University of Trento, Italy), Elisa Ricci, Oswald Lanz (Fondazione Bruno Kessler, Italy), Nicu Sebe (University of Trento, Italy)
  • An Affordable Solution for Binocular Eye Tracking and Calibration in Head-mounted Displays Michael Stengel, Steve Grogorick (TU Braunschweig, Germany), Elmar Eisemann (TU Delft, The Netherlands), Martin Eisemann (TH Koeln, Germany), Marcus A. Magnor (TU Braunschweig, Germany)
  • SINGA: Putting Deep Learning in the Hands of Multimedia Users Wei Wang (National University of Singapore), Gang Chen (Zhejiang university, China), Anh Tien Tuan Dinh, Jinyang Gao, Beng Chin Ooi, Kian-Lee Tan, Sheng Wang (National University of Singapore)
  • Weakly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation Xiangbo Shu (Nanjing University of Science and Technolog, China), Guo-Jun Qi (University of Central Florida, USA), Jinhui Tang (Nanjing University of Science and Technology, China), Jingdong Wang (Microsoft Research, China)


4 – 5.30pm Room P6

Oral Session 1: Multimedia Indexing and Search

Chair: Heng Tao Shen (The University of Queensland, Australia)

  • Fast and Accurate Content-based Semantic Search in 100M Internet Videos Lu Jiang, Shoou-I Yu (Carnegie Mellon University, USA), Deyu Meng (Xi’an Jiaotong University, China), Yi Yang (University of Technology Sydney, Australia), Teruko Mitamura, Alexander G Hauptmann (Carnegie Mellon University, USA)
  • Visual Coding in a Semantic Hierarchy Yang Yang (University of Electronic Science and Technology of China), Hanwang Zhang (National University of Singapore), Mingxing Zhang, Fumin Shen (University of Electronic Science and Technology of China), Xuelong Li (Chinese Academy of Sciences)
  • Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment Xinyang Jiang, Fei Wu, Xi Li, Zhou Zhao, Weiming Lu, Siliang Tang, Yueting Zhuang (Zhejiang University, China)
  • Effective Multi-Query Expansions: Robust Landmark Retrieval Yang Wang, Xuemin Lin (The University of New South Wales, Australia), Lin Wu (The University of Adelaide, Australia), Wenjie Zhang (The University of New South Wales, Australia)


4 – 5.30pm Room P7

Oral Session 2: Social Multimedia

Chair: Tao Mei (Microsoft Research Asia, China)

  • What are Popular: Exploring Twitter Features for Event Detection, Tracking and Visualization Hongyun Cai (The University of Queensland, Australia), Yang Yang (University of Electronic Science and Technology of China), Xuefei Li, Zi Huang (The University of Queensland, Australia)
  • Cross-Domain Collaborative Learning in Social Multimedia Shengsheng Qian, Tianzhu Zhang (National Lab of Pattern Recognition, Institute of Automation, CAS, China), Richang Hong (School of Computer and Information, Hefei University of Technology, China), Changsheng Xu (National Lab of Pattern Recognition, Institute of Automation, CAS, China)
  • Learning Socially Embedded Visual Representation from Scratch Shaowei Liu, Peng Cui, Wenwu Zhu, Shiqiang Yang (Tsinghua University, China)
  • Spatial-aware Multimodal Location Estimation for Social Images Jiewei Cao, Zi Huang (The University of Queensland, Australia), Yang Yang (University of Electronic Science and Technology of China)


4 – 5.30pm Room P8

Oral Session 3: Emotional and Social Signals in Multimedia

Chair: Hayley Hung (Technical University of Delft, The Netherlands)

  • Collaborative Fashion Recommendation: A Functional Tensor Factorization Approach Yang Hu, Xi Yi, Larry S. Davis (University of Maryland, USA)
  • Predicting and Understanding Urban Perception with Convolutional Neural Networks Lorenzo Porzi, Samuel Rota Bulò, Bruno Lepri, Elisa Ricci (Fondazione Bruno Kessler, University of Perugia, Italy)
  • A Multimodal Predictive Model of Successful Debaters or How I Learned to Sway Votes Maarten Brilman (University of Twente, The Netherlands), Stefan Scherer (USC Institute for Creative Technologies, USA)
  • Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology Brendan Jou, Tao Chen (Columbia University, USA), Nikolaos Pappas (Idiap Research Institute, Switzerland), Miriam Redi (Yahoo Labs, United Kingdom), Mercan Topkara (JW Player, USA), Shih-Fu Chang (Columbia University, USA)


WEDNESDAY 28 October – Conference Day 2

4 – 5.30pm Auditorium

Oral Session 4: Multimedia and Vision

Session Chair: Mohan S Kankanhalli (National University of Singapore, Singapore)

  • Dancing with Turks I-Kao Chiang (University of Pennsylvania, USA), Ian Spiro (New York University, USA), Seungkyu Lee (KyungHee University, South Korea), Alyssa Lees (New York University, USA), Jingchen Liu (The Pennsylvania State University, USA), Chris Bregler (New York University, USA), Yanxi Liu (The Pennsylvania State University, USA)
  • Single Image Spectral Reconstruction for Multimedia Applications Antonio Robles-Kelly (NICTA, Australia)
  • SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System with Stitching Xiangyun Meng, Wei Wang, Ben Leong (National University of Singapore)
  • Eye of the Dragon: Exploring Discriminatively Minimalist Sketch-based Abstractions for Object Categories Ravi Kiran Sarvadevabhatla, Venkatesh Babu R (Indian Institute of Science, India)


4 – 5.30pm Room P6

Oral Session 5: Multimedia Art, Entertainment and Culture

Session Chair: Phoebe Chen (La Trobe University, Australia)

  • A Distributed Theatre Experiment with Shakespeare Douglas L Williams, Ian C Kegel (BT, UK), Marian Ursu (University of York, UK), Pablo Cesar, Jack Jansen (Centrum Wiskunde & Informatica, The Netherlands), Erik Geelhoed (Falmouth University, UK), Andras Horti (Joanneum Research, Austria), Michael Frantzis (Goldsmiths, University of London, UK), Bill Scott (Miracle Theatre Company, UK)
  • Image Profiling for History Events on the Fly Jia Chen (Shanghai Jiao Tong University, China), Qin Jin (Renmin University of China), Yong Yu (Shanghai Jiao Tong University, China), Alexander G. Hauptmann (Carneige Mellon University, USA)
  • Modeling Perspective Effects in Photographic Composition Zihan Zhou, Siqiong He, Jia Li, James Z Wang (The Pennsylvania State University, USA)
  • Who’s Afraid of Itten: Using the Art Theory of Color Combination to Analyze Emotions in Abstract Paintings Andreza Sartori (University of Trento & Telecom Italia, Italy), Dubravko Culibrk (University of Trento, Italy & University of Novi Sad, Serbia), Yan Yan, Nicu Sebe (University of Trento, Italy)


4 – 5.30pm Room P7

Oral Session 6: Telepresence, Virtual, and Augmented Reality

Chair: Klara Nahrstedt (UIUC, USA)

  • Image2Scene: Transforming Style of 3D Room Xiaowu Chen, Jianwei Li (Beihang University, China), Qing Li (Beijing Union University, China), Bo Gao, Dongqing Zou, Qinping Zhao (Beihang University, China)
  • Gradient-based 2D-to-3D Conversion for Soccer Videos Kiana Calagari (Simon Fraser University, Canada), Mohamed Elgharib (Qatar Computing Research Institute, HBKU), Piotr Didyk (Saarland University, Germany), Alexandre Kaspar, Wojciech Matusik (Massachusetts Institute of Technology, USA), Mohamed Hefeeda (Qatar Computing Research Institute, HBKU)
  • Ubii: Towards Seamless Interaction between Digital and Physical Worlds Zhanpeng Huang, Weikai Li, Pan Hui (Hong Kong University of Science and Technology, China)
  • Smart Beholder: An Open-Source Smart Lens for Mobile Photography Chun-Ying Huang (National Taiwan Ocean University), Chih-Fan Hsu, Tsung-Han Tsai (Academia Sinica, Taiwan), Ching-Ling Fan, Cheng-Hsin Hsu (National Tsing Hua University, Taiwan), Kuan-Ta Chen (Academia Sinica, Taiwan)


4 – 5.30pm Room P8

Oral Session 7: Actions and Events

Session Chair: Nicu Sebe (University of Trento, Italy)

  • Coherent Motion Detection with Collective Density Clustering Yunpeng Wu, Yangdong Ye, Chenyang Zhao (Zhengzhou University, China)
  • Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images Chen Sun (University of Southern California, USA), Sanketh Shetty, Rahul Sukthankar (Google, Inc., USA), Ram Nevatia (University of Southern California, USA)
  • Temporal Matching Kernel with Explicit Feature Maps Sébastien Poullot (JFLI (CNRS), National Institute of Informatics, Japan), Shunsuke Tsukatani (University of Tokyo, National Institute of Informatics, Japan), Anh Phuong Nguyen (MMLab – University of Information Technology, Vietnam), Hervé Jégou (Inria, France), Shin’Ichi Satoh (National Institute of Informatics, Japan)
  • Efficient Activity Retrieval through Semantic Graph Queries Gregory Castanon, Yuting Chen, Ziming Zhang, Venkatesh Saligrama (Boston University, USA)


THURSDAY 29 October – Conference Day 3

11am – 12.30pm Room P6

Oral Session 8: Video Systems

Session Chair: Wu-Chi Feng (Portland State University, USA)

  • Video Killed The Data Store: Extending the n-Dimensional Display Interface for Full Screen Video Charles D Estes, Ketan Mayer-Patel (University of North Carolina at Chapel Hill, USA)
  • Dependency-Aware Unequal Error Protection for Layered Video Coding Mohammad Reza Zakerinasab, Mea Wang (University of Calgary, Canada)
  • HiFi: A Hierarchical Filtering Algorithm for Caching of Online Video Shahid Akhtar, Andre Beck, Ivica Rimac (Alcatel-Lucent, Germany)
  • Exploring QoE for Power Efficiency: A Field Study on Mobile Videos with LCD Displays Zhisheng Yan, Qian Liu (State University of New York at Buffalo, USA), Tong Zhang (Rensselaer Polytechnic Institute, USA), Chang Wen Chen (State University of New York at Buffalo, USA)


11am – 12.30pm Room P7

Oral Session 9: Deep Learning and Multimedia

Session Chair: Cees G M Snoek (University of Amsterdam & Qualcomm Research Netherlands, The Netherlands)

  • Automatic Image Dataset Construction from Click-through Logs Using Deep Neural Network Yalong Bai (Harbin Institute of Technology, China), Kuiyuan Yang (Microsoft Research, China), Wei Yu (Harbin Institute of Technology, China), Chang Xu (Nankai University, China), Wei-Ying Ma (Microsoft Research, China), Tiejun Zhao (Harbin Institute of Technology, China)
  • DeepFont: Identify Your Font from An Image Zhangyang Wang (UIUC, USA), Jianchao Yang (Snapchat Inc, USA), Hailin Jin, Eli Shechtman (Adobe Research, USA), Aseem Agarwala (Google Inc, USA), Jonathan Brandt (Adobe Research, USA), Thomas S. Huang (UIUC, USA)
  • Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, Xiangyang Xue (Fudan University, China)
  • EventNet: A Large Scale Structured Concept Library for Complex Event Detection in Video Guangnan Ye, Yitong Li, Hongliang Xu, Dong Liu, Shih-Fu Chang (Columbia University, USA)


11am – 12.30pm Room P8

Oral Session 10: Multimedia Quality Perception

Session Chair: Bart Thomee (Yahoo Research, USA)

  • Modelling Human Factors in Perceptual Multimedia Quality: On The Role of Personality and Culture Michael James Scott (Brunel University London, UK), Sharath Chandra Guntuku, Yang Huan, Weisi Lin (Nanyang Technological University, Singapore), Gheorghita Ghinea (Brunel University London, UK)
  • Biologically Inspired Media Quality Modeling Luming Zhang, Meng Wang (Hefei University of Technology, China), Liqiang Nie (National University of Singapore), Richang Hong (Hefei University of Technology, China), Roger Zimmermann (National University of Singapore), Yingjie Xia (Zhejiang University, China)
  • QoE Modelling for VP9 and H.265 Videos on Mobile Devices Wei Song, Yao Xiao, Dian Tjondronegoro (Queensland University of Technology, Australia), Antonio Liotta (Eindhoven University of Technology, The Netherlands)
  • Towards Solving the Bottleneck of Pitch-based Singing Voice Separation Bilei Zhu, Wei Li, Linwei Li (Fudan University, China)


2 – 3.30pm Room P6

Oral Session 11: Multimedia Networking

Session Chair: Ketan Mayer-Patel (University of North Carolina, USA)

  • Enhancing the Quality of Interactive Multimedia Services by Proactive Monitoring and Failure Prediction Mohammed Shatnawi (Simon Fraser University, Canada), Mohamed Hefeeda (Qatar Computing Research Institute, Hamad Bin Khalifa University)
  • Distributed Optimal Datacenter Bandwidth Allocation for Dynamic Adaptive Video Streaming Fanxin Kong (McGill University, Canada), Xingjian Lu (McGill University, Canada & East China University of Science and Technology), Mingyuan, Xue Liu (McGill University, Canada), Haibing Guan (Shang Hai Jiao Tong University, China)
  • HTTP/2-Based Methods to Improve the Live Experience of Adaptive Streaming Rafael Huysegems (Bell Labs, Belgium), Jeroen van der Hooft (Ghent University – iMinds, Belgium), Tom Bostoen, Patrice Rondao Alface (Bell Labs, Belgium), Stefano Petrangeli, Tim Wauters, Filip De Turck (Ghent University – iMinds, Belgium)
  • Bandwidth-aware Prefetching for Proactive Multi-video Preloading and Improved HAS Performance Vengatanathan Krishnamoorthi, Niklas Carlsson (Linköping University, Sweden), Derek Eager (University of Saskatchewan, Canada), Anirban Mahanti (NICTA, Australia), Nahid Shahmehri (Linköping University, Sweden)


2 – 3.30pm Room P7

Oral Session 12: Data Imperfectness for Multimedia

Session Chair: Meng Wang (Hefei Institute of Technology, China)

  • Multi-View Visual Recognition of Imperfect Testing Data Qilin Zhang, Gang Hua (Stevens Institute of Technology, USA)
  • If You Can’t Beat Them, Join Them: Learning with Noisy Data Pravin Kakar (Institute for Infocomm Research, Singapore), Alex Yong-Sang Chia (Rakuten Institute of Technology, Singapore)
  • Searching Persuasively: Joint Event Detection and Evidence Recounting with Limited Supervision Xiaojun Chang (University of Technology Sydney, Australia), Yao-Liang Yu (Carnegie Mellon University, USA), Yi Yang (University of Technology Sydney, Australia), Alexander G. Hauptmann (Carnegie Mellon University, USA)
  • Beyond Doctors: Future Health Prediction from Multimedia and Multimodal Observations Liqiang Nie (National University of Singapore), Luming Zhang (Hefei University of Technology, China), Yi Yang (University of Technology Sydney, Australia), Meng Wang, Richang Hong (Hefei University of Technology, China), Tat-Seng Chua (National University of Singapore)


2 – 3.30pm Room P8

Oral Session 13: Multimedia Experiences and Expectations

Session Chair: Dick C A Bulterman (FXPAL, USA)

  • Multi-sensor Self-Quantification of Presentations Tian Gan (National University of Singapore), Yongkang Wong (Interactive & Digital Media Institute, Singapore), Bappaditya Mandal, Vijay Chandrasekhar (Institute for Infocomm Research, Singapore), Mohan S. Kankanhalli (National University of Singapore)
  • HyperMeeting: Supporting Asynchronous Meetings with Hypervideo Andreas Girgensohn, Jennifer Marlow (FX Palo Alto Laboratory, USA), Frank Shipman (Texas A&M University, USA), Lynn Wilcox (FX Palo Alto Laboratory, USA)
  • MMToC: A Multimodal Method for Table of Content Creation in Educational Videos Arijit Biswas, Ankit Gandhi, Om Deshmukh (Xerox Research Centre India)
  • Interactive Scene Flow Editing for Improved Image-based Rendering and Virtual Spacetime Navigation Kai Ruhl (TU Braunschweig, Germany), Martin Eisemann (TH Koeln, Germany), Anna Hilsmann, Peter Eisert (HHI Fraunhofer, Germany), Marcus Magnor (TU Braunschweig, Germany)