TitleSocial image parsing by cross-modal data refinement
AuthorsLu, Zhiwu
Gao, Xin
Huang, Songfang
Wang, Liwei
Wen, Ji-Rong
AffiliationSchool of Information, Renmin University of China, Beijing, China
CEMSE Division, KAUST, Thuwal, Jeddah, Saudi Arabia
IBM China Research Lab., Beijing, China
School of EECS, Peking University, Beijing, China
Issue Date2015
Publisher24th International Joint Conference on Artificial Intelligence, IJCAI 2015
Citation24th International Joint Conference on Artificial Intelligence, IJCAI 2015.Buenos Aires, Argentina,2015/1/1,2015-January(2169-2175).
AbstractThis paper presents a cross-modal data refinement algorithm for social image parsing, or segmenting all the objects within a social image and then identifying their categories. Different from the traditional fully supervised image parsing that takes pixel-level labels as strong supervisory information, our social image parsing is initially provided with the noisy tags of images (i.e. image-level labels), which are shared by social users. By oversegmenting each image into multiple regions, we formulate social image parsing as a cross-modal data refinement problem over a large set of regions, where the initial labels of each region are inferred from image-level labels. Furthermore, we develop an efficient algorithm to solve such cross-modal data refinement problem. The experimental results on several benchmark datasets show the effectiveness of our algorithm. More notably, our algorithm can be considered to provide an alternative and natural way to address the challenging problem of image parsing, since image-level labels are much easier to access than pixel-level labels.
Appears in Collections:信息科学技术学院

Files in This Work
There are no files associated with this item.

Web of Science®

Checked on Last Week


Checked on Current Time

License: See PKU IR operational policies.