1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.
no code implementations • 15 Apr 2024 • Chengfeng Liu, Mai Xu, Qunliang Xing, Xin Zou
Lossy image compression is essential for Mars exploration missions, due to the limited bandwidth between Earth and Mars.
no code implementations • 27 Feb 2024 • Qunliang Xing, Mai Xu, Shengxi Li, Xin Deng, Meisong Zheng, Huaida Liu, Ying Chen
However, these methods exhibit a pervasive enhancement bias towards the compression domain, inadvertently regarding it as more realistic than the raw domain.
no code implementations • ICCV 2023 • Junpeng Jing, Jiankun Li, Pengfei Xiong, Jiangyu Liu, Shuaicheng Liu, Yichen Guo, Xin Deng, Mai Xu, Lai Jiang, Leonid Sigal
A novel Uncertainty Guided Adaptive Correlation (UGAC) module is introduced to robustly adapt the same model for different scenarios.
no code implementations • 18 Mar 2023 • Miaohui Wang, Zhuowei Xu, Mai Xu, Weisi Lin
Qualitative and quantitative results on Dark-4K show that BMQA achieves superior performance to existing BIQA approaches as long as a pre-trained model is provided to generate text description.
1 code implementation • ICCV 2023 • Shengxi Li, Jialu Zhang, Yifei Li, Mai Xu, Xin Deng, Li Li
The emergence of conditional generative adversarial networks (cGANs) has revolutionised the way we approach and control the generation, by means of adversarially learning joint distributions of data and auxiliary information.
no code implementations • ICCV 2023 • Xin Deng, Chao GAO, Mai Xu
In this paper, we propose a novel method namely PIRNet, which operates privacy-preserving image restoration in the steganographic domain.
1 code implementation • CVPR 2023 • Yichen Guo, Mai Xu, Lai Jiang, Leonid Sigal, Yunjin Chen
To alleviate this issue, we propose the first attempt at 360deg image rescaling, which refers to downscaling a 360deg image to a visually valid low-resolution (LR) counterpart and then upscaling to a high-resolution (HR) 360deg image given the LR variant.
1 code implementation • 20 Nov 2022 • Qunliang Xing, Mai Xu, Xin Deng, Yichen Guo
Image defocus is inherent in the physics of image formation caused by the optical aberration of lenses, providing plentiful information on image quality.
2 code implementations • 21 Apr 2022 • Meisong Zheng, Qunliang Xing, Minglang Qiao, Mai Xu, Lai Jiang, Huaida Liu, Ying Chen
As a widely studied task, video restoration aims to enhance the quality of the videos with multiple potential degradations, such as noises, blurs and compression artifacts.
2 code implementations • 20 Apr 2022 • Ren Yang, Radu Timofte, Meisong Zheng, Qunliang Xing, Minglang Qiao, Mai Xu, Lai Jiang, Huaida Liu, Ying Chen, Youcheng Ben, Xiao Zhou, Chen Fu, Pei Cheng, Gang Yu, Junyi Li, Renlong Wu, Zhilu Zhang, Wei Shang, Zhengyao Lv, Yunjin Chen, Mingcai Zhou, Dongwei Ren, Kai Zhang, WangMeng Zuo, Pavel Ostyakov, Vyal Dmitry, Shakarim Soltanayev, Chervontsev Sergey, Zhussip Magauiya, Xueyi Zou, Youliang Yan, Pablo Navarrete Michelini, Yunhua Lu, Diankai Zhang, Shaoli Liu, Si Gao, Biao Wu, Chengjian Zheng, Xiaofeng Zhang, Kaidi Lu, Ning Wang, Thuong Nguyen Canh, Thong Bach, Qing Wang, Xiaopeng Sun, Haoyu Ma, Shijie Zhao, Junlin Li, Liangbin Xie, Shuwei Shi, Yujiu Yang, Xintao Wang, Jinjin Gu, Chao Dong, Xiaodi Shi, Chunmei Nian, Dong Jiang, Jucai Lin, Zhihuai Xie, Mao Ye, Dengyan Luo, Liuhan Peng, Shengjie Chen, Qian Wang, Xin Liu, Boyang Liang, Hang Dong, Yuhao Huang, Kai Chen, Xingbei Guo, Yujing Sun, Huilei Wu, Pengxu Wei, Yulin Huang, Junying Chen, Ik Hyun Lee, Sunder Ali Khowaja, Jiseok Yoon
This challenge includes three tracks.
1 code implementation • CVPR 2022 • Lai Jiang, Yifei Li, Shengxi Li, Mai Xu, Se Lei, Yichen Guo, Bo Huang
E-commerce images are playing a central role in attracting people's attention when retailing and shopping online, and an accurate attention prediction is of significant importance for both customers and retailers, where its research is yet to start.
Ranked #2 on Saliency Prediction on SALECI
1 code implementation • 18 Nov 2021 • Li Yang, Mai Xu, Shengxi Li, Yichen Guo, Zulin Wang
When assessing the quality of 360{\textdegree} video, human tends to perceive its quality degradation from the viewport-based spatial distortion of each spherical frame to motion artifact across adjacent frames, ending with the video-level quality score, i. e., a progressive quality assessment paradigm.
1 code implementation • 5 Nov 2021 • Minglang Qiao, Yufan Liu, Mai Xu, Xin Deng, Bing Li, Weiming Hu, Ali Borji
In this paper, we propose a multitask learning method for visual-audio saliency prediction and sound source localization on multi-face video by leveraging visual, audio and face information.
1 code implementation • 22 Aug 2021 • Zhengyong Wang, Liquan Shen, Mei Yu, Kun Wang, Yufei Lin, Mai Xu
However, these methods ignore the significant domain gap between the synthetic and real data (i. e., interdomain gap), and thus the models trained on synthetic data often fail to generalize well to real underwater scenarios.
1 code implementation • journal 2021 • Qing Ding, Liquan Shen, Liangwei Yu, Hao Yang, Mai Xu
To overcome these limitations, we propose a patch-wise spatial-temporal quality enhancement network which firstly extracts spatial and temporal features, then recalibrates and fuses the obtained spatial and temporal features.
1 code implementation • CVPR 2021 • Xin Deng, Wenzhe Yang, Ren Yang, Mai Xu, Enpeng Liu, Qianhan Feng, Radu Timofte
To fully explore the mutual information across two stereo images, we use a deep regression model to estimate the homography matrix, i. e., H matrix.
no code implementations • CVPR 2021 • Lai Jiang, Mai Xu, Xiaofei Wang, Leonid Sigal
In this paper, we propose a novel task for saliency-guided image translation, with the goal of image-to-image translation conditioned on the user specified saliency map.
Generative Adversarial Network Image-to-Image Translation +1
no code implementations • CVPR 2021 • Xin Deng, Hao Wang, Mai Xu, Yichen Guo, Yuhang Song, Li Yang
In addition, we propose a deep reinforcement learning scheme with a latitude adaptive reward, in order to automatically select optimal upscaling factors for different latitude bands.
1 code implementation • ECCV 2020 • Yufan Liu, Minglang Qiao, Mai Xu, Bing Li, Weiming Hu, Ali Borji
Inspired by the findings of our investigation, we propose a novel multi-modal video saliency model consisting of three branches: visual, audio and face.
1 code implementation • 10 Mar 2021 • Li Yang, Mai Xu, Deng Xin, Bo Feng
To alleviate this issue, we propose a spatial attention-based perceptual quality prediction network for non-reference quality assessment on ODIs (SAP-net).
1 code implementation • ICCV 2021 • Junpeng Jing, Xin Deng, Mai Xu, Jianyi Wang, Zhenyu Guan
Capacity, invisibility and security are three primary challenges in image hiding task.
1 code implementation • ECCV 2020 • Jianyi Wang, Xin Deng, Mai Xu, Congyong Chen, Yuhang Song
In this paper, we focus on enhancing the perceptual quality of compressed video.
1 code implementation • ECCV 2020 • Qunliang Xing, Mai Xu, Tianyi Li, Zhenyu Guan
Recently, extensive approaches have been proposed to reduce image compression artifacts at the decoder side; however, they require a series of architecture-identical models to process images with different quality, which are inefficient and resource-consuming.
1 code implementation • 23 Jun 2020 • Tianyi Li, Mai Xu, Runzhi Tang, Ying Chen, Qunliang Xing
In VVC, the quad-tree plus multi-type tree (QTMT) structure of coding unit (CU) partition accounts for over 97% of the encoding time, due to the brute-force search for recursive rate-distortion (RD) optimization.
1 code implementation • ICCV 2019 • Xin Deng, Ren Yang, Mai Xu, Pier Luigi Dragotti
In this paper, we propose a novel method based on wavelet domain style transfer (WDST), which achieves a better PD tradeoff than the GAN based methods.
no code implementations • 6 Jun 2019 • Tie Liu, Mai Xu, Zulin Wang
In this paper, we establish a large-scale video database for rain removal (LasVR), which consists of 316 rain videos.
2 code implementations • 17 May 2019 • Yuhang Song, Andrzej Wojcicki, Thomas Lukasiewicz, Jianyi Wang, Abi Aryan, Zhenghua Xu, Mai Xu, Zihan Ding, Lianlong Wu
That is, there is not yet a general evaluation platform for research on multi-agent intelligence.
1 code implementation • 12 May 2019 • Yuhang Song, Jianyi Wang, Thomas Lukasiewicz, Zhenghua Xu, Shangtong Zhang, Andrzej Wojcicki, Mai Xu
Intrinsic rewards were introduced to simulate how human intelligence works; they are usually evaluated by intrinsically-motivated play, i. e., playing games without extrinsic rewards but evaluated with extrinsic rewards.
no code implementations • 15 Apr 2019 • Mai Xu, Li Yang, Xiaoming Tao, Yiping Duan, Zulin Wang
According to these findings, our SalGAIL approach applies deep reinforcement learning (DRL) to predict the head fixations of one subject, in which GAIL learns the reward of DRL, rather than the traditional human-designed reward.
1 code implementation • CVPR 2019 • Liu Li, Mai Xu, Xiaofei Wang, Lai Jiang, Hanruo Liu
The attention maps of the ophthalmologists are also collected in LAG database through a simulated eye-tracking experiment.
1 code implementation • 11 Mar 2019 • Ren Yang, Xiaoyan Sun, Mai Xu, Wen-Jun Zeng
The past decade has witnessed great success in applying deep learning to enhance the quality of compressed video.
no code implementations • 5 Mar 2019 • Tianyi Li, Mai Xu, Ren Yang, Xiaoming Tao
High efficiency video coding (HEVC) has brought outperforming efficiency for video compression.
1 code implementation • 26 Feb 2019 • Qunliang Xing, Zhenyu Guan, Mai Xu, Ren Yang, Tie Liu, Zulin Wang
Finally, experiments validate the effectiveness and generalization ability of our MFQE approach in advancing the state-of-the-art quality enhancement of compressed video.
Ranked #5 on Video Enhancement on MFQE v2
1 code implementation • 10 Nov 2018 • Yuhang Song, Jianyi Wang, Thomas Lukasiewicz, Zhenghua Xu, Mai Xu
However, HRL with multiple levels is usually needed in many real-world scenarios, whose ultimate goals are highly abstract, while their actions are very primitive.
Hierarchical Reinforcement Learning reinforcement-learning +1
2 code implementations • 9 Oct 2018 • Jiaxin Lu, Mai Xu, Ren Yang, Zulin Wang
In particular, we find that the high-level feature of scene category is rather correlated with outdoor natural scene memorability, and the deep features learnt by deep neural network (DNN) are also effective in predicting the memorability scores.
1 code implementation • ECCV 2018 • Lai Jiang, Mai Xu, Tie Liu, Minglang Qiao, Zulin Wang
Hence, an object-to-motion convolutional neural network (OM-CNN) is developed to predict the intra-frame saliency for DeepVS, which is composed of the objectness and motion subnets.
no code implementations • 27 Aug 2018 • Jiaxin Lu, Mai Xu, Ren Yang, Zulin Wang
Recent studies on image memorability have shed light on the visual features that make generic images, object images or face photographs memorable.
1 code implementation • 29 Jul 2018 • Chen Li, Mai Xu, Xinzhe Du, Zulin Wang
To fill in the gap between subjective quality and human behavior, this paper proposes a large-scale visual quality assessment (VQA) dataset of omnidirectional video, called VQA-OV, which collects 60 reference sequences and 540 impaired sequences.
1 code implementation • CVPR 2018 • Ren Yang, Mai Xu, Zulin Wang, Tianyi Li
In this paper, we investigate that heavy quality fluctuation exists across compressed video frames, and thus low quality frames can be enhanced using the neighboring high quality frames, seen as Multi-Frame Quality Enhancement (MFQE).
Ranked #6 on Video Enhancement on MFQE v2
1 code implementation • 30 Oct 2017 • Yuhang Song, Mai Xu, Jianyi Wang, Minglang Qiao, Liangyu Huo, Zulin Wang
Finally, the experiments validate that our approach is effective in both offline and online prediction of HM positions for panoramic video, and that the learned offline-DHP model can improve the performance of online-DHP.
no code implementations • 20 Sep 2017 • Ren Yang, Mai Xu, Tie Liu, Zulin Wang, Zhenyu Guan
Our experimental results validate that our QE-CNN method is effective in enhancing quality for both I and P frames of HEVC videos.
Multimedia
1 code implementation • 19 Sep 2017 • Lai Jiang, Mai Xu, Zulin Wang
We further find from our database that there exists a temporal correlation of human attention with a smooth saliency transition across video frames.
1 code implementation • 19 Sep 2017 • Mai Xu, Tianyi Li, Zulin Wang, Xin Deng, Ren Yang, Zhenyu Guan
Therefore, this paper proposes a deep learning approach to predict the CU partition for reducing the HEVC complexity at both intra- and inter-modes, which is based on convolutional neural network (CNN) and long- and short-term memory (LSTM) network.
1 code implementation • CVPR 2017 • Yufan Liu, Songyang Zhang, Mai Xu, Xuming He
On the other hand, we find that the attention of different subjects consistently focuses on a single face in each frame of videos involving multiple faces.
no code implementations • ICCV 2015 • Mai Xu, Yun Ren, Zulin Wang
For modeling attention on faces and facial features, the proposed method learns the Gaussian mixture model (GMM) distribution from the fixations of eye tracking data as the top-down features for saliency detection of face images.