no code implementations • 3 Mar 2024 • Tianhua Qi, Wenming Zheng, Cheng Lu, Yuan Zong, Hailun Lian
In this paper, we propose Prosody-aware VITS (PAVITS) for emotional voice conversion (EVC), aiming to achieve two major objectives of EVC: high content naturalness and high emotional naturalness, which are crucial for meeting the demands of human perception.
no code implementations • 19 Jan 2024 • Yong Wang, Cheng Lu, Hailun Lian, Yan Zhao, Björn Schuller, Yuan Zong, Wenming Zheng
These segment-level patches are then encoded using a stack of Swin blocks, in which a local window Transformer is utilized to explore local inter-frame emotional information across frame patches of each segment patch.
no code implementations • 18 Jan 2024 • Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Björn Schuller, Wenming Zheng
In speaker-independent speech emotion recognition, the training and testing samples are collected from diverse speakers, leading to a multi-domain shift challenge across the feature distributions of data from different speakers.
no code implementations • 16 Oct 2023 • Ling Zhou, Mingpei Wang, Xiaohua Huang, Wenming Zheng, Qirong Mao, Guoying Zhao
Micro-expression recognition (MER) in low-resolution (LR) scenarios presents an important and complex challenge, particularly for practical applications such as group MER in crowded environments.
no code implementations • 7 Oct 2023 • Jie Zhu, Yuan Zong, Jingang Shi, Cheng Lu, Hongli Chang, Wenming Zheng
This paper focuses on the research of micro-expression recognition (MER) and proposes a flexible and reliable deep learning method called learning to rank onset-occurring-offset representations (LTR3O).
no code implementations • 6 Oct 2023 • Qing Zhu, Qirong Mao, Jialin Zhang, Xiaohua Huang, Wenming Zheng
Group-level emotion recognition (GER) is an inseparable part of human behavior analysis, aiming to recognize an overall emotion in a multi-person scene.
no code implementations • 9 Aug 2023 • Yijin Zhou, Fu Li, Yang Li, Youshuo Ji, Lijian Zhang, Yuanfang Chen, Wenming Zheng, Guangming Shi
The transfer module encodes the domain-specific information of source and target domains and then re-constructs the source domain's emotional pattern and the target domain's statistical characteristics into the new stylized EEG representations.
no code implementations • 17 Feb 2023 • Yan Zhao, Jincen Wang, Yuan Zong, Wenming Zheng, Hailun Lian, Li Zhao
In this paper, we propose a novel deep transfer learning method called deep implicit distribution alignment networks (DIDAN) to deal with cross-corpus speech emotion recognition (SER) problem, in which the labeled training (source) and unlabeled testing (target) speech signals come from different corpora.
no code implementations • 22 Oct 2022 • Cheng Lu, Wenming Zheng, Hailun Lian, Yuan Zong, Chuangao Tang, Sunan Li, Yan Zhao
The F-Encoder and T-Encoder model the correlations within frequency bands and time frames, respectively, and they are embedded into a time-frequency joint learning strategy to obtain the time-frequency patterns for speech emotions.
no code implementations • 18 Sep 2022 • Xiaolin Xu, Yuan Zong, Wenming Zheng, Yang Li, Chuangao Tang, Xingxun Jiang, Haolin Jiang
In this paper, we present a large-scale, multi-source, and unconstrained database called SDFE-LV for spotting the onset and offset frames of a complete dynamic facial expression from long videos, which is known as the topic of dynamic facial expression spotting (DFES) and a vital prior step for lots of facial expression analysis tasks.
1 code implementation • 12 Apr 2022 • Yang Li, Ji Chen, Fu Li, Boxun Fu, Hao Wu, Youshuo Ji, Yijin Zhou, Yi Niu, Guangming Shi, Wenming Zheng
GMSS has the ability to learn more general representations by integrating multiple self-supervised tasks, including spatial and frequency jigsaw puzzle tasks, and contrastive learning tasks.
no code implementations • 14 Dec 2021 • Yijin Zhou, Fu Li, Yang Li, Youshuo Ji, Guangming Shi, Wenming Zheng, Lijian Zhang, Yuanfang Chen, Rui Cheng
Moreover, motivated by the observation of the relationship between coarse- and fine-grained emotions, we adopt a dual-head module that enables the PGCN to progressively learn more discriminative EEG features, from coarse-grained (easy) to fine-grained categories (difficult), referring to the hierarchical characteristic of emotion.
1 code implementation • 30 Nov 2021 • Xingxun Jiang, Yuan Zong, Wenming Zheng, Jiateng Liu, Mengting Wei
To solve these problems, this paper proposes a novel Transfer Group Sparse Regression method, namely TGSR, which aims to 1) optimize the measurement and better alleviate the difference between the source and target databases, and 2) highlight the valid facial regions to enhance extracted features, by the operation of selecting the group features from the raw face feature, where each region is associated with a group of raw face feature, i. e., the salient facial region selection.
no code implementations • 13 Jul 2021 • Qirong Mao, Ling Zhou, Wenming Zheng, Xiuyan Shao, Xiaohua Huang
More specifically, the backbone network aims at extracting feature representations from different facial regions, RI module computing an adaptive weight from the region itself based on attention mechanism with respect to the unobstructedness and importance for suppressing the influence of occlusion, and RR module exploiting the progressive interactions among these regions by performing graph convolutions.
no code implementations • CVPR 2021 • Tengfei Song, Zijun Cui, Wenming Zheng, Qiang Ji
In this paper, we propose a novel hybrid message passing neural network with performance-driven structures (HMP-PS), which combines complementary message passing methods and captures more possible structures in a Bayesian manner.
no code implementations • CVPR 2021 • Tengfei Song, Zijun Cui, Yuru Wang, Wenming Zheng, Qiang Ji
Second, we introduce probabilistic graph convolution that allows to perform graph convolution on the distribution of Bayesian Network structure to extract AU structural features.
no code implementations • 19 Oct 2020 • Jiateng Liu, Wenming Zheng, Yuan Zong
Correctly perceiving micro-expression is difficult since micro-expression is an involuntary, repressed, and subtle facial expression, and efficiently revealing the subtle movement changes and capturing the significant segments in a micro-expression sequence is the key to micro-expression recognition (MER).
no code implementations • 21 Sep 2020 • Yang Li, Boxun Fu, Fu Li, Guangming Shi, Wenming Zheng
So it is necessary to give more attention to the EEG samples with strong transferability rather than forcefully training a classification model by all the samples.
no code implementations • 13 Aug 2020 • Xingxun Jiang, Yuan Zong, Wenming Zheng, Chuangao Tang, Wanchuang Xia, Cheng Lu, Jiateng Liu
Experimental results show that DFEW is a well-designed and challenging database, and the proposed EC-STFL can promisingly improve the performance of existing spatiotemporal deep neural networks in coping with the problem of dynamic FER in the wild.
Ranked #17 on Dynamic Facial Expression Recognition on DFEW
Dynamic Facial Expression Recognition Facial Expression Recognition +1
no code implementations • 19 Dec 2018 • Yuan Zong, Tong Zhang, Wenming Zheng, Xiaopeng Hong, Chuangao Tang, Zhen Cui, Guoying Zhao
Cross-database micro-expression recognition (CDMER) is one of recently emerging and interesting problem in micro-expression analysis.
no code implementations • 30 Nov 2018 • Keyu Yan, Wenming Zheng, Tong Zhang, Yuan Zong, Zhen Cui
Cross-database non-frontal expression recognition is a very meaningful but rather difficult subject in the fields of computer vision and affect computing.
Facial Expression Recognition Facial Expression Recognition (FER) +1
no code implementations • 11 Sep 2018 • Zhen Cui, Chunyan Xu, Wenming Zheng, Jian Yang
Visual relationship detection can bridge the gap between computer vision and natural language for scene understanding of images.
no code implementations • 16 Apr 2018 • Jiatao Jiang, Chunyan Xu, Zhen Cui, Tong Zhang, Wenming Zheng, Jian Yang
As an analogy to a standard convolution kernel on image, Gaussian models implicitly coordinate those unordered vertices/nodes and edges in a local receptive field after projecting to the gradient space of Gaussian parameters.
no code implementations • 27 Mar 2018 • Tong Zhang, Wenming Zheng, Zhen Cui, Yang Li
For cross graph convolution, a parameterized Kronecker sum operation is proposed to generate a conjunctive adjacency matrix characterizing the relationship between every pair of nodes across two subgraphs.
no code implementations • IEEE Transactions on Affective Computing 2018 • Zhenyang Zhang, Wenming Zheng, Peng Song, Zhen Cui
In this paper, a multichannel EEG emotion recognition method based on a novel dynamical graph convolutional neural networks (DGCNN) is proposed.
Ranked #2 on Electroencephalogram (EEG) on SEED-IV
no code implementations • 27 Feb 2018 • Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, Jian Yang
To encode dynamic graphs, the constructed multi-scale local graph convolution filters, consisting of matrices of local receptive fields and signal mappings, are recursively performed on structured graph data of temporal and spatial domain.
Ranked #1 on Skeleton Based Action Recognition on Florence 3D
no code implementations • 17 Nov 2017 • Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, Rongrong Ji, Jian Yang
The motion analysis of human skeletons is crucial for human action recognition, which is one of the most active topics in computer vision.
no code implementations • 26 Jul 2017 • Yuan Zong, Xiaohua Huang, Wenming Zheng, Zhen Cui, Guoying Zhao
In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases.
no code implementations • 30 May 2017 • Tong Zhang, Wenming Zheng, Zhen Cui, Chaolong Li
Symmetric positive definite (SPD) matrices (e. g., covariances, graph Laplacians, etc.)
no code implementations • 12 May 2017 • Tong Zhang, Wenming Zheng, Zhen Cui, Yuan Zong, Yang Li
Then a bi-directional temporal RNN layer is further used to learn discriminative temporal dependencies from the sequences concatenating spatial features of each time slice produced from the spatial RNN layer.
no code implementations • 24 Jul 2016 • Yang Li, Wenming Zheng, Zhen Cui
To address the sequential changes of images including poses, in this paper we propose a recurrent regression neural network(RRNN) framework to unify two classic tasks of cross-pose face recognition on still images and video-based face recognition.
no code implementations • NeurIPS 2009 • Wenming Zheng, Zhouchen Lin
The method of common spatio-spectral patterns (CSSPs) is an extension of common spatial patterns (CSPs) by utilizing the technique of delay embedding to alleviate the adverse effects of noises and artifacts on the electroencephalogram (EEG) classification.