Search Results for author: Wenhang Ge

Found 8 papers, 5 papers with code

LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding

no code implementations • 27 May 2024 • Haoyu Zhao, Wenhang Ge, Ying-Cong Chen

LLM-Optic first employs an LLM as a Text Grounder to interpret complex text queries and accurately identify objects the user intends to locate.

Visual Grounding

Paper
Add Code

SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance

no code implementations • 24 May 2024 • Guibao Shen, Luozhou Wang, Jiantao Lin, Wenhang Ge, Chaozhe Zhang, Xin Tao, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Guangyong Chen, Yijun Li, Ying-Cong Chen

In this paper, we introduce the Scene Graph Adapter(SG-Adapter), leveraging the structured representation of scene graphs to rectify inaccuracies in the original text embeddings.

Text-to-Image Generation

Paper
Add Code

X-Ray: A Sequential 3D Representation For Generation

1 code implementation • 22 Apr 2024 • Tao Hu, Wenhang Ge, Yuyang Zhao, Gim Hee Lee

We introduce X-Ray, a novel 3D sequential representation inspired by the penetrability of x-ray scans.

3D Generation Object

Paper
Code

Decompose and Realign: Tackling Condition Misalignment in Text-to-Image Diffusion Models

1 code implementation • 26 Jun 2023 • Luozhou Wang, Guibao Shen, Wenhang Ge, Guangyong Chen, Yijun Li, Ying-Cong Chen

The ``Decompose'' phase separates conditions based on pair relationships, computing the result individually for each pair.

Image Generation

Paper
Code

Ref-NeuS: Ambiguity-Reduced Neural Implicit Surface Learning for Multi-View Reconstruction with Reflection

1 code implementation • ICCV 2023 • Wenhang Ge, Tao Hu, Haoyu Zhao, Shu Liu, Ying-Cong Chen

We show that together with a reflection direction-dependent radiance, our model achieves high-quality surface reconstruction on reflective surfaces and outperforms the state-of-the-arts by a large margin.

3D Reconstruction Multi-View 3D Reconstruction +1

102

Paper
Code

Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval

no code implementations • 27 Sep 2022 • Chengzhi Lin, AnCong Wu, Junwei Liang, Jun Zhang, Wenhang Ge, Wei-Shi Zheng, Chunhua Shen

To address this problem, we propose a Text-Adaptive Multiple Visual Prototype Matching model, which automatically captures multiple prototypes to describe a video by adaptive aggregation of video token features.

Cross-Modal Retrieval Retrieval +2

Paper
Add Code

Camera-Conditioned Stable Feature Generation for Isolated Camera Supervised Person Re-IDentification

1 code implementation • CVPR 2022 • Chao Wu, Wenhang Ge, AnCong Wu, Xiaobin Chang

To learn camera-view invariant features for person Re-IDentification (Re-ID), the cross-camera image pairs of each person play an important role.

Person Re-Identification

Paper
Code

Cross-Camera Feature Prediction for Intra-Camera Supervised Person Re-identification across Distant Scenes

1 code implementation • 29 Jul 2021 • Wenhang Ge, Chunyan Pan, AnCong Wu, Hongwei Zheng, Wei-Shi Zheng

To learn camera-invariant representation from cross-camera unpaired training data, we propose a cross-camera feature prediction method to mine cross-camera self supervision information from camera-specific feature distribution by transforming fake cross-camera positive feature pairs and minimize the distances of the fake pairs.

Person Re-Identification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.