no code implementations • 25 Mar 2024 • Hao Ai, Lin Wang
With a flexible ERP image encoder, it includes an ICOSAP point encoder, and a Bi-projection Bi-attention Fusion (B2F) module (totally ~1M parameters).
no code implementations • 19 Jan 2024 • Hao Ai, Zidong Cao, Haonan Lu, Chen Chen, Jian Ma, Pengyuan Zhou, Tae-Kyun Kim, Pan Hui, Lin Wang
To this end, we propose a transformer-based 360 image outpainting framework called Dream360, which can generate diverse, high-fidelity, and high-resolution panoramas from user-selected viewports, considering the spherical properties of 360 images.
1 code implementation • 4 Nov 2023 • Hao Ai, Lu Sheng
Therefore, we present a new method in this paper, Stable Diffusion Reference Only, a images-to-image self-supervised model that uses only two types of conditional images for precise control generation to accelerate secondary painting.
no code implementations • ICCV 2023 • Zidong Cao, Hao Ai, Yan-Pei Cao, Ying Shan, XiaoHu Qie, Lin Wang
The M\"obius transformation is typically employed to further provide the opportunity for movement and zoom on ODIs, but applying it to the image level often results in blurry effect and aliasing problem.
no code implementations • 17 Apr 2023 • Zidong Cao, Hao Ai, Athanasios V. Vasilakos, Lin Wang
Our key idea is to transfer the scene structural knowledge from the HR image modality and the corresponding LR depth maps to achieve the goal of HR depth estimation without any extra inference cost.
no code implementations • 21 Mar 2023 • Hao Ai, Zidong Cao, Yan-Pei Cao, Ying Shan, Lin Wang
Depth estimation from a monocular 360{\deg} image is a burgeoning problem owing to its holistic sensing of a scene.
no code implementations • CVPR 2023 • Hao Ai, Zidong Cao, Yan-Pei Cao, Ying Shan, Lin Wang
Depth estimation from a monocular 360 image is a burgeoning problem owing to its holistic sensing of a scene.
1 code implementation • 21 May 2022 • Hao Ai, Zidong Cao, Jinjing Zhu, Haotian Bai, Yucheng Chen, Lin Wang
Omnidirectional image (ODI) data is captured with a 360x180 field-of-view, which is much wider than the pinhole cameras and contains richer spatial information than the conventional planar images.