1 code implementation • ICML 2020 • Jianshu Zhang, Jun Du, Yongxin Yang, Yi-Zhe Song, Si Wei, Li-Rong Dai
Recent encoder-decoder approaches typically employ string decoders to convert images into serialized strings for image-to-markup.
1 code implementation • 29 May 2024 • Chaitat Utintu, Pinaki Nath Chowdhury, Aneeshan Sain, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song
This paper introduces a novel approach to sketch colourisation, inspired by the universal childhood activity of colouring and its professional applications in design and story-boarding.
1 code implementation • 14 Mar 2024 • Hmrishav Bandyopadhyay, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Tao Xiang, Timothy Hospedales, Yi-Zhe Song
(ii) SketchINR's auto-decoder provides a much higher-fidelity representation than other learned vector sketch representations, and is uniquely able to scale to complex vector sketches such as FS-COCO.
no code implementations • 14 Mar 2024 • Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Tao Xiang, Yi-Zhe Song
In this paper, we explore the unique modality of sketch for explainability, emphasising the profound impact of human strokes compared to conventional pixel-oriented studies.
no code implementations • 12 Mar 2024 • Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
Two primary input modalities prevail in image retrieval: sketch and text.
1 code implementation • 12 Mar 2024 • Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
This paper unravels the potential of sketches for diffusion models, addressing the deceptive promise of direct sketch control in generative AI.
no code implementations • 12 Mar 2024 • Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
This paper, for the first time, explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR).
no code implementations • 11 Mar 2024 • Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
@q loss to inject that understanding into the system.
no code implementations • 7 Dec 2023 • Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
In this paper, we democratise 3D content creation, enabling precise generation of 3D shapes from abstract sketches while overcoming limitations tied to drawing skills.
no code implementations • 7 Dec 2023 • Dar-Yen Chen, Ayan Kumar Bhunia, Subhadeep Koley, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song
In this paper, we democratise caricature generation, empowering individuals to effortlessly craft personalised caricatures with just a photo and a conceptual sketch.
1 code implementation • 27 Nov 2023 • Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
To bridge this gap, we introduce a novel task, Virtual Creatures Generation: Given a set of unlabeled images of the target concepts (e. g., 200 bird species), we aim to train a T2I model capable of creating new, hybrid concepts within diverse backgrounds and contexts.
no code implementations • 26 Nov 2023 • Zhiyu Qu, Lan Yang, Honggang Zhang, Tao Xiang, Kaiyue Pang, Yi-Zhe Song
Creating multi-view wire art (MVWA), a static 3D sculpture with diverse interpretations from different viewpoints, is a complex task even for skilled artists.
1 code implementation • 24 Nov 2023 • Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma
High-resolution image generation with Generative Artificial Intelligence (GenAI) has immense potential but, due to the enormous capital investment required for training, it is increasingly centralised to a few large corporations, and hidden behind paywalls.
no code implementations • 13 Nov 2023 • Ruolin Yang, Da Li, Conghui Hu, Timothy Hospedales, Honggang Zhang, Yi-Zhe Song
Reference-based video object segmentation is an emerging topic which aims to segment the corresponding target object in each video frame referred by a given reference, such as a language expression or a photo mask.
1 code implementation • 27 Aug 2023 • Zhiyu Qu, Tao Xiang, Yi-Zhe Song
Through this work, we hope to aspire the way we create visual content, democratise the creative process, and inspire further research in enhancing human creativity in AIGC.
1 code implementation • ICCV 2023 • Ling Luo, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song, Yulia Gryaditskaya
3D shape modeling is labor-intensive, time-consuming, and requires years of expertise.
no code implementations • CVPR 2023 • Zhiyu Qu, Yulia Gryaditskaya, Ke Li, Kaiyue Pang, Tao Xiang, Yi-Zhe Song
Following this, we design a simple explainability-friendly sketch encoder that accommodates the intrinsic properties of strokes: shape, location, and order.
Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +1
no code implementations • 7 Apr 2023 • Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song
Such strictly-ordered discrete factorization however falls short of capturing key properties of chirographic data -- it fails to build holistic understanding of the temporal concept due to one-way visibility (causality).
1 code implementation • ICCV 2023 • Sauradip Nag, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang
Concretely, we establish the denoising process in the Transformer decoder (e. g., DETR) by introducing a temporal location query design with faster convergence in training.
no code implementations • CVPR 2023 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song
In particular, we first perform independent prompting on both sketch and photo branches of an SBIR model to build highly generalisable sketch and photo encoders on the back of the generalisation ability of CLIP.
1 code implementation • CVPR 2023 • Fengyin Lin, Mingkang Li, Da Li, Timothy Hospedales, Yi-Zhe Song, Yonggang Qi
This paper studies the problem of zero-short sketch-based image retrieval (ZS-SBIR), however with two significant differentiators to prior art (i) we tackle all variants (inter-category, intra-category, and cross datasets) of ZS-SBIR with just one network (``everything''), and (ii) we would really like to understand how this sketch-photo matching operates (``explainable'').
no code implementations • CVPR 2023 • Aneeshan Sain, Ayan Kumar Bhunia, Subhadeep Koley, Pinaki Nath Chowdhury, Soumitri Chattopadhyay, Tao Xiang, Yi-Zhe Song
This paper advances the fine-grained sketch-based image retrieval (FG-SBIR) literature by putting forward a strong baseline that overshoots prior state-of-the-arts by ~11%.
no code implementations • CVPR 2023 • Aneeshan Sain, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Subhadeep Koley, Tao Xiang, Yi-Zhe Song
At the very core of our solution is a prompt learning setup.
no code implementations • CVPR 2023 • Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
We further introduce specific designs to tackle the abstract nature of human sketches, including a fine-grained discriminative loss on the back of a trained sketch-photo retrieval model, and a partial-aware sketch augmentation strategy.
no code implementations • CVPR 2023 • Ayan Kumar Bhunia, Subhadeep Koley, Amandeep Kumar, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
Human sketch has already proved its worth in various visual understanding tasks (e. g., retrieval, segmentation, image-captioning, etc).
1 code implementation • CVPR 2023 • Abhra Chaudhuri, Ayan Kumar Bhunia, Yi-Zhe Song, Anjan Dutta
For the first time, we identify that for data-scarce tasks like Sketch-Based Image Retrieval (SBIR), where the difficulty in acquiring paired photos and hand-drawn sketches limits data-dependent cross-modal learning algorithms, DFL can prove to be a much more practical paradigm.
no code implementations • 10 Mar 2023 • Zhongying Deng, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang
D-CFA minimizes the domain gap by augmenting the source data with distribution-sampled target features, and trains a noise-robust discriminative classifier by using target domain knowledge from the generative models.
1 code implementation • CVPR 2023 • Xiao Han, Xiatian Zhu, Licheng Yu, Li Zhang, Yi-Zhe Song, Tao Xiang
In the fashion domain, there exists a variety of vision-and-language (V+L) tasks, including cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image captioning.
1 code implementation • 15 Feb 2023 • Kam Woh Ng, Xiatian Zhu, Jiun Tian Hoe, Chee Seng Chan, Tianyu Zhang, Yi-Zhe Song, Tao Xiang
However, these methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space, due to the limited similarity range of hash codes.
1 code implementation • CVPR 2023 • Ruoyi Du, Dongliang Chang, Kongming Liang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma
Our code is available at https://github. com/PRIS-CV/On-the-fly-Category-Discovery.
no code implementations • ICCV 2023 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song
We perform pivoting on two existing datasets, each from a distant research domain to the other: 2D sketch and photo pairs from the sketch-based image retrieval field (SBIR), and 3D shapes from ShapeNet.
1 code implementation • CVPR 2023 • Ke Li, Kaiyue Pang, Yi-Zhe Song
This lack of sketch data has imposed on the community a few "peculiar" design choices -- the most representative of them all is perhaps the coerced utilisation of photo-based pre-training (i. e., no sketch), for many core tasks that otherwise dictates specific sketch understanding.
no code implementations • ICCV 2023 • Yurong Guo, Ruoyi Du, Yuan Dong, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma
In this paper, we first observe the dependence of task-specific parameter configuration on the target task.
no code implementations • CVPR 2023 • Dongliang Chang, Yujun Tong, Ruoyi Du, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma
Therefore, we first propose a feature disentanglement module and a feature re-fusion module to reduce negative transfer and boost positive transfer between different datasets.
no code implementations • ICCV 2023 • Xiao Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang
Controllable person image synthesis aims at rendering a source image based on user-specified changes in body pose or appearance.
1 code implementation • 30 Nov 2022 • Jijie Wu, Dongliang Chang, Aneeshan Sain, Xiaoxu Li, Zhanyu Ma, Jie Cao, Jun Guo, Yi-Zhe Song
Conventional few-shot learning methods however cannot be naively adopted for this fine-grained setting -- a quick pilot study reveals that they in fact push for the opposite (i. e., lower inter-class variations and higher intra-class variations).
1 code implementation • 27 Nov 2022 • Sauradip Nag, Mengmeng Xu, Xiatian Zhu, Juan-Manuel Perez-Rua, Bernard Ghanem, Yi-Zhe Song, Tao Xiang
In this work, we introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD by leveraging few-shot support videos and new class names jointly.
1 code implementation • CVPR 2023 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
To address this problem, in this work we introduce a novel model-agnostic post-processing method without model redesign and retraining.
no code implementations • 19 Nov 2022 • Sen He, Yi-Zhe Song, Tao Xiang
Key to our model is a parallel flow estimation module that predicts the flow fields for both person and garment images conditioned on the target pose.
no code implementations • 25 Oct 2022 • Tingwei Wang, Da Li, Kaiyang Zhou, Tao Xiang, Yi-Zhe Song
Machine learning models are intrinsically vulnerable to domain shift between training and testing data, resulting in poor performance in novel domains.
no code implementations • 15 Oct 2022 • Zhihe Lu, Sen He, Da Li, Yi-Zhe Song, Tao Xiang
To ensure that the fused scores are not biased to either the base or novel classes, a new Transformer-based calibration module is introduced.
Generalized Few-Shot Semantic Segmentation Semantic Segmentation
1 code implementation • 4 Oct 2022 • Zhongying Deng, Da Li, Yi-Zhe Song, Tao Xiang
Given any existing fully-trained one-step MSDA model, BORT$^2$ turns it to a labeling function to generate pseudo-labels for the target data and trains a target model using pseudo-labeled target data only.
1 code implementation • 20 Sep 2022 • Ling Luo, Yulia Gryaditskaya, Yongxin Yang, Tao Xiang, Yi-Zhe Song
We then, for the first time, study the scenario of fine-grained 3D VR sketch to 3D shape retrieval, as a novel VR sketching application and a proving ground to drive out generic insights to inform future research.
1 code implementation • 20 Sep 2022 • Ling Luo, Yulia Gryaditskaya, Yongxin Yang, Tao Xiang, Yi-Zhe Song
In this paper, we offer a different perspective towards answering these questions -- we study the use of 3D sketches as an input modality and advocate a VR-scenario where retrieval is conducted.
1 code implementation • 19 Sep 2022 • Chufeng Xiao, Wanchao Su, Jing Liao, Zhouhui Lian, Yi-Zhe Song, Hongbo Fu
We invited 70 novice users and 38 expert users to sketch 136 3D objects, which were presented as 362 images rendered from multiple views.
1 code implementation • 19 Sep 2022 • Ling Luo, Yulia Gryaditskaya, Tao Xiang, Yi-Zhe Song
In particular, we propose to use a triplet loss with an adaptive margin value driven by a "fitting gap", which is the similarity of two shapes under structure-preserving deformations.
1 code implementation • 14 Aug 2022 • Chenjian Gao, Qian Yu, Lu Sheng, Yi-Zhe Song, Dong Xu
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.
1 code implementation • 17 Jul 2022 • Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang
We thus propose a Multi-View Contrastive Learning task for pulling closer the visual representation of one image to the compositional multimodal representation of another image+text.
1 code implementation • 17 Jul 2022 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
Such a novel design effectively eliminates the dependence between localization and classification by breaking the route for error propagation in-between.
Ranked #1 on Zero-Shot Action Detection on THUMOS' 14
1 code implementation • 14 Jul 2022 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
Such a novel design effectively eliminates the dependence between localization and classification by cutting off the route for error propagation in-between.
Ranked #1 on Semi-Supervised Action Detection on ActivityNet-1.3
2 code implementations • 14 Jul 2022 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
Existing temporal action detection (TAD) methods rely on generating an overwhelmingly large number of proposals per video.
Ranked #15 on Temporal Action Localization on ActivityNet-1.3
1 code implementation • 4 Jul 2022 • Ayan Kumar Bhunia, Aneeshan Sain, Parth Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
To solve this new problem, we introduce a novel model-agnostic meta-learning (MAML) based framework with several key modifications: (1) As a retrieval task with a margin-based contrastive loss, we simplify the MAML training in the inner loop to make it more stable and tractable.
no code implementations • CVPR 2023 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song
In this paper, we extend scene understanding to include that of human sketch.
1 code implementation • 6 Apr 2022 • Xiao Han, Sen He, Li Zhang, Yi-Zhe Song, Tao Xiang
In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR.
3 code implementations • CVPR 2022 • Sen He, Yi-Zhe Song, Tao Xiang
To achieve this, a key step is garment warping which spatially aligns the target garment with the corresponding body parts in the person image.
Ranked #1 on Virtual Try-on on VITON
no code implementations • CVPR 2022 • Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang, Yi-Zhe Song
In this paper, we push the boundary further for FSCIL by addressing two key questions that bottleneck its ubiquitous application (i) can the model learn from diverse modalities other than just photo (as humans do), and (ii) what if photos are not readily accessible (due to ethical and privacy constraints).
no code implementations • CVPR 2022 • Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i. e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a result.
no code implementations • CVPR 2022 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Aneeshan Sain, Tao Xiang, Yi-Zhe Song
We scrutinise an important observation plaguing scene-level sketch research -- that a significant portion of scene sketches are "partial".
1 code implementation • CVPR 2022 • Ayan Kumar Bhunia, Subhadeep Koley, Abdullah Faiz Ur Rahman Khilji, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
We first conducted a pilot study that revealed the secret lies in the existence of noisy strokes, but not so much of the "I can't sketch".
1 code implementation • 9 Mar 2022 • Zhongying Deng, Kaiyang Zhou, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang
In this paper, we address both single-source and multi-source UDA from a completely different perspective, which is to view each instance as a fine domain.
1 code implementation • 4 Mar 2022 • Pinaki Nath Chowdhury, Aneeshan Sain, Ayan Kumar Bhunia, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song
We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO.
1 code implementation • CVPR 2022 • Lan Yang, Kaiyue Pang, Honggang Zhang, Yi-Zhe Song
Our key discovery lies in exploiting the magnitude (L2 norm) of a sketch feature as a quantitative quality metric.
no code implementations • 20 Dec 2021 • Anran Qi, Yulia Gryaditskaya, Tao Xiang, Yi-Zhe Song
We aim to segment all sketches belonging to the same category provisioned with a single sketch with a given part annotation while (i) preserving the parts semantics embedded in the exemplar, and (ii) being robust to input style and abstraction.
no code implementations • 13 Dec 2021 • Tianyuan Yu, Sen He, Yi-Zhe Song, Tao Xiang
This is because they use an instance GNN as a label propagation/classification module, which is jointly meta-learned with a feature embedding network.
1 code implementation • 6 Dec 2021 • Ruoyi Du, Dongliang Chang, Zhanyu Ma, Yi-Zhe Song, Jun Guo
Despite great strides made on fine-grained visual classification (FGVC), current methods are still heavily reliant on fully-supervised paradigms where ample expert labels are called for.
1 code implementation • 6 Dec 2021 • Dongliang Chang, Kaiyue Pang, Ruoyi Du, Zhanyu Ma, Yi-Zhe Song, Jun Guo
1 lays out our approach in answering this question.
no code implementations • 11 Nov 2021 • Xiu-Shen Wei, Yi-Zhe Song, Oisin Mac Aodha, Jianxin Wu, Yuxin Peng, Jinhui Tang, Jian Yang, Serge Belongie
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications.
no code implementations • 29 Sep 2021 • Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
In this paper, to address the above two challenges, a novel {\em Global Segmentation Mask Transformer} (GSMT) is proposed.
2 code implementations • NeurIPS 2021 • Jiun Tian Hoe, Kam Woh Ng, Tianyu Zhang, Chee Seng Chan, Yi-Zhe Song, Tao Xiang
In this work, we propose a novel deep hashing model with only a single learning objective.
no code implementations • ICLR 2022 • Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song
Learning meaningful representations for chirographic drawing data such as sketches, handwriting, and flowcharts is a gateway for understanding and emulating human creative expression.
no code implementations • ICCV 2021 • Yonggang Qi, Guoyao Su, Pinaki Nath Chowdhury, Mingkang Li, Yi-Zhe Song
The key challenge in designing a sketch representation lies with handling the abstract and iconic nature of sketches.
1 code implementation • ICCV 2021 • Zhihe Lu, Sen He, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang
A few-shot semantic segmentation model is typically composed of a CNN encoder, a CNN decoder and a simple classifier (separating foreground and background pixels).
no code implementations • ICCV 2021 • Sen He, Wentong Liao, Michael Ying Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang
The generated face image given a target age code is expected to be age-sensitive reflected by bio-plausible transformations of shape and texture, while being identity preserving.
no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song
Our framework is iterative in nature, in that it utilises predicted knowledge of character sequences from a previous iteration, to augment the main network in improving the next prediction.
no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Aneeshan Sain, Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Yi-Zhe Song
In this paper, we argue that semantic information offers a complementary role in addition to visual only.
no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song
In this paper, for the first time, we argue for their unification -- we aim for a single model that can compete favourably with two separate state-of-the-art STR and HTR models.
no code implementations • 18 May 2021 • Conghui Hu, Yongxin Yang, Yunpeng Li, Timothy M. Hospedales, Yi-Zhe Song
The practical value of existing supervised sketch-based image retrieval (SBIR) algorithms is largely limited by the requirement for intensive data collection and labeling.
1 code implementation • CVPR 2021 • Yonggang Qi, Kai Zhang, Aneeshan Sain, Yi-Zhe Song
Perceptual organization remains one of the very few established theories on the human visual system.
1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song
In this paper, we take a completely different perspective -- we work on the assumption that there is always a new style that is drastically different, and that we will only have very limited data during testing to perform adaptation.
1 code implementation • CVPR 2021 • Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song
Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations.
no code implementations • CVPR 2021 • Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song
With this meta-learning framework, our model can not only disentangle the cross-modal shared semantic content for SBIR, but can adapt the disentanglement to any unseen user style as well, making the SBIR model truly style-agnostic.
1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song
A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs.
1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song
This data is uniquely characterised by its existence in dual modalities of rasterized images and vector coordinate sequences.
1 code implementation • CVPR 2021 • Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang
We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.
Ranked #1 on Layout-to-Image Generation on COCO-Stuff 128x128
no code implementations • ICCV 2021 • Lan Yang, Kaiyue Pang, Honggang Zhang, Yi-Zhe Song
The superiority of explicitly abstracting sketch representation is empirically validated on a number of sketch analysis tasks, including sketch recognition, fine-grained sketch-based image retrieval, and generative sketch healing.
1 code implementation • CVPR 2021 • Dongliang Chang, Kaiyue Pang, Yixiao Zheng, Zhanyu Ma, Yi-Zhe Song, Jun Guo
For that, we re-envisage the traditional setting of FGVC, from single-label classification, to that of top-down traversal of a pre-defined coarse-to-fine label hierarchy -- so that our answer becomes "bird"-->"Phoenicopteriformes"-->"Phoenicopteridae"-->"flamingo".
Ranked #16 on Fine-Grained Image Classification on FGVC Aircraft
1 code implementation • 12 Nov 2020 • Yue Zhong, Yulia Gryaditskaya, Honggang Zhang, Yi-Zhe Song
Deep image-based modeling received lots of attention in recent years, yet the parallel problem of sketch-based modeling has only been briefly studied, often as a potential application.
1 code implementation • 29 Jul 2020 • Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song
In this paper, we study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail -- a person typically sketches up to various extents of detail to depict an object.
1 code implementation • 7 Jul 2020 • Peng Xu, Yongye Huang, Tongtong Yuan, Tao Xiang, Timothy M. Hospedales, Yi-Zhe Song, Liang Wang
Specifically, we use our dual-branch architecture as a universal representation framework to design two sketch-specific deep models: (i) We propose a deep hashing model for sketch retrieval, where a novel hashing loss is specifically designed to accommodate both the abstract and messy traits of sketches.
1 code implementation • ECCV 2020 • Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song
The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process.
no code implementations • 3 Apr 2020 • Da Li, Yongxin Yang, Yi-Zhe Song, Timothy Hospedales
In DG this means encountering a sequence of domains and at each step training to maximise performance on the next domain.
Ranked #80 on Domain Generalization on PACS
5 code implementations • ECCV 2020 • Ruoyi Du, Dongliang Chang, Ayan Kumar Bhunia, Jiyang Xie, Zhanyu Ma, Yi-Zhe Song, Jun Guo
In this work, we propose a novel framework for fine-grained visual classification to tackle these problems.
Ranked #17 on Fine-Grained Image Classification on Stanford Cars
2 code implementations • 8 Mar 2020 • Dongliang Chang, Aneeshan Sain, Zhanyu Ma, Yi-Zhe Song, Jun Guo
The key insight lies with how we exploit the mutually beneficial information between two networks; (a) to separate samples of known and unknown classes, (b) to maximize the domain confusion between source and target domain without the influence of unknown samples.
1 code implementation • 24 Feb 2020 • Ayan Kumar Bhunia, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch.
Cross-Modal Retrieval On-the-Fly Sketch Based Image Retrieval +1
no code implementations • 21 Feb 2020 • Peng Xu, Kun Liu, Tao Xiang, Timothy M. Hospedales, Zhanyu Ma, Jun Guo, Yi-Zhe Song
Existing sketch-analysis work studies sketches depicting static objects or scenes.
3 code implementations • 11 Feb 2020 • Dongliang Chang, Yifeng Ding, Jiyang Xie, Ayan Kumar Bhunia, Xiaoxu Li, Zhanyu Ma, Ming Wu, Jun Guo, Yi-Zhe Song
The proposed loss function, termed as mutual-channel loss (MC-Loss), consists of two channel-specific components: a discriminality component and a diversity component.
Ranked #29 on Fine-Grained Image Classification on FGVC Aircraft
1 code implementation • 3 Feb 2020 • Peng Xu, Zeyu Song, Qiyue Yin, Yi-Zhe Song, Liang Wang
In this paper, we tackle for the first time, the problem of self-supervised representation learning for free-hand sketches.
no code implementations • 16 Jan 2020 • Deng Yu, Lei LI, Youyi Zheng, Manfred Lau, Yi-Zhe Song, Chiew-Lan Tai, Hongbo Fu
In this paper, we study the problem of multi-view sketch correspondence, where we take as input multiple freehand sketches with different views of the same object and predict as output the semantic correspondence among the sketches.
2 code implementations • 8 Jan 2020 • Peng Xu, Timothy M. Hospedales, Qiyue Yin, Yi-Zhe Song, Tao Xiang, Liang Wang
Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present.
no code implementations • 10 Nov 2019 • Jianjun Lei, Yuxin Song, Bo Peng, Zhanyu Ma, Ling Shao, Yi-Zhe Song
How to align abstract sketches and natural images into a common high-level semantic space remains a key problem in SBIR.
no code implementations • ICCV 2019 • Umar Riaz Muhammad, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song
In the former one asks whether a machine can `understand' enough about the meaning of input data to produce a meaningful but more compact abstraction.
1 code implementation • CVPR 2019 • Sounak Dey, Pau Riba, Anjan Dutta, Josep Llados, Yi-Zhe Song
Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic.
2 code implementations • ICCV 2019 • Da Li, Jianshu Zhang, Yongxin Yang, Cong Liu, Yi-Zhe Song, Timothy M. Hospedales
In this paper, we build on this strong baseline by designing an episodic training procedure that trains a single deep network in a way that exposes it to the domain shift that characterises a novel domain at runtime.
Ranked #80 on Domain Generalization on PACS
no code implementations • ECCV 2018 • Ke Li, Kaiyue Pang, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Honggang Zhang
In this work we aim to develop a universal sketch grouper.
1 code implementation • 7 Aug 2018 • Ke Li, Kaiyue Pang, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Honggang Zhang
In this work we aim to develop a universal sketch grouper.
2 code implementations • ECCV 2018 • Changqing Zou, Qian Yu, Ruofei Du, Haoran Mo, Yi-Zhe Song, Tao Xiang, Chengying Gao, Baoquan Chen, Hao Zhang
We contribute the first large-scale dataset of scene sketches, SketchyScene, with the goal of advancing research on sketch understanding at both the object and scene level.
no code implementations • ECCV 2018 • Kaiyue Pang, Da Li, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales
Instead there is a fundamental process of abstraction and iconic rendering, where overall geometry is warped and salient details are selectively included.
no code implementations • CVPR 2018 • Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales
In this paper, we present a novel approach for translating an object photo to a sketch, mimicking the human sketching process.
no code implementations • CVPR 2018 • Conghui Hu, Da Li, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales
Contemporary deep learning techniques have made image recognition a reasonably reliable technology.
no code implementations • CVPR 2018 • Umar Riaz Muhammad, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales
Human free-hand sketches have been studied in various contexts including sketch recognition, synthesis and fine-grained sketch-based image retrieval (FG-SBIR).
1 code implementation • CVPR 2018 • Peng Xu, Yongye Huang, Tongtong Yuan, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Zhanyu Ma, Jun Guo
Key to our network design is the embedding of unique characteristics of human sketch, where (i) a two-branch CNN-RNN architecture is adapted to explore the temporal ordering of strokes, and (ii) a novel hashing loss is specifically designed to accommodate both the temporal and abstract traits of sketches.
no code implementations • 22 Nov 2017 • Qian Yu, Xiaobin Chang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales
Many vision problems require matching images of object instances across different domains.
5 code implementations • 10 Oct 2017 • Da Li, Yongxin Yang, Yi-Zhe Song, Timothy M. Hospedales
We propose a novel {meta-learning} method for domain generalization.
Ranked #118 on Domain Generalization on PACS
6 code implementations • ICCV 2017 • Da Li, Yongxin Yang, Yi-Zhe Song, Timothy M. Hospedales
In this paper, we make two main contributions: Firstly, we build upon the favorable domain shift-robust properties of deep learning methods, and develop a low-rank parameterized CNN model for end-to-end DG learning.
Ranked #121 on Domain Generalization on PACS
no code implementations • ICCV 2017 • Jifei Song, Qian Yu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales
Human sketches are unique in being able to capture both the spatial topology of a visual object, as well as its subtle appearance details.
Ranked #2 on Sketch-Based Image Retrieval on Handbags
no code implementations • 28 May 2017 • Peng Xu, Qiyue Yin, Yongye Huang, Yi-Zhe Song, Zhanyu Ma, Liang Wang, Tao Xiang, W. Bastiaan Kleijn, Jun Guo
Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo.
Ranked #5 on Sketch-Based Image Retrieval on Chairs
no code implementations • CVPR 2016 • Qian Yu, Feng Liu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Chen-Change Loy
We investigate the problem of fine-grained sketch-based image retrieval (SBIR), where free-hand human sketches are used as queries to perform instance-level retrieval of images.
Ranked #3 on Sketch-Based Image Retrieval on Chairs
no code implementations • CVPR 2016 • Shuxin Ouyang, Timothy M. Hospedales, Yi-Zhe Song, Xueming Li
Based on this database we build a model to reverse the forgetting process.
no code implementations • 9 Oct 2015 • Yi Li, Yi-Zhe Song, Timothy Hospedales, Shaogang Gong
We present a generative model which can automatically summarize the stroke composition of free-hand sketches of a given category.
no code implementations • CVPR 2015 • Yonggang Qi, Yi-Zhe Song, Tao Xiang, Honggang Zhang, Timothy Hospedales, Yi Li, Jun Guo
We propose a perceptual grouping framework that organizes image edges into meaningful structures and demonstrate its usefulness on various computer vision tasks.
2 code implementations • 30 Jan 2015 • Qian Yu, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales
We propose a multi-scale multi-channel deep neural network framework that, for the first time, yields sketch recognition performance surpassing that of humans.
no code implementations • 17 Sep 2014 • Shuxin Ouyang, Timothy Hospedales, Yi-Zhe Song, Xueming Li
Heterogeneous face recognition (HFR) refers to matching face imagery across different domains.