Latest Research

ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset

saviola/rocov2-code • 16 May 2024

The dataset is suitable for training image annotation models based on image-caption pairs, or for multi-label image classification using Unified Medical Language System (UMLS) concepts provided with each image.

Multi-Label Image Classification Multi-Task Learning

16 May 2024

Paper
Code

Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models

zzwaang/melody-reduction-algo • 16 May 2024

A cascaded diffusion model is trained to model the hierarchical language, where each level is conditioned on its upper levels.

Music Generation

16 May 2024

Paper
Code

Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation

v-manhlt3/m-ltm-audio-text-retrieval • • 16 May 2024

The Learning-to-match (LTM) framework proves to be an effective inverse optimal transport approach for learning the underlying ground metric between two sources of data, facilitating subsequent matching.

AudioCaps Event Detection +4

16 May 2024

Paper
Code

CatCMA : Stochastic Optimization for Mixed-Category Problems

CyberAgentAILab/cmaes • • 16 May 2024

CatCMA updates the parameters of the joint probability distribution in the natural gradient direction.

Bayesian Optimization Stochastic Optimization

318

16 May 2024

Paper
Code

Generative Unlearning for Any Identity

khu-agi/guide • • 16 May 2024

In the generative identity unlearning, we target the following objectives: (i) preventing the generation of images with a certain identity, and (ii) preserving the overall quality of the generative model.

Machine Unlearning

16 May 2024

Paper
Code

Bilateral Event Mining and Complementary for Event Stream Super-Resolution

lqm26/bmcnet-esr • • 16 May 2024

In this paper, we propose a bilateral event mining and complementary network (BMCNet) to fully leverage the potential of each event and capture the shared information to complement each other simultaneously.

Object Recognition Super-Resolution +1

16 May 2024

Paper
Code

Optimal Aggregation of Prediction Intervals under Unsupervised Domain Shift

jiaweige0416/pi-with-shift • 16 May 2024

In this paper, we propose methodologies for aggregating prediction intervals to obtain one with minimal width and adequate coverage on the target domain under unsupervised domain shift, under which we have labeled samples from a related source domain and unlabeled covariates from the target domain.

Prediction Intervals

16 May 2024

Paper
Code

Language-Oriented Semantic Latent Representation for Image Transmission

ispamm/img2img-sc • 16 May 2024

In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data.

16 May 2024

Paper
Code

PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning

jaychempan/pir-clip • • 16 May 2024

Continuing with the above, we propose PIR-CLIP, a domain-specific CLIP-based framework for remote sensing image-text retrieval, to address semantic noise in remote sensing vision-language representations and further improve open-domain retrieval performance.

Representation Learning Retrieval +2

16 May 2024

Paper
Code

LFED: A Literary Fiction Evaluation Dataset for Large Language Models

tjunlp-lab/lfed • 16 May 2024

The rapid evolution of large language models (LLMs) has ushered in the need for comprehensive assessments of their performance across various dimensions.

16 May 2024

Paper
Code