ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset

saviola/rocov2-code 16 May 2024

The dataset is suitable for training image annotation models based on image-caption pairs, or for multi-label image classification using Unified Medical Language System (UMLS) concepts provided with each image.

Multi-Label Image Classification Multi-Task Learning

0
16 May 2024

Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models

zzwaang/melody-reduction-algo 16 May 2024

A cascaded diffusion model is trained to model the hierarchical language, where each level is conditioned on its upper levels.

Music Generation

2
16 May 2024

Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation

v-manhlt3/m-ltm-audio-text-retrieval 16 May 2024

The Learning-to-match (LTM) framework proves to be an effective inverse optimal transport approach for learning the underlying ground metric between two sources of data, facilitating subsequent matching.

AudioCaps Event Detection +4

3
16 May 2024

CatCMA : Stochastic Optimization for Mixed-Category Problems

CyberAgentAILab/cmaes 16 May 2024

CatCMA updates the parameters of the joint probability distribution in the natural gradient direction.

Bayesian Optimization Stochastic Optimization

318
16 May 2024

Generative Unlearning for Any Identity

khu-agi/guide 16 May 2024

In the generative identity unlearning, we target the following objectives: (i) preventing the generation of images with a certain identity, and (ii) preserving the overall quality of the generative model.

Machine Unlearning

10
16 May 2024

Bilateral Event Mining and Complementary for Event Stream Super-Resolution

lqm26/bmcnet-esr 16 May 2024

In this paper, we propose a bilateral event mining and complementary network (BMCNet) to fully leverage the potential of each event and capture the shared information to complement each other simultaneously.

Object Recognition Super-Resolution +1

4
16 May 2024

Optimal Aggregation of Prediction Intervals under Unsupervised Domain Shift

jiaweige0416/pi-with-shift 16 May 2024

In this paper, we propose methodologies for aggregating prediction intervals to obtain one with minimal width and adequate coverage on the target domain under unsupervised domain shift, under which we have labeled samples from a related source domain and unlabeled covariates from the target domain.

Prediction Intervals

0
16 May 2024

Language-Oriented Semantic Latent Representation for Image Transmission

ispamm/img2img-sc 16 May 2024

In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data.

1
16 May 2024

PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning

jaychempan/pir-clip 16 May 2024

Continuing with the above, we propose PIR-CLIP, a domain-specific CLIP-based framework for remote sensing image-text retrieval, to address semantic noise in remote sensing vision-language representations and further improve open-domain retrieval performance.

Representation Learning Retrieval +2

2
16 May 2024

LFED: A Literary Fiction Evaluation Dataset for Large Language Models

tjunlp-lab/lfed 16 May 2024

The rapid evolution of large language models (LLMs) has ushered in the need for comprehensive assessments of their performance across various dimensions.

0
16 May 2024