2 dataset results for Information Retrieval AND Japanese

BoostCLIR is a bilingual (Japanese-English) corpus of patent abstracts, extracted from the MAREC patent data, and the data from the NTCIR PatentMT workshop collections, accompanied with relevance judgements for the task of patent prior-art search.

2 PAPERS • NO BENCHMARKS YET

MetaCLIR

This data adds textual meta-infomation data to two existing corpora for cross language information retrieval: BoostCLIR, and the Large Scale CLIR Dataset (wiki-clir).

1 PAPER • NO BENCHMARKS YET

Datasets

2 dataset results for Information Retrieval AND Japanese