The Abt-Buy dataset for entity resolution derives from the online retailers Abt.com and Buy.com. The dataset contains 1081 entities from abt.com and 1092 entities from buy.com as well as a gold standard (perfect mapping) with 1097 matching record pairs between the two data sources. The common attributes between the two data sources are: product name, product description and product price.
The dataset was initially published in the repository of the Database Group of the University of Leipzig: https://dbs.uni-leipzig.de/research/projects/object_matching/benchmark_datasets_for_entity_resolution
To enable the reproducibility of the results and the comparability of the performance of different matchers on the Abt-Buy matching task, the dataset was split into fixed train, validation and test sets. The fixed splits are provided in the CompERBench repository:
http://data.dws.informatik.uni-mannheim.de/benchmarkmatchingtasks/index.html
Paper | Code | Results | Date | Stars |
---|