Blocking

104 papers with code • 5 benchmarks • 3 datasets

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Blocking is a crucial step in any entity resolution pipeline because a pair-wise comparison of all records across two data sources is infeasible. Blocking applies a computationally cheap method to generate a smaller set of candidate record pairs reducing the workload of the matcher. During matching a more expensive pair-wise matcher generates a final set of matching record pairs.

Survey on blocking:

Papadakis et al.: Blocking and Filtering Techniques for Entity Resolution: A Survey, 2020.

Benchmarks

Add a Result

These leaderboards are used to track progress in Blocking

Dataset	Best Model	Compare
Abt-Buy	Sudowoodo	See all
Amazon-Google	SC-Block	See all
WDC Block - small	BM25	See all
WDC Block - medium	SC-Block	See all
WDC Block - large	SC-Block	See all

Libraries

Use these libraries to find Blocking models and implementations

faceonlive/ai-research

2 papers

294

ftramer/ad-versarial

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking Decoder

clovaai/c3_sinet • • 20 Nov 2019

To solve the first problem, we introduce the new extremely lightweight portrait segmentation model SINet, containing an information blocking decoder and spatial squeeze modules.

Paper
Code

Neural Text Generation with Unlikelihood Training

facebookresearch/unlikelihood_training • • ICLR 2020

Neural text generation is a key tool in natural language applications, but it is well known there are major problems at its core.

Paper
Code

Compression Artifacts Reduction by a Deep Convolutional Network

ryanxingql/powerqe • • ICCV 2015

Lossy compression introduces complex compression artifacts, particularly the blocking artifacts, ringing effects and blurring.

Paper
Code

d-blink: Distributed End-to-End Bayesian Entity Resolution

ngmarchant/dblink • 13 Sep 2019

Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers.

Paper
Code

Deep Convolution Networks for Compression Artifacts Reduction

vinayak19th/ARCNN-keras • • 9 Aug 2016

Lossy compression introduces complex compression artifacts, particularly blocking artifacts, ringing effects and blurring.

Paper
Code

Emergent Complexity via Multi-Agent Competition

openai/multiagent-competition • • ICLR 2018

In this paper, we point out that a competitive multi-agent environment trained with self-play can produce behaviors that are far more complex than the environment itself.

Paper
Code

Percival: Making In-Browser Perceptual Ad Blocking Practical With Deep Learning

dxaen/percival • 17 May 2019

In this paper we present Percival, a browser-embedded, lightweight, deep learning-powered ad blocker.

Paper
Code

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction

james-qiuhaoran/llm-serving-with-proxy-models • • 12 Apr 2024

Large language models (LLMs) have been driving a new wave of interactive AI applications across numerous domains.

Paper
Code

Using Explainable AI and Transfer Learning to understand and predict the maintenance of Atlantic blocking with limited observational data

hzhang-math/Blocking_SHAP_TL • • 12 Apr 2024

This work demonstrates the potential for machine learning methods to extract meaningful precursors of extreme weather events and achieve better prediction using limited observational data.

Paper
Code

Towards Universal Dense Blocking for Entity Resolution

tshu-w/uniblocker • • 23 Apr 2024

Blocking is a critical step in entity resolution, and the emergence of neural network-based representation models has led to the development of dense blocking as a promising approach for exploring deep semantics in blocking.

Paper
Code

Blocking

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result