Search Results for author: Denis Kuznedelev

Found 12 papers, 6 papers with code

Does Diffusion Beat GAN in Image Super Resolution?

no code implementations27 May 2024 Denis Kuznedelev, Valerii Startsev, Daniil Shlenskii, Sergey Kastryulin

There is a prevalent opinion in the recent literature that Diffusion-based models outperform GAN-based counterparts on the Image Super Resolution (ISR) problem.

PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression

1 code implementation23 May 2024 Vladimir Malinovskii, Denis Mazur, Ivan Ilin, Denis Kuznedelev, Konstantin Burlachenko, Kai Yi, Dan Alistarh, Peter Richtarik

In this work, we question the use of STE for extreme LLM compression, showing that it can be sub-optimal, and perform a systematic study of quantization-aware fine-tuning strategies for LLMs.

Quantization

Extreme Compression of Large Language Models via Additive Quantization

1 code implementation11 Jan 2024 Vage Egiazarian, Andrei Panferov, Denis Kuznedelev, Elias Frantar, Artem Babenko, Dan Alistarh

The emergence of accurate open large language models (LLMs) has led to a race towards quantization techniques for such models enabling execution on end-user devices.

Quantization

Sparse Fine-tuning for Inference Acceleration of Large Language Models

2 code implementations10 Oct 2023 Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh

While the standard approach is to leverage sparsity for computational reduction, we observe that in the case of memory-bound LLMs sparsity can also be leveraged for reducing memory bandwidth.

Quantization Text Generation +1

Accurate Neural Network Pruning Requires Rethinking Sparse Optimization

no code implementations3 Aug 2023 Denis Kuznedelev, Eldar Kurtic, Eugenia Iofinova, Elias Frantar, Alexandra Peste, Dan Alistarh

Obtaining versions of deep neural networks that are both highly-accurate and highly-sparse is one of the main challenges in the area of model compression, and several high-performance pruning techniques have been investigated by the community.

Model Compression Network Pruning +1

Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression

no code implementations25 Mar 2023 Denis Kuznedelev, Soroush Tabesh, Kimia Noorbakhsh, Elias Frantar, Sara Beery, Eldar Kurtic, Dan Alistarh

To address this, we ask: can we quickly compress large generalist models into accurate and efficient specialists?

A critical look at the evaluation of GNNs under heterophily: Are we really making progress?

2 code implementations22 Feb 2023 Oleg Platonov, Denis Kuznedelev, Michael Diskin, Artem Babenko, Liudmila Prokhorenkova

Graphs without this property are called heterophilous, and it is typically assumed that specialized methods are required to achieve strong performance on such graphs.

Graph Representation Learning Node Classification

CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision Models

no code implementations NeurIPS 2023 Denis Kuznedelev, Eldar Kurtic, Elias Frantar, Dan Alistarh

To further showcase CAP's accuracy and scalability, we use it to show for the first time that extremely-accurate large vision models, trained via self-supervised techniques, can also be pruned to moderate sparsities, with negligible accuracy loss.

Image Classification Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.