Search Results for author: Fan Yang

Found 302 papers, 111 papers with code

XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation

no code implementations • EMNLP 2020 • Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou

In this paper, we introduce XGLUE, a new benchmark dataset to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora, and evaluate their performance across a diverse set of cross-lingual tasks.

Natural Language Understanding XLM-R

Paper
Add Code

Improving Relevance Quality in Product Search using High-Precision Query-Product Semantic Similarity

no code implementations • ECNLP (ACL) 2022 • Alireza Bagheri Garakani, Fan Yang, Wen-Yu Hua, Yetian Chen, Michinari Momma, Jingyuan Deng, Yan Gao, Yi Sun

Ensuring relevance quality in product search is a critical task as it impacts the customer’s ability to find intended products in the short-term as well as the general perception and trust of the e-commerce system in the long term.

Re-Ranking Semantic Similarity +1

Paper
Add Code

Spelling Correction using Phonetics in E-commerce Search

no code implementations • ECNLP (ACL) 2022 • Fan Yang, Alireza Bagheri Garakani, Yifei Teng, Yan Gao, Jia Liu, Jingyuan Deng, Yi Sun

In E-commerce search, spelling correction plays an important role to find desired products for customers in processing user-typed search queries.

Spelling Correction

Paper
Add Code

MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis

no code implementations • SemEval (NAACL) 2022 • Cong Chen, Jiansong Chen, Cao Liu, Fan Yang, Guanglu Wan, Jinxiong Xia

Furthermore, we leverage two data augment strategies and auxiliary tasks to improve the performance on few-label data and zero-shot cross-lingual settings.

Data Augmentation Language Modelling +1

Paper
Add Code

DESED: Dialogue-based Explanation for Sentence-level Event Detection

1 code implementation • COLING 2022 • Yinyi Wei, Shuaipeng Liu, Jianwei Lv, Xiangyu Xi, Hailei Yan, Wei Ye, Tong Mo, Fan Yang, Guanglu Wan

Many recent sentence-level event detection efforts focus on enriching sentence semantics, e. g., via multi-task or prompt-based learning.

Dialogue Generation Event Detection +1

Paper
Code

Beyond 3DMM Space: Towards Fine-grained 3D Face Reconstruction

1 code implementation • ECCV 2020 • Xiangyu Zhu, Fan Yang, Di Huang, Chang Yu, Hao Wang, Jianzhu Guo, Zhen Lei, Stan Z. Li

However, most of their training data is constructed by 3D Morphable Model, whose space spanned is only a small part of the shape space.

3D Face Reconstruction

104

Paper
Code

Domain-Lifelong Learning for Dialogue State Tracking via Knowledge Preservation Networks

1 code implementation • EMNLP 2021 • Qingbin Liu, Pengfei Cao, Cao Liu, Jiansong Chen, Xunliang Cai, Fan Yang, Shizhu He, Kang Liu, Jun Zhao

This paradigm is often impractical in real-world applications since online dialogue systems usually involve continually emerging new data and domains.

Dialogue State Tracking Knowledge Distillation +1

Paper
Code

Improving Evidence Retrieval with Claim-Evidence Entailment

no code implementations • RANLP 2021 • Fan Yang, Eduard Dragut, Arjun Mukherjee

Claim verification is challenging because it requires first to find textual evidence and then apply claim-evidence entailment to verify a claim.

Claim Verification Retrieval +1

Paper
Add Code

Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

no code implementations • 30 May 2024 • Chaofan Lin, Zhenhua Han, Chengruidong Zhang, Yuqing Yang, Fan Yang, Chen Chen, Lili Qiu

Public LLM services have to blindly optimize individual LLM requests, leading to sub-optimal end-to-end performance of LLM applications.

Paper
Add Code

MindSemantix: Deciphering Brain Visual Experiences with a Brain-Language Model

no code implementations • 29 May 2024 • Ziqi Ren, Jie Li, Xuetong Xue, Xin Li, Fan Yang, Zhicheng Jiao, Xinbo Gao

MindSemantix generates high-quality captions that are deeply rooted in the visual and semantic information derived from brain activity.

Brain Decoding Language Modelling +1

Paper
Add Code

Revisiting the Robust Generalization of Adversarial Prompt Tuning

no code implementations • 18 May 2024 • Fan Yang, Mingxuan Xia, Sangzhou Xia, Chicheng Ma, Hui Hui

Understanding the vulnerability of large-scale pre-trained vision-language models like CLIP against adversarial attacks is key to ensuring zero-shot generalization capacity on various downstream tasks.

Adversarial Robustness Zero-shot Generalization

Paper
Add Code

MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

1 code implementation • 13 May 2024 • Qi Chen, Xiubo Geng, Corby Rosset, Carolyn Buractaon, Jingwen Lu, Tao Shen, Kun Zhou, Chenyan Xiong, Yeyun Gong, Paul Bennett, Nick Craswell, Xing Xie, Fan Yang, Bryan Tower, Nikhil Rao, Anlei Dong, Wenqi Jiang, Zheng Liu, Mingqin Li, Chuanjie Liu, Zengzhong Li, Rangan Majumder, Jennifer Neville, Andy Oakley, Knut Magne Risvik, Harsha Vardhan Simhadri, Manik Varma, Yujing Wang, Linjun Yang, Mao Yang, Ce Zhang

Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals.

Information Retrieval Retrieval

282

Paper
Code

Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models

no code implementations • NeurIPS 2023 • Yubin Shi, Yixuan Chen, Mingzhi Dong, Xiaochen Yang, Dongsheng Li, Yujiang Wang, Robert P. Dick, Qin Lv, Yingying Zhao, Fan Yang, Tun Lu, Ning Gu, Li Shang

To describe such modular-level learning capabilities, we introduce a novel concept dubbed modular neural tangent kernel (mNTK), and we demonstrate that the quality of a module's learning is tightly associated with its mNTK's principal eigenvalue $\lambda_{\max}$.

Paper
Add Code

Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning

no code implementations • 5 May 2024 • Tianchen Zhou, FNU Hairi, Haibo Yang, Jia Liu, Tian Tong, Fan Yang, Michinari Momma, Yan Gao

Reinforcement learning with multiple, potentially conflicting objectives is pervasive in real-world applications, while this problem remains theoretically under-explored.

Multi-Objective Reinforcement Learning reinforcement-learning

Paper
Add Code

Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge

no code implementations • 1 May 2024 • Bin Xiao, Chunan Shi, Xiaonan Nie, Fan Yang, Xiangwei Deng, Lei Su, WeiPeng Chen, Bin Cui

Consequently, the GPU spends most of its time on memory transfer instead of computation.

Paper
Add Code

Deep Multi-View Channel-Wise Spatio-Temporal Network for Traffic Flow Prediction

no code implementations • 23 Apr 2024 • Hao Miao, Senzhang Wang, Meiyue Zhang, Diansheng Guo, Funing Sun, Fan Yang

In this paper, we study the novel problem of multi-channel traffic flow prediction, and propose a deep \underline{M}ulti-\underline{V}iew \underline{C}hannel-wise \underline{S}patio-\underline{T}emporal \underline{Net}work (MVC-STNet) model to effectively address it.

Paper
Add Code

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations • 22 Apr 2024 • Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra, Xiyang Dai, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Victor Fragoso, Dan Iter, Mei Gao, Min Gao, Jianfeng Gao, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Ce Liu, Mengchen Liu, Weishung Liu, Eric Lin, Zeqi Lin, Chong Luo, Piyush Madan, Matt Mazzola, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Xin Wang, Lijuan Wang, Chunyu Wang, Yu Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Haiping Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Sonali Yadav, Fan Yang, Jianwei Yang, ZiYi Yang, Yifan Yang, Donghan Yu, Lu Yuan, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Language Modelling

Paper
Add Code

Quantifying Multilingual Performance of Large Language Models Across Languages

no code implementations • 17 Apr 2024 • Zihao Li, Yucheng Shi, Zirui Liu, Fan Yang, Ninghao Liu, Mengnan Du

However, currently there is no work to quantitatively measure the performance of LLMs in low-resource languages.

Paper
Add Code

MobileNetV4 - Universal Models for the Mobile Ecosystem

3 code implementations • 16 Apr 2024 • Danfeng Qin, Chas Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Banbury, Chengxi Ye, Berkin Akin, Vaibhav Aggarwal, Tenghui Zhu, Daniele Moro, Andrew Howard

We present the latest generation of MobileNets, known as MobileNetV4 (MNv4), featuring universally efficient architecture designs for mobile devices.

Neural Architecture Search

76,719

Paper
Code

Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

1 code implementation • 10 Apr 2024 • Mingyu Jin, Qinkai Yu, Jingyuan Huang, Qingcheng Zeng, Zhenting Wang, Wenyue Hua, Haiyan Zhao, Kai Mei, Yanda Meng, Kaize Ding, Fan Yang, Mengnan Du, Yongfeng Zhang

In this paper, we explore the hypothesis that LLMs process concepts of varying complexities in different layers, introducing the idea of "Concept Depth" to suggest that more complex concepts are typically acquired in deeper layers.

Paper
Code

Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion

no code implementations • 9 Apr 2024 • Fan Yang, Jianfeng Zhang, Yichun Shi, Bowen Chen, Chenxu Zhang, Huichao Zhang, Xiaofeng Yang, Jiashi Feng, Guosheng Lin

Benefiting from the rapid development of 2D diffusion models, 3D content creation has made significant progress recently.

3D Generation

Paper
Add Code

Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought

no code implementations • 4 Apr 2024 • Jooyoung Lee, Fan Yang, Thanh Tran, Qian Hu, Emre Barut, Kai-Wei Chang, Chengwei Su

The Frozen large LM is then prompted to predict a task output based on the rationale generated by the lightweight LM.

Extractive Question-Answering Knowledge Distillation +3

Paper
Add Code

Position-Aware Parameter Efficient Fine-Tuning Approach for Reducing Positional Bias in LLMs

no code implementations • 1 Apr 2024 • Zheng Zhang, Fan Yang, Ziyan Jiang, Zheng Chen, Zhengyang Zhao, Chengyuan Ma, Liang Zhao, Yang Liu

Recent advances in large language models (LLMs) have enhanced their ability to process long input contexts.

Data Augmentation Position

Paper
Add Code

DSFNet: Learning Disentangled Scenario Factorization for Multi-Scenario Route Ranking

no code implementations • 30 Mar 2024 • Jiahao Yu, Yihai Duan, Longfei Xu, Chao Chen, Shuliang Liu, Li Chen, Kaikui Liu, Fan Yang, Ning Guo

Multi-scenario route ranking (MSRR) is crucial in many industrial mapping systems.

Disentanglement

Paper
Add Code

Self-learning Canonical Space for Multi-view 3D Human Pose Estimation

no code implementations • 19 Mar 2024 • Xiaoben Li, Mancheng Meng, Ziyan Wu, Terrence Chen, Fan Yang, Dinggang Shen

To facilitate the aggregation of the intra- and inter-view, we define a canonical parameter space, depicted by per-view camera pose and human pose and shape parameters ($\theta$ and $\beta$) of SMPL model, and propose a two-stage learning procedure.

3D Human Pose Estimation Self-Learning

Paper
Add Code

Human Mesh Recovery from Arbitrary Multi-view Images

no code implementations • 19 Mar 2024 • Xiaoben Li, Mancheng Meng, Ziyan Wu, Terrence Chen, Fan Yang, Dinggang Shen

Human mesh recovery from arbitrary multi-view images involves two characteristics: the arbitrary camera poses and arbitrary number of camera views.

Decoder Human Mesh Recovery +1

Paper
Add Code

Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

no code implementations • 14 Mar 2024 • Cheng Chen, Xiaofeng Yang, Fan Yang, Chengzeng Feng, Zhoujie Fu, Chuan-Sheng Foo, Guosheng Lin, Fayao Liu

In this paper, we present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.

3D Generation Text to 3D

Paper
Add Code

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

1 code implementation • 14 Mar 2024 • Yufei Zhan, Yousong Zhu, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao Wang

Large Vision Language Models have achieved fine-grained object perception, but the limitation of image resolution remains a significant obstacle to surpass the performance of task-specific experts in complex and dense scenarios.

Object Object Counting +3

Paper
Code

Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era

1 code implementation • 13 Mar 2024 • Xuansheng Wu, Haiyan Zhao, Yaochen Zhu, Yucheng Shi, Fan Yang, Tianming Liu, Xiaoming Zhai, Wenlin Yao, Jundong Li, Mengnan Du, Ninghao Liu

Therefore, in this paper, we introduce Usable XAI in the context of LLMs by analyzing (1) how XAI can benefit LLMs and AI systems, and (2) how LLMs can contribute to the advancement of XAI.

Paper
Code

Automating Catheterization Labs with Real-Time Perception

no code implementations • 9 Mar 2024 • Fan Yang, Benjamin Planche, Meng Zheng, Cheng Chen, Terrence Chen, Ziyan Wu

For decades, three-dimensional C-arm Cone-Beam Computed Tomography (CBCT) imaging system has been a critical component for complex vascular and nonvascular interventional procedures.

Paper
Add Code

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

no code implementations • 8 Mar 2024 • Gemini Team, Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry, Lepikhin, Timothy Lillicrap, Jean-Baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy, Jilin Chen, Michael Isard, Paul Barham, Tom Hennigan, Ross Mcilroy, Melvin Johnson, Johan Schalkwyk, Eli Collins, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Clemens Meyer, Gregory Thornton, Zhen Yang, Henryk Michalewski, Zaheer Abbas, Nathan Schucher, Ankesh Anand, Richard Ives, James Keeling, Karel Lenc, Salem Haykal, Siamak Shakeri, Pranav Shyam, Aakanksha Chowdhery, Roman Ring, Stephen Spencer, Eren Sezener, Luke Vilnis, Oscar Chang, Nobuyuki Morioka, George Tucker, Ce Zheng, Oliver Woodman, Nithya Attaluri, Tomas Kocisky, Evgenii Eltyshev, Xi Chen, Timothy Chung, Vittorio Selo, Siddhartha Brahma, Petko Georgiev, Ambrose Slone, Zhenkai Zhu, James Lottes, Siyuan Qiao, Ben Caine, Sebastian Riedel, Alex Tomala, Martin Chadwick, Juliette Love, Peter Choy, Sid Mittal, Neil Houlsby, Yunhao Tang, Matthew Lamm, Libin Bai, Qiao Zhang, Luheng He, Yong Cheng, Peter Humphreys, Yujia Li, Sergey Brin, Albin Cassirer, Yingjie Miao, Lukas Zilka, Taylor Tobin, Kelvin Xu, Lev Proleev, Daniel Sohn, Alberto Magni, Lisa Anne Hendricks, Isabel Gao, Santiago Ontanon, Oskar Bunyan, Nathan Byrd, Abhanshu Sharma, Biao Zhang, Mario Pinto, Rishika Sinha, Harsh Mehta, Dawei Jia, Sergi Caelles, Albert Webson, Alex Morris, Becca Roelofs, Yifan Ding, Robin Strudel, Xuehan Xiong, Marvin Ritter, Mostafa Dehghani, Rahma Chaabouni, Abhijit Karmarkar, Guangda Lai, Fabian Mentzer, Bibo Xu, Yaguang Li, Yujing Zhang, Tom Le Paine, Alex Goldin, Behnam Neyshabur, Kate Baumli, Anselm Levskaya, Michael Laskin, Wenhao Jia, Jack W. Rae, Kefan Xiao, Antoine He, Skye Giordano, Lakshman Yagati, Jean-Baptiste Lespiau, Paul Natsev, Sanjay Ganapathy, Fangyu Liu, Danilo Martins, Nanxin Chen, Yunhan Xu, Megan Barnes, Rhys May, Arpi Vezer, Junhyuk Oh, Ken Franko, Sophie Bridgers, Ruizhe Zhao, Boxi Wu, Basil Mustafa, Sean Sechrist, Emilio Parisotto, Thanumalayan Sankaranarayana Pillai, Chris Larkin, Chenjie Gu, Christina Sorokin, Maxim Krikun, Alexey Guseynov, Jessica Landon, Romina Datta, Alexander Pritzel, Phoebe Thacker, Fan Yang, Kevin Hui, Anja Hauth, Chih-Kuan Yeh, David Barker, Justin Mao-Jones, Sophia Austin, Hannah Sheahan, Parker Schuh, James Svensson, Rohan Jain, Vinay Ramasesh, Anton Briukhov, Da-Woon Chung, Tamara von Glehn, Christina Butterfield, Priya Jhakra, Matthew Wiethoff, Justin Frye, Jordan Grimstad, Beer Changpinyo, Charline Le Lan, Anna Bortsova, Yonghui Wu, Paul Voigtlaender, Tara Sainath, Shane Gu, Charlotte Smith, Will Hawkins, Kris Cao, James Besley, Srivatsan Srinivasan, Mark Omernick, Colin Gaffney, Gabriela Surita, Ryan Burnell, Bogdan Damoc, Junwhan Ahn, Andrew Brock, Mantas Pajarskas, Anastasia Petrushkina, Seb Noury, Lorenzo Blanco, Kevin Swersky, Arun Ahuja, Thi Avrahami, Vedant Misra, Raoul de Liedekerke, Mariko Iinuma, Alex Polozov, Sarah York, George van den Driessche, Paul Michel, Justin Chiu, Rory Blevins, Zach Gleicher, Adrià Recasens, Alban Rrustemi, Elena Gribovskaya, Aurko Roy, Wiktor Gworek, Sébastien M. R. Arnold, Lisa Lee, James Lee-Thorp, Marcello Maggioni, Enrique Piqueras, Kartikeya Badola, Sharad Vikram, Lucas Gonzalez, Anirudh Baddepudi, Evan Senter, Jacob Devlin, James Qin, Michael Azzam, Maja Trebacz, Martin Polacek, Kashyap Krishnakumar, Shuo-Yiin Chang, Matthew Tung, Ivo Penchev, Rishabh Joshi, Kate Olszewska, Carrie Muir, Mateo Wirth, Ale Jakse Hartman, Josh Newlan, Sheleem Kashem, Vijay Bolina, Elahe Dabir, Joost van Amersfoort, Zafarali Ahmed, James Cobon-Kerr, Aishwarya Kamath, Arnar Mar Hrafnkelsson, Le Hou, Ian Mackinnon, Alexandre Frechette, Eric Noland, Xiance Si, Emanuel Taropa, Dong Li, Phil Crone, Anmol Gulati, Sébastien Cevey, Jonas Adler, Ada Ma, David Silver, Simon Tokumine, Richard Powell, Stephan Lee, Kiran Vodrahalli, Samer Hassan, Diana Mincu, Antoine Yang, Nir Levine, Jenny Brennan, Mingqiu Wang, Sarah Hodkinson, Jeffrey Zhao, Josh Lipschultz, Aedan Pope, Michael B. Chang, Cheng Li, Laurent El Shafey, Michela Paganini, Sholto Douglas, Bernd Bohnet, Fabio Pardo, Seth Odoom, Mihaela Rosca, Cicero Nogueira dos santos, Kedar Soparkar, Arthur Guez, Tom Hudson, Steven Hansen, Chulayuth Asawaroengchai, Ravi Addanki, Tianhe Yu, Wojciech Stokowiec, Mina Khan, Justin Gilmer, Jaehoon Lee, Carrie Grimes Bostock, Keran Rong, Jonathan Caton, Pedram Pejman, Filip Pavetic, Geoff Brown, Vivek Sharma, Mario Lučić, Rajkumar Samuel, Josip Djolonga, Amol Mandhane, Lars Lowe Sjösund, Elena Buchatskaya, Elspeth White, Natalie Clay, Jiepu Jiang, Hyeontaek Lim, Ross Hemsley, Zeyncep Cankara, Jane Labanowski, Nicola De Cao, David Steiner, Sayed Hadi Hashemi, Jacob Austin, Anita Gergely, Tim Blyth, Joe Stanton, Kaushik Shivakumar, Aditya Siddhant, Anders Andreassen, Carlos Araya, Nikhil Sethi, Rakesh Shivanna, Steven Hand, Ankur Bapna, Ali Khodaei, Antoine Miech, Garrett Tanzer, Andy Swing, Shantanu Thakoor, Lora Aroyo, Zhufeng Pan, Zachary Nado, Jakub Sygnowski, Stephanie Winkler, Dian Yu, Mohammad Saleh, Loren Maggiore, Yamini Bansal, Xavier Garcia, Mehran Kazemi, Piyush Patil, Ishita Dasgupta, Iain Barr, Minh Giang, Thais Kagohara, Ivo Danihelka, Amit Marathe, Vladimir Feinberg, Mohamed Elhawaty, Nimesh Ghelani, Dan Horgan, Helen Miller, Lexi Walker, Richard Tanburn, Mukarram Tariq, Disha Shrivastava, Fei Xia, Qingze Wang, Chung-Cheng Chiu, Zoe Ashwood, Khuslen Baatarsukh, Sina Samangooei, Raphaël Lopez Kaufman, Fred Alcober, Axel Stjerngren, Paul Komarek, Katerina Tsihlas, Anudhyan Boral, Ramona Comanescu, Jeremy Chen, Ruibo Liu, Chris Welty, Dawn Bloxwich, Charlie Chen, Yanhua Sun, Fangxiaoyu Feng, Matthew Mauger, Xerxes Dotiwalla, Vincent Hellendoorn, Michael Sharman, Ivy Zheng, Krishna Haridasan, Gabe Barth-Maron, Craig Swanson, Dominika Rogozińska, Alek Andreev, Paul Kishan Rubenstein, Ruoxin Sang, Dan Hurt, Gamaleldin Elsayed, Renshen Wang, Dave Lacey, Anastasija Ilić, Yao Zhao, Adam Iwanicki, Alejandro Lince, Alexander Chen, Christina Lyu, Carl Lebsack, Jordan Griffith, Meenu Gaba, Paramjit Sandhu, Phil Chen, Anna Koop, Ravi Rajwar, Soheil Hassas Yeganeh, Solomon Chang, Rui Zhu, Soroush Radpour, Elnaz Davoodi, Ving Ian Lei, Yang Xu, Daniel Toyama, Constant Segal, Martin Wicke, Hanzhao Lin, Anna Bulanova, Adrià Puigdomènech Badia, Nemanja Rakićević, Pablo Sprechmann, Angelos Filos, Shaobo Hou, Víctor Campos, Nora Kassner, Devendra Sachan, Meire Fortunato, Chimezie Iwuanyanwu, Vitaly Nikolaev, Balaji Lakshminarayanan, Sadegh Jazayeri, Mani Varadarajan, Chetan Tekur, Doug Fritz, Misha Khalman, David Reitter, Kingshuk Dasgupta, Shourya Sarcar, Tina Ornduff, Javier Snaider, Fantine Huot, Johnson Jia, Rupert Kemp, Nejc Trdin, Anitha Vijayakumar, Lucy Kim, Christof Angermueller, Li Lao, Tianqi Liu, Haibin Zhang, David Engel, Somer Greene, Anaïs White, Jessica Austin, Lilly Taylor, Shereen Ashraf, Dangyi Liu, Maria Georgaki, Irene Cai, Yana Kulizhskaya, Sonam Goenka, Brennan Saeta, Ying Xu, Christian Frank, Dario de Cesare, Brona Robenek, Harry Richardson, Mahmoud Alnahlawi, Christopher Yew, Priya Ponnapalli, Marco Tagliasacchi, Alex Korchemniy, Yelin Kim, Dinghua Li, Bill Rosgen, Kyle Levin, Jeremy Wiesner, Praseem Banzal, Praveen Srinivasan, Hongkun Yu, Çağlar Ünlü, David Reid, Zora Tung, Daniel Finchelstein, Ravin Kumar, Andre Elisseeff, Jin Huang, Ming Zhang, Ricardo Aguilar, Mai Giménez, Jiawei Xia, Olivier Dousse, Willi Gierke, Damion Yates, Komal Jalan, Lu Li, Eri Latorre-Chimoto, Duc Dung Nguyen, Ken Durden, Praveen Kallakuri, Yaxin Liu, Matthew Johnson, Tomy Tsai, Alice Talbert, Jasmine Liu, Alexander Neitz, Chen Elkind, Marco Selvi, Mimi Jasarevic, Livio Baldini Soares, Albert Cui, Pidong Wang, Alek Wenjiao Wang, Xinyu Ye, Krystal Kallarackal, Lucia Loher, Hoi Lam, Josef Broder, Dan Holtmann-Rice, Nina Martin, Bramandia Ramadhana, Mrinal Shukla, Sujoy Basu, Abhi Mohan, Nick Fernando, Noah Fiedel, Kim Paterson, Hui Li, Ankush Garg, Jane Park, DongHyun Choi, Diane Wu, Sankalp Singh, Zhishuai Zhang, Amir Globerson, Lily Yu, John Carpenter, Félix de Chaumont Quitry, Carey Radebaugh, Chu-Cheng Lin, Alex Tudor, Prakash Shroff, Drew Garmon, Dayou Du, Neera Vats, Han Lu, Shariq Iqbal, Alex Yakubovich, Nilesh Tripuraneni, James Manyika, Haroon Qureshi, Nan Hua, Christel Ngani, Maria Abi Raad, Hannah Forbes, Jeff Stanway, Mukund Sundararajan, Victor Ungureanu, Colton Bishop, Yunjie Li, Balaji Venkatraman, Bo Li, Chloe Thornton, Salvatore Scellato, Nishesh Gupta, Yicheng Wang, Ian Tenney, Xihui Wu, Ashish Shenoy, Gabriel Carvajal, Diana Gage Wright, Ben Bariach, Zhuyun Xiao, Peter Hawkins, Sid Dalmia, Clement Farabet, Pedro Valenzuela, Quan Yuan, Ananth Agarwal, Mia Chen, Wooyeol Kim, Brice Hulse, Nandita Dukkipati, Adam Paszke, Andrew Bolt, Kiam Choo, Jennifer Beattie, Jennifer Prendki, Harsha Vashisht, Rebeca Santamaria-Fernandez, Luis C. Cobo, Jarek Wilkiewicz, David Madras, Ali Elqursh, Grant Uy, Kevin Ramirez, Matt Harvey, Tyler Liechty, Heiga Zen, Jeff Seibert, Clara Huiyi Hu, Andrey Khorlin, Maigo Le, Asaf Aharoni, Megan Li, Lily Wang, Sandeep Kumar, Norman Casagrande, Jay Hoover, Dalia El Badawy, David Soergel, Denis Vnukov, Matt Miecnikowski, Jiri Simsa, Praveen Kumar, Thibault Sellam, Daniel Vlasic, Samira Daruki, Nir Shabat, John Zhang, Guolong Su, Jiageng Zhang, Jeremiah Liu, Yi Sun, Evan Palmer, Alireza Ghaffarkhah, Xi Xiong, Victor Cotruta, Michael Fink, Lucas Dixon, Ashwin Sreevatsa, Adrian Goedeckemeyer, Alek Dimitriev, Mohsen Jafari, Remi Crocker, Nicholas FitzGerald, Aviral Kumar, Sanjay Ghemawat, Ivan Philips, Frederick Liu, Yannie Liang, Rachel Sterneck, Alena Repina, Marcus Wu, Laura Knight, Marin Georgiev, Hyo Lee, Harry Askham, Abhishek Chakladar, Annie Louis, Carl Crous, Hardie Cate, Dessie Petrova, MICHAEL QUINN, Denese Owusu-Afriyie, Achintya Singhal, Nan Wei, Solomon Kim, Damien Vincent, Milad Nasr, Christopher A. Choquette-Choo, Reiko Tojo, Shawn Lu, Diego de Las Casas, Yuchung Cheng, Tolga Bolukbasi, Katherine Lee, Saaber Fatehi, Rajagopal Ananthanarayanan, Miteyan Patel, Charbel Kaed, Jing Li, Shreyas Rammohan Belle, Zhe Chen, Jaclyn Konzelmann, Siim Põder, Roopal Garg, Vinod Koverkathu, Adam Brown, Chris Dyer, Rosanne Liu, Azade Nova, Jun Xu, Alanna Walton, Alicia Parrish, Mark Epstein, Sara McCarthy, Slav Petrov, Demis Hassabis, Koray Kavukcuoglu, Jeffrey Dean, Oriol Vinyals

In this report, we present the latest model of the Gemini family, Gemini 1. 5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

Ranked #7 on Visual Question Answering on MM-Vet

Code Generation Math Word Problem Solving +2

Paper
Add Code

Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion

no code implementations • 5 Mar 2024 • Meng Zheng, Benjamin Planche, Xuan Gong, Fan Yang, Terrence Chen, Ziyan Wu

3D patient body modeling is critical to the success of automated patient positioning for smart medical scanning and operating rooms.

Keypoint Detection

Paper
Add Code

Orthogonal Gradient Boosting for Simpler Additive Rule Ensembles

1 code implementation • 24 Feb 2024 • Fan Yang, Pierre Le Bodic, Michael Kamp, Mario Boley

Gradient boosting of prediction rules is an efficient approach to learn potentially interpretable yet accurate probabilistic models.

Paper
Code

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

no code implementations • 21 Feb 2024 • Yiran Ding, Li Lyna Zhang, Chengruidong Zhang, Yuanyuan Xu, Ning Shang, Jiahang Xu, Fan Yang, Mao Yang

This is achieved by three key innovations: (i) we identify and exploit two forms of non-uniformities in positional interpolation through an efficient search, providing a better initialization for fine-tuning and enabling an 8x extension in non-fine-tuning scenarios; (ii) we introduce a progressive extension strategy that first fine-tunes a 256k length LLM and then conducts a second positional interpolation on the fine-tuned extended LLM to achieve a 2048k context window; (iii) we readjust LongRoPE on 8k length to recover the short context window performance.

Paper
Add Code

LEMMA: Towards LVLM-Enhanced Multimodal Misinformation Detection with External Knowledge Augmentation

1 code implementation • 19 Feb 2024 • Keyang Xuan, Li Yi, Fan Yang, Ruochen Wu, Yi R. Fung, Heng Ji

In this paper, we first investigate the potential of LVLM on multimodal misinformation detection.

Language Modelling LEMMA +1

Paper
Code

Towards Uncovering How Large Language Model Works: An Explainability Perspective

no code implementations • 16 Feb 2024 • Haiyan Zhao, Fan Yang, Bo Shen, Himabindu Lakkaraju, Mengnan Du

Large language models (LLMs) have led to breakthroughs in language tasks, yet the internal mechanisms that enable their remarkable generalization and reasoning abilities remain opaque.

Hallucination Language Modelling +3

Paper
Add Code

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

no code implementations • 12 Feb 2024 • Mateusz Łajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszyńska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman

Echoing the widely-reported "emergent abilities" of large language models when trained on increasing volume of data, we show that BASE TTS variants built with 10K+ hours and 500M+ parameters begin to demonstrate natural prosody on textually complex sentences.

Decoder Disentanglement

Paper
Add Code

Understanding the Weakness of Large Language Model Agents within a Complex Android Environment

1 code implementation • 9 Feb 2024 • Mingzhe Xing, Rongkai Zhang, Hui Xue, Qi Chen, Fan Yang, Zhen Xiao

These challenges motivate AndroidArena, an environment and benchmark designed to evaluate LLM agents on a modern operating system.

Date Understanding Language Modelling +1

Paper
Code

Large Language Models As Faithful Explainers

no code implementations • 7 Feb 2024 • Yu-Neng Chuang, Guanchu Wang, Chia-Yuan Chang, Ruixiang Tang, Fan Yang, Mengnan Du, Xuanting Cai, Xia Hu

In this work, we introduce a generative explanation framework, xLLM, to improve the faithfulness of the explanations provided in natural language formats for LLMs.

Decision Making

Paper
Add Code

Seeing is not always believing: The Space of Harmless Perturbations

no code implementations • 3 Feb 2024 • Lu Chen, Shaofeng Li, Benhao Huang, Fan Yang, Zheng Li, Jie Li, Yuan Luo

However, in this work, we reveal the existence of a harmless perturbation space, in which perturbations drawn from this space, regardless of their magnitudes, leave the network output unchanged when applied to inputs.

Privacy Preserving

Paper
Add Code

DARCS: Memory-Efficient Deep Compressed Sensing Reconstruction for Acceleration of 3D Whole-Heart Coronary MR Angiography

no code implementations • 1 Feb 2024 • Zhihao Xue, Fan Yang, Juan Gao, Zhuo Chen, Hao Peng, Chao Zou, Hang Jin, Chenxi Hu

While unrolling-based deep reconstruction methods have achieved state-of-the-art performance on 2D image reconstruction, their application to 3D reconstruction is hindered by the large amount of memory needed to train an unrolled network.

3D Reconstruction De-aliasing +2

Paper
Add Code

TripleSurv: Triplet Time-adaptive Coordinate Loss for Survival Analysis

no code implementations • 5 Jan 2024 • Liwen Zhang, Lianzhen Zhong, Fan Yang, Di Dong, Hui Hui, Jie Tian

However, ranking loss only focus on the ranking of survival time and does not consider potential effect of samples for exact survival time values.

Survival Analysis

Paper
Add Code

Disentangle Estimation of Causal Effects from Cross-Silo Data

no code implementations • 4 Jan 2024 • Yuxuan Liu, Haozhao Wang, Shuang Wang, Zhiming He, Wenchao Xu, Jialiang Zhu, Fan Yang

Estimating causal effects among different events is of great importance to critical fields such as drug development.

Paper
Add Code

LETA: Learning Transferable Attribution for Generic Vision Explainer

no code implementations • 23 Dec 2023 • Guanchu Wang, Yu-Neng Chuang, Fan Yang, Mengnan Du, Chia-Yuan Chang, Shaochen Zhong, Zirui Liu, Zhaozhuo Xu, Kaixiong Zhou, Xuanting Cai, Xia Hu

To address this problem, we develop a pre-trained, DNN-based, generic explainer on large-scale image datasets, and leverage its transferability to explain various vision models for downstream tasks.

Paper
Add Code

Gemini: A Family of Highly Capable Multimodal Models

no code implementations • The Keyword 2023 • Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Jack Krawczyk, Ed Chi, Heng-Tze Cheng, Eric Ni, Purvi Shah, Patrick Kane, Betty Chan, Manaal Faruqui, Aliaksei Severyn, Hanzhao Lin, Yaguang Li, Yong Cheng, Mahdis Mahdieh, Mia Chen, Pei Sun, Dustin Tran, Sumit Bagri, Balaji Lakshminarayanan, Jeremiah Liu, Andras Orban, Fabian Güra, Hao Zhou, Xinying Song, Aurelien Boffy, Harish Ganapathy, Steven Zheng, HyunJeong Choe, Ágoston Weisz, Tao Zhu, Yifeng Lu, Siddharth Gopal, Jarrod Kahn, Maciej Kula, Jeff Pitman, Rushin Shah, Emanuel Taropa, Majd Al Merey, Martin Baeuml, Zhifeng Chen, Laurent El Shafey, Yujing Zhang, Olcan Sercinoglu, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, Alexandre Frechette, Charlotte Smith, Laura Culp, Lev Proleev, Yi Luan, Xi Chen, James Lottes, Nathan Schucher, Federico Lebron, Alban Rrustemi, Natalie Clay, Phil Crone, Tomas Kocisky, Jeffrey Zhao, Bartek Perz, Dian Yu, Heidi Howard, Adam Bloniarz, Jack W. Rae, Han Lu, Laurent SIfre, Marcello Maggioni, Fred Alcober, Dan Garrette, Megan Barnes, Shantanu Thakoor, Jacob Austin, Gabriel Barth-Maron, William Wong, Rishabh Joshi, Rahma Chaabouni, Deeni Fatiha, Arun Ahuja, Gaurav Singh Tomar, Evan Senter, Martin Chadwick, Ilya Kornakov, Nithya Attaluri, Iñaki Iturrate, Ruibo Liu, Yunxuan Li, Sarah Cogan, Jeremy Chen, Chao Jia, Chenjie Gu, Qiao Zhang, Jordan Grimstad, Ale Jakse Hartman, Xavier Garcia, Thanumalayan Sankaranarayana Pillai, Jacob Devlin, Michael Laskin, Diego de Las Casas, Dasha Valter, Connie Tao, Lorenzo Blanco, Adrià Puigdomènech Badia, David Reitter, Mianna Chen, Jenny Brennan, Clara Rivera, Sergey Brin, Shariq Iqbal, Gabriela Surita, Jane Labanowski, Abhi Rao, Stephanie Winkler, Emilio Parisotto, Yiming Gu, Kate Olszewska, Ravi Addanki, Antoine Miech, Annie Louis, Denis Teplyashin, Geoff Brown, Elliot Catt, Jan Balaguer, Jackie Xiang, Pidong Wang, Zoe Ashwood, Anton Briukhov, Albert Webson, Sanjay Ganapathy, Smit Sanghavi, Ajay Kannan, Ming-Wei Chang, Axel Stjerngren, Josip Djolonga, Yuting Sun, Ankur Bapna, Matthew Aitchison, Pedram Pejman, Henryk Michalewski, Tianhe Yu, Cindy Wang, Juliette Love, Junwhan Ahn, Dawn Bloxwich, Kehang Han, Peter Humphreys, Thibault Sellam, James Bradbury, Varun Godbole, Sina Samangooei, Bogdan Damoc, Alex Kaskasoli, Sébastien M. R. Arnold, Vijay Vasudevan, Shubham Agrawal, Jason Riesa, Dmitry Lepikhin, Richard Tanburn, Srivatsan Srinivasan, Hyeontaek Lim, Sarah Hodkinson, Pranav Shyam, Johan Ferret, Steven Hand, Ankush Garg, Tom Le Paine, Jian Li, Yujia Li, Minh Giang, Alexander Neitz, Zaheer Abbas, Sarah York, Machel Reid, Elizabeth Cole, Aakanksha Chowdhery, Dipanjan Das, Dominika Rogozińska, Vitaliy Nikolaev, Pablo Sprechmann, Zachary Nado, Lukas Zilka, Flavien Prost, Luheng He, Marianne Monteiro, Gaurav Mishra, Chris Welty, Josh Newlan, Dawei Jia, Miltiadis Allamanis, Clara Huiyi Hu, Raoul de Liedekerke, Justin Gilmer, Carl Saroufim, Shruti Rijhwani, Shaobo Hou, Disha Shrivastava, Anirudh Baddepudi, Alex Goldin, Adnan Ozturel, Albin Cassirer, Yunhan Xu, Daniel Sohn, Devendra Sachan, Reinald Kim Amplayo, Craig Swanson, Dessie Petrova, Shashi Narayan, Arthur Guez, Siddhartha Brahma, Jessica Landon, Miteyan Patel, Ruizhe Zhao, Kevin Villela, Luyu Wang, Wenhao Jia, Matthew Rahtz, Mai Giménez, Legg Yeung, James Keeling, Petko Georgiev, Diana Mincu, Boxi Wu, Salem Haykal, Rachel Saputro, Kiran Vodrahalli, James Qin, Zeynep Cankara, Abhanshu Sharma, Nick Fernando, Will Hawkins, Behnam Neyshabur, Solomon Kim, Adrian Hutter, Priyanka Agrawal, Alex Castro-Ros, George van den Driessche, Tao Wang, Shuo-Yiin Chang, Paul Komarek, Ross Mcilroy, Mario Lučić, Guodong Zhang, Wael Farhan, Michael Sharman, Paul Natsev, Paul Michel, Yamini Bansal, Siyuan Qiao, Kris Cao, Siamak Shakeri, Christina Butterfield, Justin Chung, Paul Kishan Rubenstein, Shivani Agrawal, Arthur Mensch, Kedar Soparkar, Karel Lenc, Timothy Chung, Aedan Pope, Loren Maggiore, Jackie Kay, Priya Jhakra, Shibo Wang, Joshua Maynez, Mary Phuong, Taylor Tobin, Andrea Tacchetti, Maja Trebacz, Kevin Robinson, Yash Katariya, Sebastian Riedel, Paige Bailey, Kefan Xiao, Nimesh Ghelani, Lora Aroyo, Ambrose Slone, Neil Houlsby, Xuehan Xiong, Zhen Yang, Elena Gribovskaya, Jonas Adler, Mateo Wirth, Lisa Lee, Music Li, Thais Kagohara, Jay Pavagadhi, Sophie Bridgers, Anna Bortsova, Sanjay Ghemawat, Zafarali Ahmed, Tianqi Liu, Richard Powell, Vijay Bolina, Mariko Iinuma, Polina Zablotskaia, James Besley, Da-Woon Chung, Timothy Dozat, Ramona Comanescu, Xiance Si, Jeremy Greer, Guolong Su, Martin Polacek, Raphaël Lopez Kaufman, Simon Tokumine, Hexiang Hu, Elena Buchatskaya, Yingjie Miao, Mohamed Elhawaty, Aditya Siddhant, Nenad Tomasev, Jinwei Xing, Christina Greer, Helen Miller, Shereen Ashraf, Aurko Roy, Zizhao Zhang, Ada Ma, Angelos Filos, Milos Besta, Rory Blevins, Ted Klimenko, Chih-Kuan Yeh, Soravit Changpinyo, Jiaqi Mu, Oscar Chang, Mantas Pajarskas, Carrie Muir, Vered Cohen, Charline Le Lan, Krishna Haridasan, Amit Marathe, Steven Hansen, Sholto Douglas, Rajkumar Samuel, Mingqiu Wang, Sophia Austin, Chang Lan, Jiepu Jiang, Justin Chiu, Jaime Alonso Lorenzo, Lars Lowe Sjösund, Sébastien Cevey, Zach Gleicher, Thi Avrahami, Anudhyan Boral, Hansa Srinivasan, Vittorio Selo, Rhys May, Konstantinos Aisopos, Léonard Hussenot, Livio Baldini Soares, Kate Baumli, Michael B. Chang, Adrià Recasens, Ben Caine, Alexander Pritzel, Filip Pavetic, Fabio Pardo, Anita Gergely, Justin Frye, Vinay Ramasesh, Dan Horgan, Kartikeya Badola, Nora Kassner, Subhrajit Roy, Ethan Dyer, Víctor Campos Campos, Alex Tomala, Yunhao Tang, Dalia El Badawy, Elspeth White, Basil Mustafa, Oran Lang, Abhishek Jindal, Sharad Vikram, Zhitao Gong, Sergi Caelles, Ross Hemsley, Gregory Thornton, Fangxiaoyu Feng, Wojciech Stokowiec, Ce Zheng, Phoebe Thacker, Çağlar Ünlü, Zhishuai Zhang, Mohammad Saleh, James Svensson, Max Bileschi, Piyush Patil, Ankesh Anand, Roman Ring, Katerina Tsihlas, Arpi Vezer, Marco Selvi, Toby Shevlane, Mikel Rodriguez, Tom Kwiatkowski, Samira Daruki, Keran Rong, Allan Dafoe, Nicholas FitzGerald, Keren Gu-Lemberg, Mina Khan, Lisa Anne Hendricks, Marie Pellat, Vladimir Feinberg, James Cobon-Kerr, Tara Sainath, Maribeth Rauh, Sayed Hadi Hashemi, Richard Ives, Yana Hasson, Eric Noland, Yuan Cao, Nathan Byrd, Le Hou, Qingze Wang, Thibault Sottiaux, Michela Paganini, Jean-Baptiste Lespiau, Alexandre Moufarek, Samer Hassan, Kaushik Shivakumar, Joost van Amersfoort, Amol Mandhane, Pratik Joshi, Anirudh Goyal, Matthew Tung, Andrew Brock, Hannah Sheahan, Vedant Misra, Cheng Li, Nemanja Rakićević, Mostafa Dehghani, Fangyu Liu, Sid Mittal, Junhyuk Oh, Seb Noury, Eren Sezener, Fantine Huot, Matthew Lamm, Nicola De Cao, Charlie Chen, Sidharth Mudgal, Romina Stella, Kevin Brooks, Gautam Vasudevan, Chenxi Liu, Mainak Chain, Nivedita Melinkeri, Aaron Cohen, Venus Wang, Kristie Seymore, Sergey Zubkov, Rahul Goel, Summer Yue, Sai Krishnakumaran, Brian Albert, Nate Hurley, Motoki Sano, Anhad Mohananey, Jonah Joughin, Egor Filonov, Tomasz Kępa, Yomna Eldawy, Jiawern Lim, Rahul Rishi, Shirin Badiezadegan, Taylor Bos, Jerry Chang, Sanil Jain, Sri Gayatri Sundara Padmanabhan, Subha Puttagunta, Kalpesh Krishna, Leslie Baker, Norbert Kalb, Vamsi Bedapudi, Shuntong Lei, Anthony Yu, Oren Litvin, Xiang Zhou, Zhichun Wu, Sam Sobell, Andrea Siciliano, Alan Papir, Robby Neale, Jonas Bragagnolo, Tej Toor, Tina Chen, Valentin Anklin, Feiran Wang, Richie Feng, Milad Gholami, Kevin Ling, Lijuan Liu, Jules Walter, Hamid Moghaddam, Arun Kishore, Jakub Adamek, Tyler Mercado, Jonathan Mallinson, Siddhinita Wandekar, Stephen Cagle, Eran Ofek, Guillermo Garrido, Clemens Lombriser, Maksim Mukha, Botu Sun, Hafeezul Rahman Mohammad, Josip Matak, Yadi Qian, Vikas Peswani, Pawel Janus, Quan Yuan, Leif Schelin, Oana David, Ankur Garg, Yifan He, Oleksii Duzhyi, Anton Älgmyr, Timothée Lottaz, Qi Li, Vikas Yadav, Luyao Xu, Alex Chinien, Rakesh Shivanna, Aleksandr Chuklin, Josie Li, Carrie Spadine, Travis Wolfe, Kareem Mohamed, Subhabrata Das, Zihang Dai, Kyle He, Daniel von Dincklage, Shyam Upadhyay, Akanksha Maurya, Luyan Chi, Sebastian Krause, Khalid Salama, Pam G Rabinovitch, Pavan Kumar Reddy M, Aarush Selvan, Mikhail Dektiarev, Golnaz Ghiasi, Erdem Guven, Himanshu Gupta, Boyi Liu, Deepak Sharma, Idan Heimlich Shtacher, Shachi Paul, Oscar Akerlund, François-Xavier Aubet, Terry Huang, Chen Zhu, Eric Zhu, Elico Teixeira, Matthew Fritze, Francesco Bertolini, Liana-Eleonora Marinescu, Martin Bölle, Dominik Paulus, Khyatti Gupta, Tejasi Latkar, Max Chang, Jason Sanders, Roopa Wilson, Xuewei Wu, Yi-Xuan Tan, Lam Nguyen Thiet, Tulsee Doshi, Sid Lall, Swaroop Mishra, Wanming Chen, Thang Luong, Seth Benjamin, Jasmine Lee, Ewa Andrejczuk, Dominik Rabiej, Vipul Ranjan, Krzysztof Styrc, Pengcheng Yin, Jon Simon, Malcolm Rose Harriott, Mudit Bansal, Alexei Robsky, Geoff Bacon, David Greene, Daniil Mirylenka, Chen Zhou, Obaid Sarvana, Abhimanyu Goyal, Samuel Andermatt, Patrick Siegler, Ben Horn, Assaf Israel, Francesco Pongetti, Chih-Wei "Louis" Chen, Marco Selvatici, Pedro Silva, Kathie Wang, Jackson Tolins, Kelvin Guu, Roey Yogev, Xiaochen Cai, Alessandro Agostini, Maulik Shah, Hung Nguyen, Noah Ó Donnaile, Sébastien Pereira, Linda Friso, Adam Stambler, Adam Kurzrok, Chenkai Kuang, Yan Romanikhin, Mark Geller, ZJ Yan, Kane Jang, Cheng-Chun Lee, Wojciech Fica, Eric Malmi, Qijun Tan, Dan Banica, Daniel Balle, Ryan Pham, Yanping Huang, Diana Avram, Hongzhi Shi, Jasjot Singh, Chris Hidey, Niharika Ahuja, Pranab Saxena, Dan Dooley, Srividya Pranavi Potharaju, Eileen O'Neill, Anand Gokulchandran, Ryan Foley, Kai Zhao, Mike Dusenberry, YuAn Liu, Pulkit Mehta, Ragha Kotikalapudi, Chalence Safranek-Shrader, Andrew Goodman, Joshua Kessinger, Eran Globen, Prateek Kolhar, Chris Gorgolewski, Ali Ibrahim, Yang song, Ali Eichenbaum, Thomas Brovelli, Sahitya Potluri, Preethi Lahoti, Cip Baetu, Ali Ghorbani, Charles Chen, Andy Crawford, Shalini Pal, Mukund Sridhar, Petru Gurita, Asier Mujika, Igor Petrovski, Pierre-Louis Cedoz, Chenmei Li, Shiyuan Chen, Niccolò Dal Santo, Siddharth Goyal, Jitesh Punjabi, Karthik Kappaganthu, Chester Kwak, Pallavi LV, Sarmishta Velury, Himadri Choudhury, Jamie Hall, Premal Shah, Ricardo Figueira, Matt Thomas, Minjie Lu, Ting Zhou, Chintu Kumar, Thomas Jurdi, Sharat Chikkerur, Yenai Ma, Adams Yu, Soo Kwak, Victor Ähdel, Sujeevan Rajayogam, Travis Choma, Fei Liu, Aditya Barua, Colin Ji, Ji Ho Park, Vincent Hellendoorn, Alex Bailey, Taylan Bilal, Huanjie Zhou, Mehrdad Khatir, Charles Sutton, Wojciech Rzadkowski, Fiona Macintosh, Konstantin Shagin, Paul Medina, Jinjing Zhou, Pararth Shah, Yingying Bi, Attila Dankovics, Shipra Banga, Sabine Lehmann, Marissa Bredesen, Zifan Lin, John Eric Hoffmann, Jonathan Lai, Raynald Chung, Kai Yang, Nihal Balani, Arthur Bražinskas, Andrei Sozanschi, Matthew Hayes, Héctor Fernández Alcalde, Peter Makarov, Will Chen, Antonio Stella, Liselotte Snijders, Michael Mandl, Ante Kärrman, Paweł Nowak, Xinyi Wu, Alex Dyck, Krishnan Vaidyanathan, Raghavender R, Jessica Mallet, Mitch Rudominer, Eric Johnston, Sushil Mittal, Akhil Udathu, Janara Christensen, Vishal Verma, Zach Irving, Andreas Santucci, Gamaleldin Elsayed, Elnaz Davoodi, Marin Georgiev, Ian Tenney, Geoffrey Cideron, Edouard Leurent, Mahmoud Alnahlawi, Ionut Georgescu, Nan Wei, Ivy Zheng, Dylan Scandinaro, Heinrich Jiang, Jasper Snoek, Mukund Sundararajan, Xuezhi Wang, Zack Ontiveros, Itay Karo, Jeremy Cole, Vinu Rajashekhar, Lara Tumeh, Eyal Ben-David, Rishub Jain, Jonathan Uesato, Romina Datta, Oskar Bunyan, Shimu Wu, John Zhang, Piotr Stanczyk, Ye Zhang, David Steiner, Subhajit Naskar, Michael Azzam, Matthew Johnson, Adam Paszke, Chung-Cheng Chiu, Jaume Sanchez Elias, Afroz Mohiuddin, Faizan Muhammad, Jin Miao, Andrew Lee, Nino Vieillard, Jane Park, Jiageng Zhang, Jeff Stanway, Drew Garmon, Abhijit Karmarkar, Zhe Dong, Jong Lee, Aviral Kumar, Luowei Zhou, Jonathan Evens, William Isaac, Geoffrey Irving, Edward Loper, Michael Fink, Isha Arkatkar, Nanxin Chen, Izhak Shafran, Ivan Petrychenko, Zhe Chen, Johnson Jia, Anselm Levskaya, Zhenkai Zhu, Peter Grabowski, Yu Mao, Alberto Magni, Kaisheng Yao, Javier Snaider, Norman Casagrande, Evan Palmer, Paul Suganthan, Alfonso Castaño, Irene Giannoumis, Wooyeol Kim, Mikołaj Rybiński, Ashwin Sreevatsa, Jennifer Prendki, David Soergel, Adrian Goedeckemeyer, Willi Gierke, Mohsen Jafari, Meenu Gaba, Jeremy Wiesner, Diana Gage Wright, Yawen Wei, Harsha Vashisht, Yana Kulizhskaya, Jay Hoover, Maigo Le, Lu Li, Chimezie Iwuanyanwu, Lu Liu, Kevin Ramirez, Andrey Khorlin, Albert Cui, Tian Lin, Marcus Wu, Ricardo Aguilar, Keith Pallo, Abhishek Chakladar, Ginger Perng, Elena Allica Abellan, Mingyang Zhang, Ishita Dasgupta, Nate Kushman, Ivo Penchev, Alena Repina, Xihui Wu, Tom van der Weide, Priya Ponnapalli, Caroline Kaplan, Jiri Simsa, Shuangfeng Li, Olivier Dousse, Jeff Piper, Nathan Ie, Rama Pasumarthi, Nathan Lintz, Anitha Vijayakumar, Daniel Andor, Pedro Valenzuela, Minnie Lui, Cosmin Paduraru, Daiyi Peng, Katherine Lee, Shuyuan Zhang, Somer Greene, Duc Dung Nguyen, Paula Kurylowicz, Cassidy Hardin, Lucas Dixon, Lili Janzer, Kiam Choo, Ziqiang Feng, Biao Zhang, Achintya Singhal, Dayou Du, Dan McKinnon, Natasha Antropova, Tolga Bolukbasi, Orgad Keller, David Reid, Daniel Finchelstein, Maria Abi Raad, Remi Crocker, Peter Hawkins, Robert Dadashi, Colin Gaffney, Ken Franko, Anna Bulanova, Rémi Leblond, Shirley Chung, Harry Askham, Luis C. Cobo, Kelvin Xu, Felix Fischer, Jun Xu, Christina Sorokin, Chris Alberti, Chu-Cheng Lin, Colin Evans, Alek Dimitriev, Hannah Forbes, Dylan Banarse, Zora Tung, Mark Omernick, Colton Bishop, Rachel Sterneck, Rohan Jain, Jiawei Xia, Ehsan Amid, Francesco Piccinno, Xingyu Wang, Praseem Banzal, Daniel J. Mankowitz, Alex Polozov, Victoria Krakovna, Sasha Brown, Mohammadhossein Bateni, Dennis Duan, Vlad Firoiu, Meghana Thotakuri, Tom Natan, Matthieu Geist, Ser tan Girgin, Hui Li, Jiayu Ye, Ofir Roval, Reiko Tojo, Michael Kwong, James Lee-Thorp, Christopher Yew, Danila Sinopalnikov, Sabela Ramos, John Mellor, Abhishek Sharma, Kathy Wu, David Miller, Nicolas Sonnerat, Denis Vnukov, Rory Greig, Jennifer Beattie, Emily Caveness, Libin Bai, Julian Eisenschlos, Alex Korchemniy, Tomy Tsai, Mimi Jasarevic, Weize Kong, Phuong Dao, Zeyu Zheng, Frederick Liu, Fan Yang, Rui Zhu, Tian Huey Teh, Jason Sanmiya, Evgeny Gladchenko, Nejc Trdin, Daniel Toyama, Evan Rosen, Sasan Tavakkol, Linting Xue, Chen Elkind, Oliver Woodman, John Carpenter, George Papamakarios, Rupert Kemp, Sushant Kafle, Tanya Grunina, Rishika Sinha, Alice Talbert, Diane Wu, Denese Owusu-Afriyie, Cosmo Du, Chloe Thornton, Jordi Pont-Tuset, Pradyumna Narayana, Jing Li, Saaber Fatehi, John Wieting, Omar Ajmeri, Benigno Uria, Yeongil Ko, Laura Knight, Amélie Héliou, Ning Niu, Shane Gu, Chenxi Pang, Yeqing Li, Nir Levine, Ariel Stolovich, Rebeca Santamaria-Fernandez, Sonam Goenka, Wenny Yustalim, Robin Strudel, Ali Elqursh, Charlie Deck, Hyo Lee, Zonglin Li, Kyle Levin, Raphael Hoffmann, Dan Holtmann-Rice, Olivier Bachem, Sho Arora, Christy Koh, Soheil Hassas Yeganeh, Siim Põder, Mukarram Tariq, Yanhua Sun, Lucian Ionita, Mojtaba Seyedhosseini, Pouya Tafti, Zhiyu Liu, Anmol Gulati, Jasmine Liu, Xinyu Ye, Bart Chrzaszcz, Lily Wang, Nikhil Sethi, Tianrun Li, Ben Brown, Shreya Singh, Wei Fan, Aaron Parisi, Joe Stanton, Vinod Koverkathu, Christopher A. Choquette-Choo, Yunjie Li, TJ Lu, Abe Ittycheriah, Prakash Shroff, Mani Varadarajan, Sanaz Bahargam, Rob Willoughby, David Gaddy, Guillaume Desjardins, Marco Cornero, Brona Robenek, Bhavishya Mittal, Ben Albrecht, Ashish Shenoy, Fedor Moiseev, Henrik Jacobsson, Alireza Ghaffarkhah, Morgane Rivière, Alanna Walton, Clément Crepy, Alicia Parrish, Zongwei Zhou, Clement Farabet, Carey Radebaugh, Praveen Srinivasan, Claudia van der Salm, Andreas Fidjeland, Salvatore Scellato, Eri Latorre-Chimoto, Hanna Klimczak-Plucińska, David Bridson, Dario de Cesare, Tom Hudson, Piermaria Mendolicchio, Lexi Walker, Alex Morris, Matthew Mauger, Alexey Guseynov, Alison Reid, Seth Odoom, Lucia Loher, Victor Cotruta, Madhavi Yenugula, Dominik Grewe, Anastasia Petrushkina, Tom Duerig, Antonio Sanchez, Steve Yadlowsky, Amy Shen, Amir Globerson, Lynette Webb, Sahil Dua, Dong Li, Surya Bhupatiraju, Dan Hurt, Haroon Qureshi, Ananth Agarwal, Tomer Shani, Matan Eyal, Anuj Khare, Shreyas Rammohan Belle, Lei Wang, Chetan Tekur, Mihir Sanjay Kale, Jinliang Wei, Ruoxin Sang, Brennan Saeta, Tyler Liechty, Yao Zhao, Stephan Lee, Pandu Nayak, Doug Fritz, Manish Reddy Vuyyuru, John Aslanides, Nidhi Vyas, Martin Wicke, Xiao Ma, Evgenii Eltyshev, Nina Martin, Hardie Cate, James Manyika, Keyvan Amiri, Yelin Kim, Xi Xiong, Kai Kang, Florian Luisier, Nilesh Tripuraneni, David Madras, Mandy Guo, Austin Waters, Oliver Wang, Joshua Ainslie, Jason Baldridge, Han Zhang, Garima Pruthi, Jakob Bauer, Feng Yang, Riham Mansour, Jason Gelman, Yang Xu, George Polovets, Ji Liu, Honglong Cai, Warren Chen, XiangHai Sheng, Emily Xue, Sherjil Ozair, Christof Angermueller, Xiaowei Li, Anoop Sinha, Weiren Wang, Julia Wiesinger, Emmanouil Koukoumidis, Yuan Tian, Anand Iyer, Madhu Gurumurthy, Mark Goldenson, Parashar Shah, MK Blake, Hongkun Yu, Anthony Urbanowicz, Jennimaria Palomaki, Chrisantha Fernando, Ken Durden, Harsh Mehta, Nikola Momchev, Elahe Rahimtoroghi, Maria Georgaki, Amit Raul, Sebastian Ruder, Morgan Redshaw, Jinhyuk Lee, Denny Zhou, Komal Jalan, Dinghua Li, Blake Hechtman, Parker Schuh, Milad Nasr, Kieran Milan, Vladimir Mikulik, Juliana Franco, Tim Green, Nam Nguyen, Joe Kelley, Aroma Mahendru, Andrea Hu, Joshua Howland, Ben Vargas, Jeffrey Hui, Kshitij Bansal, Vikram Rao, Rakesh Ghiya, Emma Wang, Ke Ye, Jean Michel Sarr, Melanie Moranski Preston, Madeleine Elish, Steve Li, Aakash Kaku, Jigar Gupta, Ice Pasupat, Da-Cheng Juan, Milan Someswar, Tejvi M., Xinyun Chen, Aida Amini, Alex Fabrikant, Eric Chu, Xuanyi Dong, Amruta Muthal, Senaka Buthpitiya, Sarthak Jauhari, Nan Hua, Urvashi Khandelwal, Ayal Hitron, Jie Ren, Larissa Rinaldi, Shahar Drath, Avigail Dabush, Nan-Jiang Jiang, Harshal Godhia, Uli Sachs, Anthony Chen, Yicheng Fan, Hagai Taitelbaum, Hila Noga, Zhuyun Dai, James Wang, Chen Liang, Jenny Hamer, Chun-Sung Ferng, Chenel Elkind, Aviel Atias, Paulina Lee, Vít Listík, Mathias Carlen, Jan van de Kerkhof, Marcin Pikus, Krunoslav Zaher, Paul Müller, Sasha Zykova, Richard Stefanec, Vitaly Gatsko, Christoph Hirnschall, Ashwin Sethi, Xingyu Federico Xu, Chetan Ahuja, Beth Tsai, Anca Stefanoiu, Bo Feng, Keshav Dhandhania, Manish Katyal, Akshay Gupta, Atharva Parulekar, Divya Pitta, Jing Zhao, Vivaan Bhatia, Yashodha Bhavnani, Omar Alhadlaq, Xiaolin Li, Peter Danenberg, Dennis Tu, Alex Pine, Vera Filippova, Abhipso Ghosh, Ben Limonchik, Bhargava Urala, Chaitanya Krishna Lanka, Derik Clive, Yi Sun, Edward Li, Hao Wu, Kevin Hongtongsak, Ianna Li, Kalind Thakkar, Kuanysh Omarov, Kushal Majmundar, Michael Alverson, Michael Kucharski, Mohak Patel, Mudit Jain, Maksim Zabelin, Paolo Pelagatti, Rohan Kohli, Saurabh Kumar, Joseph Kim, Swetha Sankar, Vineet Shah, Lakshmi Ramachandruni, Xiangkai Zeng, Ben Bariach, Laura Weidinger, Tu Vu, Amar Subramanya, Sissie Hsiao, Demis Hassabis, Koray Kavukcuoglu, Adam Sadovsky, Quoc Le, Trevor Strohman, Yonghui Wu, Slav Petrov, Jeffrey Dean, Oriol Vinyals

This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding.

Ranked #1 on Multi-task Language Understanding on MMLU

Arithmetic Reasoning Code Generation +3

Paper
Add Code

Implicit Modeling of Non-rigid Objects with Cross-Category Signals

no code implementations • 15 Dec 2023 • Yuchun Liu, Benjamin Planche, Meng Zheng, Zhongpai Gao, Pierre Sibut-Bourde, Fan Yang, Terrence Chen, Ziyan Wu

To effectively capture the interrelation between these entities and ensure precise, collision-free representations, our approach facilitates signaling between category-specific fields to adequately rectify shapes.

Object

Paper
Add Code

Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning

no code implementations • 14 Dec 2023 • Xijie Huang, Li Lyna Zhang, Kwang-Ting Cheng, Fan Yang, Mao Yang

In this work, we propose CoT-Influx, a novel approach that pushes the boundary of few-shot Chain-of-Thoughts (CoT) learning to improve LLM mathematical reasoning.

Ranked #106 on Arithmetic Reasoning on GSM8K

Arithmetic Reasoning Few-Shot Learning +3

Paper
Add Code

MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation

1 code implementation • 11 Dec 2023 • Abdullah Rashwan, Jiageng Zhang, Ali Taalimi, Fan Yang, Xingyi Zhou, Chaochao Yan, Liang-Chieh Chen, Yeqing Li

With ResNet50 backbone, our MaskConver achieves 53. 6% PQ on the COCO panoptic val set, outperforming the modern convolution-based model, Panoptic FCN, by 9. 3% as well as transformer-based models such as Mask2Former (+1. 7% PQ) and kMaX-DeepLab (+0. 6% PQ).

Ranked #8 on Panoptic Segmentation on COCO test-dev

Decoder Panoptic Segmentation

76,719

Paper
Code

AttriHuman-3D: Editable 3D Human Avatar Generation with Attribute Decomposition and Indexing

no code implementations • 3 Dec 2023 • Fan Yang, Tianyi Chen, Xiaosheng He, Zhongang Cai, Lei Yang, Si Wu, Guosheng Lin

We propose AttriHuman-3D, an editable 3D human generation model, which address the aforementioned problems with attribute decomposition and indexing.

Attribute Disentanglement

Paper
Add Code

Few-shot Image Generation via Style Adaptation and Content Preservation

no code implementations • 30 Nov 2023 • Xiaosheng He, Fan Yang, Fayao Liu, Guosheng Lin

Many works propose to fine-tune a pre-trained GAN model.

Image Generation Image Reconstruction +1

Paper
Add Code

Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search

no code implementations • 26 Nov 2023 • Zhiqi Lin, Youshan Miao, Guanbin Xu, Cheng Li, Olli Saarikivi, Saeed Maleki, Fan Yang

This paper presents Tessel, an automated system that searches for efficient schedules for distributed DNN training and inference for diverse operator placement strategies.

Paper
Add Code

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

1 code implementation • 24 Nov 2023 • Yufei Zhan, Yousong Zhu, Zhiyang Chen, Fan Yang, Ming Tang, Jinqiao Wang

More importantly, we present $\textbf{Griffon}$, a purely LVLM-based baseline, which does not require the introduction of any special tokens, expert models, or additional detection modules.

Referring Expression Referring Expression Comprehension

Paper
Code

A Safer Vision-based Autonomous Planning System for Quadrotor UAVs with Dynamic Obstacle Trajectory Prediction and Its Application with LLMs

no code implementations • 21 Nov 2023 • Jiageng Zhong, Ming Li, Yinliang Chen, Zihang Wei, Fan Yang, Haoran Shen

For intelligent quadcopter UAVs, a robust and reliable autonomous planning system is crucial.

object-detection Object Detection +3

Paper
Add Code

Robust and Communication-Efficient Federated Domain Adaptation via Random Features

1 code implementation • 8 Nov 2023 • Zhanbo Feng, Yuanjie Wang, Jie Li, Fan Yang, Jiong Lou, Tiebin Mi, Robert. C. Qiu, Zhenyu Liao

As a result, there is a growing trend to leverage federated learning (FL) techniques to train large ML models in a distributed and collaborative manner.

Domain Adaptation Federated Learning

Paper
Code

Detecting Generated Images by Real Images Only

no code implementations • 2 Nov 2023 • Xiuli Bi, Bo Liu, Fan Yang, Bin Xiao, Weisheng Li, Gao Huang, Pamela C. Cosman

This paper approaches the generated image detection problem from a new perspective: Start from real images.

Paper
Add Code

A Multi-Modal Foundation Model to Assist People with Blindness and Low Vision in Environmental Interaction

no code implementations • 31 Oct 2023 • Yu Hao, Fan Yang, Hao Huang, Shuaihang Yuan, Sundeep Rangan, John-Ross Rizzo, Yao Wang, Yi Fang

By combining the prompt and input image, a large vision-language model (i. e., InstructBLIP) generates detailed and comprehensive descriptions of the environment and identifies potential risks in the environment by analyzing the environmental objects and scenes, relevant to the prompt.

Language Modelling Prompt Engineering +1

Paper
Add Code

Popularity, face and voice: Predicting and interpreting livestreamers' retail performance using machine learning techniques

no code implementations • 29 Oct 2023 • Xiong Xiong, Fan Yang, Li Su

Livestreaming commerce, a hybrid of e-commerce and self-media, has expanded the broad spectrum of traditional sales performance determinants.

Explainable artificial intelligence Feature Importance

Paper
Add Code

Student Classroom Behavior Detection based on Spatio-Temporal Network and Multi-Model Fusion

1 code implementation • 25 Oct 2023 • Fan Yang, Xiaofei Wang

To address this issue, we proposed a method for extending the spatio-temporal behavior dataset in Student Classroom Scenarios (SCB-ST-Dataset4) through image dataset.

109

Paper
Code

BitNet: Scaling 1-bit Transformers for Large Language Models

2 code implementations • 17 Oct 2023 • Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fan Yang, Ruiping Wang, Yi Wu, Furu Wei

The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption.

Language Modelling Quantization

243

Paper
Code

Reinforcement Learning in a Safety-Embedded MDP with Trajectory Optimization

no code implementations • 10 Oct 2023 • Fan Yang, Wenxuan Zhou, Zuxin Liu, Ding Zhao, David Held

This work introduces a novel approach that combines RL with trajectory optimization to manage this trade-off effectively.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

A Spatio-Temporal Attention-Based Method for Detecting Student Classroom Behaviors

no code implementations • 4 Oct 2023 • Fan Yang

However, low accuracy in student classroom behavior detection is a prevalent issue.

Paper
Add Code

SCB-Dataset3: A Benchmark for Detecting Student Classroom Behavior

1 code implementation • 4 Oct 2023 • Fan Yang, Tao Wang

The use of deep learning methods to automatically detect students' classroom behavior is a promising approach for analyzing their class performance and improving teaching effectiveness.

109

Paper
Code

GAFlow: Incorporating Gaussian Attention into Optical Flow

1 code implementation • ICCV 2023 • Ao Luo, Fan Yang, Xin Li, Lang Nie, Chunyu Lin, Haoqiang Fan, Shuaicheng Liu

Moreover, for reliable motion analysis, we provide a new Gaussian-Guided Attention Module (GGAM) which not only inherits properties from Gaussian distribution to instinctively revolve around the neighbor fields of each point but also is empowered to put the emphasis on contextually related regions during matching.

Optical Flow Estimation Representation Learning

Paper
Code

Model-enhanced Vector Index

1 code implementation • NeurIPS 2023 • Hailin Zhang, Yujing Wang, Qi Chen, Ruiheng Chang, Ting Zhang, Ziming Miao, Yingyan Hou, Yang Ding, Xupeng Miao, Haonan Wang, Bochen Pang, Yuefeng Zhan, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Xing Xie, Mao Yang, Bin Cui

We empirically show that our model achieves better performance on the commonly used academic benchmarks MSMARCO Passage and Natural Questions, with comparable serving latency to dense retrieval solutions.

Natural Questions Quantization +1

Paper
Code

Baichuan 2: Open Large-scale Language Models

1 code implementation • 19 Sep 2023 • Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong, Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, Juntao Dai, Kun Fang, Lei Su, Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, WeiPeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, Zhiying Wu

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering.

Feature Engineering GSM8K

4,001

Paper
Code

AOSR-Net: All-in-One Sandstorm Removal Network

no code implementations • 16 Sep 2023 • Yazhong Si, xulong Zhang, Fan Yang, Jianzong Wang, Ning Cheng, Jing Xiao

Most existing sandstorm image enhancement methods are based on traditional theory and prior knowledge, which often restrict their applicability in real-world scenarios.

Image Enhancement Image Restoration

Paper
Add Code

DiscoverPath: A Knowledge Refinement and Retrieval System for Interdisciplinarity on Biomedical Research

1 code implementation • 4 Sep 2023 • Yu-Neng Chuang, Guanchu Wang, Chia-Yuan Chang, Kwei-Herng Lai, Daochen Zha, Ruixiang Tang, Fan Yang, Alfredo Costilla Reyes, Kaixiong Zhou, Xiaoqian Jiang, Xia Hu

The exponential growth in scholarly publications necessitates advanced tools for efficient article retrieval, especially in interdisciplinary fields where diverse terminologies are used to describe similar research.

named-entity-recognition Named Entity Recognition +5

Paper
Code

Explainability for Large Language Models: A Survey

no code implementations • 2 Sep 2023 • Haiyan Zhao, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Mengnan Du

For each paradigm, we summarize the goals and dominant approaches for generating local explanations of individual predictions and global explanations of overall model knowledge.

Paper
Add Code

RecMind: Large Language Model Powered Agent For Recommendation

no code implementations • 28 Aug 2023 • Yancheng Wang, Ziyan Jiang, Zheng Chen, Fan Yang, Yingxue Zhou, Eunah Cho, Xing Fan, Xiaojiang Huang, Yanbin Lu, Yingzhen Yang

While the recommendation system (RS) has advanced significantly through deep learning, current RS approaches usually train and fine-tune models on task-specific datasets, limiting their generalizability to new recommendation tasks and their ability to leverage external knowledge due to model scale and data size constraints.

Explanation Generation Language Modelling +2

Paper
Add Code

Multilingual context-based pronunciation learning for Text-to-Speech

no code implementations • 31 Jul 2023 • Giulia Comini, Manuel Sam Ribeiro, Fan Yang, Heereen Shim, Jaime Lorenzo-Trueba

Phonetic information and linguistic knowledge are an essential component of a Text-to-speech (TTS) front-end.

Paper
Add Code

Words That Stick: Predicting Decision Making and Synonym Engagement Using Cognitive Biases and Computational Linguistics

no code implementations • 26 Jul 2023 • Nimrod Dvir, Elaine Friedman, Suraj Commuri, Fan Yang, Jennifer Romano

This research draws upon cognitive psychology and information systems studies to anticipate user engagement and decision-making on digital platforms.

Decision Making Marketing

Paper
Add Code

A Predictive Model of Digital Information Engagement: Forecasting User Engagement With English Words by Incorporating Cognitive Biases, Computational Linguistics and Natural Language Processing

no code implementations • 26 Jul 2023 • Nimrod Dvir, Elaine Friedman, Suraj Commuri, Fan Yang, Jennifer Romano

This study introduces and empirically tests a novel predictive model for digital information engagement (IE) - the READ model, an acronym for the four pivotal attributes of engaging information: Representativeness, Ease-of-use, Affect, and Distribution.

Language Modelling

Paper
Add Code

One for Multiple: Physics-informed Synthetic Data Boosts Generalizable Deep Learning for Fast MRI Reconstruction

1 code implementation • 25 Jul 2023 • Zi Wang, Xiaotong Yu, Chengyan Wang, Weibo Chen, Jiazheng Wang, Ying-Hua Chu, Hongwei Sun, Rushuai Li, Peiyong Li, Fan Yang, Haiwei Han, Taishan Kang, Jianzhong Lin, Chen Yang, Shufu Chang, Zhang Shi, Sha Hua, Yan Li, Juan Hu, Liuhong Zhu, Jianjun Zhou, Meijing Lin, Jiefeng Guo, Congbo Cai, Zhong Chen, Di Guo, Guang Yang, Xiaobo Qu

We demonstrate that training DL models on synthetic data, coupled with enhanced learning techniques, yields in vivo MRI reconstructions comparable to or surpassing those of models trained on matched realistic datasets, reducing the reliance on real-world MRI data by up to 96%.

Medical Diagnosis MRI Reconstruction

Paper
Code

Spatio-Temporal Classification of Lung Ventilation Patterns using 3D EIT Images: A General Approach for Individualized Lung Function Evaluation

no code implementations • 1 Jul 2023 • Shuzhe Chen, Li Li, Zhichao Lin, Ke Zhang, Ying Gong, Lu Wang, Xu Wu, Maokun Li, Yuanlin Song, Fan Yang, Shenheng Xu

A simple convolutional neural network is used for classification.

Paper
Add Code

ContentCTR: Frame-level Live Streaming Click-Through Rate Prediction with Multimodal Transformer

no code implementations • 26 Jun 2023 • Jiaxin Deng, Dong Shen, Shiyao Wang, Xiangyu Wu, Fan Yang, Guorui Zhou, Gaofeng Meng

However, most previous works treat the live as a whole item and explore the Click-through-Rate (CTR) prediction framework on item-level, neglecting that the dynamic changes that occur even within the same live room.

Click-Through Rate Prediction Dynamic Time Warping +1

Paper
Add Code

A Pairing Enhancement Approach for Aspect Sentiment Triplet Extraction

no code implementations • 11 Jun 2023 • Fan Yang, Mian Zhang, Gongzhen Hu, Xiabing Zhou

Aspect Sentiment Triplet Extraction (ASTE) aims to extract the triplet of an aspect term, an opinion term, and their corresponding sentiment polarity from the review texts.

Aspect Sentiment Triplet Extraction Contrastive Learning +1

Paper
Add Code

Multi-Task Knowledge Enhancement for Zero-Shot and Multi-Domain Recommendation in an AI Assistant Application

no code implementations • 9 Jun 2023 • Elan Markowitz, Ziyan Jiang, Fan Yang, Xing Fan, Tony Chen, Greg Ver Steeg, Aram Galstyan

We propose in this work to unify these approaches: Using information from interactions in other domains as well as external knowledge graphs to make predictions in a new domain that would be impossible with either information source alone.

Knowledge Graphs Recommendation Systems

Paper
Add Code

Student Classroom Behavior Detection based on Improved YOLOv7

1 code implementation • 6 Jun 2023 • Fan Yang

Accurately detecting student behavior in classroom videos can aid in analyzing their classroom performance and improving teaching effectiveness.

109

Paper
Code

Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training

no code implementations • 31 May 2023 • Yijia Zhang, Yibo Han, Shijie Cao, Guohao Dai, Youshan Miao, Ting Cao, Fan Yang, Ningyi Xu

We find that previous gradient accumulation reduces activation memory but fails to be compatible with gradient memory reduction due to a contradiction between preserving gradients and releasing gradients.

Paper
Add Code

Reversible Quantization Index Modulation for Static Deep Neural Network Watermarking

no code implementations • 29 May 2023 • Junren Qin, Shanxiang Lyu, Fan Yang, Jiarui Deng, Zhihua Xia, Xiaochun Cao

In this paper, we propose a novel RDH-based static DNN watermarking scheme using quantization index modulation (QIM).

Quantization

Paper
Add Code

Self-aware and Cross-sample Prototypical Learning for Semi-supervised Medical Image Segmentation

no code implementations • 25 May 2023 • Zhenxi Zhang, Ran Ran, Chunna Tian, Heng Zhou, Xin Li, Fan Yang, Zhicheng Jiao

To address these issues, we propose a self-aware and cross-sample prototypical learning method (SCP-Net) to enhance the diversity of prediction in consistency learning by utilizing a broader range of semantic information derived from multiple inputs.

Image Segmentation Semantic Segmentation +1

Paper
Add Code

Cross-supervised Dual Classifiers for Semi-supervised Medical Image Segmentation

no code implementations • 25 May 2023 • Zhenxi Zhang, Ran Ran, Chunna Tian, Heng Zhou, Fan Yang, Xin Li, Zhicheng Jiao

This paper proposes a cross-supervised learning framework based on dual classifiers (DC-Net), including an evidential classifier and a vanilla classifier.

Image Segmentation Segmentation +2

Paper
Add Code

Graph Meets LLM: A Novel Approach to Collaborative Filtering for Robust Conversational Understanding

no code implementations • 23 May 2023 • Zheng Chen, Ziyan Jiang, Fan Yang, Eunah Cho, Xing Fan, Xiaojiang Huang, Yanbin Lu, Aram Galstyan

This paper presents our "Collaborative Query Rewriting" approach, which specifically addresses the task of rewriting new user interactions that have not been previously observed in the user's history.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +9

Paper
Add Code

Interpretation of Time-Series Deep Models: A Survey

no code implementations • 23 May 2023 • Ziqi Zhao, Yucheng Shi, Shushan Wu, Fan Yang, WenZhan Song, Ninghao Liu

Deep learning models developed for time-series associated tasks have become more widely researched nowadays.

Time Series

Paper
Add Code

DEGREE: Decomposition Based Explanation For Graph Neural Networks

1 code implementation • ICLR 2022 • Qizhang Feng, Ninghao Liu, Fan Yang, Ruixiang Tang, Mengnan Du, Xia Hu

Graph Neural Networks (GNNs) are gaining extensive attention for their application in graph data.

Graph Classification Node Classification

Paper
Code

Integer or Floating Point? New Outlooks for Low-Bit Quantization on Large Language Models

no code implementations • 21 May 2023 • Yijia Zhang, Lingran Zhao, Shijie Cao, WenQiang Wang, Ting Cao, Fan Yang, Mao Yang, Shanghang Zhang, Ningyi Xu

In this study, we conduct a comparative analysis of INT and FP quantization with the same bit-width, revealing that the optimal quantization format varies across different layers due to the complexity and diversity of tensor distribution.

Quantization

Paper
Add Code

OR-NeRF: Object Removing from 3D Scenes Guided by Multiview Segmentation with Neural Radiance Fields

1 code implementation • 17 May 2023 • Youtan Yin, Zhoujie Fu, Fan Yang, Guosheng Lin

This paper proposes a novel object-removing pipeline, named OR-NeRF, that can remove objects from 3D scenes with user-given points or text prompts on a single view, achieving better performance in less time than previous works.

3D scene Editing Novel View Synthesis +1

Paper
Code

The Ways of Words: The Impact of Word Choice on Information Engagement and Decision Making

no code implementations • 16 May 2023 • Nimrod Dvir, Elaine Friedman, Suraj Commuri, Fan Yang, Jennifer Romano

The framework was empirically validated in a large-scale user study measuring how word choice impacts the dimensions of IE.

Decision Making

Paper
Add Code

Student Classroom Behavior Detection based on YOLOv7-BRA and Multi-Model Fusion

1 code implementation • 13 May 2023 • Fan Yang, Tao Wang, Xiaofei Wang

We constructed a dataset, which contained 11, 248 labels and 4, 001 images, with an emphasis on the common behavior of raising hands in a classroom setting (Student Classroom Behavior dataset, SCB-Dataset).

109

Paper
Code

PALR: Personalization Aware LLMs for Recommendation

no code implementations • 12 May 2023 • Fan Yang, Zheng Chen, Ziyan Jiang, Eunah Cho, Xiaojiang Huang, Yanbin Lu

Then we adopt a LLM-based ranking model to generate recommended items.

Retrieval Sequential Recommendation

Paper
Add Code

SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for Monocular 3D Object Detection

1 code implementation • 12 May 2023 • Xuan He, Fan Yang, Kailun Yang, Jiacheng Lin, Haolong Fu, Meng Wang, Jin Yuan, Zhiyong Li

To tackle this problem, this paper proposes a novel "Supervised Scale-aware Deformable Attention" (SSDA) for monocular 3D object detection.

Monocular 3D Object Detection Object +1

Paper
Code

HACMan: Learning Hybrid Actor-Critic Maps for 6D Non-Prehensile Manipulation

no code implementations • 6 May 2023 • Wenxuan Zhou, Bowen Jiang, Fan Yang, Chris Paxton, David Held

In this work, we introduce Hybrid Actor-Critic Maps for Manipulation (HACMan), a reinforcement learning approach for 6D non-prehensile manipulation of objects using point cloud observations.

Object

Paper
Add Code

Event-Free Moving Object Segmentation from Moving Ego Vehicle

2 code implementations • 28 Apr 2023 • Zhuyun Zhou, Zongwei Wu, Danda Pani Paudel, Rémi Boutteau, Fan Yang, Luc van Gool, Radu Timofte, Dominique Ginhac

Subsequently, we devise EmoFormer, a novel network able to exploit the event data.

Autonomous Driving Object +6

Paper
Code

MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel Feed

no code implementations • 17 Apr 2023 • Xiaowen Shi, Ze Wang, Yuanying Cai, Xiaoxu Wu, Fan Yang, Guogang Liao, Yongkang Wang, Xingxing Wang, Dong Wang

There are two types of data employed to train reinforcement learning (RL) model for position allocation, named strategy data and random data.

Imitation Learning Position +2

Paper
Add Code

Reweighted Mixup for Subpopulation Shift

no code implementations • 9 Apr 2023 • Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian, Peilin Zhao, QinGhua Hu, Bingzhe Wu, Changqing Zhang, Jianhua Yao

Subpopulation shift exists widely in many real-world applications, which refers to the training and test distributions that contain the same subpopulation groups but with different subpopulation proportions.

Fairness Generalization Bounds

Paper
Add Code

A large-scale dataset for end-to-end table recognition in the wild

1 code implementation • 27 Mar 2023 • Fan Yang, Lei Hu, Xinwu Liu, Shuangping Huang, Zhenghui Gu

To this end, we propose a new large-scale dataset named Table Recognition Set (TabRecSet) with diverse table forms sourcing from multiple scenarios in the wild, providing complete annotation dedicated to end-to-end TR research.

Table annotation Table Detection +1

Paper
Code

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

no code implementations • 22 Mar 2023 • Shengming Yin, Chenfei Wu, Huan Yang, JianFeng Wang, Xiaodong Wang, Minheng Ni, Zhengyuan Yang, Linjie Li, Shuguang Liu, Fan Yang, Jianlong Fu, Gong Ming, Lijuan Wang, Zicheng Liu, Houqiang Li, Nan Duan

In this paper, we propose NUWA-XL, a novel Diffusion over Diffusion architecture for eXtremely Long video generation.

Video Generation

Paper
Add Code

Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking

1 code implementation • 20 Mar 2023 • Ruixiang Tang, Qizhang Feng, Ninghao Liu, Fan Yang, Xia Hu

To overcome this challenge, we introduce a clean-label backdoor watermarking framework that uses imperceptible perturbations to replace mislabeled samples.

Anomaly Detection

Paper
Code

IRGen: Generative Modeling for Image Retrieval

1 code implementation • 17 Mar 2023 • Yidan Zhang, Ting Zhang, Dong Chen, Yujing Wang, Qi Chen, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Mao Yang, Qingmin Liao, Baining Guo

While generative modeling has been ubiquitous in natural language processing and computer vision, its application to image retrieval remains unexplored.

Image Retrieval Retrieval

Paper
Code

Data-centric Artificial Intelligence: A Survey

10 code implementations • 17 Mar 2023 • Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan Yang, Zhimeng Jiang, Shaochen Zhong, Xia Hu

Artificial Intelligence (AI) is making a profound impact in almost every domain.

4,000

Paper
Code

SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers

no code implementations • 15 Mar 2023 • Guoqiang Jin, Fan Yang, Mingshan Sun, Ruyi Zhao, Yakun Liu, Wei Li, Tianpeng Bao, Liwei Wu, Xingyu Zeng, Rui Zhao

To this end, we propose SeqCo-DETR, a novel Sequence Consistency-based self-supervised method for object DEtection with TRansformers.

Object object-detection +2

Paper
Add Code

Generation-Guided Multi-Level Unified Network for Video Grounding

no code implementations • 14 Mar 2023 • Xing Cheng, Xiangyu Wu, Dong Shen, Hezheng Lin, Fan Yang

Video grounding aims to locate the timestamps best matching the query description within an untrimmed video.

Video Grounding

Paper
Add Code

AGTGAN: Unpaired Image Translation for Photographic Ancient Character Generation

1 code implementation • 13 Mar 2023 • Hongxiang Huang, Daihui Yang, Gang Dai, Zhen Han, Yuyi Wang, Kin-Man Lam, Fan Yang, Shuangping Huang, Yongge Liu, Mengchao He

We evaluate our approach on the photographic ancient character datasets, e. g., OBC306 and CSDD.

Generative Adversarial Network Translation

Paper
Code

Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offine Handwritten Mathematical Expression Recognition

no code implementations • 13 Mar 2023 • Zihao Lin, Jinrong Li, Fan Yang, Shuangping Huang, Xu Yang, Jianmin Lin, Ming Yang

In this paper, we propose a novel model called Spatial Attention and Syntax Rule Enhanced Tree Decoder (SS-TD), which is equipped with spatial attention mechanism to alleviate the prediction error of tree structure and use syntax masks (obtained from the transformation of syntax rules) to constrain the occurrence of ungrammatical mathematical expression.

Decoder

Paper
Add Code

NoiseCAM: Explainable AI for the Boundary Between Noise and Adversarial Attacks

no code implementations • 9 Mar 2023 • Wenkai Tan, Justus Renkhoff, Alvaro Velasquez, Ziyu Wang, Lusi Li, Jian Wang, Shuteng Niu, Fan Yang, Yongxin Liu, Houbing Song

Our work could provide a useful tool to defend against certain adversarial attacks on deep neural networks.

Paper
Add Code

CoRTX: Contrastive Framework for Real-time Explanation

1 code implementation • 5 Mar 2023 • Yu-Neng Chuang, Guanchu Wang, Fan Yang, Quan Zhou, Pushkar Tripathi, Xuanting Cai, Xia Hu

In this work, we propose a COntrastive Real-Time eXplanation (CoRTX) framework to learn the explanation-oriented representation and relieve the intensive dependence of explainer training on explanation labels.

Contrastive Learning

Paper
Code

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images

no code implementations • 21 Feb 2023 • Xiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, JianFeng Wang, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan

3D photography renders a static image into a video with appealing 3D visual effects.

Ranked #1 on Image Outpainting on MSCOCO

Image Outpainting Monocular Depth Estimation

Paper
Add Code

Deep Seam Prediction for Image Stitching Based on Selection Consistency Loss

no code implementations • 10 Feb 2023 • Senmao Cheng, Fan Yang, Zhi Chen, Nanjun Yuan, Wenbing Tao

To our knowledge, the proposed DSeam is the first deep learning based seam prediction method for image stitching.

Image Stitching

Paper
Add Code

A Unified Multi-view Multi-person Tracking Framework

no code implementations • 8 Feb 2023 • Fan Yang, Shigeyuki Odashima, Sosuke Yamao, Hiroaki Fujimoto, Shoichi Masui, Shan Jiang

Although there is a significant development in 3D Multi-view Multi-person Tracking (3D MM-Tracking), current 3D MM-Tracking frameworks are designed separately for footprint and pose tracking.

Ranked #1 on Object Tracking on MMPTRACK

3D Multi-Person Pose Estimation Multiple People Tracking +2

Paper
Add Code

Efficient XAI Techniques: A Taxonomic Survey

no code implementations • 7 Feb 2023 • Yu-Neng Chuang, Guanchu Wang, Fan Yang, Zirui Liu, Xuanting Cai, Mengnan Du, Xia Hu

Finally, we summarize the challenges of deploying XAI acceleration methods to real-world scenarios, overcoming the trade-off between faithfulness and efficiency, and the selection of different acceleration methods.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI)

Paper
Add Code

PIER: Permutation-Level Interest-Based End-to-End Re-ranking Framework in E-commerce

1 code implementation • 6 Feb 2023 • Xiaowen Shi, Fan Yang, Ze Wang, Xiaoxu Wu, Muzhi Guan, Guogang Liao, Yongkang Wang, Xingxing Wang, Dong Wang

Then we design a novel omnidirectional attention mechanism in OCPM to capture the context information in the permutation.

Re-Ranking

Paper
Code

Conditional generalized quantiles based on expected utility model and equivalent characterization of properties

no code implementations • 29 Jan 2023 • Qinyu Wu, Fan Yang, Ping Zhang

As a counterpart to the (static) risk measures of generalized quantiles and motivated by Bellini et al. (2018), we propose a new kind of conditional risk measure called conditional generalized quantiles.

Paper
Add Code

PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation

no code implementations • 26 Jan 2023 • Ningxin Zheng, Huiqiang Jiang, Quanlu Zhang, Zhenhua Han, Yuqing Yang, Lingxiao Ma, Fan Yang, Chengruidong Zhang, Lili Qiu, Mao Yang, Lidong Zhou

Dynamic sparsity, where the sparsity patterns are unknown until runtime, poses a significant challenge to deep learning.

Paper
Add Code

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction

no code implementations • 21 Jan 2023 • Zhiqi Lin, Youshan Miao, Guodong Liu, Xiaoxiang Shi, Quanlu Zhang, Fan Yang, Saeed Maleki, Yi Zhu, Xu Cao, Cheng Li, Mao Yang, Lintao Zhang, Lidong Zhou

SuperScaler is a system that facilitates the design and generation of highly flexible parallelization plans.

Scheduling

Paper
Add Code

Data-centric AI: Perspectives and Challenges

1 code implementation • 12 Jan 2023 • Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan Yang, Xia Hu

The role of data in building AI systems has recently been significantly magnified by the emerging concept of data-centric AI (DCAI), which advocates a fundamental shift from model advancements to ensuring data quality and reliability.

999

Paper
Code

AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts

1 code implementation • ICCV 2023 • Tianlong Chen, Xuxi Chen, Xianzhi Du, Abdullah Rashwan, Fan Yang, Huizhong Chen, Zhangyang Wang, Yeqing Li

Instead of compressing multiple tasks' knowledge into a single model, MoE separates the parameter space and only utilizes the relevant model pieces given task type and its input, which provides stabilized MTL training and ultra-efficient inference.

Instance Segmentation Multi-Task Learning +3

33,184

Paper
Code

Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms

1 code implementation • 24 Dec 2022 • Rui Ma, Mengxi Guo, Yi Hou, Fan Yang, Yuan Li, Huizhu Jia, Xiaodong Xie

The CIN is composed of the invertible part to achieve high imperceptibility and the non-invertible part to strengthen the robustness against strong noise attacks.

Paper
Code

Exploring Stochastic Autoregressive Image Modeling for Visual Representation

1 code implementation • 3 Dec 2022 • Yu Qi, Fan Yang, Yousong Zhu, Yufei Liu, Liwei Wu, Rui Zhao, Wei Li

By introducing stochastic prediction and the parallel encoder-decoder, SAIM significantly improve the performance of autoregressive image modeling.

Decoder Self-Supervised Learning

Paper
Code

Instance-level Heterogeneous Domain Adaptation for Limited-labeled Sketch-to-Photo Retrieval

1 code implementation • IEEE Transactions on Multimedia 2020 • Fan Yang, Yang Wu, Zheng Wang, Xiang Li, Sakriani Sakti, Satoshi Nakamura

Therefore, previous works pre-train their models on rich-labeled photo retrieval data (i. e., source domain) and then fine-tune them on the limited-labeled sketch-to-photo retrieval data (i. e., target domain).

Ranked #1 on Image Retrieval on PKU-Reid

Domain Adaptation Image Retrieval +1

Paper
Code

MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts

1 code implementation • 25 Nov 2022 • Xiangyu Xi, Jianwei Lv, Shuaipeng Liu, Wei Ye, Fan Yang, Guanglu Wan

As a pioneering exploration that expands event detection to the scenarios involving informal and heterogeneous texts, we propose a new large-scale Chinese event detection dataset based on user reviews, text conversations, and phone conversations in a leading e-commerce platform for food service.

Event Detection

Paper
Code

The Second-place Solution for CVPR 2022 SoccerNet Tracking Challenge

no code implementations • 24 Nov 2022 • Fan Yang, Shigeyuki Odashima, Shoichi Masui, Shan Jiang

This is our second-place solution for CVPR 2022 SoccerNet Tracking Challenge.

Clustering

Paper
Add Code

The Second-place Solution for ECCV 2022 Multiple People Tracking in Group Dance Challenge

no code implementations • 24 Nov 2022 • Fan Yang, Shigeyuki Odashima, Shoichi Masui, Shan Jiang

This is our 2nd-place solution for the ECCV 2022 Multiple People Tracking in Group Dance Challenge.

Motion Estimation Multiple People Tracking

Paper
Add Code

Hard to Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching Space

no code implementations • 24 Nov 2022 • Fan Yang, Shigeyuki Odashima, Shoichi Masui, Shan Jiang

To address this issue, our C-BIoU tracker adds buffers to expand the matching space of detections and tracks, which mitigates the effect of irregular motions in two aspects: one is to directly match identical but non-overlapping detections and tracks in adjacent frames, and the other is to compensate for the motion estimation bias in the matching space.

Ranked #14 on Multi-Object Tracking on DanceTrack

Motion Estimation Multi-Object Tracking +1

Paper
Add Code

A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset

no code implementations • 19 Nov 2022 • Jiaxin Deng, Dong Shen, Haojie Pan, Xiangyu Wu, Ximan Liu, Gaofeng Meng, Fan Yang, Size Li, Ruiji Fu, Zhongyuan Wang

Furthermore, based on this dataset, we propose an end-to-end model that jointly optimizes the video understanding objective with knowledge graph embedding, which can not only better inject factual knowledge into video understanding but also generate effective multi-modal entity embedding for KG.

Common Sense Reasoning Knowledge Graph Embedding +4

Paper
Add Code

Self-distillation with Online Diffusion on Batch Manifolds Improves Deep Metric Learning

1 code implementation • 14 Nov 2022 • Zelong Zeng, Fan Yang, Hong Liu, Shin'ichi Satoh

However, this type of method normally ignores the crucial knowledge hidden in the data (e. g., intra-class information variation), which is harmful to the generalization of the trained model.

Metric Learning

Paper
Code

Deep-Learning-Empowered Inverse Design for Freeform Reconfigurable Metasurfaces

no code implementations • 11 Nov 2022 • Changhao Liu, Fan Yang, Maokun Li, Shenheng Xu

Recently, artificial neural network empowered inverse design for metasurfaces has been developed that can design on-demand meta-atoms with diverse shapes and high performance, where the design process based on artificial intelligence is fast and automatic.

Paper
Add Code

Multimodal Learning for Non-small Cell Lung Cancer Prognosis

no code implementations • 7 Nov 2022 • Yujiao Wu, Yaxiong Wang, Xiaoshui Huang, Fan Yang, Sai Ho Ling, Steven Weidong Su

This paper focuses on the task of survival time analysis for lung cancer.

Decision Making Survival Analysis

Paper
Add Code

ISA-Net: Improved spatial attention network for PET-CT tumor segmentation

no code implementations • 4 Nov 2022 • Zhengyong Huang, Sijuan Zou, Guoshuai Wang, Zixiang Chen, Hao Shen, HaiYan Wang, Na Zhang, Lu Zhang, Fan Yang, Haining Wangg, Dong Liang, Tianye Niu, Xiaohua Zhuc, Zhanli Hua

In this paper, we propose a deep learning segmentation method based on multimodal positron emission tomography-computed tomography (PET-CT), which combines the high sensitivity of PET and the precise anatomical information of CT. We design an improved spatial attention network(ISA-Net) to increase the accuracy of PET or CT in detecting tumors, which uses multi-scale convolution operation to extract feature information and can highlight the tumor region location information and suppress the non-tumor region location information.

Segmentation STS +1

Paper
Add Code

Ground Plane Matters: Picking Up Ground Plane Prior in Monocular 3D Object Detection

no code implementations • 3 Nov 2022 • Fan Yang, Xinhao Xu, Hui Chen, Yuchen Guo, Jungong Han, Kai Ni, Guiguang Ding

To pick up the ground plane prior for M3OD, we propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go.

Monocular 3D Object Detection object-detection

Paper
Add Code

SIMPLE-RC: Group Network Inference with Non-Sharp Nulls and Weak Signals

no code implementations • 31 Oct 2022 • Jianqing Fan, Yingying Fan, Jinchi Lv, Fan Yang

To address these practical challenges, in this paper we propose a SIMPLE method with random coupling (SIMPLE-RC) for testing the non-sharp null hypothesis that a group of given nodes share similar (not necessarily identical) membership profiles under weaker signals.

Uncertainty Quantification

Paper
Add Code

Revisiting Attention Weights as Explanations from an Information Theoretic Perspective

no code implementations • 31 Oct 2022 • Bingyang Wen, K. P. Subbalakshmi, Fan Yang

Attention mechanisms have recently demonstrated impressive performance on a range of NLP tasks, and attention scores are often used as a proxy for model explainability.

Deep Attention

Paper
Add Code

Forecasting Human Trajectory from Scene History

1 code implementation • 17 Oct 2022 • Mancheng Meng, Ziyan Wu, Terrence Chen, Xiran Cai, Xiang Sean Zhou, Fan Yang, Dinggang Shen

We categorize scene history information into two types: historical group trajectory and individual-surroundings interaction.

Trajectory Prediction

Paper
Code

SoccerNet 2022 Challenges Results

7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Paper
Code

Improving alignment of dialogue agents via targeted human judgements

no code implementations • 28 Sep 2022 • Amelia Glaese, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh, Laura Weidinger, Martin Chadwick, Phoebe Thacker, Lucy Campbell-Gillingham, Jonathan Uesato, Po-Sen Huang, Ramona Comanescu, Fan Yang, Abigail See, Sumanth Dathathri, Rory Greig, Charlie Chen, Doug Fritz, Jaume Sanchez Elias, Richard Green, Soňa Mokrá, Nicholas Fernando, Boxi Wu, Rachel Foley, Susannah Young, Iason Gabriel, William Isaac, John Mellor, Demis Hassabis, Koray Kavukcuoglu, Lisa Anne Hendricks, Geoffrey Irving

We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines.

Language Modelling

Paper
Add Code

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

2 code implementations • 28 Sep 2022 • Zhiyang Chen, Yousong Zhu, Zhaowen Li, Fan Yang, Wei Li, Haixin Wang, Chaoyang Zhao, Liwei Wu, Rui Zhao, Jinqiao Wang, Ming Tang

Obj2Seq is able to flexibly determine input categories to satisfy customized requirements, and be easily extended to different visual tasks.

Multi-Label Classification Object +2

Paper
Code

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training

no code implementations • 22 Sep 2022 • Cong Guo, Yuxian Qiu, Jingwen Leng, Chen Zhang, Ying Cao, Quanlu Zhang, Yunxin Liu, Fan Yang, Minyi Guo

An activation function is an element-wise mathematical function and plays a crucial role in deep neural networks (DNN).

Paper
Add Code

UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup

1 code implementation • 19 Sep 2022 • Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian, Peilin Zhao, Bingzhe Wu, Changqing Zhang, Jianhua Yao

Importance reweighting is a normal way to handle the subpopulation shift issue by imposing constant or adaptive sampling weights on each sample in the training dataset.

Generalization Bounds

Paper
Code

RGB-Event Fusion for Moving Object Detection in Autonomous Driving

1 code implementation • 17 Sep 2022 • Zhuyun Zhou, Zongwei Wu, Rémi Boutteau, Fan Yang, Cédric Demonceaux, Dominique Ginhac

Moving Object Detection (MOD) is a critical vision task for successfully achieving safe autonomous driving.

Autonomous Driving Moving Object Detection +1

Paper
Code

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

1 code implementation • 30 Aug 2022 • Cong Guo, Chen Zhang, Jingwen Leng, Zihan Liu, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu

In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads.

Quantization

Paper
Code

Actor-identified Spatiotemporal Action Detection --- Detecting Who Is Doing What in Videos

1 code implementation • 27 Aug 2022 • Fan Yang, Norimichi Ukita, Sakriani Sakti, Satoshi Nakamura

By using MOT, the spatiotemporal boundary of each actor is obtained and assigned to a unique actor identity.

Action Classification Action Detection +3

Paper
Code

Surrogate-assisted Multi-objective Neural Architecture Search for Real-time Semantic Segmentation

no code implementations • 14 Aug 2022 • Zhichao Lu, Ran Cheng, Shihua Huang, Haoming Zhang, Changxiao Qiu, Fan Yang

The main challenges of applying NAS to semantic segmentation arise from two aspects: (i) high-resolution images to be processed; (ii) additional requirement of real-time inference speed (i. e., real-time semantic segmentation) for applications such as autonomous driving.

Autonomous Driving Image Classification +3

Paper
Add Code

Differentially Private Counterfactuals via Functional Mechanism

no code implementations • 4 Aug 2022 • Fan Yang, Qizhang Feng, Kaixiong Zhou, Jiahao Chen, Xia Hu

Counterfactual, serving as one emerging type of model explanation, has attracted tons of attentions recently from both industry and academia.

counterfactual valid

Paper
Add Code

ReMix: A General and Efficient Framework for Multiple Instance Learning based Whole Slide Image Classification

1 code implementation • 5 Jul 2022 • Jiawei Yang, Hanbo Chen, Yu Zhao, Fan Yang, Yao Zhang, Lei He, Jianhua Yao

We evaluate ReMix on two public datasets with two state-of-the-art MIL methods.

Data Augmentation Image Classification +1

Paper
Code

Accelerating Shapley Explanation via Contributive Cooperator Selection

1 code implementation • 17 Jun 2022 • Guanchu Wang, Yu-Neng Chuang, Mengnan Du, Fan Yang, Quan Zhou, Pushkar Tripathi, Xuanting Cai, Xia Hu

Even though Shapley value provides an effective explanation for a DNN model prediction, the computation relies on the enumeration of all possible input feature coalitions, which leads to the exponentially growing complexity.

Paper
Code

Improving Generalization of Metric Learning via Listwise Self-distillation

1 code implementation • 17 Jun 2022 • Zelong Zeng, Fan Yang, Zheng Wang, Shin'ichi Satoh

Most deep metric learning (DML) methods employ a strategy that forces all positive samples to be close in the embedding space while keeping them away from negative ones.

Metric Learning

Paper
Code

Learning Interpretable Decision Rule Sets: A Submodular Optimization Approach

no code implementations • NeurIPS 2021 • Fan Yang, Kai He, Linxiao Yang, Hongxia Du, Jingbang Yang, Bo Yang, Liang Sun

The learning problem is framed as a subset selection task in which a subset of all possible rules needs to be selected to form an accurate and interpretable rule set.

Paper
Add Code

Tutel: Adaptive Mixture-of-Experts at Scale

2 code implementations • 7 Jun 2022 • Changho Hwang, Wei Cui, Yifan Xiong, Ziyue Yang, Ze Liu, Han Hu, Zilong Wang, Rafael Salas, Jithin Jose, Prabhat Ram, Joe Chau, Peng Cheng, Fan Yang, Mao Yang, Yongqiang Xiong

On efficiency, Flex accelerates SwinV2-MoE, achieving up to 1. 55x and 2. 11x speedup in training and inference over Fairseq, respectively.

Object Detection

13,159

Paper
Code

Geo-Localization via Ground-to-Satellite Cross-View Image Retrieval

1 code implementation • 22 May 2022 • Zelong Zeng, Zheng Wang, Fan Yang, Shin'ichi Satoh

The large variation of viewpoint and irrelevant content around the target always hinder accurate image retrieval and its subsequent tasks.

Image Retrieval Representation Learning +1

Paper
Code

Demo: low-power communications based on RIS and AI for 6G

no code implementations • 21 May 2022 • Mingyao Cui, Zidong Wu, Yuhao Chen, Shenheng Xu, Fan Yang, Linglong Dai

By jointly designing the hardware and software, this prototype can realize real-time 4K video transmission with much reduced power consumption.

Paper
Add Code

NMA: Neural Multi-slot Auctions with Externalities for Online Advertising

no code implementations • 20 May 2022 • Guogang Liao, Xuejian Li, Ze Wang, Fan Yang, Muzhi Guan, Bingqi Zhu, Yongkang Wang, Xingxing Wang, Dong Wang

Although VCG-based multi-slot auctions (e. g., VCG, WVCG) make it theoretically possible to model global externalities (e. g., the order and positions of ads and so on), they lack an efficient balance of both revenue and social welfare.

Paper
Add Code

A Low-Cost, Controllable and Interpretable Task-Oriented Chatbot: With Real-World After-Sale Services as Example

no code implementations • 13 May 2022 • Xiangyu Xi, Chenxu Lv, Yuncheng Hua, Wei Ye, Chaobo Sun, Shuaipeng Liu, Fan Yang, Guanglu Wan

Though widely used in industry, traditional task-oriented dialogue systems suffer from three bottlenecks: (i) difficult ontology construction (e. g., intents and slots); (ii) poor controllability and interpretability; (iii) annotation-hungry.

Chatbot Task-Oriented Dialogue Systems

Paper
Add Code

Limited-memory BFGS Optimisation of Phase-Only Computer-Generated Hologram for Fraunhofer Diffraction

no code implementations • 10 May 2022 • Jinze Sha, Andrew Kadis, Fan Yang, Timothy D. Wilkinson

We implement a novel limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) optimisation algorithm with cross entropy (CE) loss function, to produce phase-only computer-generated hologram (CGH) for holographic displays, with validation on a binary-phase modulation holographic projector.

Paper
Add Code

Learning Individual Interactions from Population Dynamics with Discrete-Event Simulation Model

no code implementations • 4 May 2022 • Yan Shen, Fan Yang, Mingchen Gao, Wen Dong

Traditional machine learning approaches capture complex system dynamics either with dynamic Bayesian networks and state space models, which is hard to scale because it is non-trivial to prescribe the dynamics with a sparse graph or a system of differential equations; or a deep neural networks, where the distributed representation of the learned dynamics is hard to interpret.

Paper
Add Code

A Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions

1 code implementation • 21 Apr 2022 • Fan Yang

Spatio-temporal action detection is an important and challenging problem in video understanding.

Action Detection Video Understanding

Paper
Code

Distill-VQ: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge from Dense Embeddings

2 code implementations • 1 Apr 2022 • Shitao Xiao, Zheng Liu, Weihao Han, Jianjin Zhang, Defu Lian, Yeyun Gong, Qi Chen, Fan Yang, Hao Sun, Yingxia Shao, Denvy Deng, Qi Zhang, Xing Xie

We perform comprehensive explorations for the optimal conduct of knowledge distillation, which may provide useful insights for the learning of VQ based ANN index.

Contrastive Learning Knowledge Distillation +2

Paper
Code

SC^2-PCR: A Second Order Spatial Compatibility for Efficient and Robust Point Cloud Registration

1 code implementation • 28 Mar 2022 • Zhi Chen, Kun Sun, Fan Yang, Wenbing Tao

In this paper, we present a second order spatial compatibility (SC^2) measure based method for efficient and robust point cloud registration (PCR), called SC^2-PCR.

Point Cloud Registration

146

Paper
Code

Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss

no code implementations • 17 Mar 2022 • Yantao Gong, Cao Liu, Fan Yang, Xunliang Cai, Guanglu Wan, Jiansong Chen, Weipeng Zhang, Houfeng Wang

Experiments on the open datasets verify that our model outperforms the existing calibration methods and achieves a significant improvement on the calibration metric.

Intent Detection

Paper
Add Code

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

no code implementations • CVPR 2022 • Zhaowen Li, Yousong Zhu, Fan Yang, Wei Li, Chaoyang Zhao, Yingying Chen, Zhiyang Chen, Jiahao Xie, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang

Furthermore, our method can also exploit single-centric-object dataset such as ImageNet and outperforms BYOL by 2. 5% with the same pre-training epochs in linear probing, and surpass current self-supervised object detection methods on COCO dataset, demonstrating its universality and potential.

Image Classification Object +4

Paper
Add Code

Learning from Attacks: Attacking Variational Autoencoder for Improving Image Classification

no code implementations • 11 Mar 2022 • Jianzhang Zheng, Fan Yang, Hao Shen, Xuan Tang, Mingsong Chen, Liang Song, Xian Wei

We propose an algorithmic framework that leverages the advantages of the DNNs for data self-expression and task-specific predictions, to improve image classification.

Classification Image Classification

Paper
Add Code

Class-Aware Contrastive Semi-Supervised Learning

1 code implementation • CVPR 2022 • Fan Yang, Kai Wu, Shuyi Zhang, Guannan Jiang, Yong liu, Feng Zheng, Wei zhang, Chengjie Wang, Long Zeng

Pseudo-label-based semi-supervised learning (SSL) has achieved great success on raw data utilization.

Ranked #1 on Semi-Supervised Image Classification on CIFAR-100 (250 Labels, ImageNet-100 Unlabeled)

Pseudo Label Semi-Supervised Image Classification

Paper
Code

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

1 code implementation • ICLR 2022 • Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo

This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements.

Data Free Quantization

157

Paper
Code

Learning Optical Flow with Adaptive Graph Reasoning

1 code implementation • 8 Feb 2022 • Ao Luo, Fan Yang, Kunming Luo, Xin Li, Haoqiang Fan, Shuaicheng Liu

Our key idea is to decouple the context reasoning from the matching procedure, and exploit scene information to effectively assist motion estimation by learning to reason over the adaptive graph.

Motion Estimation Optical Flow Estimation +1

Paper
Code

A comprehensive benchmark analysis for sand dust image reconstruction

no code implementations • 7 Feb 2022 • Yazhong Si, Fan Yang, Ya Guo, Wei zhang, Yipu Yang

In this paper, we presented a comprehensive perceptual study and analysis of real-world sand dust images, then constructed a Sand-dust Image Reconstruction Benchmark (SIRB) for training Convolutional Neural Networks (CNNs) and evaluating algorithms performance.

Image Enhancement Image Reconstruction

Paper
Add Code

Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices

1 code implementation • 24 Jan 2022 • Yingying Zhao, Yuhu Chang, Yutian Lu, Yujiang Wang, Mingzhi Dong, Qin Lv, Robert P. Dick, Fan Yang, Tun Lu, Ning Gu, Li Shang

Experimental studies with 20 participants demonstrate that, thanks to the emotionship awareness, EMOShip not only achieves superior emotion recognition accuracy over existing methods (80. 2% vs. 69. 4%), but also provides a valuable understanding of the cause of emotions.

Emotion Recognition

Paper
Code

BBA-net: A bi-branch attention network for crowd counting

no code implementations • 22 Jan 2022 • Yi Hou, Chengyang Li, Fan Yang, Cong Ma, Liping Zhu, Yuan Li, Huizhu Jia, Xiaodong Xie

Our method can integrate the pedestrian's head and body information to enhance the feature expression ability of the density map.

Crowd Counting

Paper
Add Code

Learning Optical Flow With Kernel Patch Attention

1 code implementation • CVPR 2022 • Ao Luo, Fan Yang, Xin Li, Shuaicheng Liu

Optical flow is a fundamental method used for quantitative motion estimation on the image plane.

Motion Estimation Optical Flow Estimation

Paper
Code

SC2-PCR: A Second Order Spatial Compatibility for Efficient and Robust Point Cloud Registration

no code implementations • CVPR 2022 • Zhi Chen, Kun Sun, Fan Yang, Wenbing Tao

In this paper, we present a second order spatial compatibility (SC^2) measure based method for efficient and robust point cloud registration (PCR), called SC^2-PCR.

Ranked #1 on Point Cloud Registration on FP-O-H

Image to Point Cloud Registration

Paper
Add Code

Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal Classification

1 code implementation • CVPR 2022 • Zongbo Han, Fan Yang, Junzhou Huang, Changqing Zhang, Jianhua Yao

To the best of our knowledge, this is the first work to jointly model both feature and modality variation for different samples to provide trustworthy fusion in multi-modal classification.

Informativeness Medical Diagnosis +1

Paper
Code

DetarNet: Decoupling Translation and Rotation by Siamese Network for Point Cloud Registration

1 code implementation • 28 Dec 2021 • Zhi Chen, Fan Yang, Wenbing Tao

In this paper, we propose a neural network named DetarNet to decouple the translation $t$ and rotation $R$, so as to overcome the performance degradation due to their mutual interference in point cloud registration.

Point Cloud Registration Translation

Paper
Code

Neural Born Iteration Method For Solving Inverse Scattering Problems: 2D Cases

no code implementations • 18 Dec 2021 • Tao Shan, Zhichao Lin, Xiaoqian Song, Maokun Li, Fan Yang, Zhensheng Xu

In this paper, we propose the neural Born iterative method (NeuralBIM) for solving 2D inverse scattering problems (ISPs) by drawing on the scheme of physics-informed supervised residual learning (PhiSRL) to emulate the computing process of the traditional Born iterative method (TBIM).

Paper
Add Code

A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

3 code implementations • 17 Dec 2021 • Wuyang Chen, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou

In this paper, we comprehensively study three architecture design choices on ViT -- spatial reduction, doubled channels, and multiscale features -- and demonstrate that a vanilla ViT architecture can fulfill this goal without handcrafting multiscale features, maintaining the original ViT design philosophy.

Image Classification Instance Segmentation +6

76,719

Paper
Code

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

1 code implementation • 24 Nov 2021 • Chenfei Wu, Jian Liang, Lei Ji, Fan Yang, Yuejian Fang, Daxin Jiang, Nan Duan

To cover language, image, and video at the same time for different scenarios, a 3D transformer encoder-decoder framework is designed, which can not only deal with videos as 3D data but also adapt to texts and images as 1D and 2D data, respectively.

Ranked #1 on Text-to-Video Generation on Kinetics

Decoder Text-to-Image Generation +3

535

Paper
Code

Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence

1 code implementation • 18 Nov 2021 • Xiang Bai, Hanchen Wang, Liya Ma, Yongchao Xu, Jiefeng Gan, Ziwei Fan, Fan Yang, Ke Ma, Jiehua Yang, Song Bai, Chang Shu, Xinyu Zou, Renhao Huang, Changzheng Zhang, Xiaowu Liu, Dandan Tu, Chuou Xu, Wenqing Zhang, Xi Wang, Anguo Chen, Yu Zeng, Dehua Yang, Ming-Wei Wang, Nagaraj Holalkere, Neil J. Halin, Ihab R. Kamel, Jia Wu, Xuehua Peng, Xiang Wang, Jianbo Shao, Pattanasak Mongkolwat, Jianjun Zhang, Weiyang Liu, Michael Roberts, Zhongzhao Teng, Lucian Beer, Lorena Escudero Sanchez, Evis Sala, Daniel Rubin, Adrian Weller, Joan Lasenby, Chuangsheng Zheng, Jianming Wang, Zhen Li, Carola-Bibiane Schönlieb, Tian Xia

Artificial intelligence (AI) provides a promising substitution for streamlining COVID-19 diagnoses.

COVID-19 Diagnosis Federated Learning +2

Paper
Code

Towards Privacy-Preserving Affect Recognition: A Two-Level Deep Learning Architecture

no code implementations • 14 Nov 2021 • Jimiama M. Mase, Natalie Leesakul, Fan Yang, Grazziela P. Figueredo, Mercedes Torres Torres

Possible solutions to protect the privacy of users and avoid misuse of their identities are to: (1) extract anonymised facial features, namely action units (AU) from a database of images, discard the images and use AUs for processing and training, and (2) federated learning (FL) i. e. process raw images in users' local machines (local processing) and send the locally trained models to the main processing machine for aggregation (central processing).

Federated Learning Privacy Preserving +1

Paper
Add Code

Defense Against Explanation Manipulation

no code implementations • 8 Nov 2021 • Ruixiang Tang, Ninghao Liu, Fan Yang, Na Zou, Xia Hu

Explainable machine learning attracts increasing attention as it improves transparency of models, which is helpful for machine learning to be trusted in real applications.

Adversarial Attack BIG-bench Machine Learning

Paper
Add Code

Causal-TGAN: Causally-Aware Synthetic Tabular Data Generative Adversarial Network

no code implementations • 29 Sep 2021 • Bingyang Wen, Yupeng Cao, Fan Yang, Koduvayur Subbalakshmi, Rajarathnam Chandramouli

The flexibility of this architecture is its capability to support different types of expert knowledge (e. g., complete or partial) about the causal nature of the underlying phenomenon.

Generative Adversarial Network Image Generation

Paper
Add Code

Generalized Demographic Parity for Group Fairness

1 code implementation • ICLR 2022 • Zhimeng Jiang, Xiaotian Han, Chao Fan, Fan Yang, Ali Mostafavi, Xia Hu

We show the understanding of GDP from the probability perspective and theoretically reveal the connection between GDP regularizer and adversarial debiasing.

Attribute Fairness

Paper
Code

EXACT: Scalable Graph Neural Networks Training via Extreme Activation Compression

no code implementations • ICLR 2022 • Zirui Liu, Kaixiong Zhou, Fan Yang, Li Li, Rui Chen, Xia Hu

Based on the implementation, we propose a memory-efficient framework called ``EXACT'', which for the first time demonstrate the potential and evaluate the feasibility of training GNNs with compressed activations.

Graph Learning

Paper
Add Code

Recursive Disentanglement Network

no code implementations • ICLR 2022 • Yixuan Chen, Yubin Shi, Dongsheng Li, Yujiang Wang, Mingzhi Dong, Yingying Zhao, Robert Dick, Qin Lv, Fan Yang, Li Shang

The feature space of deep models is inherently compositional.

Disentanglement Inductive Bias

Paper
Add Code

LODE: Deep Local Deblurring and A New Benchmark

1 code implementation • 19 Sep 2021 • Zerun Wang, Liuyu Xiang, Fan Yang, Jinzhao Qian, Jie Hu, Haidong Huang, Jungong Han, Yuchen Guo, Guiguang Ding

While recent deep deblurring algorithms have achieved remarkable progress, most existing methods focus on the global deblurring problem, where the image blur mostly arises from severe camera shake.

Deblurring

Paper
Code

Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

2 code implementations • 9 Sep 2021 • Xing Cheng, Hezheng Lin, Xiangyu Wu, Fan Yang, Dong Shen

In this paper, we propose a multi-stream Corpus Alignment network with single gate Mixture-of-Experts (CAMoE) and a novel Dual Softmax Loss (DSL) to solve the two heterogeneity.

Ranked #9 on Video Retrieval on MSVD (using extra training data)

Retrieval Text Retrieval +1

Paper
Code

LinEasyBO: Scalable Bayesian Optimization Approach for Analog Circuit Synthesis via One-Dimensional Subspaces

no code implementations • 1 Sep 2021 • Shuhan Zhang, Fan Yang, Changhao Yan, Dian Zhou, Xuan Zeng

A large body of literature has proved that the Bayesian optimization framework is especially efficient and effective in analog circuit synthesis.

Bayesian Optimization

Paper
Add Code

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

no code implementations • 30 Aug 2021 • Yang Wu, Dingheng Wang, Xiaotong Lu, Fan Yang, Guoqi Li, Weisheng Dong, Jianbo Shi

Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence.

Paper
Add Code

Adaptive Label Smoothing To Regularize Large-Scale Graph Training

no code implementations • 30 Aug 2021 • Kaixiong Zhou, Ninghao Liu, Fan Yang, Zirui Liu, Rui Chen, Li Li, Soo-Hyun Choi, Xia Hu

Graph neural networks (GNNs), which learn the node representations by recursively aggregating information from its neighbors, have become a predominant computational tool in many domains.

Node Clustering

Paper
Add Code

Actuarial-consistency and two-step actuarial valuations: a new paradigm to insurance valuation

no code implementations • 30 Aug 2021 • Karim Barigou, Daniël Linders, Fan Yang

This paper introduces new valuation schemes called actuarial-consistent valuations for insurance liabilities which depend on both financial and actuarial risks, which imposes that all actuarial risks are priced via standard actuarial principles.

Paper
Add Code

Density-Based Dynamic Curriculum Learning for Intent Detection

no code implementations • 24 Aug 2021 • Yantao Gong, Cao Liu, Jiazhen Yuan, Fan Yang, Xunliang Cai, Guanglu Wan, Jiansong Chen, Ruiyao Niu, Houfeng Wang

To handle this problem, we propose a density-based dynamic curriculum learning model.

Intent Detection

Paper
Add Code

RAIN: Reinforced Hybrid Attention Inference Network for Motion Forecasting

no code implementations • ICCV 2021 • Jiachen Li, Fan Yang, Hengbo Ma, Srikanth Malla, Masayoshi Tomizuka, Chiho Choi

Motion forecasting plays a significant role in various domains (e. g., autonomous driving, human-robot interaction), which aims to predict future motion sequences given a set of historical observations.

Motion Forecasting Trajectory Prediction

Paper
Add Code

Opinion Prediction with User Fingerprinting

1 code implementation • RANLP 2021 • Kishore Tumarada, Yifan Zhang, Fan Yang, Eduard Dragut, Omprakash Gnawali, Arjun Mukherjee

Experimental results show novel insights that were previously unknown such as better predictions for an increase in dynamic history length, the impact of the nature of the article on performance, thereby laying the foundation for further research.

Sentiment Analysis Time Series +1

Paper
Code

An Efficient Asynchronous Batch Bayesian Optimization Approach for Analog Circuit Synthesis

no code implementations • 28 Jun 2021 • Shuhan Zhang, Fan Yang, Dian Zhou, Xuan Zeng

A new strategy is proposed to better balance the exploration and exploitation and guarantee the diversity of the query points.

Bayesian Optimization

Paper
Add Code

An Efficient Batch Constrained Bayesian Optimization Approach for Analog Circuit Synthesis via Multi-objective Acquisition Ensemble

no code implementations • 28 Jun 2021 • Shuhan Zhang, Fan Yang, Changhao Yan, Dian Zhou, Xuan Zeng

After achieving the first feasible point, we favor the feasible region by adopting a specially designed penalization term to the acquisition function ensemble.

Bayesian Optimization valid

Paper
Add Code

A Scalable 256-Elements E-Band Phased-Array Transceiver for Broadband Communication

no code implementations • 20 Jun 2021 • Xu Li, Wenyao Zhai, Morris Repeta, Hua Cai, Tyler Ross, Kimia Ansari, Sam Tiller, Hari Krishna Pothula, Dong Liang, Fan Yang, Yibo Lyu, Songlin Shuai, Guangjian Wang, Wen Tong

For E-band wireless communications, a high gain steerable antenna with sub-arrays is desired to reduce the implementation complexity.

Paper
Add Code

Probabilistic Model Distillation for Semantic Correspondence

1 code implementation • CVPR 2021 • Xin Li, Deng-Ping Fan, Fan Yang, Ao Luo, Hong Cheng, Zicheng Liu

We address this problem with the use of a novel Probabilistic Model Distillation (PMD) approach which transfers knowledge learned by a probabilistic teacher model on synthetic data to a static student model with the use of unlabeled real image pairs.

Representation Learning Semantic correspondence

Paper
Code

Model-Based Counterfactual Synthesizer for Interpretation

no code implementations • 16 Jun 2021 • Fan Yang, Sahan Suresh Alva, Jiahao Chen, Xia Hu

To address these limitations, we propose a Model-based Counterfactual Synthesizer (MCS) framework for interpreting machine learning models.

counterfactual Inductive Bias

Paper
Add Code

From Paraphrasing to Semantic Parsing: Unsupervised Semantic Parsing via Synchronous Semantic Decoding

no code implementations • ACL 2021 • Shan Wu, Bo Chen, Chunlei Xin, Xianpei Han, Le Sun, Weipeng Zhang, Jiansong Chen, Fan Yang, Xunliang Cai

During synchronous decoding: the utterance paraphrasing is constrained by the structure of the logical form, therefore the canonical utterance can be paraphrased controlledly; the semantic decoding is guided by the semantics of the canonical utterance, therefore its logical form can be generated unsupervisedly.

Unsupervised semantic parsing

Paper
Add Code

MlTr: Multi-label Classification with Transformer

1 code implementation • 11 Jun 2021 • Xing Cheng, Hezheng Lin, Xiangyu Wu, Fan Yang, Dong Shen, Zhongyuan Wang, Nian Shi, Honglin Liu

The task of multi-label image classification is to recognize all the object labels presented in an image.

Ranked #12 on Multi-Label Classification on MS-COCO

Classification Multi-Label Classification +1

Paper
Code

CAT: Cross Attention in Vision Transformer

1 code implementation • 10 Jun 2021 • Hezheng Lin, Xing Cheng, Xiangyu Wu, Fan Yang, Dong Shen, Zhongyuan Wang, Qing Song, Wei Yuan

In this paper, we propose a new attention mechanism in Transformer termed Cross Attention, which alternates attention inner the image patch instead of the whole image to capture local information and apply attention between image patches which are divided from single-channel feature maps capture global information.

134

Paper
Code

MST: Masked Self-Supervised Transformer for Visual Representation

no code implementations • NeurIPS 2021 • Zhaowen Li, Zhiyang Chen, Fan Yang, Wei Li, Yousong Zhu, Chaoyang Zhao, Rui Deng, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang

More importantly, the masked tokens together with the remaining tokens are further recovered by a global image decoder, which preserves the spatial information of the image and is more friendly to the downstream dense prediction tasks.

Language Modelling Linear evaluation +4

Paper
Add Code

Calibrating multi-dimensional complex ODE from noisy data via deep neural networks

no code implementations • 7 Jun 2021 • Kexuan Li, Fangfang Wang, Ruiqi Liu, Fan Yang, Zuofeng Shang

Our method is able to recover the ODE system without being subject to the curse of dimensionality and complicated ODE structure.

Paper
Add Code

Towards Compact CNNs via Collaborative Compression

1 code implementation • CVPR 2021 • Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji

Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression.

Neural Network Compression Tensor Decomposition

Paper
Code

ModelPS: An Interactive and Collaborative Platform for Editing Pre-trained Models at Scale

1 code implementation • 18 May 2021 • Yuanming Li, Huaizheng Zhang, Shanshan Jiang, Fan Yang, Yonggang Wen, Yong Luo

AI engineering has emerged as a crucial discipline to democratize deep neural network (DNN) models among software developers with a diverse background.

Model Editing

188

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.