Search Results for author: Jen-Hao Rick Chang

Found 18 papers, 2 papers with code

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

no code implementations • 21 May 2024 • Hadi Pouransari, Chun-Liang Li, Jen-Hao Rick Chang, Pavan Kumar Anasosalu Vasu, Cem Koc, Vaishaal Shankar, Oncel Tuzel

During training, we use variable sequence length and batch size, sampling simultaneously from all buckets with a curriculum.

2k 8k +1

Paper
Add Code

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications

no code implementations • 30 Nov 2023 • Karren D. Yang, Anurag Ranjan, Jen-Hao Rick Chang, Raviteja Vemulapalli, Oncel Tuzel

While these models can achieve high-quality lip articulation for speakers in the training set, they are unable to capture the full and diverse distribution of 3D facial motions that accompany speech in the real world.

Motion Synthesis

Paper
Add Code

HUGS: Human Gaussian Splats

1 code implementation • 29 Nov 2023 • Muhammed Kocabas, Jen-Hao Rick Chang, James Gabriel, Oncel Tuzel, Anurag Ranjan

We achieve state-of-the-art rendering quality with a rendering speed of 60 FPS while being ~100x faster to train over previous work.

Neural Rendering Novel View Synthesis

Paper
Code

Novel-View Acoustic Synthesis from 3D Reconstructed Rooms

1 code implementation • 23 Oct 2023 • Byeongjoo Ahn, Karren Yang, Brian Hamilton, Jonathan Sheaffer, Anurag Ranjan, Miguel Sarabia, Oncel Tuzel, Jen-Hao Rick Chang

Given audio recordings from 2-4 microphones and the 3D geometry and material of a scene containing multiple unknown sound sources, we estimate the sound anywhere in the scene.

Paper
Code

Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day

no code implementations • 4 Oct 2023 • Yifan Jiang, Hao Tang, Jen-Hao Rick Chang, Liangchen Song, Zhangyang Wang, Liangliang Cao

Although the fidelity and generalizability are greatly improved, training such a powerful diffusion model requires a vast volume of training data and model parameters, resulting in a notoriously long time and high computational costs.

Image Generation Novel View Synthesis

Paper
Add Code

Corpus Synthesis for Zero-shot ASR domain Adaptation using Large Language Models

no code implementations • 18 Sep 2023 • Hsuan Su, Ting-yao Hu, Hema Swetha Koppula, Raviteja Vemulapalli, Jen-Hao Rick Chang, Karren Yang, Gautam Varma Mantena, Oncel Tuzel

In this paper, we propose a new strategy for adapting ASR models to new target domains without any text or speech from those domains.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Pointersect: Neural Rendering with Cloud-Ray Intersection

no code implementations • CVPR 2023 • Jen-Hao Rick Chang, Wei-Yu Chen, Anurag Ranjan, Kwang Moo Yi, Oncel Tuzel

Specifically, we train a set transformer that, given a small number of local neighbor points along a light ray, provides the intersection point, the surface normal, and the material blending weights, which are used to render the outcome of this light ray.

Inverse Rendering Neural Rendering +2

Paper
Add Code

FaceLit: Neural 3D Relightable Faces

no code implementations • CVPR 2023 • Anurag Ranjan, Kwang Moo Yi, Jen-Hao Rick Chang, Oncel Tuzel

We propose a generative framework, FaceLit, capable of generating a 3D face that can be rendered at various user-defined lighting conditions and views, learned purely from 2D images in-the-wild without any manual annotation.

Paper
Add Code

Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis

no code implementations • 27 Mar 2023 • Karren Yang, Ting-yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel

Here, we ask two fundamental questions about this strategy: when is synthetic data effective for personalization, and why is it effective in those cases?

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Synt++: Utilizing Imperfect Synthetic Data to Improve Speech Recognition

no code implementations • 21 Oct 2021 • Ting-yao Hu, Mohammadreza Armandpour, Ashish Shrivastava, Jen-Hao Rick Chang, Hema Koppula, Oncel Tuzel

With recent advances in speech synthesis, synthetic data is becoming a viable alternative to real data for training speech recognition models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Data Incubation -- Synthesizing Missing Data for Handwriting Recognition

no code implementations • 13 Oct 2021 • Jen-Hao Rick Chang, Martin Bresler, Youssouf Chherawala, Adrien Delaye, Thomas Deselaers, Ryan Dixon, Oncel Tuzel

We use the framework to optimize data synthesis and demonstrate significant improvement on handwriting recognition over a model trained on real data only.

Handwriting Recognition

Paper
Add Code

Token Pooling in Vision Transformers

no code implementations • 8 Oct 2021 • Dmitrii Marin, Jen-Hao Rick Chang, Anurag Ranjan, Anish Prabhu, Mohammad Rastegari, Oncel Tuzel

Token Pooling is a simple and effective operator that can benefit many architectures.

Paper
Add Code

Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models

no code implementations • 6 Oct 2021 • Jen-Hao Rick Chang, Ashish Shrivastava, Hema Swetha Koppula, Xiaoshuai Zhang, Oncel Tuzel

However, under an unsupervised-style setting, typical training algorithms for controllable sequence generative models suffer from the training-inference mismatch, where the same sample is used as content and style input during training but unpaired samples are given during inference.

Paper
Add Code

SapAugment: Learning A Sample Adaptive Policy for Data Augmentation

no code implementations • 2 Nov 2020 • Ting-yao Hu, Ashish Shrivastava, Jen-Hao Rick Chang, Hema Koppula, Stefan Braun, Kyuyeon Hwang, Ozlem Kalinli, Oncel Tuzel

Our policy adapts the augmentation parameters based on the training loss of the data samples.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Towards Occlusion-Aware Multifocal Displays

no code implementations • 2 May 2020 • Jen-Hao Rick Chang, Anat Levin, B. V. K. Vijaya Kumar, Aswin C. Sankaranarayanan

Multifocal displays, one of the classic approaches to satisfy the accommodation cue, place virtual content at multiple focal planes, each at a di erent depth.

Paper
Add Code

Towards Multifocal Displays with Dense Focal Stacks

no code implementations • 27 May 2018 • Jen-Hao Rick Chang, B. V. K. Vijaya Kumar, Aswin C. Sankaranarayanan

We present a virtual reality display that is capable of generating a dense collection of depth/focal planes.

Paper
Add Code

Random Features for Sparse Signal Classification

no code implementations • CVPR 2016 • Jen-Hao Rick Chang, Aswin C. Sankaranarayanan, B. V. K. Vijaya Kumar

Random features is an approach for kernel-based inference on large datasets.

Classification General Classification

Paper
Add Code

Propagated Image Filtering

no code implementations • CVPR 2015 • Jen-Hao Rick Chang, Yu-Chiang Frank Wang

In this paper, we propose the propagation filter as a novel image filtering operator, with the goal of smoothing over neighboring image pixels while preserving image context like edges or textural regions.

Image Denoising

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.