Alex Hauptmann
Carnegie Mellon University
H-index: 93
North America-United States
Top articles of Alex Hauptmann
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Spae: Semantic pyramid autoencoder for multimodal generation with frozen llms | Lijun Yu Yong Cheng Zhiruo Wang Vivek Kumar Wolfgang Macherey | 2023 | |
MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis | arXiv preprint arXiv:2404.18398 | Xiang Li Zhi-Qi Cheng Jun-Yan He Xiaojiang Peng Alexander G Hauptmann | 2024/4/29 |
Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin | Gabriel Moreira Manuel Marques João Paulo Costeira Alexander Hauptmann | 2024 | |
PhISANet: Phonetically Informed Speech Animation Network | Salvador Medina Sarah L Taylor Carsten Stoll Gareth Edwards Alex Hauptmann | 2024/4/14 | |
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward | arXiv preprint arXiv:2404.01258 | Ruohong Zhang Liangke Gui Zhiqing Sun Yihao Feng Keyang Xu | 2024/4/1 |
Adversarially Masked Video Consistency for Unsupervised Domain Adaptation | arXiv preprint arXiv:2403.16242 | Xiaoyu Zhu Junwei Liang Po-Yao Huang Alex Hauptmann | 2024/3/24 |
Breaking the limits of text-conditioned 3d motion synthesis with elaborative descriptions | Yijun Qian Jack Urbanek Alexander G Hauptmann Jungdam Won | 2023 | |
Towards open-domain twitter user profile inference | Haoyang Wen Zhenxin Xiao Eduard Hovy Alexander G Hauptmann | 2023/7 | |
Chartreader: A unified framework for chart derendering and comprehension without heuristic rules | Zhi-Qi Cheng Qi Dai Alexander G Hauptmann | 2023 | |
Robust Automatic Detection of Traffic Activity | Alexander Hauptmann Lijun Yu Wenhe Liu Yijun Qian Zhiqi Cheng | 2023/6/30 | |
Magvit: Masked generative video transformer | Lijun Yu Yong Cheng Kihyuk Sohn José Lezama Han Zhang | 2023 | |
Document Entity Retrieval with Massive and Noisy Pre-training | arXiv preprint arXiv:2306.08937 | Lijun Yu Jin Miao Xiaoyu Sun Jiayi Chen Alexander G Hauptmann | 2023/6/15 |
Documentnet: Bridging the data gap in document pre-training | Lijun Yu Jin Miao Xiaoyu Sun Jiayi Chen Alexander G Hauptmann | 2023/12 | |
Stmt: A spatial-temporal mesh transformer for mocap-based action recognition | CVPR 2023 | Xiaoyu Zhu Po-Yao Huang Junwei Liang Celso M de Melo Alexander Hauptmann | 2023/3/31 |
Leveraging body pose estimation for gesture recognition in human-robot interaction using synthetic data | Xiaoyu Zhu Celso M de Melo Alexander Hauptmann | 2023/6/13 | |
Language Model Beats Diffusion--Tokenizer is Key to Visual Generation | Lijun Yu José Lezama Nitesh B. Gundavarapu Luca Versari Kihyuk Sohn | 2024 | |
Data processing system for classifying keyed data representing inhaler device operation | 2023/6/13 | ||
Zero-shot and few-shot stance detection on varied topics via conditional generation | Haoyang Wen Alexander G Hauptmann | 2023/7 | |
Vehicle and Pedestrian Trajectory and Gap Estimation for Traffic Conflict Prediction | Alexander Hauptmann | 2022/2/1 | |
Gsrformer: Grounded situation recognition transformer with alternate semantic attention refinement | Zhi-Qi Cheng Qi Dai Siyao Li Teruko Mitamura Alexander Hauptmann | 2022/10/10 |