ProfessorsProfessors of Georgia Institute of TechnologyIrfan Essa

Irfan Essa

Georgia Institute of Technology

H-index: 69

North America-United States

About Irfan Essa

Irfan Essa, With an exceptional h-index of 69 and a recent h-index of 39 (since 2020), a distinguished researcher at Georgia Institute of Technology, specializes in the field of Computer Vision, Artificial Intelligence, Machine Learning, Computer Graphics, Robotics.

His recent articles reflect a diverse array of research interests and contributions to the field:

Styledrop: Text-to-image synthesis of any style

Spae: Semantic pyramid autoencoder for multimodal generation with frozen llms

Image manipulation by text instruction

SLAIM: Robust Dense Neural SLAM for Online Tracking and Mapping

On the Efficacy of Text-Based Input Modalities for Action Anticipation

3D Semantic MapNet: Building Maps for Multi-Object Re-Identification in 3D

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity Recognition

Irfan Essa Information

University	Georgia Institute of Technology
Position	Distinguished Professor of Computing / Research Scientist Google
Citations(all)	23709
Citations(since 2020)	6800
Cited By	19108
hIndex(all)	69
hIndex(since 2020)	39
i10Index(all)	190
i10Index(since 2020)	111
Email	Access Email
University Profile Page	Georgia Institute of Technology
Google Scholar	View Google Scholar Profile

Irfan Essa Skills & Research Interests

Computer Vision

Artificial Intelligence

Machine Learning

Computer Graphics

Robotics

Top articles of Irfan Essa

Title	Journal	Author(s)	Publication Date
Styledrop: Text-to-image synthesis of any style	Advances in Neural Information Processing Systems	Kihyuk Sohn Lu Jiang Jarred Barber Kimin Lee Nataniel Ruiz ...	2024/2/13
Spae: Semantic pyramid autoencoder for multimodal generation with frozen llms		Lijun Yu Yong Cheng Zhiruo Wang Vivek Kumar Wolfgang Macherey ...	2023
Image manipulation by text instruction			2024/2/13
SLAIM: Robust Dense Neural SLAM for Online Tracking and Mapping	arXiv preprint arXiv:2404.11419	Vincent Cartillier Grant Schindler Irfan Essa	2024/4/17
On the Efficacy of Text-Based Input Modalities for Action Anticipation	arXiv preprint arXiv:2401.12972 (Under review)	Apoorva Beedu Karan Samel Irfan Essa	2024/1/23
3D Semantic MapNet: Building Maps for Multi-Object Re-Identification in 3D	arXiv preprint arXiv:2403.13190	Vincent Cartillier Neha Jain Irfan Essa	2024/3/19
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation	arXiv preprint arXiv:2401.05675	Seung Hyun Lee Yinxiao Li Junjie Ke Innfarn Yoo Han Zhang ...	2024/1/11
Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity Recognition	Sensors	Harish Haresamudram Irfan Essa Thomas Ploetz	2024/2/15
Prompt-Free Diffusion: Taking" Text" out of Text-to-Image Diffusion Models	arXiv preprint arXiv:2305.16223	Xingqian Xu Jiayi Guo Zhangyang Wang Gao Huang Irfan Essa ...	2023/5/25
Words into action: Learning diverse humanoid robot behaviors using language guided iterative motion refinement	arXiv preprint arXiv:2310.06226	K Niranjan Kumar Irfan Essa Sehoon Ha	2023/10/10
Video textures		Alexander JV White Aphrodite Galata	2007/5/2
Videopoet: A large language model for zero-shot video generation	arXiv preprint arXiv:2312.14125	Dan Kondratyuk Lijun Yu Xiuye Gu José Lezama Jonathan Huang ...	2023/12/21
Visual prompt tuning for generative transfer learning		Kihyuk Sohn Huiwen Chang José Lezama Luisa Polania Han Zhang ...	2023
Slide Gestalt: Automatic Structure Extraction in Slide Decks for Non-Visual Access		Yi-Hao Peng Peggy Chi Anjuli Kannan Meredith Ringel Morris Irfan Essa	2023/4/19
Cascaded compositional residual learning for complex interactive behaviors	IEEE Robotics and Automation Letters	K Niranjan Kumar Irfan Essa Sehoon Ha	2023/6/14
Emergence of maps in the memories of blind navigation agents	AI Matters	Erik Wijmans Manolis Savva Irfan Essa Stefan Lee Ari S Morcos ...	2023/10/10
Integrating Noisy Knowledge into Language Representations for E-Commerce Applications		Karan Samel Jun Ma Zhengyang Wang Tong Zhao Irfan Essa	2023/12/15
Investigating enhancements to contrastive predictive coding for human activity recognition		Harish Haresamudram Irfan Essa Thomas Plötz	2023/3/13
Language Model Beats Diffusion--Tokenizer is Key to Visual Generation		Lijun Yu José Lezama Nitesh B. Gundavarapu Luca Versari Kihyuk Sohn ...	2024
Photorealistic video generation with diffusion models	arXiv preprint arXiv:2312.06662	Agrim Gupta Lijun Yu Kihyuk Sohn Xiuye Gu Meera Hahn ...	2023/12/11