Siddharth Dalmia

Siddharth Dalmia

Carnegie Mellon University

H-index: 18

North America-United States

About Siddharth Dalmia

Siddharth Dalmia, With an exceptional h-index of 18 and a recent h-index of 18 (since 2020), a distinguished researcher at Carnegie Mellon University, specializes in the field of Speech Recognition, Deep Learning, Natural Language Processing.

His recent articles reflect a diverse array of research interests and contributions to the field:

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Llm augmented llms: Expanding capabilities through composition

Multimodal Modeling for Spoken Language Identification

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems

Legonn: Building modular encoder-decoder models

Espnet-st-v2: Multipurpose spoken language translation toolkit

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization

Siddharth Dalmia Information

University

Position

___

Citations(all)

1063

Citations(since 2020)

1003

Cited By

243

hIndex(all)

18

hIndex(since 2020)

18

i10Index(all)

27

i10Index(since 2020)

27

Email

University Profile Page

Carnegie Mellon University

Google Scholar

View Google Scholar Profile

Siddharth Dalmia Skills & Research Interests

Speech Recognition

Deep Learning

Natural Language Processing

Top articles of Siddharth Dalmia

Title

Journal

Author(s)

Publication Date

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

arXiv preprint arXiv:2403.05530

Machel Reid

Nikolay Savinov

Denis Teplyashin

Dmitry Lepikhin

Timothy Lillicrap

...

2024/3/8

Llm augmented llms: Expanding capabilities through composition

arXiv preprint arXiv:2401.02412

Rachit Bansal

Bidisha Samanta

Siddharth Dalmia

Nitish Gupta

Shikhar Vashishth

...

2024/1/4

Multimodal Modeling for Spoken Language Identification

Shikhar Bharadwaj

Min Ma

Shikhar Vashishth

Ankur Bapna

Sriram Ganapathy

...

2024/4/14

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems

arXiv preprint arXiv:2404.01616

Frank Palma Gomez

Ramon Sanabria

Yun-hsuan Sung

Daniel Cer

Siddharth Dalmia

...

2024/4/2

Legonn: Building modular encoder-decoder models

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Siddharth Dalmia

Dmytro Okhonko

Mike Lewis

Sergey Edunov

Shinji Watanabe

...

2023/7/17

Espnet-st-v2: Multipurpose spoken language translation toolkit

arXiv preprint arXiv:2304.04596

Brian Yan

Jiatong Shi

Yun Tang

Hirofumi Inaguma

Yifan Peng

...

2023/4/10

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models

arXiv preprint arXiv:2210.15734

Siddhant Arora

Siddharth Dalmia

Brian Yan

Florian Metze

Alan W Black

...

2022/10/27

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization

ICASSP 2022

Brian Yan

Chunlei Zhang

Meng Yu

Shi-Xiong Zhang

Siddharth Dalmia

...

2022

CMU’s IWSLT 2022 Dialect Speech Translation System

Brian Yan

Patrick Fernandes

Siddharth Dalmia

Jiatong Shi

Yifan Peng

...

2022/5

CTC alignments improve autoregressive translation

arXiv preprint arXiv:2210.05200

Brian Yan

Siddharth Dalmia

Yosuke Higuchi

Graham Neubig

Florian Metze

...

2022/10/11

Exploiting Compositionality in Sequence Models

Siddharth Dalmia

2022

Two-pass low latency end-to-end spoken language understanding

arXiv preprint arXiv:2207.06670

Siddhant Arora

Siddharth Dalmia

Xuankai Chang

Brian Yan

Alan Black

...

2022/7/14

Branchformer: Parallel mlp-attention architectures to capture local and global context for speech recognition and understanding

Yifan Peng

Siddharth Dalmia

Ian Lane

Shinji Watanabe

2022/6/28

A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding

Yifan Peng

Siddhant Arora

Yosuke Higuchi

Yushi Ueda

Sujay Kumar

...

2023/1/9

FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech

SLT 2022

Alexis Conneau

Min Ma

Simran Khanuja

Yu Zhang

Vera Axelrod

...

2022/5/25

Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding

arXiv preprint arXiv:2106.15065

Siddhant Arora

Alissa Ostapenko

Vijay Viswanathan

Siddharth Dalmia

Florian Metze

...

2021/6/29

Highland Puebla Nahuatl Speech Translation Corpus for Endangered Language Documentation

Jiatong Shi

Jonathan D Amith

Xuankai Chang

Siddharth Dalmia

Brian Yan

...

2021/6

NoiseQA: Challenge set evaluation for user-centric question answering

arXiv preprint arXiv:2102.08345

Abhilasha Ravichander

Siddharth Dalmia

Maria Ryskina

Florian Metze

Eduard Hovy

...

2021/2/16

Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates

Hirofumi Inaguma

Siddharth Dalmia

Brian Yan

Shinji Watanabe

2021/12/13

Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks

arXiv preprint arXiv:2105.00573

Siddharth Dalmia

Brian Yan

Vikas Raunak

Florian Metze

Shinji Watanabe

2021/5/2

See List of Professors in Siddharth Dalmia University(Carnegie Mellon University)

Co-Authors

H-index: 82
Graham Neubig

Graham Neubig

Carnegie Mellon University

H-index: 78
Alan W Black

Alan W Black

Carnegie Mellon University

H-index: 74
Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

H-index: 53
Florian Metze

Florian Metze

Carnegie Mellon University

H-index: 34
Ian Lane

Ian Lane

Carnegie Mellon University

H-index: 23
Xuankai Chang

Xuankai Chang

Carnegie Mellon University

academic-engine