Xinyuan Qian

About Xinyuan Qian

Xinyuan Qian, With an exceptional h-index of 10 and a recent h-index of 10 (since 2020), a distinguished researcher at National University of Singapore, specializes in the field of speech processing, multimedia, human robot interaction.

His recent articles reflect a diverse array of research interests and contributions to the field:

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention

GLMB 3D Speaker Tracking with Video-Assisted Multi-Channel Audio Optimization Functions

Visually Guided Binaural Audio Generation with Cross-Modal Consistency

LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism

Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training

M3TTS: Multi-modal text-to-speech of multi-scale style control for dubbing

Attention-Based End-to-End Differentiable Particle Filter for Audio Speaker Tracking

Adapting Pre-Trained Self-Supervised Learning Model for Speech Recognition with Light-Weight Adapters

Xinyuan Qian Information

University

Position

___

Citations(all)

410

Citations(since 2020)

402

Cited By

56

hIndex(all)

10

hIndex(since 2020)

10

i10Index(all)

10

i10Index(since 2020)

10

Email

University Profile Page

Google Scholar

Xinyuan Qian Skills & Research Interests

speech processing

multimedia

human robot interaction

Top articles of Xinyuan Qian

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention

arXiv preprint arXiv:2404.18501

2024/4/29

GLMB 3D Speaker Tracking with Video-Assisted Multi-Channel Audio Optimization Functions

2024/4/14

Xinyuan Qian
Xinyuan Qian

H-Index: 4

Qiquan Zhang
Qiquan Zhang

H-Index: 5

Visually Guided Binaural Audio Generation with Cross-Modal Consistency

2024/4/14

LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism

2024/4/14

Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training

arXiv preprint arXiv:2404.00861

2024/4/1

M3TTS: Multi-modal text-to-speech of multi-scale style control for dubbing

Pattern Recognition Letters

2024/2/10

Attention-Based End-to-End Differentiable Particle Filter for Audio Speaker Tracking

2023/9/8

Adapting Pre-Trained Self-Supervised Learning Model for Speech Recognition with Light-Weight Adapters

Electronics

2024/1/1

Audio-Visual Temporal Forgery Detection Using Embedding-Level Fusion and Multi-Dimensional Contrastive Loss

IEEE Transactions on Circuits and Systems for Video Technology

2023/10/23

Audio-visual speaker tracking: Progress, challenges, and future directions

arXiv preprint arXiv:2310.14778

2023/10/23

Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech

IEEE Transactions on Multimedia

2023/10/13

Audio Visual Speaker Localization from EgoCentric Views

arXiv preprint arXiv:2309.16308

2023/9/28

L F-TOUCH: A Wireless GelSight with Decoupled Tactile and Three-axis Force Sensing

IEEE Robotics and Automation Letters

2023/7/5

Self-Convolution for Automatic Speech Recognition

2023/6/4

Stream Attention Based U-Net for L3DAS23 Challenge

2023/6/4

Ripple sparse self-attention for monaural speech enhancement

2023/6/4

A miniaturised camera-based multi-modal tactile sensor

2023/5/29

Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding

arXiv preprint arXiv:2305.14049

2023/5/23

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

Network

2023/5

Device features based on linear transformation with parallel training data for replay speech detection

IEEE/ACM Transactions on Audio, Speech, and Language Processing

2023/4/17

See List of Professors in Xinyuan Qian University(National University of Singapore)

Co-Authors

academic-engine