Zhiyao Duan

Zhiyao Duan

University of Rochester

H-index: 30

North America-United States

About Zhiyao Duan

Zhiyao Duan, With an exceptional h-index of 30 and a recent h-index of 26 (since 2020), a distinguished researcher at University of Rochester, specializes in the field of Computer Audition, Music Information Retrieval, Audio-Visual Processing, Machine Learning.

His recent articles reflect a diverse array of research interests and contributions to the field:

SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription

MusicHiFi: Fast High-Fidelity Stereo Vocoding

Toward Fully Self-Supervised Multi-Pitch Estimation

Cacophony: An Improved Contrastive Audio-Text Model

Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech

Singfake: Singing voice deepfake detection

Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks and Channel Variation

SingNet: a real-time Singing Voice beat and Downbeat Tracking System

Zhiyao Duan Information

University

Position

Electrical and Computer Engineering

Citations(all)

4181

Citations(since 2020)

3286

Cited By

1832

hIndex(all)

30

hIndex(since 2020)

26

i10Index(all)

73

i10Index(since 2020)

59

Email

University Profile Page

University of Rochester

Google Scholar

View Google Scholar Profile

Zhiyao Duan Skills & Research Interests

Computer Audition

Music Information Retrieval

Audio-Visual Processing

Machine Learning

Top articles of Zhiyao Duan

Title

Journal

Author(s)

Publication Date

SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription

Yongyi Zang

Yi Zhong

Frank Cwitkowitz

Zhiyao Duan

2024/4/14

MusicHiFi: Fast High-Fidelity Stereo Vocoding

arXiv preprint arXiv:2403.10493

Ge Zhu

Juan-Pablo Caceres

Zhiyao Duan

Nicholas J Bryan

2024/3/15

Toward Fully Self-Supervised Multi-Pitch Estimation

arXiv preprint arXiv:2402.15569

Frank Cwitkowitz

Zhiyao Duan

2024/2/23

Cacophony: An Improved Contrastive Audio-Text Model

arXiv preprint arXiv:2402.06986

Ge Zhu

Zhiyao Duan

2024/2/10

Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech

arXiv preprint arXiv:2311.14816

Enting Zhou

You Zhang

Zhiyao Duan

2023/11/24

Singfake: Singing voice deepfake detection

Yongyi Zang

You Zhang

Mojtaba Heydari

Zhiyao Duan

2024/4/14

Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks and Channel Variation

You Zhang

Fei Jiang

Ge Zhu

Xinhui Chen

Zhiyao Duan

2023/2/24

SingNet: a real-time Singing Voice beat and Downbeat Tracking System

Mojtaba Heydari

Ju-Chiang Wang

Zhiyao Duan

2023/6/4

Euterpe: A Web Framework for Interactive Music Systems

Journal of the Audio Engineering Society

Yongyi Zang

Christodoulos Benetatos

Zhiyao Duan

2023/11/16

Transcription free filler word detection with Neural semi-CRFs

Ge Zhu

Yujia Yan

Juan-Pablo Caceres

Zhiyao Duan

2023/6/4

EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis

arXiv preprint arXiv:2311.08667

Ge Zhu

Yutong Wen

Marc-André Carbonneau

Zhiyao Duan

2023/11/15

SAMO: Speaker Attractor Multi-Center One-Class Learning For Voice Anti-Spoofing

Siwen Ding

You Zhang

Zhiyao Duan

2023/6/4

Harmonic Analysis With Neural Semi-CRF

Qiaoyu Yang

Frank Cwitkowitz

Zhiyao Duan

2023/11/5

HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields

You Zhang

Yuxiang Wang

Zhiyao Duan

2023/6/4

Mitigating Cross-Database Differences for Learning Unified HRTF Representation

Yutong Wen

You Zhang

Zhiyao Duan

2023/10/22

Grid-agnostic personalized head-related transfer function modeling with neural fields

The Journal of the Acoustical Society of America

You Zhang

Yuxiang Wang

Mark Bocko

Zhiyao Duan

2023/3/1

Phase perturbation improves channel robustness for speech spoofing countermeasures

arXiv preprint arXiv:2306.03389

Yongyi Zang

You Zhang

Zhiyao Duan

2023/6/6

Editorial for TISMIR Special Collection: Cultural Diversity in MIR Research

Zhiyao Duan

Peter van Kranenburg

Juhan Nam

Preeti Rao

2023/12/13

ControlVC: Zero-shot voice conversion with time-varying controls on pitch and speed

arXiv preprint arXiv:2209.11866

Meiying Chen

Zhiyao Duan

2022/9/23

A study of the robustness of raw waveform based speaker embeddings under mismatched conditions

Ge Zhu

Frank Cwitkowitz

Zhiyao Duan

2022/5/23

See List of Professors in Zhiyao Duan University(University of Rochester)

Co-Authors

H-index: 83
Changshui Zhang

Changshui Zhang

Tsinghua University

H-index: 69
Gaurav Sharma

Gaurav Sharma

University of Rochester

H-index: 60
Wendi B Heinzelman

Wendi B Heinzelman

University of Rochester

H-index: 50
Paris Smaragdis

Paris Smaragdis

University of Illinois at Urbana-Champaign

H-index: 37
Bryan Pardo

Bryan Pardo

North Western University

academic-engine