Sound Search by Vocal Imitation

Publications

Peer-Reviewed Publications

[22] Yichi Zhang, Junbo Hu, Yiting Zhang, Bryan Pardo, and Zhiyao Duan, Vroom!: A Search engine for sounds by vocal imitation queries. in Proc. the 2020 Conference on Human Information Interaction and Retrieval (CHIIR), 2020, pp. 23-32.

[21] Sefik Emre Eskimez, Kazuhito Koishida, and Zhiyao Duan, Adversarial training for speech super-resolution, IEEE Journal of Selected Topics in Signal Processing, 2019. (accepted)

[20] Yichi Zhang, Yiting Zhang, and Zhiyao Duan, Sound search by text description or vocal imitation?. arXiv preprint arXiv:1907:08661, 2019.

[19] Bongjun Kim and Bryan Pardo, Sound Event Detection Using Point-labeled Data, in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019.

[18] Bongjun Kim and Bryan Pardo, Improving Content-based Audio Retrieval by Vocal Imitation Feedback, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019, pp. 4100-4104.

[17] Bongjun Kim, Madhav Ghei, Bryan Pardo, and Zhiyao Duan, Vocal Imitation Set: a dataset of vocally imitated sound events using the AudioSet ontology, in Proc. Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE), 2018.

[16] Yichi Zhang, Bryan Pardo, and Zhiyao Duan, Siamese style convolutional neural networks for sound search by vocal imitation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 2, pp. 429-441, 2019.

[15] Yichi Zhang and Zhiyao Duan, Visualization and interpretation of Siamese style convolutional neural networks for sound search by vocal imitation, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 2406-2410.

[14] Bongjun Kim, Leveraging user input and feedback for interactive sound event detection and annotation, ACM 23rd International Conference on Intelligent User Interfaces, 2018, pp. 671-672.

[13] Bongjun Kim and Bryan Pardo, A human-in-the-loop system for sound event detection and annotation, ACM Transaction on Interactive Intelligent System (TiiS), pp. 13:1-23, 2018.

[12] Zhihan Zhou, Yichi Zhang, and Zhiyao Duan, Joint Speaker Diarization and Recognition Using Convolutional and Recurrent Neural Networks, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 2496-2500.

[11] Sefik Emre Eskimez, Zhiyao Duan, and Wendi Heinzelman, Unsupervised learning approach to feature analysis for automatic speech emotion recognition, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 5099-5103.

[10] Sefik Emre Eskimez, Peter Soufleris, Zhiyao Duan, and Wendi Heinzelman, Front-end speech enhancement for commercial speaker verification systems, Speech Communication, vol. 99, no. 101, pp. 101-113, 2018.

[9] Rui Lu, Zhiyao Duan, and Changshui Zhang. Multi-scale recurrent neural network for sound event detection, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 131-135.

[8] Yichi Zhang and Zhiyao Duan, IMINET: Convolutional Semi-Siamese Networks for Sound Search by Vocal Imitation, in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2017, pp. 304-308.

[7] Rui Lu, Zhiyao Duan, and Changshui Zhang, Metric learning based data augmentation for environmental sound classification, in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2017, pp. 1-5.

[6] Bongjun Kim and Bryan Pardo, A Human-in-the-loop System for Sound Event Detection and Annotation, accepted by ACM Transactions on Interactive Intelligent Systems, 2017.

[5] Bongjun Kim and Bryan Pardo, I-Sed: An Interactive Sound Event Detector, in Proc. International Conference on Intelligent User Interfaces (IUI), 2017, pp. 553-557.

[4] Rui Lu, Kailun Wu, Zhiyao Duan, and Changshui Zhang, Deep ranking: triplet MatchNet for music metric learning, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 121-125.

[3] Yichi Zhang and Zhiyao Duan, Supervised and unsupervised sound retrieval by vocal imitation, Journal of Audio Engineering Society, vol. 64, no. 7, pp. 1-11, 2016.

[2] Yichi Zhang and Zhiyao Duan, IMISOUND: An unsupervised system for sound query by vocal imitation, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 2269-2273.

[1] Yichi Zhang and Zhiyao Duan, Retrieving sounds by vocal imitation recognition, in Proc. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2015, pp. 1-6.

Other publications

[2] Rui Lu and Zhiyao Duan, Bidirectional GRU for Sound Event Detection, Technical Report for DCASE 2017 Challenge Task 3 Sound Event Detection, 2017.

[1] Yukun Chen, Yichi Zhang, and Zhiyao Duan, DCASE2017 Sound Event Detection Using Convolutional Neural Network, Technical Report for DCASE 2017 Challenge Task 3 Sound Event Detection, 2017.



Last updated .