University of Rochester Multi-Modal Music Performance (URMP) Dataset

This project is supported by the National Science Foundation under grant No. 1741472, titled "BIGDATA: F: Audio-Visual Scene Understanding".
Disclaimer: Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Paper Describing URMP

Bochen Li *, Xinzhao Liu *, Karthik Dinesh, Zhiyao Duan, Gaurav Sharma, "Creating a multi-track classical music performance dataset for multi-modal music analysis: Challenges, insights, and applications", IEEE Transactions on Multimedia, 2018. (* equal contribution) <pdf>

Overview

We introduce a dataset for facilitating audio-visual analysis of musical performances. The dataset comprises a number of simple multi-instrument musical pieces assembled from coordinated but separately recorded performances of individual tracks. For each piece, we provide the musical score in MIDI format, the high-quality individual instrument audio recordings and the videos of the assembled pieces. We anticipate that the dataset will be useful for multi-modal information retrieval techniques such as music source separation, transcription, performance analysis and also serve as ground-truth for evaluating performances.