Soundprism Examples

This is the companion webpage for the paper:

Zhiyao Duan and Bryan Pardo, Soundprism: an online system for score-informed source separation of music audio, IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 6, pp. 1205-1215, 2011. <pdf> <slides>

Source separation examples

As describied in the paper, we compare Soundprism with three other source separation systems:

Ideally-aligned: ground-truth audio-score alignment + the separation stage of Soundprism.
Ganseman10: an offline score-informed source separation system [1].
MPET: multi-pitch estimation [2] & tracking [3] + the separation stage of Soundprism. Score information is not used in MPET.

1. We first present three examples from the Bach chorale dataset, each of which has a different polyphony.

ID	Mixture	Sources	Soundprism	Ideally-aligned	Ganseman10	MPET
1	MIDI auido	violin	violin	violin	violin	violin
1	MIDI auido	clarinet	clarinet	clarinet	clarinet	clarinet
2	MIDI auido	clarinet	clarinet	clarinet	clarinet	clarinet
		saxophone	saxophone	saxophone	saxophone	saxophone
		bassoon	bassoon	bassoon	bassoon	bassoon
3	MIDI auido	violin	violin	violin	violin	violin
		clarinet	clarinet	clarinet	clarinet	clarinet
		saxophone	saxophone	saxophone	saxophone	saxophone
		bassoon	bassoon	bassoon	bassoon	bassoon

2. Then, we present two examples of realistic orchestral music from the RWC database [4], where sources are mixed in a natural environment instead of being individually recorded then artificially mixed. Note that we do not have the ground-truth sources, nor the ground-truth audio-score alignment. Therefore, we do not have results of "Ideally-aligned" here.

ID	Mixture	Soundprism	Ganseman10	MPET
4	MIDI auido	clarinet	clarinet	clarinet
		viola	viola	viola
		cello	cello	cello
5	MIDI auido	violin1	violin1	violin1
		violin2	violin2	violin2
		viola	viola	viola
		cello	cello	cello

Score following results

Now we show score following results of the above examples. We compare Soundprism with an offline audio-score alignment method:

Scorealign: an open source software, based on the algorithm described in [5].

We show the results by time-warping the original mixture according to the output alignment of each score follower/aligner. You may follow the following steps to get a feel for the alignment results of these examples:

Import the score file (*.mid) to Audacity;
Import the time-warped audio file (*.wav) to Audacity;
Listen to the audio and watch how well it is aligned with the score.

ID	Score	Original audio	Time-warped audio by Soundprism	Time-warped audio by Scorealign
1	MIDI	wave	wave	wave
2	MIDI	wave	wave	wave
3	MIDI	wave	wave	wave
4	MIDI	wave	wave	wave
5	MIDI	wave	wave	wave

References

[1] J. Ganseman, G. Mysore, P. Scheunders and J. Abel, "Source separation by score synthesis," in Proc. International Computer Music Conference (ICMC), New York, NY, June 2010.
[2] Z. Duan, B. Pardo and C. Zhang, "Multiple fundamental frequency estimation by modeling spectral peaks and non-peak regions," IEEE Trans. Audio Speech Language Process., vol. 18, no. 8, pp. 2121-2133, 2010.
[3] Z. Duan, J. Han and B. Pardo, "Song-level multi-pitch tracking by heavily constrained clustering," in Proc. IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2010, pp. 57-60.
[4] M. Goto, H. Hashiguchi, T. Nishimura and R. Oka, "RWC music database: popular, classical, and jazz music databases," in Proc. International Conference on Music Information Retrieval (ISMIR 2002), 2002, pp.287-288.
[5] N. Hu, R.B. Dannenberg and G. Tzanetakis, "Polyphonic audio matching and alignment for music retrieval," in Proc. 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, New York, USA, 2003, pp. 185-188.

For any questions or comments, please contact us at zhiyaoduan00 AT gmail DOT com.