Overview
We propose a model to improve automatic music transcription by adding a perceptual objective using differentiable rendering, while permitting automatic transposition to different musical instruments from the original.
Transcription
Bach - Prelude in C Major, BWV 846
Original data | Transcription |
---|---|
Input waveform |
|
Ground-truth piano-roll![]() |
Predicted piano-roll![]() |
Predicted onset matrix![]() |
|
Rendered from ground-truth piano-roll |
Rendered from predicted piano-roll |
Chopin - Fantaisie-Impromptu in C# minor, Op. 66
Original data | Transcription |
---|---|
Input waveform |
|
Ground-truth piano-roll![]() |
Predicted piano-roll![]() |
Predicted onset matrix![]() |
|
Rendered from ground-truth piano-roll |
Rendered from predicted piano-roll |
Schumann - Träumerei, “Kinderszenen” No. 7 in F major, Op. 15
Original data | Transcription |
---|---|
Input waveform |
|
Ground-truth piano-roll![]() |
Predicted piano-roll![]() |
Predicted onset matrix![]() |
|
Rendered from ground-truth piano-roll |
Rendered from predicted piano-roll |
Arrangement
Orchestra to strings (Dvorak - Symphony No.9 Fourth movement)
The sounds of strings are stationary.
Original sound | Arrangement |
---|---|
![]() |
![]() |
Orchestra to organ (Holst - The Planets, Jupiter)
The sound of a organ is stationary.
Original sound | Arrangement |
---|---|
![]() |
![]() |
Orchestra to piano (Haydn - Menuet)
The sound of a piano is non-stationary.
Original sound | Arrangement |
---|---|
![]() |
![]() |