Overview

We propose a model to improve automatic music transcription by adding a perceptual objective using differentiable rendering, while permitting automatic transposition to different musical instruments from the original.

Transcription

Bach - Prelude in C Major, BWV 846

Original data	Transcription
Input waveform
Ground-truth piano-roll	Predicted piano-roll
	Predicted onset matrix
Rendered from ground-truth piano-roll	Rendered from predicted piano-roll

Chopin - Fantaisie-Impromptu in C# minor, Op. 66

Original data	Transcription
Input waveform
Ground-truth piano-roll	Predicted piano-roll
	Predicted onset matrix
Rendered from ground-truth piano-roll	Rendered from predicted piano-roll

Schumann - Träumerei, “Kinderszenen” No. 7 in F major, Op. 15

Original data	Transcription
Input waveform
Ground-truth piano-roll	Predicted piano-roll
	Predicted onset matrix
Rendered from ground-truth piano-roll	Rendered from predicted piano-roll