To use this software in academic publications, please cite the following paper: Morsi A, Zhang H, Maezawa A, Dixon S, Serra X. Simulating piano performance mistakes for music learning. Proceedings of the 21st Sound and Music Computing Conference SMC 2024; 2024 July 4-6; Porto, Portugal.
for other uses, please see the included LICENSE file
This is the repository of SynMist, containing the synthesized mistake dataset as well as python scripts that generates the mistakes in a taxonomical way. simulate_mistakes.py
and lowlvl.py
contains functions regarding to the mid-level mistake scheduler and low-level deviation functions. region_classifier.py
contains the simple texture \ technique region identifier.
python simulate_mistakes.py <input_midi_folder> <output_midi_folder> <run_id>
This processes all midi performance files in <input_midi_folder>, applies mistakes to them, and saves the files in <run_id>/<output_midi_folder> python simulate_mistakes.py --no_ts_annot '<path_prefix>/synthetic-mistake-study/data' repeat_test 'run11'
sampling_prob.csv has several mistake types and their associated probabilities. The probabilities should be interpreted as: the probability of a mistake type per detected 'texture', assuming that it has already been decided that there will be a mistake at a note that is classified as belonging to this texture. This is crucial because we do not have a 'no mistake' probability. Consistent with the above description,
the rows should sum up to 1. (summing over the column doesn't make much sense).
The sampling process is as follows: each note in the performance is classified as belonging to a textural region. Then, we specify a total number of mistakes that we want to be present per file (that is the current approach). Additionally, we could specify the playing attitude as something that would define the parameters of each of the functions, though currently this doesn't exist yet and those function parameters are randomly selected.
Then, the these mistake numbers are
The current set of values are not based on actual analysis..
index,forward_backward_insertion,mistouch,pitch_change,drag is_double_note,0.4,0.3,0.1,0.2 is_scale_note,0.3,0.4,0.3,0 is_block_chords_note,0.2,0.1,0.4,0.3 others,0.1,0.2,0.2,0.2
The detailed analysis and annoation of individual mistakes from both datasets can be found here.
For accessing the burgmuller dataset, please refering to the original paper project page and find the download link.
For the augmented version of expert-novice dataset (containing transcribed MIDIs and error-annotation), please refer to the repository.
For interviews with piano teachers, we have put the evaluated samples into an online questionnaire. Their anonymized comments and rating are shown here.
Following the examples to adapt ASAP and AMAPS, create a similar one for whichever dataset as long as it is possible to extract a 1d array of timevalues representing the locations of the annotations.