MuseData File Format

MuseDataTM is the primary encoding system used by the Center for Computer Assisted Research in the Humanities (CCARH). All musical information is entered and verified by the Center's personnel from established or specially commissioned editions of music. An online description of the MuseData file formats is available, taken from the book Beyond MIDI: The Handbook of Musical Codes.

MuseData files have the potential to exist in multiple formats generated from a common set of information. Most derivative encodings accommodate only some of the features included in the master MuseData encodings. The MuseData file format is designed to support applications in sound, graphics, and analysis. Derivative formats of the MuseData musical encodings which are currently in distribution are: MIDI1, MIDI+, and Humdrum.


MuseData File Organization

MuseData files are ASCII-based and are viewable in any text editor. Users should be aware that the number of files per movement and per work may vary from one format to another. Within the MuseData format this number may vary from one edition to another.

MuseData files are part-oriented. A movement from a composition is typically found divided into several files collected in a directory for that movement. The parts for MuseData files are always labeled 01 for the first part in the score, 02 for the second part in the score, etc. The part files can also contain multiple line of music, such as two flutes on one staff in an orchestral score, or two staves for piano music. MuseData files for different movements of a composition are found in separate directories usually indicating the movement number, e.g. 01, 02, etc.

The completeness of the information within files varies between two levels which in MuseData files we call Stage 1 and Stage 2. Only Stage 2 files are recommended for serious applications.

The first pass in data entry (Stage 1) captures basic information such as the duration and pitch of notes. For example, there would normally be four files (Violin 1, Violin 2, Viola, Cello) for each movement of a string quartet. If the quartet movement begins in duple meter, changes to triple meter, and then reverts to duple meter each metrical section will have its own set of parts. Thus there would be twelve files for the movement.

The second pass in data entry (Stage 2) supplies all information which cannot be reliably captured from an electronic keyboard. This includes indications for tempo, dynamics, and articualtion; text underlay; stem, beam, and slur information, and many other details which are essential for notational output of professional quality.

Human judgment is applied in Stage 2. Thus when the string-quartet movement cited above is converted to Stage 2, the three metrical sections for each instrument captured from keyboard input will be chained into one movement each. The movement will now have four data files (one each for Violin 1, Violin 2, Viola, Cello).

Human judgment also supplies corrections and annotations to the data. Some kinds of errors (for example, incomplete measures) must be corrected for the data to make sense to user software. Matters which are more discretionary (such as optional alterations of ornaments or accidentals in earlier repertories) are usually left unchanged. Discretionary decisions are annotated in the files which allow for editorial markings.