![]() |
![]() |
|
MusicXML for Notation and AnalysisRecordare LLC Reprinted with permission from The Virtual Score: Representation, Retrieval, Restoration, Walter B. Hewlett and Eleanor Selfridge-Field, eds., MIT Press, Cambridge, MA, 2001, pp. 113-124. Computing in Musicology 12. Copyright © 2001 Center for Computer Assisted Research in the Humanities. AbstractMusicXML is intended to represent common western musical notation from the seventeenth century onwards, including both classical and popular music. MusicXML is intended to support interchange between musical notation, performance, analysis, and retrieval applications. It is designed to be sufficient, not optimal, for these applications.
MusicXML is an XML-based music interchange language.1 It is intended to represent common western musical notation from the seventeenth century onwards, including both classical and popular music. The language is designed to be extensible to future coverage of early music and less standard notation needs of twentieth and twenty-first century scores. (Non-western musical notations would probably be best represented through a separate XML language.) MusicXML is intended to support interchange between musical notation, performance, analysis, and retrieval applications. It is therefore designed to be sufficient, not optimal, for these applications. MusicXML is not intended to supersede other formats that are optimized for specific musical applications, but to support sharing of musical data between applications. The development goal is to support interchange with any musical program for western notation with a published computer data format. The current MusicXML converter software runs on Windows. As of October 2000, it reads from:
The current MusicXML software writes to:
MusicXML software currently provides complete coverage for both reading and writing MuseData files, and partial coverage of the other formats and applications. The NIFF, ETF, and MIDI converters use XML versions of these formats as intermediate data structures. MusicXML adapts the MuseData and Humdrum formats to XML, adding features needed to cover more of music usage from the mid-nineteenth century to the present time.2 These were chosen as starting points because they are two of the most powerful languages currently available for musical analysis and interchange. One of Humdrum’s important features is its explicitly two-dimensional representation of music by part and by time. A hierarchical representation like XML cannot directly support this type of lattice structure, but automatic conversion between these two orderings is an adequate alternative. MusicXML uses Extensible Style Sheet Transformations (XSLT) programs to convert between two hierarchical representations: a part-wise score where measures are nested within parts, and a time-wise score where parts are nested within measures. A Sample MusicXML EncodingTo give a flavor of MusicXML, here is an encoding of the beginning of the voice part of Robert Schumann's Op. 35 setting of Kerner's "Frage," illustrated in Figure 1. Figure 1: Robert Schumann, start of the setting of Kerner's "Frage" and its representation in MusicXML.
MusicXML score files do not represent presentation concepts such as pages and systems. The details of formatting will change based on different paper and display sizes. In the XML environment, formatting is handled separately from structure and semantics. The same applies for detailed interpretive performance information. Separate XML languages could be developed to represent individual printings and performances. Each MusicXML score file represents a single movement. Multi-movement works and collections are represented in a MusicXML opus file, based on a separate DTD for linking and musicological data. MusicXML documents are larger than previous text formats such as MuseData and Humdrum. However, XML documents compress well, and zip compression typically reduces the size of MusicXML files by a factor of 30. MusicXML files that are seven times larger than MuseData files when uncompressed are only twice as large when compressed. MusicXML Score DTD ExamplesThe MusicXML score DTD is still under development as it is tested with more music, music formats, and musical applications. The complete DTD can be accessed via www.musicxml.org/xml.html. Some examples from the current version illustrate both the level of detail in MusicXML and some of the standardization issues that arise when defining an XML interchange language. In Figure 2 we see how a Figure 2. Definition of a note element in a MusicXML DTD. <!-- Internal entities to simplify note definitions --> <!ENTITY % full-note "(chord?, (pitch | unpitched | rest))"> <!ENTITY % voice-track "(footnote?, level?, track?)"> Two elements within the note definition are Figure 3. Elements within the note definition. <!ELEMENT pitch (step, alter?, octave)> <!ELEMENT step (#PCDATA)> <!ELEMENT alter (#PCDATA)> <!ELEMENT octave (#PCDATA)><!-- Tie is an empty element with one attribute. --> <!ENTITY % start-stop "(start | stop)"> <!ELEMENT tie EMPTY> <!ATTLIST tie type %start-stop; #REQUIRED> These definitions illustrate an interesting dichotomy in XML DTDs: element text is weakly typed, but attribute text can be strongly typed. Yet for most purposes, it is overall better design practice to put semantic data into elements rather than attributes (Harold 1999). Elements are generally easier to manipulate than attributes from within an XML program, and elements can have more complex structure than attributes can. The weak typing of element text helps make XML DTDs more extensible for new
applications, but puts a heavier burden on documentation and software to handle
interoperability. In our current MusicXML software, pitch names and note types
are interpreted using American terminology ( MusicXML Analysis ExamplesOne limitation to computer-based musical analysis has been the tight coupling of representations to development tools. Humdrum tools require familiarity with Unix usage, while MuseData tools run in TenX, a non-standard DOS environment. In contrast, XML programming tools are available for all major industry programming languages and platforms. This lets the user rather than the representation language choose the programming environment, making for simpler development of musical applications. Two main programming models are currently available for handling XML data: the Document Object Model (DOM) and Simple API for XML (SAX) (Martin et al. 2000). The W3C's DOM interprets an entire XML document as a tree of nodes. SAX (for serial access) provides an alternative event-based model, where an entire XML document need not be read into memory at once, but can instead be parsed on an as-needed basis. Tools for both models are available for many programming languages (e.g. Java, C++, Visual Basic) from many vendors. These examples use a DOM-based model coded using Microsoft Visual Basic. Both of the analysis program examples are adapted from the problem list on the Humdrum web site. Say we want to investigate whether Bach’s pieces really have 90% of its notes in one of two durations (e.g., quarters and eighths, or eighths and sixteenths). We can do this by plotting a distribution of note durations on a bar chart, displayed together with a simple spreadsheet. Figure 4 shows the duration distribution for the second movement of Bach’s Cantata No. 6 (BWV 6). The top two note durations make up nearly 87% of the notes. This is not quite the 90% posed in the question, but still a more uneven distribution than often seen. For retrieval purposes, an extended program could then look for the works in a given corpus with the most uneven distribution of note durations. Figure 4. Duration distribution for Bach's Cantata No. 6 (BWV 6), second movement
Note durations are represented as fractions in many musical codes, including
MuseData and NIFF. MusicXML follows MuseData’s example in encoding the
denominator (which changes rarely) in a separate element from the numerator,
which is coded individually for each note. Thus to build the distribution chart,
we need to search not only for Once this is done, we search the file for both Figure 5. XPath addressing of notes and divisions.
As another example, say we wanted to investigate whether there is a correlation between pitch and duration in a given score. The code logic is nearly the same. Instead of computing the counts in an array, we instead add pitch/duration pairs for each note to the spreadsheet. Rests are excluded, along with cue and grace notes (Figure 6). Figure 6. XPath or a coordinate search of pitch and duration.
Afterwards, we map the scatterplot axes to the two columns in the spreadsheet to display the graph. Figure 7 shows a scatter-plot of pitch vs. duration for the first movement of Mozart’s String Quartet No. 7 (K. 169). As with most musical scores we have looked at so far, there is no correlation between the two. Figure 7: Pitch/duration scatter-plot for Mozart’s Quartet No. 7 (K. 169), first movement
ConclusionsMusic information retrieval faces a tower-of-Babel problem. There is no musical format in widespread use today that overcomes MIDI's limitations as an interchange format between performance, notation, analysis, and retrieval applications. A problem for past interchange efforts has been the absence of commonly used formats for complex structured data in general. XML provides the technical foundation for an interchange format that is more powerful and expressive than the current MIDI format. Developing converters between existing formats and a single music XML document type definition has the potential to greatly simplify the tasks of music information retrieval. MusicXML attempts to provide a common document type definition that is well designed from musical, human, and computer perspectives. Footnotes1An abstract of this paper was presented as a poster session at the International Symposium on Music Information Retrieval MusicIR (October 2000). 2I wish to thank Eleanor Selfridge-Field, Walter B. Hewlett, Barry Vercoe, and David Huron for their advice, encouragement, and prior work in musical score representation. References and URLsBray, Tim., Jean Paoli, and Charles M. Sperberg-McQueen (eds.) (2000). Extensible Markup Language (XML) 1.0 (Second Edition). World Wide Web Consortium (W3C) Recommendation, October 6, 2000. www.w3.org/TR/2000/REC-xml-20001006 Clark, James (ed.) (1999). XSL Transformations (XSLT) Version 1.0. W3C Recommendation, November 16, 1999. www.w3.org/TR/1999/REC-xslt-19991116 Harold, Elliotte Rusty (1999). XML Bible. (Foster City, CA: IDG Books Worldwide). Hewlett, Walter B. (1997). "MuseData: Multipurpose Representation" in Beyond MIDI: The Handbook of Musical Codes, ed. Eleanor Selfridge-Field (Cambridge, MA: The MIT Press), 402-447. www.ccarh.org/publications/books/beyondmidi/online/musedata/ Huron, David. (1997). "Humdrum and Kern: Selective Feature Encoding" in Beyond MIDI: The Handbook of Musical Codes, ed. Eleanor Selfridge-Field (Cambridge, MA: The MIT Press), 375-401. dactyl.som.ohio-state.edu/Humdrum/ Martin, Didier et al. (2000). Professional XML. (Birmingham, UK: Wrox Press). MusicIR (2000). International Symposium on Music Information Retrieval (Plymouth, MA; Oct. 23-25, 2000). U. Mass. Center for Intelligent Information Retrieval in conjunction with the Digital Libraries Phase II and the National Science Foundation. Abstracts at: ciir.cs.umass.edu/music2000 MusicXML (2000). www.musicxml.org/xml.html Home - Music - Software - MusicXML - Events - Search - Store - About Us - Publications |
||