Logo Computer scientist,
engineer, and educator
• Articles • Articles about computing • Articles about software development

Generating simple MIDI files using Java, without using the Java Sound API

If you need to generate a MIDI file from a Java application, in nearly all cases you're better off using the standard Java Sound API, which has built-in support for it. Unfortunately, not all platforms where Java runs has support for the MIDI API, and even where it is supported it's a pretty heavy thing to use if all you want to do is play a few notes of music.

My interest in writing MIDI files using Java comes from developing music applications for the Android platform. Android applications are Java-based, but the API set is limited. In particular, there is no MIDI support. This is very strange, because the Android platform has a MIDI renderer and can play MIDI files. Even stranger, Android did go through a stage of including the standard MIDI APIs, but Google took them out. So now, if you want to generate music programmatically on Android, the only practical approach is to have your program write a file and feed that it into the built-in media player.

Happily, if your music-generation needs are modest, it's not all that difficult to implement a Java class that writes a MIDI file. By 'modest' I mean that you can get by with one track and one controller channel, and you don't need to fiddle with such things as the tempo and key signature in the middle of a track. Polyphony is possible, so long as you're careful about your deltas (more on deltas below). The complete source code for my MIDI writer fits into one Java class (which I've attached to the end of this article), and has only about a hundred lines of Java, not including comments and test code. But this simple class is capable of generating MIDI files that can be played by the Android media player, and the Windows media player, among other things.

In this article I will describe the format of a minimal MIDI file, and describe some of the nasties you'll have to contend with when using Java to write one. At the end is a complete code example and some ideas how to use it.

Format of a simple MIDI file

MIDI files can be as complicated as you like, but if all you need to do is output the notes for 'Mary had a little lamb' or something of similar complexity, you don't need all the bells and whistles. A simple MIDI file contains, as a minimum, the following elements.
  • A file header. If you settle on a timebase that suits you (see below) this will be the same for every file — 14 constant bytes
  • A track header. A constant four bytes
  • Four bytes to indicate the amount of track data, including the track footer. This number is in big-endian format.
  • The track data, which will usually consist of:
    • Metadata events, most importantly the tempo. Most of the defined metadata (key signature, time signature, etc) is used by editing tools, and irrelevant to players
    • Performance events — notes, controller changes, etc
    • The track footer — four constant bytes
I'm not going to describe all the header values — they should be pretty clear from the source code below.

Two (related) aspects of the MIDI file format are crucial for the programmer, and not at all well documented — deltas and timebase. Connected to the issues of timebase and delta is the fact that notes in MIDI are not specified by duration (e.g., crotchet, quaver) but by their on-time and off-time. You have to turn a note on, and then a bit later turn it off.

The timebase of the file is the rate of timer ticks used to sequence the various MIDI events. All events have their times specified in terms of the number of timebase tick from the laste event. Speeding up the timebase increases the tempo of playback, but the relationship between timebase and tempo is fiddly. The basic unit of timing in a MIDI file is the quarter note. In principle, a quarter note is not the same as a crotchet, and the MIDI format respects that distinction — this is part of the reason for the fiddliness. However, for most musicians the distinction is unimportant, and I'm going to assume that a quarter note is the same as a crotchet, both in this discussion and in my source code.

The tempo of the track is expressed in microseconds per quarter note (crotchet). So to get a tempo of crotchet=60, this value needs to be set to 1,000,000. The timebase of the track is some multiple of the number of crotchets per second, but you set the multiple. The set value can be (in principle) between 1 and 65,536, but I doubt that the extremes of this range are particularly useful. Whatever value is set for the multiplier, one crotchet will be exactly that length. So if I set the multiplier to, say, 1000, then if I set a note which turns on at time zero and off at time 2000, I'll get a minimum. A duration of 500 will get me a quaver, and so on.

So how do we choose the multiplier? The larger the multiplier, the faster the timebase and hence the more precise the time measurement can be. This is very important if we're sampling an actual performance on an instrument, but for simple note generation applications, we don't really need this timing precision. In fact, in simple applications, there's a good reason to set the timebase pretty low. Here's why.

If you use a high (fast, precise) timebase multiplier, then you need more bits to represent each time value. Not only does this make the MIDI file much larger, it also makes the program logic rather complicated. So long as each time measurement is less than 128 ticks, you can represent it in the file as a single byte. Beyond that point you have to tackle the ghastly process of splitting your number up into the 7-bit repeating chunks that the specification demands.

In the source code below I set the timebase multiplier to 16 ticks per crotchet. This means that the longest note or rest that can be specified in a single operation is a semibreve (=64 ticks) and the shortest is a hemidemisemiquaver (=1 tick). Setting the timebase this way leads to very compact files, not to mention very simple Java code. Of course, it won't suit all applications. One slight difficulty you'll have with a small timebase multiplier is getting accurate triplets — 16 does not divide very nicely by three. But there are prices to pay for being able to write a MIDI file with ~100 lines of Java, and this is one of them.

The other, related complication to be aware of is the use of deltas. A delta is simply a time interval — expressed in timebase ticks — between two events. It is impossible to specify absolute times in a MIDI file: every event of any kind is measured from the previous event. This measurement is called a delta. The first event in the track need not have delta=0, because it's permissible (and normal) for the music to start part-way through a bar. But when you're generating sound, not music notation, and you've only got one track, there's no compelling reason to start with delta non-zero.

Thereafter, any event that has delta=0 means 'do this at as near as possible the same time as the previous event'. The way that time is measured in MIDI files means that there's no concept of a rest. If you want to give the illusion of a rest, you turn the note off at delta=N, and then start the next one at delta=N+rest_ticks. So if, for example, you want a crotchet, then a crotchet rest, then another crotchet, the delta values will be 0 for the start of the first crotchet, 16 for the end of the first crotchet, then 32 (=16 + 16) for the start of the second crotchet. The rest is simply implied.

The complexity of using deltas for timing becomes particularly apparent when generating chords. If you turn on (say) three crotchet notes at delta=0, you'll need to turn one of them off at delta=16, but the other two turn off at delta=0, not delta=0. That, again, is because events are always measured from the previous event. When you've turned the first note off, if all the notes are to be the same length, then the others turn off at the same time as the first, i.e., at delta=0.

Musicians, I think, tend to visualize note timings in terms of sequences of durations — crotchet, quaver rest, quaver, etc. Thinking in terms of deltas is awkward, but unavoidable in all but simple cases. In the source code below, I include a method that takes as its input a sequence of notes and rests of given duration, and derives the deltas internally. This works for simple monophonic lines of any duration, but if you need polyphony, I'm afraid you have to do some math. Sorry.

Once you've got your head around ticks and deltas, everything else is straightforward. Each MIDI event that goes in the file is preceded by a delta (which will always be one byte in my example). Most events are two to four bytes long. A 'note on', for example, is the byte 0x90 followed by a byte that represents the note, then a byte that represents the strike velocity. In my source code I only demonstrate note-on, note-off, and instrument change events, in addition to metadata events for tempo, etc. But all the others are very similar, and reasonably well documented on the Web.

Java issues with MIDI files

The principle problem with manipulating MIDI in Java is that for the protocol is expressed in terms of both signed and unsigned bytes, and Java does not have an unsigned byte data type. The decision to make the byte data type signed only was, in my view, a lamentably daft one, but we're stuck with it. You can't say, in Java:
  byte x = 0xFF;
because 0xFF is too big to fit a signed byte. It doesn't even wrap around to -1, which is what C does. It simply won't compile.

Although it seems inefficient, the only way I've found to deal with this situation elegantly is to define all bytes as (signed) ints, and ignore the top three bytes completely. This works because when you cast an int to a byte it has exactly the effect of masking off the top three bytes. In my source code, the method intArrayToByteArray does this conversion, and is used before all file writes.

The crazy thing about this situation is that intArrayToByteArray actually does not change the size of the data elements, only the type. This is because JVMs actually manipulate all integers smaller than 32 bits as if they were 32 bits. Computationally this makes sense — on a 32-bit CPU you don't gain anything by working in smaller chunks. But it does mean that an array of bytes takes up exactly the same amount of memory as an array of ints of the same dimension.

What all this means is that when we define a file header like this:

static final int header[] = new int[]
{
0x4d, 0x54, 0x64....
We don't lose anything in terms of storage over the situation we would have if we were allowed to say:
static final unsigned byte header[] = new unsigned byte[]
{
0x4d, 0x54, 0x64....
But we do waste some CPU cycles casting blocks of data from int to byte, when this operation does not, in fact, have any discernable effect on any values. Oh well, that's Java for you.

The other complication to be aware of relates to endian-ness. The MIDI file format is big-endian, but its integers can be between one and four bytes long. If all integers were four bytes, we could rely only the big-endianness of Java data storage (which is independent of the machine architecture) to write out integers directly. But since they aren't, there are places where we have to do the math and split an integer into byte chains of various sizes. The source code below does this where it is necessary. You'll note that I've made some simplifications where it's unlikely that the full range of the number will be needed. This is just to speed things up on the Android platform, which isn't overwhelmingly fast at Java math.

The MidiFile class

The attached MidiFile class can be used as follows:

MidiFile mf = new MidiFile();

// Insert some notes

mf.noteOnOffNow (CROCHET, 60, 127) // 60=Middle C 
// Etc...

mg.writeToFile ("somefile.mid");

That's all there is too it. The attached code demonstrates various other ways in which notes can be inserted.

Limitations of the code, and possible improvements

I've tried to find the simplest, fastest way to write an uncomplicated MIDI file, which will work on the Android java platform. There are many limitations, some of which you'd almost certainly want to do something about in a serious application.
  • Maximum note length is a semibreve. For music that is made up of crotchets, and fractions and multiples of crotchets, I would recommend using constants CROTCHET, MINIM, rather than numbers. That way you're less likely to write a delta that is too long and which will break the player. Moreover, you only have to adjust the constants if you decide to change the timebase. Remember that, unless you change the code, the maximum delta is 127 (just under 2*SEMIBREVE). Dealing with longer deltas (which you'll need to with a faster timebase) is not hugely difficult, but it will make the files much larger, and considerably increase the complexity of the arithmetic.
  • Fixed tempo of crotchet=60. Refer to the the definition of tempoEvent to see where to change this. The tempo is a number of microseconds in 3-byte big-endian format. Some math will be required.
  • Only one controller channel. The channel number is specified as the second four-bit quantity on each even message. So to send a note-on to channel 1, you need to set the value 0x90 in the noteOn method to 0x90+channel, and similarly for all the other event types.
  • The only performance events handled are note on/off and program change. But with a copy of the MIDI spec to hand, it wouldn't be hard to add other events.
  • Data is buffered in memory. The Java class accumulates all MIDI events and writes them out to file in one go. In most cases this isn't a huge problem, because the kind of application I envisage won't be playing hours of music. But it would be more memory efficient to write out data in chunks and them free the memory. Of course, once the file was complete you'd have to go back and fill in the track data length.
An interesting exercise would be to extent the code so that you could specify multiple lines of music (voices) in terms of pitch and duration, and have the program merge them into a single track, calculating the deltas as it goes. This would significantly improve the class's ability to generate more complex music, but it's beyond the needs of my simple Android applications.

Annotated source code

The main method in this class demonstrates three different ways in which the MidiFile class can be used: to specify note on and note off events at explicit deltas, to specify notes as durations, in the understanding that one follows the other without rests, and to specify notes as an array of values and durations, and let the method noteSequenceFixedVelocity take care of the rests.
/*
  A simple Java class that writes a MIDI file

  (c)2011 Kevin Boone, all rights reserved
*/
package com.kevinboone.music;
import java.io.*;
import java.util.*;


public class MidiFile
{
  // Note lengths
  //  We are working with 32 ticks to the crotchet. So
  //  all the other note lengths can be derived from this
  //  basic figure. Note that the longest note we can
  //  represent with this code is one tick short of a 
  //  two semibreves (i.e., 8 crotchets)

  static final int SEMIQUAVER = 4;
  static final int QUAVER = 8;
  static final int CROTCHET = 16;
  static final int MINIM = 32;
  static final int SEMIBREVE = 64;

  // Standard MIDI file header, for one-track file
  // 4D, 54... are just magic numbers to identify the
  //  headers
  // Note that because we're only writing one track, we
  //  can for simplicity combine the file and track headers
  static final int header[] = new int[]
     {
     0x4d, 0x54, 0x68, 0x64, 0x00, 0x00, 0x00, 0x06,
     0x00, 0x00, // single-track format
     0x00, 0x01, // one track
     0x00, 0x10, // 16 ticks per quarter
     0x4d, 0x54, 0x72, 0x6B
     };

  // Standard footer
  static final int footer[] = new int[]
     {
     0x01, 0xFF, 0x2F, 0x00
     };

  // A MIDI event to set the tempo
  static final int tempoEvent[] = new int[]
     {
     0x00, 0xFF, 0x51, 0x03,
     0x0F, 0x42, 0x40 // Default 1 million usec per crotchet
     };

  // A MIDI event to set the key signature. This is irrelent to
  //  playback, but necessary for editing applications 
  static final int keySigEvent[] = new int[]
     {
     0x00, 0xFF, 0x59, 0x02,
     0x00, // C
     0x00  // major
     };


  // A MIDI event to set the time signature. This is irrelent to
  //  playback, but necessary for editing applications 
  static final int timeSigEvent[] = new int[]
     {
     0x00, 0xFF, 0x58, 0x04,
     0x04, // numerator
     0x02, // denominator (2==4, because it's a power of 2)
     0x30, // ticks per click (not used)
     0x08  // 32nd notes per crotchet 
     };

  // The collection of events to play, in time order
  protected Vector<int[]> playEvents;

  /** Construct a new MidiFile with an empty playback event list */
  public MidiFile()
  {
    playEvents = new Vector<int[]>();
  }


  /** Write the stored MIDI events to a file */
  public void writeToFile (String filename)
    throws IOException
  {
    FileOutputStream fos = new FileOutputStream (filename);


    fos.write (intArrayToByteArray (header));

    // Calculate the amount of track data
    // _Do_ include the footer but _do not_ include the 
    // track header

    int size = tempoEvent.length + keySigEvent.length + timeSigEvent.length
      + footer.length;

    for (int i = 0; i < playEvents.size(); i++)
      size += playEvents.elementAt(i).length;

    // Write out the track data size in big-endian format
    // Note that this math is only valid for up to 64k of data
    //  (but that's a lot of notes) 
    int high = size / 256;
    int low = size - (high * 256);
    fos.write ((byte) 0);
    fos.write ((byte) 0);
    fos.write ((byte) high);
    fos.write ((byte) low);


    // Write the standard metadata — tempo, etc
    // At present, tempo is stuck at crotchet=60 
    fos.write (intArrayToByteArray (tempoEvent));
    fos.write (intArrayToByteArray (keySigEvent));
    fos.write (intArrayToByteArray (timeSigEvent));

    // Write out the note, etc., events
    for (int i = 0; i < playEvents.size(); i++)
    {
      fos.write (intArrayToByteArray (playEvents.elementAt(i)));
    }

    // Write the footer and close
    fos.write (intArrayToByteArray (footer));
    fos.close();
  }


  /** Convert an array of integers which are assumed to contain
      unsigned bytes into an array of bytes */
  protected static byte[] intArrayToByteArray (int[] ints)
  {
    int l = ints.length;
    byte[] out = new byte[ints.length];
    for (int i = 0; i < l; i++)
    {
      out[i] = (byte) ints[i];
    }
    return out;
  }


  /** Store a note-on event */
  public void noteOn (int delta, int note, int velocity)
  {
  int[] data = new int[4];
  data[0] = delta;
  data[1] = 0x90;
  data[2] = note;
  data[3] = velocity;
  playEvents.add (data);
  }


  /** Store a note-off event */
  public void noteOff (int delta, int note)
  {
  int[] data = new int[4];
  data[0] = delta;
  data[1] = 0x80;
  data[2] = note;
  data[3] = 0;
  playEvents.add (data);
  }


  /** Store a program-change event at current position */
  public void progChange (int prog)
  {
  int[] data = new int[3];
  data[0] = 0;
  data[1] = 0xC0;
  data[2] = prog;
  playEvents.add (data);
  }


  /** Store a note-on event followed by a note-off event a note length
      later. There is no delta value — the note is assumed to
      follow the previous one with no gap. */
  public void noteOnOffNow (int duration, int note, int velocity)
  {
  noteOn (0, note, velocity);
  noteOff (duration, note);
  }


  public void noteSequenceFixedVelocity (int[] sequence, int velocity)
  {
    boolean lastWasRest = false;
    int restDelta = 0;
    for (int i = 0; i < sequence.length; i += 2)
    {
      int note = sequence[i];
      int duration = sequence[i + 1];
      if (note < 0)
      {
        // This is a rest
        restDelta += duration;
        lastWasRest = true;
      }
      else
      {
        // A note, not a rest
        if (lastWasRest)
        {
          noteOn (restDelta, note, velocity);
          noteOff (duration, note);
        }
        else
        {
          noteOn (0, note, velocity);
          noteOff (duration, note);
        }
        restDelta = 0;
        lastWasRest = false;
      }
    }
  }


  /** Test method — creates a file test1.mid when the class
      is executed */
  public static void main (String[] args)
    throws Exception
  {
    MidiFile mf = new MidiFile();

    // Test 1 — play a C major chord

    // Turn on all three notes at start-of-track (delta=0) 
    mf.noteOn (0, 60, 127);
    mf.noteOn (0, 64, 127);
    mf.noteOn (0, 67, 127);

    // Turn off all three notes after one minim. 
    // NOTE delta value is cumulative — only _one_ of
    //  these note-offs has a non-zero delta. The second and
    //  third events are relative to the first
    mf.noteOff (MINIM, 60);
    mf.noteOff (0, 64);
    mf.noteOff (0, 67);

    // Test 2 — play a scale using noteOnOffNow
    //  We don't need any delta values here, so long as one
    //  note comes straight after the previous one 

    mf.noteOnOffNow (QUAVER, 60, 127);
    mf.noteOnOffNow (QUAVER, 62, 127);
    mf.noteOnOffNow (QUAVER, 64, 127);
    mf.noteOnOffNow (QUAVER, 65, 127);
    mf.noteOnOffNow (QUAVER, 67, 127);
    mf.noteOnOffNow (QUAVER, 69, 127);
    mf.noteOnOffNow (QUAVER, 71, 127);
    mf.noteOnOffNow (QUAVER, 72, 127);

    // Test 3 — play a short tune using noteSequenceFixedVelocity
    //  Note the rest inserted with a note value of -1

    int[] sequence = new int[]
      {
      60, QUAVER + SEMIQUAVER,
      65, SEMIQUAVER,
      70, CROTCHET + QUAVER,
      69, QUAVER,
      65, QUAVER / 3,
      62, QUAVER / 3,
      67, QUAVER / 3,
      72, MINIM + QUAVER,
      -1, SEMIQUAVER,
      72, SEMIQUAVER,
      76, MINIM,
      };

    // What the heck — use a different instrument for a change
    mf.progChange (10);

    mf.noteSequenceFixedVelocity (sequence, 127);

    mf.writeToFile ("test1.mid");
  }
}

Copyright © 1994-2013 Kevin Boone. Updated Feb 08 2013