This post describes how to build the simplest possible MIDI app (which you can download already built for free from the iOS app store). It’s called Jamulator, like a calculator but for jamming.
The open-source code for the app is here.
If you want to build an app using MIDI, you will want a sound font, and you’ll probably want AudioKit. This version of AudioKit has been updated for Swift 3.2.
Why Use MIDI?
MIDI is a standard music protocol that lets you generate very realistic music in a very simple way. No more cheesy old-school computer music from the days of Pac-Man (which was pretty good back in 1980).
[youtube https://www.youtube.com/watch?v=BxYzjjs6d1s&w=560&h=315\]
Nowadays many apps use recorded sound, but you will find that MIDI is a much more flexible way to make music than using recorded audio files. Using a selection of 128 high-quality instrument sounds that closely approximate their real-world counterparts, it will enable you to:
Make the music in your app sound "real".
Play chords and melodies.
Record music and play it back.
Change melody, harmony, rhythm, and pitch in response to events.
Caveat: Both the MIDI sound font and AudioKit are very large, weighing in at about 150 mg for the font and more than 100 mg for AudioKit. AudioKit is optional. In this case, it is used just to display a piano keyboard. Smaller sound fonts are available, but they may be of lesser quality.
A Brief History of MIDI
In the early 1980s, synthesizers, drum machines, and even automatic bass players were being introduced into the mass musical instrument market. Just a few years before that, synthesizers were uncommon.
They were cumbersome to use, involving complicated patch bays similar to old-time telephone switchboards. Each sound, or “patch,” was hand-tuned using a variety of knobs. These synthesizers had no way to remember these settings, so players did their best to recreate sounds by hand when they were needed.
These early instruments had their own synthetic sounds and were not commonly used to reproduce the sounds of traditional instruments. A good example of the sound of early synthesis is The Edgar Winter Group’s “Frankenstein.”
[youtube https://www.youtube.com/watch?v=RSLP1FCREBA?start=52&w=560&h=315\]
Notwithstanding Edgar Winter, Pink Floyd, The Beatles, and such, working musicians were, on the whole, unenthused by these experimental sounds. Much more relevant to their working lives would be having high-quality prepackaged instrument sounds and the capability to record and play these back without complicated tape loop setups.
Enter the Musical Instrument Digital Interface, aka MIDI, with a very compact software protocol and standard cables connecting to standard hardware ports. Key players in the musical instrument industry got together and agreed on a standard so musicians could connect these devices together. Their focus was on performing, not twiddling knobs to get a brand new sound.
!Sign up for a free Codeship Account
Apple’s Sound Infrastructure
You have to jump through quite a few hoops to get to the point where you can play a MIDI note using Apple’s Core Audio API. Learning about how this plumbing fits together will enable you to use more Core Audio features in the future, including effects like delay and reverb and mixers.
The AUGraph
First you need a graph, specifically an AUGraph. You will add some nodes (you guessed it, AUNodes) to the graph in order to connect parts of the audio subsystem.
A graph with nodes is a very generic, non-audio-specific way of describing what’s getting connected. I like to think of the graph as a patch bay in my studio. Everything has to go through it in order to participate in the final sound output, but it’s just a connector. Nodes are like connections. You need a specific kind of node to connect to a specific kind of audio component.
You’ll need two specific nodes, a synth node and an output node, in order to make sounds with MIDI. The nodes are assigned to the graph, and an audio unit of type synth is assigned to the synth node.
var processingGraph: AUGraph? var midisynthNode = AUNode() var ioNode = AUNode() var midisynthUnit: AudioUnit?
Start with the graph:
func initAudio() { // the graph is like a patch bay, where everything gets connected checkError(osstatus: NewAUGraph(&processingGraph)) }
You will notice that the processingGraph
parameter that you pass to NewAUGraph
is always passed by reference (ie, with a leading &
character) because it is of type UnsafeMutableRawPointer
. UnsafeMutableRawPointer
s are used all over the place in Apple’s audio APIs, so get used to it. This also forces you to declare audio variables as optional and then unwrap them when you pass them as parameters, which is an anti-pattern in Swift.
Next we need an I/O node and a synth node:
// Mark: - Audio Init Utility Methods private func createIONode() { var cd = AudioComponentDescription( componentType: OSType(kAudioUnitType_Output), componentSubType: OSType(kAudioUnitSubType_RemoteIO), componentManufacturer: OSType(kAudioUnitManufacturer_Apple), componentFlags: 0,componentFlagsMask: 0) checkError(osstatus: AUGraphAddNode(processingGraph!, &cd, &ioNode)) } private func createSynthNode() { var cd = AudioComponentDescription( componentType: OSType(kAudioUnitType_MusicDevice), componentSubType: OSType(kAudioUnitSubType_MIDISynth), componentManufacturer: OSType(kAudioUnitManufacturer_Apple), componentFlags: 0,componentFlagsMask: 0) checkError(osstatus: AUGraphAddNode(processingGraph!, &cd, &midisynthNode)) }
These component descriptions look gnarly, but the componentType
and componentSubType
fields are the only fields that vary. Some of the options are delays, distortion, filters, reverb, and more. Check out Apple's Effect Audio Unit Subtypes to see what’s available.
Now we get a reference to the synthesizer AudioUnit:
func initAudio() { // the graph is like a patch bay, where everything gets connected checkError(osstatus: NewAUGraph(&processingGraph)) createIONode() createSynthNode() checkError(osstatus: AUGraphOpen(processingGraph!)) checkError(osstatus: AUGraphNodeInfo(processingGraph!, midisynthNode, nil, &midisynthUnit)) }
The AudioUnit called midisynthUnit
is the workhorse of this app. This gets passed to MusicDeviceMIDIEvent
to change voices, turn notes on, and turn notes off.
The synthNode
and the ioNode
must be connected to each other and the graph by calling AUGraphConnectNodeInput
. Then you are free to call AUGraphInitialize
and AUGraphStart
:
func initAudio() { // the graph is like a patch bay, where everything gets connected checkError(osstatus: NewAUGraph(&processingGraph)) createIONode() createSynthNode() checkError(osstatus: AUGraphOpen(processingGraph!)) checkError(osstatus: AUGraphNodeInfo(processingGraph!, midisynthNode, nil, &midisynthUnit)) let synthOutputElement:AudioUnitElement = 0 let ioUnitInputElement:AudioUnitElement = 0 checkError(osstatus: AUGraphConnectNodeInput(processingGraph!, midisynthNode, synthOutputElement, ioNode, ioUnitInputElement)) checkError(osstatus: AUGraphInitialize(processingGraph!)) checkError(osstatus: AUGraphStart(processingGraph!)) }
Now the graph is populated, initialized, and started. It’s ready to receive some MIDI note commands.
Playing MIDI notes
These two methods send MIDI note-on and note-off commands to the synthesizer:
func noteOn(note: UInt8) { let noteCommand = UInt32(0x90 | midiChannel) let base = note - 48 let octaveAdjust = (UInt8(octave) * 12) + base let pitch = UInt32(octaveAdjust) checkError(osstatus: MusicDeviceMIDIEvent(self.midisynthUnit!, noteCommand, pitch, UInt32(self.midiVelocity), 0)) } func noteOff(note: UInt8) { let channel = UInt32(0) let noteCommand = UInt32(0x80 | channel) let base = note - 48 let octaveAdjust = (UInt8(octave) * 12) + base let pitch = UInt32(octaveAdjust) checkError(osstatus: MusicDeviceMIDIEvent(self.midisynthUnit!, noteCommand, pitch, 0, 0)) }
MIDI note-on and note-off commands, and for that matter, all MIDI commands, share the same pattern of the command byte being constructed by a bitwise or
of the upper four bits (called a nybble) with the lower four bits, which contain the channel to send the command to. For note-on, the upper nybble is 0x90. For note off, it’s 0x80. In this app, the lower channel nybble is always 0.
The piano keyboard, which is provided by AudioKit, shows two octaves at a time, defaulting to octaves 4 and 5. With 12 keys in an octave, the lowest note would be note 48. The piano keyboard knows nothing about the UISegmentedControl
used for octave selection in Jamulator, so the octave is stripped out of the note the keyboard sends, and then the value from the octave control gets added back in, resulting in the final pitch.
Adding an AKKeyboardView
At the bottom of MIDIInstrumentViewController.swift
, you will find an extension to the class. Extensions are a nice clean way to implement a protocol, and the AKKeyboardView class requires conformance to the AKKeyboardDelegate
protocol:
// MARK: - AKKeyboardDelegate // the protocol for the piano keyboard needs methods to turn notes on and off extension MIDIInstrumentViewController: AKKeyboardDelegate { func noteOn(note: MIDINoteNumber) { synth.noteOn(note: UInt8(note)) } func noteOff(note: MIDINoteNumber) { synth.noteOff(note: UInt8(note)) } }
Now find the setUpPianoKeyboard
method, which looks like this:
func setUpPianoKeyboard() { let keyboard = AKKeyboardView(frame: ScreenUtils.resizeRect( rect: CGRect(x: 40, y: 0, width: 687, height: 150))) keyboard.delegate = self keyboard.polyphonicMode = true // allow more than one note at a time self.view.addSubview(keyboard) }
There are optional keyboard properties (not shown above), but you really only need the delegate. polyphonicMode
is needed if you want to play more than one note at a time. Playing chord is a good example of polyphony.
Loading a voice
Voices are also known as patches.
func loadPatch(patchNo: Int) { let channel = UInt32(0) var enabled = UInt32(1) var disabled = UInt32(0) patch1 = UInt32(patchNo) checkError(osstatus: AudioUnitSetProperty( midisynthUnit!, AudioUnitPropertyID(kAUMIDISynthProperty_EnablePreload), AudioUnitScope(kAudioUnitScope_Global), 0, &enabled, UInt32(MemoryLayout<UInt32>.size))) let programChangeCommand = UInt32(0xC0 | channel) checkError(osstatus: MusicDeviceMIDIEvent(midisynthUnit!, programChangeCommand, patch1, 0, 0)) checkError(osstatus: AudioUnitSetProperty( midisynthUnit!, AudioUnitPropertyID(kAUMIDISynthProperty_EnablePreload), AudioUnitScope(kAudioUnitScope_Global), 0, &disabled, UInt32(MemoryLayout<UInt32>.size))) // the previous programChangeCommand just triggered a preload // this one actually changes to the new voice checkError(osstatus: MusicDeviceMIDIEvent(midisynthUnit!, programChangeCommand, patch1, 0, 0)) }
loadPatch
is called whenever the user chooses a new voice. Note that in MIDI lingo, a program change command means to change voices.
Preload is enabled in order to load the new voice. After the program change command, which simply triggers the preload, is issued, you disable preload. Preload is a mode of the midiSynth
, so you must turn it off again in order to actually change voices with the final program change command.
Now would you like some sounds to play with? Sounds good to me!
The sound font
I have to admit that calling a collection of musical instrument voices a sound font would never have occurred to me, but it really fits. And since there are 128 sounds, it’s almost like musical ASCII.
The sound font is loaded in the background, separately from the audio initialization:
func loadSoundFont() { var bankURL = Bundle.main.url(forResource: "FluidR3 GM2-2", withExtension: "SF2") checkError(osstatus: AudioUnitSetProperty(midisynthUnit!, AudioUnitPropertyID(kMusicDeviceProperty_SoundBankURL), AudioUnitScope(kAudioUnitScope_Global), 0, &bankURL, UInt32(MemoryLayout<URL>.size))) }
Here again, the midisynthUnit
is referenced in order to set the sound bank URL property. There are many sound banks out on the web, so you might want to experiment with some of them.
Near the top of VoiceSelectorView.swift
is an array of voice names for this sound bank, which conforms to the general MIDI 2 standard set of voices.
In MIDIInstrumentViewController.swift
, the method loadVoices
looks like this:
// load the voices in a background thread // on completion, tell the voice selector so it can display them // again, might only matter in the simulator func loadVoices() { DispatchQueue.global(qos: .background).async { self.synth.loadSoundFont() self.synth.loadPatch(patchNo: 0) DispatchQueue.main.async { // don't let the user choose a voice until they finish loading self.voiceSelectorView.setShowVoices(show: true) // don't let the user use the sequencer until the voices are loaded self.setUpSequencer() } } }
DispatchQueue.global(qos: .background).async
is the current preferred way to execute code in a background thread.
The main UI thread gets called back by DispatchQueue.main.async
at the end to remove the voices loading message and replace it with the custom voice selector control, and also to initialize the sequencer which unhides the UISegmentedControl
that allows you to record and play sequences. So, it was a small fib on my part to say this is the simplest possible MIDI app.
The sequencer is not part of the minimum feature set, but is included to illustrate some of the power of using MIDI. The code for sequencing is not explained in this post, but it shouldn’t be too hard to understand.
The 128 voices
The custom voice selection control shows 16 categories in its top half. When a category is selected, the eight voices in that category are shown in the bottom half of the control. There’s quite a variety of sounds here, ranging from pretty common instruments like piano, organ, and guitar, to exotic sounds like steel drum, sitar, and even sound effects like helicopter or gunshot.
One of my favorite things to do with Jamulator is to try out extremely high or low octave settings. The voices weren’t meant to be used that way and you can make some intriguing sounds that seem to have nothing to do with the instrument you’ve selected. Take some time to explore different voices.
Where To Go From Here
Want to experiment some more? Gene De Lisa’s software development blog has many interesting Swift/iOS audio articles, including The Great AVAudioUnitSampler workout.