Sonic Pi: Ruby as a Composition Tool

Like the blip of an intro on the front page says, my degree was originally in music. My running joke as a web dev is that neither has meaningfully required me to count past 32. And while my main concentration was vocals, I've since realized I should probably stop strictly calling this a nontechnical field, because my actual major was recording — even if I did primarily branch out into this for the sake of tracking my own material.

That last part fell off for a few reasons. First of all, I pretty quickly fell into tech work just by happenstance, and it happened to take. I also didn't have the space or resources or skillset to realistically amass a lot of different instruments. (Or other audio equipment.) I did pick up bass competently enough after a peer-pressure-induced lark, which happened to stick after picking up Scott Pilgrim (that was a joke... it was FLCL), and I picked up piano in the course of my major enough to passably self-accompany, but six-string guitars elude me about as much as consistently organizing with a group of other people who play things.

But I also mostly learned to track live instruments, and the small, disorganized experiments I took at electronic music never stuck. Something about a whole other set of overwhelm around picking synths up as instruments, I guess, even if I'm pretty familiar with audio workstations conceptually. But more recently, after a series of constraints that put all the instruments I do have into storage, I've taken a dive back into what was also one of my first attempts to learn how to code: Sonic Pi. Ironically enough, as I've started making better sense of the language that underpins it, I've also started feeling some of my prior knowledge around audio engineering click in new and different ways.

Sonic Pi, created by Sam Aaron, is a very different beast from most audio applications: it's a software synth controlled entirely through code. It comes with its own control language (a domain-specific language, or DSL) that extends Ruby to map various music and audio concepts onto it. So for instance, you'll find note names as symbols, like :c4, corresponding to their equivalent MIDI codes. You'll find chord and scale constructors that take notes and chord/scale structures as arguments, such as chord(:d3, :maj7). There's a play that's used in conjunction with Ruby's native sleep (sort of... more on that in a second), and a play_pattern_timed that abridges this for you by taking a list of notes and a time interval. (Quarter/half/etc notes are just plain numbers here, and hopefully don't require more explanation.)

The goal of this project was to track one demo. Compose one instrumental backing, purely by writing code, without the use of anything this this tool didn't come with out of the box. Because I could use MIDI, or external samples, but then I'm back to rabbit-holing about other audio tools.

I did accomplish this, but it's a little long for this piece, so for now let's do something simpler. (And a little less depressing. Besides, my mic was missing for a while during that same storage shuffle, so I never got around to tracking vocals for that anyway.)

If you install the app itself, you can follow along by copy-pasting the code below. (Note that for length I won't be repeating everything, so at points that I mention I'm reusing sections, or for persistent values like bpm or synth settings, just scroll up.) Also out of the box, you get a detailed set of documentation for the language as well as a series of tutorials.

# bpm defaults to 60, but we can change this
# other time values will scale accordingly,
# including the time interval going into `sleep`
use_bpm 70
  use_synth :pulse # this defaults to a sine wave

play :c2
sleep 0.25
play :d2
sleep 0.25
play :e2
sleep 0.25
play :g2
sleep 0.25

# these are mostly equivalent, except that this
#  also sets a `sustain` of 0.25 to each note
# (more on options in a second)
play_pattern_timed [ :c3, :d3, :e3, :g3 ], 0.25

Of course, I can also just demonstrate the audio itself:

(Note: Some examples below won't be full blocks of code, or just don't demonstrate audible changes, so I won't be doing this for all of them.)

Since this is built on top of plain Ruby, we can abridge this, and make it more flexible so we're not repeating a lot of code.

We'll define the whole sequence here:


def arpeggiate do root, is_minor = false
  # ascending sequence
  # repeat the same pattern,
  #  at different octaves
  4.times do
    # modify the third based on the optional second argument
    #  we'll use this later
    third = 4
    third -= 1 if is_minor
    # Ruby supports trailing conditional statements. in addition to
    #  `if`, you can also check if the condition is *false*
    #  using `unless`

    sequence = [ 0, 2, third, 7 ].map { |note| root + note }
    
    play_pattern_timed sequence, 0.25

    # move the root of the sequence up one octave
    root += 12
  end

  # descending sequence
  # the same pattern, in reverse
  4.times do
    third = 8
    third += 1 if is_minor
    
    sequence = [ 0, 5, third, 10 ].map { |note| root - note }

    play_pattern_timed sequence, 0.25

    root -= 12
  end
end

And we can then can run through this same pattern several times, at different starting points:

in_thread do
  # define synth tones
  # you can pass additional options into `play` 
  #  or `play_pattern_timed` directly, but you 
  #  can also set them as defaults up front
  # (amp indicates volume level, but there are of course more)
  use_synth :pulse
  use_synth_defaults amp: 0.1

  2.times do
    # parentheses are optional for functions in Ruby
    # `play 60` is the same as `play(60)`
    # but in some cases you may need clearer separation
    #  like `play chord(:a3, maj7)`
    arpeggiate(:c3)
    arpeggiate(:a2, :min)
  end

  arpeggiate(:f2)
  arpeggiate(:g2)
  arpeggiate(:ab2)
  arpeggiate(:bb2)
end 

You might be wondering about that in_thread do block.

Sonic Pi also uses loops to run code in parallel. So by wrapping the above in one, we can run two separate "instruments" in parallel.

We could also, say...


in_thread do
  # ...the block from earlier?
end

in_thread do
  use_synth :saw
  use_synth_defaults amp: 0.2

  melody = [
    :c5, :b4, :d5,
    :c5,
    :c5, :b4, :d5,
    :d5, :e5, :c5,
    :a4, :g4, :a4,
    :b4, :c5, :d5, :g5,
    :f5, :eb5, :d5, :c5,
    :g5, :f5, :eb5, :d5,
  ]
  
  # play_pattern_timed can also take a list of time intervals
  # you can also build that list out of smaller lists
  
  rhythm_a = [ 4, 2, 2 ]

  
  # the math operations here operate on the list's length — multiplying
  #  a list will extend its length and wrap its contents
  # so `[1] * 2` gives you `[1, 1]` and so on
  # (this is also equivalent to `Array.new(1, 2)`)
  # `play_pattern_timed doesn't take nested arrays...  
  rhythm = [
    rhythm_a,
    8,
    rhythm_a,
    [0.5] * 2, 7,
    rhythm_a,
    3, 1, [2] * 2,
    [ 1, [0.5] * 2, 6 ] * 2
  ].flatten
  # ...*but*, by using `flatten`, we can spread its contents
  #  into a single layer
  
  play_pattern_timed melody, rhythm
in_thread do

We could layer these in a couple of different ways to get a "choir" here. You can manually specify chords to play, but you might not have every part of one in a single section, or might want them spread out in specific ways. In that case, you could construct layered notes manually using ring and then just pass them into one list:

choir = [
  ring(:c5, :e5), ring(:c5, :e5), ring(:d5, :f5)
  ring(:c5, :e5)
  # ...and so on
]

# play_pattern_timed choir, [ 4, 2, 2, 8 ]

# as a side note, the `flatten` trick above wouldn't work here
#  because `ring`s are also lists, so they'd also get merged
# while `flatten` does allow you to specify depth, your
#  patterns might not be uniform enough for that
# instead, you can also spread out a single list's contents
#  (or splat, I guess, as the operator is called in Ruby)
#  into its parent by adding a leading `*`, such as:
#  `[ *[ ring(:c5, :e5) ] * 4, *[ ring(:b4, :d5) ] * 2 ]`
#  which would produce a single list of 6 rings

But real choirs don't all sing in singular rhythmic patterns, and this doesn't need to either. We can also nest threads, to share parts across different voices.

in_thread do
  use_synth :saw
  
  # using the outer scope that the depeer levels can access,
  #  we can construct shared rhythmic or melodic sections

  rhythm_a = [ 4, 2, 2 ]

  # effects also run in similar blocks. Signal flow starts at the deepest layer,
  #  and effects chains run from inner layers out.
  # mix represents the level of blend between wet and dry signal —
  #  that is, signal *with* the effect, and signal *without* it
  with_fx :reverb, mix: 0.2 do
  # so if I added another one here, it would run before the reverb
    # with_fx :echo do

  # we could also wrap the whole group in a single effect
  #  in a similar fashion to buses on a DAW
  # however, deeper grouping than that is pretty manual;
  #  there's no routing to speak of, and scopes limit
  #  you to working in a mostly top-down fashion
  # you can duplicate these things to a reasonable degree,
  #  but I'd personally recommend treating it as less
  #  of a production tool and more of an instrument

    # soprano
    in_thread do
      use_synth_defaults amp: 0.25
        
      melody = [
        :c5, :b4, :d5,
        :c5, #...
      ]

      # here we can take the pieces from above
      #  and construct a variation that exists
      #  only inside this block
      rhythm = [
        rhythm_a, #...
      ].flatten
        
    end

    # alto
    in_thread do
      use_synth_defaults amp: 0.2

      # each block has its own scope, so they can each
      #  use these same variable names
      # because neither can directly access the other
      melody = [
        :c5, :b4, :d5,
        :c5, #...
      ]
        
      # you can assemble a different pattern here, using pieces
      #  from the outer layer
      
      # you could also construct longer ones outside — like, say:
      #  `rhythm_a = [rhythm_a1, rhythm_a2].flatten`
      #  and then modify them using list operations
      #  but I'm not going to get too deep into that here

      # the important part is:
      #  we can keep referencing the pieces used in the
      #  outer scope, while still being able to reuse *names*
      #  in each inner scope
      rhythm = [
        rhythm_a #...
      ].flatten
    end
  end
end

We could give them different voicings with distinct rhythmic patterns. But the source piece (the Prelude, from Final Fantasy) has numerous arrangements that don't always add that much complexity to its layers, so we don't have to do that here.

So let's go back to the rings. Since play_pattern_timed adds a sustain value, we could set that manually. It would look like:

# amp applies to the total volume of this synth — each individual
#  note will be quieter if you're playing multiple layers together
use_synth_defaults release: 0.2, amp: 0.2

# these rings have to be nested, because if not, `play_pattern_timed`
#  will see the underlying list and play the notes in a sequence
#  instead of together
play_pattern_timed [ring(:c5, :e5)], 4, sustain: 3.8
play_pattern_timed [ring(:b4, :d5)], 2, sustain: 1.8
play_pattern_timed [ring(:d5, :f5)], 2, sustain: 1.8
play_pattern_timed [ring(:c5, :e5)], 8, sustain: 7.8

# ...and so on

# this *could* be:

# play ring(notes), sustain: length
# sleep length

# but we don't *need* that if we're overriding sustain anyway

But that's kind of verbose, and manually handling the offsetting is kind of a pain in the ass, and we can define a function for this.

Before I get to that, though, let me explain some of these parameters above a little more. This is called an envelope — specifically an ADSR (Attack, Decay, Sustain, Release) envelope — and it refers to how the volume levels of a sound are shaped. The Sonic Pi docs have more detail on this, but to give you a simplified explanation of each:

Attack is the initial "strike" of a sound, and represents the time that it takes to reach its initial peak, coming from 0. Quick examples would be like the pluck of a guitar string, or the hammer of a piano. Slower ones would be the press of an accordion, or a bowed string that's slowly increasing in volume.

Decay is the time the sound takes to leave that peak. Think when you're holding down a piano key or a guitar string after it's first hit. The note is continuing, but it still has a slow fade to it. The sound won't just continue indefinitely, even if you're still holding the key that made it ring out.

Sustain is the time a sound is held at a stable level, without the fadeout that makes a decay. Examples of sustain include a string being bowed at a consistent volume, or vocals that are being held.

Release is the time between letting go of the sound and it actually going silent. A piano key that you hit without holding it will still ring out for a brief moment. A vocalist may still let out a short exhale after letting go of a note. And so on.

Not every sound has every one of these. Synths, for instance, are generated tones that don't necessarily have any initial build or lingering trail to them. Other, more natural instruments will have these things (or not) in different proportions depending on how you play them. Here, we're simulating a choir section, but the goal isn't really to make it lifelike, so the only real aim is to make sure it's holding for the correct time and has a little bit of separation between chords.

All of that being said, back to the function at hand. We can take the notes that make each chord and define shorthand for them, because

def choral_rings notes, sus
  offset = sus < 1 ? 0.1 : 0.2
  
  # remember the `*` splat operator I mentioned before?
  # you can also use it to spread lists out into arguments
  play ring(*notes), sustain: sus - offset, release: offset
  sleep sus
end

Going further, we can make all of this loop indefinitely, like an actual video game, with live_loop.

Sonic Pi is largely built for live performance — the code inside a live_loop will run until you tell the program to stop. You can alter the contents as it's playing, and by rerunning the start command, the loop will update the on the next run.

To do this, you'd replace the outer in_thread loops with live_loop :some_unique_name. This gets a little more complex when we're talking about effects chains — they're recreated each time the loop runs, so it's cheaper on resources to run the effects outside the live_loop block, especially as you stack them. But we're not here to get deep into audio or software engineering right now. We're here to make blips blip.

live_loop :harp do
  # the same blocks...
end

live_loop :choir do
  # ...we just wrote

  # to make it game-accurate, only play this every
  #  *other* time the other loop runs:
  
  # sleep 64
end

Ultimately, the whole thing looks like this:


use_bpm 75

def arpeggiate note, is_minor = false
  # there's probably a cleaner way to reverse this
  # and the map operation *could* be nested
  # but for illustrative purposes, this is fine
  
  ascending_three = is_minor ? note + 3 : note + 4
  
  ascending = [note, note + 2, ascending_three, note + 7]
  ascending_arp = [
    *ascending,
    *ascending.map { |note| note + 12 },
    *ascending.map { |note| note + 24 },
    *ascending.map { |note| note + 36 }
  ]
  
  top = note + 48
  descending_three = is_minor ? top - 9 : top - 8
  
  descending = [top, top - 5, descending_three, top - 10]
  
  descending_arp = [
    *descending,
    *descending.map { |note| note - 12 },
    *descending.map { |note| note - 24 },
    *descending.map { |note| note - 36 }
  ]
  
  # unlike the version above, this just outputs a value
  #  to then be played in the next block, instead of playing
  #  them directly within the function
  # Ruby supports implicit returns — you can simply declare
  #  the value you want from the function on its last line,
  #  without needing to specify as much
  [*ascending_arp, *descending_arp]

  # this is equivalent:
  # return [*ascending_arp, *descending_arp]
end

arp_c = arpeggiate :c3
arp_a = arpeggiate :a2, true
arp_f = arpeggiate :f2
arp_g = arpeggiate :g2
arp_ab = arpeggiate :ab2
arp_bb = arpeggiate :bb2

live_loop :harp do
  use_synth :square # ha
  use_synth_defaults amp: 0.15
  
  2.times do
    play_pattern_timed (arp_c), 0.25
    play_pattern_timed (arp_a), 0.25
  end
  play_pattern_timed (arp_f), 0.25
  play_pattern_timed (arp_g), 0.25
  play_pattern_timed (arp_ab), 0.25
  play_pattern_timed (arp_bb), 0.25
end

def choral_rings notes, sus
  offset = sus < 1 ? 0.1 : 0.2
  
  play ring(*notes), sustain: sus - offset, release: offset
  sleep sus
end

live_loop :choir do
  use_synth :saw
  use_synth_defaults amp: 0.35
  
  sleep 64
  
  with_fx :reverb, mix: 0.75 do
    choral_rings [:c5, :e5], 4
    choral_rings [:b4, :d5], 2
    choral_rings [:d5, :f5], 2
    choral_rings [:c5, :e5], 8
    
    choral_rings [:c5, :e5], 4
    choral_rings [:b4, :d5], 2
    choral_rings [:d5, :f5], 2
    choral_rings [:d5, :f5], 0.5
    choral_rings [:e5, :g5], 0.5
    choral_rings [:c5, :e5], 7
    
    choral_rings [:a4, :c5], 4
    choral_rings [:g4, :b4], 2
    choral_rings [:a4, :c5], 2
    choral_rings [:b4, :d5], 3
    choral_rings [:c5, :e5], 1
    choral_rings [:d5, :f5], 2
    choral_rings [:b4, :g5], 2
    
    choral_rings [:d5, :f5], 1
    choral_rings [:c5, :eb5], 0.5
    choral_rings [:bb4, :d5], 0.5
    choral_rings [:ab4, :c5], 6
    
    choral_rings [:eb5, :g5], 1
    choral_rings [:d5, :f5], 0.5
    choral_rings [:c5, :eb5], 0.5
    choral_rings [:bb4, :d5], 6
  end
end

And sounds like this.

As much as tech work is usually discussed in terms of computer science (and as much as I've had ex bosses neg me for my major in college), programming is also art. And it's not even just art when you're using it to purposely do something creative — such as generating audio like this, or something more visual like designing layouts using CSS. What's understated is that writing code is a creative act, much like writing anything else. See, you're not just talking to a machine in the most optimized fashion possible. You're also talking to other people. And even talking to yourself. (You're not crazy though — you're just a little unwell.) Code is ultimately text, and organized text at that. Ultimately, it's read and not just written — so writing good code is about writing code that can be understood at a glance, whether that's to other people, or to you in six months.

But once in a while though, the art of it really is the point in itself.