Gold Medal Games
Gold Medal Games logo
Gameplay Made Great
Written: 12/18/2024

Dynamic Music - promising, but ultimately a failure

Summary

This is a game dev blog, and while software development is my background, as an Indie, I get the pleasure of doing all aspects of game development, including all the ones I am not particularly good at. I wanted unique music for each area in Earth Matters, and had the idea of making each song sound like music from its respective continent.

I had no clue what music styles exist around the world, but thankfully GarageBand sound loops have a wide variety of instruments. While I'm not sure piecing loops together across several sound channels counts as composing music, that is pretty much how I go about it. It is a very time consuming, trial and error, approach. Given I'm not too musically inclined, I don't really have useful alternatives. Or do I?

What if I could automate the process I do manually on the fly? Then in theory there would be original, thematically correct, music every time you play. This blog post is how I approached it, and why I ultimately abandoned it for Earth Matters.


Approach

The concept was pretty straightforward, have several sound channels each with their own set of loops. One channel for percussion, lead, bass, rhythm accents, etc. Randomly pick a sound-loop from each channel's set and start it playing. When a particular channel's loop ends, grab another one from its set at random, and start it playing.

4-Channel Example
Channel → Time →
Lead L4 L1 L5 L3
Bass B3 B1 B3
Drum D2 D3 D7 D3 D1 D4
Rhythm R5 R1 R4 R2 R1

The Code

This was all coded in C# using Unity. Since it is experimental, it is just hacked together. I debated even showing code, but figured it might help explain things / provide clarity. What I'm saying is you probably shouldn't cut & paste this into your own project as-is.

Object Structure
[System.Serializable] public class MusicLoop {   public string name;   public AudioClip clip;   [Range(0f,1f)]   public float volume;   public float length;   [HideInInspector]   public AudioSource source; } [System.Serializable] public class SongStream {   public string name;   public Channel[] channels; } [System.Serializable] public class Channel {   public string name;   public MusicLoop[] loops;   public int currentLoop;   public Coroutine channelManager; } public SongStream[] songs;

A MusicLoop is the container for the actual sound file reference. You can assign a name to make it easier to identify in the inspector. You set the volume, and specify the length of each clip, so this works correctly. I did it to the nearest hundreth of a second. It also has the hidden AudioSource unity object that will ultimately control the playing of the sound file.

A SongStream is a really an array of channels with a name. The name is so you can identify it easier in the inspector, and has nothing to do with any functionality.

A Channel is a an array of sound-loops, the currently playing loop's id, and a reference to its Coroutine that controls playback timing. There is also a name for convenience when looking at the inspector.

Finally there is an array of SongStreams. Each SongStream is basically a song generator based on the style of loops contained in its channels. I grouped sound loops from the chosen regions of the world for each SongStream.

Initialization

This loads the data from the inspector for all sound-loops and channels, and adds an AudioSource component to them so they will actually play within Unity. It sets a reference to the clip (the sound file), and sets their volume on a per-clip basis. I put this in the Awake( ) method of the MonoBehavior's class.

foreach( SongStream st in songs ) {   foreach( Channel ch in st.channels )   {     ch.currentLoop = -1;     foreach( MusicLoop s in ch.loops )     {       s.source = gameObject.AddComponent();       s.source.clip = s.clip;       s.source.volume = s.volume;     }   } }
Starting and Stopping the "Music"

I created Play( ) and Stop( ) methods to control starting and stopping the music. The Play( ) method gets passed a songId, which represents a group of channels and their associated loops. The songId doesn't represent a specific song, but more the style of what the song will be.

public void Play( int songId ) {   int i;   // We only want 1 song playing at any given time, so make sure we stop the currently playing one first   // Ideally this would crossfade. I think that is ideal anyway.   //   if( currentSong >= 0 )   {     Stop();   }   currentSong = songId;   for( i = 0; i < songs[currentSong].channels.Length; i++ )   {     songs[currentSong].channels[i].channelManager = StartCoroutine( ChannelManager( currentSong, i ) );   } } public void Stop() {   if( currentSong != -1 )   {     foreach( Channel ch in songs[currentSong].channels )     {       if( ch.currentLoop >= 0 )       {         StopCoroutine( ch.channelManager );         ch.loops[ch.currentLoop].source.Stop();       }     }   } }

Play( ) first makes sure any currently playing song is stopped. It then sets the currentSong value to whatever songId was passed in. The coroutines are then started for each channel involved in the new song.

Stop( ) conversely stops all coroutines for each channel in the currently playing song. Remember song in this case is a dynamically generated thing made up of somewhat randomly playing sound-loops.

Channels working independently to make a song

One of the interesting aspects of this method is that these songs repeat portions at odd intervals, but never actually end. They are continaully being stitched together. Each channel grabs a loop from its list whenever the previously playing one ends. The choice of the sound loops for each channel for a song need to be somewhat similar. At least that is how I approached it for this experiment. It seems like unrelated sound-loops would really be jarring to the listeners.

Here is the code for the coroutines that makes the what I had hoped would be magic happen.

IEnumerator ChannelManager( int sid, int cid ) {   while( true )   {     // Pick loop at random and start it playing     //     songs[sid].channels[cid].currentLoop = Random.Range( 0, songs[sid].channels[cid].loops.Length );     songs[sid].channels[cid].loops[songs[sid].channels[cid].currentLoop].source.Play();     yield return new WaitForSeconds( songs[sid].channels[cid].loops[songs[sid].channels[cid].currentLoop].length );   } }

The coroutine is fairly straightforward. Set the currentLoop for the song and channel to a random loop from that channel's set. Start the loop playing, and then wait until it finishes. Then do it all over again.


The Results

How did it sound? It was so bad, it was actually funny to listen to. While having multiple channels playing helps smooth even the most jarring of transitions on a single-channel, it was still bad. Fairly often 2 or 3 channels would end and need a new loop in close proximity, really making for some abrupt sonic transitions. Those are definitely not really something I would classify as music.

However, at any given point mid-loop the music can sound like, well, music, and then all of a sudden it doesn't. Or at least not the same song. I can't think of a description that does it justice, so here is an example clip generated with this method.

Original music?

Possible Improvements

When trying to create the audio for Earth Matters, I was working on a self-imposed deadline to get it launched. The initial results were so bad with this automated approach, that I didn't feel like investing a lot of time in it. However, there were definitely some promising moments where it actually sounded good.

One thought would be to double the number of channels to have the loop that is ending crossfade with the next one. I never got around to coding and testing it. I may do so for my next game.

The improvement that would probably have the biggest impact is to find a way to make the channels complement each other. Right now, it is like 4 kids banging away on musical instruments. In other words, noise. So the thought would be to pick some sort of tempo progression (perhaps a wave), and have the other channels match their BPM with it. Also, having channels take smart pauses. For example, there aren't many songs where every instrument is being played at all times.

The reality though, is that using Gen AI is in all likelihood going to produce better outcomes, meaning sounds that more closely resembles music.


Discoveries

GarageBand tries to help you out when you are composing in their editor. The first loop you place seems to dictate your BPM, and subsequent loops will be silently modified to match it. The result is that the loops match better from a music-perspective.

When you have a list of sound-loops that you export 1 by 1, it uses their original BPM. So some clips that you drag and kind of randomly place in GarageBand won't necessarily sound the same as their actual sound files. That made for an unpleasant surprise right in the middle of trying to build this out.