How to build a synthesizer with python: part 0

Introduction & Architecture

mike | July 10, 2023, 4:22 a.m.

Have you ever wanted to build a musical synthesizer? Me too!

And if you’re anything like me, you have that one project that you’ve tried a few times but for whatever reason, you’ve never gotten it off the ground. It could be a program, a hobby project, or anything, really. I think most people have something like that. For me, building a synthesizer is that project. It’s something I’ve come at from a few angles over the past few years, but I never quite found an approach that worked. Until now. (!)

So what changed? Well, I’ve grown as a developer over the past few years, for sure. Looking back at my first few attempts at building a synthesizer, I can track the progress of how far I’ve come. And of course, the time I spent trying and failing at this project was invaluable. But the biggest difference this time is that I got help.

Let’s be honest, if you’re reading this post, ChatGPT is probably already on your radar. You may have even already used it for coding help. Personally, I’ve used it for anything from recommending old school games to play to getting help building this blog. But what it’s really enabled me to do is learn things that I once thought were beyond my grasp.

So why am I writing this tutorial if you can just ask ChatGPT for help like I did? There are a few reasons: first and foremost, because I love sharing things. And I hope that even one person will get the same joy and satisfaction out of building a synthesizer that I have. But I also aim to provide content that would be difficult or impossible to get out of ChatGPT, like step-by-step instructions and a working example project.

What

Okay, now that's out of the way, let’s talk about what we’ll actually be building, how we’ll do it, and what tools we’ll need.

The goal of this tutorial series is to build a proof-of-concept software synthesizer with the Python programming language. The finished product will be a 4 voice poly synth which is capable of accepting MIDI input. Each voice will have 2 mixable oscillators. We’ll also implement a low pass filter and a delay effect.

Here's an example of what it can sound like:

CAUTION: It can be a bit loud

What won’t we cover? We will (in general) not cover python programming concepts, in-depth digital signal processing (DSP), or any synthesizer components not mentioned above. We won’t be implementing an ADSR envelope for this demo project. If you’d like to see a working example of an ADSR envelope, check out the original project that this demo is based on.

requirements.txt

The original project was written on macOS and ported to a Windows 10 machine. I’ve tried to keep it as platform agnostic as possible but instructions will be written for Windows first, as that's what I assume the most users are running. The program should work with little to no modification on any platform, notwithstanding some extra platform dependencies that may need to be installed.

There are only a few prerequisites. Since this is not a Python programming guide, I do expect that you have some coding experience, or at least a very strong hacker ethic. I will try to write the instructions so that if you follow them carefully, you can piece together the program with only a little coding experience. That said, multi-threaded programming and DSP experience are nice-to-haves.

As far as hardware and software you will need: a computer with Python 3.10 installed, and a MIDI controller that can be connected to the computer. Any USB MIDI controller should work. I used an Akai MPK mini 3 while building this project, which I highly recommend.

And that’s it! That’s all you need. In the rest of this post, I’ll go over the basic architecture of the synth as well as a few helpful concepts. If you can’t wait to start coding, or you’re already familiar with DSP, you can go ahead and skip to part 1.

Architecture of a synthesizer

Still here? Excellent. Let’s talk about what component pieces a synthesizer needs. In my view, a bare bones soft-synth needs a way to:

  • generate an audio signal
  • play the signal out loud
  • control the signal

And not to mention this all needs to be done in real-time. Here’s a graphic showing how we could represent that functionality as Python modules:

We'll generate our audio signal using NumPy, we'll play it out loud using PyAudio, and we'll manipulate the synthesis parameters via MIDI. Our program will need four threads: the main thread, one for the MIDI listener, one for the synthesizer, and one for the PyAudio stream.

When the program starts, it will create a MIDI listener and a synthesizer. The MIDI listener thread's job is to attach to a MIDI controller, listen for messages from it, interpret them, and send the interpreted messages to the synthesizer thread. The synthesizer sets up a signal source and connects it to a PyAudio stream. Once the audio stream is opened, the synthesizer's main job is to provide an API for manipulating the synthesis parameters.

A few DSP concepts

How does a computer output audio, anyway? More complete details are outside the scope of this tutorial, but it does help to have a basic model in our heads.

Basically, the computer generates a series of discrete values, which are fed to a digital-to-analog converter, or DAC. The DAC then outputs a continuous electrical signal to an amplifier, which boosts the signal and stimulates a transducer inside a speaker to produce sound. A transducer is a type of device that translates one type of signal to another; in this case, electrical to mechanical.

The computer has no way of outputting a smooth, continuous signal like a sine wave. The best we can do is slice the sine wave into very small segments, called frames, and send them to the DAC one at a time. Each frame represents the value of the sine wave at a single point in time. By changing the value of the DAC’s analog output very quickly, we can simulate the continuous signal of a wave. The process of slicing a continuous signal into a series of discrete frames is called sampling, and the number of frames per second is called the sample rate.

Our program works by generating a very short audio clip and feeding it to the stream many times per second. In the source code I’ll refer to this short audio clip as a chunk. Every time the PyAudio stream needs more data, it invokes a callback function we provide to generate another chunk, which is just an array of 32 bit floats in the range [-1, 1].

The diagram represents digitally generating a 1.25 Hz sine wave at a sample rate of 15 Hz. As you can see, in the span of 1 second, we’ll divide the wave into 15 discrete values. We'll pass these values to the DAC 4 at a time. The DAC will scan these values and update its analog output for each one. In the example diagram, the sample rate is 15, but in real-world applications the sample rate is commonly set to 44,100 Hz, meaning the DAC updates 44,100 times per second. The chunk size will typically be 1024 for our application.

What about MIDI?

MIDI is a widely used and relatively easy to implement standard for electronic musical instruments. It makes sense for us to to use because of the availability of controllers and software libraries. We won't implement a fully compliant MIDI synthesizer, but it provides a straightforward way for us to control the synth.

In the context of our program, we'll handle MIDI messages. MIDI messages are encoded packets of musical information. They typically specify an action followed by a parameter or two. The actions we'll handle are things like note_on, note_off and control_change. MIDI parameters are typically integer values between 0-127. In practice, that means there are 128 different pitches we can specify. They range from C -1 (8.176 Hz) to G9 (12,543.9 Hz). This covers all 88 keys of a piano and then some. It's possible to extend this range and to use microtonal values, but that's another topic.

As an example, note_on messages will speficy two parameters, pitch and velocity. Since we're using a MIDI library, we don't have to worry about the bytes of the MIDI message. Instead, we'll deal with a Message object. So we might see a message with attributes type=note_on, note=19, velocity=127. This means play a note with pitch #19 (24.499 Hz) at the highest velocity. Velocity corresponds to how hard/fast the key is pressed.

The other type of message we'll handle is control_change. These messages specify a control and a value. Like before, both are integers between 0-127. So we might receive a message telling us to set control number 65 to value 82. Think of value 0 as a knob turned all the way left, and value 127 as the knob turned all the way right. We can use control_change messages to set things like volume, or e.g. the cutoff frequency of a low-pass filter.

All in all, MIDI offers a pretty simple way to control our synth.

One last thing

In the next part, we'll get started building the synthesizer in earnest. Before we start coding, I just want to say I hope you'll get some enjoyment out of making the computer do bleep bloops like I did. Obviously, I hope you'll stick around till the end, but I appreciate you giving this any time at all.

Happy coding!