How to build a synthesizer with python: part 1

Stream player

mike | July 10, 2023, 4:24 a.m.

In this tutorial we’ll walk through using PyAudio to build a simple audio stream player. By the end of the post you should have a 440 Hz sine wave coming out of your speaker. On the way we’ll create a temporary sine wave generator to produce the stream and a StreamPlayer class to consume it.

Let’s make some noise!

Project setup

The first and most important tool we’ll need to install is Python 3.10. We do need Python >= 3.10 because we’re using pattern matching. That said, there are a few ways to get Python. If you’re using Windows I recommend getting it from the Microsoft Store. For macOS users, I’d recommend installing Python 3.10 with homebrew. And for Linux users, Ubuntu 22.04 ships with 3.10, otherwise you’ll most likely need to build from source.

NOTE to macOS and Linux users: you might find it undesirable to bind python to version 3.10, so until we have a virtual environment set up you may want to use python3 or python3.10, depending on what is on your PATH variable.

One last thing: whenever you see a code block that starts with $, that is a shell prompt. You'll want to run everything but the $. So in the example

$ some command /with/a/path

You only want to run some command /with/a/path in your shell.

Set up a virtual environment

Now that we have Python 3.10 installed, let’s create our project and set up a virtual environment. Open up a terminal (I recommend Terminal from the Microsoft Store). Use cd to navigate to the folder where you want to save the project. Next create a new directory and cd into it:

$ mkdir toy-synth
$ cd toy-synth

Now we want to create a python virtual environment. The reason we do this is so that we can keep track of our project dependencies. Keeping track of project dependencies lets us easily tell others what is necessary to run our program, as well as makes it easier to transfer our code to a different machine.

You might already have the virtualenv python module installed on your machine. If not, or you’re not sure, run:

$ pip3 install virtualenv

Then once it’s installed:

$ python3 -m virtualenv venv

You should see a few lines of output letting you know that a new virtual environment named “venv” was created. If you ls in the project directory, you should see a new folder called “venv”. Now we need to activate the virtual environment:

Windows: .\venv\Scripts\activate

macOS and Linux: source ./venv/bin/activate

(venv) should appear before your shell prompt. If you close the shell session, you’ll need to cd into the project directory and activate the virtual environment again. From here on out, unless otherwise specified, I’ll assume you’re working from a shell session open to your project directory with the virtual environment active.

Set up a git repository (optional)

You don’t have to for the project to work, but I’d recommend tracking your files with git. Initialize a new repo with

$ git init

Now you can create a file named .gitignore (the . is important) in your project directory. Add the following to .gitignore:

__pycache__/
*.py[cod]
*$py.class
venv/

Create a new python project

Now we’re ready to start coding! Open your project folder in a text editor (I use Visual Studio Code). We’ll create a top-level python module called “synth”. Create a folder directly under the project folder and name it synth. Now, inside the synth folder create a file called __init__.py. This file is what makes the folder a python module. Create another file called __main__.py. This file is what makes the module executable, and it’s also where we’ll write the first code.

Open __main__.py. Let’s set up some logging to test the program. Copy the following code into the file:

import logging

if __name__ == "__main__":
    logging.basicConfig(level=logging.DEBUG, 
                        format='%(asctime)s [%(levelname)s] %(module)s [%(funcName)s]: %(message)s',
                        datefmt='%Y-%m-%d %H:%M:%S')
    
    log = logging.getLogger(__name__)
    log.info(
        """
    __
   |  |
 __|  |___             ______         __        __
/__    __/          __|______|__     |  |      |  |
   |  |            |  |      |  |    |__|______|  |
   |  |     __     |  |      |  |       |______   |
   |__|____|__|    |__|______|__|        ______|__|
      |____|          |______|          |______|

                                                           __              __
                                                          |  |            |  |
    _______         __        __     __    ____         __|  |___         |  |   ____
 __|_______|       |  |      |  |   |  |__|____|__     /__    __/         |  |__|____|__
|__|_______        |__|______|  |   |   __|    |  |       |  |            |   __|    |  |
   |_______|__        |______   |   |  |       |  |       |  |     __     |  |       |  |
 __________|__|        ______|__|   |  |       |  |       |__|____|__|    |  |       |  |
|__________|          |______|      |__|       |__|          |____|       |__|       |__|
        """
    )

First we import the logging module from the standard library. We’ll need this to do any type of serious logging.

You’ve probably seen the line if __name__ == "__main__": in a python program before. But if not, here’s the basic idea: every python program has some special global variables associated with it. One of them is the __name__ variable. Accessing this variable will always give you the name of the script that the variable is accessed from. Normally this is the name of the file, without the .py extension. The exception is that the entry point to the program is always named __main__, even if the file name is different. Our file name happens to also be __main__ in this case, because we want to make the module executable.

Next, we set up the logger. We specify a basic configuration, which tells the logger to print some useful info like the time and function name before each log message. After we set up the logger, we get an instance of it and print out some character art with the log level INFO. For more information on logging, check out the python docs.

Alright, time to run our test. At the command line, run:

$ python -m synth

You should get a log printout of the character art.

StreamPlayer class

Now that we’ve set up our project, let’s go ahead and give ourselves a way to play audio. We’ll use the PyAudio library, so let’s go ahead and install it with:

$ pip install pyaudio

And since we’ve installed a new package, let’s create a requirements.txt file to keep track of it.

$ pip freeze > requirements.txt

This will generate a list of packages we’ve installed in the virtual environment and put it in a file called requirements.txt. We’ll do this every time we add a new package.

Now we can write the stream player code. We’ll create a new submodule under the synth module called “playback”. So create a folder inside the synth folder and name it playback. Make an __init__.py file in the new folder. And finally, create a new file in the playback folder called stream_player.py.

The directory structure looks something like this:

|
+-toy-synth
  |
  |-.gitignore
  |-requirements.txt
  |
  +-venv
  |
  +-synth
    |
    |-__init__.py
    |-__main__.py
    | 
    +-playback
      |
      |-__init__.py
      |-stream_player.py

Now open the stream_player.py file. At the top of the class, add the following imports:

import pyaudio
import logging

And just below that, add a class declaration with an init method.

class StreamPlayer:
    def __init__(self, sample_rate: int, frames_per_chunk: int, input_delegate):
        self.log = logging.getLogger(__name__)
        self.sample_rate = sample_rate
        self.frames_per_chunk = frames_per_chunk
        self.input_delegate = input_delegate
        self.pyaudio_interface = pyaudio.PyAudio()
        self._output_stream = None

The init method sets a few member variables we need and instantiates a PyAudio interface. Nothing too fancy going on yet. Next, we’ll add some getter and setter definitions for the variables that are passed in from the constructor. This isn’t totally necessary, but it’s always good practice to do validation on your data, especially in a dynamically typed language like Python. Go ahead and add the following below the init method:

    @property
    def sample_rate(self):
        """
        The sample rate of the audio stream.
        This is the number of data points (frames) per second.
        """
        return self._sample_rate
    
    @sample_rate.setter
    def sample_rate(self, value):
        try:
            if (int_value := int(value)) > 0:
                self._sample_rate = int_value
            else:
                raise ValueError
        except ValueError:
            self.log.error(f"Couldn't set sample_rate with value {value}")

    @property
    def frames_per_chunk(self):
        """
        The size of the chunks of data that are passed to the output stream.
        """
        return self._frames_per_chunk
    
    @frames_per_chunk.setter
    def frames_per_chunk(self, value):
        try:
            if (int_value := int(value)) > 0:
                self._frames_per_chunk = int_value
            else:
                raise ValueError
        except ValueError:
            self.log.error(f"Couldn't set frames_per_chunk with value {value}")

    @property
    def input_delegate(self):
        """
        This should be an iterator which returns the BYTES of an ndarray (aka calling tobytes() on it)
        of size <frames_per_chunk>
        """
        return self._input_delegate
    
    @input_delegate.setter
    def input_delegate(self, value):
        try:
            _ = iter(value)
            self._input_delegate = value
        except TypeError:
            self.log.error(f"Could not set input delegate with value {value}")

The setter methods do some light validation on the variables when we want to set them. For example, the sample_rate setter makes sure that the parameter passed in can be safely represented as an int data type.

Let’s add a few more methods to the class:

    def play(self):
        """
        Start the output stream
        """
        if self._output_stream is None:
            self._output_stream = self.pyaudio_interface.open(format = pyaudio.paFloat32,
                                                              channels = 1,
                                                              rate = self.sample_rate,
                                                              output = True,
                                                              stream_callback=self.audio_callback,
                                                              frames_per_buffer=self.frames_per_chunk)

        self._output_stream.start_stream()
    
    def stop(self):
        """
        Stop the output stream
        """
        if self._output_stream is None:
            return
        else:
            self._output_stream.stop_stream()
            self._output_stream.close()
            self.pyaudio_interface.terminate()

    def audio_callback(self, in_data, frame_count, time_info, status):
        """
        The audio callback is called by the pyaudio interface when it needs more data.
        """
        frames = next(self.input_delegate)
        return (frames, pyaudio.paContinue)
    
    def is_active(self):
        """
        Used to determine if the output stream is currently active.
        """
        if self._output_stream is None:
            return False
        
        return self._output_stream.is_active()

The play, stop, and is_active methods act as an API to our stream player. They provide the primary way of interacting with our stream player.

The audio_callback method is our implementation of a function that is required by the PyAudio interface. You may notice that it simply calls next() on a function that was passed in through the constructor (the __init__ method).

So what is going on here? Let’s focus on the play method. Essentially this function opens an audio output stream with the parameters that we specify. The exact details are abstracted away from us by the PyAudio library, but the idea is that we’re telling the operating system that we want to start playing audio, and the parameters we supply allows the library to tell the OS how to do that. Once we’ve created the stream object, then we have to start it as well.

Let’s go over the parameters we passed in.

The format parameter specifies the data type of our audio stream. One common convention for audio streams is to use 32 bit floats in the range [-1, 1], which is what we specify here.

The parameter channels = 1 specifies that this is a mono stream.

The rate parameter sets the sample rate. The sample rate is essentially how many data points per second will flow through the stream. We’ll dig further into the sample rate and frames per chunk terms shortly.

output = True tell PyAudio that this is an output stream, meant for playing back audio out loud, as opposed to an input stream which is for capturing audio.

The stream_callback parameter takes a function that the output stream can call when it needs more data. This is the mechanism by which the stream is connected to the audio source.

Finally, the frames_per_buffer parameter tells the stream object how much data to bite off at once. This parameter is closely related to the responsiveness of our synthesizer. You usually want to set it as low as possible without causing hiccups in the audio output.

Testing the stream player

Okay, now that we have a way to play an audio stream, let’s give ourselves a temporary way to generate one. We’ll need to install a new package into our virtual environment.

$ pip install numpy
$ pip freeze > requirements.txt

Then, in the synth module folder, create a file called settings.py and add the lines

sample_rate = 44100
frames_per_chunk = 1024

Now open your __main__.py file and add a few imports:

from time import sleep

import numpy as np

from . import settings
from .playback.stream_player import StreamPlayer

Just as a quick aside, I like to keep my imports organized by standard library imports first, then third party, then project modules. You'll see me follow this pattern throughout the project.

Now, right below the imports, let’s write a generator.

def sine_generator(frequency, amplitude, sample_rate, frames_per_chunk):
    """
    A generator which yields a sine wave of frequency <frequency> and amplitude <amplitude>.
    """
    chunk_duration = frames_per_chunk / sample_rate
    chunk_start_time = 0.0
    chunk_end_time = chunk_duration
    phase = 0.0
    while True:
        # Generate the wave
        if frequency <= 0.0:
            if frequency < 0.0:
                log.error("Overriding negative frequency to 0")
            amplitude = 0.0
            wave = np.zeros(frames_per_chunk)
        
        else:
            wave = amplitude * np.sin(phase + (2 * np.pi * frequency) * np.linspace(chunk_start_time, chunk_end_time, frames_per_chunk, endpoint=False))

        # Update the state variables for next time
        chunk_start_time = chunk_end_time
        chunk_end_time += chunk_duration

        yield wave.astype(np.float32)

We’ll give this code a permanent home and go over how it works in the next tutorial, but for now just know that it generates a new chunk every time the audio callback calls next() on it.

Now below the character art, add:

# Create a sine wave generator
sine_wave_generator = sine_generator(frequency=440.0, amplitude=0.5, sample_rate=settings.sample_rate, frames_per_chunk=settings.frames_per_chunk)

# Create a stream player
stream_player = StreamPlayer(sample_rate=settings.sample_rate, frames_per_chunk=settings.frames_per_chunk, input_delegate=sine_wave_generator)
stream_player.play()
while True:
    sleep(1)

The __main__.py file now looks like:

import logging
from time import sleep

import numpy as np

from . import settings
from .playback.stream_player import StreamPlayer

def sine_generator(frequency, amplitude, sample_rate, frames_per_chunk):
    """
    A generator which yields a sine wave of frequency <frequency> and amplitude <amplitude>.
    """
    chunk_duration = frames_per_chunk / sample_rate
    chunk_start_time = 0.0
    chunk_end_time = chunk_duration
    phase = 0.0
    while True:
        # Generate the wave
        if frequency <= 0.0:
            if frequency < 0.0:
                log.error("Overriding negative frequency to 0")
            amplitude = 0.0
            wave = np.zeros(frames_per_chunk)
        
        else:
            wave = amplitude * np.sin(phase + (2 * np.pi * frequency) * np.linspace(chunk_start_time, chunk_end_time, frames_per_chunk, endpoint=False))

        # Update the state variables for next time
        chunk_start_time = chunk_end_time
        chunk_end_time += chunk_duration

        yield wave.astype(np.float32)

if __name__ == "__main__":
    logging.basicConfig(level=logging.DEBUG, 
                        format='%(asctime)s [%(levelname)s] %(module)s [%(funcName)s]: %(message)s',
                        datefmt='%Y-%m-%d %H:%M:%S')
    
    log = logging.getLogger(__name__)
    log.info(
        """
    __
   |  |
 __|  |___             ______         __        __
/__    __/          __|______|__     |  |      |  |
   |  |            |  |      |  |    |__|______|  |
   |  |     __     |  |      |  |       |______   |
   |__|____|__|    |__|______|__|        ______|__|
      |____|          |______|          |______|

                                                           __              __
                                                          |  |            |  |
    _______         __        __     __    ____         __|  |___         |  |   ____
 __|_______|       |  |      |  |   |  |__|____|__     /__    __/         |  |__|____|__
|__|_______        |__|______|  |   |   __|    |  |       |  |            |   __|    |  |
   |_______|__        |______   |   |  |       |  |       |  |     __     |  |       |  |
 __________|__|        ______|__|   |  |       |  |       |__|____|__|    |  |       |  |
|__________|          |______|      |__|       |__|          |____|       |__|       |__|
        """
    )

    # Create a sine wave generator
    sine_wave_generator = sine_generator(frequency=440.0, amplitude=0.5, sample_rate=settings.sample_rate, frames_per_chunk=settings.frames_per_chunk)

    # Create a stream player
    stream_player = StreamPlayer(sample_rate=settings.sample_rate, frames_per_chunk=settings.frames_per_chunk, input_delegate=sine_wave_generator)
    stream_player.play()
    while True:
        sleep(1)

Before we run the program, be sure to start with your volume turned very low or off, as this can be quite loud. After you start the program, adjust the volume up until it’s at a comfortable level.

You can start the program with python -m synth. You should hear a 440 Hz sine wave! (CAUTION: Loud)

You can exit the program with ctrl-c.

Code up to this point.