mike | July 10, 2023, 4:24 a.m.
In this tutorial we’ll walk through using PyAudio to build a simple audio stream player. By the end of the post you should have a 440 Hz sine wave coming out of your speaker. On the way we’ll create a temporary sine wave generator to produce the stream and a StreamPlayer class to consume it.
Let’s make some noise!
The first and most important tool we’ll need to install is Python 3.10. We do need Python >= 3.10 because we’re using pattern matching. That said, there are a few ways to get Python. If you’re using Windows I recommend getting it from the Microsoft Store. For macOS users, I’d recommend installing Python 3.10 with homebrew. And for Linux users, Ubuntu 22.04 ships with 3.10, otherwise you’ll most likely need to build from source.
NOTE to macOS and Linux users: you might find it undesirable to bind python
to version 3.10, so until we have a virtual environment set up you may want to use python3
or python3.10
, depending on what is on your PATH
variable.
One last thing: whenever you see a code block that starts with $
, that is a shell prompt. You'll want to run everything but the $
. So in the example
$ some command /with/a/path
You only want to run some command /with/a/path
in your shell.
Now that we have Python 3.10 installed, let’s create our project and set up a virtual environment. Open up a terminal (I recommend Terminal from the Microsoft Store). Use cd
to navigate to the folder where you want to save the project. Next create a new directory and cd
into it:
$ mkdir toy-synth
$ cd toy-synth
Now we want to create a python virtual environment. The reason we do this is so that we can keep track of our project dependencies. Keeping track of project dependencies lets us easily tell others what is necessary to run our program, as well as makes it easier to transfer our code to a different machine.
You might already have the virtualenv python module installed on your machine. If not, or you’re not sure, run:
$ pip3 install virtualenv
Then once it’s installed:
$ python3 -m virtualenv venv
You should see a few lines of output letting you know that a new virtual environment named “venv” was created. If you ls
in the project directory, you should see a new folder called “venv”. Now we need to activate the virtual environment:
Windows: .\venv\Scripts\activate
macOS and Linux: source ./venv/bin/activate
(venv) should appear before your shell prompt. If you close the shell session, you’ll need to cd
into the project directory and activate the virtual environment again. From here on out, unless otherwise specified, I’ll assume you’re working from a shell session open to your project directory with the virtual environment active.
You don’t have to for the project to work, but I’d recommend tracking your files with git. Initialize a new repo with
$ git init
Now you can create a file named .gitignore
(the . is important) in your project directory. Add the following to .gitignore
:
__pycache__/
*.py[cod]
*$py.class
venv/
Now we’re ready to start coding! Open your project folder in a text editor (I use Visual Studio Code). We’ll create a top-level python module called “synth”. Create a folder directly under the project folder and name it synth
. Now, inside the synth folder create a file called __init__.py
. This file is what makes the folder a python module. Create another file called __main__.py
. This file is what makes the module executable, and it’s also where we’ll write the first code.
Open __main__.py
. Let’s set up some logging to test the program. Copy the following code into the file:
import logging
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s [%(levelname)s] %(module)s [%(funcName)s]: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
log = logging.getLogger(__name__)
log.info(
"""
__
| |
__| |___ ______ __ __
/__ __/ __|______|__ | | | |
| | | | | | |__|______| |
| | __ | | | | |______ |
|__|____|__| |__|______|__| ______|__|
|____| |______| |______|
__ __
| | | |
_______ __ __ __ ____ __| |___ | | ____
__|_______| | | | | | |__|____|__ /__ __/ | |__|____|__
|__|_______ |__|______| | | __| | | | | | __| | |
|_______|__ |______ | | | | | | | __ | | | |
__________|__| ______|__| | | | | |__|____|__| | | | |
|__________| |______| |__| |__| |____| |__| |__|
"""
)
First we import the logging module from the standard library. We’ll need this to do any type of serious logging.
You’ve probably seen the line if __name__ == "__main__":
in a python program before. But if not, here’s the basic idea: every python program has some special global variables associated with it. One of them is the __name__
variable. Accessing this variable will always give you the name of the script that the variable is accessed from. Normally this is the name of the file, without the .py extension. The exception is that the entry point to the program is always named __main__
, even if the file name is different. Our file name happens to also be __main__
in this case, because we want to make the module executable.
Next, we set up the logger. We specify a basic configuration, which tells the logger to print some useful info like the time and function name before each log message. After we set up the logger, we get an instance of it and print out some character art with the log level INFO. For more information on logging, check out the python docs.
Alright, time to run our test. At the command line, run:
$ python -m synth
You should get a log printout of the character art.
Now that we’ve set up our project, let’s go ahead and give ourselves a way to play audio. We’ll use the PyAudio library, so let’s go ahead and install it with:
$ pip install pyaudio
And since we’ve installed a new package, let’s create a requirements.txt file to keep track of it.
$ pip freeze > requirements.txt
This will generate a list of packages we’ve installed in the virtual environment and put it in a file called requirements.txt
. We’ll do this every time we add a new package.
Now we can write the stream player code. We’ll create a new submodule under the synth module called “playback”. So create a folder inside the synth folder and name it playback
. Make an __init__.py
file in the new folder. And finally, create a new file in the playback folder called stream_player.py
.
The directory structure looks something like this:
|
+-toy-synth
|
|-.gitignore
|-requirements.txt
|
+-venv
|
+-synth
|
|-__init__.py
|-__main__.py
|
+-playback
|
|-__init__.py
|-stream_player.py
Now open the stream_player.py file. At the top of the class, add the following imports:
import pyaudio
import logging
And just below that, add a class declaration with an init method.
class StreamPlayer:
def __init__(self, sample_rate: int, frames_per_chunk: int, input_delegate):
self.log = logging.getLogger(__name__)
self.sample_rate = sample_rate
self.frames_per_chunk = frames_per_chunk
self.input_delegate = input_delegate
self.pyaudio_interface = pyaudio.PyAudio()
self._output_stream = None
The init method sets a few member variables we need and instantiates a PyAudio interface. Nothing too fancy going on yet. Next, we’ll add some getter and setter definitions for the variables that are passed in from the constructor. This isn’t totally necessary, but it’s always good practice to do validation on your data, especially in a dynamically typed language like Python. Go ahead and add the following below the init method:
@property
def sample_rate(self):
"""
The sample rate of the audio stream.
This is the number of data points (frames) per second.
"""
return self._sample_rate
@sample_rate.setter
def sample_rate(self, value):
try:
if (int_value := int(value)) > 0:
self._sample_rate = int_value
else:
raise ValueError
except ValueError:
self.log.error(f"Couldn't set sample_rate with value {value}")
@property
def frames_per_chunk(self):
"""
The size of the chunks of data that are passed to the output stream.
"""
return self._frames_per_chunk
@frames_per_chunk.setter
def frames_per_chunk(self, value):
try:
if (int_value := int(value)) > 0:
self._frames_per_chunk = int_value
else:
raise ValueError
except ValueError:
self.log.error(f"Couldn't set frames_per_chunk with value {value}")
@property
def input_delegate(self):
"""
This should be an iterator which returns the BYTES of an ndarray (aka calling tobytes() on it)
of size <frames_per_chunk>
"""
return self._input_delegate
@input_delegate.setter
def input_delegate(self, value):
try:
_ = iter(value)
self._input_delegate = value
except TypeError:
self.log.error(f"Could not set input delegate with value {value}")
The setter methods do some light validation on the variables when we want to set them. For example, the sample_rate
setter makes sure that the parameter passed in can be safely represented as an int data type.
Let’s add a few more methods to the class:
def play(self):
"""
Start the output stream
"""
if self._output_stream is None:
self._output_stream = self.pyaudio_interface.open(format = pyaudio.paFloat32,
channels = 1,
rate = self.sample_rate,
output = True,
stream_callback=self.audio_callback,
frames_per_buffer=self.frames_per_chunk)
self._output_stream.start_stream()
def stop(self):
"""
Stop the output stream
"""
if self._output_stream is None:
return
else:
self._output_stream.stop_stream()
self._output_stream.close()
self.pyaudio_interface.terminate()
def audio_callback(self, in_data, frame_count, time_info, status):
"""
The audio callback is called by the pyaudio interface when it needs more data.
"""
frames = next(self.input_delegate)
return (frames, pyaudio.paContinue)
def is_active(self):
"""
Used to determine if the output stream is currently active.
"""
if self._output_stream is None:
return False
return self._output_stream.is_active()
The play
, stop
, and is_active
methods act as an API to our stream player. They provide the primary way of interacting with our stream player.
The audio_callback
method is our implementation of a function that is required by the PyAudio interface. You may notice that it simply calls next()
on a function that was passed in through the constructor (the __init__
method).
So what is going on here? Let’s focus on the play
method. Essentially this function opens an audio output stream with the parameters that we specify. The exact details are abstracted away from us by the PyAudio library, but the idea is that we’re telling the operating system that we want to start playing audio, and the parameters we supply allows the library to tell the OS how to do that. Once we’ve created the stream object, then we have to start it as well.
Let’s go over the parameters we passed in.
The format
parameter specifies the data type of our audio stream. One common convention for audio streams is to use 32 bit floats in the range [-1, 1], which is what we specify here.
The parameter channels = 1
specifies that this is a mono stream.
The rate
parameter sets the sample rate. The sample rate is essentially how many data points per second will flow through the stream. We’ll dig further into the sample rate and frames per chunk terms shortly.
output = True
tell PyAudio that this is an output stream, meant for playing back audio out loud, as opposed to an input stream which is for capturing audio.
The stream_callback
parameter takes a function that the output stream can call when it needs more data. This is the mechanism by which the stream is connected to the audio source.
Finally, the frames_per_buffer
parameter tells the stream object how much data to bite off at once. This parameter is closely related to the responsiveness of our synthesizer. You usually want to set it as low as possible without causing hiccups in the audio output.
Okay, now that we have a way to play an audio stream, let’s give ourselves a temporary way to generate one. We’ll need to install a new package into our virtual environment.
$ pip install numpy
$ pip freeze > requirements.txt
Then, in the synth module folder, create a file called settings.py
and add the lines
sample_rate = 44100
frames_per_chunk = 1024
Now open your __main__.py
file and add a few imports:
from time import sleep
import numpy as np
from . import settings
from .playback.stream_player import StreamPlayer
Just as a quick aside, I like to keep my imports organized by standard library imports first, then third party, then project modules. You'll see me follow this pattern throughout the project.
Now, right below the imports, let’s write a generator.
def sine_generator(frequency, amplitude, sample_rate, frames_per_chunk):
"""
A generator which yields a sine wave of frequency <frequency> and amplitude <amplitude>.
"""
chunk_duration = frames_per_chunk / sample_rate
chunk_start_time = 0.0
chunk_end_time = chunk_duration
phase = 0.0
while True:
# Generate the wave
if frequency <= 0.0:
if frequency < 0.0:
log.error("Overriding negative frequency to 0")
amplitude = 0.0
wave = np.zeros(frames_per_chunk)
else:
wave = amplitude * np.sin(phase + (2 * np.pi * frequency) * np.linspace(chunk_start_time, chunk_end_time, frames_per_chunk, endpoint=False))
# Update the state variables for next time
chunk_start_time = chunk_end_time
chunk_end_time += chunk_duration
yield wave.astype(np.float32)
We’ll give this code a permanent home and go over how it works in the next tutorial, but for now just know that it generates a new chunk every time the audio callback calls next()
on it.
Now below the character art, add:
# Create a sine wave generator
sine_wave_generator = sine_generator(frequency=440.0, amplitude=0.5, sample_rate=settings.sample_rate, frames_per_chunk=settings.frames_per_chunk)
# Create a stream player
stream_player = StreamPlayer(sample_rate=settings.sample_rate, frames_per_chunk=settings.frames_per_chunk, input_delegate=sine_wave_generator)
stream_player.play()
while True:
sleep(1)
The __main__.py
file now looks like:
import logging
from time import sleep
import numpy as np
from . import settings
from .playback.stream_player import StreamPlayer
def sine_generator(frequency, amplitude, sample_rate, frames_per_chunk):
"""
A generator which yields a sine wave of frequency <frequency> and amplitude <amplitude>.
"""
chunk_duration = frames_per_chunk / sample_rate
chunk_start_time = 0.0
chunk_end_time = chunk_duration
phase = 0.0
while True:
# Generate the wave
if frequency <= 0.0:
if frequency < 0.0:
log.error("Overriding negative frequency to 0")
amplitude = 0.0
wave = np.zeros(frames_per_chunk)
else:
wave = amplitude * np.sin(phase + (2 * np.pi * frequency) * np.linspace(chunk_start_time, chunk_end_time, frames_per_chunk, endpoint=False))
# Update the state variables for next time
chunk_start_time = chunk_end_time
chunk_end_time += chunk_duration
yield wave.astype(np.float32)
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s [%(levelname)s] %(module)s [%(funcName)s]: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
log = logging.getLogger(__name__)
log.info(
"""
__
| |
__| |___ ______ __ __
/__ __/ __|______|__ | | | |
| | | | | | |__|______| |
| | __ | | | | |______ |
|__|____|__| |__|______|__| ______|__|
|____| |______| |______|
__ __
| | | |
_______ __ __ __ ____ __| |___ | | ____
__|_______| | | | | | |__|____|__ /__ __/ | |__|____|__
|__|_______ |__|______| | | __| | | | | | __| | |
|_______|__ |______ | | | | | | | __ | | | |
__________|__| ______|__| | | | | |__|____|__| | | | |
|__________| |______| |__| |__| |____| |__| |__|
"""
)
# Create a sine wave generator
sine_wave_generator = sine_generator(frequency=440.0, amplitude=0.5, sample_rate=settings.sample_rate, frames_per_chunk=settings.frames_per_chunk)
# Create a stream player
stream_player = StreamPlayer(sample_rate=settings.sample_rate, frames_per_chunk=settings.frames_per_chunk, input_delegate=sine_wave_generator)
stream_player.play()
while True:
sleep(1)
Before we run the program, be sure to start with your volume turned very low or off, as this can be quite loud. After you start the program, adjust the volume up until it’s at a comfortable level.
You can start the program with python -m synth
. You should hear a 440 Hz sine wave! (CAUTION: Loud)
You can exit the program with ctrl-c.