The Deeptone SDK can be used to process a real time audio stream.
The data being fed via the
input_generator should be 16-bit PCM with the sample rate of 16 kHz. If a different sample rate is provided, it will be re-sampled, but generally the results might not be accurate.
Configuration options and outputs
There are different configuration options and types of outputs which can be used depending on the SDK language.
Available configuration options
There are several possible arguments which can be passed to the
input_generator- generator that yields byte arrays representing audio data properly sampled
models- the list of model names to use for the audio analysis
output_period- how often (in milliseconds, multiple of 64) the output of the models should be returned
A generator will be returned which will yield one output per
output_period milliseconds of the provided input, representing timestamped
results from the requested models.
You can use the
process_stream method to process a stream of audio. You will need to provide
a valid generator that yields audio bytes. Below you will find two different examples, where we:
- open an audio file and stream bytes from that file, or
- stream bytes using microphone as an input source
1. Streaming bytes from an audio file
2. Streaming bytes from a microphone
You can find even more detailed recipes on using a microphone in the Gender model recipes section.
In either of those two cases, the returned object is a generator that will yield results for every
The output of the script would be something like:
You can find more detailed recipes for real-time processing of microphone input in the Gender model recipes section.