[Tutor] How to check whether audio bytes contain empty noise or actual voice/signal?

Fri Oct 25 19:42:34 EDT 2024

On 26/10/24 05:25, marc nicole via Tutor wrote:
> Hello Python fellows,
> 
> I hope this question is not very far from the main topic of this list, but
> I have a hard time finding a way to check whether audio data samples are
> containing empty noise or actual significant voice/noise.
> 
> I am using PyAudio to collect the sound through my PC mic as follows:
> 
> FRAMES_PER_BUFFER = 1024
> FORMAT = pyaudio.paInt16
> CHANNELS = 1
> RATE = 48000
> RECORD_SECONDS = 2import pyaudio
> audio = pyaudio.PyAudio()
> stream = audio.open(format=FORMAT,
>                  channels=CHANNELS,
>                  rate=RATE,
>                  input=True,
>                  frames_per_buffer=FRAMES_PER_BUFFER,
>                  input_device_index=2)
> data = stream.read(FRAMES_PER_BUFFER)
> 
> 
> I want to know whether or not data contains voice signals or empty sound,
> To note that the variable always contains bytes (empty or sound) if I print
> it.
> 
> Is there an straightforward "easy way" to check whether data is filled with
> empty noise or that somebody has made noise/spoke?

If it were "easy" then there would be articles and tutorials aplenty...

Signal processing is a very involved topic.

A Fourier Transform can be thought of as converting a graph from signal 
against time, to frequency components. Speech can then be identified.

Filtering allows the inclusion/removal of unwanted frequencies (probably 
not useful, per spec).

Spectral Analysis is similar to above but with respect to changes over time.

Time-Domain analysis stays at the level of the current code. Try 
graphing that. A lead-in period (of "silence") should enable 
identification of background/technical noise. Perhaps thereafter, the 
presence of sound over-and-above the "background" will be sufficient for 
your purposes (use-case not stated).

-- 
Regards,
=dn