Advice for Audio classifier based on Voice Activity Detection

user783746 linux.experi at gmail.com
Fri May 22 14:19:54 EDT 2015



I am writting a program to classify recorded audio phone calls files (wav) 
which contain atleast some Human Voice or Non Voice (only DTMF, Dialtones, 
ringtones, noise). I tried implementing simple VAD (voice activity 
detector) using ZCR (zero crossing rate) & calculating Energy, but these 
parameters confuse with DTMF, Dialtones files with Voice.

I also tried implementing a machine learning based approach using SVM 
(Support Vector Machine) and MFCC coefficients. The results were worse than 
previous approach.

I need someone to advice me little on this domain, I have no previous 
experience in machine learning or AI. I am willing to put in good amount of 
time in this domain.

I am comfortable working in MATLAB, scipy, numpy, scikit-learn, python.

Thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20150522/143ac718/attachment.html>


More information about the scikit-image mailing list