22 May
2015
22 May
'15
2:04 p.m.
On 2015-05-22 11:19:54, user783746 <linux.experi@gmail.com> wrote:
I am writting a program to classify recorded audio phone calls files (wav) which contain atleast some Human Voice or Non Voice (only DTMF, Dialtones, ringtones, noise). I tried implementing simple VAD (voice activity detector) using ZCR (zero crossing rate) & calculating Energy, but these parameters confuse with DTMF, Dialtones files with Voice.
I also tried implementing a machine learning based approach using SVM (Support Vector Machine) and MFCC coefficients. The results were worse than previous approach.
The problem description is pretty vague, but I guess that you're better off asking on the scipy or scikit-learn lists. Regards Stéfan