[Numpy-discussion] Help Convolution with binaural filters(HRTFs)

arthur de conihout arthurdeconihout at gmail.com
Thu May 27 10:31:36 EDT 2010


""Can you maybe give some hint?""
The most commonly used model for HRTF implementation is the one refered to
as "minimum phase filter and pure delay".It is composed of:
-a minimum-phase filter, which accounts for the magnitude spectrum of HRTF
-and a pure delay , which represents the temporal information contained in
the HRTF

If H(f) is the HRTF to be implemented , the corresponding Hminphase is given
by:


2010/5/27 Friedrich Romstedt <friedrichromstedt at gmail.com>

> 2010/5/27 arthur de conihout <arthurdeconihout at gmail.com>:
> > I try to make me clearer on my project :
> > [...]
>
> I think I understood now.  Thank you for explanation.
>
> >  original = [s / 2.0**15 for s in original]
> >
> >     nframes=filtre.getnframes()
> >     nchannels=filtre.getnchannels()
> >     filtre = struct.unpack_from("%dh" % nframes*nchannels,
> > filtre.readframes(nframes*nchannels))
> >     filtre = [s / 2.0**15 for s in filtre]
> >
> >     result = numpy.convolve(original, filtre)
>
> Additionally to what David pointed out, it should also suffice to
> normalise only the impulse response channel.
>
> Furthermore, I would spend some thought on what "normalised" actually
> means.  And I think it can only be understood in Fourier domain.  When
> taking the conservation of energy into account, i.e. the conservation
> of the L2 norm of the impulse response function under Fourier
> transformation, it can be normalised in time domain also by setting
> the time-domain L2 norm to unity.  (This is already much different
> from the maximum normalisation.)  Then the energy content of the
> signal before and after the processing is identical.  I.e., it
> emphasises some frequencies in favour of others which are diminished
> in volume.
>
> A better approach might be in my opinion to use the (already directly)
> determined transfer function.  This can be normalised by the power of
> the input signal, which can e.g. be determined by a reference
> measurement without any obstruction in defined distance, I would say.
>
> >     result = [ sample * 2.0**15 for sample in result ]
> >     filtered.writeframes(struct.pack('%dh' % len(result), *result))
>
> It's to me too a mystery why you observe this octave jump you reported
> on.  I guess it's an octave, because you can compensate by doubling
> the sampling rate.  Can you check whether your playback program loads
> the output file as single or double channel?  Also I guess some
> relation between your observation of "incomplete convolution" and this
> pitch change.
>
> And now, I'm very irritated by the way you handle multiple channels in
> the input file.  Actually you load maybe a two-channel input file,
> extract *all* the data, i.e. data from ch1, ch2, ch1, ch2, ..., and
> then convolve this?  For single-channel input, this is correct, but
> when your input data is two-channel, it explains on the one hand why
> your program maybe doesn't work properly and on the other why you have
> pitch-halfening (you double the length of each frame).  It is I think
> possible that you didn't notice more strange phenomenon, since the
> impulse response function is applied to each datum individually.
>
> > i had a look to what you sent me i m on my way understanding maybe your
> > initialisation tests will allow me to make difference between every  wav
> > formats?
>
> Exactly!  It should be able to decode at least most wavs, with
> different samp widhts and different channel number.  But I think I
> will myself in future hold to David's advice ...
>
> > i want to be able to encode every formats (16bit unsigned, 32bits)what
> > precautions do i have to respect in the filtering?do filter and original
> > must be the same or?
> > Thank you
>
> All problems would go away when you use the transfer function
> directly.  It should be present as some function, which can easily be
> interpolated to the frequency points your FFT of the input signal
> yields.  Interpolating the time-domain impulse response is not a good
> idea, since it assumes already frequency-boundedness, which is not
> necessarily fulfilled, I guess.  Also it's much more complicated.
>
> When you measure transfer function, how can you reconstruct a unique
> impulse response?  Do you measure also phases?  When you shine in with
> white noise, do autocorrelation of the result and Fourier transform,
> the phases of the transfer function are lost, you only obtain
> amplitudes (squared).  Of course it is the same to multiply in Fourier
> space with the transfer function as to convolve with its Fourier-back
> transform, but I'm not yet convinced that this is a good idea to do it
> in time domain tough ... Can you maybe give some hint?
>
> This means you essentially convolve with the autocorrelation function
> itself - but this one is symmetric and therefore not causal.  I think
> it's strange when sound starts before it starts in the input ... the
> impulse response must be causal.  Anyway, this problem remains also
> when you multiply with the plain real numbers in Fourier domain which
> result from the Fourier transform of the autocorrelation function -
> it's still acausal.  The only way to circumvent this I'm seeing
> currently is to measure the phases of the transfer function too - i.e.
> to do a frequency sweep, white noise autocorrelation is then as far as
> I think not sufficient.
>
> Sorry, I feel it didn't came out very clear ...
>
> Friedrich
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100527/933c064c/attachment.html>


More information about the NumPy-Discussion mailing list