[SciPy-User] RIFF header vs Scipy for odd length payloads

Warren Weckesser warren.weckesser at gmail.com
Tue Sep 1 14:20:00 EDT 2015


On Tue, Sep 1, 2015 at 10:23 AM, Joseph Codadeen <jdmc80 at hotmail.com> wrote:

> Hi,
>
> (tried posting this before with no luck, retrying)
>
> I am a scipy newbie.
>
> The RIFF specification states;
>
> http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt (definitive guide?)
>
>      ckSize    A 32-bit unsigned value identifying the
>                size of ckData. This size value does not
>                include the size of the ckID or ckSize
>                fields or the pad byte at the end of
>                ckData.
>      ckData    Binary data of fixed or variable size. The
>                start of ckData is word-aligned with
>                respect to the start of the RIFF file. If
>                the chunk size is an odd number of bytes, a
>                pad byte with value zero is written after
>                ckData. Word aligning improves access speed
>                (for chunks resident in memory) and
>                maintains compatibility with EA IFF. The
>                ckSize value does not include the pad byte.
>
>      <WORD>    16-bit unsigned        unsigned int
>                quantity in Intel
>                format
>
> However, if I do this and read my HFP wav file via scipy,
> <pre>framerate, data = scipy.io.wavfile.read(filepath)</pre>
>
>  it complains with;
> <pre>string size must be a multiple of element size</pre>
>
> A bit more debugging added to my test code and numpy (multiarray/ctors.c)
> gives:
>
> Sample file is 16 bits, note that 24 bit samples do not work in scipy
> Got error type "ValueError"
> Analysis of the wav file encountered a problem: "slen: 48683, itemsize: 2
> - string size must be a multiple of element size"
>
> i.e. my payload length is odd, reflecting the actual payload as per the
> the spec. The length of the file reflects the additional pad byte.
>
> So for odd length payloads;
> * we have the spec saying do not add the pad byte to the payload length,
> but only to the file length,
> * scipy likes the payload length to be even.
> * If I add the pad byte to to the payload length and the file length,
> scipy is happy.
> * If I want to follow the spec then no one can load my files into scipy.
>
> Am I misunderstanding something?
>
> What is the correct thing to do in this case?
> * Follow the spec
> * Follow scipy
> * Fix scipy
>
> I believe it should be to fix scipy unless I am looking at the wrong spec.
> The spec came from
> http://www.digitalpreservation.gov/formats/fdd/fdd000001.shtml
>
> I have tried this on scipy version 0.16.0 on Ubuntu 14.04 LTS
>
> Thanks.
>


Could you provide a link to a wav file that demonstrates the problem?

How many bits per sample is your file?  (Sorry, the answer is not clear to
me from your email.)  Scipy's wav reader does not support 24 bit files.
If your file is 24 bit, you can try wavio, a small module I wrote
specifically to read 24 bit wav files into a numpy array:
https://github.com/WarrenWeckesser/wavio


Warren

P.S. For anyone reading this, there is also an issue on github:
https://github.com/scipy/scipy/issues/5175



>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20150901/f61a200e/attachment.html>


More information about the SciPy-User mailing list