[Numpy-discussion] seeking advice on a fast string->array conversion

Tue Nov 16 09:29:54 EST 2010

Sorry, I accidentally hit send long before I was finished writing. But
to answer your question, they contain many 2048-element multi-channel
analyzer spectra.

Darren

On Tue, Nov 16, 2010 at 9:26 AM, william ratcliff
<william.ratcliff at gmail.com> wrote:
> Actually,
> I do use spec when I have synchotron experiments.  But why are your files so
> large?
>
> On Nov 16, 2010 9:20 AM, "Darren Dale" <dsdale24 at gmail.com> wrote:
>> I am wrapping up a small package to parse a particular ascii-encoded
>> file format generated by a program we use heavily here at the lab. (In
>> the unlikely event that you work at a synchrotron, and use Certified
>> Scientific's "spec" program, and are actually interested, the code is
>> currently available at
>> https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/
>> .)
>>
>> I have been benchmarking the project against another python package
>> developed by a colleague, which is an extension module written in pure
>> C. My python/cython project takes about twice as long to parse and
>> index a file (~0.8 seconds for 100MB), which is acceptable. However,
>> actually converting ascii strings to numpy arrays, which is done using
>> numpy.fromstring, takes a factor of 10 longer than the extension
>> module. So I am wondering about the performance of np.fromstring:
>>
>> import time
>> import numpy as np
>> s = b'1 ' * 2048 *1200
>> d = time.time()
>> x = np.fromstring(s)
>> print time.time() - d
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>