[SciPy-dev] Starting a datasets package, again

David Cournapeau david at ar.media.kyoto-u.ac.jp
Tue Jun 5 21:23:35 EDT 2007


Robert Kern wrote:
>
> The iris and oldfaithful packages you posted earlier were good. We might want to
> fiddle with the metadata later, but what you had is probably sufficient.
Those data were from r-base, and I thought that following our discussion 
on licensing, it would have been better not to use them.
>
>> Would it be ok to create such a packages the next few days with the 
>> incoming data ? I think that starting the actual package may encourage 
>> other people to join the wagon. Concerning the license, if the copyright 
>> holder requires to be cited in the sources, is it OK (I am a bit 
>> confused because modified BSD does not require to keep the 
>> acknowledgments, so I am not sure exactly how to apply it correctly in 
>> this case) ?
>
> It would not be okay to put a BSD license on that data. It would be making a
> false representation as to the actual terms attached to the data. But that's
> fine since they won't be distributed as part of scipy proper anyways and can
> have whatever license the authors deem appropriate. Personally, while I mind
> distributing non-open source *code* in scikits, I don't mind distributing
> non-open source, but redistributable datasets.
The think I really like with the datasets in R is that any package can 
depend on them for demos/examples/etc... I don't know much about 
easy_install yet, but does the dependency tracking system work well ? 
For example, you install foo which uses faithful in some examples, when 
is the dependency resolved ? Would it be ok to use them in tests ?

For the old faithful data, the answer I received from Pr Azzalani (whose 
article "A look at some data on the old faithful geyser" has original 
data) is that an acknowledgment would be welcomed, so if we acknowledge 
it in the sources, is it OK to apply BSD (the problem would be if people 
using it would be required to acknowledge as well, right ?)

David



More information about the SciPy-Dev mailing list