[Tutor] Suggestions Please
Alan Gauld
alan.gauld at btinternet.com
Tue Oct 7 11:39:08 CEST 2014
On 06/10/14 23:42, Phillip Pugh wrote:
> I am trying to decide if Python is the right toolset for me.
> I do a lot of data analytics.
It can almost certainly do what you want but there may be other
tools that do it better. However, data analytics is quite vague.
It depends on what kind of data and what kind of analysis.
> Can you point me to a really good, intuitive resource
intuitive depends on the student.
But we also need to know what kind of data.
Is it stored in flat files?
in a SQL database(which one?)
In a NoSQL database(which one?)
Python can handle all of those but the tutorials involved
will all be different.
If you want a general introduction with some SQL database
specifics you can try my tutorial(see sig). Whether you
find it intuitive is another matter.
> I have one text file that is 500,000 + records..
Thats not very big in modern computing terms.
You could probably just read that straight into memory.
> I need to read the file,
What kind of file? A database file such as Foxpro?
or Access? or a CSV export? Or something else?
> move "structured" data around and then write it to a new file.
What is structured about it? Fixed column width?
Fixed relative position? Binary format?
> The txt file has several data elements and is
> 300 characters per line.
> I am only interested in the first two fields.
> The first data element is 19 characters.
> The second data element is 6 characters.
There are two ways in Python to extract 'columns' from a file.
If you know the separators you can use either the csv module(best)
or string.split(<sep list>) to create a list of fields.
If its a fixed length record (with potentially no seperator)
you can use string slicing. In your case that would be
field1 = string[:19]; field2 = string[19:25]
> I want to rearrange the data by moving the 6 characters data
> in front of the 19 characters data
Do you need a separator?
> and then write the 25 character data to a new file.
the reading and writing of the files is straightforward, any tutorial
will show you that.
> I have spent some time digging for the correct resource,
> However being new to Python and the syntax for the language
> makes it slow going. I would like to see if I can speed up
> the learning curve.
So far it sounds like you don't need any of the high powered data
analysis tools like R or Pandas, you are just doing basic data
extraction and manipulation. For that standard Python should
be fine and most tutorials include all you need.
If you look at mine the most relevant topics from the contents
are:
The raw materials - variables & data types
Looping - basic loops in Python
Branching - basic selection in python
Handling Files - files
Handling Text - text
and possibly
Working with Databases - using SQL in Python
You probably should read the CSV module documentation too.
I suspect it will do a lot of what you want.
HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list