[Tutor] How to change the values in python code?

dn PythonList at DancesWithMice.info
Thu Nov 10 17:41:00 EST 2022


On 11/11/2022 08.33, samira wrote:
> Hi all
> 
> Thanks for your comments and sorry I did not get a chance to respond back.
> I did not wish to stir the pot, LOL just trying to get help. As a newbie it
> is not always easy to pass on the clear content.
> 
> As David mentioned below "the final number in the line (aka "last") is
> increasing, until line 1342 where 493 < 48075 from the previous line" .
> Thanks for pointing this out. However, this is just an example. This
> happens multiple times throughout the data with different numbers. That's
> where I need to find where this happens and get the python print" split
> here" so I can split the file at those locations. I have a large data set
> and manually finding it is impossible.
> 
> In simple language, I would need to write a code  that reads line by line
> and when the above happens, prints "split here" then actually splits the
> file at that location and saves it to a new file.  I know how to write a
> code to save it to file and all but I just can't get the first part right.
> 
> If not possible, it is all good and I will find another solution.


OK then, let's start from first-principles - to suit someone new to 
programming.


The best way to solve a multi-part problem, is to break it into smaller 
problems, or steps in the process.

1 be able to read the entire file
2 be able to identify the dependent-data on each row/record/line of the file
3 splitting into (potentially) multiple output files

The advantage of these smaller problems is that they are (potentially) 
easier to understand and solve.

Indeed, we may even further-divide sub-problem (3):

3a detect when the data directs that a split should happen
3b organise the output files

...and keep dividing down the problems until solutions become more 
evident. (the level at which this happens for you (as a 
Python-Apprentice may be more-detailed than for someone with years of 
experience - but who cares, because the objective is the solution!)


First Step:

Can you write code to solve (1)?

Prove it by adding a 'debug-print' to the loop, and thus being able to 
'see' that the file is being read.


Second Step:

Take one of the printed-lines of *actual data* output by the 'First 
Step'. Write a separate routine which takes that data as a 
literal/constant string-value, and selects the pertinent control-data 
from the line (the piece of data on which the 'Third Step' will depend - 
remember that word?)

If the code can achieve this successfully for one line, it can be 
located inside a code-loop, and perform same on each line of the 
data-file in-turn!

Thus, we have simplified the task, and are solving the sub-problem 
without any influence from other code, ie if we find an error, we know 
it is within this little routine, and has nothing to do with the logic 
in the 'First Step'! Also, there's no need to tackle every line at-once 
- deal with the sub-problem first, and then get more ambitious ideas!


New question:

Are you aware that there are different data-types? That strings are not 
the same as integers (for example). Have you come-across type()?
(if not, time for some research/study)

Accordingly, once the code is able to extract the pertinent field from 
the line check what data-type it is. Consider if that is what you need 
for Step 3a.


Method:

Above I used the term 'routine'. If you are using an editor/IDE, then 
this will be a Python script, ie a .py file. Fine! An alternative 
approach is that some IDEs, eg PyCharm, provide a "Python Console"; or 
if you are a Terminal user, firing-up python[3] will give access to an 
"interactive console", aka the REPL.

These tools are ideal for experimenting, in the sense of the two Steps 
outlined above - I've been shown code using a particular library which 
interacts with workbooks (spreadsheets) and used the Python Console to 
be sure I understood how it works and the differences between several 
grab-some-data methods!

Input: if you already have some code, or care to 'borrow' from solutions 
suggested earlier in this thread, it can be copy-pasted from the editor 
or email-message into the Python Console - to ease your effort.

Output: likewise, after solving a problem in the Python Console, it is a 
simple matter of copy-pasting the (successful) experiment into the 
editor as part of assembling your 'final code'!


You may already have these steps 'done', and only have to extract them 
from an existing attempt to be able to prove the sub-problem(s). That's 
fine. Recommend following the above steps anyway, if only to give you a 
sense of how to solve the next coding-problem which comes your way...


Reverting to this List:

Please copy-paste the two snippets into a reply-message.


You might also like to suggest (in English) how one might attack the 
sub-problems in the "Third Step".

Someone will happily verify your thinking, and perhaps point-out any 
errors or improvements. (list-members find it easier to help with 
existing code than with a 'blank canvas'!)

To get a head-start on such, please consider the sub-problem of perhaps 
multiple output files being output by the completed program[me]:
- what file-names will they need?
- where will the 'base-name' come-from?
- how will the multiple files be distinguished from each-other?

Thereafter, someone will be able to offer advice about tackling the last 
step(s) - but again, if you'd like to show some code, even attempts at a 
sub-solution, it will help you and help improve contributions 'here'.

-- 
Regards,
=dn


More information about the Tutor mailing list