[Tutor] Python text file read/compare

samira khoda samiraeastcoast at gmail.com
Mon Oct 3 16:14:20 EDT 2022


Hi

Thank you very much for your feedback.I just started working on the file
again. Unfortunately I am still not getting anywhere with the modifications
I made to the codes.  I don't know when I can create the new file and write
to it and close it. Basically as I mentioned before I need to find where
the timestamps jump down to zero or the lowest number then start time so I
can split the file and make a new file.***For your information the
timestamps are in milliseconds***

Thanks in advance for your help. Please let me know if you need more
clarification to assist me.

kyk=( 'hfx-test-.txt'  , 'r')

line_numbers=[]
i=0
current_time=""
count=1

for line in kyk:
    if i < 488:
        i=0
        line_numbers.append(current_time)
        current_time=""
    else:
        current_time+=line
        i+=1

**************this is where It does not write to the file*****

*opf=open("kyk_txt_new_file.txt","w")*

*    if current_time <start_time:*

*        splitlines(True).add(line_number);*

*    opf.write("this is the timestamps after the line splits.")*

*    kyk.close()*


*Here are the first and last few lines of my text file data.  *

time stamps
-1.75, 1.08, 10.35, -0.10, -0.01, -0.01, 23.19, *488*
-1.75, 1.12, 10.39, -0.10, -0.01, -0.01, 23.20, *521*

9.65, -1.31, -1.95, -0.11, -0.06, -0.02, 22.05, *15339436*
9.56, -1.32, -1.97, -0.10, -0.00, -0.01, 22.05, *15339495*

*I was also provided with the * pseudocode * below which I am trying to
follow if that helps to guide me along the way.*

-> load sourceFile (a copy of the raw data file)



line_number = 0

start_time = 0

split_numbers = []

not_done = true



while(not_done):

       ->read line from sourcefile

       ->split line on ','

       ->convert last item on line to unsigned long and store in
current_time



       if current_time < start_time:

              split_numbers.add(line_number)

              start_time = current_time

       if end_of_file:

              not_done = false;



for s in split_numbers:

       ->create newfile

       for i = 0, i < s, i++:

              ->read line from sourcefile

              ->write line to newfile

       ->close newfile



->Close sourcefile

On Thu, Sep 29, 2022 at 6:46 PM Cameron Simpson <cs at cskk.id.au> wrote:

> On 29Sep2022 15:09, samira khoda <samiraeastcoast at gmail.com> wrote:
> >Below is the snapshot of the data. Basically I need to find where the
> >timestamps jump down to zero or the lowest number so I can split the file.
> >For your information the timestamps are in milliseconds on the last
> column.
> >
> >[image: image.png]
>
> This list strips nontext attachments. Please just paste a few lines of
> example data directly into your message.
>
> >*And below is the code I wrote but nothing came out of it.
>
> I'll make a few remarks about the code inline below.
>
> >with open('hfx-test-.txt') as kyk:
> >    op=''
> >    start=0
> >    count=1
> >    for x in kyk.read().split("\n"):
>
> This reads all the data and then breaks it up on newlines. For big files
> that can be expensive. Text files are iterables, yielding lines, so you
> can write this:
>
>      for line in kyk:
>          x = line.rstrip()   # this removes the newline
>
> >        if(x=='0'):
>
> You don't need brackets in Python if-statements:
>
>      if x =='0':
>
> >            if (start==1):
>
> If you don't get any files, presumably `start` is never equal to `1`.
>
> >                with open(str(count)+ '.txt', 'w') as opf:
> >                    opf.write(op)
> >                    opf.close()
>
> The `with open()` form does the `close()` for you, you do not need the
> `opf.close()`.
>
> >                    op=''
> >                    count+=1
>
> and I'd have these lines outside the `with` i.e. less indented.
>
> >        else:
> >            start=1
> >
> >    else:
> >        if(op==''):
> >          op = x
> >        else:
> >            op= op+ '\n' + x
>
> It looks like you're accumulating the `x` values as one big string.
> That will work, but it would be more common to accumulate a list< eg:
>
> Up the top: ops = [] # an empty list.
> Down here: ops.append(x)
> In the `opf` with-statement:
>
>      for op in ops:
>          print(op, file=opf)
>      ops = []  # we have writtne them, reset the list to empty
>
> >kyk.close()
>
> As with `opf`, the with statement closes the file for you. You don't
> need this line.
>
> Finally, if your code's not doing what you intend, then _either_ the
> if-statements have the wrong tests _or_ the variables you're testing do
> not have the values you expect.
>
> Put some `print()` calls in the code to see what's going on. Examples:
>
>      for line in kyk:
>          x = line.rstrip()   # this removes the newline
>          print("x =", repr(x))
>          print("start =", start, "count =", count, "op = ", repr(op))
>
> That should show you the lines of data as you read them, and the values
> of the variable you're testing. Hopefully that should show you when
> things go wrong, and lead you to a fix.
>
> The `repr(x)` expression prints a representation of the value, which is
> particularly handy for things like strings.
>
> Cheers,
> Cameron Simpson <cs at cskk.id.au>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>


More information about the Tutor mailing list