[Tutor] script guidelines

Tue Oct 3 14:17:56 EDT 2017

On 03/10/17 09:48, renukesh nk wrote:
> requirement:
> i have a directory , that contains multiple sub directories, each sub
> directory has multiple text and log files, my script fetches  the required
> lines from all the sub directories and stores it in one text file.
> 

I'm not too sure what you mean by writing the "required lines"
but I've added some general comments below

> but i want it to store separate text file for each  sub directory ,after
> fetching the contents. can anyone please help me where to edit my script.

It is better if you copy the script into the body of the email,
attachments often get stripped by the server so we can't see
them. In this case you got lucky...

Comments below:

> import zipfile,fnmatch,os
> import os, shutil, re, glob

you import os twice, its not a bug but it is a
potential future maintenance gotcha.

> from os.path import isfile, join

you use os.path.xxx so don't need this import

> from os import walk

You use os.walk so don't need this import.

> root_src_dir = r'E:\\New folder'
> root_dst_dir = "E:\\destination"

dst_dir should be a raw string too.
And why double quotes here and single quotes
above? It's good to be consistent.

> for src_dir, dirs, files in os.walk(root_src_dir):
>     dst_dir = src_dir.replace(root_src_dir, root_dst_dir, 1)
>     if not os.path.exists(dst_dir):
>        os.makedirs(dst_dir)
>     for file_ in files:
>        src_file = os.path.join(src_dir, file_)
>        dst_file = os.path.join(dst_dir, file_)
>        if os.path.exists(dst_file):
>            os.remove(dst_file)
>        shutil.copy(src_file, dst_dir)

This just copies everything so why not use
the shutil.copytree() function instead?

> rootPath = r"E:\\destination"

Rather than duplicate the string use root_dst_dir

> pattern = '*.zip'
> for root, dirs, files in os.walk(rootPath):

This is where you do stuff at a per directory level so
this is probably where you want to create your logfile
and write to it. But i'm still not sure what you are
looking to write...

>    for filename in fnmatch.filter(files, pattern):
>         print(os.path.join(root, filename))
>         zipfile.ZipFile(os.path.join(root,
>                         filename)).extractall(os.path.join(root,
>                         os.path.splitext(filename)[0]))

This chunk looks at every zip file and extracts the contents into the
same folder(what if there are duplicate filenames?) But you don't do
anything with those files?

>     os.chdir(rootPath)
>     for file in glob.glob(pattern):
>        f = open((file.rsplit(".", 1)[0]) + "ANA.txt", "w")
>
>        f.close()

Then you change to the destination root and iterate over any zip files
at the root level, opening a new file and immediately closing it again?

At this point I'm confused?
It would be good to put a comment before each block
explaining the intended purpose of the code - what
it is expected to achieve and why you are doing it.

> ##here folder output
> mypath =r"E:\\destination"

Again this could use root_dst_dir to avoid string
duplication (and potential error)

> newpath = os.path.expanduser("E:\\destination")

expanduser() replaces an initial tilde(~) with the homedir but
since you don't have an initial tilde its completely
redundant here.

> filenam = "1.txt"

> f = []
> path = ''
> path1 = []

These look suspiciously like bad variable names.
What is their purpose? Is 'path1' really a list?
Usually lists suggest a plural name, such as paths.

> for (dirpath, dirnames, filenames) in walk(mypath):
>     if isfile(join(mypath, dirpath)):

This should never be true since mypath is a directory
and dirpath is, by definition, the folder that walk()
is currently reading. But joining them makes no sense
since dirpath should already include mypath.

>         path1.extend(join(mypath, dirpath))
>     if os.path.isdir(join(mypath, dirpath)):

Likewise because you join() them the result
will probably not exist so the tst will fail.

>        for f in filenames:
>             path1.append(str(join(mypath, dirpath, f)))

If this does get called it will probably have
broken paths since you use join again.

If you just want the files use the list of files that
os.walk gave you - it has already separated the contents
into dirs and files as part of its return value.

> print(path1)
>
> newf = open(os.path.join(newpath, filenam ), "w+")

Why use 'w+'? You only need that if you are going to be
reading and writing the file - usually involving use
of tell()/seek() etc. You don't do any of that so
keep things simple and just use 'w'.

> myarray = {"ERROR", "error"}

Its not an array its a set

> for element in myarray:
>    elementstring = ''.join(element)

This is a very complex way of writing

elementstring = 'Error'

Although it does have a frisson of interest because the
actual set element chosen is theoretically arbitrary
since sets are unordered. It's not clear what you are
trying to achieve but I suspect this is not it.

> for f in path1:
>     openfile = open(os.path.join(path, f), "r")

Since path is an empty string this is effectively saying

     openfile = open(f, "r")

>     for line in openfile:
>        if elementstring in line:
>          newf.write(line)

> newf.close()
> openfile.close()

You should be closing openfile in the loop above where you open it.
But better still you should be using the

with open((f,'r') as openfile:

idiom which will auto close the file for you.

I'm still not clear on what you are trying to do(*)
but it looks like you are making it much more
difficult than I suspect it needs to be.

It will also be pretty slow since you walk the directory
tree multiple times when one copytree() followed by one
walk() should, I think, suffice.

(*)I think it's to copy the initial file structure
then search all files in the copy for lines including
the strings 'Error' or 'error' and write those lines to
a logfile, one per folder. But I'm not really certain.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos