[Tutor] Pass arguments from bash script to embedded python script
Cameron Simpson
cs at cskk.id.au
Wed Oct 23 17:58:59 EDT 2019
On 23Oct2019 15:08, Stephen P. Molnar <s.molnar at sbcglobal.net> wrote:
>I have revised my script to make use of the def function:
There's no "def function". A "def" statement defines a function.
Anyway, remarks inline below:
>fileList = []
>filesList = []
>
>for files in glob.glob("*.log"):
> fileName, fileExtension = os.path.splitext(files)
> fileList.append(fileName)
> filesList.append(files)
This iterates over a list of filenames. So "files" is a single filename;
I would not make this a plural.
Also, "fileList" and "filesList" are so similar that I would expect them
to cause confusion. And, in fact, they did confuse me later.
>fname = fileList
And here "fileList" is a list of filenames. So "fname" _should_ be
plural. This may seem like nitpicking, but getting this right is very
important for readability and therefore debugging. So much so that I
initially misread what your loop above did.
>for fname in fname:
> fname
This loop does nothing _inside_ the loop; why bother? However, its
control action is to iterate over the list "fname", assigning each value
to... "fname"!
The end result of this is that after the loop, "fname" is no longer a
list of filenames, it is now just the _last_ filename.
Again, getting plurality consistent would probably prevent you from this
result.
>fname1 = fname+'.log'
>fname2 = fname+'-dG'
>print('fname = ', fname)
>print('fname1 = ',fname)
>print('fname2 = ',fname2)
Ok, preparing a filename (by reassembling the stuff you undid earlier)
and the associated "-dG" name. And printing them out (which is fine, an
aid to debugging).
>def dG(filesList):
> data = np.genfromtxt(fname1, usecols=(1), skip_header=27,
>skip_footer=1, encoding=None)
> np.savetxt(fname2, data, fmt='%.10g', header=fname)
> return(data)
The function dG does not use its parameter "filesList". Why do you pass
it in?
Also, it is a source of bugs to use a parameter with the same name as a
global because inside the function you might work on the parameter,
_thinking_ you were working on the global. This is called "shadowing",
and linters will say something like "the parameter filesList shadows a
global of the same name" in order to point this out to you.
Then within the function you use the _global_ names "fname", "fname1"
and "fname2". Normally a function will never use any global names; that
is why we pass parameters to them. The whole point is to encapsulate
their tasks as a generic method of doing something, _not_ dependent on
any outside state.
I would have written this function thus:
def dG(basic_name):
src_filename = basic_name + '.log'
dst_filename = basic_name + '-dG'
data = np.genfromtxt(src_filename, usecols=(1), skip_header=27, skip_footer=1, encoding=None)
np.savetxt(dst_filename, data, fmt='%.10g', header=basic_name)
return data
so that has no dependence on external global names.
>data = dG(filesList)
Again, the function never uses "filesList" - there's no point in passing
it in. I would have used the revised function above and gone:
data = dG(fname)
>It seems to work with one little (actually major) problem. The only
>result saved is for the last file in the list 14-7.log.
>Which s the last file in the list.
That is because of the earlier for-loop I pointed out, which puts just
the last filename into fname.
Regardless, your script will only ever process one file because the call
to dG() is not inside any kind of iteration; it will only be run once.
Consider this:
for fname in filesList:
data = dG(fname)
print(data)
which calls the function (in this case my revised function) once for
each name in filesList.
You could put a print call inside dG() to see what fnam it was
processing to make things more obvious.
Finally, I recommend avoiding global variables altogether - they are a
rich source of bugs, particularly when some function quietly uses a
global. Instead you can put _all_ the code into functions, eg:
def dG(......):
..... as above ...
def main(argv):
fileList = []
filesList = []
for files in glob.glob("*.log"):
fileName, fileExtension = os.path.splitext(files)
fileList.append(fileName)
filesList.append(files)
for fname in filesList:
data = dG(fname)
print(data)
# call the main function
main()
By structuring things this way there are no global variables and you
cannot accidentally use one in dG().
Cheers,
Cameron Simpson <cs at cskk.id.au>
More information about the Tutor
mailing list