[Tutor] What's in a name.

ThreeBlindQuarks threesomequarks at proton.me
Thu May 18 19:17:53 EDT 2023


I was thinking about a recent question that is now better understood.

Part of it may boil down to a choice of names. Sometimes a name may make some people feel like a function or variable may be doing something else than it does.

The word I am talking about now is "suffix" and in some sense it means something appended to the end of something else. That has more specifix meanings in areas of computer science where it may mean a part of a filename at the end after some market like a period. Actually, I have seen multiple suffixes of sorts in a row such as in an old operating system, a file.c;3 is the third version/copy of an edited file containing part or all of a program written in the C language. Many human languages let you pile one suffix after another to extend a words to have many additional layers of meaning including gender and plurality and possessiveness and quite a few more in some languages.

So the OP had a task to do where they read in a string that perhaps was meant to be a filename and remove a suffix perhaps looking like .xyz and they noticed a method called removesuffix() which sounded right.

But is it?

Nope. Not in general as what the function is designed to do is remove an exact suffix you supply. It can be used this way as in "proportional".removesuffix("al") with no periods in sight.

So it looks like the OP thought about perhaps asking the person who typed in the text a second questions asking what suffix to omit. This strikes me as odd in several ways as they could have typed it in a shorter version in the first request from the user and also because the user might provide ".xyz" or just "xyz" or anything else and then removing that may fail or leave part behind such as the period. It does not look like a great one-liner to me.

So the OP seems to have found removesuffix as a plan and then rather than go for a different plan, wanted to fix it by calculating the suffix without creating a variable name and passing that to removesuffix. It could be an ingenious solution if it worked but again, stepping back, is it?

As I see it, if the intent has to strip a suffix for a filename and especially one that uses a period and may have more than one, as in modern operating systems, it can be better to hunt for a module that someone has written carefully and is well tested.

Consider a case where the user has chosen a file called hello.exe and you want to find a file called hello.c that corresponds to it. Using normal string-manipulation methods, you can do quite a bit, even if not inline. You can for example split the text of the name on a period and ignore the last component but how well will you handle a name like xyz.c.tar.gz in which you have multiple suffixes that can be peeled away by uncompressing and then removing an item from within an archive file? Finally, you need to back away from the C suffix if you really want what some might call a basename.

So you can do this yourself and cover mainly the cases that matter to you. Or you can find functions that more easily match your need.

The OP was supplied with some suggestions including one that sort of fits part of what maybe they need by careful use of the walrus operator (or just multiple lines of code) to extract the suffix while retaining the original name. But I am guessing that in some ways it was not really a good thing to answer what was asked versus suggesting a reconsideration and doing anything else.

I offered suggestions that can be explored without lots of details but each has plusses and minuses and often the names of the parts may not be meaningful. A function with a name like basename() may not be meaningful. But even the concept or removing an ending is flawed.

The goal is not about removing but not having the suffix. A straightforward alternate approach might be a way to identify what part of the name is before the first (or maybe last) period and keep that. It can be as simple as making a new empty string or list of characters and looping on the original while copying one character at a time until you hit a period and stopping copying. It can be moving from the back until you hit a period and noting what position you are at and taking a view of the original between the beginning and the spot before a period as in text[1:(N-1)] or something like that. There are all kinds of functions that simply search for you and return the index at which something is found.

And more generally you can use fairly simple but powerful regular expressions to match various parts and keep what parts you want in what order. Some may even focus in and only recognize specific sets of suffixes like .c and .h as your needs dictate.

Bottom line is the python object we talk about as being of type str has NOTHING to do with filenames. It has more broad uses. You cannot expect it to remove suffixes in filenames except in the same way it will remove suffixes in general text. But other places and objects may be a better choice. If there is an object I will call "filename" then it might have a method that does what you want and maybe you can write code like:

base = filename(input("Type a filename in full: ")).desuffixize()

Python people have made a truly stupid statement years ago suggesting that the language generally has one obvious way to do things. In my experience, it pretty much NEVER has only one way and you have too many ways to choose from. Reading code written by others (especially without comments) can be almost painful as you try to figure out what exactly they are trying to do which often is not on your short list of ways you might have done it.

So choosing names in your program can be a challenge as it is the first part of you communicating with anyone reading your code or using your function. Had the people named removesuffix as remove_end_characters_as_specified_if_they_exist() it would be a pain to use but might not make you think it was the best choice for what the OP seems to want.

Sent with [Proton Mail](https://proton.me/) secure email.


More information about the Tutor mailing list