Behavior of the for-else construct

Avi Gross avigross at verizon.net
Fri Mar 4 16:24:05 EST 2022


Om (unless your first name is Joshi),

Yes, your example captures some of my intent. I have not studied Django but your example suggests it uses special notation using a sort of set of braces like I have talked about "{%" and "%}" that allows freedom in using what might otherwise need keywords. Anything between braces can be an extension of the language in a context where users do not put their own variable names.

But again, any language defined can be set up to do things their own way. In Python, I believe the contents of a loop are not examined at all if skipped so searching in it for some clause of what to do if empty is not doable. The ELSE clause is a way to get the attention if there has been no break.

I would not be shocked if some language has a FOR command return a value such as the number of iterations if run and something like None if not and allows something like:

result = for ...

That might return 0 or None if it was part of the language but it is not.

Avi (my current first name)

-----Original Message-----
From: Om Joshi <om+python at omajoshi.com>
To: Avi Gross <avigross at verizon.net>
Cc: python-list <python-list at python.org>
Sent: Fri, Mar 4, 2022 3:04 pm
Subject: Re: Behavior of the for-else construct


I'm not sure if anyone has mentioned it on this thread, but with respect to your comment about adding either on.empty or a decorator, the Django template syntax uses

{% for x in iterator %}
 <h2>{{ x }}</h2>
{% empty %}
 <h2>Empty</h2>
{% endfor %}

and this seems to work quite well and be incredibly intuitive, at least for Django html templates. I took a look and it has other enhancements like a way to know when you are in the last iteration of the loop.

{% if forloop.last %}
...
{% endif %}

And also for the first item using forloop.first and a generalized counter in forloop.counter ..


 ---- On Fri, 04 Mar 2022 13:45:21 -0600 Avi Gross via Python-list <python-list at python.org> wrote ----
 > {NOTE, after some diversion, this long message does revert a bit to the topic.}
 > 
 > Ah, Chris, the games we played when we were young and relatively immature!
 > 
 > Has anyone else played with embedding "escape sequences" or other gimmicks in unexpected places like filenames so that on the right terminals, such as the VT100, it displayed oddly or was hard to open or delete as what showed was not quite the underlying representation?
 > 
 > The main reason for the restrictions in olden days was cost. Everything was so costly including storage/memory and CPU time. Clearly it is a lot easier to have fixed-length filenames that fit into say 16 bytes, or storing multiple flags about file permissions as single bits, even if it meant lots of bit-twiddling or using masks to retrieve their values. We think nothing of creating structures that have many others embedded in them as attributes or function calls that allow a hundred optional arguments so that the function spends much of the time used just figuring out what was set before doing whatever calculation is required to fulfill the request.
 > 
 > I was reading a novel recently (Jack Reacher Series) where the main character is noticing how much technology has changed as they have been ignoring it for a decade or so. Everything seems to be coming up faster. My view was that if something seems ten times as fast as it was, it also probably is doing a hundred or ten thousand times as much to get that result.  The real speed changes are often counterbalanced by expecting to do more. A web page served may display a screen of text but to transmit it may include not just lots of padding in the HTML, but have all kinds of code such as in Java or JavaScript or lots of back and forth with the server to keep something like a graph displayed being refreshed ...
 > 
 > So back to filenames, the concept of having to search for long filenames that may not even be stored sequentially in large blocks that can be read (ahead) efficiently, may have seemed to be so illogical as not to be considered. So given that the shorter ones were not allowed to have embedded spaces, it made sense to treat them like tokens that could be broken up at whitespace. As mentioned, languages (or other programs) would often parse a command line and create something like this for the main program in C with suitable version in Python and other languages:
 > 
 >    main(int argc, char *argv[])
 > 
 > The code variations on the above do suppose that something has already parsed the command line that invoked them and partitioned it properly into individual strings placed in an array of such strings and also returned how many arguments it saw. Users invoking the program needed to be careful such as using double quotes around anything with embedded spaces, where allowed.
 > 
 > But like many paradigms, there can be a shift. Consider the fact that languages like Python are constantly parsing things like names. Can you create a variable name like "me first" with an embedded space or even other symbols normally reserved such as parentheses? Most languages do not like such things. It makes it hard to parse if not quoted in some unique way. Yet languages like R happily allow such constructs if placed in back quotes (grave accents?) as in `me & you` as a variable name or the name of a function. Of course, you then never use the darn name without the extra quotes.
 > 
 > Similarly, when you make an object like a DataFrame, can you include spaces and other things in the names of columns (or sometimes rows)? If so, is there only access some ways and not others? 
 > 
 > The answer often is not simple. As Chris repeatedly highlights, making a language consistent as you play with features can be VERY hard and sometimes not quite possible without relaxing some rules or making exceptions. Sometimes the answer varies. In base R a data.frame can be given a column name like "me + you" which it then stores as "me...you" leading to odd results. But it happily returns that result if you ask for mydf$me using auto-completion. Spell it out fully and it won't find it! A later package added on makes modified data.frame objects called tibbles which do not autocomplete but do completely store and let you access the name so mydf$me fails and mydf$"me + you" or mydf
 ---- On Fri, 04 Mar 2022 13:45:21 -0600 Avi Gross via Python-list <python-list at python.org> wrote ----

me + you` works but oddly an alternative format like mydf[, "me + you"] works while the similar mydf[, `me + you`] fails!
 > 
 > My point is not about R but a more general one. I can rant about many other languages, LOL! Allowing spaces or other characters in what used to be a more easily viewable name that can be parsed easier, can lead to having to find every place such things are used and seeing if they can be made to work consistently. I show an example above where it is not consistent, in my view. 
 > 
 > But when humans view things and their perceptions differ, you are inviting disagreements about whatever you implement. You may end up having to make people do more than they would prefer such as quoting all variable names even if they do not "need" it. Wouldn't it be nice to be able to name a variable 1time at least if you felt like it? Many people have passwords like that. I think the answer is NO, not if it meant quoting every variable because there was no longer any reasonable way to parse programs.
 > 
 > The issue that started the discussion was different but in a sense similar. If you want to extend the functionality of a "for" loop in one of many possible ways, how do you design a way to specify it so it can both be unambiguously parsed and implemented while at the same time making sense to humans reading it and interpreting it using their human language skills.
 > 
 > I happen to think this is even harder for some who speak in languages other than English and have to program in languages loosely based on English. I am happy that I seem to think in English now but I was seven when I first encountered it after thinking in others. People who program do not all speak English or are more fluent in other languages. T may may be used to other word orders for example. They may move verbs to the end of a sentence  or place adjectives or other modifiers after versus before a word and forget about all the other games played where the same word means something completely different. To them ELSE may either mean nothing or the phrase IF ... ELSE may be said differently or adding a clause after the construct is not seen as natural.
 > 
 > So was this way of doing FOR ... ELSE the only or even best way, is what some of this debate is about.
 > 
 > I am thinking of a function in some languages that lets you specify what should happen in a later circumstance. In a language like R, you can specify one or more instances of on.exit(...) that are not run immediately. Each one can replace the commands in the previous one or add on to it. When the function they are defined in exits for any reason, it pauses to run any uncleared such commands. Clearly this works better if a language has a way to defer evaluation of code so there are no side effects.
 > 
 > So consider the suggestion of code that should be run if you have a loop and you break out of it. Could you design an alternate way to handle that other than adding an ELSE clause after the loop?
 > 
 > Clearly you could simply add a function called on.break() that can be used as described but only within the body of that loop. It might be something that can be set and unset as needed and when the loop is exited, the program implicitly checks to see if any code has been dynamically set and executes it. This clearly is not necessarily a good way or better way, but is an example of how you can implement something without using any key words. No need to think about forcing the use of ELSE versus a new keyword that may conflict with existing code. Yes, the name on.break may conflict but that is trivially handled in Python by invoking it with a full name that includes what module it is in or by creating an alias. 
 >  
 > So what about considering an alternate approach that does handle a for loop that does nothing? Would it create huge incompatibilities for something like:
 > 
 > for eye in range(0), on.empty=... :
 >     pass
 > 
 > In some languages, arbitrary additional arguments are allowed, and if not understood, are ignored. Python does not allow anything like the above. And in this case, the entire body of the for loop is never evaluated so no gimmicks inside the body are possible. A gimmick before it might work and I even wonder if there is room here for a decorator concept like:
 > 
 > @on.empty(...)
 > for eye in range(0):
 >     pass
 > 
 > I am ending with a reminder. NOTHING I am writing here is meant to be taken seriously but merely as part of a well-intentioned debate to share ideas and not to win or lose but learn. Python is more than a language but also has aspects of a culture and we sometimes talk about whether something has a pythonic flavor or is pythonic versus translating it literally from a language like C rather than using the ideas common in python. The method chosen to implement the ELSE clause here may well be Pythonic and some of my attempts to show other ways may well not be. I am not one of those that find the current implementation to be the wrong one and will happily use it when I have code that can be done well that way. I am just discussing the issue and wider ones. Languages have an amazing variety of designs that fascinate me.
 > 
 > 
 > 
 > -----Original Message-----
 > From: Chris Angelico <rosuav at gmail.com>
 > To: python-list at python.org
 > Sent: Fri, Mar 4, 2022 12:46 pm
 > Subject: Re: Behavior of the for-else construct
 > 
 > 
 >  On Sat, 5 Mar 2022 at 02:02, Tim Chase <python.list at tim.thechases.com> wrote:
 > >
 > > On 2022-03-04 11:55, Chris Angelico wrote:
 > > > In MS-DOS, it was perfectly possible to have spaces in file names
 > >
 > > DOS didn't allow space (0x20) in filenames unless you hacked it by
 > > hex-editing your filesystem (which I may have done a couple times).
 > > However it did allow you to use 0xFF in filenames which *appeared* as
 > > a space in most character-sets.
 > 
 > Hmm, I'm not sure which APIs worked which way, but I do believe that I
 > messed something up at one point and made a file with an included
 > space (not FF, an actual 20) in it. Maybe it's something to do with
 > the (ancient) FCB-based calls. It was tricky to get rid of that file,
 > though I think it turned out that it could be removed by globbing,
 > putting a question mark where the space was.
 > 
 > (Of course, internally, MS-DOS considered that the base name was
 > padded to eight with spaces, and the extension padded to three with
 > spaces, so "READ.ME" would be "READ\x20\x20\x20\x20ME\x20", but that
 > doesn't count, since anything that enumerates the contents of a
 > directory would translate that into the way humans think of it.)
 > 
 > > I may have caused a mild bit of consternation in school computer labs
 > > doing this. ;-)
 > 
 > Nice :)
 > 
 > > > Windows forbade a bunch of characters in file names
 > >
 > > Both DOS and Windows also had certain reserved filenames
 > >
 > > https://www.howtogeek.com/fyi/windows-10-still-wont-let-you-use-these-file-names-reserved-in-1974/
 > >
 > > that could cause issues if passed to programs.
 > 
 > Yup. All because, way back in the day, they didn't want to demand the
 > colon. If you actually *want* to use the printer device, for instance,
 > you could get a hard copy of a directory listing like this:
 > 
 > DIR >LPT1:
 > 
 > and it's perfectly clear that you don't want to create a file called
 > "LPT1", you want to send it to the printer. But noooooo it had to be
 > that you could just write "LPT1" and it would go to the printer.
 > 
 > > To this day, if you poke around on microsoft.com and change random
 > > bits of URLs to include one of those reserved filenames in the GET
 > > path, you'll often trigger a 5xx error rather than a 404 that you
 > > receive with random jibberish in the same place.
 > >
 > >   https://microsoft.com/…/asdfjkl → 404
 > >   https://microsoft.com/…/lpt1 → 5xx
 > >   https://microsoft.com/…/asdfjkl/some/path → 404
 > >   https://microsoft.com/…/lpt1/some/path → 5xx
 > >
 > > Just in case you aspire to stir up some trouble.
 > >
 > 
 > In theory, file system based URLs could be parsed such that, if you
 > ever hit one of those, it returns "Directory not found". In
 > practice... apparently they didn't do that.
 > 
 > As a side point, I've been increasingly avoiding any sort of system
 > whereby I take anything from the user and hand it to the file system.
 > The logic is usually more like:
 > 
 > If path matches "/static/%s":
 > 1) Get a full directory listing of the declared static-files directory
 > 2) Search that for the token given
 > 3) If not found, return 404
 > 4) Return the contents of the file, with cache markers
 > 
 > Since Windows will never return "lpt1" in that directory listing, I
 > would simply never find it, never even try to open it. This MIGHT be
 > an issue with something that accepts file *uploads*, but I've been
 > getting paranoid about those too, so, uhh... my file upload system now
 > creates URLs that look like this:
 > 
 > https://sikorsky.rosuav.com/static/upload-49497888-6bede802d13c8d2f7b92ca9fac7c
 > 
 > That was uploaded as "pie.gif" but stored on the file system as
 > ~/stillebot/httpstatic/uploads/49497888-6bede802d13c8d2f7b92ca9fac7c
 > with some metadata stored elsewhere about the user-specified file
 > name. So hey, if you were to try to upload a file that had an NTFS
 > invalid character in it, I wouldn't even notice.
 > 
 > Maybe I'm *too* paranoid, but at least I don't have to worry about
 > file system attacks.
 > 
 > 
 > ChrisA
 > -- 
 > https://mail.python.org/mailman/listinfo/python-list
 > 
 > 
 > -- 
 > https://mail.python.org/mailman/listinfo/python-list
 > 



More information about the Python-list mailing list