
On Tue, Feb 15, 2022 at 11:51:41PM +0900, Stephen J. Turnbull wrote:
scanf just isn't powerful enough. For example, consider parsing user input dates: scanf("%d/%d/%d", &year, &month, &day). This is nice and simple, but handling "2022-02-15" as well requires a bit of thinking and several extra statements in C. In Python, I guess it would probably look something like
year, sep1, month, sep2, day = scanf("%d%c%d%c%d") if not ('/' == sep1 == sep2 or '-' == sep1 == sep2): raise DateFormatUnacceptableError # range checks for month and day go here
Assuming that scanf raises if there is no match, I would probably go with: try: # Who writes ISO-8601 dates using slashes? day, month, year = scanf("%d/%d/%d") if ALLOW_TWO_DIGIT_YEARS and len(year) == 2: year = "20" + year except ScanError: year, month, day = scanf("%d-%d-%d")
which isn't too bad, though. But
year, month, day = re.match(r"(\d+)[-/](\d+)[-/](\d+)").groups() if not sep1 == sep2: raise DateFormatUnacceptableError # range checks for month and day go here
Doesn't that raise an exception? NameError: name 'sep1' is not defined I think that year, sep1, month, sep2, day = re.match(r"(\d+)([-/])(\d+)([-/])(\d+)").groups() might do it (until Tim or Chris tell me that actually is wrong). Or use \2 as you suggest later on.
expresses the intent a lot more clearly, I think.
Noooo, I don't think it does. The scanf (hypothetical) solution is a lot closer to my intent. But yes, regexes are more powerful: you can implement scanf using regexes, but you can't implement regexes using scanf. -- Steve