[Python-ideas] More alternate constructors for builtin type
Steven D'Aprano
steve at pearwood.info
Tue May 7 20:20:13 EDT 2019
On Tue, May 07, 2019 at 07:17:00PM +0100, Oscar Benjamin wrote:
> The int function accepts all kinds of things e.g.
>
> >>> int('๒')
> 2
>
> However in my own code if that character ever got passed to int then
> it would definitely indicate either a bug in the code or data
> corruption so I'd rather have an exception.
If you ever get Thai users who would like to enter numbers in their own
language, they may be a tad annoyed that you consider that a bug.
> Admittedly the non-ASCII unicode digit example is not one that has
> actually caused me a problem
I don't see why it should cause a problem. An int is an int, regardless
of how it was spelled before conversion.
You probably don't lose any sleep over the relatively high probability
of a single flipped bit changing a '7' digit into a '6' digit, say. It
seems strange to worry about the enormously less likely data corruption
which just so happens to result in valid non-ASCII digits.
If your application can support user-data of "123" for the int 123, why
would it matter if the user spelled it '๑๒๓' instead? You're not
obligated to output Thai digits if you don't have many Thai users, but
it just seems mean to reject Thai input if It Just Works.
> but what I have had a problem with is
> floats. Given that a user of my code can pass in a float in place of a
> string the fact that int(1.5) gives 1 can lead to bugs or confusion.
So write a helper function and use that. Or specify a base:
int(string, 0)
will support the usual Python formats (e.g. any of '123', '0x7b',
'0o173', '0b1111011') without converting non-strings. If for some reason
you only want to support a single base, say, base 7, you can specify a
non-zero argument as the base:
py> int('234', 7)
123
But user input seems like a good place to apply Postel's Law:
"Be conservative in what you output, be liberal in what you accept."
It shouldn't be any skin off your nose to accept '0x7b' or '๑๒๓' as well
as '123'.
--
Steven
More information about the Python-ideas
mailing list