> On 12 July 2013 13:36, Zaur Shibzukhov <szport at gmail.com> wrote:
> > Hello!
> > Is it good idea to allow
> > float('∞') to be float('inf') in python?
>Because it obviously means infinity -- much more so than "inf" does :)
Do you have the infinity symbol on your keyboard? I don't! So, for
me, should I ask for
Since Python3, the python creators removed a lot of encodings from the
str.encode() method. They did it because they weren't sure how to implement
the feature in Python3. They wanted it to be better.
I have an idea, add a built in method called "convert".
convert(data, current_state, desired_state)
convert(data, from, to)
Real world examples:
dataBytes = b"hello"
dataUTF8_Str = "Ɠahhhh hi all ̮"
convert(dataUTF8_Str, encodings.UTF8, encodings.BYTES)
Returns: b'\xc6\x93ahhhh hi all \xcc\xae'
convert(dataBytes, encodings.BYTES, encodings.HEX)
convert(dataUTF8_Str, encodings.UTF8, encodings.ASCII)
Returns: TypeError: can't convert utf8 character "\u0193" to ascii
Some other encodings:
Maybe even INT?
Feel free to add suggestions!
For the benefit of those who read this in ASCII, I will include Unicode
translations in the following. I prefer code which is readable in ASCII (as
PEP-8 suggests) which is one reason that I a little bit dislike the
proposal. I had to go to the archives to even read the subject line.
Nevertheless, I think that, in the Unicode world, the proposal is sound.
The question was asked earlier why the Python int() and float() functions
do not allow Greek numbers, when they do allow numbers from many other
language character sets.
The answer is in the documentation for int():
> The numeric literals accepted include the digits 0 to 9 or any Unicode
> equivalent (code points with the Nd property).
The "Nd" characters are decimal digits of systems which use positional
notation (i.e. Arabic numbers). The Greeks used decimal numbers, but used
different symbols for one, ten, hundred, thousand, (etc.) and added them
together, much like the system of Roman numbers we are familiar with.
The int() parser expects Arabic formatted numbers, so it will not correctly
interpret other systems of notation. In order to read such numbers, you
need to use a parser which was built for them. PEP 313 suggested that a
parser for Roman formatted numbers be included in Python, and it was
Several algorithms for reading Roman numbers encoded using ASCII values
['i','v','x','L', (etc.)] have been published. The one I wrote goes a bit
further -- it also tries to read the value of unicodedata.numeric() for
each character of its input string, and sums them (sort of). It would,
therefore convert all of the Greek and other characters mentioned in this
thread and return a value for them. If a Greek author followed Roman
formatting rules it would return a _correct_ value, too. If, on the other
hand, he put a smaller valued digit on the left side of a larger digit, he
would probably not appreciate the resulting subtraction.
> >>> import romanclass as Roman
>>> g2 = '\U0001015c'
> >>> unicodedata.name(g2)
> 'GREEK ACROPHONIC THESPIAN TWO'
> >>> g5000 = '\U00010172'
> >>> unicodedata.name(g5000)
> 'GREEK ACROPHONIC THESPIAN FIVE THOUSAND'
> >>> g5002 = g5000 + g2 # string concatenation (not addition)
> >>> g5002
> >>> Roman.Roman(g5002)
> >>> print(Roman.Roman(g5002))
> >>> # but -- since Roman math subtracts values on the left...
> >>> print(Roman.Roman(g2 + g5000))
This is all an unimportant side effect of my attempt to support actual
Unicode Roman numbers:
> >>> u'\u2167'
> >>> eight = Roman.Roman(u'\u2167')
> >>> print(eight + 10) # NOTE: mathematical addition
This all assumes that we are talking about Acrophonic (or Herodian or
Attic) numerals. The Greeks also used Alphabetic (also called Milesian,
Alexandrian, or Ionic) numerals. In that system, the value of pi ('\u03c0')
is 80 (and has nothing to do with the circumference of a circle.) That
usage, however, is not recognized by Unicode:
> >>> '\u03c0'
> >>> pi = '\u03c0'
> >>> unicodedata.name(pi)
> 'GREEK SMALL LETTER PI'
> >>> unicodedata.numeric(pi)
> Traceback (most recent call last):
> File "<pyshell#113>", line 1, in <module>
> ValueError: not a numeric character
[ as a complete side note: Greeks pronounce the name of that letter as
"pea" not "pie".]
That agrees with Unicode's non-recognition of the numeric value of ASCII
letters used in Roman numerals:
> >>> unicodedata.numeric('X')
> Traceback (most recent call last):
> File "<pyshell#114>", line 1, in <module>
> ValueError: not a numeric character
Any numeric usage requires a definition of how the string is to be parsed:
> >>> Roman.Roman('X')
> >>> float(Roman.Roman('X'))
So, forget all of this noise about all of the other possible things that
could be done with extended definitions of float(). Any of those would
require another definition, and another PEP. This proposal is for only one
thing -- to make the following happen:
>>> inf = '\u221e'
Mark me as +0
The tripple quote string literal is a great feature, but there is one
problem. When you use them, it forces you to break out of you're current
indentation which maks code look ugly. I propose a new way to define a
triple back quote that woks the same way regular triple quotes work, but
instead does some simple parsing of the data within the quotes to preserve
the flow of the code. Due to the brittle and sometimes ambigious nature of
anything 'automatic', this feature is obviously not meant for data where
exact white space is needed. It would be great for docstrings, exception
messages and other type text.
Here is a short example of it's usage:
The strange contortions of the "fast sum for lists" discussions got me
wondering about whether it was possible to rehabilitate reduce with a less
error-prone API. It was banished to functools in 3.0 because it was so
frequently used incorrectly, but now its disfavour seems to be causing
people to propose ridiculous things.
The 2.x reduce is modelled on map and filter: it accepts the combinator as
the first argument, and then the iterable, and finally an optional initial
value. The most common error was failing to handle the empty iterable case
sensibly by leaving out the initial value, so you got a TypeError instead
of returning a result.
So, what if we instead added a new alternative API based on Haskell's
"fold"  where the initial value is *mandatory*:
def fold(op, start, iterable):
Efficiently merging a collection of iterables into a list would then just
data = fold(operator.iadd, , iterables)
I'd personally be in favour of the notion of also allowing strings as the
first argument, so you could instead write:
data = fold("+=", , iterables)
This could also be introduced as an alternative API in functools.
(Independent of this idea, it would actually be nice if the operator module
had a dictionary mapping from op symbols to names, like
operator.by_symbol["+="] giving operator.iadd)
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
Wow, that's a lot to take in.
I don't think things through very well sometimes.
I am not trying to "bring back" Python2, I am trying to see if we can fix
the problem that was always there that got worse in Python3.
I guess the bottom line for me is what you said:
> In most cases, there are other ways to do it--base64.encode(),
> binascii.hexlify(), etc. *It would be nice if there was a convenient and
> consistent way to do all of them instead of having to hunt around the stdlib
What does that do if I use it?
It could be used to convert a string to an integer (if applicable).
So have you thought of a solution for this problem?
This is a little off-topic. Can anyone tell me why we support numerals
in other alphabets but apparently not Greek?
On Fri, Jul 12, 2013 at 11:29 AM, Joshua Landau <joshua(a)landau.ws> wrote:
> On 12 July 2013 16:14, Gerald Britton <gerald.britton(a)gmail.com> wrote:
>> "Just because."
>> so, maybe we should have the interpreter spit out ∞ instead?
> I don't know whether this was a joke, but just as int("߅") spits out 5
> and not ߅, there is no reason that float("inf") should split out
> anything other than "inf".
>> I get that we special case infinity. Its an IEEE thing. I can sure
>> the next request coming: The various constants represented by unicode
> I don't see how one leads to the next. None thinks that that's a good
> idea. This is a *very* restricted change that fits with what we have
> already done.
> I don't get the hostility to it. I do get the objections that this
> isn't needed or that float() has a more restricted scope but this
> overt dislike to this extent surprises me. This is *minor* extension
> of the leniency there already is. I'm approximately neutral on the issue,
> but I'm definitely not as negative as a lot of the reviews it's
I've always found +=, -= and the like to be handy, but I had hoped like so
many other things in python there would be a generic form of this
x += 5 could be expressed as x = ? + 5 perhaps.