Here is an idea about adding a mark to PyUnicode object which allows
fast answer to the question if a string has surrogate code. This mark
has one of three possible states:
* String doesn't contain surrogates.
* String contains surrogates.
* It is still unknown.
We can combine this with "is_ascii" flag in 2-bit value:
* String is ASCII-only (and doesn't contain surrogates).
* String is not ASCII-only and doesn't contain surrogates.
* String is not ASCII-only and contains surrogates.
* …
[View More]String is not ASCII-only and it is still unknown if it contains surrogate.
By default a string is created in "unknown" state (if it is UCS2 or
UCS4). After first request it can be switched to "has surrogates" or
"hasn't surrogates". State of the result of concatenating or slicing can
be determined from states of input strings.
This will allow faster UTF-16 and UTF-32 encoding (and perhaps even a
little faster UTF-8 encoding) and converting to wchar_t* if string
hasn't surrogates (this is true in most cases).
[View Less]
What are you think about using pprint.pprint() to output the result of
evaluating an expression entered in an interactive Python session (and
in IDLE)?
I suppose this has already been proposed in past but couldn't find
any online reference so here goes.
When it comes to module constant imports I usually like being explicit it's
OK with me as long as I have to do:
>>> from resource import (RLIMIT_CORE, RLIMIT_CPU, RLIMIT_FSIZE)
Nevertheless in case the existence of certain constants depends on the
platform in use I end up doing:
>>> if hasattr(resource, "RLIMIT_MSGQUEUE"): # linux only
.... import resource.…
[View More]RLIMIT_MSGQUEUE
....
>>> if hasattr(resource, "RLIMIT_NICE"): # linux only
.... import resource.RLIMIT_NICE
....
...or worse, if for simplicity I'm willing to simply import all RLIMIT_*
constants I'll have to do this:
>>> import resource
>>> import sys
>>> for name in dir(resource):
.... if name.startswith('RLIMIT_'):
.... setattr(sys.modules[__name__], name, getattr(resource, name))
...or just give up and use:
from resource import *
...which of course will pollute the namespace with unnecessary stuff.
So why not just allow "from resource import RLIMIT_*" syntax?
Another interesting variation might be:
>>> from socket import AF_*, SOCK_*
>>> AF_INET, AF_INET6, SOCK_STREAM, SOCK_DGRAM
(2, 10, 1, 2)
On the other hand mixing "*" and "common" imports would be forbidden:
>>> from socket import AF_*, socket,
File "<stdin>", line 1
from socket import AF_*, socket
^
SyntaxError: invalid syntax;
Thoughts?
--- Giampaolo
https://code.google.com/p/pyftpdlib/https://code.google.com/p/psutil/https://code.google.com/p/pysendfile/
[View Less]