Unicode and Python - how often do you index strings?
Tim Chase
python.list at tim.thechases.com
Tue Jun 3 22:37:17 EDT 2014
On 2014-06-04 12:16, Chris Angelico wrote:
> On Wed, Jun 4, 2014 at 11:11 AM, Tim Chase
> <python.list at tim.thechases.com> wrote:
> > I then take row 2 and use it to make a mapping of header-name to a
> > slice-object for slicing the subsequent strings:
> >
> > slice(i.start(), i.end())
> >
> > print("EmpID = %s" % row[header_map["EMPID"]].strip())
> > print("Name = %s" % row[header_map["NAME"]].strip())
> >
> > which I presume uses string indexing under the hood.
>
> Yes, it's definitely going to be indexing. If strings were
> represented internally in UTF-8, each of those calls would need to
> scan from the beginning of the string, counting and discarding
> characters until it finds the place to start, then counting and
> retaining characters until it finds the place to stop. Definite
> example of what I'm looking for, thanks!
For what it's worth, most of the lines in each file are under ~2k, so
even O(N) or O(log N) indexing wouldn't be grievous. Noticeable, but
not grievous.
Glad my example could give you some fodder.
-tkc
More information about the Python-list
mailing list