[portland] detecting implicit encoding conversion

Christopher Hiller chiller at decipherinc.com
Thu Feb 25 03:14:52 CET 2010


Awesome!  Thanks.

On Wed, Feb 24, 2010 at 5:08 PM, jason kirtland <jek at discorporate.us> wrote:

> On Wed, Feb 24, 2010 at 4:49 PM, Christopher Hiller
> <chiller at decipherinc.com> wrote:
> > List,
> >
> > I'm having a difficult time with this particular problem.  I have a
> codebase
> > where I would like to find all occurrences of implicit decodes.  It's
> > difficult to do this with grep, and I was wondering if there was another
> way
> > by means of decorators or monkeypatching or compiler/parse tree analysis
> or
> > something.  An example:
> >
> > foo = u'bar' + 'baz'
> >
> > This implicitly decodes "baz" using the system default encoding.  In my
> case
> > this encoding is ASCII.
> >
> > However -- and this is where problems can arise -- what if you had this:
> >
> > foo = u'bar' + 'büz'
> >
> > ...which results in a SyntaxError if your default encoding is ASCII.
> >
> > Any ideas?  I'm having problems googling for solutions because I'm not
> > entirely sure what to google for.
>
> I went through this process myself recently.  The path I took was to
> switch out the default unicode codec with one that explodes, run the
> unit tests, and incrementally fix the problems.  The code is open
> source and you can snag it here:
>
> http://bitbucket.org/jek/flatland/src/75d8155a30a2/tests/__init__.py
> http://bitbucket.org/jek/flatland/src/75d8155a30a2/tests/_util.py
>
> The short version looks like:
>
> class NoCoercionCodec(codecs.Codec):
>    def encode(self, input, errors='string'):
>        raise UnicodeError("encoding coercion blocked")
>
>    def decode(self, input, errors='strict'):
>        raise UnicodeError("encoding coercion blocked")
>
> The real version is a little longer.  The stdlib does some implicit
> conversions, and in my case I didn't want those to explode.
> _______________________________________________
> Portland mailing list
> Portland at python.org
> http://mail.python.org/mailman/listinfo/portland
>



-- 
christopher hiller
sr software engineer
decipher
34 nw 1st ave, ste 305
portland or 97209
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/portland/attachments/20100224/996ac45c/attachment.html>


More information about the Portland mailing list