[Tutor] output not in ANSI, conversing char set to locale.getpreferredencoding()

Tue Aug 14 14:59:30 CEST 2012

> From: eryksun at gmail.com
> Date: Mon, 13 Aug 2012 15:12:04 -0400
> To: joel.goldstick at gmail.com
> CC: alan.gauld at btinternet.com; tutor at python.org
> Subject: Re: [Tutor] output not in ANSI
> 
> On Mon, Aug 13, 2012 at 2:04 PM, Joel Goldstick
> <joel.goldstick at gmail.com> wrote:
> >
> > I believe in this context the OP means ASCII.  ASCII became an ANSI
> > recognized standard many years ago
> 
> In Windows, ANSI refers to the locale-dependent 8-bit codepage. But
> there is no ANSI standard for Windows1252. It's a common misnomer in
> the OS dialog boxes and controls. Another MS misnomer is labeling
> UTF-16 as 'Unicode'.
> 
> @leon zaat
> 
> Process your text with Unicode. Open the file using codecs.open set to
> your platform's preferred encoding, e.g. 'cp1252' for Western,
> 'cp1251' for Cyrilic, or locale.getpreferredencoding() in general.
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

I tried changing my code.
I now have this piece of code:

import csv
import codecs
import locale
# Globale variabele
bagObjecten = []
chartype=locale.getpreferredencoding()
#------------------------------------------------------------------------------
# BAGExtractPlus toont het hoofdscherm van de BAG Extract+ tool
#------------------------------------------------------------------------------    
class BAGExtractPlus(wx.Frame):

    #------------------------------------------------------------------------------
    # schrijven van de records
    #------------------------------------------------------------------------------
    def schrijfExportRecord(self, verblijfhoofd,identificatie):

        sql1="";
        sql1="Select openbareruimtenaam, woonplaatsnaam  from nummeraanduiding where identificatie = '" + identificatie "'" 
        num= database.select(sql1);
        for row in num:
            openbareruimtenaam1=row[0]     
            openbareruimtenaam=u'' + (openbareruimtenaam1.encode(chartype))
            woonplaatsnaam1=(row[0]);
            woonplaatsnaam=u'' + (woonplaatsnaam1.encode(chartype))
            newrow=[openbareruimtenaam, woonplaatsnaam];
            verblijfhoofd.writerow(newrow);

    #--------------------------------------------------------------------------------------
    # Exporteer benodigde gegevens
    #--------------------------------------------------------------------------------------
    def ExportBestanden(self, event):
         ofile=codecs.open(r'D:\bestanden\BAG\adrescoordinaten.csv', 'wb', chartype)
        verblijfhoofd = csv.writer(ofile, delimiter=',',    
                 quotechar='"', quoting=csv.QUOTE_NONNUMERIC)
        counterVBO=2;
        identificatie='0014010011066771';
        while 1 < counterVBO:
            hulpIdentificatie= identificatie;            
            sql="Select identificatie, hoofdadres, verblijfsobjectgeometrie  from verblijfsobject where ";
            sql= sql + "identificatie > '" +  hulpIdentificatie ;
            vbo= database.select(sql);
            if not vbo:
                break;
            else:
                for row in vbo:
                    identificatie=row[0];
                    verblijfobjectgeometrie=row[2];
                    self.schrijfExportRecord(verblijfhoofd, identificatie)

I highlighted in red the lines i think that are important.
When i try to convert openbareruimtenaam from  the data below:
"P.J. Noël Bakerstraat";"Groningen"

I get the error:
UnicodeDecodeError: 'ascii' codecs can't decode byte 0xc3 in position 7: ordinal not in range(128) for the openbareruimtenaam=u'' + (openbareruimtenaam1.encode(chartype)) line.

I know that the default system codecs is ascii and chartype=b'cp1252'
But how can i get the by pass the ascii encoding? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120814/39c4ee43/attachment.html>