str and unicode proper usage

Dave Angel davea at ieee.org
Mon Dec 14 14:19:26 EST 2009



gizli wrote:
> Hi all,
>
> If an entire application operates on Unicode strings from UI to
> database, is there a use case for str() and unicode() functions? The
> application should be able to read/write files, open sockets and
> execute external processes and parse their output. From my own
> experiments, the open() command for files accepts unicode strings. I
> am just wondering if there is a place where str() would have to be
> used, other than the usual use case of converting a non-string python
> construct (such as an integer) into a string.
>
> The reason I am asking is, I work on a project with several other
> developers and our NLS testing is not going so well. Major reason is
> (I think) that there is a lot of str() functions interspersed
> everywhere. So whenever a unicode character is used in those
> variables, the application breaks. My recommendation to the team was
> to remove these functions and only leave the necessary ones. However,
> I do not have a generic answer on when a str() function is necessary.
>
> Thanks!
>
>   
Consider switching to Python 3.x, if you aren't using any incompatible 
3rd party libraries.  There, the str type is always Unicode, and 
literals are interpreted as Unicode.

But if 3.x isn't an option, I'd say you only need 8bit strings when 
doing I/O to 8 bit devices and files.  You might also need them when 
talking to a program not under your own control.  But if it's feasible, 
convert input data immediately to Unicode, do all your processing 
(including all literal strings) in Unicode, and convert back on output.  
You may also need 8bit strings for some OS calls, but if you're writing 
portable code, those should be minimized.

DaveA




More information about the Python-list mailing list