About size of Unicode string
Laszlo Zsolt Nagy
gandalf at geochemsource.com
Mon Jun 6 19:42:49 CEST 2005
Frank Abel Cancio Bello wrote:
>I need know the size of string object independently of its encoding. For
> len('123') == len('123'.encode('utf_8'))
>while the size of '123' object is different of the size of
>I need send in HTTP request a string. Then I need know the length of the
>string to set the header "content-length" independently of its encoding.
This is from the RFC:
> The Content-Length entity-header field indicates the size of the
> entity-body, in decimal number of OCTETs, sent to the recipient or, in
> the case of the HEAD method, the size of the entity-body that would
> have been sent had the request been a GET.
> Content-Length = "Content-Length" ":" 1*DIGIT
> An example is
> Content-Length: 3495
> Applications SHOULD use this field to indicate the transfer-length of
> the message-body, unless this is prohibited by the rules in section
> 4.4 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4>.
> Any Content-Length greater than or equal to zero is a valid value.
> Section 4.4 describes how to determine the length of a message-body if
> a Content-Length is not given.
Looks to me that the Content-Length header has nothing to do with the
encoding. It is a very low levet stuff. The content length is given in
OCTETs and it represents the size of the body. Clearly, it has nothing
to do with MIME/encoding etc. It is about the number of bits transferred
in the body. Try to write your unicode strings into a StringIO and take
More information about the Python-list