bit = "\xd8\xa3\xd9\x88\xd9\x87 \xd8\xa8\xd8\xaf\xd9\x8a\xd9\x84 \xd9\x85\xd9\x86 \xd9\x82\xd9\x88\xd9\x84\xd8\xaa\xd9\x8a \xd9\x88\xd8\xa7\xd9\x87\xd8\xa7" # here it is a byte but in str encode_bit = bytes(bit , "latin-1") # i can here say to the computer it is byte #then it will print the same words in but byte print(encode_bit) decode = encode_bit.decode() # and i can then decode it and get what i want print( decode ) bit = open("ss.txt","r") bit = bit.read() # but here i read the file and its hold the same bytes above encode_bits = bytes(bit , "latin-1") # but here when i want to say it is bytes it encode it so if it was /xa33 it will be //xa33 decode = encode_bits.decode() # and when i decode it i get /xa33 not the same first one print( decode)
On 23 Feb 2022, at 21:05, one last Day <rizr93172@gmail.com> wrote:
bit = "\xd8\xa3\xd9\x88\xd9\x87 \xd8\xa8\xd8\xaf\xd9\x8a\xd9\x84 \xd9\x85\xd9\x86 \xd9\x82\xd9\x88\xd9\x84\xd8\xaa\xd9\x8a \xd9\x88\xd8\xa7\xd9\x87\xd8\xa7"
Is "bit" bidirectional text? I see what looks like arabic UTF-8 above. Use the b suffix to make a bytes literal. bit = b"\xd8\xa3\xd9\x88\xd9\x87 \xd8\xa8\xd8\xaf\xd9\x8a\xd9\x84 \xd9\x85\xd9\x86 \xd9\x82\xd9\x88\xd9\x84\xd8\xaa\xd9\x8a \xd9\x88\xd8\xa7\xd9\x87\xd8\xa7" Now you can decode that into a unicode string: text = bit.decode() If you want to write that into a file then as unicode: with open('ss.txt', 'w') as f: f.write(text) If you want to write the utf-8 bytes then: with open('ss.txt', 'wb') as f: f.write(bit) or with open('ss.txt', 'wb') as f: f.write(text.encode()) Note: encode() and decode() both default to utf-8.
# here it is a byte but in str encode_bit = bytes(bit , "latin-1") # i can here say to the computer it is byte
Using latin-1 is not helping in this case. I hope you can see how to make the reading and writing of your text work now.
#then it will print the same words in but byte print(encode_bit) decode = encode_bit.decode() # and i can then decode it and get what i want print( decode )
bit = open("ss.txt","r") bit = bit.read() # but here i read the file and its hold the same bytes above encode_bits = bytes(bit , "latin-1") # but here when i want to say it is bytes it encode it so if it was /xa33 it will be //xa33
decode = encode_bits.decode() # and when i decode it i get /xa33 not the same first one print( decode)
Barry
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MMHWLH... Code of Conduct: http://python.org/psf/codeofconduct/
participants (2)
-
Barry Scott
-
one last Day