ascii-unicode replacement
Andrea Valle
andrea.valle at unito.it
Thu Apr 5 13:28:20 EDT 2007
Hi to all,
I scripted some text files with another language which cannot handle
unicode.
As I need special character in the resulting text files (IPA
extension), my idea was to define some special ascii sequences in the
text files, open the text files in Python, replace the special
sequences with unicode and encode in utf8. I made some tests with
consolle and everything seemed fine.
But my script keeps on raising exceptions related to encoding.
Sorry if it's obvious but I really can't figure out what to do.
The script follows.
Thanks a lot
-a-
# a class for replacing ascii with unicode
import codecs
import os
class Unicoder:
def __init__(self, folder):
files = os.listdir(folder)
paths = []
for x in files:
paths.append(folder+"/"+x)
self.files = paths
# a list containing all the sc-generated .ly files
def intoText(self, inFile):
aFile = codecs.open(inFile, "r")
text = aFile.read() # read all its content in text
return text
def replaceSpecials(self, text):
replacementDict = (
{"[O]":u"\u0254",
"[U]":u"\u0277",
"[E]":u"\u025b",
"[o|]":u"\xf8",
"[oe]":u"\u0153",
"[e:]":u"\u0259",
"[I]":u"\u026a",
"[ae]":u"\xe6",
"[A]":u"\u0251",
"[Q]":u"\u0252",
"[V]":u"\u028c"
}
)
# hash table where to look up for replacement
for ascii in replacementDict:
print ascii
utf = replacementDict[ascii]
text = text.replace(ascii, utf.encode("utf-8"))
return text
def toFile(self, text, outFileName):
outFile = codecs.open(outFileName, encoding='utf-8',
mode="w")
outFile.write(text)
outFile.close()
def run(self):
for aFileName in self.files:
outFileName = aFileName.split(".")[0]+"UTF.ly"
text = self.intoText(aFileName)
text = self.replaceSpecials(text)
self.toFile(text, outFileName)
if __name__ == "__main__":
a = Unicoder("/musica/antigone/scores/")
# EOF
--------------------------------------------------
Andrea Valle
--------------------------------------------------
CIRMA - DAMS
Università degli Studi di Torino
--> http://www.cirma.unito.it/andrea/
--> andrea.valle at unito.it
--------------------------------------------------
I did this interview where I just mentioned that I read Foucault.
Who doesn't in university, right? I was in this strip club giving
this guy a lap dance and all he wanted to do was to discuss Foucault
with me. Well, I can stand naked and do my little dance, or I can
discuss Foucault, but not at the same time; too much information.
(Annabel Chong)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20070405/f92e2ed1/attachment.html>
More information about the Python-list
mailing list