Changing filenames from Greeklish => Greek (subprocess complain)
Νικόλαος Κούρας
nikos.gr33k at gmail.com
Thu Jun 6 16:39:56 EDT 2013
Τη Πέμπτη, 6 Ιουνίου 2013 11:25:15 μ.μ. UTC+3, ο χρήστης Lele Gaifax έγραψε:
> Νικόλαος Κούρας <nikos.gr33k at gmail.com> writes:
>
>
>
> > Now the error afetr fixithg that transformed to:
>
> >
>
> > [Thu Jun 06 22:13:49 2013] [error] [client 79.103.41.173] filename = fullpath.replace( '/home/nikos/public_html/data/apps/', '' )
>
> > [Thu Jun 06 22:13:49 2013] [error] [client 79.103.41.173] TypeError: expected bytes, bytearray or buffer compatible object
>
> >
>
> > MRAB has told me that i need to open those paths and filenames as bytestreams and not as unicode strings.
>
>
>
> Yes, that way the function will return a list of bytes
>
> instances. Knowing that, consider the following example, that should
>
> ring a bell:
>
>
>
> $ python3
>
> Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 09:59:04)
>
> [GCC 4.7.2] on linux
>
> Type "help", "copyright", "credits" or "license" for more information.
>
> >>> path = b"some/path"
>
> >>> path.replace('some', '')
>
> Traceback (most recent call last):
>
> File "<stdin>", line 1, in <module>
>
> TypeError: expected bytes, bytearray or buffer compatible object
>
> >>> path.replace(b'some', b'')
>
> b'/path'
Ah yes, very logical, i should have though of that.
Tahnks here is what i have up until now with many corrections.
#========================================================
# Get filenames of the apps directory as bytestrings
path = os.listdir( b'/home/nikos/public_html/data/apps/' )
# iterate over all filenames in the apps directory
for filename in path:
# Grabbing just the filename from path
try:
# Is this name encoded in utf-8?
filename.decode('utf-8')
except UnicodeDecodeError:
# Decoding from UTF-8 failed, which means that the name is not valid utf-8
# It appears that this filename is encoded in greek-iso, so decode from that and re-encode to utf-8
new_filename = filename.decode('iso-8859-7').encode('utf-8')
# rename filename form greek bytestreams --> utf-8 bytestreams
old_path = b'/home/nikos/public_html/data/apps/' + b'filename')
new_path = b'/home/nikos/public_html/data/apps/' + b'new_filename')
os.rename( old_path, new_path )
#========================================================
# Get filenames of the apps directory as unicode
path = os.listdir( '/home/nikos/public_html/data/apps/' )
# Load'em
for filename in path:
try:
# Check the presence of a file against the database and insert if it doesn't exist
cur.execute('''SELECT url FROM files WHERE url = %s''', (filename,) )
data = cur.fetchone() #URL is unique, so should only be one
if not data:
# First time for file; primary key is automatic, hit is defaulted
cur.execute('''INSERT INTO files (url, host, lastvisit) VALUES (%s, %s, %s)''', (filename, host, lastvisit) )
except pymysql.ProgrammingError as e:
print( repr(e) )
#========================================================
# Empty set that will be filled in with 'path/to/filename' of path dir
urls = ()
# Build a set of 'path/to/filename' based on the objects of path dir
for filename in path
url = '/home/nikos/public_html/data/apps/' + filename
urls.add( url )
# Delete spurious
cur.execute('''SELECT url FROM files''')
data = cur.fetchall()
# Check database's urls against path's urls
for url in data:
if url not in urls
cur.execute('''DELETE FROM files WHERE url = %s''', (url,) )
==================================
I think its ready! But i want to hear from you, before i try it! :)
More information about the Python-list
mailing list