Review Request of Python Code
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Wed Mar 9 00:10:51 EST 2016
On Wednesday 09 March 2016 15:18, subhabangalore at gmail.com wrote:
> I am trying to copy the code here, for your kind review.
>
> import MySQLdb
> import nltk
> def sql_connect_NewTest1():
This function says that it connects to the SQL database, but actually does
much more. It does too much. Split your big function into small functions
that do one thing each.
Your code has too many generic variable names like "var1" ("oh, this is
variable number 1? how useful to know!") and too many commented out dead
lines which make it hard to read. There are too many temporary variables
that get used once, then never used again. You should give your variables
names which explain what they are or what they are used for. You need to use
better comments: explain *why* you do things, don't just write a comment
that repeats what the code does:
dict_open = open(...) #OPENING THE DICTIONARY FILE
That comment is useless. The code tells us that you are opening the
dictionary file.
Because I don't completely understand what your code is trying to do, I
cannot simplify the code or rewrite it very well. But I've tried. Try this,
and see it it helps. If not, try simplifying the code some more, explain
what it does better, and then we'll see if we can speed it up.
import MySQLdb
import nltk
def get_words(filename):
"""Return words from a dictionary file."""
with open(filename, "r") as f:
words = f.read().split()
return words
def join_suffix(word, suffix):
return word + "/" + suffix
def split_sentence(alist, size):
"""Split sentence (a list of words) into chunks of the given size."""
return [alist[i:i+size] for i in range(0, len(alist), size)]
def process():
db = MySQLdb.connect(host="localhost",
user="*****",
passwd="*****",
db="abcd_efgh")
cur = db.cursor()
cur.execute("SELECT * FROM newsinput limit 0,50;")
dict_words = get_words("/python27/NewTotalTag.txt")
words = []
for row in cur.fetchall():
lines = row[3].split(".")
for line in lines:
for word in line.split():
if word in dict_words:
i = dict_words.index(word)
next_word = dict_words[i + 1]
else:
next_word = "NA"
words.append(join_suffix(word, next_word))
db.close()
chunks = split_sentence(words, 7)
for chunk in chunks:
print chunk
--
Steve
More information about the Python-list
mailing list