The variable 'train' is being called like this -> def main(train='train.csv', test='test.csv', submit='logistic_pred.csv'): print "Reading dataset..." train_data = pd.read_csv(train) test_data = pd.read_csv(test) Let me know if I need to post the full code.