Problem while reading files from hdfs using python
shalini.ravishankar at gmail.com
Sun Jan 25 21:23:12 CET 2015
I am trying to read(open) and write files in hdfs inside a python script. But having error. Can someone tell me what is wrong here.
Code (full): sample.py
from subprocess import Popen, PIPE
print "Before Loop"
cat = Popen(["hadoop", "fs", "-cat", "./sample.txt"],
put = Popen(["hadoop", "fs", "-put", "-", "./modifiedfile.txt"],
for line in cat.stdout:
line += "Blah"
When I execute :
hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.5.1.jar -file ./sample.py -mapper './sample.py' -input sample.txt -output fileRead
It executes properly I couldn't find the file which supposed to create in hdfs modifiedfile
And When I execute :
hadoop fs -getmerge ./fileRead/ file.txt
Inside the file.txt, I got :
Can someone please tell me what I am doing wrong here ?? I dont think it reads from the sample.txt
I would really appreciate the help.
Thanks & Regards,
More information about the Python-list