Problem while reading files from hdfs using python

Shalini Ravishankar shalini.ravishankar at
Sun Jan 25 21:23:12 CET 2015

Hello Everyone,

I am trying to read(open) and write files in hdfs inside a python script. But having error. Can someone tell me what is wrong here.

Code (full):

    from subprocess import Popen, PIPE
    print "Before Loop"
    cat = Popen(["hadoop", "fs", "-cat", "./sample.txt"],
    put = Popen(["hadoop", "fs", "-put", "-", "./modifiedfile.txt"],
    for line in cat.stdout:
        line += "Blah"
        print line

When I execute : 

    hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.5.1.jar -file ./ -mapper './' -input sample.txt -output fileRead

It executes properly I couldn't find the file which supposed to create in hdfs modifiedfile

And When I execute :

     hadoop fs -getmerge ./fileRead/ file.txt

Inside the file.txt, I got :

    Before Loop	
    Before Loop

Can someone please tell me what I am doing wrong here ?? I dont think it reads from the sample.txt

I would really appreciate the help.

Thanks & Regards,
Shalini Ravishankar.

More information about the Python-list mailing list