problem with regex

dimmaim at gmail.com dimmaim at gmail.com
Mon Apr 28 08:52:02 EDT 2014


i want to find a specific urls from a txt file but i have some issus. First when i take just two lines from the file with copy paste and assign it to a variable like this and it works only with triple quotes
 
test='''_*_n.jpg","timelineCoverPhoto":"{\"focus\":{\"x\":0.5,\"y\":0.386925795053},\"photo\":{\"__type__\":{\"name\":\"Photo\"},\"image_lowres\":{\"uri\":\"https://fbcdn-photos-f-a.akamaihd.net/*-*-*/*_*_*_a.jpg\",\"width\":180,\"height\":179}}}","subscribeStatus":"IS_SUBSCRIBED","smallPictureUrl":"https://fbcdn-profile-a.akamaihd.net/*-*-*/s100x100/*_*_*_s.jpg","contactId":"*==","contactType":"USER","friendshipStatus":"ARE_FRIENDS","graphApiWriteId":"contact_*:*:*","hugePictureUrl":"https://fbcdn-profile-a.akamaihd.net/hprofile-ak-frc3/*_*_*_n.jpg","profileFbid":"1284503586","isMobilePushable":"NO","lookupKey":null,"name":{"displayName":"* *","firstName":"*","lastName":"*"},"nameSearchTokens":["*","*"],"phones":[],"phoneticName":{"displayName":null,"firstName":null,"lastName":null},"isMemorialized":false,"communicationRank":1.1144714,"canViewerSendGift":false,"canMessage":true}
*=={"bigPictureUrl":"https://fbcdn-profile-a.akamaihd.net/hprofile-ak-ash3/*.*.*.*/s200x200/*_*_*_n.jpg","timelineCoverPhoto":"{\"focus\":{\"x\":0.5,\"y\":0.49137931034483},\"photo\":{\"__type__\":{\"name\":\"Photo\"},\"image_lowres\":{\"uri\":\"https://fbcdn-photos-h-a.akamaihd.net/*-*-*/*_*_*_a.jpg\",\"width\":180,\"height\":135}}}","subscribeStatus":"IS_SUBSCRIBED","smallPictureUrl":"https://fbcdn-profile-a.akamaihd.net/*-*-*/*.*.*.*/s100x100/*_*_*_a.jpg","contactId":"*==","contactType":"USER","friendshipStatus":"ARE_FRIENDS","graphApiWriteId":"contact_*:*:*","hugePictureUrl":"https://fbcdn-profile-a.akamaihd.net/hprofile-ak-ash3/c0.0.540.540/*_*_*_n.jpg","profileFbid":"*","isMobilePushable":"YES","lookupKey":null,"name":{"displayName":"* *","firstName":"*","lastName":"*"},"nameSearchTokens":["*","*"],"phones":[],"phoneticName":{"displayName":null,"firstName":null,"lastName":null},"isMemorialized":false,"communicationRank":1.2158813,"canViewerSendGift":false,"canMessage":true}'''

uri = re.findall(r'''uri\":\"https://fbcdn-(a-z|photos)?([^\'" >]+)''',test)
print uri

it works fine and i have my result [('photos', '-f-a.akamaihd.net/*-*-*/*_*_*_a.jpg'), ('photos', '-h-a.akamaihd.net/*-*-*/*_*_*_a.jpg')]

but if a take those lines and save it into a txt file like the original is without the quotes and do the following 

datafile=open('a.txt','r')
data_array=''
for line in datafile:
    data_array=data_array+line

uri = re.findall(r'''uri\":\"https://fbcdn-(a-z|photos)?([^\'" >]+)''',data_array)

after printing uri it gives an empty list,. what to do to make it work for the lines of a txt file



More information about the Python-list mailing list