Concatenating columns via Python
MRAB
python at mrabarnett.plus.com
Tue Jul 28 15:17:15 EDT 2015
On 2015-07-28 19:50, hannahgracemcdonald16 at gmail.com wrote:
> I extracted a table from a PDF so the data is quite messy and the data that should be in 1 row is in 3 colums, like so:
> year color location
> 1 1997 blue, MD
> 2 green,
> 3 and yellow
>
> SO far my code is below, but I know I am missing data I am just not sure what to put in it:
>
> # Simply read and split an example Table 4
> import sys
>
The indentation is messed up, which makes it hard to follow.
> # Assigning count number and getting rid of right space
> def main():
> count = 0
> pieces = []
> for line in open(infile, 'U'):
> if count < 130:
> data = line.replace('"', '').rstrip().split("\t")
> data = clean_data(data)
> if data[1] == "year" and data[1] != "":
If the first test is true, then the second test is definitely true, and
is unnecessary.
> write_pieces(pieces)
> pieces = data
> str.join(pieces)
str.join _returns_ its result.
> else:
> for i in range(len(data)):
> pieces[i] = pieces[i] + data[i]
> str.join(pieces)
>
> # Executing command to remove right space
> def clean_data(s):
> return [x.rstrip() for x in s]
>
> def write_pieces(pieces):
> print
>
> if __name__ == '__main__':
> infile = "file.txt"
> main()
>
More information about the Python-list
mailing list