Parsing file format to ensure file meets criteria
MRAB
python at mrabarnett.plus.com
Thu Dec 17 20:42:56 EST 2009
seafoid wrote:
> Hi folks,
>
> I am new to python and am having some trouble parsing a file.
>
> I wish to parse a file and ensure that the format meets certain
> restrictions.
>
> The file format is as below (abbreviated):
>
> c this is a comment
> p wcnf 1468 817439 186181
> 286 32 0
> 186191 -198 -1098 0
> 186191 98 -1098 1123 0
>
> Lines beginning c are comment lines and must precede all other lines.
>
> Lines beginning p are header lines with the numbers being 'nvar', 'nclauses'
> and 'hard' respectively.
>
> All other lines are clause lines. These must contain at least two integers
> followed by zero. There is no limit on the number of clause lines.
>
> Header lines must precede clause lines.
>
> In the above example:
> nvar = 1468
> nclauses = 817439
> hard = 186191
>
> Now for the interesting part...........
>
> The first number in a clause line = weight.
> All else are literals.
> Therefore, clause = weight + literals
>
> weight <= hard
> |literal| > 0
> |literal| <= nvar
> number of clause lines = nclauses
>
> My attempts thus far have been a dismal failure, computing is so viciously
> logical :confused:
>
> My main problem is that below:
>
> fname = raw_input('Please enter the name of the file: ')
>
> z = open(fname, 'r')
>
> z_list = [i.strip().split() for i in z]
>
> #here each line is converted to a list, all nested within a list - all
> elements of the list are strings, even integers are converted to strings
>
> Question - how are nested lists indexed?
>
A list is indexed by integers:
>>> my_list = ['a', 'b', 'c']
>>> my_list[0]
'a'
A list of lists requires 2 subscripts, one for the list and the other
for the list in that list:
>>> my_list = [['a', 'b'], ['c', 'd']]
>>> my_list[0]
['a', 'b']
>>> my_list[0][1]
'b'
> I then attempted to extract the comment, headers and clauses from the nested
> list and assign them to a variable.
>
> I tried:
>
z_list is a list of lines, where each line is a list of words.
For example, is the file contains:
c this is a comment
p wcnf 1468 817439 186181
then z_list contains:
[['c', 'this', 'is', 'a', 'comment'], ['p', 'wcnf', '1468',
'817439', '186181']]
> for inner in z_list:
> for lists in inner:
> if lists[0] == 'c':
> comment = lists[:]
> elif lists[0] == 'p':
> header = lists[:]
> else:
> clause = lists[:]
> print comment, header, clause
>
> This does not work for some reasons which I understand. I have messed up the
> indexing and my assignment of variables is wrong.
>
> The aim was to extract the headers and comments and then be left with a
> nested list of clauses.
>
> Then I intended to converted the strings within the clauses nested list back
> to integers and via indexing, check that all conditions are met. This would
> have involved also converting the numerical strings within the header to
> integers but the actual strings are proving a difficult problem to ignore.
>
> Any suggestions?
>
> If my mistakes are irritatingly stupid, please feel free to advise that I
> r.t.f.m (read the f**king manual). However, thus far the manual has helped
> me little.
>
More information about the Python-list
mailing list