Newbie: Check first two non-whitespace characters
Jussi Piitulainen
harvesting at is.invalid
Fri Jan 1 03:16:29 EST 2016
otaksoftspamtrap at gmail.com writes:
> I need to check a string over which I have no control for the first 2
> non-white space characters (which should be '[{').
>
> The string would ideally be: '[{...' but could also be something like
> ' [ { ....'.
>
> Best to use re and how? Something else?
No comment on whether re is good for your use case but another comment
on how. First, some test data:
>>> data = '\r\n {\r\n\t[ "etc" ]}\n\n\n')
Then the actual comment - there's a special regex type, \S, to match a
non-whitespace character, and a method to produce matches on demand:
>>> black = re.compile(r'\S')
>>> matches = re.finditer(black, data)
Then the demonstration. This accesses the first, then second, then third
match:
>>> empty = re.match('', '')
>>> next(matches, empty).group()
'{'
>>> next(matches, empty).group()
'['
>>> next(matches, empty).group()
'"'
The empty match object provides an appropriate .group() when there is no
first or second (and so on) non-whitespace character in the data:
>>> matches = re.finditer(black, '\r\t\n')
>>> next(matches, empty).group()
''
>>> next(matches, empty).group()
''
More information about the Python-list
mailing list