Regular expression problem

Chris Angelico rosuav at gmail.com
Sun Mar 10 19:08:44 CET 2013


On Mon, Mar 11, 2013 at 4:59 AM, Chris Angelico <rosuav at gmail.com> wrote:
> On Mon, Mar 11, 2013 at 4:42 AM, mukesh tiwari
> <mukeshtiwari.iiitm at gmail.com> wrote:
>> I am trying to solve this problem[1] using regular expression. I wrote this code but I am getting time limit exceed. Could some one please tell me how to make this code run faster.
>
> What is the time limit? I just tried it (Python 2.6 under Windows) and
> it finished in a humanly-immeasurable amount of time. Are you sure
> that STDIN (eg raw_input()) is where your test data is coming from?

Oops, reading comprehension fail. Time limit is 3s on a Pentium III.
I've no idea how long your code will take on that hardware, but I
doubt that it's taking three seconds. So my query regarding source of
test data still stands. Can you put together an uber-simple test
program that just echoes the lines of input, to make sure it really is
coming off stdin?

The problem description certainly does seem to imply stdin, but I
can't see why your code would take three seconds unless it's stalling
for some reason. Though perhaps on a P3 with the maximum 100 tests,
maybe that could take a while...

Something to try: Since you're using re.search(), see if you can drop
the complemented sets at the beginning [^~!@#$%^&*()<>?,.]* and end
[^~!@#$%^&*()<>?,.a-zA-Z0-9]* - they're going to be slow to process.
Also, you can simplify this:

[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*

to this:

[a-zA-Z0-9][a-zA-Z0-9._]{4,}

The brace notation means "at least 4, at most infinity".

Try those out and see if you still get the results you want.

ChrisA



More information about the Python-list mailing list