[Baypiggies] quick question: regex to stop naughty control characters
Shannon -jj Behrens
jjinux at gmail.com
Fri Apr 27 20:34:11 CEST 2007
Thanks, Kelly, I suspect you've nailed it!
Best Regards,
-jj
On 4/25/07, Kelly Yancey <kelly at nttmcl.com> wrote:
> Shannon -jj Behrens wrote::
> > Hi,
> >
> > I'm doing some form validation. I accept UTF-8 strings and decode
> > them to unicode objects. I would like to check that the strings are
> > no longer than 128 characters, and that they are "reasonable". I'm
> > using FormEncode with a regex that looks like r".{1,128}$". By
> > "reasonable", I think the only thing I want to prevent are control
> > characters. Now, I'm sure some Unicode whiz out there knows how to do
> > this with some funky Unicode regex magic, but I don't know how.
> > Anyone know the right way to do this? Should I be worried about more
> > than just control characters? I'm already taking care of HTML
> > escaping, SQL injection, etc.
> >
> > Thanks,
> > -jj
> >
>
> JJ,
>
> It ain't pretty, but how about this:
>
> ur"(?u)^[\u0000-\u001f\u007f-\u009f]{1,128}$"
>
> If python's re module implemented POSIX named character classes you
> could do this:
> r"(?u)^[^[:cntrl:]]{1,128}$"
>
> Or if it supported Unicode regular expressions as detailed in
> http://www.unicode.org/unicode/reports/tr18/, you could do this:
> r"(?u)^\P{Control}{1,128}$"
>
> But alas, we aren't there yet. :(
> https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1528154&group_id=5470
>
> I hope that works for you,
>
> Kelly
>
>
>
--
http://jjinux.blogspot.com/
More information about the Baypiggies
mailing list