[Baypiggies] quick question: regex to stop naughty control characters

Shannon -jj Behrens jjinux at gmail.com
Fri Apr 27 20:39:13 CEST 2007


Thanks, guys!

-jj

On 4/27/07, Shannon -jj Behrens <jjinux at gmail.com> wrote:
> Thanks, Kelly, I suspect you've nailed it!
>
> Best Regards,
> -jj
>
> On 4/25/07, Kelly Yancey <kelly at nttmcl.com> wrote:
> > Shannon -jj Behrens wrote::
> > > Hi,
> > >
> > > I'm doing some form validation.  I accept UTF-8 strings and decode
> > > them to unicode objects.  I would like to check that the strings are
> > > no longer than 128 characters, and that they are "reasonable".  I'm
> > > using FormEncode with a regex that looks like r".{1,128}$".  By
> > > "reasonable", I think the only thing I want to prevent are control
> > > characters.  Now, I'm sure some Unicode whiz out there knows how to do
> > > this with some funky Unicode regex magic, but I don't know how.
> > > Anyone know the right way to do this?  Should I be worried about more
> > > than just control characters?  I'm already taking care of HTML
> > > escaping, SQL injection, etc.
> > >
> > > Thanks,
> > > -jj
> > >
> >
> >    JJ,
> >
> >    It ain't pretty, but how about this:
> >
> >         ur"(?u)^[\u0000-\u001f\u007f-\u009f]{1,128}$"
> >
> >    If python's re module implemented POSIX named character classes you
> > could do this:
> >         r"(?u)^[^[:cntrl:]]{1,128}$"
> >
> > Or if it supported Unicode regular expressions as detailed in
> > http://www.unicode.org/unicode/reports/tr18/, you could do this:
> >         r"(?u)^\P{Control}{1,128}$"
> >
> > But alas, we aren't there yet. :(
> > https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1528154&group_id=5470
> >
> >    I hope that works for you,
> >
> >    Kelly
> >
> >
> >
>
>
> --
> http://jjinux.blogspot.com/
>


-- 
http://jjinux.blogspot.com/


More information about the Baypiggies mailing list