How to convert a raw string r'\xdd' to '\xdd' more gracefully?
Jach Feng
jfong at ms4.hinet.net
Fri Dec 9 21:06:54 EST 2022
Weatherby,Gerard 在 2022年12月9日 星期五晚上9:36:18 [UTC+8] 的信中寫道:
> That’s actually more of a shell question than a Python question. How you pass certain control characters is going to depend on the shell, operating system, and possibly the keyboard you’re using. (e.g. https://www.alt-codes.net).
>
> Here’s a sample program. The dashes are to help show the boundaries of the string
>
> #!/usr/bin/env python3
> import argparse
> import logging
>
>
> parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
> parser.add_argument('data')
> args = parser.parse_args()
> print(f'Input\n: -{args.data}- length {len(args.data)}')
> for c in args.data:
> print(f'{ord(c)} ',end='')
> print()
>
>
> Using bash on Linux:
>
> ./cl.py '^M
> '
> Input
> -
> - length 3
> 13 32 10
> From: Python-list <python-list-bounces+gweatherby=uchc... at python.org> on behalf of Jach Feng <jf... at ms4.hinet.net>
> Date: Thursday, December 8, 2022 at 9:31 PM
> To: pytho... at python.org <pytho... at python.org>
> Subject: Re: How to convert a raw string r'xdd' to 'xdd' more gracefully?
> *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***
> Jach Feng 在 2022年12月7日 星期三上午10:23:20 [UTC+8] 的信中寫道:
> > s0 = r'\x0a'
> > At this moment it was done by
> >
> > def to1byte(matchobj):
> > ....return chr(int('0x' + matchobj.group(1), 16))
> > s1 = re.sub(r'\\x([0-9a-fA-F]{2})', to1byte, s0)
> >
> > But, is it that difficult on doing this simple thing?
> >
> > --Jach
> The whold story is,
>
> I had a script which accepts an argparse's positional argument. I like this argument may have control character embedded in when required. So I make a post "How to enter escape character in a positional string argument from the command line? on DEC05. But there is no response. I assume that there is no way of doing it and I have to convert it later after I get the whole string from the command line.
>
> I made this convertion using the chr(int(...)) method but not satisfied with. That why this post came out.
>
> At this moment the conversion is done almost the same as Peter's codecs.decode() method but without the need of importing codecs module:-)
>
> def to1byte(matchobj):
> ....return matchobj.group(0).encode().decode("unicode-escape")
> --
> https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!hcg9ULzmtVUzMJ87Emlfsf6PGAfC-MEzUs3QQNVzWwK4aWDEtePG34hRX0ZFVvWcqZXRcM67JkkIg-l-K9vB$<https://urldefense.com/v3/__https:/mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!hcg9ULzmtVUzMJ87Emlfsf6PGAfC-MEzUs3QQNVzWwK4aWDEtePG34hRX0ZFVvWcqZXRcM67JkkIg-l-K9vB$>
> That’s actually more of a shell question than a Python question. How you pass certain control characters is going to depend on the shell, operating system, and possibly the keyboard you’re using. (e.g. https://www.alt-codes.net).
You are right, that's why I found later that it's easier to enter it using a preferred pattern. But there is a case, as moi mentioned in his previous post, will cause failure when a Windows path in the form of \xdd just happen in the string:-(
More information about the Python-list
mailing list