Re: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting

On Tue, Aug 11, 2015 at 12:52 PM, Wes Turner <wes.turner@gmail.com> wrote:
So, again, I am -1000 on (both of these PEPs) because they are just another way of making it too easy to do the wrong thing. * #1 most prevalent security vulnerability: *1**CWE-89 <http://cwe.mitre.org/data/definitions/89.html>: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')* * ORM with parametrization, quoting, escaping and lists of reserved words * SQLAlchemy * #2 most prevalent security vulnerability: *2**CWE-78 <http://cwe.mitre.org/data/definitions/78.html>: Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')* * Command preparation library (which builds a tuple() for exec) * Sarge, subprocess.call(shell=False=0) - [ ] DOC: (Something like this COULD/SHOULD be in the % and str.format docs as well)

Isn't it already like this? It's no harder than: Popen('%s a.c' % cc, shell=True) Heck, I used to do that when I started programming (I hadn't yet learned about injection stuff). If someone is uneducated about injection, they *will do it anyway*. The introduction of format strings (f-strings sounds like a certain word to me...) wouldn't make it any easier, really. On August 11, 2015 1:22:06 PM CDT, Wes Turner <wes.turner@gmail.com> wrote:
-- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

On Tue, Aug 11, 2015 at 1:37 PM, Ryan Gonzalez <rymg19@gmail.com> wrote:
Well, exactly. So I/we must grep for shell=True, %, .format(, .format_globals(**kwargs), and f" or f' and update static analysis tools (to essentially re-AST string.Template with merge(globals, locals, kwargs))

I would rather think of this as an opportunity to help avoid injection vectors. if there was a separate.. . interpolation provider .. then something like os.system('dosomething {a} {b} {c}'.format(...)) could be written as ( !cmd here being a special type of f-string that does command line escaping, borrowing syntax from another thread a few days ago..) os.sytem(!cmd'dosomething {a} {b} {c}') This is both shorter and more resilient to injections. Essentially it feels like you annotate a string as "this will be executed on the command line" and the interpolation adapts. this would make doing the right thing the same as doing the easy thing and this would be good overall, no? I don't know about you, but i dont know by heart how to escape arbitrary user input and deal with all of the corner cases. yes, you can do this more safely with Popen.. but that is quite a bit more effort. also often times there is no such alternative or it is very unweildy (sql land this happens more often)

On Tue, Aug 11, 2015 at 2:25 PM, Joonas Liik <liik.joonas@gmail.com> wrote:
I would rather think of this as an opportunity to help avoid injection vectors.
you get an "F" grade/letter/mark every time you build an f-string without defining what the user-supplied input and destination outputs could/would be.
sarge.run('do something {0} {1} {2}', a, b, c) is currently supported (and could/should be stdlib IMHO) https://sarge.readthedocs.org/en/latest/overview.html#why-not-just-use-subpr... . * (again, sorry) this adds ~subprocess compat to sarge: https://bitbucket.org/vinay.sajip/sarge/pull-requests/1/enh-add-call-check_c... ("ENH: Add call, check_call, check_output, CalledProcessError, expect_returncode")
So, IPython/Jupyter understands _repr_html_ (_repr_*_) methods, IDK why we couldn't have e.g. _repr_shell_path_, _repr_shell_cmdarg_, _repr_sql_sqlite_reserved_keywords_. Representing things for an output format which is expressed as a string but has control characters in order to separate data and code.
POSIX exec accepts a tuple (and does not parse ';' or '--').

On Tue, Aug 11, 2015 at 2:35 PM, Wes Turner <wes.turner@gmail.com> wrote:
A configuration object (passable as e.g. format(**conf)) more explicitly defines the scope (as variables that need to be - [ ] escaped - [ ] encoded - [ ] translated - [ ] concatenated - [ ] mutated or not mutated - [ ] formatted In an ordered idempotent sequence. lookup = partial[kwargs, locals, globals] merged = merge(globals, locals, kwargs) .formatg(**kwargs) .format(lookup(kwargs)) .formatl(**kwargs) uno = trans("one}") f"abc {uno}" ft"abc {uno}" eetcmf"abc {uno}"

Wes, I don't know you, but your contributions to this thread are adding more noise than light. I am not the only one who is exasperated at many of your posts. Please stop. -- --Guido van Rossum (python.org/~guido)

On Tue, Aug 11, 2015 at 1:22 PM, Wes Turner <wes.turner@gmail.com> wrote:
Maybe it would be helpful to think of string concatenation more in terms of compiling a template for serializable DOM(html,js,brython)/doctree(docutils,sphinx)/jinja nodes which have types (Path, CommandOption/Arg, [Tag, Attr]) and appropriate quoting, escaping, encoding, **and translation** rules according to a given output context. # because this is what could just not be: [os.system(f'echo "{cmd}") for cmd in cmds] os.system(f'echo2 '{cmd}') What is the target output format for this string concatenation, most of the time?

Isn't it already like this? It's no harder than: Popen('%s a.c' % cc, shell=True) Heck, I used to do that when I started programming (I hadn't yet learned about injection stuff). If someone is uneducated about injection, they *will do it anyway*. The introduction of format strings (f-strings sounds like a certain word to me...) wouldn't make it any easier, really. On August 11, 2015 1:22:06 PM CDT, Wes Turner <wes.turner@gmail.com> wrote:
-- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

On Tue, Aug 11, 2015 at 1:37 PM, Ryan Gonzalez <rymg19@gmail.com> wrote:
Well, exactly. So I/we must grep for shell=True, %, .format(, .format_globals(**kwargs), and f" or f' and update static analysis tools (to essentially re-AST string.Template with merge(globals, locals, kwargs))

I would rather think of this as an opportunity to help avoid injection vectors. if there was a separate.. . interpolation provider .. then something like os.system('dosomething {a} {b} {c}'.format(...)) could be written as ( !cmd here being a special type of f-string that does command line escaping, borrowing syntax from another thread a few days ago..) os.sytem(!cmd'dosomething {a} {b} {c}') This is both shorter and more resilient to injections. Essentially it feels like you annotate a string as "this will be executed on the command line" and the interpolation adapts. this would make doing the right thing the same as doing the easy thing and this would be good overall, no? I don't know about you, but i dont know by heart how to escape arbitrary user input and deal with all of the corner cases. yes, you can do this more safely with Popen.. but that is quite a bit more effort. also often times there is no such alternative or it is very unweildy (sql land this happens more often)

On Tue, Aug 11, 2015 at 2:25 PM, Joonas Liik <liik.joonas@gmail.com> wrote:
I would rather think of this as an opportunity to help avoid injection vectors.
you get an "F" grade/letter/mark every time you build an f-string without defining what the user-supplied input and destination outputs could/would be.
sarge.run('do something {0} {1} {2}', a, b, c) is currently supported (and could/should be stdlib IMHO) https://sarge.readthedocs.org/en/latest/overview.html#why-not-just-use-subpr... . * (again, sorry) this adds ~subprocess compat to sarge: https://bitbucket.org/vinay.sajip/sarge/pull-requests/1/enh-add-call-check_c... ("ENH: Add call, check_call, check_output, CalledProcessError, expect_returncode")
So, IPython/Jupyter understands _repr_html_ (_repr_*_) methods, IDK why we couldn't have e.g. _repr_shell_path_, _repr_shell_cmdarg_, _repr_sql_sqlite_reserved_keywords_. Representing things for an output format which is expressed as a string but has control characters in order to separate data and code.
POSIX exec accepts a tuple (and does not parse ';' or '--').

On Tue, Aug 11, 2015 at 2:35 PM, Wes Turner <wes.turner@gmail.com> wrote:
A configuration object (passable as e.g. format(**conf)) more explicitly defines the scope (as variables that need to be - [ ] escaped - [ ] encoded - [ ] translated - [ ] concatenated - [ ] mutated or not mutated - [ ] formatted In an ordered idempotent sequence. lookup = partial[kwargs, locals, globals] merged = merge(globals, locals, kwargs) .formatg(**kwargs) .format(lookup(kwargs)) .formatl(**kwargs) uno = trans("one}") f"abc {uno}" ft"abc {uno}" eetcmf"abc {uno}"

Wes, I don't know you, but your contributions to this thread are adding more noise than light. I am not the only one who is exasperated at many of your posts. Please stop. -- --Guido van Rossum (python.org/~guido)

On Tue, Aug 11, 2015 at 1:22 PM, Wes Turner <wes.turner@gmail.com> wrote:
Maybe it would be helpful to think of string concatenation more in terms of compiling a template for serializable DOM(html,js,brython)/doctree(docutils,sphinx)/jinja nodes which have types (Path, CommandOption/Arg, [Tag, Attr]) and appropriate quoting, escaping, encoding, **and translation** rules according to a given output context. # because this is what could just not be: [os.system(f'echo "{cmd}") for cmd in cmds] os.system(f'echo2 '{cmd}') What is the target output format for this string concatenation, most of the time?
participants (5)
-
Eric V. Smith
-
Guido van Rossum
-
Joonas Liik
-
Ryan Gonzalez
-
Wes Turner