Triple-quoted strings and indentation
In general, I find triple-quoted strings to be very handy, particularly for standalone scripts. However, the fact that they have to be written in the left-hand column to avoid leading whitespace really grates, particularly when they're nested within a block or two -- it's a wart: try: options, args = getopt.getopt(sys.argv[1:], "cf:s") except getopt.GetoptError: print """Usage: dostuff <options> Options: -c - blah blah -f <filename> - do stuff with file "filename" -s - more blah""" sys.exit(1) This really makes the code hard to read, as the indentation is all mixed up (visually). I have written a patch that changes the way triple-quoted strings are scanned so that leading whitespace is ignored in much the same way that pep 257 handles it for docstrings. Largely this was for a learning experience in hacking the parser, but it would be very nice IMO if something of this sort could be implemented in a future version of Python. To this end, I have sketched out a draft PEP (which was itself a good learning exercise in thinking out the issues of such a change). Should I post it here for discussion? Andrew
On 7/5/05, Andrew Durdin
print """Usage: dostuff <options>
Options: -c - blah blah -f <filename> - do stuff with file "filename" -s - more blah"""
Isn't the standard idiom for this already: import textwrap ... print textwrap.dedent("""\ Usage: dostuff <options> Options: -c - blah blah -f <filename> - do stuff with file "filename" -s - more blah""") STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy
"Andrew Durdin"
In general, I find triple-quoted strings to be very handy, particularly for standalone scripts. However, the fact that they have to be written in the left-hand column to avoid leading whitespace really grates, particularly when they're nested within a block or two -- it's a wart:
try: options, args = getopt.getopt(sys.argv[1:], "cf:s") except getopt.GetoptError: print """Usage: dostuff <options>
Options: -c - blah blah -f <filename> - do stuff with file "filename" -s - more blah"""
sys.exit(1)
This really makes the code hard to read, as the indentation is all mixed up (visually).
I have written a patch that changes the way triple-quoted strings are scanned so that leading whitespace is ignored in much the same way that pep 257 handles it for docstrings. Largely this was for a learning experience in hacking the parser, but it would be very nice IMO if something of this sort could be implemented in a future version of Python. To this end, I have sketched out a draft PEP (which was itself a good learning exercise in thinking out the issues of such a change). Should I post it here for discussion?
Andrew _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gman...
"Andrew Durdin"
In general, I find triple-quoted strings to be very handy, particularly for standalone scripts. However, the fact that they have to be written in the left-hand column to avoid leading whitespace really grates, particularly when they're nested within a block or two
At present I think I would do usage_text = '''\ text how I want it ''' perhaps in global context or at top of function and then
try: options, args = getopt.getopt(sys.argv[1:], "cf:s") except getopt.GetoptError: print usage_text
I long ago found it advantageous to pull message texts from scattered locations into a central place where easier to find and edit. I also make program logic easier to read without a long block in the way. YMMV Doc strings, first meant for the code reader, need to be where they are. They also come before the code itself, so don't interfere.
-- it's a wart:
That is rather extreme, and is definitely an opinion.
I have written a patch that changes the way triple-quoted strings are scanned so that leading whitespace is ignored.
And what if I want the leading whitespace left just the way I carefully put it? And what of existing code dependent on literals being as literal as they currently are? I think the soonest this could be considered is Python 3.0. Terry J. Reedy
On 7/6/05, Terry Reedy
Doc strings, first meant for the code reader, need to be where they are. They also come before the code itself, so don't interfere.
Doc strings are really not an issue, due to the conventions for processing whitespace in them (and also the fact that their primary use is for the reader of the code, even before any automated processing).
-- it's a wart:
That is rather extreme, and is definitely an opinion.
My opinion, certainly. However, I think the fact that workarounds for the leading whitespace issue are needed (PEP 257, textwrap.dedent("""\ <text begins on next line>)) points to it being more than that. But of course that is also my opinion :-)
And what if I want the leading whitespace left just the way I carefully put it?
You can still put leading whitespace as you want it -- there would just be a slightly different convention to follow -- I'll post what I wrote up so you can see the whole proposal: better than me repeating it all in bits.
And what of existing code dependent on literals being as literal as they currently are?
There would be some breakage, certainly -- though I believe it would be quite little.
I think the soonest this could be considered is Python 3.0.
Quite likely so. Andrew
Here's the draft PEP I wrote up: Abstract Triple-quoted string (TQS henceforth) literals in Python preserve the formatting of the literal string including newlines and whitespace. When a programmer desires no leading whitespace for the lines in a TQS, he must align all lines but the first in the first column, which differs from the syntactic indentation when a TQS occurs within an indented block. This PEP addresses this issue. Motivation TQS's are generally used in two distinct manners: as multiline text used by the program (typically command-line usage information displayed to the user) and as docstrings. Here's a hypothetical but fairly typical example of a TQS as a multiline string: if not interactive_mode: if not parse_command_line(): print """usage: UTIL [OPTION] [FILE]... try `util -h' for more information.""" sys.exit(1) Here the second line of the TQS begins in the first column, which at a glance appears to occur after the close of both "if" blocks. This results in a discrepancy between how the code is parsed and how the user initially sees it, forcing the user to jump the mental hurdle in realising that the call to sys.exit() is actually within the second "if" block. Docstrings on the other hand are usually indented to be more readable, which causes them to have extraneous leading whitespace on most lines. To counteract the problem, PEP 257 [1] specifies a standard algorithm for trimming this whitespace. In the end, the programmer is left with a dilemma: either to align the lines of his TQS to the first column, and sacrifice readability; or to indent it to be readable, but have to deal with unwanted whitespace. This PEP proposes that TQS's should have a certain amount of leading whitespace trimmed by the parser, thus avoiding the drawbacks of the current behaviour. Specification Leading whitespace in TQS's will be dealt with in a similar manner to that proposed in PEP 257: "... strip a uniform amount of indentation from the second and further lines of the [string], equal to the minimum indentation of all non-blank lines after the first line. Any indentation in the first line of the [string] (i.e., up to the first newline) is insignificant and removed. Relative indentation of later lines in the [string] is retained." Note that a line within the TQS that is entirely blank or consists only whitespace will not count toward the minimum indent, and will be retained as a blank line (possibly with some trailing whitespace). There are several significant differences between this proposal and PEP 257's docstring parsing algorithm: * This proposal considers all lines to end at the next newline in the source code (whether escaped or not); PEP 257's algorithm only considers lines to end at the next (necessarily unescaped) newline in the parsed string. * Only literal whitespace is counted; an escape such as \x20 will not be counted as indentation. * Tabs are not converted to spaces. * Blank lines at the beginning and end of the TQS will *not* be stripped. * Leading whitespace on the first line is preserved, as is trailing whitespace on all lines. Rationale I considered several different ways of determining the amount of whitespace to be stripped, including: 1. Determined by the column (after allowing for expanded tabs) of the triple-quote: myverylongvariablename = """\ This line is indented, But this line is not. Note the trailing newline: """ + Easily allows all lines to be indented. - Easily leads to problems due to re-alignment of all but first line when mixed tabs and spaces are used. - Forces programmers to use a particular level of indentation for continuing TQS's. - Unclear whether the lines should align with the triple- quote or immediately after it. - Not backward compatible with most non-docstrings. 2. Determined by the indent level of the second line of the string: myverylongvariablename = """\ This line is not indented (and has no leading newline), But this one is. Note the trailing newline: """ + Allows for flexible alignment of lines. + Mixed tabs and spaces should be fine (as long as they're consistent). - Cannot support an indent on the second line of the string (very bad!). - Not backward compatible with most non-docstrings. 3. Determined by the minimum indent level of all lines after the first: myverylongvariablename = """\ This line is indented, But this line is not. Note the trailing newline: """ + Allows for flexible alignment of lines. + Mixed tabs and spaces should be fine (as long as they're consistent). + Backward compatible with all docstrings and a majority of non-docstrings - Support for indentation on all lines not immediately obvious Overall, solution 3 provided the best balance of features, and (importantly) had the best backward compatibility. I thus consider it the most suitable. Examples The examples here are set out in pairs: the first of each pair shows how the TQS must be currently written to avoid indentation issues; the second shows how it can be written using this proposal (although some variation is possible). All examples are taken or adapted from the Python standard library or another real source. 1. Command-line usage information: def usage(outfile): outfile.write("""Usage: %s [OPTIONS] <file> [ARGS] Meta-options: --help Display this help then exit. --version Output version information then exit. """ % sys.argv[0]) #------------------------# def usage(outfile): outfile.write("""Usage: %s [OPTIONS] <file> [ARGS] Meta-options: --help Display this help then exit. --version Output version information then exit. """ % sys.argv[0]) 2. Embedded Python code: self.runcommand("""if 1: import sys as _sys _sys.path = %r del _sys \n""" % (sys.path,)) #------------------------# self.runcommand("""\ if 1: import sys as _sys _sys.path = %r del _sys \n""" % (sys.path,)) 3. Unit testing class WrapTestCase(BaseTestCase): def test_subsequent_indent(self): # Test subsequent_indent parameter expect = '''\ * This paragraph will be filled, first without any indentation, and then with some (including a hanging indent).''' result = fill(self.text, 40, initial_indent=" * ", subsequent_indent=" ") self.check(result, expect) #------------------------# class WrapTestCase(BaseTestCase): def test_subsequent_indent(self): # Test subsequent_indent parameter expect = '''\ * This paragraph will be filled, first without any indentation, and then with some (including a hanging indent).\ ''' result = fill(self.text, 40, initial_indent=" * ", subsequent_indent=" ") self.check(result, expect) Example 3 illustrates how indentation of all lines (by 2 spaces) is achieved with this proposal: the position of the closing triple quote is used to determine the minimum indentation for the whole string. To avoid a trailing newline in the string, the final newline is escaped. Example 2 avoids the need for this construction by placing the first line (which is not indented) on the line after the triple-quote, and escaping the leading newline. Backwards Compatibility Uses of TQS's fall into two broad categories: those where indentation is significant, and those where it is not. Those in the latter (larger) category, which includes all docstrings, will remain effectively unchanged under this proposal. Docstrings in particular are usually trimmed according to the rules in PEP 257 before their value is used; the trimmed strings will be the same under this proposal as they are now. Of the former category, the majority are those which have at least one line beginning in the first column of the source code; these will be entirely unaffected if left alone, but may be reformatted to increase readability (see example 1 above). However a small number of strings in this first category depend on all lines (or all but the first) being indented. Under this proposal, these will need to be edited to ensure that the intended amount of whitespace is preserved. Examples 2 and 3 above show two different ways to reformat the strings for these cases. Note that in both examples, the overall indentation of the code is cleaner, producing more readable code. Some evidence may be desired to support the claims made above regarding the distribution of the different uses of TQS's. I have begun some analysis to produce some statistics for these; while still incomplete, I have some initial results for the Python 2.4.1 standard library (these figures should not be off by more than a small margin): In the standard library (some 396,598 lines of Python code), there are 7,318 occurrences of TQS's, an average rate of one per 54 lines. Of these, 6,638 (90.7%) are docstrings; the remaining 680 (9.3%) are not. A further examination shows that only 64 (0.9%) of these have leading indentation on all lines (the only case where the proposed solution is not backward compatible). These must be manually checked to determine whether they will be affected; such a check reveals only 7-15 TQS's (0.1%-0.2%) that actually need to be edited. Although small, the impact of this proposal on compatibility is still more than negligible; if accepted in principle, it might be better suited to be initially implemented as a __future__ feature, or perhaps relegated to Python 3000. Implementation An implementation for this proposal has been made; however I have not yet made a patch file with the changes, nor do the changes yet extend to the documentation or other affected areas. References [1] PEP 257, Docstring Conventions, David Goodger, Guido van Rossum http://www.python.org/peps/pep-0257.html Copyright This document has been placed in the public domain.
Andrew Durdin wrote:
Here's the draft PEP I wrote up:
Well reasoned, and well written up IMO. In particular, being able to preserve all leading whitespace by the simple expedient of putting the closing triple quote in column zero and escaping the final newline is a nice feature. However, while I prefer what you describe to Python's current behaviour, I am not yet convinced the backward compatibility pain is worth it. Adding yet-another-kind-of-string-literal (when we already have bytestrings on the horizon) is also unappealing. Your readability examples are good, but the first two involve strings that should probably be module level constants (as Terry described) and the test case involving expected output is probably better handled using doctest, which has its own mechanism for handling indentation. Raw and unicode string literals need to be mentioned in the PEP. Which literals would the reformatting apply to? All 3? Only standard and unicode, leaving raw strings alone? You should research the reasons why PEP 295 [1] was rejected, and describe in the new PEP how it differs from PEP 295 (unfortunately PEP 295 was not updated with the rationale for rejection, but this post [2] looks like Guido putting the final nail in its coffin). Regards, Nick. [1] http://www.python.org/peps/pep-0295.html [2] http://mail.python.org/pipermail/python-dev/2002-July/026969.html -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com
On 7/6/05, Nick Coghlan
However, while I prefer what you describe to Python's current behaviour, I am not yet convinced the backward compatibility pain is worth it. Adding yet-another-kind-of-string-literal (when we already have bytestrings on the horizon) is also unappealing.
First off, thanks for your feedback--you raise some very good points that I have addressed insufficiently or not at all. I personally feel that backward compatibility issues provide the strongest argument against accepting the proposal (but obviously I find the rest of it favourable :-). It may not be particularly clear (that's why it's a draft) that I am not proposing another kind of string literal, but rather a change of rules for an existing one.
Your readability examples are good, but the first two involve strings that should probably be module level constants (as Terry described) and the test case involving expected output is probably better handled using doctest, which has its own mechanism for handling indentation.
I think the question of whether an inline string should be made a module-level constant (or moved to a separate file entirely) relates in general to higher-level considerations of readability and program structure (similar, for example, to those that would prompt one to refactor a function). IOW, the answer to that question would be the same with or without this proposal. In any case, I chose the examples (from different parts of the standard library) more because they illustrated different classes of usage for TQS's than because they were the best illustrations of readability improvement--perhaps something I should address.
Raw and unicode string literals need to be mentioned in the PEP. Which literals would the reformatting apply to? All 3? Only standard and unicode, leaving raw strings alone?
The proposal would apply to all 4 :-) -- normal, raw, unicode, and raw unicode.
You should research the reasons why PEP 295 [1] was rejected, and describe in the new PEP how it differs from PEP 295 (unfortunately PEP 295 was not updated with the rationale for rejection, but this post [2] looks like Guido putting the final nail in its coffin).
THANK YOU! In my research for this, I didn't come across PEP 295 at all -- perhaps due to the fact that it uses the term "multiline strings" exclusively, which is not how I would describe them at all. I will certainly address this in my next draft. Cheers, Andrew.
"Andrew Durdin"
Here's the draft PEP I wrote up:
I believe there were some current alternatives and concerns already expressed that have not been included yet that maybe should be. Some of your examples look worse than needed by putting the first line after the triple quote instead of escaping the first newline like you did elsewhere. Having separate rules for doc strings and other tq strings would be a nuisance. Terry J. Reedy
On 7/7/05, Terry Reedy
I believe there were some current alternatives and concerns already expressed that have not been included yet that maybe should be.
Yes; Nick pointed me to one, and I'll be looking at that and the related discussions before redrafting; I'll also have a further look for other similar proposals.
Some of your examples look worse than needed by putting the first line after the triple quote instead of escaping the first newline like you did elsewhere.
In general, I wanted to preserve as much as possible the way that the string was originally written (as these examples were taken and adapted from the standard library source). In the example with the embedded python code, I felt it was significantly clearer if the initial newline was escaped.
Having separate rules for doc strings and other tq strings would be a nuisance.
I totally agree -- and if the proposal as written gives that impression then I'll correct it. What I was trying to say about docstrings was that the change would have no effect on the result after processing them with docutils or anything else that follows PEP 257 -- which is very significant in terms of backward compatibility, as docstrings are AFAICT the leading use of TQS's (by a large margin). Cheers, Andrew.
On 7/5/05, Andrew Durdin
I have written a patch that changes the way triple-quoted strings are scanned so that leading whitespace is ignored in much the same way that pep 257 handles it for docstrings. Largely this was for a learning experience in hacking the parser, but it would be very nice IMO if something of this sort could be implemented in a future version of Python.
I don't think so. It smells too much of DWIM, which is very unpythonic. EIBTI. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
On 7/7/05, Guido van Rossum
I don't think so. It smells too much of DWIM, which is very unpythonic. EIBTI.
In what way? The scheme described is explicit, and consistently applied to all triple-quoted strings[*] -- although the rules are different to the current behaviour. On the other hand, my written proposal may not be clear or explicit, something I will attempt to remedy over the next few days. [*] Whether it should apply also to normal strings with escaped newlines is not something I have yet considered. Andrew
Andrew Durdin
On 7/7/05, Guido van Rossum
wrote: I don't think so. It smells too much of DWIM, which is very unpythonic. EIBTI.
In what way? The scheme described is explicit, and consistently applied to all triple-quoted strings[*] -- although the rules are different to the current behaviour. On the other hand, my written proposal may not be clear or explicit, something I will attempt to remedy over the next few days.
You are wrong. Current string literals are explicit. They are what you type. What you propose is to force all string literals to be /implicitly/ preprocessed by the compiler, an operation that /necessarily/ loses information. The current mechanism that works behind the scenes for docstrings does /no/ preprocessing of string literals used as docstrings*. Why? Because the designers realized that doing so may break source that relies on those docstrings for precise indentation. Right now, your (implicit preprocessing of triple quoted strings) proposal may change the output of various report generation softwares. Specifically, ones who use a 'header-line' for the names of columns... print ''' column column column'''
[*] Whether it should apply also to normal strings with escaped newlines is not something I have yet considered.
When you have to start differentiating, or consider differentiating, how preprocessing occurs based on the existance or non-existance of escaped newlines, you should realize that this has a serious "Do what I mean" stink (as Guido has already stated, more politely). I propose that we keep all string literals /literal/, not only those lacking triple-quoting. Any processing that needs to be done to /any/ string (literal or otherwise), should be explicitly asked for. Is that too much to ask? - Josiah [*]
def foo(): ... ''' ... x ... y ... z ... ''' ... help(foo) Help on function foo in module __main__:
foo() x y z
print foo.__doc__
x y z
On 7/11/05, Josiah Carlson
You are wrong. Current string literals are explicit. They are what you type.
No they are not: >>> "I typed \x41, but got this!" 'I typed A, but got this!' What we have are not explicit string literals but *explicit rules*, forming part of the language definition and given in the documentation that certain sequences in string literals are preprocessed by the compiler. People have learnt these rules and apply them unconsciously when reading the source--but that can apply to any rule. For example, there's another explicit rule, that an "r" prefixed before the string literal uses a different set of rules without \-escape sequences: >>> r"I typed \x41, but got this!" 'I typed \\x41, but got this!' The point is that processing \-escape sequences is just as implicit as my proposal: but tradition, and custom, and documentation make it *seem* explicit of itself. IOW, arguing that my proposal is "implicit" or "DWIM" is neither relevant nor valid--but arguing either that it is confusing to a long-term Python user, or that it is not fully backward compatible is valid, and these (and other) arguments against should be weighed up against those in favour. (Even some fairly recent and major changes to Python have been accepted despite having these two particular arguments against them, such as unified classes/types and nested scopes).
When you have to start differentiating, or consider differentiating, how preprocessing occurs based on the existance or non-existance of escaped newlines, you should realize that this has a serious "Do what I mean" stink (as Guido has already stated, more politely).
What I am considering differentiating on here is a feature of Python
that is (at least) awkward and (at most) has a "serious stink" -- the
ability to escape newlines in a single-quoted [' or "] string with a
backslash, which has inconsistent or confusing behaviour:
>>> "This is a normal string\
... with an escaped newline."
'This is a normal stringwith an escaped newline.'
>>> r"This is a raw string\
... with an escaped newline."
'This is a raw string\\\nwith an escaped newline.'
This is not an issue with TQS's because they can naturally (i.e.
without escapes) span multiple lines.
Since your main objections above are much the same as Guido's, I'll
respond to his in this message also:
On 7/11/05, Guido van Rossum
The scheme may be explicitly spelled out in the language reference, but it is implicit to the human reader -- what you see is no longer what you get.
See discussion of explicitness vs. implicitness above.
I recommend that you give it up. You ain't gonna convince me.
Very likely. But given the number of times that similar proposals have been put forth in the past, it is reasonable to expect that they will be brought up again in the future by others, if this is rejected--and in that case, these other can simply be pointed to a thorough (but rejected) PEP that discusses the proposal and variants and reasons for rejection. And so--while I still hope that you can be convinced (there's precedent ;-), I think a good, thorough PEP will be of benefit even if rejected. And, of course, such a PEP is bound to be more convincing than a hasty ill-considered one. So I am rewriting my previous draft accordingly, and will submit it as a PEP when it's done. Cheers, Andrew.
Andrew Durdin
On 7/11/05, Josiah Carlson
wrote: You are wrong. Current string literals are explicit. They are what you type.
No they are not:
Apparently my disclaimer of "except in the case of the decades-old string escapes that were inherited from C, as well as unicode and 'raw' strings" didn't make it into the final draft of that email. It is not as though you are taking something that used to be invalid and making it valid, you are taking something that used to mean X and making it mean Y. Your proposed change /will/ cause incompatibility for some unknown number of modules which rely on the indentation of triple quoted strings. You should realize that people get angry when their APIs change the meaning of f(x), and you are asking for the language to do that. Have you a guess as to why you are getting resistance? - Josiah
On Jul 10, 2005, at 6:39 PM, Josiah Carlson wrote:
Andrew Durdin
wrote: On 7/11/05, Josiah Carlson
wrote: You are wrong. Current string literals are explicit. They are what you type.
No they are not:
Apparently my disclaimer of "except in the case of the decades-old string escapes that were inherited from C, as well as unicode and 'raw' strings" didn't make it into the final draft of that email.
It is not as though you are taking something that used to be invalid and making it valid, you are taking something that used to mean X and making it mean Y. Your proposed change /will/ cause incompatibility for some unknown number of modules which rely on the indentation of triple quoted strings. You should realize that people get angry when their APIs change the meaning of f(x), and you are asking for the language to do that. Have you a guess as to why you are getting resistance?
A better proposal would probably be another string prefix that means "dedent", but I'm still not sold. doc processing software is clearly going to have to know how to dedent anyway in order to support existing code. -bob
Bob Ippolito wrote:
A better proposal would probably be another string prefix that means "dedent", but I'm still not sold. doc processing software is clearly going to have to know how to dedent anyway in order to support existing code.
Agreed. It is easy enough for any doc-string extraction tool to do the dedenting based on the common whitespace prefix found in lines 2 - n of the string. And that works on all sorts of string literals. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 11 2005)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
On Mon, 2005-07-11 at 01:08, Bob Ippolito wrote:
A better proposal would probably be another string prefix that means "dedent", but I'm still not sold. doc processing software is clearly going to have to know how to dedent anyway in order to support existing code.
OTOH, adding another string prefix means doubling the total number of prefix combinations. The potential for this getting out of hand was the primary reason that string templates were implemented as a library function instead of as a string prefix. Personally, I'm not convinced that string literals need to change in any way. Dedentation should be handled (is handled?!) in the stdlib. -Barry
"Andrew Durdin"
Very likely. But given the number of times that similar proposals have been put forth in the past, it is reasonable to expect that they will be brought up again in the future by others, if this is rejected--and in that case, these other can simply be pointed to a thorough (but rejected) PEP that discusses the proposal and variants and reasons for rejection.
I agree that this would be useful. I also agree with Bob Ippolito that a new prefex might be better. tjr
Terry Reedy wrote:
"Andrew Durdin"
wrote in message news:59e9fd3a050710202721851037@mail.gmail.com... Very likely. But given the number of times that similar proposals have been put forth in the past, it is reasonable to expect that they will be brought up again in the future by others, if this is rejected--and in that case, these other can simply be pointed to a thorough (but rejected) PEP that discusses the proposal and variants and reasons for rejection.
I agree that this would be useful. I also agree with Bob Ippolito that a new prefex might be better.
<plug> Why using a new syntax construct when you can do it with existing features? We do already have str.split(), which is often used to postprocess string literals (in the perl qw() style), why not introduce str.dedent()? Reinhold -- Mail address is perfectly valid!
On 7/10/05, Andrew Durdin
On 7/7/05, Guido van Rossum
wrote: I don't think so. It smells too much of DWIM, which is very unpythonic. EIBTI.
In what way? The scheme described is explicit, and consistently applied to all triple-quoted strings[*] -- although the rules are different to the current behaviour. On the other hand, my written proposal may not be clear or explicit, something I will attempt to remedy over the next few days.
The scheme may be explicitly spelled out in the language reference, but it is implicit to the human reader -- what you see is no longer what you get.
[*] Whether it should apply also to normal strings with escaped newlines is not something I have yet considered.
I recommend that you give it up. You ain't gonna convince me. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
On 7/5/05, Andrew Durdin
wrote: I have written a patch that changes the way triple-quoted strings are scanned so that leading whitespace is ignored in much the same way that pep 257 handles it for docstrings. Largely this was for a learning experience in hacking the parser, but it would be very nice IMO if something of this sort could be implemented in a future version of Python.
I don't think so. It smells too much of DWIM, which is very unpythonic. EIBTI.
Another idea, which is much more conservative: textwrap.dedent is highly under- rated and hidden. Why not make it builtin or a string method? Reinhold -- Mail address is perfectly valid!
Reinhold Birkenfeld wrote:
Guido van Rossum wrote:
On 7/5/05, Andrew Durdin
wrote: I have written a patch that changes the way triple-quoted strings are scanned so that leading whitespace is ignored in much the same way that pep 257 handles it for docstrings. Largely this was for a learning experience in hacking the parser, but it would be very nice IMO if something of this sort could be implemented in a future version of Python.
I don't think so. It smells too much of DWIM, which is very unpythonic. EIBTI.
Another idea, which is much more conservative: textwrap.dedent is highly under- rated and hidden. Why not make it builtin or a string method?
Reinhold
Using Andrew's examples from the PEP: def usage(outfile): outfile.write( """Usage: %s [OPTIONS] <file> [ARGS] Meta-options: --help Display this help then exit. --version Output version information then exit. """.dedent() % sys.argv[0]) self.runcommand("""\ if 1: import sys as _sys _sys.path = %r del _sys \n""".dedent() % (sys.path,)) class WrapTestCase(BaseTestCase): def test_subsequent_indent(self): # Test subsequent_indent parameter expect = '''\ * This paragraph will be filled, first without any indentation, and then with some (including a hanging indent). '''.dedent().rstrip() result = fill(self.text, 40, initial_indent=" * ", subsequent_indent=" ") self.check(result, expect) And if the loading of the textwrap module is deferred to the first call to dedent(), then it wouldn't even need to incur any extra start up overhead. Although that last example is a bad one, since you end up testing textwrap against itself ;) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com
participants (10)
-
Andrew Durdin
-
Barry Warsaw
-
Bob Ippolito
-
Guido van Rossum
-
Josiah Carlson
-
M.-A. Lemburg
-
Nick Coghlan
-
Reinhold Birkenfeld
-
Steven Bethard
-
Terry Reedy