textwrap.dedent replaces tabs?

Frederic Rentsch anthra.norell at vtxmail.ch
Mon Dec 25 02:28:23 EST 2006


Tom Plunket wrote:
> Frederic Rentsch wrote:
>
>   
>> Following a call to dedent () it shouldn't be hard to translate leading 
>> groups of so many spaces back to tabs.
>>     
>
> Sure, but the point is more that I don't think it's valid to change to
> tabs in the first place.
>
> E.g.:
>
>  input = ' ' + '\t' + 'hello\n' +
>          '\t' + 'world'
>
>  output = textwrap.dedent(input)
>
> will yield all of the leading whitespace stripped, which IMHO is a
> violation of its stated function.  In this case, nothing should be
> stripped, because the leading whitespace in these two lines does not
> /actually/ match.  Sure, it visually matches, but that's not the point
> (although I can understand that that's a point of contention in the
> interpreter anyway, I would have no problem with it not accepting "1 tab
> = 8 spaces" for indentation...  But that's another holy war.
>
>   
>> If I understand your problem, you want to restore the dedented line to 
>> its original composition if spaces and tabs are mixed and this doesn't 
>> work because the information doesn't survive dedent ().
>>     
>
> Sure, although would there be a case to be made to simply not strip the
> tabs in the first place?
>
> Like this, keeping current functionality and everything...  (although I
> would think if someone wanted tabs expanded, they'd call expandtabs on
> the input before calling the function!):
>
> def dedent(text, expand_tabs=True):
>     """dedent(text : string, expand_tabs : bool) -> string
>
>     Remove any whitespace than can be uniformly removed from the left
>     of every line in `text`, optionally expanding tabs before altering
>     the text.
>
>     This can be used e.g. to make triple-quoted strings line up with
>     the left edge of screen/whatever, while still presenting it in the
>     source code in indented form.
>
>     For example:
>
>         def test():
>             # end first line with \ to avoid the empty line!
>             s = '''\
>              hello
>             \t  world
>             '''
>             print repr(s)     # prints '     hello\n    \t  world\n    '
>             print repr(dedent(s))  # prints ' hello\n\t  world\n'
>     """
>     if expand_tabs:
>         text = text.expandtabs()
>     lines = text.split('\n')
>     
>     margin = None
>     for line in lines:
>         if margin is None:
>             content = line.lstrip()
>             if not content:
>                 continue
>             indent = len(line) - len(content)
>             margin = line[:indent]
>         elif not line.startswith(margin):
>             if len(line) < len(margin):
>                 content = line.lstrip()
>                 if not content:
>                     continue
>             while not line.startswith(margin):
>                 margin = margin[:-1]
>
>     if margin is not None and len(margin) > 0:
>         margin = len(margin)
>         for i in range(len(lines)):
>             lines[i] = lines[i][margin:]
>
>     return '\n'.join(lines)
>
> import unittest
>
> class DedentTest(unittest.TestCase):
>     def testBasicWithSpaces(self):
>         input = "\n   Hello\n      World"
>         expected = "\nHello\n   World"
>         self.failUnlessEqual(expected, dedent(input))
>
>     def testBasicWithTabLeadersSpacesInside(self):
>         input = "\n\tHello\n\t   World"
>         expected = "\nHello\n   World"
>         self.failUnlessEqual(expected, dedent(input, False))
>         
>     def testAllTabs(self):
>         input = "\t\tHello\n\tWorld"
>         expected = "\tHello\nWorld"
>         self.failUnlessEqual(expected, dedent(input, False))
>         
>     def testFirstLineNotIndented(self):
>         input = "Hello\n\tWorld"
>         expected = input
>         self.failUnlessEqual(expected, dedent(input, False))
>         
>     def testMixedTabsAndSpaces(self):
>         input = "  \t Hello\n   \tWorld"
>         expected = "\t Hello\n \tWorld"
>         self.failUnlessEqual(expected, dedent(input, False))
>         
> if __name__ == '__main__':
>     unittest.main()
> -tom!
>
>   
It this works, good for you. I can't say I understand your objective. 
(You dedent common leading tabs, except if preceded by common leading 
spaces (?)). Neither do I understand the existence of indentations made 
up of tabs mixed with spaces, but that is another topic.
     I have been wasting a lot of time with things of this nature coding 
away before forming a clear conception in my mind of what my code was 
supposed to accomplish. Sounds stupid. But many problems seem trivial 
enough at first sight to create the illusion of perfect understanding. 
The encounter with the devil in the details can be put off but not 
avoided. Best to get it over with from the start and write an exhaustive 
formal description of the problem. Follows an exhaustive formal 
description of the rules for its solution. The rules can then be morphed 
into code in a straightforward manner. In other words, coding should be 
the translation of a logical system into a language a machine 
understands. It should not be the construction of the logical system. 
This, anyway, is the conclusion I have arrived at, to my advantage I 
believe.

Frederic





More information about the Python-list mailing list