[IPython-dev] Pasting fix, unicode woes
epatters at enthought.com
Tue Sep 7 20:16:51 EDT 2010
I appreciate the desire for Unicode support (although I firmly believe
that source code itself should *always* be in ASCII). Unfortunately,
you're correct that there may be fair amount of effort involved in
supporting Unicode robustly.
In short, I may not have time to get this done in the next week and a
half. We should discuss this further off-line, though.
On Mon, Sep 6, 2010 at 11:06 PM, Fernando Perez <fperez.net at gmail.com> wrote:
> Hey Evan,
> I just fixed the paste-trailing-newline annoyance:
> I think that approach is good, because it gives the user a chance to
> edit the code before actually executing, but otherwise just needs a
> simple return to execute.
> I do have one question though: why disallow unicode paste? People are
> quite likely to have non-ascii in their examples, and it seems odd to
> block them from pasting it in. Consider for example that I can't
> paste this:
> name = "Fernando Pérez"
> I consider the fact that I can't type my own name into ipython a bug :)
> I think the solution is to set the GUI encoding by default to UTF-8,
> with an option for the user to change that according to their
> preferences later. I had a quick go at it, but it was getting too
> complicated so I didn't commit anything anywhere. Here's the diff in
> case you find it useful as a starting point (I just reverted locally):
> (newkernel)amirbar[qt]> git diff
> diff --git a/IPython/frontend/qt/console/console_widget.py
> index d78cd63..f6ae9fd 100644
> --- a/IPython/frontend/qt/console/console_widget.py
> +++ b/IPython/frontend/qt/console/console_widget.py
> @@ -10,7 +10,7 @@ from PyQt4 import QtCore, QtGui
> # Local imports
> from IPython.config.configurable import Configurable
> from IPython.frontend.qt.util import MetaQObjectHasTraits
> -from IPython.utils.traitlets import Bool, Enum, Int
> +from IPython.utils.traitlets import Bool, Enum, Int, Str
> from ansi_code_processor import QtAnsiCodeProcessor
> from completion_widget import CompletionWidget
> @@ -37,6 +37,9 @@ class ConsoleWidget(Configurable, QtGui.QWidget):
> # non-positive number disables text truncation (not recommended).
> buffer_size = Int(500, config=True)
> + # The default encoding used by the GUI.
> + encoding = Str('utf-8')
> # Whether to use a list widget or plain text output for tab completion.
> gui_completion = Bool(False, config=True)
> @@ -233,7 +236,7 @@ class ConsoleWidget(Configurable, QtGui.QWidget):
> text = QtGui.QApplication.clipboard().text()
> if not text.isEmpty():
> - str(text)
> + text.encode(self.encoding)
> return True
> except UnicodeEncodeError:
> @@ -421,7 +424,8 @@ class ConsoleWidget(Configurable, QtGui.QWidget):
> # Remove any trailing newline, which confuses the GUI and
> # forces the user to backspace.
> - text = str(QtGui.QApplication.clipboard().text(mode)).rstrip()
> + raw = QtGui.QApplication.clipboard().text(mode).rstrip()
> + text = raw.encode(self.encoding)
> except UnicodeEncodeError:
> @@ -1034,7 +1038,7 @@ class ConsoleWidget(Configurable, QtGui.QWidget):
> - return str(cursor.selection().toPlainText())
> + return unicode(cursor.selection().toPlainText()).encode(self.encoding)
> def _get_cursor(self):
> """ Convenience method that returns a cursor for the current position.
> By the way, this isn't an odd corner case: in other countries, people
> are likely to have files and directories with unicode in them *all the
> time*, so this problem will hit us immediately once the code is out,
> I'm afraid.
> I saw multiple calls of the form str(some.Qt.Code()) that were
> throwing exceptions and decided to stop before I get myself too deep
> into Qt code I don't know well. But the right approach is probably to
> encapsulate all those into a single common call that manages the
> The tricky part, I suspect, will be to do the cursor positioning logic
> with unicode in play: you need to correctly compute the lengths in
> terms of characters on the unicode string (more precisely, the number
> of glyphs that the code points map to), not bytes on the raw one.
> Welcome to the wonderful world of unicode!
> ps - and on py3k it's *only* unicode everywhere, so we might as well
> get this code right from the get go. Now that we have people starting
> to help towards py3, the last thing we should do is write a ton of new
> code that is unicode-unsafe for a py3 transition. We're not writing
> py3 code yet, but we should write *with an eye towards py3*.
> IPython-dev mailing list
> IPython-dev at scipy.org
More information about the IPython-dev