[Python-3000] PEP 368: Standard image protocol and class

Lino Mastrodomenico l.mastrodomenico at gmail.com
Sun Jul 1 03:00:29 CEST 2007


Here's the full text of the PEP's current draft, so you can comment
directly on it (thanks to Collin Winter for the suggestion):

PEP: 368
Title: Standard image protocol and class
Version: $Revision: 56133 $
Last-Modified: $Date: 2007-06-30 21:07:03 +0200 (sab, 30 giu 2007) $
Author: Lino Mastrodomenico <l.mastrodomenico at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Jun-2007
Python-Version: 2.6, 3.0
Post-History:


Abstract
========

The current situation of image storage and manipulation in the Python
world is extremely fragmented: almost every library that uses image
objects has implemented its own image class, incompatible with
everyone else's and often not very pythonic.  A basic RGB image class
exists in the standard library (``Tkinter.PhotoImage``), but is pretty
much unusable, and unused, for anything except Tkinter programming.

This fragmentation not only takes up valuable space in the developers
minds, but also makes the exchange of images between different
libraries (needed in relatively common use cases) slower and more
complex than it needs to be.

This PEP proposes to improve the situation by defining a simple and
pythonic image protocol/interface that can be hopefully accepted and
implemented by existing image classes inside and outside the standard
library *without breaking backward compatibility* with their existing
user bases.  In practice this is a definition of how a minimal
*image-like* object should look and act (in a similar way to the
``read()`` and ``write()`` methods in *file-like* objects).

The inclusion in the standard library of a class that provides basic
image manipulation functionality and implements the new protocol is
also proposed, together with a mixin class that helps adding support
for the protocol to existing image classes.


Rationale
=========

A good way to have high quality modules ready for inclusion in the
Python standard library is to simply wait for natural selection among
competing external libraries to provide a clear winner with useful
functionality and a big user base.  Then the de-facto standard can be
officially sanctioned by including it in the standard library.

Unfortunately this approach hasn't worked well for the creation of a
dominant image class in the Python world: almost every third-party
library that requires an image object creates its own class
incompatible with the ones from other libraries.  This is a real
problem because it's entirely reasonable for a program to create and
manipulate an image using, e.g., PIL (the Python Imaging Library) and
then display it using wxPython or pygame.  But these libraries have
different and incompatible image classes, and the usual solution is to
manually "export" an image from the source to a (width, height,
bytes_string) tuple and "import" it creating a new instance in the
target format.  This approach *works*, but is both uglier and slower
than it needs to be.

Another "solution" that has been sometimes used is the creation of
specific adapters and/or converters from a class to another (e.g. PIL
offers the ``ImageTk`` module for converting PIL images to a class
compatible with the Tkinter one).  But this approach doesn't scale
well with the number of libraries involved and it's still annoying for
the user: if I have a perfectly good image object why should I convert
before passing it to the next method, why can't it simply accept my
image as-is?

The problem isn't by any stretch limited to the three mentioned
libraries and has probably multiple causes, including two that IMO are
very important to understand before solving it:

* in today's computing world an image is a basic type not strictly
  tied to a specific domain.  This is why there will never be a clear
  winner between the image classes from the three libraries mentioned
  above (PIL, wxPython and pygame): they cover different domains and
  don't really compete with each other;

* the Python standard library has never provided a good image class
  that can be adopted or imitated by third part modules.
  ``Tkinter.PhotoImage`` provides basic RGB functionality, but it's by
  far the slowest and ugliest of the bunch and it can be instantiated
  only after the Tkinter root window has been created.

This PEP tries to improve this situation in four ways:

1. It defines a simple and pythonic image protocol/interface (both on
   the Python and the C side) that can be hopefully accepted and
   implemented by existing image classes inside and outside the
   standard library *without breaking backward compatibility* with
   their existing user bases.

2. It proposes the inclusion in the standard library of three new
   classes:

   * ``ImageMixin`` provides almost everything necessary to implement
     the new protocol; its main purpose is to make as simple as
     possible to support this interface for existing libraries, in
     some cases as simple as adding it to the list of base classes and
     doing minor additions to the constructor.

   * ``Image`` is a subclass of ``ImageMixin`` and will add a
     constructor that can resize and/or convert an image between
     different pixel formats.  This is intended to provide a fast and
     efficient default implementation of the new protocol.

   * ``ImageSize`` is a minor helper class.  See below for details.

3. ``Tkinter.PhotoImage`` will implement the new protocol (mostly
   through the ``ImageMixin`` class) and all the Tkinter methods that
   can receive an image will be modified the accept any object that
   implements the interface.  As an aside the author of this PEP will
   collaborate with the developers of the most common external
   libraries to achieve the same goal (supporting the protocol in
   their classes and accepting any class that implements it).

4. New ``PyImage_*`` functions will be added to the CPython C API:
   they implement the C side of the protocol and accept as first
   parameter **any** object that supports it, even if it isn't an
   instance of the ``Image``/``ImageMixin`` classes.

The main effects for the end user will be a simplification of the
interchange of images between different libraries (if everything goes
well, any Python library will accept images from any other library)
and the out-of-the-box availability of the new ``Image`` class.  The
new class is intended to cover simple but common use cases like
cropping and/or resizing a photograph to the desired size and passing
it an appropriate widget for displaying it on a window, or darkening a
texture and passing it to a 3D library.

The ``Image`` class is not intended to replace or compete with PIL,
Pythonmagick or NumPy, even if it provides a (very small) subset of
the functionality of these three libraries.  In particular PIL offers
very rich image manipulation features with *dozens* of classes,
filters, transformations and file formats.  The inclusion of PIL (or
something similar) in the standard library may, or may not, be a
worthy goal but it's completely outside the scope of this PEP.


Specification
=============

The ``imageop`` module is used as the *default* location for the new
classes and objects because it has for a long time hosted functions
that provided a somewhat similar functionality, but a new module may
be created if preferred (e.g. a new "``image``" or "``media``" module;
the latter may eventually include other multimedia classes).

``MODES`` is a new module level constant: it is a set of the pixel
formats supported by the ``Image`` class.  Any image object that
implements the new protocol is guaranteed to be formatted in one of
these modes, but libraries that accept images are allowed to support
only a subset of them.

These modes are in turn also available as module level constants (e.g.
``imageop.RGB``).

The following table is a summary of the modes currently supported and
their properties:

========= =============== ========= =========== ======================
  Name       Component    Bits per  Subsampling        Valid
             names        component                    intervals
========= =============== ========= =========== ======================
L         l (lowercase L) 8         no          full range
L16       l               16        no          full range
L32       l               32        no          full range
LA        l, a            8         no          full range
LA32      l, a            16        no          full range
RGB       r, g, b         8         no          full range
RGB48     r, g, b         16        no          full range
RGBA      r, g, b, a      8         no          full range
RGBA64    r, g, b, a      16        no          full range
YV12      y, cr, cb       8         1, 2, 2     16-235, 16-240, 16-240
JPEG_YV12 y, cr, cb       8         1, 2, 2     full range
CMYK      c, m, y, k      8         no          full range
CMYK64    c, m, y, k      16        no          full range
========= =============== ========= =========== ======================

When the name of a mode ends with a number, it represents the average
number of bits per pixel.  All the other modes simply use a byte per
component per pixel.

No palette modes or modes with less than 8 bits per component are
supported.  Welcome to the 21st century.

Here's a quick description of the modes and the rationale for their
inclusion; there are four groups of modes:

1. **grayscale** (``L*`` modes): they are heavily used in scientific
   computing (those people may also need a very high dynamic range and
   precision, hence ``L32``, the only mode with 32 bits per component)
   and sometimes it can be useful to consider a single component of a
   color image as a grayscale image (this is used by the individual
   planes of the planar images, see ``YV12`` below); the name of the
   component (``'l'``, lowercase letter L) stands for luminance, the
   second optional component (``'a'``) is the alpha value and
   represents the opacity of the pixels: alpha = 0 means full
   transparency, alpha = 255/65535 represents a fully opaque pixel;

2. **RGB\* modes**: the garden variety color images.  The optional
   alpha component has the same meaning as in grayscale modes;

3. **YCbCr**, a.k.a. YUV (``*YV12`` modes).  These modes are planar
   (i.e. the values of all the pixel for each component are stored in
   a consecutive memory area, instead of the usual arrangement where
   all the components of a pixel reside in consecutive bytes) and use
   a 1, 2, 2 (a.k.a. 4:2:0) subsampling (i.e. each pixel has its own Y
   value, but the Cb and Cr components are shared between groups of
   2x2 adjacent pixels) because this is the format that's by far the
   most common for YCbCr images.  Please note that the V (Cr) plane is
   stored before the U (Cb) plane.

   ``YV12`` is commonly used for MPEG2 (including DVDs), MPEG4 (both
   ASP/DivX and AVC/H.264) and Theora video frames.  Valid values for
   Y are in range(16, 236) (excluding 236), and valid values for Cb
   and Cr are in range(16, 241).  ``JPEG_YV12`` is similar to
   ``YV12``, but the three components can have the full range of 256
   values.  It's the native format used by almost all JPEG/JFIF files
   and by MJPEG video frames.  The "strangeness" of these two wrt all
   the other supported modes derives from the fact that they are
   widely used that way by a lot of existing libraries and
   applications; this is also the reason why they are included (and
   the fact that they can't losslessly converted to RGB because YCbCr
   is a bigger color space); the funny 4:2:0 planar arrangement of the
   pixel values is relatively easy to support because in most cases
   the three planes can be considered three separate grayscale images;

4. **CMYK\* modes** (cyan, magenta, yellow and black) are subtractive
   color modes, used for printing color images on dead trees.
   Professional designers love to pretend that they can't live without
   them, so here they are.


Python API
----------

See the examples_ below.

In Python 2.x, all the new classes defined here are new-style classes.


Mode Objects
''''''''''''

The mode objects offer a number of attributes and methods that can be
used for implementing generic algorithms that work on different types
of images:

``components``

    The number of components per pixel (e.g. 4 for an RGBA image).

``component_names``

    A tuple of strings; see the column "Component names" in the above
    table.

``bits_per_component``

    8, 16 or 32; see "Bits per component" in the above table.

``bytes_per_pixel``

    ``components * bits_per_component // 8``, only available for non
    planar modes (see below).

``planar``

    Boolean; ``True`` if the image components reside each in a
    separate plane.  Currently this happens if and only if the mode
    uses subsampling.

``subsampling``

    A tuple that for each component in the mode contains a tuple of
    two integers that represent the amount of downsampling in the
    horizontal and vertical direction, respectively.  In practice it's
    ``((1, 1), (2, 2), (2, 2))`` for ``YV12`` and ``JPEG_YV12`` and
    ``((1, 1),) * components`` for everything else.

``x_divisor``

    ``max(x for x, y in subsampling)``; the width of an image that
    uses this mode must be divisible for this value.

``y_divisor``

    ``max(y for x, y in subsampling)``; the height of an image that
    uses this mode must be divisible for this value.

``intervals``

    A tuple that for each component in the mode contains a tuple of
    two integers: the minimum and maximum valid value for the
    component.  Its value is ``((16, 235), (16, 240), (16, 240))`` for
    ``YV12`` and ``((0, 2 ** bits_per_component - 1),) * components``
    for everything else.

``get_length(iterable[integer]) -> int``

    The parameter must be an iterable that contains two integers: the
    width and height of an image; it returns the number of bytes
    needed to store an image of these dimensions with this mode.

Implementation detail: the modes are instances of a subclass of
``str`` and have a value equal to their name (e.g. ``imageop.RGB ==
'RGB'``) except for ``L32`` that has value ``'I'``.  This is only
intended for backward compatibility with existing PIL users; new code
that uses the image protocol proposed here should not rely on this
detail.


Image Protocol
''''''''''''''

Any object that supports the image protocol must provide the following
methods and attributes:

``mode``

    The format and the arrangement of the pixels in this image; it's
    one of the constants in the ``MODES`` set.

``size``

    An instance of the `ImageSize class`_; it's a named tuple of two
    integers: the width and the height of the image in pixels; both of
    them must be >= 1 and can also be accessed as the ``width`` and
    ``height`` attributes of ``size``.

``buffer``

    A sequence of integers between 0 and 255; they are the actual
    bytes used for storing the image data (i.e. modifying their values
    affects the image pixels and vice versa); the data has a
    row-major/C-contiguous order without padding and without any
    special memory alignment, even when there are more than 8 bits per
    component.  The only supported methods are ``__len__``,
    ``__getitem__``/``__setitem__`` (with both integers and slice
    indexes) and ``__iter__``; on the C side it implements the buffer
    protocol.

    This is a pretty low level interface to the image and the user is
    responsible for using the correct (native) byte order for modes
    with more than 8 bit per component and the correct value ranges
    for ``YV12`` images.  A buffer may or may not keep a reference to
    its image, but it's still safe (if useless) to use the buffer even
    after the corresponding image has been destroyed by the garbage
    collector (this will require changes to the image class of
    wxPython and possibly other libraries).  Implementation detail:
    this can be an ``array('B')``, a ``bytes()`` object or a
    specialized fixed-length type.

``info``

    A ``dict`` object that can contain arbitrary metadata associated
    with the image (e.g. DPI, gamma, ICC profile, exposure time...);
    the interpretation of this data is beyond the scope of this PEP
    and probably depends on the library used to create and/or to save
    the image; if a method of the image returns a new image, it can
    copy or adapt metadata from its own ``info`` attribute (the
    ``ImageMixin`` implementation always creates a new image with an
    empty ``info`` dictionary).

| ``bits_per_component``
| ``bytes_per_pixel``
| ``component_names``
| ``components``
| ``intervals``
| ``planar``
| ``subsampling``

    Shortcuts for the corresponding ``mode.*`` attributes.

``map(function[, function...]) -> None``

    For every pixel in the image, maps each component through the
    corresponding function.  If only one function is passed, it is
    used repeatedly for each component.  This method modifies the
    image **in place** and is usually very fast (most of the time the
    functions are called only a small number of times, possibly only
    once for simple functions without branches), but it imposes a
    number of restrictions on the function(s) passed:

    * it must accept a single integer argument and return a number
      (``map`` will round the result to the nearest integer and clip
      it to ``range(0, 2 ** bits_per_component)``, if necessary);

    * it must *not* try to intercept any ``BaseException``,
      ``Exception`` or any unknown subclass of ``Exception`` raised by
      any operation on the argument (implementations may try to
      optimize the speed by passing funny objects, so even a simple
      ``"if n == 10:"`` may raise an exception: simply ignore it,
      ``map`` will take care of it); catching any other exception is
      fine;

    * it should be side-effect free and its result should not depend
      on values (other than the argument) that may change during a
      single invocation of ``map``.

| ``rotate90() -> image``
| ``rotate180() -> image``
| ``rotate270() -> image``

    Return a copy of the image rotated 90, 180 or 270 degrees
    counterclockwise around its center.

``clip() -> None``

    Saturates invalid component values in ``YV12`` images to the
    minimum or the maximum allowed (see ``mode.intervals``), for other
    image modes this method does nothing, very fast; libraries that
    save/export ``YV12`` images are encouraged to always call this
    method, since intermediate operations (e.g. the ``map`` method)
    may assign to pixels values outside the valid intervals.

``split() -> tuple[image]``

    Returns a tuple of ``L``, ``L16`` or ``L32`` images corresponding
    to the individual components in the image.

Planar images also supports attributes with the same names defined in
``component_names``: they contain grayscale (mode ``L``) images that
offer a view on the pixel values for the corresponding component; any
change to the subimages is immediately reflected on the parent image
and vice versa (their buffers refer to the same memory location).

Non-planar images offer the following additional methods:

``pixels() -> iterator[pixel]``

    Returns an iterator that iterates over all the pixels in the
    image, starting from the top line and scanning each line from left
    to right.  See below for a description of the `pixel objects`_.

``__iter__() -> iterator[line]``

    Returns an iterator that iterates over all the lines in the image,
    from top to bottom.  See below for a description of the `line
    objects`_.

``__len__() -> int``

    Returns the number of lines in the image (``size.height``).

``__getitem__(integer) -> line``

    Returns the line at the specified (y) position.

``__getitem__(tuple[integer]) -> pixel``

    The parameter must be a tuple of two integers; they are
    interpreted respectively as x and y coordinates in the image (0, 0
    is the top left corner) and a pixel object is returned.

``__getitem__(slice | tuple[integer | slice]) -> image``

    The parameter must be a slice or a tuple that contains two slices
    or an integer and a slice; the selected area of the image is
    copied and a new image is returned; ``image[x:y:z]`` is equivalent
    to ``image[:, x:y:z]``.

``__setitem__(tuple[integer], integer | iterable[integer]) -> None``

    Modifies the pixel at specified position; ``image[x, y] =
    integer`` is a shortcut for ``image[x, y] = (integer,)`` for
    images with a single component.

``__setitem__(slice | tuple[integer | slice], image) -> None``

    Selects an area in the same way as the corresponding form of the
    ``__getitem__`` method and assigns to it a copy of the pixels from
    the image in the second argument, that must have exactly the same
    mode as this image and the same size as the specified area; the
    alpha component, if present, is simply copied and doesn't affect
    the other components of the image (i.e. no alpha compositing is
    performed).

The ``mode``, ``size`` and ``buffer`` (including the address in memory
of the ``buffer``) never change after an image is created.

It is expected that, if PEP 3118 is accepted, all the image objects
will support the new buffer protocol, however this is beyond the scope
of this PEP.


``Image`` and ``ImageMixin`` Classes
''''''''''''''''''''''''''''''''''''

The ``ImageMixin`` class implements all the methods and attributes
described above except ``mode``, ``size``, ``buffer`` and ``info``.
``Image`` is a subclass of ``ImageMixin`` that adds support for these
four attributes and offers the following constructor (please note that
the constructor is not part of the image protocol):

``__init__(mode, size, color, source)``

    ``mode`` must be one of the constants in the ``MODES`` set,
    ``size`` is a sequence of two integers (width and height of the
    new image); ``color`` is a sequence of integers, one for each
    component of the image, used to initialize all the pixels to the
    same value; ``source`` can be a sequence of integers of the
    appropriate size and format that is copied as-is in the buffer of
    the new image or an existing image; in Python 2.x ``source`` can
    also be an instance of ``str`` and is interpreted as a sequence of
    bytes.  ``color`` and ``source`` are mutually exclusive and if
    they are both omitted the image is initialized to transparent
    black (all the bytes in the buffer have value 16 in the ``YV12``
    mode, 255 in the ``CMYK*`` modes and 0 for everything else).  If
    ``source`` is present and is an image, ``mode`` and/or ``size``
    can be omitted; if they are specified and are different from the
    source mode and/or size, the source image is converted.

    The exact algorithms used for resizing and doing color space
    conversions may differ between Python versions and
    implementations, but they always give high quality results (e.g.:
    a cubic spline interpolation can be used for upsampling and an
    antialias filter can be used for downsampling images); any
    combination of mode conversion is supported, but the algorithm
    used for conversions to and from the ``CMYK*`` modes is pretty
    naïve: if you have the exact color profiles of your devices you
    may want to use a good color management tool such as LittleCMS.
    The new image has an empty ``info`` ``dict``.


Line Objects
''''''''''''

The line objects (returned, e.g., when iterating over an image)
support the following attributes and methods:

``mode``

    The mode of the image from where this line comes.

``__iter__() -> iterator[pixel]``

    Returns an iterator that iterates over all the pixels in the line,
    from left to right.  See below for a description of the `pixel
    objects`_.

``__len__() -> int``

    Returns the number of pixels in the line (the image width).

``__getitem__(integer) -> pixel``

    Returns the pixel at the specified (x) position.

``__getitem__(slice) -> image``

    The selected part of the line is copied and a new image is
    returned; the new image will always have height 1.

``__setitem__(integer, integer | iterable[integer]) -> None``

    Modifies the pixel at the specified position; ``line[x] =
    integer`` is a shortcut for ``line[x] = (integer,)`` for images
    with a single component.

``__setitem__(slice, image) -> None``

    Selects a part of the line and assigns to it a copy of the pixels
    from the image in the second argument, that must have height 1, a
    width equal to the specified slice and the same mode as this line;
    the alpha component, if present, is simply copied and doesn't
    affect the other components of the image (i.e. no alpha
    compositing is performed).


Pixel Objects
'''''''''''''

The pixel objects (returned, e.g., when iterating over a line) support
the following attributes and methods:

``mode``

    The mode of the image from where this pixel comes.

``value``

    A tuple of integers, one for each component.  Any iterable of the
    correct length can be assigned to ``value`` (it will be
    automagically converted to a tuple), but you can't assign to it an
    integer, even if the mode has only a single component: use, e.g.,
    ``pixel.l = 123`` instead.

``r, g, b, a, l, c, m, y, k``

    The integer values of each component; only those applicable for
    the current mode (in ``mode.component_names``) will be available.

| ``__iter__() -> iterator[int]``
| ``__len__() -> int``
| ``__getitem__(integer | slice) -> int | tuple[int]``
| ``__setitem__(integer | slice, integer | iterable[integer]) ->
                                                              None``

    These four methods emulate a fixed length list of integers, one
    for each pixel component.


``ImageSize`` Class
'''''''''''''''''''

``ImageSize`` is a named tuple, a class identical to ``tuple`` except
that:

* its constructor only accepts two integers, width and height; they
  are converted in the constructor using their ``__index__()``
  methods, so all the ``ImageSize`` objects are guaranteed to contain
  only ``int`` (or possibly ``long``, in Python 2.x) instances;

* it has a ``width`` and a ``height`` property that are equivalent to
  the first and the second number in the tuple, respectively;

* the string returned by its ``__repr__`` method is
  ``'imageop.ImageSize(width=%d, height=%d)' % (width, height)``.

``ImageSize`` is not usually instantiated by end-users, but can be
used when creating a new class that implements the image protocol,
since the ``size`` attribute must be an ``ImageSize`` instance.


C API
-----

The available image modes are visible at the C level as ``PyImage_*``
constants of type ``PyObject *`` (e.g.: ``PyImage_RGB`` is
``imageop.RGB``).

The following functions offer a C-friendly interface to mode and image
objects (all the functions return ``NULL`` or -1 on failure):

``int PyImageMode_Check(PyObject *obj)``

    Returns true if the object ``obj`` is a valid image mode.

| ``int PyImageMode_GetComponents(PyObject *mode)``
| ``PyObject* PyImageMode_GetComponentNames(PyObject *mode)``
| ``int PyImageMode_GetBitsPerComponent(PyObject *mode)``
| ``int PyImageMode_GetBytesPerPixel(PyObject *mode)``
| ``int PyImageMode_GetPlanar(PyObject *mode)``
| ``PyObject* PyImageMode_GetSubsampling(PyObject *mode)``
| ``int PyImageMode_GetXDivisor(PyObject *mode)``
| ``int PyImageMode_GetYDivisor(PyObject *mode)``
| ``Py_ssize_t PyImageMode_GetLength(PyObject *mode, Py_ssize_t width,
                                     Py_ssize_t height)``

    These functions are equivalent to their corresponding Python
    attributes or methods.

``int PyImage_Check(PyObject *obj)``

    Returns true if the object ``obj`` is an ``Image`` object or an
    instance of a subtype of the ``Image`` type; see also
    ``PyObject_CheckImage`` below.

``int PyImage_CheckExact(PyObject *obj)``

    Returns true if the object ``obj`` is an ``Image`` object, but not
    an instance of a subtype of the ``Image`` type.

| ``PyObject* PyImage_New(PyObject *mode, Py_ssize_t width,
                          Py_ssize_t height)``

    Returns a new ``Image`` instance, initialized to transparent black
    (see ``Image.__init__`` above for the details).

| ``PyObject* PyImage_FromImage(PyObject *image, PyObject *mode,
                                Py_ssize_t width, Py_ssize_t height)``

    Returns a new ``Image`` instance, initialized with the contents of
    the ``image`` object rescaled and converted to the specified
    ``mode``, if necessary.

| ``PyObject* PyImage_FromBuffer(PyObject *buffer, PyObject *mode,
                                 Py_ssize_t width,
                                 Py_ssize_t height)``

    Returns a new ``Image`` instance, initialized with the contents of
    the ``buffer`` object.

``int PyObject_CheckImage(PyObject *obj)``

    Returns true if the object ``obj`` implements a sufficient subset
    of the image protocol to be accepted by the functions defined
    below, even if its class is not a subclass of ``ImageMixin``
    and/or ``Image``.  Currently it simply checks for the existence
    and correctness of the attributes ``mode``, ``size`` and
    ``buffer``.

| ``PyObject* PyImage_GetMode(PyObject *image)``
| ``Py_ssize_t PyImage_GetWidth(PyObject *image)``
| ``Py_ssize_t PyImage_GetHeight(PyObject *image)``
| ``int PyImage_Clip(PyObject *image)``
| ``PyObject* PyImage_Split(PyObject *image)``
| ``PyObject* PyImage_GetBuffer(PyObject *image)``
| ``int PyImage_AsBuffer(PyObject *image, const void **buffer,
                         Py_ssize_t *buffer_len)``

    These functions are equivalent to their corresponding Python
    attributes or methods; the image memory can be accessed only with
    the GIL and a reference to the image or its buffer held, and extra
    care should be taken for modes with more than 8 bits per
    component: the data is stored in native byte order and it can be
    **not** aligned on 2 or 4 byte boundaries.


Examples
========

A few examples of common operations with the new ``Image`` class and
protocol::

    # create a new black RGB image of 6x9 pixels
    rgb_image = imageop.Image(imageop.RGB, (6, 9))

    # same as above, but initialize the image to bright red
    rgb_image = imageop.Image(imageop.RGB, (6, 9), color=(255, 0, 0))

    # convert the image to YCbCr
    yuv_image = imageop.Image(imageop.JPEG_YV12, source=rgb_image)

    # read the value of a pixel and split it into three ints
    r, g, b = rgb_image[x, y]

    # modify the magenta component of a pixel in a CMYK image
    cmyk_image[x, y].m = 13

    # modify the Y (luma) component of a pixel in a *YV12 image and
    # its corresponding subsampled Cr (red chroma)
    yuv_image.y[x, y] = 42
    yuv_image.cr[x // 2, y // 2] = 54

    # iterate over an image
    for line in rgb_image:
        for pixel in line:
            # swap red and blue, and set green to 0
            pixel.value = pixel.b, 0, pixel.r

    # find the maximum value of the red component in the image
    max_red = max(pixel.r for pixel in rgb_image.pixels())

    # count the number of colors in the image
    num_of_colors = len(set(tuple(pixel) for pixel in image.pixels()))

    # copy a block of 4x2 pixels near the upper right corner of an
    # image and paste it into the lower left corner of the same image
    image[:4, -2:] = image[-6:-2, 1:3]

    # create a copy of the image, except that the new image can have a
    # different (usually empty) info dict
    new_image = image[:]

    # create a mirrored copy of the image, with the left and right
    # sides flipped
    flipped_image = image[::-1, :]

    # downsample an image to half its original size using a fast, low
    # quality operation and a slower, high quality one:
    low_quality_image = image[::2, ::2]
    new_size = image.size.width // 2, image.size.height // 2
    high_quality_image = imageop.Image(size=new_size, source=image)

    # direct buffer access
    rgb_image[0, 0] = r, g, b
    assert tuple(rgb_image.buffer[:3]) == (r, g, b)


Backwards Compatibility
=======================

There are three areas touched by this PEP where backwards
compatibility should be considered:

* **Python 2.6**: new classes and objects are added to the ``imageop``
  module without touching the existing module contents; new methods
  and attributes will be added to ``Tkinter.PhotoImage`` and its
  ``__getitem__`` and ``__setitem__`` methods will be modified to
  accept integers, tuples and slices (currently they only accept
  strings).  All the changes provide a superset of the existing
  functionality, so no major compatibility issues are expected.

* **Python 3.0**: the legacy contents of the ``imageop`` module will
  be deleted, according to PEP 3108; everything defined in this
  proposal will work like in Python 2.x with the exception of the
  usual 2.x/3.0 differences (e.g. support for ``long`` integers and
  for interpreting ``str`` instances as sequences of bytes will be
  dropped).

* **external libraries**: the names and the semantics of the standard
  image methods and attributes are carefully chosen to allow some
  external libraries that manipulate images (including at least PIL,
  wxPython and pygame) to implement the new protocol in their image
  classes without breaking compatibility with existing code.  The only
  blatant conflicts between the image protocol and NumPy arrays are
  the value of the ``size`` attribute and the coordinates order in the
  ``image[x, y]`` expression.


Reference Implementation
========================

If this PEP is accepted, the author will provide a reference
implementation of the new classes in pure Python (that can run in
CPython, PyPy, Jython and IronPython) and a second one optimized for
speed in Python and C, suitable for inclusion in the CPython standard
library.  The author will also submit the required Tkinter patches.
For all the code will be available a version for Python 2.x and a
version for Python 3.0 (it is expected that the two version will be
very similar and the Python 3.0 one will probably be generated almost
completely automatically).


Acknowledgments
===============

The implementation of this PEP, if accepted, is sponsored by Google
through the Google Summer of Code program.


Copyright
=========

This document has been placed in the public domain.


-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com


More information about the Python-3000 mailing list