[Patches] [Patch #101120] add .get() to cgi.FieldStorage and cgi.FormContentDict

noreply@sourceforge.net noreply@sourceforge.net
Thu, 24 Aug 2000 05:57:10 -0700


Patch #101120 has been updated. 

Project: 
Category: library
Status: Open
Summary: add .get() to cgi.FieldStorage and cgi.FormContentDict

Follow-Ups:

Date: 2000-Aug-08 20:28
By: ping

Comment:
This actually adds all of the auxiliary dictionary
methods to FormContentDict (e.g. copy, update) since
it makes FormContentDict derive from UserDict.

__version__ has been incremented to 2.3.
-------------------------------------------------------

Date: 2000-Aug-15 21:38
By: tim_one

Comment:
Ping, if you don't make this a full patch (i.e., including doc changes, and tests if at all possible) Soon, I have to Postpone it.  Also please it assign it to someone capable of reviewing it (not me -- don't know anything about cgi).
-------------------------------------------------------

Date: 2000-Aug-16 06:36
By: tim_one

Comment:
Assigned to Barry (sorry, Barry!) for sanity-checking.

Ping pointed out in email that the module in question already documents that it "acts like a dict", and all the patch does is make that true in more cases.

Looks benign to me, but I've never used the module, so it could stand a quick scan from someone who isn't a total idiot.  Jeremy is probably the most natural choice for looking at this, but he won't be useful to us again until just a few days before 2.0b1 is scheduled to ship.  Don't want to wait that long.
-------------------------------------------------------

Date: 2000-Aug-16 06:41
By: tim_one

Comment:
Reassigned to Moshe, cuz Ping said he volunteered.  Way to go, Moshe!  If this release ever goes it, you'll be the only reason it does <wink>.

-------------------------------------------------------

Date: 2000-Aug-16 07:28
By: moshez

Comment:
OK, I gave it a short run around and it looked quite
all right. I do have on qualm, Ping: it's a bit awkward
to use .get() like that, because you still have to check
whether to use .value or not.

Here's an example of what I mean
(I hope if I surround it with PRE tags it will be protected):

<PRE>
print cgi.FieldStorage().get("hello", 
             cgi.MiniFieldStorage("hello", "world)).value
</PRE>

Because otherwise .get() doesn't buy you much. However, 
other then that it's great -- accept that there still
aren't docos and test cases. (So I've kept this "Open")

-------------------------------------------------------

Date: 2000-Aug-16 09:16
By: ping

Comment:
Okay, here's the new patch.  There is quite a bit more now:

1.  Fixed parse_qsl to honour the keep_blank_values argument.  Previously FieldStorage() would contain empty fields despite keep_blank_values being zero, contrary to intent.  Now empty fields will not be present unless keep_blank_values is supplied.  This could break code that expects fields to always be present, but it is also more in keeping with the way keep_blank_values was supposed to work.  (If compatibility is enough of a concern, we can default keep_blank_values to 1 to maintain the old behaviour.)

2.  New get() method on FieldStorage looks up the string value in the 'value' attribute.  When a multiple values are present, it looks up the 'value' attribute within each MiniFieldStorage and produces a list of strings.

3.  FormContentDict is derived from UserDict (same as the original patch).

4.  More tests: new tests for FieldStorage, since there were none before, and tests for the get() method.

5.  Documentation updates: describe the get() method, the keep_blank_values option, and simplify the examples with the convenient use of the get() method.  Fix a little formatting.  Standardize the capitalization of "Content-Type".  Reword the descriptions of parse_multipart and parse_header to make them more comprehensible.

Phew.
-------------------------------------------------------

Date: 2000-Aug-16 09:35
By: moshez

Comment:
OK, this one looks fine to me! I didn't test it again,
though (I assume Ping did) -- but the API seems much improved. I've even had a look over the documentation
and it looks great!

-------------------------------------------------------

Date: 2000-Aug-22 01:54
By: tim_one

Comment:
Ping, are you going to check this in?  If not, please assign it to someone else for checkin.
-------------------------------------------------------

Date: 2000-Aug-22 04:41
By: moshez

Comment:
Actually, it's now waiting for someone's pronouncement -- Guido had a problem with .get() and __getitem__ not being equivalent (the first does an implicit .value). Ping and I thought it best that way, but if the BDFL has strong feelings otherwise, he should pronounce. Assigned to him for final verdict. 
-------------------------------------------------------

Date: 2000-Aug-22 20:46
By: ping

Comment:
On Fri, 18 Aug 2000, Guido van Rossum wrote:
> If I understand the discussion below correctly, to get the
> value for 'foo' you must use
> 
>   x['foo'].value
> 
> but now you can also use
> 
>   x.get('foo')

This is true.

> ???  That's inconsistent compared to how these two behave for regular
> dictionaries!

I mulled this over a while when doing the patch.  Initially, i did
make form.get(x) behave exactly like form[x], and proceeded to try
to make the FieldStorage object behave as much like a dictionary as
possible (since the docs say it can be "accessed like a dictionary").

FormContentDict and its descendants clearly behave like a dictionary
since they are derived from UserDict.  But as i investigated
FieldStorage further, i found that it actually lacks lots of
dictionary behaviour: no update(), no clear(), no copy(), no items(),
no values().  The only dictionary-like semantics we really care about
are [key], keys(), and has_key().  Implementing items() and values()
would require an uncertain choice, since the FieldStorage represents
a multimap: do multi-valued fields appear as multiple key-value pairs
with the same key, or a single pair with a list for its value?

In the end, i decided to abandon true dictionary-like behaviour,
because the thing being represented is a bit too much more than a
dictionary.  By far the most common use of the FieldStorage (and
even the entire cgi module) is just to decode simple strings from
single-valued form fields, and the interface should do its best to
facilitate that common use.  I decided it was okay to break
consistency with regular dictionaries (a) as long as the behaviour
is clearly documented and (b) since it doesn't fully satisfy one's
expectations of a dictionary anyway.

Moshe's response drove home the point when he demonstrated that the
advantage of get() -- to provide a default when a value is missing --
is overwhelmed by the awkwardness of the .value interface.  If get()
returns a MiniFieldStorage, then to default the "spam" field to
"eggs" i have to say

    field = form.get("spam", "eggs")
    if type(field) is type(""):
        process(field)
    else:
        process(field.value)

or 
    field = form.get("spam", None)
    if field is None:
        process("eggs")
    else:
        process(field.value)

or

    field = form.get("spam", cgi.MiniFieldStorage("spam", "eggs"))
    process(field.value)

which are hardly any better than

    if form.has_key("spam"):
        process(form["spam"].value)
    else:
        process("eggs")

at all!

If get() returns a string, then i can just say

    process(form.get("spam", "eggs"))

and i can be on my way.


I have personally stayed away from FieldStorage because it's so
annoying to use; having to look up the "value" attribute on every
form field makes programs verbose and awkward.  This gets worse
when a field has multple values.  In fact, the usage example
provided in Doc/lib/libcgi.tex is a perfect demonstration:

    username = form["username"]
    if type(username) is type([]):
        usernames = ""
        for item in username:
            if username:
                usernames = usernames + "," + item.value
            else:
                usernames = item.value
    else:
        usernames = username.value

The above can be shortened somewhat with map(), but with the new
get() method, this can simply be written in the obvious way:
    value = form.get("username", "")
    if type(value) is type([]):
        usernames = ",".join(value)
    else:
        usernames = value

(The latter is what you would do with a FormContentDict, which
is why i find it so vastly more convenient than the FieldStorage.
It's slightly better, though: it also painlessly handles the case
where the "username" field is not present.)


As mentioned earlier, i'm more concerned about the default setting
for keep_blank_values.  The message i wrote providing a complete
explanation of this issue is included below so you don't have to
search through old e-mail to find it.


-- ?!ng


-------- the headers of the original message --------
>From ping@lfw.org Tue Aug 22 12:18:57 2000
Date: Wed, 16 Aug 2000 02:31:48 -0700 (PDT)
From: Ka-Ping Yee <ping@lfw.org>
To: Moshe Zadka <moshez@math.huji.ac.il>, bwarsaw@beopen.com,
     tpeters@beopen.com
Subject: Re: [Patch #101120] add .get() to cgi.FieldStorage and
    cgi.FormContentDict

-------- the message body, with some edits --------
Hi, guys.

This patch includes a "fix" to make cgi.parse_qsl honour its
"keep_blank_values" argument.  This fix might require a bit more
discussion since it could break compatibility, so i thought i
would air it for your consideration.


BACKGROUND
----------

cgi.FieldStorage(), cgi.parse(), cgi.parse_qs(), and cgi.parse_qsl()
all accept an optional "keep_blank_values" argument which is supposed
to indicate whether form fields with empty strings as values should
be included in the result.  In reality, only cgi.parse_qs() honours
this flag (cgi.parse() is okay because it uses cgi.parse_qs()).
cgi.parse_qsl() totally ignores it, and always includes all values
including empty ones.  cgi.FieldStorage(), which uses cgi.parse_qsl(),
similarly does not honour the "keep_blank_values" flag.

The patch contains a fix to make cgi.parse_qsl() honour this flag.


FOR
---

It's clear that this was the intent of the "keep_blank_values"
argument.  The doc string for parse_qsl says:

    """Parse a query given as a string argument.

        Arguments:

        qs: URL-encoded query string to be parsed

        keep_blank_values: flag indicating whether blank values in
            URL encoded queries should be treated as blank strings.­­
            A true value indicates that blanks should be retained as­
            blank strings.  The default false value indicates that
            blank values are to be ignored and treated as if they were
            not included.

        strict_parsing: flag indicating what to do with parsing errors.
            If false (the default), errors are silently ignored.
            If true, errors raise a ValueError exception.

       Returns a list, as God intended.
    """

Its current disregard for "keep_blank_values" clearly contradicts
this doc string.

The doc string for FieldStorage.__init__ says:

        """Constructor.  Read multipart/* until last part.

        Arguments, all optional:

        fp              : file pointer; default: sys.stdin
            (not used when the request method is GET)

        headers         : header dictionary-like object; default:
            taken from environ as per CGI spec

        outerboundary   : terminating multipart boundary
            (for internal use only)

        environ         : environment dictionary; default: os.environ

        keep_blank_values: flag indicating whether blank values in
            URL encoded forms should be treated as blank strings.­­
            A true value indicates that blanks should be retained as­
            blank strings.  The default false value indicates that
            blank values are to be ignored and treated as if they were
            not included.

        strict_parsing: flag indicating what to do with parsing errors.
            If false (the default), errors are silently ignored.
            If true, errors raise a ValueError exception.

        """

...and the current behaviour of FieldStorage.__init__ contradicts
what it says here about "keep_blank_values".


AGAINST
-------

Existing code which uses FieldStorage and relies on its current
behaviour could break.  If a form field is left blank by the user
and the form is submitted, that key will not appear in the
FieldStorage -- and an attempt to index it with form[key] will 
produce a KeyError where previously it yielded a MiniFieldStorage
containing an empty string.

The existing TeX documentation for the cgi module does not mention
empty values or the keep_blank_values argument at all, so to those
who have only read the library reference manual, this could be a
surprise.


OPTIONS
-------

The reason this never showed up before is that there have never
been *any* tests for FieldStorage in test_cgi.py!  Naughty, naughty.

One easy way to maintain compatibility is to change FieldStorage so
that the default value for keep_blank_values is 1 (previously this
was 0 but the FieldStorage *acted* as though it were 1).




-------------------------------------------------------

Date: 2000-Aug-23 16:35
By: jhylton

Comment:
Since we are in feature freeze, let's not worry about whether we should have a get method or not.  Please limit this patch to bug fixes and re-submit.  I'll be happy to accept.

Let's do a more thorough job of revising this module for 2.1.  It looks amazingly complicated to me.

-------------------------------------------------------

Date: 2000-Aug-24 08:27
By: ping

Comment:
Hi, Jeremy.

I actually think the motivations for these changes are pretty clear.  They directly address two things:

1.  The doc strings say keep_blank_values is supposed to control whether blank values are kept.  This currently doesn't work at all.

2.  The TeX documentation says the FieldStorage object is supposed to behave like a dictionary.  Dictionaries are expected to provide a .get() method, but the FieldStorage doesn't.

Both of these things are basically bugs.  The only decisions that remain have to do with backward compatibility when we fix #1 -- this is easily addressed by defaulting keep_blank_values to 1 -- and with determining the most effective way to implement .get() when we fix #2.

The implementation of .get() in this patch greatly simplifies most CGI scripts.  As shown in the examples in my previous comment, it can reduce four or five lines of mucking with the .value attribute to a single call.

-- ?!ng
-------------------------------------------------------

Date: 2000-Aug-24 12:57
By: moshez

Comment:
Guido:

Adding the functionality you want is great, but it shouldn't be called
get().  Call it getvalue() and I'm happy.

The words "behave like a dictionary" have such a vague meaning that
lacking a get() method can't be seen as a violation.  However having
a get() method that does something different *is* a violation!                 
-------------------------------------------------------

-------------------------------------------------------
For more info, visit:

http://sourceforge.net/patch/?func=detailpatch&patch_id=101120&group_id=5470