What confuses the newbies: An unscientific study

Nick Mathewson 9nick9m at alum.mit.edu
Sat Jul 21 00:00:57 EDT 2001


Every so often, people on comp.lang.python get into flamew^W heated
discussions about which aspects of the language ought to be simplified
for the benefit of non-programmers.

I thought it might be cool for me to use suck(1) to pull down all the
articles I could from comp.lang.python, find all the ones that
represented questions about the language, and tabulate which aspects
of the language people were misunderstanding.

CAVEATS:
  - This isn't scientific.  This is just what I found, on a single
    newsserver, as interpreted by me.  I didn't use any formal
    protocol in evaluating which issues were which.

  - These people definitely don't represent non-programmers.  These
    are just the people who asked questions on usenet.  Many of them
    probably have prior programming experience; _all_ of them at least
    know how to use email or a newsreader.

  - I skipped over questions about embedding, the Python C API, and
    other really advanced topics.  I also skipped over questions
    (e.g., "what was the access keyword for.") that didn't represent
    misunderstanding of Python.

  - I might have missed a few by mistake.  I only looked at messages
    which began new threads that were still available on my server at
    about 3 this afternoon.  I'm pretty sure I remember reading some
    stuff a couple of weeks ago that wasn't on here...

  - I'm mind-reading in some cases here... or rather, I'm interpreting
    the way that the users must have _expected_ the language to work
    in order to ask the questions or write the sample code.

  - Some messages are counted twice, if they show more than one
    piece of confusion.  

WHY I THINK THIS IS NEAT NEVERTHELESS:

  - All of these issues are real issues that confused at least one
    person, and may confuse others.  This isn't stuff I imagine that
    might confuse me, or stuff that I seem to remember having confused
    me once.  (I'm an easily confused kind of guy, but I'm leaving my
    confusion out of this.)

  - Many of them may be addressable by improving tools, tweaking
    the libraries, or twiddling error messages.

  - Even though non-programmer friendliness is an area where Python is
    intended to shine long-term, newbie-friendliness is nice too.

AND FINALLY:

  - If you recognize yourself as a poster of one of these questions, I
    hope you won't be offended.  I'm taking your confusion as a sign
    of possible weakness in Python, not in you. :)

Ok, here we go.  The most common confusions, as shown on my server:

1. NOT KNOWING WHERE MODULE DOCUMENTATION IS.

   A good fifteen or so people asked questions of the form, "How do
   I...", "Is there a module that can...", etc.  All of these
   questions were answered with references to the library reference,
   or to the Vaults.

   Perhaps when the Python Cookbook is more compiled, these things
   will be easier.

   BTW, the most common requests were satisfied by: (in no particular
   order) XML/SGML parsing, windowing toolkits, popen, timeoutsocket,
   and stringio.  Other requests turned out to be for chr and popen2.
  
 * A related kind of confusion: in 4 or so of these cases, users knew
   about a function that did _almost_ the right thing, and [instead
   of looking for the function they _really_ wanted] they searched
   for a way to make the almost-right function do the right thing.

   An example is somebody who knows that str(1.0/3) can convert
   float->str trying to specify a precision parameter instead of using %.

2. NOT SURE HOW VARIABLES WORK

   I counted 5 cases of people expecting variables to work in ways
   they didn't.  They broke down like this:
      A. Thinking that 'global' means world-global, not module-global.
      B. Confusing 'outer' self with 'inner' self in helper class.
      C. Thinking that variables in a function are global until
         assigned, and local thereafter.
      D. Writing 'x="b"; a.x' instead of 'a.b'
      E. Writing 'x="fn"; x()' instead of 'x=fn; x()'

   There wasn't a lot of pattern here, and I don't think much can be
   done to make this area of the language easier to understand...

   ...except for the fact that several of these errors fell into the
   next category, which is:

3. GOT ERROR MESSAGES THEY DIDN'T UNDERSTAND

   At least 4 people submitted exceptions that, when translated into
   English, would have told them how to solve their problems.  One
   other submitted an exception that told them about a problem which
   they understood, but left them unsure how to solve it.

   The biggest culprit was AttributeError.  While object instances now
   give a helpful "'X' instance has no attribute 'foo'", the old form
   "AttributeError: foo" still appears for many types.  Two people
   were not sure what this meant.

   One person didn't understand what 'TypeError: call of non-function
   (type string)' meant.

   One person knew what 'TypeError: cannot add type "int" to string'
   meant, but didn't know where to go from there.

4. EXPECTED LIBRARIES AND BUILTINS TO ACT DIFFERENTLY

   Four people, at least, expected libraries included with standard
   Python to work differently than they actually do.   
   
   One person expected 'remove' to remove a directory.

   One person expected the xml library to set the file name without
   using an InputSource object.

   Others had issues with windows stuff that I didn't understand. :)

5. OTHER ISSUES:

   2 people expected strings to act mutably:
     1 expected string.strip to have side effects
     1 expected string slices to be assignable

   2 people had problems writing re patterns.
     1 didn't know to use raw strings
     1 didn't know to use any quotes at all.

   2 people had misinstalled Python.
 
   2 people had issues with aliasing lists
     1 tried to do a lst.remove within a loop over lst
     1 expected [[...]]*6 to perform a deep copy
 
   2 people were bitten by import
     1 expected a second import of a module should reload it
        (why doesn't it give a warning?)
     1 imported a module with hidden side-effects

   2 people didn't know about unpacking tuples
     1 person thought that, as in perl, 'x = funcReturningTuple()'
       would set x to the first element of the tuple.
     1 person just didn't know you could unpack tuples.

   2 people were confused about fp.
     1 person thought that int/int should yield a float
     1 person didn't know that fp was inexact

   1 person used 'varname' when they meant `varname`
   1 person used \\ as a separator and had trouble porting to unix
  >1 person wrote 'print "foo\n"' when they probably only wanted one
     newline.
   1 person tried to use 2.0 syntax (reading from the 2.0 manual)
     while using 1.5.

You-should-have-seen-me-when-I-was-learning'ly yrs,

-- 
 Nick Mathewson    <9 nick 9 m at alum dot mit dot edu> 
                     Remove 9's to respond.  No spam.



More information about the Python-list mailing list