[Python-3000] Support for PEP 3131

Ka-Ping Yee python at zesty.ca
Fri May 25 21:29:50 CEST 2007


On Fri, 25 May 2007, Josiah Carlson wrote:
> Apples and oranges to be sure, but there are no other statistics that
> anyone else is able to offer about use of non-ascii identifiers in Java,
> Javascript, C#, etc.

Let's see what we can find.  I made several attempts to search for
non-ASCII identifiers using google.com/codesearch and here's what I got.


Java or JavaScript (total: about 1480000 files found with "lang:java .")
------------------------------------------------------------------------

1.  lang:java ^[^"]*[^\s!-~].*=    (assignment to non-ASCII name)

    2 files with a UTF-8 BOM at the beginning; 1 file with non-ASCII
    in comments; 5 files with non-ASCII in strings; 2 files with
    non-ASCII elsewhere in source code:

    1.  moin-1.5.8/wiki/htdocs/applets/moinFCKplugins/.../lang/en.js
        UTF-8 BOM in middle of file.

    2.  SMSkyline.wdgt/fr.lproj/localizedStrings.js
        UTF-16 BOM beginning of a UTF-8 file. (!)


2.  lang:java ^[^"]*[^\s!-~]\w*\.  (method call on non-ASCII name)

    2 files with a UTF-8 BOM at the beginning; 13 files with non-ASCII
    in comments; 5 files with non-ASCII in strings; 5 files with
    non-ASCII elsewhere in source code:

    1.  struts-2.0.6/src/core/src/.../Editor2Plugin/FindReplaceDialog.js
        UTF-8 BOM in middle of file.

    2.  moin-1.5.8/wiki/htdocs/applets/moinFCKplugins/.../lang/en.js
        UTF-8 BOM in middle of file.

    3.  chickenfoot/chickenscratch/tests/findTest.js
        Non-breaking spaces embedded in indentation.


3.  lang:java ^\s*class.*[^\s!-~]    (class declaration)

    2 files with non-ASCII in strings; no other hits.


4.  lang:javascript ^\s*function.*[^\s!-~]   (function declaration)

    1 non-JavaScript file; 9 files with non-ASCII in comments;
    1 file with non-ASCII in strings; 1 file with non-ASCII elsewhere
    in source code:

    1.  google_hacks_3E_code/hack_61/zoom-google.user.js
        Thin spaces (U+2009) embedded in code.


C# (total: about 266000 files found with "lang:c# .")
-----------------------------------------------------

5.  lang:c# ^[^"]*[^\s!-~].*=      (assignment to non-ASCII name)

    5 non-C# files; 6 files with a UTF-8 BOM at the beginning;
    9 files with non-ASCII in comments; 7 files with non-ASCII
    elsewhere in source code:

    1.  blam-1.8.4pre2/src/PreferencesDialog.cs
        Non-breaking spaces in the middle of the line.

    2.  BildschirmTennis2/BildschirmTennis2/Program1.cs
        Identifier containing non-ASCII.

    3.  Ukazkova reseni CS - Prakticke priklady/.../Exp_2_03/Class2.cs
        Identifier containing non-ASCII.

    4.  Rule.cs
        Identifier containing non-ASCII.

    5.  SharpIntroduction/ComplexExample/Zv?????tko.cs
        Identifier containing non-ASCII.

    6.  WitherwynWebDist/Witherwyn/Map.cs
        "Times" character in expression, probably a typo.

    7.  PDFsharp/XGraphicsLab/MainForm.cs
        Identifier containing non-ASCII.


6.  lang:c# ^[^"]*[^\s!-~]\w*\(    (function call on non-ASCII name)

    4 files with non-ASCII in comments; 6 files with non-ASCII
    elsewhere in source code:

    1.  BildschirmTennis2/BildschirmTennis2/Program1.cs
        Identifier containing non-ASCII.

    2.  SharpIntroduction/ComplexExample/Program.cs
        Identifier containing non-ASCII.

    3.  Ukazkova reseni CS - Prakticke priklady/.../Exp_2_03/Class1.cs
        Identifier containing non-ASCII.

    4.  ActiveRecord/Generator/.../RelationshipBuilderTestCase.cs
        Identifier containing non-ASCII, almost certainly a typo.

    5.  Sample1/Sample1/Program.cs
        Identifier containing non-ASCII.

    6.  Kap11/03/TEXT.CS
        Identifier containing non-ASCII.


7.  lang:c# ^\s*class.*[^\s!-~]    (class declaration)

    1 hit:

    1.  Kap06/03/Kalen.cs
        Identifier containing non-ASCII.


In summary, that means out of around 5.7 million Java, JavaScript,
and C# files that are indexed by Google Code Search, the only use
of non-ASCII identifiers I could find was in 12 C# files, and one
of those 12 occurrences is almost certainly a mistake.


-- ?!ng


More information about the Python-3000 mailing list