[Python-3000] Support for PEP 3131
Ka-Ping Yee
python at zesty.ca
Fri May 25 21:29:50 CEST 2007
On Fri, 25 May 2007, Josiah Carlson wrote:
> Apples and oranges to be sure, but there are no other statistics that
> anyone else is able to offer about use of non-ascii identifiers in Java,
> Javascript, C#, etc.
Let's see what we can find. I made several attempts to search for
non-ASCII identifiers using google.com/codesearch and here's what I got.
Java or JavaScript (total: about 1480000 files found with "lang:java .")
------------------------------------------------------------------------
1. lang:java ^[^"]*[^\s!-~].*= (assignment to non-ASCII name)
2 files with a UTF-8 BOM at the beginning; 1 file with non-ASCII
in comments; 5 files with non-ASCII in strings; 2 files with
non-ASCII elsewhere in source code:
1. moin-1.5.8/wiki/htdocs/applets/moinFCKplugins/.../lang/en.js
UTF-8 BOM in middle of file.
2. SMSkyline.wdgt/fr.lproj/localizedStrings.js
UTF-16 BOM beginning of a UTF-8 file. (!)
2. lang:java ^[^"]*[^\s!-~]\w*\. (method call on non-ASCII name)
2 files with a UTF-8 BOM at the beginning; 13 files with non-ASCII
in comments; 5 files with non-ASCII in strings; 5 files with
non-ASCII elsewhere in source code:
1. struts-2.0.6/src/core/src/.../Editor2Plugin/FindReplaceDialog.js
UTF-8 BOM in middle of file.
2. moin-1.5.8/wiki/htdocs/applets/moinFCKplugins/.../lang/en.js
UTF-8 BOM in middle of file.
3. chickenfoot/chickenscratch/tests/findTest.js
Non-breaking spaces embedded in indentation.
3. lang:java ^\s*class.*[^\s!-~] (class declaration)
2 files with non-ASCII in strings; no other hits.
4. lang:javascript ^\s*function.*[^\s!-~] (function declaration)
1 non-JavaScript file; 9 files with non-ASCII in comments;
1 file with non-ASCII in strings; 1 file with non-ASCII elsewhere
in source code:
1. google_hacks_3E_code/hack_61/zoom-google.user.js
Thin spaces (U+2009) embedded in code.
C# (total: about 266000 files found with "lang:c# .")
-----------------------------------------------------
5. lang:c# ^[^"]*[^\s!-~].*= (assignment to non-ASCII name)
5 non-C# files; 6 files with a UTF-8 BOM at the beginning;
9 files with non-ASCII in comments; 7 files with non-ASCII
elsewhere in source code:
1. blam-1.8.4pre2/src/PreferencesDialog.cs
Non-breaking spaces in the middle of the line.
2. BildschirmTennis2/BildschirmTennis2/Program1.cs
Identifier containing non-ASCII.
3. Ukazkova reseni CS - Prakticke priklady/.../Exp_2_03/Class2.cs
Identifier containing non-ASCII.
4. Rule.cs
Identifier containing non-ASCII.
5. SharpIntroduction/ComplexExample/Zv?????tko.cs
Identifier containing non-ASCII.
6. WitherwynWebDist/Witherwyn/Map.cs
"Times" character in expression, probably a typo.
7. PDFsharp/XGraphicsLab/MainForm.cs
Identifier containing non-ASCII.
6. lang:c# ^[^"]*[^\s!-~]\w*\( (function call on non-ASCII name)
4 files with non-ASCII in comments; 6 files with non-ASCII
elsewhere in source code:
1. BildschirmTennis2/BildschirmTennis2/Program1.cs
Identifier containing non-ASCII.
2. SharpIntroduction/ComplexExample/Program.cs
Identifier containing non-ASCII.
3. Ukazkova reseni CS - Prakticke priklady/.../Exp_2_03/Class1.cs
Identifier containing non-ASCII.
4. ActiveRecord/Generator/.../RelationshipBuilderTestCase.cs
Identifier containing non-ASCII, almost certainly a typo.
5. Sample1/Sample1/Program.cs
Identifier containing non-ASCII.
6. Kap11/03/TEXT.CS
Identifier containing non-ASCII.
7. lang:c# ^\s*class.*[^\s!-~] (class declaration)
1 hit:
1. Kap06/03/Kalen.cs
Identifier containing non-ASCII.
In summary, that means out of around 5.7 million Java, JavaScript,
and C# files that are indexed by Google Code Search, the only use
of non-ASCII identifiers I could find was in 12 C# files, and one
of those 12 occurrences is almost certainly a mistake.
-- ?!ng
More information about the Python-3000
mailing list