PEP 4XX: pyzaa "Improving Python ZIP Application Support"

https://docs.google.com/document/d/1MKXgPzhWD5wIUpoSQX7dxmqgTZVO6l9iZZis8dnr... PEP: 4XX Title: Improving Python ZIP Application Support Author: Daniel Holth <dholth@gmail.com> Status: Draft Type: Standards Track Python-Version: 3.4 Created: 30 March 2013 Post-History: 30 March 2013, 1 April 2013 Improving Python ZIP Application Support Python has had the ability to execute directories or ZIP-format archives as scripts since version 2.6. When invoked with a zip file or directory as its first argument the interpreter adds that directory to sys.path and executes the __main__ module. These archives provide a great way to publish software that needs to be distributed as a single file script but is complex enough to need to be written as a collection of modules. This feature is not as popular as it should be, mainly because no one’s heard of it because it wasn’t promoted as part of Python 2.6, but also because Windows users don’t have a file extension (other than .py) to associate with the launcher. This PEP proposes to fix these problems by re-publicising the feature, defining the .pyz and .pyzw extensions as “Python ZIP Applications” and “Windowed Python ZIP Applications”, and providing some simple tooling to manage the format. A New Python ZIP Application Extension The Python 3.4 installer will associate .pyz and .pyzw “Python ZIP Applications” with the platform launcher so they can be executed. A .pyz archive is a console application and a .pyzw archive is a windowed application, indicating whether the console should appear when running the app. Why not use .zip or .py? Users expect a .zip file would be opened with an archive tool, and users expect .py to be opened with a text editor. Both would be confusing for this use case. For UNIX users, .pyz applications should be prefixed with a #! line pointing to the correct Python interpreter and an optional explanation. #!/usr/bin/env python3 # This is a Python application stored in a ZIP archive. (binary contents of archive) As background, ZIP archives are defined with a footer containing relative offsets from the end of the file. They remain valid when concatenated to the end of any other file. This feature is completely standard and is how self-extracting ZIP archives and the bdist_wininst installer format work. Minimal Tooling: The pyzaa Module This PEP also proposes including a simple application for working with these archives: The Python Zip Application Archiver “pyzaa” (rhymes with “huzzah” or “pizza”). “pyzaa” can archive or extract these files, compile bytecode, and can write the __main__ module if it is not present. Usage python -m pyzaa (pack | compile) python -m pyzaa pack [-o path/name] [-m module.submodule:callable] [-c] [-w] [-p interpreter] directory: ZIP the contents of directory as directory.pyz or [-w] directory.pyzw. Adds the executable flag to the archive. -c compile .pyc files and add them to the archive -p interpreter include #!interpreter as the first line of the archive -o path/name archive is written to path/name.pyz[w] instead of dirname. The extension is added if not specified. -m module.submodule:callable __main__.py is written as “import module.submodule; module.submodule.callable()” pyzaa pack will warn if the directory contains C extensions or if it doesn’t contain __main__.py. python -m pyzaa compile arcname.pyz[w] The Python files in arcname.pyz[w] are compiled and appended to the ZIP file. A standard ZIP utility or Python’s zipfile module can unpack the archives. FAQ Q. Isn’t pyzaa just a very thin wrapper over zipfile and compileall? A. Yes. Q. How does this compete with existing sdist/bdist formats? A. There is some overlap, but .pyz files are especially interesting as a way to distribute an installer. They may also prove useful as a way to deliver applications when users shouldn’t be asked to perform virtualenv + “pip install”. References [1] http://bugs.python.org/issue1739468 “Allow interpreter to execute a zip file” [2] http://bugs.python.org/issue17359 “Feature is not documented” Copyright This document has been placed into the public domain.

On 4/1/2013 5:47 PM, Daniel Holth wrote:
users expect .py to be opened with a text editor.
This user expects .py to be executed as an executable script, and thinks that is the default after an installation of Python on Windows. Windows has a separate option, Edit, to use to edit things. But, I'm glad to see you write the PEP. I have an even thinner method of doing this, using .py extensions, that I've been using for several years now with Python 3, and wondered why it wasn't more popular. My equivalent of pyzaa, though, only performs the pack operation, and requires a bit of cooperation from the application (as a convenient way of storing the application-specific parameters, I build the invocation of pyzaa-equivalent into the application itself using a non-documented command-line option, and build to a different directory, to avoid overwriting application.py). Feel free to incorporate all or parts of that idea if it makes sense to you and sounds convenient.

What happens when -p is omitted? I'd hope it would add the interpreter used to create the zip (or at least the major version), but that may not be ideal for some reason that I haven't thought of yet. Everything else looks great. I'm really looking forward to this. Cheers, Steve

On Tue, Apr 2, 2013 at 1:20 PM, Steve Dower <Steve.Dower@microsoft.com>wrote:
Question is whether ``/usr/bin/python3.3`` is better or ``/usr/bin/env python3.3``. I vote for the latter since it gets you the right thing without having to care about whether the interpreter moved or is being hidden by a user-installed interpreter. -Brett

On Wed, Apr 3, 2013 at 1:26 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Pushed as Draft PEP 441, tooling prototyped (with less than awesome CLI) at https://bitbucket.org/dholth/pyzaa or https://crate.io/packages/pyzaa Thanks, Daniel Holth

On 2 April 2013 01:47, Daniel Holth <dholth@gmail.com> wrote:
There is a bug in Windows Powershell, which is apparently due to a bug in the underlying FindExecutable API, that can fail to recognise extensions which are longer than 3 characters properly. Rather than risk obscure bugs, I would suggest restricting the extensions to 3 characters. For the “Windowed Python ZIP Applications” case, could we use .pzw as the extension instead of .pyzw? Please don't shoot the messenger here - I'm not going to try to defend such a stupid Windows bug, but better to be safe in my view. Flames about Windows to /dev/null... Paul.

On 3 May 2013 20:40, "Paul Moore" <p.f.moore@gmail.com> wrote:
the underlying FindExecutable API, that can fail to recognise extensions which are longer than 3 characters properly.
Rather than risk obscure bugs, I would suggest restricting the extensions
to 3 characters. For the “Windowed Python ZIP Applications” case, could we use .pzw as the extension instead of .pyzw?
Please don't shoot the messenger here - I'm not going to try to defend
such a stupid Windows bug, but better to be safe in my view. Flames about Windows to /dev/null... I'm OK with the shortened extension. Cheers, Nick.
http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

On 03/05/13 20:37, Paul Moore wrote:
Are you referring to this one? https://groups.google.com/group/microsoft.public.vb.general.discussion/brows... That's pretty old, is it still a problem? Besides, if I'm reading this properly: http://msdn.microsoft.com/en-us/library/bb776419(VS.85).aspx the issue is that they should be using AssocQueryString, not FindExecutable.
I've had Linux systems which associated OpenOffice docs with Archive Manager rather than OpenOffice. It's likely that at least some Linux systems will likewise decide that .pyz files are archives, not Python files, and open them in Archive Manager. I don't believe that it is Python's responsibility to work around bugs in desktop environments' handling of file associations. Many official Microsoft file extensions are four or more letters, e.g. docx. I don't see any value in making long-lasting decisions on file extensions based on (transient?) bugs that aren't our responsibility. -- Steven

Steven D'Aprano writes:
+0
Many official Microsoft file extensions are four or more letters, e.g. docx.
Give us a non-MS example, please. Nobody in their right mind would clash with a major MS product's naming conventions. Not even if their file format implements Digital-Ocular Coordination eXtensions. And a shell that borks the Borg's extensions won't make it in the market.
Getting these associations right is worth *something* to Python. I'm not in a position to say more than "it's positive". But I don't see why we really care about what the file extensions are as long as they serve the purpose of making it easy to figure out which files are in what format in a names-only list. I have to admit that "Windowed Python ZIP Application" is probably something I personally will only ever consider as an hypothesis, though.

On 04/05/13 15:13, Stephen J. Turnbull wrote:
I'm afraid I don't understand your question. Are you suggesting that four letter extensions are restricted to Microsoft products? If so, that would be an excellent reason to avoid .pyzw, but I don't believe that is the case. Common 4+ letter extensions include .html, .tiff, .jpeg, .mpeg, .midi, .java and .torrent. -- Steven

On Sat, May 4, 2013 at 4:15 PM, Steven D'Aprano <steve@pearwood.info> wrote:
We don't need examples of arbitrary data file extentions, we need examples of 4 letter extensions that are known to work correctly when placed on PATHEXT, including when called from PowerShell. In the absence of confirmation that 4-letter extensions work reliably in such cases, it seems wise to abbreviate the Windows GUI application extension as .pzw. I've also cc'ed Steve Dower, since investigation of this kind of Windows behavioural question is one of the things he offered distuils-sig help with after PyCon US :) Cheers, Nick. P.S. Steve, FYI, here is Paul's original concern: http://mail.python.org/pipermail/python-dev/2013-May/125928.html -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 4 May 2013 07:48, Nick Coghlan <ncoghlan@gmail.com> wrote:
Nick, thanks for passing this on. Your explanation of the issue is precisely correct. For information (I should have included this in the original message) here's the Powershell bug report I found: https://connect.microsoft.com/PowerShell/feedback/details/238550/power-shell... Unfortunately the link to the referenced discussion in that report is inaccessible :-( Paul

Thanks, Nick. I've been following along with this but haven't really been able to add anything. I can certainly say that I've never had any issue with more than 3 letters in an extension, and I deal with those every day (.pyproj, .csproj, .vxcproj.filters, etc). The PowerShell bug (which I hadn't heard of before) may be a complete non-issue, depending on how the associations are set up. To summarise the bug, when PowerShell invokes a command based on an extension in PATHEXT, only the first three characters of the extension are used to determine the associated program. I tested this by creating a file "test.txta" and adding ".TXTA" to my PATHEXT variable. Typing ".\test" in PowerShell opened the file in my text editor. This only affects PowerShell (cmd.exe handles it correctly) and only in the case where you don't specify the extension (".\test.txta" works fine, and with tab completion, this is probably more likely). It also ignores the associated command line and only uses the executable. (I'll pass this on to the PowerShell team, though I have no idea how they'll prioritise it, and of course there's probably no fix for existing versions.) Because we'd be claiming both .pyz and .pyzw, it's possible to work around this issue if we accept that .pyzw files may run with the .pyz program instead of the .pyzw program. Maybe it goes to something other than py.exe that could choose based on the extension. (Since other command-line arguments get stripped, adding an option to py.exe can't be done, and unless the current behaviour is for it to open .pyw files in pythonw.exe, I wouldn't want it to be different for .pyzw files.) However, anywhere in Windows that uses ShellExecute rather than FindExecutable will handle long extensions without any issue. AFAIK, this is everywhere except PowerShell, so I don't really see a strong case for breaking the w-suffix convention here. Cheers, Steve

On 6 May 2013 20:46, Steve Dower <Steve.Dower@microsoft.com> wrote:
The form in which I hit the bug is that I tried to create a "script" extension so that "foo.script" would be a generic script with a #! extension specifying the interpreter. I was adding the extension to PATHEXT so that scripts would be run "inline" (displaying the output at the console prompt, rather than in a new transient console window) - again this is a Powershell-specific issue which does not affect CMD. But when I added .script to PATHEXT, the script ran, but in a separate console window, which flashed up too fast for me to see the output. (It may be that it's the clash with .scr screensaver files that caused the file to be treated as a windows executable rather than a console executable, but it's hard to tell when you can't see the error message :-()
Thanks. In my view, it's a vaguely irritating rough edge rather than a dealbreaker. But it does cause problems as here. And the number of rough edges in powershell when dealing with "traditional" console commands (e.g., see the point above about needing PATHEXT to get inline output, rather than just to be able to omit the extension) are sufficient to make the accumulation greater than the sum of its individual parts.
I'm not sure the behaviour is clearly defined enough to be certain of this (although I'll defer to someone who's looked at the Powershell source code :-)). In my experiments, it was frustratingly difficult to pin down the exact behaviour with any certainty. And given that the choice is over running a console executable or a Windows one, it can be particularly bad if the wrong one is run (console program pops up in a transient second window and the output gets lost, for example). Add the fact that the powershell behaviour is essentially undocumented, and it's hard to guarantee anything. On the plus side, I suspect (but haven't proved) that if the GUI extension (pyzw) gets misread as the console one (pyz) the behaviour is less serious, because PATHEXT is not as relevant for GUI programs.
To be blunt, I see no point in using a pair of extensions that are known to be broken, even if only in Powershell, over a pair that will work everywhere (but are no more than mildly less consistent with other cases - note that while there's a py/pyw pair, there is no pycw corresponding to pyc, or pyow corresponding to pyo). Paul

So the bug would just cause .pyzw files to be opened with py instead of pyw? Won't this be harmless? I think the worst that would happen would be that you get a redundant console window if you are not already running powershell inside a console. -- Richard

Everyone seems to like the first half of this simple PEP adding the extensions. The 3-letter extension for windowed apps can be "pzw" while the "pyz" extension for console apps stays the same. The second half, the tool https://bitbucket.org/dholth/pyzaa/src/tip/pyzaa.py?at=default is less mature, but there's not a whole lot to do in a simple tool that may serve more as an example: you can open any file with ZipFile in append mode, even one that is not a zip file and just contains the #!python shebang line. Thanks, Daniel

Steven D'Aprano writes:
Give us a non-MS example, please.
I'm afraid I don't understand your question.
There were two problems mentioned. Paul worries about 4-letter extensions under PowerShell. You mentioned conflicts in Linux file managers. In both cases, a bug on Windows in detecting Microsoft products would kill (or at least seriously maim) a shell or file manager. I doubt many have ever existed, and surely they were detected *and* corrected pretty much immediately. My point is that such bug-awareness would not extend as strongly to extensions used by third-party free software.
Are you suggesting that four letter extensions are restricted to Microsoft products?
No, of course not.
Common 4+ letter extensions include .html, .tiff, .jpeg, .mpeg, .midi, .java and .torrent.
All of which (except perhaps .java and .torrent, which I bet are most commonly invoked not from shells but from IDEs and webbrowsers which have their own internal association databases) are commonly abbreviated to three letters on Windows, including in HTTP URLs which should have no such issues at all. That is consistent with my point (and Paul's, I believe). It doesn't prove anything, but given the decreasing importance of extensions for file typing on all systems, I think there's little penalty to being shortsighted and following the 3-character convention for extensions, especially on Windows.

On Sat, 04 May 2013 11:41:27 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
What would that have to do with the file extension? If some Linux systems decide that .ods and .pyz files are archives, it's probably because they *are* archives in their own right (though specialized ones). Probably the libmagic (used e.g. by the `file` command) wasn't up-to-date enough to specifically recognize OpenOffice documents, so it simply recognized the ZIP file structure and detected the file as such. Regards Antoine.

On 4/1/2013 5:47 PM, Daniel Holth wrote:
users expect .py to be opened with a text editor.
This user expects .py to be executed as an executable script, and thinks that is the default after an installation of Python on Windows. Windows has a separate option, Edit, to use to edit things. But, I'm glad to see you write the PEP. I have an even thinner method of doing this, using .py extensions, that I've been using for several years now with Python 3, and wondered why it wasn't more popular. My equivalent of pyzaa, though, only performs the pack operation, and requires a bit of cooperation from the application (as a convenient way of storing the application-specific parameters, I build the invocation of pyzaa-equivalent into the application itself using a non-documented command-line option, and build to a different directory, to avoid overwriting application.py). Feel free to incorporate all or parts of that idea if it makes sense to you and sounds convenient.

What happens when -p is omitted? I'd hope it would add the interpreter used to create the zip (or at least the major version), but that may not be ideal for some reason that I haven't thought of yet. Everything else looks great. I'm really looking forward to this. Cheers, Steve

On Tue, Apr 2, 2013 at 1:20 PM, Steve Dower <Steve.Dower@microsoft.com>wrote:
Question is whether ``/usr/bin/python3.3`` is better or ``/usr/bin/env python3.3``. I vote for the latter since it gets you the right thing without having to care about whether the interpreter moved or is being hidden by a user-installed interpreter. -Brett

On Wed, Apr 3, 2013 at 1:26 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Pushed as Draft PEP 441, tooling prototyped (with less than awesome CLI) at https://bitbucket.org/dholth/pyzaa or https://crate.io/packages/pyzaa Thanks, Daniel Holth

On 2 April 2013 01:47, Daniel Holth <dholth@gmail.com> wrote:
There is a bug in Windows Powershell, which is apparently due to a bug in the underlying FindExecutable API, that can fail to recognise extensions which are longer than 3 characters properly. Rather than risk obscure bugs, I would suggest restricting the extensions to 3 characters. For the “Windowed Python ZIP Applications” case, could we use .pzw as the extension instead of .pyzw? Please don't shoot the messenger here - I'm not going to try to defend such a stupid Windows bug, but better to be safe in my view. Flames about Windows to /dev/null... Paul.

On 3 May 2013 20:40, "Paul Moore" <p.f.moore@gmail.com> wrote:
the underlying FindExecutable API, that can fail to recognise extensions which are longer than 3 characters properly.
Rather than risk obscure bugs, I would suggest restricting the extensions
to 3 characters. For the “Windowed Python ZIP Applications” case, could we use .pzw as the extension instead of .pyzw?
Please don't shoot the messenger here - I'm not going to try to defend
such a stupid Windows bug, but better to be safe in my view. Flames about Windows to /dev/null... I'm OK with the shortened extension. Cheers, Nick.
http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

On 03/05/13 20:37, Paul Moore wrote:
Are you referring to this one? https://groups.google.com/group/microsoft.public.vb.general.discussion/brows... That's pretty old, is it still a problem? Besides, if I'm reading this properly: http://msdn.microsoft.com/en-us/library/bb776419(VS.85).aspx the issue is that they should be using AssocQueryString, not FindExecutable.
I've had Linux systems which associated OpenOffice docs with Archive Manager rather than OpenOffice. It's likely that at least some Linux systems will likewise decide that .pyz files are archives, not Python files, and open them in Archive Manager. I don't believe that it is Python's responsibility to work around bugs in desktop environments' handling of file associations. Many official Microsoft file extensions are four or more letters, e.g. docx. I don't see any value in making long-lasting decisions on file extensions based on (transient?) bugs that aren't our responsibility. -- Steven

Steven D'Aprano writes:
+0
Many official Microsoft file extensions are four or more letters, e.g. docx.
Give us a non-MS example, please. Nobody in their right mind would clash with a major MS product's naming conventions. Not even if their file format implements Digital-Ocular Coordination eXtensions. And a shell that borks the Borg's extensions won't make it in the market.
Getting these associations right is worth *something* to Python. I'm not in a position to say more than "it's positive". But I don't see why we really care about what the file extensions are as long as they serve the purpose of making it easy to figure out which files are in what format in a names-only list. I have to admit that "Windowed Python ZIP Application" is probably something I personally will only ever consider as an hypothesis, though.

On 04/05/13 15:13, Stephen J. Turnbull wrote:
I'm afraid I don't understand your question. Are you suggesting that four letter extensions are restricted to Microsoft products? If so, that would be an excellent reason to avoid .pyzw, but I don't believe that is the case. Common 4+ letter extensions include .html, .tiff, .jpeg, .mpeg, .midi, .java and .torrent. -- Steven

On Sat, May 4, 2013 at 4:15 PM, Steven D'Aprano <steve@pearwood.info> wrote:
We don't need examples of arbitrary data file extentions, we need examples of 4 letter extensions that are known to work correctly when placed on PATHEXT, including when called from PowerShell. In the absence of confirmation that 4-letter extensions work reliably in such cases, it seems wise to abbreviate the Windows GUI application extension as .pzw. I've also cc'ed Steve Dower, since investigation of this kind of Windows behavioural question is one of the things he offered distuils-sig help with after PyCon US :) Cheers, Nick. P.S. Steve, FYI, here is Paul's original concern: http://mail.python.org/pipermail/python-dev/2013-May/125928.html -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 4 May 2013 07:48, Nick Coghlan <ncoghlan@gmail.com> wrote:
Nick, thanks for passing this on. Your explanation of the issue is precisely correct. For information (I should have included this in the original message) here's the Powershell bug report I found: https://connect.microsoft.com/PowerShell/feedback/details/238550/power-shell... Unfortunately the link to the referenced discussion in that report is inaccessible :-( Paul

Thanks, Nick. I've been following along with this but haven't really been able to add anything. I can certainly say that I've never had any issue with more than 3 letters in an extension, and I deal with those every day (.pyproj, .csproj, .vxcproj.filters, etc). The PowerShell bug (which I hadn't heard of before) may be a complete non-issue, depending on how the associations are set up. To summarise the bug, when PowerShell invokes a command based on an extension in PATHEXT, only the first three characters of the extension are used to determine the associated program. I tested this by creating a file "test.txta" and adding ".TXTA" to my PATHEXT variable. Typing ".\test" in PowerShell opened the file in my text editor. This only affects PowerShell (cmd.exe handles it correctly) and only in the case where you don't specify the extension (".\test.txta" works fine, and with tab completion, this is probably more likely). It also ignores the associated command line and only uses the executable. (I'll pass this on to the PowerShell team, though I have no idea how they'll prioritise it, and of course there's probably no fix for existing versions.) Because we'd be claiming both .pyz and .pyzw, it's possible to work around this issue if we accept that .pyzw files may run with the .pyz program instead of the .pyzw program. Maybe it goes to something other than py.exe that could choose based on the extension. (Since other command-line arguments get stripped, adding an option to py.exe can't be done, and unless the current behaviour is for it to open .pyw files in pythonw.exe, I wouldn't want it to be different for .pyzw files.) However, anywhere in Windows that uses ShellExecute rather than FindExecutable will handle long extensions without any issue. AFAIK, this is everywhere except PowerShell, so I don't really see a strong case for breaking the w-suffix convention here. Cheers, Steve

On 6 May 2013 20:46, Steve Dower <Steve.Dower@microsoft.com> wrote:
The form in which I hit the bug is that I tried to create a "script" extension so that "foo.script" would be a generic script with a #! extension specifying the interpreter. I was adding the extension to PATHEXT so that scripts would be run "inline" (displaying the output at the console prompt, rather than in a new transient console window) - again this is a Powershell-specific issue which does not affect CMD. But when I added .script to PATHEXT, the script ran, but in a separate console window, which flashed up too fast for me to see the output. (It may be that it's the clash with .scr screensaver files that caused the file to be treated as a windows executable rather than a console executable, but it's hard to tell when you can't see the error message :-()
Thanks. In my view, it's a vaguely irritating rough edge rather than a dealbreaker. But it does cause problems as here. And the number of rough edges in powershell when dealing with "traditional" console commands (e.g., see the point above about needing PATHEXT to get inline output, rather than just to be able to omit the extension) are sufficient to make the accumulation greater than the sum of its individual parts.
I'm not sure the behaviour is clearly defined enough to be certain of this (although I'll defer to someone who's looked at the Powershell source code :-)). In my experiments, it was frustratingly difficult to pin down the exact behaviour with any certainty. And given that the choice is over running a console executable or a Windows one, it can be particularly bad if the wrong one is run (console program pops up in a transient second window and the output gets lost, for example). Add the fact that the powershell behaviour is essentially undocumented, and it's hard to guarantee anything. On the plus side, I suspect (but haven't proved) that if the GUI extension (pyzw) gets misread as the console one (pyz) the behaviour is less serious, because PATHEXT is not as relevant for GUI programs.
To be blunt, I see no point in using a pair of extensions that are known to be broken, even if only in Powershell, over a pair that will work everywhere (but are no more than mildly less consistent with other cases - note that while there's a py/pyw pair, there is no pycw corresponding to pyc, or pyow corresponding to pyo). Paul

So the bug would just cause .pyzw files to be opened with py instead of pyw? Won't this be harmless? I think the worst that would happen would be that you get a redundant console window if you are not already running powershell inside a console. -- Richard

Everyone seems to like the first half of this simple PEP adding the extensions. The 3-letter extension for windowed apps can be "pzw" while the "pyz" extension for console apps stays the same. The second half, the tool https://bitbucket.org/dholth/pyzaa/src/tip/pyzaa.py?at=default is less mature, but there's not a whole lot to do in a simple tool that may serve more as an example: you can open any file with ZipFile in append mode, even one that is not a zip file and just contains the #!python shebang line. Thanks, Daniel

Steven D'Aprano writes:
Give us a non-MS example, please.
I'm afraid I don't understand your question.
There were two problems mentioned. Paul worries about 4-letter extensions under PowerShell. You mentioned conflicts in Linux file managers. In both cases, a bug on Windows in detecting Microsoft products would kill (or at least seriously maim) a shell or file manager. I doubt many have ever existed, and surely they were detected *and* corrected pretty much immediately. My point is that such bug-awareness would not extend as strongly to extensions used by third-party free software.
Are you suggesting that four letter extensions are restricted to Microsoft products?
No, of course not.
Common 4+ letter extensions include .html, .tiff, .jpeg, .mpeg, .midi, .java and .torrent.
All of which (except perhaps .java and .torrent, which I bet are most commonly invoked not from shells but from IDEs and webbrowsers which have their own internal association databases) are commonly abbreviated to three letters on Windows, including in HTTP URLs which should have no such issues at all. That is consistent with my point (and Paul's, I believe). It doesn't prove anything, but given the decreasing importance of extensions for file typing on all systems, I think there's little penalty to being shortsighted and following the 3-character convention for extensions, especially on Windows.

On Sat, 04 May 2013 11:41:27 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
What would that have to do with the file extension? If some Linux systems decide that .ods and .pyz files are archives, it's probably because they *are* archives in their own right (though specialized ones). Probably the libmagic (used e.g. by the `file` command) wasn't up-to-date enough to specifically recognize OpenOffice documents, so it simply recognized the ZIP file structure and detected the file as such. Regards Antoine.
participants (11)
-
Antoine Pitrou
-
Brett Cannon
-
Daniel Holth
-
Glenn Linderman
-
Nick Coghlan
-
Paul Moore
-
Richard Oudkerk
-
Stefan Behnel
-
Stephen J. Turnbull
-
Steve Dower
-
Steven D'Aprano