How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous
There is a long standing issue, (or at least I consider it a poor feature some others consider it compliance with other existing tools), with a couple of python standard libraries that I would like to raise an issue for & PR(s) to address but when it comes to putting tests in place the failing test case can potentially crash the host machine, sometimes in an unrecoverable manner, due to the potential to generate an infinite size file. Is there a way of proposing a change with tests that refuse to run in some situations or can be terminated based on time or resource limits being exceeded? Steve Barnes
If the file is supposed to be small, would patching write() in the test to accumulate the number of bytes written and fail the test when it goes over what it's supposed to be possible? What is the issue?
The issue is simple and simple enough for a beginner to fall foul of - test procedure: Change directory to any directory with files in totally a few 10s of megs (ideally but it doesn't matter much). python -m zipfile -c my_zipfile.zip . Wait a few minutes and press control-C then do a directory - you will find that my_zipfile.zip is huge (much bigger than the total of the files in the original directory). The same applies to tarfile. If you fail to hit control-C soon enough (how long depends on the spare space on the current drive) it will crash when it runs out of disk space, possibly, on some systems, requiring a format and re-install of the OS. The cause of the problem is that when zipfile &/or tarfile is archiving to a file in the current directory it tries to add the target file to the input list and you have an Ouroboros situation. The fix is also simple - exclude the target file from the permitted inputs. I have seen this happen in the wild, I do know that python is not the only archiving tool that suffers from this problem but that seems to me to be no reason not to address this. If write were patched to simply count the bytes but not write the file then the problem would not occur so I guess that it would need to be patched in a transparent manner. Steve Barnes -----Original Message----- From: remi.lapeyre@henki.fr <remi.lapeyre@henki.fr> Sent: 21 May 2020 13:56 To: python-ideas@python.org Subject: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous If the file is supposed to be small, would patching write() in the test to accumulate the number of bytes written and fail the test when it goes over what it's supposed to be possible? What is the issue? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ORPDN4... Code of Conduct: http://python.org/psf/codeofconduct/
Hi Steve, Have you considered and rejected filing a bug on bugs.python.org? Or are you specifically concerned about how to write a test for this behavior that doesn't fill up the disk when the bug is present? On Unixoid systems there's a resource limit you can set to limit disk space IIRC (e.g. on my Mac, `ulimit -a` shows various limits including file size). Solving this by patching write() seems somewhat complex because you'd really have to patch open() to return an io.IO subclass that has a patched write() method. But I'm sure it can be done in the context of a specific test. I wouldn't try patching it for *all* tests though -- there are many tests that might fail due to such a change. It also wouldn't help for file operations made directly from C code. --Guido On Thu, May 21, 2020 at 6:57 AM Steve Barnes <GadgetSteve@live.co.uk> wrote:
The issue is simple and simple enough for a beginner to fall foul of - test procedure: Change directory to any directory with files in totally a few 10s of megs (ideally but it doesn't matter much). python -m zipfile -c my_zipfile.zip .
Wait a few minutes and press control-C then do a directory - you will find that my_zipfile.zip is huge (much bigger than the total of the files in the original directory). The same applies to tarfile.
If you fail to hit control-C soon enough (how long depends on the spare space on the current drive) it will crash when it runs out of disk space, possibly, on some systems, requiring a format and re-install of the OS.
The cause of the problem is that when zipfile &/or tarfile is archiving to a file in the current directory it tries to add the target file to the input list and you have an Ouroboros situation. The fix is also simple - exclude the target file from the permitted inputs.
I have seen this happen in the wild, I do know that python is not the only archiving tool that suffers from this problem but that seems to me to be no reason not to address this.
If write were patched to simply count the bytes but not write the file then the problem would not occur so I guess that it would need to be patched in a transparent manner.
Steve Barnes
-----Original Message----- From: remi.lapeyre@henki.fr <remi.lapeyre@henki.fr> Sent: 21 May 2020 13:56 To: python-ideas@python.org Subject: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous
If the file is supposed to be small, would patching write() in the test to accumulate the number of bytes written and fail the test when it goes over what it's supposed to be possible?
What is the issue? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ORPDN4... Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OT5M4U... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
From: Guido van Rossum <guido@python.org> Sent: 21 May 2020 16:59 To: Steve Barnes <GadgetSteve@live.co.uk> Cc: remi.lapeyre@henki.fr; python-ideas@python.org Subject: Re: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous Hi Steve, Have you considered and rejected filing a bug on bugs.python.org<http://bugs.python.org>? Or are you specifically concerned about how to write a test for this behavior that doesn't fill up the disk when the bug is present? On Unixoid systems there's a resource limit you can set to limit disk space IIRC (e.g. on my Mac, `ulimit -a` shows various limits including file size). Solving this by patching write() seems somewhat complex because you'd really have to patch open() to return an io.IO subclass that has a patched write() method. But I'm sure it can be done in the context of a specific test. I wouldn't try patching it for *all* tests though -- there are many tests that might fail due to such a change. It also wouldn't help for file operations made directly from C code. --Guido [Steve Barnes] Unfortunately we have no control over where the tests may be run – if run on Windows from the C: drive it could potentially brick the entire machine, (which of course some people might consider a bonus of course).
On Fri, May 22, 2020 at 08:09:29AM +0000, Steve Barnes wrote:
Unfortunately we have no control over where the tests may be run – if run on Windows from the C: drive it could potentially brick the entire machine, (which of course some people might consider a bonus of course).
I'm rather shocked to learn that (allegedly?) you can brick modern Windows by filling the drive up. If this was Windows 95 I might believe it, but Windows 7 or 10? Have you confirmed this or is it just a guess? As for strategies for the test... you could write the test to skip on Windows. Using `unittest`: @unittest.skipif(os.name == 'nt', 'dangerous to run on Windows') Does Windows have the capacity to create a file system on the fly and then mount it, as we can do in Linux? # Untested. dd if=/dev/zero of=/tmp/disk count=2048 mkfs.ntfs -v Untitled /tmp/disk sudo mount -t ntfs /tmp/disk /mnt/ In pseudo-code: - create a temporary file of 2048 bytes; - write a NTFS file system in that file; - mount that file system somewhere so it is visible; So have the test create a new file system on the fly, cd into that file system, run the test proper, and if it fills up, no harm done. If you cannot create a file system on the fly, perhaps you can prepare one before hand, manually, as part of the test requirements, and refuse to run if that scratch (pseudo)disk doesn't exist. -- Steven
On Fri, May 22, 2020 at 4:23 AM Steven D'Aprano <steve@pearwood.info> wrote:
In pseudo-code:
- create a temporary file of 2048 bytes; - write a NTFS file system in that file; - mount that file system somewhere so it is visible;
So have the test create a new file system on the fly, cd into that file system, run the test proper, and if it fills up, no harm done.
If that IS doable on all platforms we care about, it could be pretty handy for all kinds of tests that require writing files. yes, in most cases, temp files work fine, but this feels cleaner and easier to make make sure you don't leave any files laying around after your tests. It would be a handy utility to have available in unit tests in general. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 22/05/2020 12:15, Steven D'Aprano wrote:
On Fri, May 22, 2020 at 08:09:29AM +0000, Steve Barnes wrote:
Unfortunately we have no control over where the tests may be run – if run on Windows from the C: drive it could potentially brick the entire machine, (which of course some people might consider a bonus of course). I'm rather shocked to learn that (allegedly?) you can brick modern Windows by filling the drive up. If this was Windows 95 I might believe it, but Windows 7 or 10? Have you confirmed this or is it just a guess?
As for strategies for the test... you could write the test to skip on Windows. Using `unittest`:
@unittest.skipif(os.name == 'nt', 'dangerous to run on Windows')
Does Windows have the capacity to create a file system on the fly and then mount it, as we can do in Linux?
# Untested. dd if=/dev/zero of=/tmp/disk count=2048 mkfs.ntfs -v Untitled /tmp/disk sudo mount -t ntfs /tmp/disk /mnt/
In pseudo-code:
- create a temporary file of 2048 bytes; - write a NTFS file system in that file; - mount that file system somewhere so it is visible;
So have the test create a new file system on the fly, cd into that file system, run the test proper, and if it fills up, no harm done.
If you cannot create a file system on the fly, perhaps you can prepare one before hand, manually, as part of the test requirements, and refuse to run if that scratch (pseudo)disk doesn't exist.
I find having a RAM drive is very handy. Could this be another use for it? Rob Cliffe
On Fri, May 22, 2020 at 03:54:30PM +0100, Rob Cliffe via Python-ideas wrote:
I find having a RAM drive is very handy. Could this be another use for it?
Could be, if there's a standard way to create a RAM drive on all windows machines without needing to install specialist RAM drive software from a third party. -- Steven
On Windows https://freetechtutors.com/create-virtual-hard-disk-using-diskpart-windows/ gives a nice description of creating a virtual disk with only operating system commands. Note that it shows the commands being used interactively but it can also be scripted by starting diskpart /s script.txt - it does need to be run as admin. Steve Barnes -----Original Message----- From: Steven D'Aprano <steve@pearwood.info> Sent: 25 May 2020 15:34 To: python-ideas@python.org Subject: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous On Fri, May 22, 2020 at 03:54:30PM +0100, Rob Cliffe via Python-ideas wrote:
I find having a RAM drive is very handy. Could this be another use for it?
Could be, if there's a standard way to create a RAM drive on all windows machines without needing to install specialist RAM drive software from a third party. -- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Q4DEI6... Code of Conduct: http://python.org/psf/codeofconduct/
Forgot to mention - according to https://docs.microsoft.com/en-us/windows-server/administration/windows-comma... this process applies to Windows 10, Windows 8.1, Windows 8, Windows 7, Windows Server 2019, Windows Server 2016, Windows Server 2012 R2, Windows Server 2012, and Windows Server 2008 R2, Windows Server 2008 which I think covers all of the currently supported versions of Windows. -----Original Message----- From: Steve Barnes <GadgetSteve@live.co.uk> Sent: 25 May 2020 18:56 To: Steven D'Aprano <steve@pearwood.info>; python-ideas@python.org Subject: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous On Windows https://freetechtutors.com/create-virtual-hard-disk-using-diskpart-windows/ gives a nice description of creating a virtual disk with only operating system commands. Note that it shows the commands being used interactively but it can also be scripted by starting diskpart /s script.txt - it does need to be run as admin. Steve Barnes -----Original Message----- From: Steven D'Aprano <steve@pearwood.info> Sent: 25 May 2020 15:34 To: python-ideas@python.org Subject: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous On Fri, May 22, 2020 at 03:54:30PM +0100, Rob Cliffe via Python-ideas wrote:
I find having a RAM drive is very handy. Could this be another use for it?
Could be, if there's a standard way to create a RAM drive on all windows machines without needing to install specialist RAM drive software from a third party. -- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Q4DEI6... Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/W5L2YP... Code of Conduct: http://python.org/psf/codeofconduct/
It's nice to fantasize about RAM disks, but the code in zipfile.py is begging for a simpler solution, just pass in a wrapper that overrides the write() method (you may have to make a few changes but you can develop and test this separately from the fix for your bug). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Mon, May 25, 2020 at 12:19 PM Guido van Rossum <guido@python.org> wrote:
It's nice to fantasize about RAM disks, but the code in zipfile.py is begging for a simpler solution, just pass in a wrapper that overrides the write() method (you may have to make a few changes but you can develop and test this separately from the fix for your bug).
indeed -- pretty standard use case for mocking, yes? But in cases where you might need a more complete file system (and or where C code might be doing the writing) , it would be nice to be able to mock an entire filesystem. -CHB
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WEP6WS... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Mon, May 25, 2020 at 10:59 AM Steve Barnes <GadgetSteve@live.co.uk> wrote:
On Windows https://freetechtutors.com/create-virtual-hard-disk-using-diskpart-windows/ gives a nice description of creating a virtual disk with only operating system commands. Note that it shows the commands being used interactively but it can also be scripted by starting diskpart /s script.txt -
...
it does need to be run as admin.
Well, darn. That seriously reduces its usefulness. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 5/25/20, Christopher Barker <pythonchb@gmail.com> wrote:
On Mon, May 25, 2020 at 10:59 AM Steve Barnes <GadgetSteve@live.co.uk> wrote:
On Windows https://freetechtutors.com/create-virtual-hard-disk-using-diskpart-windows/ gives a nice description of creating a virtual disk with only operating system commands. Note that it shows the commands being used interactively but it can also be scripted by starting diskpart /s script.txt -
...
it does need to be run as admin.
Well, darn. That seriously reduces its usefulness.
Creating and mounting a VHD can be implemented in PowerShell as well, but it requires installing the Hyper-V management tools and services (not the hypervisor itself). The hyper-v cmdlets for managing virtual disks (e.g. mount-vhd) also require administrator access, since the underlying API does (e.g. AttachVirtualDisk). A new system service and client application would have to be developed in order to allow standard users to manage virtual disks. Here's a PowerShell example to create, format, and mount a volume on a 10 MiB virtual disk. This example mounts the volume both at "V:/" and at "C:/Mount/vhd_mount": $vhdpath = 'C:\Mount\temp.vhdx' $mountpath = 'C:\Mount\vhd_mount' # create and mount the physical disk as a RAW volume new-vhd -path $vhdpath -fixed -sizebytes (10 -shl 20) mount-vhd -path $vhdpath # create a partition on the disk, format it as NTFS, and assign # the DOS device name "V:" and label "vhd" $nd = (get-vhd -path $vhdpath).DiskNumber new-volume -disknum $nd -filesys ntfs -drive V -friendly vhd # set a folder mountpoint mkdir $mountpath add-partitionaccesspath -drive V -accesspath $mountpath If no drive letter is desired, use the -accesspath option of the new-volume cmdlet instead of the -driveletter option. The following command dismounts the disk: dismount-vhd -path $vhdpath The mountpoint on the empty directory remains set but inaccessible once the disk is dismounted. You can can delete this directory if the disk won't be mounted again. Or, while the disk is mounted, you can remove the mountpoint via remove-partitionaccesspath.
-----Original Message----- From: Eryk Sun <eryksun@gmail.com> Sent: 02 June 2020 02:02 To: python-ideas@python.org Cc: Christopher Barker <pythonchb@gmail.com>; Steve Barnes <GadgetSteve@live.co.uk> Subject: Re: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous On 5/25/20, Christopher Barker <pythonchb@gmail.com> wrote:
On Mon, May 25, 2020 at 10:59 AM Steve Barnes <GadgetSteve@live.co.uk> wrote:
On Windows https://freetechtutors.com/create-virtual-hard-disk-using-diskpart-wi ndows/ gives a nice description of creating a virtual disk with only operating system commands. Note that it shows the commands being used interactively but it can also be scripted by starting diskpart /s script.txt -
...
it does need to be run as admin.
Well, darn. That seriously reduces its usefulness.
Creating and mounting a VHD can be implemented in PowerShell as well, but it requires installing the Hyper-V management tools and services (not the hypervisor itself). The hyper-v cmdlets for managing virtual disks (e.g. mount-vhd) also require administrator access, since the underlying API does (e.g. AttachVirtualDisk). A new system service and client application would have to be developed in order to allow standard users to manage virtual disks. Here's a PowerShell example to create, format, and mount a volume on a 10 MiB virtual disk. This example mounts the volume both at "V:/" and at "C:/Mount/vhd_mount": $vhdpath = 'C:\Mount\temp.vhdx' $mountpath = 'C:\Mount\vhd_mount' # create and mount the physical disk as a RAW volume new-vhd -path $vhdpath -fixed -sizebytes (10 -shl 20) mount-vhd -path $vhdpath # create a partition on the disk, format it as NTFS, and assign # the DOS device name "V:" and label "vhd" $nd = (get-vhd -path $vhdpath).DiskNumber new-volume -disknum $nd -filesys ntfs -drive V -friendly vhd # set a folder mountpoint mkdir $mountpath add-partitionaccesspath -drive V -accesspath $mountpath If no drive letter is desired, use the -accesspath option of the new-volume cmdlet instead of the -driveletter option. The following command dismounts the disk: dismount-vhd -path $vhdpath The mountpoint on the empty directory remains set but inaccessible once the disk is dismounted. You can can delete this directory if the disk won't be mounted again. Or, while the disk is mounted, you can remove the mountpoint via remove-partitionaccesspath. [Steve Barnes] Interesting Eryk. I have found that PyFilesystem2 (https://pypi.org/project/fs/) can create "file systems", both temporary and memory, for use from within python code without needing Admin permissions **but** at the moment they do not have any way of setting maximum sizes/available spaces.
On 25/05/2020 15:33, Steven D'Aprano wrote:
On Fri, May 22, 2020 at 03:54:30PM +0100, Rob Cliffe via Python-ideas wrote:
I find having a RAM drive is very handy. Could this be another use for it? Could be, if there's a standard way to create a RAM drive on all windows machines without needing to install specialist RAM drive software from a third party.
No, I downloaded a 3rd-party package. I'm not sure if that's "specialist sotware" by your definition. I won't name it here, but it was dead easy to install and use and works perfectly. Personally I'd recommend it.
On 05/27/2020 05:53 PM, Rob Cliffe via Python-ideas wrote:
No, I downloaded a 3rd-party package. I'm not sure if that's "specialist sotware" by your definition. I won't name it here, but it was dead easy to install and use and works perfectly. Personally I'd recommend it.
You'd recommend it, but you won't name it? Hmmm..... ;) -- ~Ethan~
OK, I didn't want to be guilty of advertising, but it is ImDisk and can, I believe, be downloaded from https://sourceforge.net/projects/imdisk-toolkit/ and I have used it happily for some years. Disclosure: I have no interest in or connection with, direct or indirect, financial or otherwise, this product or any company or individual that provides it, other than being a satisfied user. Other products are available; I haven't tried any of them so can't comment on them. Rob Cliffe On 28/05/2020 03:00, Ethan Furman wrote:
On 05/27/2020 05:53 PM, Rob Cliffe via Python-ideas wrote:
No, I downloaded a 3rd-party package. I'm not sure if that's "specialist sotware" by your definition. I won't name it here, but it was dead easy to install and use and works perfectly. Personally I'd recommend it.
You'd recommend it, but you won't name it? Hmmm..... ;)
-- ~Ethan~ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/N3I46E... Code of Conduct: http://python.org/psf/codeofconduct/
On 21/05/2020 14:50, Steve Barnes wrote:
The issue is simple and simple enough for a beginner to fall foul of - test procedure: Change directory to any directory with files in totally a few 10s of megs (ideally but it doesn't matter much). python -m zipfile -c my_zipfile.zip .
Wait a few minutes and press control-C then do a directory - you will find that my_zipfile.zip is huge (much bigger than the total of the files in the original directory). The same applies to tarfile.
If you fail to hit control-C soon enough (how long depends on the spare space on the current drive) it will crash when it runs out of disk space, possibly, on some systems, requiring a format and re-install of the OS.
The cause of the problem is that when zipfile &/or tarfile is archiving to a file in the current directory it tries to add the target file to the input list and you have an Ouroboros situation. The fix is also simple - exclude the target file from the permitted inputs. Sorry - the moment I pressed Send I noticed the last sentence. Which is a better solution than mine. So Ignore what I just said:
Am I missing something here? IIUC zipfile.py tries to add the zipfile it's creating to the zipfile it's creating. That's a bug, pure and simple (and a dangerous one). It should be rewritten to build a list of the files it's going to zip up (before creating its own zipfile), then work from that list. Surely it can't be that hard?
I have seen this happen in the wild, I do know that python is not the only archiving tool that suffers from this problem but that seems to me to be no reason not to address this.
If write were patched to simply count the bytes but not write the file then the problem would not occur so I guess that it would need to be patched in a transparent manner.
Steve Barnes
-----Original Message----- From: remi.lapeyre@henki.fr <remi.lapeyre@henki.fr> Sent: 21 May 2020 13:56 To: python-ideas@python.org Subject: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous
If the file is supposed to be small, would patching write() in the test to accumulate the number of bytes written and fail the test when it goes over what it's supposed to be possible?
What is the issue? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ORPDN4... Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OT5M4U... Code of Conduct: http://python.org/psf/codeofconduct/
On 21May2020 21:30, Rob Cliffe <rob.cliffe@btinternet.com> wrote:
On 21/05/2020 14:50, Steve Barnes wrote:
The issue is simple and simple enough for a beginner to fall foul of - test procedure: Change directory to any directory with files in totally a few 10s of megs (ideally but it doesn't matter much). python -m zipfile -c my_zipfile.zip . [... zip that zips its own archive...]
Am I missing something here?
Yes. The discussion isn't about the bug, but about containing the bug during testing.
IIUC zipfile.py tries to add the zipfile it's creating to the zipfile it's creating. That's a bug, pure and simple (and a dangerous one).
Which is the point of Steve's post. He's asking about good ways to write a test for something like this, which has bad/dangerous behaviour when the test fails. The whole point of the test is to trigger the bad behaviour but the secondary objective is to do it without zip exploding in the tester's face. The process here is: - bug exists (exploding zip) - write a test that fails when the bug exists - fix the bug - the test should now pass Step 2 eats an unbounded amount of disc (in this case, other bugs can go bad on other ways). Cheers, Cameron Simpson <cs@cskk.id.au>
On 21/05/2020 14:50, Steve Barnes wrote:
The issue is simple and simple enough for a beginner to fall foul of - test procedure: Change directory to any directory with files in totally a few 10s of megs (ideally but it doesn't matter much). python -m zipfile -c my_zipfile.zip .
Wait a few minutes and press control-C then do a directory - you will find that my_zipfile.zip is huge (much bigger than the total of the files in the original directory). The same applies to tarfile.
If you fail to hit control-C soon enough (how long depends on the spare space on the current drive) it will crash when it runs out of disk space, possibly, on some systems, requiring a format and re-install of the OS.
The cause of the problem is that when zipfile &/or tarfile is archiving to a file in the current directory it tries to add the target file to the input list and you have an Ouroboros situation. The fix is also simple - exclude the target file from the permitted inputs. Am I missing something here? IIUC zipfile.py tries to add the zipfile it's creating to the zipfile it's creating. That's a bug, pure and simple (and a dangerous one). It should be rewritten to build a list of the files it's going to zip up (before creating its own zipfile), then work from that list. Surely it can't be that hard?
I have seen this happen in the wild, I do know that python is not the only archiving tool that suffers from this problem but that seems to me to be no reason not to address this.
If write were patched to simply count the bytes but not write the file then the problem would not occur so I guess that it would need to be patched in a transparent manner.
Steve Barnes
-----Original Message----- From: remi.lapeyre@henki.fr <remi.lapeyre@henki.fr> Sent: 21 May 2020 13:56 To: python-ideas@python.org Subject: [Python-ideas] Re: How to propose a change with tests where the failing test case (current behaviour) is bad or dangerous
If the file is supposed to be small, would patching write() in the test to accumulate the number of bytes written and fail the test when it goes over what it's supposed to be possible?
What is the issue? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ORPDN4... Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OT5M4U... Code of Conduct: http://python.org/psf/codeofconduct/
participants (9)
-
Cameron Simpson
-
Christopher Barker
-
Eryk Sun
-
Ethan Furman
-
Guido van Rossum
-
remi.lapeyre@henki.fr
-
Rob Cliffe
-
Steve Barnes
-
Steven D'Aprano