From PythonList at DancesWithMice.info Sun Sep 1 20:33:16 2024 From: PythonList at DancesWithMice.info (dn) Date: Mon, 2 Sep 2024 12:33:16 +1200 Subject: Formatting a str as a number - Okay, one more related thing... In-Reply-To: <9e13c5d9-fda3-4d7d-a87b-3dfd1c51f6ed@mrabarnett.plus.com> References: <4Wtgt14jXFznWHr@mail.python.org> <9e13c5d9-fda3-4d7d-a87b-3dfd1c51f6ed@mrabarnett.plus.com> Message-ID: On 1/09/24 06:55, MRAB via Python-list wrote: > On 2024-08-31 06:31, Gilmeh Serda via Python-list wrote: >> On Fri, 30 Aug 2024 05:22:17 GMT, Gilmeh Serda wrote: >> >>> f"{int(number):>20,}" >> >> I can find "," (comma) and I can find "_" (underscore) but how about " " >> (space)? >> >> Or any other character, for that matter? >> >> Any ideas? >> >> Of course I can do f"{123456:>20_}".replace("_", " "), just thought there >> might be something else my search mojo fails on. >> > The format is described here: > > https://docs.python.org/3/library/string.html#formatspec > > A space is counted as a fill character. Rather than strict formatting, you may be venturing into "internationalisation/localisation" thinking. Different cultures/languages present numeric-amounts in their own ways. For example, a decimal point may look like a dot or period to some (there's two words for the same symbol from different English-language cultures!), whereas in Europe the symbol others call a comma is used, eg 123.45 or 123,45 - and that's just one complication of convention... For your reading pleasure, please review "locales" (https://docs.python.org/3/library/locale.html) Here's an example: Country Integer Float USA 123,456 123,456.78 France 123?456 123?456,78 Spain 123.456 123.456,78 Portugal 123456 123456,78 Poland 123?456 123?456,78 Here's some old code, filched from somewhere (above web.ref?) and updated today to produce the above:- """ PythonExperiments:locale_numbers.py Demonstrate numeric-presentations in different cultures (locales). """ __author__ = "dn, IT&T Consultant" __python__ = "3.12" __created__ = "PyCharm, 02 Jan 2021" __copyright__ = "Copyright ? 2024~" __license__ = "GNU General Public License v3.0" # PSL import locale locales_to_compare = [ ( "USA", "en_US", ), ( "France", "fr_FR", ), ( "Spain", "es_ES", ), ( "Portugal", "pt_PT", ), ( "Poland", "pl_PL", ), ] print( "\n Country Integer Float" ) for country_name, locale_identifier in locales_to_compare: locale.setlocale( locale.LC_ALL, locale_identifier, ) print( F"{country_name:>10}", end=" ", ) print( locale.format_string("%10d", 123456, grouping=True, ), end="", ) print( locale.format_string("%15.2f", 123456.78, grouping=True, ) ) -- Regards =dn -- Regards, =dn From mk1853387 at gmail.com Mon Sep 2 10:00:15 2024 From: mk1853387 at gmail.com (marc nicole) Date: Mon, 2 Sep 2024 16:00:15 +0200 Subject: Getting a Process.start() error pickle.PicklingError: Can't pickle : it's not found as __builtin__.module with Python 2.7 Message-ID: Hello, I am using Python 2.7 on Windows 10 and I want to launch a process independently of the rest of the code so that the execution continues while the started process proceeds. I am using Process().start() from Python 2.7 as follows: from multiprocessing import Process def do_something(text): print(text) if __name__ == "__main__": q = Process(target=do_something,args=("somecmd") ) q.start() # following code should execute right after the q.start() call (not until it returns) ..... But getting the error at the call of Process().start(): pickle.PicklingError: Can't pickle : it's not found as __builtin__.module anybody could provide an alternative to call the function do_something() in a separate thread ? From barry at barrys-emacs.org Mon Sep 2 11:36:20 2024 From: barry at barrys-emacs.org (Barry Scott) Date: Mon, 2 Sep 2024 16:36:20 +0100 Subject: Getting a Process.start() error pickle.PicklingError: Can't pickle : it's not found as __builtin__.module with Python 2.7 In-Reply-To: References: Message-ID: > On 2 Sep 2024, at 15:00, marc nicole via Python-list wrote: > > I am using Python 2.7 on Windows 10 Why? Install Python 3.12 and it will be easier to get help and support. If you have legacy that still needs porting then you can install 3.12 along side the unsupported 3.12. Barry From mk1853387 at gmail.com Tue Sep 3 05:34:53 2024 From: mk1853387 at gmail.com (marc nicole) Date: Tue, 3 Sep 2024 11:34:53 +0200 Subject: [Tutor] Getting a Process.start() error pickle.PicklingError: Can't pickle : it's not found as __builtin__.module with Python 2.7 In-Reply-To: References: Message-ID: Hello Alan, Thanks for the reply, Here's the code I tested for the debug: import time from multiprocessing import Process def do_Something(): print('hello world!') def start(fn): p = Process(target=fn, args=()) p.start() def ghello(): print ("hello world g") def fhello(): print('hello world f') if __name__ == "__main__": start(do_something) print("executed") exit(0) but neither "Hello World" or "Executed" are displayed in the console which finishes normally without returning any message. Module naming is OK and don't think it is a problem related to that. Now the question, when to use Process/Multiprocess and when to use Threading in Python?.Thread is there a distinctive use case that can showcase when to use either? are they interchangeable? to note that using Threading the console DID display the messages correctly! Thanks. Le mar. 3 sept. 2024 ? 10:48, Alan Gauld via Tutor a ?crit : > On 02/09/2024 15:00, marc nicole via Python-list wrote: > > Hello, > > > > I am using Python 2.7 on Windows 10 > > Others have pointed out that 2.7 is unsupported and has > been for many years now. Its also inferior in most > respects including its error reporting. > If possible you should upgrade to 3.X > > > from multiprocessing import Process > > def do_something(text): > > print(text) > > if __name__ == "__main__": > > q = Process(target=do_something,args=("somecmd") ) > > q.start() > > # following code should execute right after the q.start() call > > So what does happen? If you put a print statement here does it execute > before or after the error message? It might make things easier to > debug(clearer error traceback) if you put the code to create the thread > into a separate function? > > def do_Something(text)... > > def start(fn): > q = Process.... > q.start() > > if __name_.... > start(do_something) > print('Something here') > > > > But getting the error at the call of Process().start(): > > pickle.PicklingError: Can't pickle : it's not found as > > __builtin__.module > > But please show us the full error trace even if its not much. > > Also check your module naming, is there a possibility > you've named your file do_something.py or similar? > (I'm guessing the function is what is being pickled?) > > > anybody could provide an alternative to call the function do_something() > in > > a separate thread ? > > Why not just use the Threading module? > If it's as simple as just running something in a > thread multiprocessing is probably not needed. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From alan at csail.mit.edu Mon Sep 2 03:55:15 2024 From: alan at csail.mit.edu (Alan Bawden) Date: Mon, 02 Sep 2024 03:55:15 -0400 Subject: Python told me a Joke Message-ID: <868qwafq1o.fsf@williamsburg.bawden.org> Python 3.10.5 (v3.10.5:f37715, Jul 10 2022, 00:26:17) [GCC 4.9.2] on linux Type "help", "copyright", "credits" or "license" for more information. >>> x,_,z = [1,2,3] Works as expected. Now I didn't expect the following to work (but Python sometimes surprises me!), so I tried: >>> x,2,z = [1,2,3] File "", line 1 x,2,z = [1,2,3] ^^^^^^^^^^^ SyntaxError: invalid syntax. Maybe you meant '==' or ':=' instead of '='? Yeah, that makes sense, no surprises today... Except "maybe you meant '=='..." caught my attention. _Could_ that be what someone would want in this situation I wondered? So I tried: >>> x,2,z == [1,2,3] (1, 2, False) Now that made me laugh. - Alan [ Some people reading this will be tempted to explain what's really going on here -- it's not hard to understand. But please remember that a joke is never funny if you have to explain it. ] From janburse at fastmail.fm Mon Sep 2 04:05:30 2024 From: janburse at fastmail.fm (Mild Shock) Date: Mon, 2 Sep 2024 10:05:30 +0200 Subject: Python told me a Joke In-Reply-To: <868qwafq1o.fsf@williamsburg.bawden.org> References: <868qwafq1o.fsf@williamsburg.bawden.org> Message-ID: You can try: >>> 1,2 == 2,2 (1, True, 2) Its the same as: >>> 1, (2 == 2), 2 (1, True, 2) Hope this helps! Alan Bawden schrieb: > Python 3.10.5 (v3.10.5:f37715, Jul 10 2022, 00:26:17) [GCC 4.9.2] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> x,_,z = [1,2,3] > > Works as expected. > > Now I didn't expect the following to work (but Python sometimes > surprises me!), so I tried: > > >>> x,2,z = [1,2,3] > File "", line 1 > x,2,z = [1,2,3] > ^^^^^^^^^^^ > SyntaxError: invalid syntax. Maybe you meant '==' or ':=' instead of '='? > > Yeah, that makes sense, no surprises today... Except "maybe you meant > '=='..." caught my attention. _Could_ that be what someone would want > in this situation I wondered? So I tried: > > >>> x,2,z == [1,2,3] > (1, 2, False) > > Now that made me laugh. > > - Alan > > [ Some people reading this will be tempted to explain what's really > going on here -- it's not hard to understand. But please remember that > a joke is never funny if you have to explain it. ] > From geodandw at gmail.com Mon Sep 2 13:31:37 2024 From: geodandw at gmail.com (geodandw) Date: Mon, 2 Sep 2024 13:31:37 -0400 Subject: Getting a Process.start() error pickle.PicklingError: Can't pickle : it's not found as __builtin__.module with Python 2.7 In-Reply-To: References: Message-ID: On 9/2/24 11:36, Barry Scott wrote: > > >> On 2 Sep 2024, at 15:00, marc nicole via Python-list wrote: >> >> I am using Python 2.7 on Windows 10 > > Why? Install Python 3.12 and it will be easier to get help and support. > If you have legacy that still needs porting then you can install 3.12 along side > the unsupported 3.12. > > > Barry > I think you mean alongside the unsupported 2.7. From sjeik_appie at hotmail.com Tue Sep 3 17:49:09 2024 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Tue, 03 Sep 2024 23:49:09 +0200 Subject: Synchronise annotations -> docstring Message-ID: Hi, Are there any tools that check whether type annotations and Numpydoc strings are consistent? I did find this Vim plugin: https://lxyuan0420.github.io/posts/til-vim-pydocstring-plugin. Looks incredibly useful, but I haven't tried it yet. Thanks! AJ From Keith.S.Thompson+u at gmail.com Tue Sep 3 16:47:13 2024 From: Keith.S.Thompson+u at gmail.com (Keith Thompson) Date: Tue, 03 Sep 2024 13:47:13 -0700 Subject: Process.start References: Message-ID: <87wmjs1n3i.fsf@nosuchdomain.example.com> ram at zedat.fu-berlin.de (Stefan Ram) writes: > marc nicole wrote or quoted: >>Thanks for the reply, Here's the code I tested for the debug: >>print("executed") >>but neither "Hello World" or "Executed" are displayed in the console which > > It shouldn't spit out "Executed" 'cause there's a lowercase > "e" in the mix. Talk about sweating the small stuff! > > That 'if __name__ == "__main__"' jazz? It's barking up the wrong > tree here, just muddying the waters. I'd 86 that clause for now. > > In your start() function call, you're rockin' "do_something()", > but the actual function's defined as "do_Something()" with a > capital "S". Python's all about that case sensitivity. > > Dropping that "exit(0)" bomb right after firing up the process? > That's like bailing on a gnarly wave before you even catch it. > It might pull the plug on the main process before the kid process > has a chance to strut its stuff. > > Those "ghello" and "fhello" functions? They're just chillin' > there, not pulling their weight! [...] Stefan, you've recently started using a lot of slang in your posts. I suggest that this is counterproductive. For me, it makes your posts more difficult to read. I can imagine that it would be even more difficult for readers whose first language is not English. You also indent your own new text, which is exactly the opposite of common Usenet conventions. (You've been doing this for a long time.) Please consider prioritizing your readers' convenience over whatever benefit you derive from your unconventional posting style. -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u at gmail.com void Void(void) { Void(); } /* The recursive call of the void */ From norman.robins59 at gmail.com Tue Sep 3 20:34:13 2024 From: norman.robins59 at gmail.com (Norman Robins) Date: Tue, 3 Sep 2024 17:34:13 -0700 Subject: Trouble with mocking Message-ID: I'm somewhat new to mocking for unit tests. I have some code like this: In foo/bar/baz.py I have 2 function I want to mock, one calls the other" def function1_to_mock(): . . . def function2_to_mock(): function1_to_mock() In foo/bar/main.py I import 1 of these and call it" from .baz import function2_to_mock def some_function(): function1_to_mock() . . . I want to mock both function1_to_mock and function2_to_mock In my test I do this: def function1_to_mock(kid): return MOCKED_VALUE @pytest.fixture(autouse=True) def mock_dependencies(): with patch(foo.bar.baz.function1_to_mock') as mock_function1_to_mock, \ patch('foo.bar.main.function2_to_mock') as mock_function2_to_mock: mock_function2_to_mock.return_value = { 'this': 'that } yield mock_function1_to_mock, mock_function2_to_mock def test_main(mock_dependencies): some_function() When some_function is called the real function1_to_mock is called instead of my mock. Can someone please let me know how to properly mock these 2 functions. Thanks! From avi.e.gross at gmail.com Wed Sep 4 01:19:12 2024 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Wed, 4 Sep 2024 01:19:12 -0400 Subject: [Tutor] Getting a Process.start() error pickle.PicklingError: Can't pickle : it's not found as __builtin__.module with Python 2.7 In-Reply-To: References: Message-ID: <059301dafe89$fac57470$f0505d50$@gmail.com> Unfortunately, Alan, even though 2.7 was considered pickled, people keep taking it back out of the bottle and wondering why it does not work so well! There are companies like Microsoft and Samsung that let people know their OS on their devices will no longer be supported with updates and some apps may no longer work if downloaded. And, yet, I bet for years afterwards, people will refuse to upgrade because they don't want to replace equipment or even learn a new slightly different interface. Having said that, I understand many people are stuck for various reasons and are required to use whatever version is officially allowed. For some questions, answers may still be provided. There are some workarounds or even newer packages designed to do what is not otherwise available. But many of us here may not be answering the questions as we have no reason to be able to access the old software or interest. -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Tuesday, September 3, 2024 4:41 AM To: tutor at python.org Cc: python-list at python.org Subject: Re: [Tutor] Getting a Process.start() error pickle.PicklingError: Can't pickle : it's not found as __builtin__.module with Python 2.7 On 02/09/2024 15:00, marc nicole via Python-list wrote: > Hello, > > I am using Python 2.7 on Windows 10 Others have pointed out that 2.7 is unsupported and has been for many years now. Its also inferior in most respects including its error reporting. If possible you should upgrade to 3.X > from multiprocessing import Process > def do_something(text): > print(text) > if __name__ == "__main__": > q = Process(target=do_something,args=("somecmd") ) > q.start() > # following code should execute right after the q.start() call So what does happen? If you put a print statement here does it execute before or after the error message? It might make things easier to debug(clearer error traceback) if you put the code to create the thread into a separate function? def do_Something(text)... def start(fn): q = Process.... q.start() if __name_.... start(do_something) print('Something here') > But getting the error at the call of Process().start(): > pickle.PicklingError: Can't pickle : it's not found as > __builtin__.module But please show us the full error trace even if its not much. Also check your module naming, is there a possibility you've named your file do_something.py or similar? (I'm guessing the function is what is being pickled?) > anybody could provide an alternative to call the function do_something() in > a separate thread ? Why not just use the Threading module? If it's as simple as just running something in a thread multiprocessing is probably not needed. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From guenther.sohler at gmail.com Wed Sep 4 11:27:40 2024 From: guenther.sohler at gmail.com (Guenther Sohler) Date: Wed, 4 Sep 2024 17:27:40 +0200 Subject: Crash when launching python Message-ID: Hi, My "Project" is to integrate python support into OpenSCAD. It runs quite well, but there are still issues on MacOS. On My MacOS it works, but it crashes when I ship the DMG files. It looks very much like python is not able to find the "startup" python files and therefore crashes. Is it possible to turn on debugging and to display on the console, where python is loading files from ? Thank you From PythonList at DancesWithMice.info Wed Sep 4 15:25:13 2024 From: PythonList at DancesWithMice.info (dn) Date: Thu, 5 Sep 2024 07:25:13 +1200 Subject: Crash when launching python In-Reply-To: References: Message-ID: <0585fdf2-70fa-42af-adf3-d90010f752d4@DancesWithMice.info> On 5/09/24 03:27, Guenther Sohler via Python-list wrote: > Hi, > > My "Project" is to integrate python support into OpenSCAD. It runs quite > well, but > there are still issues on MacOS. On My MacOS it works, but it crashes when > I ship > the DMG files. > It looks very much like python is not able to find the "startup" python > files and therefore crashes. > > Is it possible to turn on debugging and to display on the console, where > python is loading files from ? (am not a Mac user) Starting with 'the basics', are you familiar with: 5. Using Python on a Mac https://docs.python.org/3/using/mac.html (and the more general preceding sections) This doc likely includes mention of such parameters: 1.2. Environment variables https://docs.python.org/3/using/cmdline.html#environment-variables Here is a library for programmatic manipulation: site ? Site-specific configuration hook https://docs.python.org/3/library/site.html#module-site Please let us know how things progress... -- Regards, =dn From barry at barrys-emacs.org Wed Sep 4 15:48:21 2024 From: barry at barrys-emacs.org (Barry Scott) Date: Wed, 4 Sep 2024 20:48:21 +0100 Subject: Crash when launching python In-Reply-To: References: Message-ID: > On 4 Sep 2024, at 16:27, Guenther Sohler via Python-list wrote: > > Is it possible to turn on debugging and to display on the console, where > python is loading files from ? > I assume you have a .app that is then packaged into a .dmg. It will be the .app that you need to either build with a debug version of your code or after building the .app edit the debug code into it. Do you know that .app files are a tree of files? You can right-click on an .app in Finder and it will have a "Show Package Context" option. Or using the terminal and you can: cd .app/Contents then have a look around. Beware that you cannot use print to stdout for a .app as its stdin/stdout do not go anywhere useful. What I do is use code like this in the main function: sys.stdout = open( '/home/barry/debug.log', 'w', 1 ) sys.stderr = sys.stdout Now you can use print(); look in the debug.log to see what happened. Also any tracebacks will end up in the debug.log. Barry From greg.ewing at canterbury.ac.nz Wed Sep 4 21:25:08 2024 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 5 Sep 2024 13:25:08 +1200 Subject: Crash when launching python In-Reply-To: References: Message-ID: On 5/09/24 7:48 am, Barry Scott wrote: > Beware that you cannot use print to stdout for a .app as its stdin/stdout do not go anywhere useful. You can invoke the executable inside the package from the Terminal. Normally it's in the .app/Contents/MacOS subdirectory. The name varies, but there's usually just one executable file in there. Run that from a shell and you should see anything written to stdout or stderr. -- Greg From barry at barrys-emacs.org Thu Sep 5 09:51:01 2024 From: barry at barrys-emacs.org (Barry) Date: Thu, 5 Sep 2024 14:51:01 +0100 Subject: Crash when launching python In-Reply-To: References: Message-ID: <2C9E37C8-ECEF-4015-84A9-279598A28AA1@barrys-emacs.org> > On 5 Sep 2024, at 02:32, Greg Ewing via Python-list wrote: > > Normally it's in the .app/Contents/MacOS subdirectory. The name > varies, but there's usually just one executable file in there. Run that > from a shell and you should see anything written to stdout or stderr. I recall that does not always work for app code that expects the macOS options that are passed in when launching from GUI. Barry From 711 at spooky.mart Thu Sep 5 00:40:10 2024 From: 711 at spooky.mart (711 Spooky Mart) Date: Thu, 5 Sep 2024 04:40:10 -0000 Subject: PyBitmessage is not dead. Ignore the FUD. Message-ID: from [chan] bitmessage PyBitmessage is not dead. Ignore the FUD. I think Peter and gang just got tired of responding to this recurring claim. I wouldn't even expect a rebuttal from them at this point. Newer forks have been under development, one in Python3 and one in Rust. Here is the PyBitmessage fork in Python3: https://github.com/kashikoibumi/PyBitmessage Somebody should help the maintainer. I am no longer a Pythonista or I would. I dumped Python after the 2.7 sunset as I expect they will likely break Py3 equally bad some day. I will be occasionally reviewing Koibumi's bash scripts, build scripts and documentation looking for errors or points of improvement. ?????????????????????????????? Spooky Mart Channel ?????????????????????????????? [chan] 711 ?????????????????????????????? always open | stay spooky ?????????????????????????????? https://bitmessage.org From 711 at spooky.mart Thu Sep 5 00:42:20 2024 From: 711 at spooky.mart (711 Spooky Mart) Date: Thu, 5 Sep 2024 04:42:20 -0000 Subject: Unofficial PyBitmessage port to run with Python3 and PyQt5 Message-ID: from https://www.reddit.com/r/bitmessage/comments/1d5ff18/unofficial_pybitmessage_port_to_run_with_python3/ Unofficial PyBitmessage port to run with Python3 and PyQt5 The official PyBitmessage still runs with outdated Python2 and PyQt4. Recently, I'm trying to port PyBitmessage to run with Python3 and PyQt5. Although it's unofficial version and still should have some bugs, it seems now running as expected as long as I use it. I use PyBitmessage mostly for communications on chans. One-to-one messaging is not tested well since I have no friends at all. The source code is published at GutHub: https://github.com/kashikoibumi/PyBitmessage . The default branch 'py3qt' is most matured among other many branches. If you try to use it, at first backup your PyBitmessage databases and settings which are found $HOME/.config/PyBitmessage/ if you are using Linux. Any bug reports or comments are welcome. I'm using it on Devuan GNU+Linux Daedalus which is mostly compatible to Debian GNU/Linux bookworm except systemd utilizations. All dependencies are installed from Devuan (Debian) packages. ?????????????????????????????? Spooky Mart Channel ?????????????????????????????? [chan] 711 ?????????????????????????????? always open | stay spooky ?????????????????????????????? https://bitmessage.org From 711 at spooky.mart Thu Sep 5 00:47:39 2024 From: 711 at spooky.mart (711 Spooky Mart) Date: Thu, 5 Sep 2024 04:47:39 -0000 Subject: Python3 Fork of BMWrapper Message-ID: from https://github.com/kashikoibumi/bmwrapper bmwrapper is a poorly hacked together python script to let Thunderbird and PyBitmessage communicate, similar to AyrA's (generally much better) application: ?Bitmessage2Mail. I'm on Linux, and don't feel like dealing with wine. So I wrote this to fill the same role as B2M, until the source code was released. (Which has since been open-sourced: https://github.com/AyrA/BitMailServer) The script (usually) parses outgoing messages to strip the ugly email header information and put quoted text in PyBitmessage?s '---? delimited form. Attached images are included, base64 encoded, in an img tag. Incoming messages are likewise parsed to reconstruct a email, with attachment. This works...most of the time, and I?ve tried to make it fail gracefully when something goes wrong. ?????????????????????????????? Spooky Mart Channel ?????????????????????????????? [chan] 711 ?????????????????????????????? always open | stay spooky ?????????????????????????????? https://bitmessage.org From 711 at spooky.mart Thu Sep 5 00:53:05 2024 From: 711 at spooky.mart (711 Spooky Mart) Date: Thu, 5 Sep 2024 04:53:05 -0000 Subject: BitChan (python project) Message-ID: from https://github.com/813492291816/BitChan BitChan is a decentralized anonymous imageboard inspired by BitBoard and built on top of Bitmessage with Tor, I2P, and GnuPG. BitChan solves a number of security and free speech problems that have plagued most imageboards. Centralized imageboards can be taken offline or hijacked and can leak user data. BitChan reduces the likelihood of this by being decentralized, allowing each user to host their own instance of the software, requiring all connections to go through Tor/I2P, and not requiring JavaScript. Users of centralized forums often have to deal with overzealous moderators and sometimes even pressure from state powers that tend to suffocate the forum's culture. BitChan's moderation is multifaceted, but to be brief, the option exists to create entirely unmoderatable boards to post content on. Due to its decentralized design, BitChan cannot be moderated by its developers, the government, or any other entity. Indeed, there is no way to disconnect BitChan from the internet, and as long as people are still running Bitmessage, BitChan is completely untouchable. ?????????????????????????????? Spooky Mart Channel ?????????????????????????????? [chan] 711 ?????????????????????????????? always open | stay spooky ?????????????????????????????? https://bitmessage.org From sch at fedora.email Fri Sep 6 04:05:14 2024 From: sch at fedora.email (Schimon Jehudah) Date: Fri, 6 Sep 2024 11:05:14 +0300 Subject: Unofficial PyBitmessage port to run with Python3 and PyQt5 In-Reply-To: References: Message-ID: <20240906110514.3c03ec9b@workstation.localdomain> Good day, 711 Spooky Mart! Congratulations and thank you for investing efforts to enhance PyBitmessage, as it is an important telecommunication mean . I use Arch Linux, and I would be happy to help you to test. I have several tasks with Python, mostly on XMPP, so I am not sure I would be available soon to help with coding. Kind regards, Schimon On Thu, 5 Sep 2024 04:42:20 -0000 711 Spooky Mart via Python-list wrote: > from > https://www.reddit.com/r/bitmessage/comments/1d5ff18/unofficial_pybitmessage_port_to_run_with_python3/ > > Unofficial PyBitmessage port to run with Python3 and PyQt5 > > The official PyBitmessage still runs with outdated Python2 and PyQt4. > > Recently, I'm trying to port PyBitmessage to run with Python3 and > PyQt5. Although it's unofficial version and still should have some > bugs, it seems now running as expected as long as I use it. I use > PyBitmessage mostly for communications on chans. One-to-one messaging > is not tested well since I have no friends at all. > > The source code is published at GutHub: > https://github.com/kashikoibumi/PyBitmessage . The default branch > 'py3qt' is most matured among other many branches. > > If you try to use it, at first backup your PyBitmessage databases and > settings which are found $HOME/.config/PyBitmessage/ if you are using > Linux. > > Any bug reports or comments are welcome. > > I'm using it on Devuan GNU+Linux Daedalus which is mostly compatible > to Debian GNU/Linux bookworm except systemd utilizations. All > dependencies are installed from Devuan (Debian) packages. > > ?????????????????????????????? Spooky Mart Channel > ?????????????????????????????? [chan] 711 > ?????????????????????????????? always open | stay spooky > ?????????????????????????????? https://bitmessage.org > From sch at fedora.email Fri Sep 6 04:00:30 2024 From: sch at fedora.email (Schimon Jehudah) Date: Fri, 6 Sep 2024 11:00:30 +0300 Subject: BitChan (python project) In-Reply-To: References: Message-ID: <20240906110030.6e9871df@workstation.localdomain> Greetings, 711! This is very good! Do you know of Plebbit? It might be good to interoperate with Plebbit too. https://plebbit.com/ Kind regards, Schimon On Thu, 5 Sep 2024 04:53:05 -0000 711 Spooky Mart via Python-list wrote: > from https://github.com/813492291816/BitChan > > BitChan is a decentralized anonymous imageboard inspired by BitBoard > and built on top of Bitmessage with Tor, I2P, and GnuPG. > > BitChan solves a number of security and free speech problems that have > plagued most imageboards. Centralized imageboards can be taken offline > or hijacked and can leak user data. BitChan reduces the likelihood of > this by being decentralized, allowing each user to host their own > instance of the software, requiring all connections to go through > Tor/I2P, and not requiring JavaScript. > > Users of centralized forums often have to deal with overzealous > moderators and sometimes even pressure from state powers that tend to > suffocate the forum's culture. BitChan's moderation is multifaceted, > but to be brief, the option exists to create entirely unmoderatable > boards to post content on. Due to its decentralized design, BitChan > cannot be moderated by its developers, the government, or any other > entity. Indeed, there is no way to disconnect BitChan from the > internet, and as long as people are still running Bitmessage, BitChan > is completely untouchable. > > ?????????????????????????????? Spooky Mart Channel > ?????????????????????????????? [chan] 711 > ?????????????????????????????? always open | stay spooky > ?????????????????????????????? https://bitmessage.org > From sch at fedora.email Fri Sep 6 04:07:31 2024 From: sch at fedora.email (Schimon Jehudah) Date: Fri, 6 Sep 2024 11:07:31 +0300 Subject: Python3 Fork of BMWrapper In-Reply-To: References: Message-ID: <20240906110731.78010405@workstation.localdomain> Good day, 711 Spooky Mart! Did you consider to add support for IRC or XMPP too? Best regards, Schimon On Thu, 5 Sep 2024 04:47:39 -0000 711 Spooky Mart via Python-list wrote: > from https://github.com/kashikoibumi/bmwrapper > > bmwrapper is a poorly hacked together python script to let Thunderbird > and PyBitmessage communicate, similar to AyrA's (generally much > better) application: ?Bitmessage2Mail. > > I'm on Linux, and don't feel like dealing with wine. So I wrote this > to fill the same role as B2M, until the source code was released. > (Which has since been > open-sourced: https://github.com/AyrA/BitMailServer) > > The script (usually) parses outgoing messages to strip the ugly email > header information and put quoted text in PyBitmessage?s '---? > delimited form. Attached images are included, base64 encoded, in an > img tag. Incoming messages are likewise parsed to reconstruct a > email, with attachment. This works...most of the time, and I?ve tried > to make it fail gracefully when something goes wrong. > > ?????????????????????????????? Spooky Mart Channel > ?????????????????????????????? [chan] 711 > ?????????????????????????????? always open | stay spooky > ?????????????????????????????? https://bitmessage.org > From sch at fedora.email Fri Sep 6 04:10:31 2024 From: sch at fedora.email (Schimon Jehudah) Date: Fri, 6 Sep 2024 11:10:31 +0300 Subject: PyBitmessage is not dead. Ignore the FUD. In-Reply-To: References: Message-ID: <20240906111031.459736cb@workstation.localdomain> Greetings! I am interested in adding support for Bitmessage to Slixfeed news bot. Support is currently provided to XMPP and it will be extended to Email, IRC and Session. https://git.xmpp-it.net/sch/Slixfeed Schimon On Thu, 5 Sep 2024 04:40:10 -0000 711 Spooky Mart via Python-list wrote: > from [chan] bitmessage > > PyBitmessage is not dead. Ignore the FUD. > > I think Peter and gang just got tired of responding to this recurring > claim. I wouldn't even expect a rebuttal from them at this point. > > Newer forks have been under development, one in Python3 and one in > Rust. > > Here is the PyBitmessage fork in Python3: > > https://github.com/kashikoibumi/PyBitmessage > > Somebody should help the maintainer. I am no longer a Pythonista or I > would. I dumped Python after the 2.7 sunset as I expect they will > likely break Py3 equally bad some day. I will be occasionally > reviewing Koibumi's bash scripts, build scripts and documentation > looking for errors or points of improvement. > > ?????????????????????????????? Spooky Mart Channel > ?????????????????????????????? [chan] 711 > ?????????????????????????????? always open | stay spooky > ?????????????????????????????? https://bitmessage.org > From lukasz at langa.pl Sat Sep 7 10:26:06 2024 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Sat, 7 Sep 2024 16:26:06 +0200 Subject: [RELEASE] Python 3.13.0RC2, 3.12.6, 3.11.10, 3.10.15, 3.9.20, and 3.8.20 are now available! Message-ID: <81BAD7AA-08FF-4E55-B52D-161B92A5D385@langa.pl> Hi there! A big joint release today. Mostly security fixes but we also have the final release candidate of 3.13 so let?s start with that! Python 3.13.0RC2 Final opportunity to test and find any show-stopper bugs before we bless and release 3.13.0 final on October 1st. Get it here: Python Release Python 3.13.0rc2 | Python.org Call to action We strongly encourage maintainers of third-party Python projects to prepare their projects for 3.13 compatibilities during this phase, and where necessary publish Python 3.13 wheels on PyPI to be ready for the final release of 3.13.0. Any binary wheels built against Python 3.13.0rc2 will work with future versions of Python 3.13. As always, report any issues to the Python bug tracker . Please keep in mind that this is a preview release and while it?s as close to the final release as we can get it, its use is notrecommended for production environments. Core developers: time to work on documentation now Are all your changes properly documented? Are they mentioned in What?s New ? Did you notice other changes you know of to have insufficient documentation? As a reminder, until the final release of 3.13.0, the 3.13 branch is set up so that the Release Manager (@thomas ) has to merge the changes. Please add him (@Yhg1s on GitHub) to any changes you think should go into 3.13.0. At this point, unless something critical comes up, it should really be documentation only. Other changes (including tests) will be pushed to 3.13.1. New features in Python 3.13 A new and improved interactive interpreter , based on PyPy ?s, featuring multi-line editing and color support, as well as colorized exception tracebacks . An experimental?free-threaded build mode , which disables the Global Interpreter Lock, allowing threads to run more concurrently. The build mode is available as an experimental feature in the Windows and macOS installers as well. A preliminary,?experimental?JIT , providing the ground work for significant performance improvements. The locals() builtin function (and its C equivalent) now has well-defined semantics when mutating the returned mapping , which allows debuggers to operate more consistently. The (cyclic) garbage collector is now incremental , which should mean shorter pauses for collection in programs with a lot of objects. A modified version of mimalloc is now included, optional but enabled by default if supported by the platform, and required for the free-threaded build mode. Docstrings now have their leading indentation stripped , reducing memory use and the size of .pyc files. (Most tools handling docstrings already strip leading indentation.) The dbm module has a new dbm.sqlite3 backend that is used by default when creating new files. The minimum supported macOS version was changed from 10.9 to 10.13 (High Sierra). Older macOS versions will not be supported going forward. WASI is now a Tier 2 supported platform . Emscripten is no longer an officially supported platform (but Pyodide continues to support Emscripten). iOS is now a Tier 3 supported platform , with Android on the way as well . Python 3.12.6 This is an expedited release for 3.12 due to security content. The schedule returns back to regular programming in October. One notable change for macOS users: as mentioned in the previous release of 3.12, this release drops support for macOS versions 10.9 through 10.12. Versions of macOS older than 10.13 haven?t been supported by Apple since 2019, and maintaining support for them has become too difficult. (All versions of Python 3.13 have already dropped support for them.) Get it here: Python Release Python 3.12.6 | Python.org 92 commits. Python 3.11.10 Python 3.11 joins the elite club of security-only versions with no binary installers. Get it here: Python Release Python 3.11.10 | Python.org 28 commits. Python 3.10.15 Get it here: Python Release Python 3.10.15 | Python.org 24 commits. Python 3.9.20 Get it here: Python Release Python 3.9.20 | Python.org 22 commits. Python 3.8.20 Python 3.8 is very close to End of Life (see the Release Schedule ). Will this be the last release of 3.8 ever? We?ll see? but now I think I jinxed it. Get it here: Python Release Python 3.8.20 | Python.org 22 commits. Security content in today?s releases gh-123678 and gh-116741 : Upgrade bundled libexpat to 2.6.3 to fix CVE-2024-28757 , CVE-2024-45490 , CVE-2024-45491 and CVE-2024-45492 . gh-118486 : os.mkdir() on Windows now accepts mode of 0o700 to restrict the new directory to the current user. This fixes CVE-2024-4030 affecting tempfile.mkdtemp() in scenarios where the base temporary directory is more permissive than the default. gh-123067 : Fix quadratic complexity in parsing "-quoted cookie values with backslashes by http.cookies . Fixes CVE-2024-7592. gh-113171 : Fixed various false positives and false negatives in IPv4Address.is_private, IPv4Address.is_global, IPv6Address.is_private, IPv6Address.is_global. Fixes CVE-2024-4032. gh-67693 : Fix urllib.parse.urlunparse() and urllib.parse.urlunsplit() for URIs with path starting with multiple slashes and no authority. Fixes CVE-2015-2104. gh-121957 : Fixed missing audit events around interactive use of Python, now also properly firing for python -i, as well as for python -m asyncio. The event in question is cpython.run_stdin. gh-122133 : Authenticate the socket connection for the socket.socketpair() fallback on platforms where AF_UNIXis not available like Windows. gh-121285 : Remove backtracking from tarfile header parsing for hdrcharset, PAX, and GNU sparse headers. That?s CVE-2024-6232. gh-114572 : ssl.SSLContext.cert_store_stats() and ssl.SSLContext.get_ca_certs() now correctly lock access to the certificate store, when the ssl.SSLContext is shared across multiple threads. gh-102988 : email.utils.getaddresses() and email.utils.parseaddr() now return ('', '') 2-tuples in more situations where invalid email addresses are encountered instead of potentially inaccurate values. Add optional strict parameter to these two functions: use strict=False to get the old behavior, accept malformed inputs. getattr(email.utils, 'supports_strict_parsing', False) can be use to check if the strict paramater is available. This improves the CVE-2023-27043 fix. gh-123270 : Sanitize names in zipfile.Path to avoid infinite loops (gh-122905 ) without breaking contents using legitimate characters. That?s CVE-2024-8088. gh-121650 : email headers with embedded newlines are now quoted on output. The generator will now refuse to serialize (write) headers that are unsafely folded or delimited; see verify_generated_headers . That?s CVE-2024-6923. gh-119690 : Fixes data type confusion in audit events raised by _winapi.CreateFile and _winapi.CreateNamedPipe. gh-116773 : Fix instances of <_overlapped.Overlapped object at 0xXXX> still has pending operation at deallocation, the process may crash. gh-112275 : A deadlock involving pystate.c?s HEAD_LOCK in posixmodule.c at fork is now fixed. Stay safe and upgrade! Upgrading is highly recommended to all users of affected versions. Thank you for your support Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation. -- ?ukasz Langa @ambv on behalf of your friendly release team, Ned Deily @nad Steve Dower @steve.dower Pablo Galindo Salgado @pablogsal ?ukasz Langa @ambv Thomas Wouters @thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From Karsten.Hilbert at gmx.net Sat Sep 7 11:48:01 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sat, 7 Sep 2024 17:48:01 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks Message-ID: Dear all, unto now I had been thinking this is a wise idiom (in code that needs not care whether it fails to do what it tries to do^1): conn = psycopg2.connection(...) curs = conn.cursor() try: curs.execute(SOME_SQL) except PSYCOPG2-Exception: some logging being done, and, yes, I can safely inhibit propagation^1 finally: conn.commit() # will rollback, if SOME_SQL failed conn.close() So today I head to learn that conn.commit() may very well raise a DB related exception, too: psycopg2.errors.SerializationFailure: could not serialize access due to read/write dependencies among transactions DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. TIP: The transaction might succeed if retried. Now, what is the proper placement of the .commit() ? (doing "with ... as conn:" does not free me of committing appropriately) Should I try: curs.execute(SOME_SQL) conn.commit() except PSYCOPG2-Exception: some logging being done, and, yes, I can safely inhibit propagation^1 finally: conn.close() # which should .rollback() automagically in case we had not reached to .commit() ? Thanks for insights, Karsten #------------------------------- ^1: This particular code is writing configuration defaults supplied in-code when no value is yet to be found in the database. If it fails, no worries, the supplied default is used by follow-on code and storing it is re-tried next time around. #------------------------------- Exception details: Traceback (most recent call last): File "/usr/share/gnumed/Gnumed/wxpython/gmGuiMain.py", line 3472, in OnInit frame = gmTopLevelFrame(None, id = -1, title = _('GNUmed client'), size = (640, 440)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/share/gnumed/Gnumed/wxpython/gmGuiMain.py", line 191, in __init__ self.LayoutMgr = gmHorstSpace.cHorstSpaceLayoutMgr(self, -1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/share/gnumed/Gnumed/wxpython/gmHorstSpace.py", line 215, in __init__ self.top_panel = gmTopPanel.cTopPnl(self, -1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/share/gnumed/Gnumed/wxpython/gmTopPanel.py", line 52, in __init__ wxgTopPnl.wxgTopPnl.__init__(self, *args, **kwargs) File "/usr/share/gnumed/Gnumed/wxGladeWidgets/wxgTopPnl.py", line 33, in __init__ self._TCTRL_patient_selector = cActivePatientSelector(self, wx.ID_ANY, "") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/share/gnumed/Gnumed/wxpython/gmPatSearchWidgets.py", line 1295, in __init__ cfg.get2 ( File "/usr/share/gnumed/Gnumed/pycommon/gmCfg.py", line 248, in get2 self.set ( File "/usr/share/gnumed/Gnumed/pycommon/gmCfg.py", line 367, in set rw_conn.commit() # will rollback if transaction failed ^^^^^^^^^^^^^^^^ psycopg2.errors.SerializationFailure: could not serialize access due to read/write dependencies among transactions DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. TIP: The transaction might succeed if retried. 2024-08-20 22:17:04 INFO gm.cfg [140274204403392 UpdChkThread-148728] (/usr/share/gnumed/Gnumed/pycommon/gmCfg.py::get2() #148): creating option [horstspace.update.consider_latest_branch] with default [True] 2024-08-20 22:17:04 DEBUG gm.db_pool [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmConnectionPool.py::exception_is_connection_loss() #667): interpreting: could not serialize access due to read/write dependencies among transactions DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. TIP: The transaction might succeed if retried. 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #170): exception: could not serialize access due to read/write dependencies among transactions DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. TIP: The transaction might succeed if retried. 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #171): type: 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #172): list of attributes: 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #178): add_note: 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #178): args: ('could not serialize access due to read/write dependencies among transactions\nDETAIL: Reason code: Canceled on identification as a pivot, during commit attempt.\nTIP: The transaction might succeed if retried.\n',) 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #178): cursor: None 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #178): diag: 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #178): pgcode: 40001 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #178): pgerror: ERROR: could not serialize access due to read/write dependencies among transactions DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. TIP: The transaction might succeed if retried. 2024-08-20 22:17:04 DEBUG gm.logging [140274459512896 MainThread] (/usr/share/gnumed/Gnumed/pycommon/gmLog2.py::log_stack_trace() #178): with_traceback: -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From rob.cliffe at btinternet.com Sat Sep 7 12:11:57 2024 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Sat, 7 Sep 2024 17:11:57 +0100 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: Message-ID: <610d3e06-63f3-459a-b060-e0d2737dfa9e@btinternet.com> On 07/09/2024 16:48, Karsten Hilbert via Python-list wrote: > Dear all, > > unto now I had been thinking this is a wise idiom (in code > that needs not care whether it fails to do what it tries to > do^1): > > conn = psycopg2.connection(...) > curs = conn.cursor() > try: > curs.execute(SOME_SQL) > except PSYCOPG2-Exception: > some logging being done, and, yes, I > can safely inhibit propagation^1 > finally: > conn.commit() # will rollback, if SOME_SQL failed > conn.close() > > So today I head to learn that conn.commit() may very well > raise a DB related exception, too: > > psycopg2.errors.SerializationFailure: could not serialize access due to read/write dependencies among transactions > DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. > TIP: The transaction might succeed if retried. > > Now, what is the proper placement of the .commit() ? > > (doing "with ... as conn:" does not free me of committing appropriately) > > Should I > > try: > curs.execute(SOME_SQL) > conn.commit() > except PSYCOPG2-Exception: > some logging being done, and, yes, I > can safely inhibit propagation^1 > finally: > conn.close() # which should .rollback() automagically in case we had not reached to .commit() > > ? > > Thanks for insights, > Karsten I would put the curs.execute and the conn.commit in separate try...except blocks.? That way you know which one failed, and can put appropriate info in the log, which may help trouble-shooting. (The general rule is to keep try...except blocks small.? And of course only catch the exceptions you are interested in, which you seem to be already doing.) Best wishes Rob Cliffe From Karsten.Hilbert at gmx.net Sat Sep 7 15:44:36 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sat, 7 Sep 2024 21:44:36 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> Message-ID: Am Sat, Sep 07, 2024 at 09:46:03AM -0700 schrieb Adrian Klaver: > >unto now I had been thinking this is a wise idiom (in code > >that needs not care whether it fails to do what it tries to > >do^1): > > > > conn = psycopg2.connection(...) > > In the above do you have: > > https://www.psycopg.org/docs/extensions.html#psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE > > psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE I do indeed. > Or is that in some other concurrent transaction? In fact in that codebase all transactions -- running concurrently or not -- are set to SERIALIZABLE. They are not psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT, for that matter. > > curs = conn.cursor() > > try: > > curs.execute(SOME_SQL) > > except PSYCOPG2-Exception: > > some logging being done, and, yes, I > > can safely inhibit propagation^1 > > finally: > > conn.commit() # will rollback, if SOME_SQL failed > > It will if you use with conn:, otherwise it up to you to do the rollback() > > Are you are doing a rollback() in except PSYCOPG2-Exception: ? No I don't but - to my understanding - an ongoing transaction is being closed upon termination of the hosting connection. Unless .commit() is explicitely being issued somewhere in the code that closing of a transaction will amount to a ROLLBACK. In case of SQL having failed within a given transaction a COMMIT will fail-but-rollback, too (explicit ROLLBACK would succeed while a COMMIT would fail and, in-effect, roll back). IOW, when SOME_SQL has failed it won't matter that I close the connection with conn.commit() and it won't matter that conn.commit() runs a COMMIT on the database -- an open transaction having run that failed SQL will still roll back as if ROLLBACK had been issued. Or else my mental model is wrong. https://www.psycopg.org/docs/connection.html#connection.close In the particular case I was writing about the SQL itself succeeded but then the COMMIT failed due to serialization. I was wondering about where to best place any needed conn.commit(). My knee-jerk reaction was to then put it last in the try: block... All this is probably more related to Python than to PostgreSQL. Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From Karsten.Hilbert at gmx.net Sat Sep 7 16:45:24 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sat, 7 Sep 2024 22:45:24 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> Message-ID: Am Sat, Sep 07, 2024 at 01:03:34PM -0700 schrieb Adrian Klaver: > In the case you show you are doing commit() before the close() so any errors in the > transactions will show up then. My first thought would be to wrap the commit() in a > try/except and deal with error there. Right, and this was suggested elsewhere ;) And, yeah, the actual code is much more involved :-D #------------------------------------------------------------------------ def __safely_close_cursor_and_rollback_close_conn(close_cursor=None, rollback_tx=None, close_conn=None): if close_cursor: try: close_cursor() except PG_ERROR_EXCEPTION as pg_exc: _log.exception('cannot close cursor') gmConnectionPool.log_pg_exception_details(pg_exc) if rollback_tx: try: # need to rollback so ABORT state isn't retained in pooled connections rollback_tx() except PG_ERROR_EXCEPTION as pg_exc: _log.exception('cannot rollback transaction') gmConnectionPool.log_pg_exception_details(pg_exc) if close_conn: try: close_conn() except PG_ERROR_EXCEPTION as pg_exc: _log.exception('cannot close connection') gmConnectionPool.log_pg_exception_details(pg_exc) #------------------------------------------------------------------------ def run_rw_queries ( link_obj:_TLnkObj=None, queries:_TQueries=None, end_tx:bool=False, return_data:bool=None, get_col_idx:bool=False, verbose:bool=False ) -> tuple[list[dbapi.extras.DictRow], dict[str, int] | None]: """Convenience function for running read-write queries. Typically (part of) a transaction. Args: link_obj: None, cursor, connection queries: * a list of dicts [{'cmd': , 'args': or ) * to be executed as a single transaction * the last query may usefully return rows, such as: SELECT currval('some_sequence'); or INSERT/UPDATE ... RETURNING some_value; end_tx: * controls whether the transaction is finalized (eg. COMMITted/ROLLed BACK) or not, this allows the call to run_rw_queries() to be part of a framing transaction * if link_obj is a *connection* then "end_tx" will default to False unless it is explicitly set to True which is taken to mean "yes, you do have full control over the transaction" in which case the transaction is properly finalized * if link_obj is a *cursor* we CANNOT finalize the transaction because we would need the connection for that * if link_obj is *None* "end_tx" will, of course, always be True, because we always have full control over the connection, not ending the transaction would be pointless return_data: * if true, the returned data will include the rows the last query selected * if false, it returns None instead get_col_idx: * True: the returned tuple will include a dictionary mapping field names to column positions * False: the returned tuple includes None instead of a field mapping dictionary Returns: * (None, None) if last query did not return rows * ("fetchall() result", ) if last query returned any rows and "return_data" was True * for *index* see "get_col_idx" """ assert queries is not None, ' must not be None' if link_obj is None: conn = get_connection(readonly = False) curs = conn.cursor() conn_close = conn.close tx_commit = conn.commit tx_rollback = conn.rollback curs_close = curs.close notices_accessor = conn else: conn_close = lambda *x: None tx_commit = lambda *x: None tx_rollback = lambda *x: None curs_close = lambda *x: None if isinstance(link_obj, dbapi._psycopg.cursor): curs = link_obj notices_accessor = curs.connection elif isinstance(link_obj, dbapi._psycopg.connection): if end_tx: tx_commit = link_obj.commit tx_rollback = link_obj.rollback curs = link_obj.cursor() curs_close = curs.close notices_accessor = link_obj else: raise ValueError('link_obj must be cursor, connection or None but not [%s]' % link_obj) for query in queries: try: args = query['args'] except KeyError: args = None try: curs.execute(query['cmd'], args) if verbose: gmConnectionPool.log_cursor_state(curs) for notice in notices_accessor.notices: _log.debug(notice.replace('\n', '/').replace('\n', '/')) del notices_accessor.notices[:] # DB related exceptions except dbapi.Error as pg_exc: _log.error('query failed in RW connection') gmConnectionPool.log_pg_exception_details(pg_exc) for notice in notices_accessor.notices: _log.debug(notice.replace('\n', '/').replace('\n', '/')) del notices_accessor.notices[:] __safely_close_cursor_and_rollback_close_conn ( curs_close, tx_rollback, conn_close ) # privilege problem ? if pg_exc.pgcode == PG_error_codes.INSUFFICIENT_PRIVILEGE: details = 'Query: [%s]' % curs.query.decode(errors = 'replace').strip().strip('\n').strip().strip('\n') if curs.statusmessage != '': details = 'Status: %s\n%s' % ( curs.statusmessage.strip().strip('\n').strip().strip('\n'), details ) if pg_exc.pgerror is None: msg = '[%s]' % pg_exc.pgcode else: msg = '[%s]: %s' % (pg_exc.pgcode, pg_exc.pgerror) raise gmExceptions.AccessDenied ( msg, source = 'PostgreSQL', code = pg_exc.pgcode, details = details ) # other DB problem gmLog2.log_stack_trace() raise # other exception except Exception: _log.exception('error running query in RW connection') gmConnectionPool.log_cursor_state(curs) for notice in notices_accessor.notices: _log.debug(notice.replace('\n', '/').replace('\n', '/')) del notices_accessor.notices[:] gmLog2.log_stack_trace() __safely_close_cursor_and_rollback_close_conn ( curs_close, tx_rollback, conn_close ) raise data = None col_idx = None if return_data: try: data = curs.fetchall() except Exception: _log.exception('error fetching data from RW query') gmLog2.log_stack_trace() __safely_close_cursor_and_rollback_close_conn ( curs_close, tx_rollback, conn_close ) raise if get_col_idx: col_idx = get_col_indices(curs) curs_close() tx_commit() conn_close() return (data, col_idx) #------------------------------------------------------------------------ Best, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From Karsten.Hilbert at gmx.net Sat Sep 7 17:20:57 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sat, 7 Sep 2024 23:20:57 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: Am Sat, Sep 07, 2024 at 02:09:28PM -0700 schrieb Adrian Klaver: > >Right, and this was suggested elsewhere ;) > > > >And, yeah, the actual code is much more involved :-D > > > > I see that. > > The question is does the full code you show fail? > > The code sample you show in your original post is doing something very different then > what you show in your latest post. At this point I do not understand the exact problem > we are dealing with. We are not dealing with an unsolved problem. I had been asking for advice where to best place that .commit() call in case I am overlooking benefits and drawbacks of choices. The try: do something except: log something finally: .commit() cadence is fairly Pythonic and elegant in that it ensures the the .commit() will always be reached regardless of exceptions being thrown or not and them being handled or not. It is also insufficient because the .commit() itself may elicit exceptions (from the database). So there's choices: Ugly: try: do something except: log something finally: try: .commit() except: log some more Fair but not feeling quite safe: try: do something .commit() except: log something Boring and repetitive and safe(r): try: do something except: log something try: .commit() except: log something I eventually opted for the last version, except for factoring out the second try: except: block. Best, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From greg.ewing at canterbury.ac.nz Sat Sep 7 20:48:50 2024 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 8 Sep 2024 12:48:50 +1200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 8/09/24 9:20 am, Karsten Hilbert wrote: > try: > do something > except: > log something > finally: > .commit() > > cadence is fairly Pythonic and elegant in that it ensures the > the .commit() will always be reached regardless of exceptions > being thrown or not and them being handled or not. That seems wrong to me. I would have thought the commit should only be attempted if everything went right. What if there's a problem in your code that causes a non-SQL-related exception when some but not all of the SQL statements in the transaction bave been issued? The database doesn't know something has gone wrong, so it will happily commit a partially-completed transaction and possibly corrupt your data. This is how I normally do things like this: try: do something .commit() except: log something .rollback() Doing an explicit rollback ensures that the transaction is always rolled back if it is interrupted for any reason. -- Greg From Karsten.Hilbert at gmx.net Sun Sep 8 07:06:19 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sun, 8 Sep 2024 13:06:19 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: Am Sun, Sep 08, 2024 at 12:48:50PM +1200 schrieb Greg Ewing via Python-list: > On 8/09/24 9:20 am, Karsten Hilbert wrote: > > try: > > do something > > except: > > log something > > finally: > > .commit() > > > >cadence is fairly Pythonic and elegant in that it ensures the > >the .commit() will always be reached regardless of exceptions > >being thrown or not and them being handled or not. > > That seems wrong to me. I would have thought the commit should only > be attempted if everything went right. > > What if there's a problem in your code that causes a non-SQL-related > exception when some but not all of the SQL statements in the > transaction bave been issued? The database doesn't know something > has gone wrong, so it will happily commit a partially-completed > transaction and possibly corrupt your data. A-ha ! try: run_some_SQL_that_succeeds() print(no_such_name) # tongue-in-cheek 1 / 0 # for good measure except SOME_DB_ERROR: print('some DB error, can be ignored for now') finally: commit() which is wrong, given that the failing *Python* statements may very well belong into the *business level* "transaction" which a/the database transaction is part of. See, that's why I was asking in the first place :-) I was overlooking implications. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From Karsten.Hilbert at gmx.net Sun Sep 8 07:13:37 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sun, 8 Sep 2024 13:13:37 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: Am Sun, Sep 08, 2024 at 12:48:50PM +1200 schrieb Greg Ewing via Python-list: > On 8/09/24 9:20 am, Karsten Hilbert wrote: > > try: > > do something > > except: > > log something > > finally: > > .commit() > > > >cadence is fairly Pythonic and elegant in that it ensures the > >the .commit() will always be reached regardless of exceptions > >being thrown or not and them being handled or not. > > That seems wrong to me. I would have thought the commit should only > be attempted if everything went right. It is only attempted when "everything" went right. The fault in my thinking was what the "everything" might encompass. When some SQL fails it won't matter whether I say conn.commit() or conn.rollback() or, in fact, nothing at all -- the (DB !) transaction will be rolled back in any case. However, that reasoning missed this: > What if there's a problem in your code that causes a non-SQL-related > exception when some but not all of the SQL statements in the > transaction bave been [-- even successfully --] issued? Still, in this code pattern: > try: > do something > .commit() > except: > log something it doesn't technically matter whether I say .commit or .rollback here: > .rollback() ... but ... > Doing an explicit rollback ensures that the transaction is always > rolled back if it is interrupted for any reason. explicit is better than implicit ;-) Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From rob.cliffe at btinternet.com Sun Sep 8 09:58:03 2024 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Sun, 8 Sep 2024 14:58:03 +0100 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: <62133ebd-a4a3-471b-9acc-3a988b4fcbd7@btinternet.com> On 07/09/2024 22:20, Karsten Hilbert via Python-list wrote: > Am Sat, Sep 07, 2024 at 02:09:28PM -0700 schrieb Adrian Klaver: > >>> Right, and this was suggested elsewhere ;) >>> >>> And, yeah, the actual code is much more involved :-D >>> >> I see that. >> >> The question is does the full code you show fail? >> >> The code sample you show in your original post is doing something very different then >> what you show in your latest post. At this point I do not understand the exact problem >> we are dealing with. > We are not dealing with an unsolved problem. I had been > asking for advice where to best place that .commit() call in > case I am overlooking benefits and drawbacks of choices. > > The > > try: > do something > except: > log something > finally: > .commit() > > cadence is fairly Pythonic and elegant in that it ensures the > the .commit() will always be reached regardless of exceptions > being thrown or not and them being handled or not. > > It is also insufficient because the .commit() itself may > elicit exceptions (from the database). > > So there's choices: > > Ugly: > > try: > do something > except: > log something > finally: > try: > .commit() > except: > log some more > > Fair but not feeling quite safe: > > try: > do something > .commit() > except: > log something > > Boring and repetitive and safe(r): > > try: > do something > except: > log something > try: > .commit() > except: > log something > > I eventually opted for the last version, except for factoring > out the second try: except: block. > > Best, > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B Unless I'm missing something, the 1st & 3rd versions always do the commit() even after the first bit fails, which seems wrong. I suggest the 1st version but replacing "finally" by "else".? Then the try-commit-except will not be executed if the "something" fails. Perhaps the extra indentation of the second try block is a bit ugly, but it is more important that it does the right thing. If it is convenient (it may not be) to put the whole thing in a function, you may feel that the follwing is less ugly: try: do something except: log something return try: .commit() except: log some more return Best wishes Rob Cliffe From Karsten.Hilbert at gmx.net Sun Sep 8 10:13:49 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sun, 8 Sep 2024 16:13:49 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: <62133ebd-a4a3-471b-9acc-3a988b4fcbd7@btinternet.com> References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> <62133ebd-a4a3-471b-9acc-3a988b4fcbd7@btinternet.com> Message-ID: Am Sun, Sep 08, 2024 at 02:58:03PM +0100 schrieb Rob Cliffe via Python-list: > >Ugly: > > > > try: > > do something > > except: > > log something > > finally: > > try: > > .commit() > > except: > > log some more > > > >Fair but not feeling quite safe: > > > > try: > > do something > > .commit() > > except: > > log something > > > >Boring and repetitive and safe(r): > > > > try: > > do something > > except: > > log something > > try: > > .commit() > > except: > > log something > > > >I eventually opted for the last version, except for factoring > >out the second try: except: block. > Unless I'm missing something, the 1st & 3rd versions always do the commit() even after > the first bit fails, which seems wrong. Well, it does run a Python function called "commit". That function will call "COMMIT" on the database. The end result will be correct (at the SQL level) because the COMMIT will not effect a durable commit of data when the SQL in "do something" had failed. We have, however, elicited that there may well be other things belonging into the running business level transaction which may fail and which should lead to data not being committed despite being technically correct at the SQL level. > I suggest the 1st version but replacing "finally" by "else".? Then the try-commit-except > will not be executed if the "something" fails. The whole point was to consolidate the commit into one place. It is unfortunately named, though. It should be called "close_transaction". > Perhaps the extra indentation of the second try block is a bit ugly, but it is more > important that it does the right thing. > If it is convenient (it may not be) to put the whole thing in a function, you may feel > that the follwing is less ugly: The whole thing does reside inside a function but the exit-early pattern > try: > do something > except: > log something > return > try: > .commit() > except: > log some more > return won't help as there's more stuff to be done inside that function. Thanks, Karsten For what it's worth here's the current state of code: #------------------------------------------------------------------------ def __safely_close_cursor_and_rollback_close_conn(close_cursor=None, rollback_tx=None, close_conn=None): if close_cursor: try: close_cursor() except PG_ERROR_EXCEPTION as pg_exc: _log.exception('cannot close cursor') gmConnectionPool.log_pg_exception_details(pg_exc) if rollback_tx: try: # need to rollback so ABORT state isn't retained in pooled connections rollback_tx() except PG_ERROR_EXCEPTION as pg_exc: _log.exception('cannot rollback transaction') gmConnectionPool.log_pg_exception_details(pg_exc) if close_conn: try: close_conn() except PG_ERROR_EXCEPTION as pg_exc: _log.exception('cannot close connection') gmConnectionPool.log_pg_exception_details(pg_exc) #------------------------------------------------------------------------ def __log_notices(notices_accessor=None): for notice in notices_accessor.notices: _log.debug(notice.replace('\n', '/').replace('\n', '/')) del notices_accessor.notices[:] #------------------------------------------------------------------------ def __perhaps_reraise_as_permissions_error(pg_exc, curs): if pg_exc.pgcode != PG_error_codes.INSUFFICIENT_PRIVILEGE: return # privilege problem -- normalize as GNUmed exception details = 'Query: [%s]' % curs.query.decode(errors = 'replace').strip().strip('\n').strip().strip('\n') if curs.statusmessage != '': details = 'Status: %s\n%s' % ( curs.statusmessage.strip().strip('\n').strip().strip('\n'), details ) if pg_exc.pgerror is None: msg = '[%s]' % pg_exc.pgcode else: msg = '[%s]: %s' % (pg_exc.pgcode, pg_exc.pgerror) raise gmExceptions.AccessDenied ( msg, source = 'PostgreSQL', code = pg_exc.pgcode, details = details ) #------------------------------------------------------------------------ def run_rw_queries ( link_obj:_TLnkObj=None, queries:_TQueries=None, end_tx:bool=False, return_data:bool=None, get_col_idx:bool=False, verbose:bool=False ) -> tuple[list[dbapi.extras.DictRow], dict[str, int] | None]: """Convenience function for running read-write queries. Typically (part of) a transaction. Args: link_obj: None, cursor, connection queries: * a list of dicts [{'cmd': , 'args': or ) * to be executed as a single transaction * the last query may usefully return rows, such as: SELECT currval('some_sequence'); or INSERT/UPDATE ... RETURNING some_value; end_tx: * controls whether the transaction is finalized (eg. COMMITted/ROLLed BACK) or not, this allows the call to run_rw_queries() to be part of a framing transaction * if link_obj is a *connection* then "end_tx" will default to False unless it is explicitly set to True which is taken to mean "yes, you do have full control over the transaction" in which case the transaction is properly finalized * if link_obj is a *cursor* we CANNOT finalize the transaction because we would need the connection for that * if link_obj is *None* "end_tx" will, of course, always be True, because we always have full control over the connection, not ending the transaction would be pointless return_data: * if true, the returned data will include the rows the last query selected * if false, it returns None instead get_col_idx: * True: the returned tuple will include a dictionary mapping field names to column positions * False: the returned tuple includes None instead of a field mapping dictionary Returns: * (None, None) if last query did not return rows * ("fetchall() result", ) if last query returned any rows and "return_data" was True * for *index* see "get_col_idx" """ assert queries is not None, ' must not be None' assert isinstance(link_obj, (dbapi._psycopg.connection, dbapi._psycopg.cursor, type(None))), ' must be None, a cursor, or a connection, but [%s] is of type (%s)' % (link_obj, type(link_obj)) if link_obj is None: conn = get_connection(readonly = False) curs = conn.cursor() conn_close = conn.close tx_commit = conn.commit tx_rollback = conn.rollback curs_close = curs.close notices_accessor = conn else: conn_close = lambda *x: None tx_commit = lambda *x: None tx_rollback = lambda *x: None curs_close = lambda *x: None if isinstance(link_obj, dbapi._psycopg.cursor): curs = link_obj notices_accessor = curs.connection elif isinstance(link_obj, dbapi._psycopg.connection): curs = link_obj.cursor() curs_close = curs.close notices_accessor = link_obj if end_tx: tx_commit = link_obj.commit tx_rollback = link_obj.rollback for query in queries: try: args = query['args'] except KeyError: args = None try: curs.execute(query['cmd'], args) if verbose: gmConnectionPool.log_cursor_state(curs) __log_notices(notices_accessor) # DB related exceptions except dbapi.Error as pg_exc: _log.error('query failed in RW connection') gmConnectionPool.log_pg_exception_details(pg_exc) __log_notices(notices_accessor) __safely_close_cursor_and_rollback_close_conn ( curs_close, tx_rollback, conn_close ) __perhaps_reraise_as_permissions_error(pg_exc, curs) # not a permissions problem gmLog2.log_stack_trace() raise # other exceptions except Exception: _log.exception('error running query in RW connection') gmConnectionPool.log_cursor_state(curs) __log_notices(notices_accessor) gmLog2.log_stack_trace() __safely_close_cursor_and_rollback_close_conn ( curs_close, tx_rollback, conn_close ) raise if not return_data: curs_close() tx_commit() conn_close() return (None, None) data = None try: data = curs.fetchall() except Exception: _log.exception('error fetching data from RW query') gmLog2.log_stack_trace() __safely_close_cursor_and_rollback_close_conn ( curs_close, tx_rollback, conn_close ) raise col_idx = None if get_col_idx: col_idx = get_col_indices(curs) curs_close() tx_commit() conn_close() return (data, col_idx) #------------------------------------------------------------------------ -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From greg.ewing at canterbury.ac.nz Sun Sep 8 21:33:24 2024 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 9 Sep 2024 13:33:24 +1200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 8/09/24 11:03 pm, Jon Ribbens wrote: > On 2024-09-08, Greg Ewing wrote: >> try: >> do something >> .commit() >> except: >> log something >> .rollback() > > What if there's an exception in your exception handler? I'd put the > rollback in the 'finally' handler, so it's always called. Good point. Putting the rollback first would be safer/ -- Greg From greg.ewing at canterbury.ac.nz Sun Sep 8 21:48:32 2024 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 9 Sep 2024 13:48:32 +1200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> <62133ebd-a4a3-471b-9acc-3a988b4fcbd7@btinternet.com> Message-ID: On 9/09/24 2:13 am, Karsten Hilbert wrote: > For what it's worth here's the current state of code: That code doesn't inspire much confidence in me. It's far too convoluted with too much micro-management of exceptions. I would much prefer to have just *one* place where exceptions are caught and logged. -- Greg From Karsten.Hilbert at gmx.net Mon Sep 9 04:40:14 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Mon, 9 Sep 2024 10:40:14 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> <62133ebd-a4a3-471b-9acc-3a988b4fcbd7@btinternet.com> Message-ID: Am Mon, Sep 09, 2024 at 01:48:32PM +1200 schrieb Greg Ewing via Python-list: > That code doesn't inspire much confidence in me. It's far too > convoluted with too much micro-management of exceptions. > > I would much prefer to have just *one* place where exceptions are > caught and logged. I am open to suggestions. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From Karsten.Hilbert at gmx.net Mon Sep 9 04:56:47 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Mon, 9 Sep 2024 10:56:47 +0200 Subject: psycopg2: proper positioning of .commit() within try: except: blocks In-Reply-To: References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> <62133ebd-a4a3-471b-9acc-3a988b4fcbd7@btinternet.com> Message-ID: Am Mon, Sep 09, 2024 at 01:48:32PM +1200 schrieb Greg Ewing via Python-list: > That code doesn't inspire much confidence in me. It's far too > convoluted with too much micro-management of exceptions. It is catching two exceptions, re-raising both of them, except for re-raising one of them as another kind of exception. What would you doing differently and how ? > I would much prefer to have just *one* place where exceptions are > caught and logged. There's, of course, a top level handler which logs and user-displays-as-appropriate any exceptions. This is code from a much larger codebase. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From jon+usenet at unequivocal.eu Sun Sep 8 07:03:21 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Sun, 8 Sep 2024 11:03:21 -0000 (UTC) Subject: psycopg2: proper positioning of .commit() within try: except: blocks References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 2024-09-08, Greg Ewing wrote: > On 8/09/24 9:20 am, Karsten Hilbert wrote: >> try: >> do something >> except: >> log something >> finally: >> .commit() >> >> cadence is fairly Pythonic and elegant in that it ensures the >> the .commit() will always be reached regardless of exceptions >> being thrown or not and them being handled or not. > > That seems wrong to me. I would have thought the commit should only > be attempted if everything went right. > > What if there's a problem in your code that causes a non-SQL-related > exception when some but not all of the SQL statements in the > transaction bave been issued? The database doesn't know something > has gone wrong, so it will happily commit a partially-completed > transaction and possibly corrupt your data. > > This is how I normally do things like this: > > try: > do something > .commit() > except: > log something > .rollback() > > Doing an explicit rollback ensures that the transaction is always > rolled back if it is interrupted for any reason. What if there's an exception in your exception handler? I'd put the rollback in the 'finally' handler, so it's always called. If you've already called 'commit' then the rollback does nothing of course. From jon+usenet at unequivocal.eu Mon Sep 9 05:13:40 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Mon, 9 Sep 2024 09:13:40 -0000 (UTC) Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 2024-09-08, Lawrence D'Oliveiro wrote: > On Sun, 8 Sep 2024 11:03:21 -0000 (UTC), Jon Ribbens wrote: >> What if there's an exception in your exception handler? I'd put the >> rollback in the 'finally' handler, so it's always called. If you've >> already called 'commit' then the rollback does nothing of course. > > In any DBMS worth its salt, rollback is something that happens > automatically if the transaction should fail to complete for any reason. > > This applies for any failure reason, up to and including a program or > system crash. If it's a program or system crash, sure, but anything less than that - how would the database even know, unless the program told it? From jon+usenet at unequivocal.eu Mon Sep 9 06:00:11 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Mon, 9 Sep 2024 10:00:11 -0000 (UTC) Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 2024-09-09, Lawrence D'Oliveiro wrote: > On Mon, 9 Sep 2024 09:13:40 -0000 (UTC), Jon Ribbens wrote: >> On 2024-09-08, Lawrence D'Oliveiro wrote: >>> On Sun, 8 Sep 2024 11:03:21 -0000 (UTC), Jon Ribbens wrote: >>>> What if there's an exception in your exception handler? I'd put the >>>> rollback in the 'finally' handler, so it's always called. If you've >>>> already called 'commit' then the rollback does nothing of course. >>> >>> In any DBMS worth its salt, rollback is something that happens >>> automatically if the transaction should fail to complete for any >>> reason. >>> >>> This applies for any failure reason, up to and including a program or >>> system crash. >> >> If it's a program or system crash, sure, but anything less than that - >> how would the database even know, unless the program told it? > > The database only needs to commit when it is explicitly told. Anything > less -- no commit. So the Python code is half-way through a transaction when it throws a (non-database-related) exception and that thread of execution is aborted. The database connection returns to the pool, and is re-used by another thread which continues using it to perform a different sequence of operations ... ending in a COMMIT, which commits one-and-a-half transactions. From Karsten.Hilbert at gmx.net Mon Sep 9 13:28:23 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Mon, 9 Sep 2024 19:28:23 +0200 Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) In-Reply-To: References: <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: Am Mon, Sep 09, 2024 at 10:00:11AM -0000 schrieb Jon Ribbens via Python-list: > > The database only needs to commit when it is explicitly told. Anything > > less -- no commit. > > So the Python code is half-way through a transaction when it throws > a (non-database-related) exception and that thread of execution is > aborted. The database connection returns to the pool, and is re-used > by another thread which continues using it to perform a different > sequence of operations ... ending in a COMMIT, which commits > one-and-a-half transactions. Right, but that's true only when writable connections are being pooled, which should be avoidable in many cases. Any pool worth its salt should rollback any potentially pending transactions of a connection when it is given back that pooled connection. Unless explicitely told not to. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From Karsten.Hilbert at gmx.net Mon Sep 9 13:29:27 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Mon, 9 Sep 2024 19:29:27 +0200 Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) In-Reply-To: References: <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: Am Mon, Sep 09, 2024 at 10:00:11AM -0000 schrieb Jon Ribbens via Python-list: > So the Python code is half-way through a transaction when it throws > a (non-database-related) exception and that thread of execution is > aborted. The database connection returns to the pool, How does it return to the pool ? Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From jon+usenet at unequivocal.eu Mon Sep 9 15:00:17 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Mon, 9 Sep 2024 19:00:17 -0000 (UTC) Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) References: <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 2024-09-09, Karsten Hilbert wrote: > Am Mon, Sep 09, 2024 at 10:00:11AM -0000 schrieb Jon Ribbens via Python-list: >> So the Python code is half-way through a transaction when it throws >> a (non-database-related) exception and that thread of execution is >> aborted. The database connection returns to the pool, > > How does it return to the pool ? It's just any circumstance in which a bit of your code uses a database "cursor" (which isn't a cursor) that it didn't create moments before. From jon+usenet at unequivocal.eu Mon Sep 9 17:12:51 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Mon, 9 Sep 2024 21:12:51 -0000 (UTC) Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 2024-09-09, Lawrence D'Oliveiro wrote: > On Mon, 9 Sep 2024 10:00:11 -0000 (UTC), Jon Ribbens wrote: >> On 2024-09-09, Lawrence D'Oliveiro wrote: >>> The database only needs to commit when it is explicitly told. Anything >>> less -- no commit. >> >> So the Python code is half-way through a transaction when it throws a >> (non-database-related) exception and that thread of execution is >> aborted. The database connection returns to the pool ... > > The DBMS connection is deleted. How does that happen then? From jon+usenet at unequivocal.eu Tue Sep 10 04:38:30 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Tue, 10 Sep 2024 08:38:30 -0000 (UTC) Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 2024-09-09, Lawrence D'Oliveiro wrote: > On Mon, 9 Sep 2024 21:12:51 -0000 (UTC), Jon Ribbens wrote: >> On 2024-09-09, Lawrence D'Oliveiro wrote: >>> On Mon, 9 Sep 2024 10:00:11 -0000 (UTC), Jon Ribbens wrote: >>>> On 2024-09-09, Lawrence D'Oliveiro wrote: >>>>> The database only needs to commit when it is explicitly told. >>>>> Anything less -- no commit. >>>> >>>> So the Python code is half-way through a transaction when it throws a >>>> (non-database-related) exception and that thread of execution is >>>> aborted. The database connection returns to the pool ... >>> >>> The DBMS connection is deleted. >> >> How does that happen then? > > You write code to do it. Ok. So we've moved away from "In any DBMS worth its salt, rollback is something that happens automatically" and now you're saying it isn't automatic after all, "you write code to do it". That was my point. The database provides the tools, but it isn't psychic. From Karsten.Hilbert at gmx.net Tue Sep 10 11:56:24 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Tue, 10 Sep 2024 17:56:24 +0200 Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) In-Reply-To: References: Message-ID: Am Tue, Sep 10, 2024 at 08:38:30AM -0000 schrieb Jon Ribbens via Python-list: > Ok. So we've moved away from "In any DBMS worth its salt, rollback is > something that happens automatically" Nope. The original post asked something entirely different. > and now you're saying it isn't automatic after all, No again, such shenanigans only start to happen when pooling is brought into the equation. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From jon+usenet at unequivocal.eu Tue Sep 10 12:20:22 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Tue, 10 Sep 2024 16:20:22 -0000 (UTC) Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) References: Message-ID: On 2024-09-10, Karsten Hilbert wrote: > Am Tue, Sep 10, 2024 at 08:38:30AM -0000 schrieb Jon Ribbens via Python-list: >> Ok. So we've moved away from "In any DBMS worth its salt, rollback is >> something that happens automatically" > > Nope. The original post asked something entirely different. No it didn't. >> and now you're saying it isn't automatic after all, > > No again, such shenanigans only start to happen when pooling > is brought into the equation. No they don't. From jon+usenet at unequivocal.eu Tue Sep 10 18:48:36 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Tue, 10 Sep 2024 22:48:36 -0000 (UTC) Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 2024-09-10, Lawrence D'Oliveiro wrote: > On Tue, 10 Sep 2024 08:38:30 -0000 (UTC), Jon Ribbens wrote: > >> On 2024-09-09, Lawrence D'Oliveiro wrote: >>> >>> On Mon, 9 Sep 2024 21:12:51 -0000 (UTC), Jon Ribbens wrote: >>>> >>>> On 2024-09-09, Lawrence D'Oliveiro wrote: >>>>> >>>>> On Mon, 9 Sep 2024 10:00:11 -0000 (UTC), Jon Ribbens wrote: >>>>>> >>>>>> On 2024-09-09, Lawrence D'Oliveiro wrote: >>>>>>> >>>>>>> The database only needs to commit when it is explicitly told. >>>>>>> Anything less -- no commit. >>>>>> >>>>>> So the Python code is half-way through a transaction when it throws >>>>>> a (non-database-related) exception and that thread of execution is >>>>>> aborted. The database connection returns to the pool ... >>>>> >>>>> The DBMS connection is deleted. >>>> >>>> How does that happen then? >>> >>> You write code to do it. >> >> Ok. So we've moved away from "In any DBMS worth its salt, rollback is >> something that happens automatically" and now you're saying it isn't >> automatic after all, "you write code to do it". > > The database code already performs that function. As far as the client is > concerned, the function happens automatically. ... but only if "you write code to do it". > And it?s not just code, it?s data. The database structures on persistent > storage are also carefully designed with transaction safety in mind. So > any partial transaction data saved on persistent storage that remains > after a system crash can be identified as such and discarded, leaving the > database in its pre-transaction state. Yes, nobody's disputing that. A good database will do what you tell it, and keep the data you give it. But what if you tell it the wrong thing or give it the wrong data? It's like, for example, a RAID array will save you from a faulty disk, but will not save you from the software writing incorrect data, which the RAID array will then faithfully copy across to all the disks overwriting the good data. From jon+usenet at unequivocal.eu Wed Sep 11 17:12:01 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Wed, 11 Sep 2024 21:12:01 -0000 (UTC) Subject: psycopg2 positioning of .commit() (Posting On Python-List Prohibited) References: <7cb50df2-9c76-477f-91c9-e149c7637104@aklaver.com> <5ee80b84-f04b-454d-ab39-45572e0751a1@aklaver.com> <4a1b12fc-24b7-4c7e-b1f2-6ec9c5f341c2@aklaver.com> Message-ID: On 2024-09-11, Lawrence D'Oliveiro wrote: > On Tue, 10 Sep 2024 22:48:36 -0000 (UTC), Jon Ribbens wrote: >> But what if you tell it the wrong thing ... > > To get back to the original point of this thread, all that rigmarole to > try to ensure to call ?rollback? in case of an exception is completely > unnecessary: the DBMS will take care of that for you. No, it won't. From usenet at andyburns.uk Thu Sep 12 10:18:08 2024 From: usenet at andyburns.uk (Andy Burns) Date: Thu, 12 Sep 2024 15:18:08 +0100 Subject: PyBitmessage is not dead. Ignore the FUD. In-Reply-To: References: Message-ID: 711 Spooky Mart wrote: > PyBitmessage is not dead. > https://bitmessage.org It may help with looking "not dead" to have a changelog that has actually changed within the last 8 years? From sjeik.appie at gmail.com Sun Sep 15 05:12:32 2024 From: sjeik.appie at gmail.com (AJ) Date: Sun, 15 Sep 2024 09:12:32 -0000 (UTC) Subject: Synchronise annotations -> docstring In-Reply-To: References: Message-ID: On 9/4/24 00:21, ram at zedat.fu-berlin.de wrote: >Albert-Jan Roskam wrote or quoted: >>Are there any tools that check whether type annotations and Numpydoc >>strings are consistent? > > According to one webpage, the "sphinx-autodoc-typehints" extension > lets you roll with Python 3 annotations for documenting the types > of arguments and return values of functions. > > So, you'd have a "single source of truth" again to keep everything > chill and straightforward. > > Thanks, I'll have a look. I'm currently using pdoc (or was it pdoc3?) but I could go back to Sphinx. I like Markdown better than restructuredText, though. From ml at fam-goebel.de Wed Sep 18 10:49:39 2024 From: ml at fam-goebel.de (Ulrich Goebel) Date: Wed, 18 Sep 2024 16:49:39 +0200 Subject: Python 3.8 or later on Debian? Message-ID: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> Hi, Debian Linux seems to love Python 3.7 - that is shown by apt-get list, and it's installed on my Debian Server. But I need at least Python 3.8 Is there a repository which I can give to apt to get Python 3.8 or later? Or do I really have to install and compile these versions manually? I'm not a friend of things so deep in the system... Greetings Ulrich -- Ulrich Goebel From alexander at neilson.net.nz Wed Sep 18 18:49:02 2024 From: alexander at neilson.net.nz (Alexander Neilson) Date: Thu, 19 Sep 2024 10:49:02 +1200 Subject: Python 3.8 or later on Debian? In-Reply-To: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> References: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> Message-ID: Python 3.7 is part of Buster (Debian old old stable) If you moved to Debian bullseye you would get offered 3.9 (old stable) Currently the stable version (Bookworm) would give you 3.11 I am not aware of anyone maintaining a repo for old Debian versions to get newer Python versions. But I know in the past I did build newer Python versions (mostly on raspberry pi?s) Regards Alexander Alexander Neilson Neilson Productions Limited 021 329 681 alexander at neilson.net.nz > On 19 Sep 2024, at 10:42, Ulrich Goebel via Python-list wrote: > > ?Hi, > > Debian Linux seems to love Python 3.7 - that is shown by apt-get list, and it's installed on my Debian Server. > > But I need at least Python 3.8 > > Is there a repository which I can give to apt to get Python 3.8 or later? > > Or do I really have to install and compile these versions manually? I'm not a friend of things so deep in the system... > > Greetings > Ulrich > > -- > Ulrich Goebel > -- > https://mail.python.org/mailman/listinfo/python-list From PythonList at DancesWithMice.info Wed Sep 18 19:40:50 2024 From: PythonList at DancesWithMice.info (dn) Date: Thu, 19 Sep 2024 11:40:50 +1200 Subject: Python 3.8 or later on Debian? In-Reply-To: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> References: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> Message-ID: <57b74bcb-6789-4895-bbfa-bd6166ef4751@DancesWithMice.info> On 19/09/24 02:49, Ulrich Goebel via Python-list wrote: > Hi, > > Debian Linux seems to love Python 3.7 - that is shown by apt-get list, and it's installed on my Debian Server. > > But I need at least Python 3.8 > > Is there a repository which I can give to apt to get Python 3.8 or later? > > Or do I really have to install and compile these versions manually? I'm not a friend of things so deep in the system... Assumptions: 1 "need" for a particular project, cf system-wide 2 use of a virtual-environment for project(s) Try pyenv (https://github.com/pyenv/pyenv). It offers a list of Python versions. When downloaded, it builds a version for you - assuming have build-environment s/w in place. (this is where my lack of Debian knowledge may become obvious) Thereafter, within the project's virtual-environment can select which (installed-version of) Python is to be used. Am sure there are plenty of how-to-installs. Here's one: https://bgasparotto.com/install-pyenv-ubuntu-debian Am using pyenv to support multiple projects initially built during the reign of multiple Python versions (which now update annually - next is about two weeks away). -- Regards, =dn From list1 at tompassin.net Wed Sep 18 19:27:29 2024 From: list1 at tompassin.net (Thomas Passin) Date: Wed, 18 Sep 2024 19:27:29 -0400 Subject: Python 3.8 or later on Debian? In-Reply-To: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> References: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> Message-ID: On 9/18/2024 10:49 AM, Ulrich Goebel via Python-list wrote: > Hi, > > Debian Linux seems to love Python 3.7 - that is shown by apt-get list, and it's installed on my Debian Server. > > But I need at least Python 3.8 > > Is there a repository which I can give to apt to get Python 3.8 or later? > > Or do I really have to install and compile these versions manually? I'm not a friend of things so deep in the system... My Debian 12 VM has python 3.11. You must have a very old version of Debian. On some VMs (not Debian, I think) I have had other Python versions alongside of the system's, e.g., 3.11 and 3.12. I didn't compile them myself. You will have to search for a repository with the right package. But upgrade your system first! From mats at wichmann.us Thu Sep 19 13:09:53 2024 From: mats at wichmann.us (Mats Wichmann) Date: Thu, 19 Sep 2024 11:09:53 -0600 Subject: Python 3.8 or later on Debian? In-Reply-To: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> References: <20240918164939.45076d602132a78fa1b04dc9@fam-goebel.de> Message-ID: On 9/18/24 08:49, Ulrich Goebel via Python-list wrote: > Hi, > > Debian Linux seems to love Python 3.7 - that is shown by apt-get list, and it's installed on my Debian Server. > > But I need at least Python 3.8 > > Is there a repository which I can give to apt to get Python 3.8 or later? > > Or do I really have to install and compile these versions manually? I'm not a friend of things so deep in the system... Not going to pile on and tell you you must upgrade... You can use a tool like pyenv to build Python IF another answer doesn't present itself - it how to build just about any version (not just cpython, but pypy, anaconda and more). The Real Python folks have written a fairly complete description (plus of course there's the project's own documentation): https://realpython.com/intro-to-pyenv/ From vinay_sajip at yahoo.co.uk Fri Sep 20 13:14:31 2024 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Fri, 20 Sep 2024 17:14:31 +0000 (UTC) Subject: ANN: A new version (0.5.3) of python-gnupg has been released. References: <615892778.15932749.1726852471985.ref@mail.yahoo.com> Message-ID: <615892778.15932749.1726852471985@mail.yahoo.com> What Changed?============= This is an enhancement and bug-fix release, and all users are encouraged to upgrade. Brief summary: - * Fix #117: Add WKD (Web Key Directory) support for auto-locating keys. Thanks to Myzel394 for the patch. * Fix #237: Ensure local variable is initialized even when an exception occurs. - * Fix #239: Remove logging of decryption result. This release [2] has been signed with my code signing key: Vinay Sajip (CODE SIGNING KEY) Fingerprint: CA74 9061 914E AC13 8E66 EADB 9147 B477 339A 9B86 Recent changes to PyPI don't show the GPG signature with the download links. An alternative download source where the signatures are available is at [4]. The source code repository is at [1]. Documentation is available at [5]. As always, your feedback is most welcome (especially bug reports [3], patches and suggestions for improvement, or any other points via this group). Enjoy! Cheers Vinay Sajip [1] https://github.com/vsajip/python-gnupg [2] https://pypi.org/project/python-gnupg/0.5.3 [3] https://github.com/vsajip/python-gnupg/issues [4] https://github.com/vsajip/python-gnupg/releases/ [5] python-gnupg - A Python wrapper for GnuPG From loris.bennett at fu-berlin.de Fri Sep 20 04:42:14 2024 From: loris.bennett at fu-berlin.de (Loris Bennett) Date: Fri, 20 Sep 2024 10:42:14 +0200 Subject: Common objects for CLI commands with Typer Message-ID: <87tteayavt.fsf@zedat.fu-berlin.de> Hi, Apologies if the following description is to brief - I can expand if no one knows what I'm on about, but maybe a short description is enough. I am developing a command line application using Typer. Most commands need to do something in a database and also do LDAP stuff. Currently each command creates its own Database and LDAP objects, since each command forms an entry point to the program. With Typer, is there a way I can define the equivalent of class attributes at a single point which are then available to all commands? Cheers, Loris -- This signature is currently under constuction. From nntp.mbourne at spamgourmet.com Fri Sep 20 15:00:57 2024 From: nntp.mbourne at spamgourmet.com (Mark Bourne) Date: Fri, 20 Sep 2024 20:00:57 +0100 Subject: Trouble with mocking In-Reply-To: References: Message-ID: Norman Robins wrote: > I'm somewhat new to mocking for unit tests. > > I have some code like this: > > In foo/bar/baz.py I have 2 function I want to mock, one calls the other" > def function1_to_mock(): > . > . > . > > def function2_to_mock(): > function1_to_mock() > > In foo/bar/main.py I import 1 of these and call it" > from .baz import function2_to_mock > > def some_function(): > function1_to_mock() I'm assuming this is supposed to be calling `function2_to_mock`? (Otherwise the import should be for `function1_to_mock`, but then the fact `function2_to_mock` also calls `function1_to_mock` would be irrelevant) > . > . > . > > I want to mock both function1_to_mock and function2_to_mock > > In my test I do this: > > def function1_to_mock(kid): > return MOCKED_VALUE > > @pytest.fixture(autouse=True) > def mock_dependencies(): > with patch(foo.bar.baz.function1_to_mock') as mock_function1_to_mock, \ > patch('foo.bar.main.function2_to_mock') as mock_function2_to_mock: > mock_function2_to_mock.return_value = { > 'this': 'that > } > yield mock_function1_to_mock, mock_function2_to_mock > > def test_main(mock_dependencies): > some_function() > > When some_function is called the real function1_to_mock is called instead > of my mock. > > Can someone please let me know how to properly mock these 2 functions. > > Thanks! In `foo/bar/main.py`, the line: ``` from .baz import function2_to_mock ``` creates a reference named `function2_to_mock` in `main.py` referring to the method in `foo/bar/baz.py`. When you patch `foo.bar.baz.function2_to_mock`, you're patching the reference in `foo.bar.baz`, but the reference in `foo.bar.main` still refers to the original function object. There are at least a couple of ways around this. The way I prefer is to change `main.py` to import the `baz` module rather than just the function: ``` > from . import baz > > def some_function(): > baz.function2_to_mock() ``` Here, `main.py` has a reference to the `baz` module rather than the individual function object. It looks up `function2_to_mock` in `baz` just before calling it so, when the `baz` module is patched so that `baz.function2_to_mock` refers to a mock, the call in main.py` gets the mock and calls that rather than the original function. There no memory saving by importing just the functions you need from `baz` - the whole module is still loaded on import and remains in memory, it's just that `main` only gets a reference to the one function. The effect is similar to doing something like: ``` from . import baz function2_to_mock = baz.function2_to_mock del baz ``` ...including the fact that, after the import, the reference to `function2_to_mock` in `main` is just a copy of the reference in `baz`, hence not getting updated by the patch. The other way around it is to patch `main.function2_to_mock` instead of patching `foo.bar.baz.function2_to_mock`. See also the documentation under "where to patch" at . Note that, since you're patching `function2_to_mock`, you don't necessarily need to patch `function1_to_mock` as well. The mock of `function2_to_mock` won't call `function1_to_mock` (or its mock) regardless of whether `function1_to_mock` has been patched, unless you set the mock of `function2_to_mock` to do so. You don't necessarily need to patch `function1_to_mock`, unless of course there are other calls to it that you need to mock. -- Mark. From martin.nilsson4 at skola.malmo.se Fri Sep 20 06:52:08 2024 From: martin.nilsson4 at skola.malmo.se (Martin Nilsson) Date: Fri, 20 Sep 2024 12:52:08 +0200 Subject: Bug in 3.12.5 Message-ID: <2E08BDE8-27C6-4CF7-805A-BA5CB4E5DB21@skola.malmo.se> Dear Sirs ! The attached program doesn?t work in 3.12.5, but in 3.9 it worked. Best Regards Martin Nilsson From cs at cskk.id.au Fri Sep 20 18:15:34 2024 From: cs at cskk.id.au (Cameron Simpson) Date: Sat, 21 Sep 2024 08:15:34 +1000 Subject: Bug in 3.12.5 In-Reply-To: <2E08BDE8-27C6-4CF7-805A-BA5CB4E5DB21@skola.malmo.se> References: <2E08BDE8-27C6-4CF7-805A-BA5CB4E5DB21@skola.malmo.se> Message-ID: On 20Sep2024 12:52, Martin Nilsson wrote: >The attached program doesn?t work in 3.12.5, but in 3.9 it worked. This mailing list discards attachments. Please include your code inline in the message text. Thanks, Cameron Simpson From Keith.S.Thompson+u at gmail.com Fri Sep 20 16:20:24 2024 From: Keith.S.Thompson+u at gmail.com (Keith Thompson) Date: Fri, 20 Sep 2024 13:20:24 -0700 Subject: Bug in 3.12.5 References: <2E08BDE8-27C6-4CF7-805A-BA5CB4E5DB21@skola.malmo.se> Message-ID: <87setuqdpz.fsf@nosuchdomain.example.com> Martin Nilsson writes: > The attached program doesn?t work in 3.12.5, but in 3.9 it worked. Attachments don't show up either on the mailing list or the newsgroup. Try again with the program inline in your post (if it's not too long). -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u at gmail.com void Void(void) { Void(); } /* The recursive call of the void */ From barry at barrys-emacs.org Sat Sep 21 01:38:05 2024 From: barry at barrys-emacs.org (Barry) Date: Sat, 21 Sep 2024 06:38:05 +0100 Subject: Common objects for CLI commands with Typer In-Reply-To: <87tteayavt.fsf@zedat.fu-berlin.de> References: <87tteayavt.fsf@zedat.fu-berlin.de> Message-ID: <28833A4D-B57C-4195-87BF-FAAF9EFF5F19@barrys-emacs.org> > On 20 Sep 2024, at 21:01, Loris Bennett via Python-list wrote: > > ?Hi, > > Apologies if the following description is to brief - I can expand if no > one knows what I'm on about, but maybe a short description is enough. > > I am developing a command line application using Typer. Most commands > need to do something in a database and also do LDAP stuff. Currently > each command creates its own Database and LDAP objects, since each > command forms an entry point to the program. > > With Typer, is there a way I can define the equivalent of class > attributes at a single point which are then available to all commands? I do not know typer. But the general solution is to create an instance of your class and tell typer to call member function of the instance. app = Application() ? typer.set_callback(app.my_handler) Barry > > Cheers, > > Loris > > -- > This signature is currently under constuction. > -- > https://mail.python.org/mailman/listinfo/python-list > From 2QdxY4RzWzUUiLuE at potatochowder.com Sat Sep 21 06:40:37 2024 From: 2QdxY4RzWzUUiLuE at potatochowder.com (2QdxY4RzWzUUiLuE at potatochowder.com) Date: Sat, 21 Sep 2024 06:40:37 -0400 Subject: Common objects for CLI commands with Typer In-Reply-To: <28833A4D-B57C-4195-87BF-FAAF9EFF5F19@barrys-emacs.org> References: <87tteayavt.fsf@zedat.fu-berlin.de> <28833A4D-B57C-4195-87BF-FAAF9EFF5F19@barrys-emacs.org> Message-ID: On 2024-09-21 at 06:38:05 +0100, Barry via Python-list wrote: > > On 20 Sep 2024, at 21:01, Loris Bennett via Python-list wrote: > > > > ?Hi, > > > > Apologies if the following description is to brief - I can expand if no > > one knows what I'm on about, but maybe a short description is enough. > > > > I am developing a command line application using Typer. Most commands > > need to do something in a database and also do LDAP stuff. Currently > > each command creates its own Database and LDAP objects, since each > > command forms an entry point to the program. > > > > With Typer, is there a way I can define the equivalent of class > > attributes at a single point which are then available to all commands? > > I do not know typer. But the general solution is to create an instance of your class > and tell typer to call member function of the instance. > > app = Application() > ? > typer.set_callback(app.my_handler) Despite the fact that "everything is an object" in Python, you don't have to put data or functions inside classes or objects. I also know nothing about Typer, but there's nothing wrong with functions in a module. There's also nothing wrong with writing a function that creates and returns the database and LDAP connections (perhas as instances of application-level classes), amd calling that function from within each command. DRY. Yeah, yeah, yeah. :-/ So there's one line at the top of each comamnd that initializes things, and possibly a line at the bottom to close those things down. Turn those lines into a context manager, which is actually a sub-framework inside Typer. Don't convolute/compilicate your design to eliminate one line at the top of each command. Go ahead, accuse me of writing FORTRAN (all caps, no numbers or qualifiers, as $deity intended) in Python. But neither optimize prematurely nor invoke the Inner Platform Effect to save one or two lines in your not-yet-written commands, either. Sorry for the rant. :-) Simple is better than complex. Complex is better than complicated. HTH. From barry at barrys-emacs.org Mon Sep 23 14:00:10 2024 From: barry at barrys-emacs.org (Barry Scott) Date: Mon, 23 Sep 2024 19:00:10 +0100 Subject: Common objects for CLI commands with Typer In-Reply-To: References: <87tteayavt.fsf@zedat.fu-berlin.de> <28833A4D-B57C-4195-87BF-FAAF9EFF5F19@barrys-emacs.org> Message-ID: <1E3ED29E-161E-430C-9E99-F89266472ADB@barrys-emacs.org> > On 21 Sep 2024, at 11:40, Dan Sommers via Python-list wrote: > > Despite the fact that "everything is an object" in Python, you don't > have to put data or functions inside classes or objects. I also know > nothing about Typer, but there's nothing wrong with functions in a > module. Python is great in allowing you to pick your style. A few lines in a module, a couple of functions or use classes. But once your code gets big the disciple of using classes helps maintenance. Code with lots of globals is problematic. Barry From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Sep 23 15:51:49 2024 From: 2QdxY4RzWzUUiLuE at potatochowder.com (2QdxY4RzWzUUiLuE at potatochowder.com) Date: Mon, 23 Sep 2024 14:51:49 -0500 Subject: Common objects for CLI commands with Typer In-Reply-To: <1E3ED29E-161E-430C-9E99-F89266472ADB@barrys-emacs.org> References: <87tteayavt.fsf@zedat.fu-berlin.de> <28833A4D-B57C-4195-87BF-FAAF9EFF5F19@barrys-emacs.org> <1E3ED29E-161E-430C-9E99-F89266472ADB@barrys-emacs.org> Message-ID: On 2024-09-23 at 19:00:10 +0100, Barry Scott wrote: > > On 21 Sep 2024, at 11:40, Dan Sommers via Python-list wrote: > But once your code gets big the disciple of using classes helps > maintenance. Code with lots of globals is problematic. Even before your code gets big, discipline helps maintenance. :-) Every level of your program has globals. An application with too many classes is no better (or worse) than a class with too many methods, or a module with too many functions. Insert your own definitions of (and tolerances for) "too many," which will vary in flexibility. (And as was alluded to elsewhere in this thread, you could probably deduce the original and/or preferred programming languages of people with certain such definitions. But I digress.) $ python -m this|grep Namespaces Namespaces are one honking great idea -- let's do more of those! From annada at tilde.green Mon Sep 23 03:44:00 2024 From: annada at tilde.green (Annada Behera) Date: Mon, 23 Sep 2024 13:14:00 +0530 Subject: Beazley's Problem In-Reply-To: <87plow4v4p.fsf@nightsong.com> References: <87tte941ko.fsf@nightsong.com> <87plow4v4p.fsf@nightsong.com> Message-ID: <0709b4b8b0bbf2a32d53649d1a6fbefbcd44a68a.camel@tilde.green> The "next-level math trick" Newton-Raphson has nothing to do with functional programming. I have written solvers in purely iterative style. As far as I know, Newton-Raphson is the opposite of functional programming as you iteratively solve for the root. Functional programming is stateless where you are not allowed to store any state (current best guess root). -----Original Message----- From: Paul Rubin Subject: Re: Beazley's Problem Date: 09/22/2024 01:49:50 AM Newsgroups: comp.lang.python ram at zedat.fu-berlin.de?(Stefan Ram) writes: > ? It's hella rad to see you bust out those "next-level math tricks" > ? with just a single line each! You might like: https://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf The numerics stuff starts on page 9. From annada at tilde.green Tue Sep 24 04:25:57 2024 From: annada at tilde.green (Annada Behera) Date: Tue, 24 Sep 2024 13:55:57 +0530 Subject: Beazley's Problem In-Reply-To: <87h6a5lx30.fsf@nightsong.com> References: <87tte941ko.fsf@nightsong.com> <87plow4v4p.fsf@nightsong.com> <0709b4b8b0bbf2a32d53649d1a6fbefbcd44a68a.camel@tilde.green> <87h6a5lx30.fsf@nightsong.com> Message-ID: <08bddb548dce214b1d41432e92d431d0ef304929.camel@tilde.green> -----Original Message----- From: Paul Rubin Subject: Re: Beazley's Problem Date: 09/24/2024 05:52:27 AM Newsgroups: comp.lang.python >> def f_prime(x: float) -> float: >> ??? return 2*x > >You might enjoy implementing that with automatic differentiation (not >to be confused with symbolic differentiation) instead. > >http://blog.sigfpe.com/2005/07/automatic-differentiation.html Before I knew automatic differentiation, I thought neural networks backpropagation was magic. Although coding up backward mode autodiff is little trickier than forward mode autodiff. (a) Forward-mode autodiff takes less space (just a dual component of every input variable) but needs more time to compute. For any function: f:R->R^m, forward mode can compute the derivates in O(m^0)=O(1) time, but O(m) time for f:R^m->R. (b) Reverse-mode autodiff requires you build a computation graph which takes space but is faster. For function: f:R^m->R, they can run in O(m^0)=O(1) time and vice versa ( O(m) time for f:R->R^m ). Almost all neural network training these days use reverse-mode autodiff. From mk1853387 at gmail.com Wed Sep 25 13:24:49 2024 From: mk1853387 at gmail.com (marc nicole) Date: Wed, 25 Sep 2024 19:24:49 +0200 Subject: How to stop a specific thread in Python 2.7? Message-ID: Hello guys, I want to know how to kill a specific running thread (say by its id) for now I run and kill a thread like the following: # start thread thread1 = threading.Thread(target= self.some_func(), args=( ...,), ) thread1.start() # kill the thread event_thread1 = threading.Event() event_thread1.set() I know that set() will kill all running threads, but if there was thread2 as well and I want to kill only thread1? Thanks! From cs at cskk.id.au Wed Sep 25 16:44:09 2024 From: cs at cskk.id.au (Cameron Simpson) Date: Thu, 26 Sep 2024 06:44:09 +1000 Subject: How to stop a specific thread in Python 2.7? In-Reply-To: References: Message-ID: On 25Sep2024 19:24, marc nicole wrote: >I want to know how to kill a specific running thread (say by its id) > >for now I run and kill a thread like the following: ># start thread >thread1 = threading.Thread(target= self.some_func(), args=( ...,), ) >thread1.start() ># kill the thread >event_thread1 = threading.Event() >event_thread1.set() > >I know that set() will kill all running threads, but if there was thread2 >as well and I want to kill only thread1? No, `set()` doesn't kill a thread at all. It sets the `Event`, and each thread must be checking that event regularly, and quitting if it becomes set. You just need a per-thred vent instead of a single Event for all the threads. Cheers, Cameron Simpson From mk1853387 at gmail.com Wed Sep 25 16:56:58 2024 From: mk1853387 at gmail.com (marc nicole) Date: Wed, 25 Sep 2024 22:56:58 +0200 Subject: How to stop a specific thread in Python 2.7? In-Reply-To: References: Message-ID: How to create a per-thread event in Python 2.7? On Wed, 25 Sept 2024, 22:47 Cameron Simpson via Python-list, < python-list at python.org> wrote: > On 25Sep2024 19:24, marc nicole wrote: > >I want to know how to kill a specific running thread (say by its id) > > > >for now I run and kill a thread like the following: > ># start thread > >thread1 = threading.Thread(target= self.some_func(), args=( ...,), ) > >thread1.start() > ># kill the thread > >event_thread1 = threading.Event() > >event_thread1.set() > > > >I know that set() will kill all running threads, but if there was thread2 > >as well and I want to kill only thread1? > > No, `set()` doesn't kill a thread at all. It sets the `Event`, and each > thread must be checking that event regularly, and quitting if it becomes > set. > > You just need a per-thred vent instead of a single Event for all the > threads. > > Cheers, > Cameron Simpson > -- > https://mail.python.org/mailman/listinfo/python-list > From cs at cskk.id.au Wed Sep 25 21:06:03 2024 From: cs at cskk.id.au (Cameron Simpson) Date: Thu, 26 Sep 2024 11:06:03 +1000 Subject: How to stop a specific thread in Python 2.7? In-Reply-To: References: Message-ID: On 25Sep2024 22:56, marc nicole wrote: >How to create a per-thread event in Python 2.7? Every time you make a Thread, make an Event. Pass it to the thread worker function and keep it to hand for your use outside the thread. From mk1853387 at gmail.com Wed Sep 25 21:34:05 2024 From: mk1853387 at gmail.com (marc nicole) Date: Thu, 26 Sep 2024 03:34:05 +0200 Subject: [Tutor] How to stop a specific thread in Python 2.7? In-Reply-To: References: Message-ID: Could you show a python code example of this? On Thu, 26 Sept 2024, 03:08 Cameron Simpson, wrote: > On 25Sep2024 22:56, marc nicole wrote: > >How to create a per-thread event in Python 2.7? > > Every time you make a Thread, make an Event. Pass it to the thread > worker function and keep it to hand for your use outside the thread. > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From olegsivokon at gmail.com Wed Sep 25 16:14:43 2024 From: olegsivokon at gmail.com (Left Right) Date: Wed, 25 Sep 2024 22:14:43 +0200 Subject: How to stop a specific thread in Python 2.7? In-Reply-To: References: Message-ID: That's one of the "disadvantages" of threads: you cannot safely stop a thread. Of course you could try, but that's never a good idea. The reason for this is that threads share memory. They might be holding locks that, if killed, will never be unlocked. They might (partially) modify the shared state observed by other threads in such a way that it becomes unusable to other threads. So... if you want to kill a thread, I'm sorry to say this: you will have to bring down the whole process, there's really no other way, and that's not Python-specific, this is just the design of threads. On Wed, Sep 25, 2024 at 7:26?PM marc nicole via Python-list wrote: > > Hello guys, > > I want to know how to kill a specific running thread (say by its id) > > for now I run and kill a thread like the following: > # start thread > thread1 = threading.Thread(target= self.some_func(), args=( ...,), ) > thread1.start() > # kill the thread > event_thread1 = threading.Event() > event_thread1.set() > > I know that set() will kill all running threads, but if there was thread2 > as well and I want to kill only thread1? > > Thanks! > -- > https://mail.python.org/mailman/listinfo/python-list From asifali.ha at gmail.com Fri Sep 27 02:17:12 2024 From: asifali.ha at gmail.com (Asif Ali Hirekumbi) Date: Fri, 27 Sep 2024 11:47:12 +0530 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API Message-ID: Dear Python Experts, I am working with the Kenna Application's API to retrieve vulnerability data. The API endpoint provides a single, massive JSON file in gzip format, approximately 60 GB in size. Handling such a large dataset in one go is proving to be quite challenging, especially in terms of memory management. I am looking for guidance on how to efficiently stream this data and process it in chunks using Python. Specifically, I am wondering if there?s a way to use the requests library or any other libraries that would allow us to pull data from the API endpoint in a memory-efficient manner. Here are the relevant API endpoints from Kenna: - Kenna API Documentation - Kenna Vulnerabilities Export If anyone has experience with similar use cases or can offer any advice, it would be greatly appreciated. Thank you in advance for your help! Best regards Asif Ali From arj.python at gmail.com Mon Sep 30 01:49:21 2024 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Mon, 30 Sep 2024 09:49:21 +0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: Message-ID: Idk if you tried Polars, but it seems to work well with JSON data import polars as pl pl.read_json("file.json") Kind Regards, Abdur-Rahmaan Janhangeer about | blog github Mauritius On Mon, Sep 30, 2024 at 8:00?AM Asif Ali Hirekumbi via Python-list < python-list at python.org> wrote: > Dear Python Experts, > > I am working with the Kenna Application's API to retrieve vulnerability > data. The API endpoint provides a single, massive JSON file in gzip format, > approximately 60 GB in size. Handling such a large dataset in one go is > proving to be quite challenging, especially in terms of memory management. > > I am looking for guidance on how to efficiently stream this data and > process it in chunks using Python. Specifically, I am wondering if there?s > a way to use the requests library or any other libraries that would allow > us to pull data from the API endpoint in a memory-efficient manner. > > Here are the relevant API endpoints from Kenna: > > - Kenna API Documentation > > - Kenna Vulnerabilities Export > > > If anyone has experience with similar use cases or can offer any advice, it > would be greatly appreciated. > > Thank you in advance for your help! > > Best regards > Asif Ali > -- > https://mail.python.org/mailman/listinfo/python-list > From asifali.ha at gmail.com Mon Sep 30 02:41:30 2024 From: asifali.ha at gmail.com (Asif Ali Hirekumbi) Date: Mon, 30 Sep 2024 12:11:30 +0530 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: Message-ID: Thanks Abdur Rahmaan. I will give it a try ! Thanks Asif On Mon, Sep 30, 2024 at 11:19?AM Abdur-Rahmaan Janhangeer < arj.python at gmail.com> wrote: > Idk if you tried Polars, but it seems to work well with JSON data > > import polars as pl > pl.read_json("file.json") > > Kind Regards, > > Abdur-Rahmaan Janhangeer > about | blog > > github > Mauritius > > > On Mon, Sep 30, 2024 at 8:00?AM Asif Ali Hirekumbi via Python-list < > python-list at python.org> wrote: > >> Dear Python Experts, >> >> I am working with the Kenna Application's API to retrieve vulnerability >> data. The API endpoint provides a single, massive JSON file in gzip >> format, >> approximately 60 GB in size. Handling such a large dataset in one go is >> proving to be quite challenging, especially in terms of memory management. >> >> I am looking for guidance on how to efficiently stream this data and >> process it in chunks using Python. Specifically, I am wondering if there?s >> a way to use the requests library or any other libraries that would allow >> us to pull data from the API endpoint in a memory-efficient manner. >> >> Here are the relevant API endpoints from Kenna: >> >> - Kenna API Documentation >> >> - Kenna Vulnerabilities Export >> >> >> If anyone has experience with similar use cases or can offer any advice, >> it >> would be greatly appreciated. >> >> Thank you in advance for your help! >> >> Best regards >> Asif Ali >> -- >> https://mail.python.org/mailman/listinfo/python-list >> > From olegsivokon at gmail.com Mon Sep 30 04:41:44 2024 From: olegsivokon at gmail.com (Left Right) Date: Mon, 30 Sep 2024 10:41:44 +0200 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: Message-ID: Whether and to what degree you can stream JSON depends on JSON structure. In general, however, JSON cannot be streamed (but commonly it can be). Imagine a pathological case of this shape: 1... <60GB of digits>. This is still a valid JSON (it doesn't have any limits on how many digits a number can have). And you cannot parse this number in a streaming way because in order to do that, you need to start with the least significant digit. Typically, however, JSON can be parsed incrementally. The format is conceptually very simple to write a parser for. There are plenty of parsers that do that, for example, this one: https://pypi.org/project/json-stream/ . But, I'd encourage you to do it yourself. It's fun, and the resulting parser should end up less than some 50 LoC. Also, it allows you to closer incorporate your desired output into your parser. On Mon, Sep 30, 2024 at 8:44?AM Asif Ali Hirekumbi via Python-list wrote: > > Thanks Abdur Rahmaan. > I will give it a try ! > > Thanks > Asif > > On Mon, Sep 30, 2024 at 11:19?AM Abdur-Rahmaan Janhangeer < > arj.python at gmail.com> wrote: > > > Idk if you tried Polars, but it seems to work well with JSON data > > > > import polars as pl > > pl.read_json("file.json") > > > > Kind Regards, > > > > Abdur-Rahmaan Janhangeer > > about | blog > > > > github > > Mauritius > > > > > > On Mon, Sep 30, 2024 at 8:00?AM Asif Ali Hirekumbi via Python-list < > > python-list at python.org> wrote: > > > >> Dear Python Experts, > >> > >> I am working with the Kenna Application's API to retrieve vulnerability > >> data. The API endpoint provides a single, massive JSON file in gzip > >> format, > >> approximately 60 GB in size. Handling such a large dataset in one go is > >> proving to be quite challenging, especially in terms of memory management. > >> > >> I am looking for guidance on how to efficiently stream this data and > >> process it in chunks using Python. Specifically, I am wondering if there?s > >> a way to use the requests library or any other libraries that would allow > >> us to pull data from the API endpoint in a memory-efficient manner. > >> > >> Here are the relevant API endpoints from Kenna: > >> > >> - Kenna API Documentation > >> > >> - Kenna Vulnerabilities Export > >> > >> > >> If anyone has experience with similar use cases or can offer any advice, > >> it > >> would be greatly appreciated. > >> > >> Thank you in advance for your help! > >> > >> Best regards > >> Asif Ali > >> -- > >> https://mail.python.org/mailman/listinfo/python-list > >> > > > -- > https://mail.python.org/mailman/listinfo/python-list From info at egenix.com Mon Sep 30 06:37:17 2024 From: info at egenix.com (eGenix Team) Date: Mon, 30 Sep 2024 12:37:17 +0200 Subject: =?UTF-8?Q?ANN=3A_Python_Meeting_D=C3=BCsseldorf_-_02=2E10=2E2024?= Message-ID: <1536c336-a91f-4a8e-8eaf-cbbe0f64612c@egenix.com> /This announcement is in German since it targets a local user group//meeting in D?sseldorf, Germany/ Ank?ndigung Python Meeting D?sseldorf - Oktober 2024 Ein Treffen von Python Enthusiasten und Interessierten in ungezwungener Atmosph?re. *02.10.2024, 18:00 Uhr* Raum 1, 2.OG im B?rgerhaus Stadtteilzentrum Bilk D?sseldorfer Arcaden , Bachstr. 145, 40217 D?sseldorf Programm Bereits angemeldete Vortr?ge: * Detlef Lannert: /*pyinfra als Alternative zu Ansible */ * Marc-Andr? Lemburg: /*Rapid web app development with Panel */ * Detlef Lannert: /*Low-cost-Objekte als Alternativen zu Dictionaries? */ * Charlie Clark: /*Editieren von ZIP Dateien mit Python*/ Weitere Vortr?ge k?nnen gerne noch angemeldet werden. Bei Interesse, bitte unter info at pyddf.de melden. Startzeit und Ort Wir treffen uns um 18:00 Uhr im B?rgerhaus in den D?sseldorfer Arcaden. Das B?rgerhaus teilt sich den Eingang mit dem Schwimmbad und befindet sich an der Seite der Tiefgarageneinfahrt der D?sseldorfer Arcaden. ?ber dem Eingang steht ein gro?es "Schwimm? in Bilk" Logo. Hinter der T?r direkt links zu den zwei Aufz?gen, dann in den 2. Stock hochfahren. Der Eingang zum Raum 1 liegt direkt links, wenn man aus dem Aufzug kommt. >>> Eingang in Google Street View *?? Wichtig*: Bitte nur dann anmelden, wenn ihr absolut sicher seid, dass ihr auch kommt. Angesichts der begrenzten Anzahl Pl?tze, haben wir kein Verst?ndnis f?r kurzfristige Absagen oder No-Shows. Einleitung Das Python Meeting D?sseldorf ist eine regelm??ige Veranstaltung in D?sseldorf, die sich an Python Begeisterte aus der Region wendet. Einen guten ?berblick ?ber die Vortr?ge bietet unser PyDDF YouTube-Kanal , auf dem wir Videos der Vortr?ge nach den Meetings ver?ffentlichen. Veranstaltet wird das Meeting von der eGenix.com GmbH , Langenfeld, in Zusammenarbeit mit Clark Consulting & Research , D?sseldorf: Format Das Python Meeting D?sseldorf nutzt eine Mischung aus (Lightning) Talks und offener Diskussion. Vortr?ge k?nnen vorher angemeldet werden, oder auch spontan w?hrend des Treffens eingebracht werden. Ein Beamer mit HDMI und FullHD Aufl?sung steht zur Verf?gung. (Lightning) Talk Anmeldung bitte formlos per EMail an info at pyddf.de Kostenbeteiligung Das Python Meeting D?sseldorf wird von Python Nutzern f?r Python Nutzer veranstaltet. Da Tagungsraum, Beamer, Internet und Getr?nke Kosten produzieren, bitten wir die Teilnehmer um einen Beitrag in H?he von EUR 10,00 inkl. 19% Mwst. Sch?ler und Studenten zahlen EUR 5,00 inkl. 19% Mwst. Wir m?chten alle Teilnehmer bitten, den Betrag in bar mitzubringen. Anmeldung Da wir nur 25 Personen in dem angemieteten Raum empfangen k?nnen, m?chten wir bitten, sich vorher anzumelden. *Meeting Anmeldung* bitte per Meetup Weitere Informationen Weitere Informationen finden Sie auf der Webseite des Meetings: https://pyddf.de/ Viel Spa? ! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 30 2024) >>> Python Projects, Coaching and Support ... https://www.egenix.com/ >>> Python Product Development ... https://consulting.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/ From barry at barrys-emacs.org Mon Sep 30 11:30:19 2024 From: barry at barrys-emacs.org (Barry) Date: Mon, 30 Sep 2024 16:30:19 +0100 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: Message-ID: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> > On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list wrote: > > > import polars as pl > pl.read_json("file.json") > > This is not going to work unless the computer has a lot more the 60GiB of RAM. As later suggested a streaming parser is required. Barry From grant.b.edwards at gmail.com Mon Sep 30 11:44:50 2024 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Mon, 30 Sep 2024 11:44:50 -0400 (EDT) Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API References: Message-ID: <4XHQPG4LzsznVwM@mail.python.org> On 2024-09-30, Left Right via Python-list wrote: > Whether and to what degree you can stream JSON depends on JSON > structure. In general, however, JSON cannot be streamed (but commonly > it can be). > > Imagine a pathological case of this shape: 1... <60GB of digits>. This > is still a valid JSON (it doesn't have any limits on how many digits a > number can have). And you cannot parse this number in a streaming way > because in order to do that, you need to start with the least > significant digit. Which is how arabic numbers were originally parsed, but when westerners adopted them from a R->L written language, thet didn't flip them around to match the L->R written language into which they were being adopted. So now long numbers can't be parsed as a stream in software. They should have anticipated this problem back in the 13th century and flipped the numbers around. From list1 at tompassin.net Mon Sep 30 12:11:46 2024 From: list1 at tompassin.net (Thomas Passin) Date: Mon, 30 Sep 2024 12:11:46 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> Message-ID: <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> On 9/30/2024 11:30 AM, Barry via Python-list wrote: > > >> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list wrote: >> >> >> import polars as pl >> pl.read_json("file.json") >> >> > > This is not going to work unless the computer has a lot more the 60GiB of RAM. > > As later suggested a streaming parser is required. Streaming won't work because the file is gzipped. You have to receive the whole thing before you can unzip it. Once unzipped it will be even larger, and all in memory. From rosuav at gmail.com Mon Sep 30 13:00:21 2024 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 1 Oct 2024 03:00:21 +1000 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> Message-ID: On Tue, 1 Oct 2024 at 02:20, Thomas Passin via Python-list wrote: > > On 9/30/2024 11:30 AM, Barry via Python-list wrote: > > > > > >> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list wrote: > >> > >> > >> import polars as pl > >> pl.read_json("file.json") > >> > >> > > > > This is not going to work unless the computer has a lot more the 60GiB of RAM. > > > > As later suggested a streaming parser is required. > > Streaming won't work because the file is gzipped. You have to receive > the whole thing before you can unzip it. Once unzipped it will be even > larger, and all in memory. Streaming gzip is perfectly possible. You may be thinking of PKZip which has its EOCD at the end of the file (although it may still be possible to stream-decompress if you work at it). ChrisA From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Sep 30 14:28:33 2024 From: 2QdxY4RzWzUUiLuE at potatochowder.com (2QdxY4RzWzUUiLuE at potatochowder.com) Date: Mon, 30 Sep 2024 14:28:33 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: <4XHQPG4LzsznVwM@mail.python.org> References: <4XHQPG4LzsznVwM@mail.python.org> Message-ID: On 2024-09-30 at 11:44:50 -0400, Grant Edwards via Python-list wrote: > On 2024-09-30, Left Right via Python-list wrote: > > Whether and to what degree you can stream JSON depends on JSON > > structure. In general, however, JSON cannot be streamed (but commonly > > it can be). > > > > Imagine a pathological case of this shape: 1... <60GB of digits>. This > > is still a valid JSON (it doesn't have any limits on how many digits a > > number can have). And you cannot parse this number in a streaming way > > because in order to do that, you need to start with the least > > significant digit. > > Which is how arabic numbers were originally parsed, but when > westerners adopted them from a R->L written language, thet didn't flip > them around to match the L->R written language into which they were > being adopted. Interesting. > So now long numbers can't be parsed as a stream in software. They > should have anticipated this problem back in the 13th century and > flipped the numbers around. What am I missing? Handwavingly, start with the first digit, and as long as the next character is a digit, multipliy the accumulated result by 10 (or the appropriate base) and add the next value. Oh, and handle scientific notation as a special case, and perhaps fail spectacularly instead of recovering gracefully in certain edge cases. And in the pathological case of a single number with 60 billion digits, run out of memory (and complain loudly to the person who claimed that the file contained a "dataset"). But why do I need to start with the least significant digit? From grant.b.edwards at gmail.com Mon Sep 30 14:41:46 2024 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Mon, 30 Sep 2024 14:41:46 -0400 (EDT) Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API References: <4XHQPG4LzsznVwM@mail.python.org> Message-ID: <4XHVKQ2G9wznXbM@mail.python.org> On 2024-09-30, Dan Sommers via Python-list wrote: > On 2024-09-30 at 11:44:50 -0400, > Grant Edwards via Python-list wrote: > >> On 2024-09-30, Left Right via Python-list wrote: >> > [...] >> > Imagine a pathological case of this shape: 1... <60GB of digits>. This >> > is still a valid JSON (it doesn't have any limits on how many digits a >> > number can have). And you cannot parse this number in a streaming way >> > because in order to do that, you need to start with the least >> > significant digit. >> >> Which is how arabic numbers were originally parsed, but when >> westerners adopted them from a R->L written language, thet didn't >> flip them around to match the L->R written language into which they >> were being adopted. > > Interesting. > >> So now long numbers can't be parsed as a stream in software. They >> should have anticipated this problem back in the 13th century and >> flipped the numbers around. > > What am I missing? Handwavingly, start with the first digit, and as > long as the next character is a digit, multipliy the accumulated > result by 10 (or the appropriate base) and add the next value. > [...] But why do I need to start with the least significant digit? Excellent question. That's actully a pretty standard way to parse numeric literals. I accepted the claim at face value that in JSON there is something that requires parsing numeric literals from the least significant end -- but I can't think of why the usual algorithms used by other languages' lexers for yonks wouldn't work for JSON. -- Grant From rosuav at gmail.com Mon Sep 30 14:46:35 2024 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 1 Oct 2024 04:46:35 +1000 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <4XHQPG4LzsznVwM@mail.python.org> Message-ID: On Tue, 1 Oct 2024 at 04:30, Dan Sommers via Python-list wrote: > > But why do I need to start with the least > significant digit? If you start from the most significant, you don't know anything about the number until you finish parsing it. There's almost nothing you can say about a number given that it starts with a particular sequence (since you don't know how MANY digits there are). However, if you know the LAST digits, you can make certain statements about it (trivial examples being whether it's odd or even). It's not very, well, significant. But there's something to it. And it extends nicely to p-adic numbers, which can have an infinite number of nonzero digits to the left of the decimal: https://en.wikipedia.org/wiki/P-adic_number ChrisA From list1 at tompassin.net Mon Sep 30 13:57:05 2024 From: list1 at tompassin.net (Thomas Passin) Date: Mon, 30 Sep 2024 13:57:05 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> Message-ID: <848d6843-d919-4a43-80e1-768fb8da2139@tompassin.net> On 9/30/2024 1:00 PM, Chris Angelico via Python-list wrote: > On Tue, 1 Oct 2024 at 02:20, Thomas Passin via Python-list > wrote: >> >> On 9/30/2024 11:30 AM, Barry via Python-list wrote: >>> >>> >>>> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list wrote: >>>> >>>> >>>> import polars as pl >>>> pl.read_json("file.json") >>>> >>>> >>> >>> This is not going to work unless the computer has a lot more the 60GiB of RAM. >>> >>> As later suggested a streaming parser is required. >> >> Streaming won't work because the file is gzipped. You have to receive >> the whole thing before you can unzip it. Once unzipped it will be even >> larger, and all in memory. > > Streaming gzip is perfectly possible. You may be thinking of PKZip > which has its EOCD at the end of the file (although it may still be > possible to stream-decompress if you work at it). > > ChrisA You're right, that's what I was thinking of. From olegsivokon at gmail.com Mon Sep 30 15:30:06 2024 From: olegsivokon at gmail.com (Left Right) Date: Mon, 30 Sep 2024 21:30:06 +0200 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> Message-ID: > Streaming won't work because the file is gzipped. You have to receive > the whole thing before you can unzip it. Once unzipped it will be even > larger, and all in memory. GZip is specifically designed to be streamed. So, that's not a problem (in principle), but you would need to have a streaming GZip parser, quick search in PyPI revealed this package: https://pypi.org/project/gzip-stream/ . On Mon, Sep 30, 2024 at 6:20?PM Thomas Passin via Python-list wrote: > > On 9/30/2024 11:30 AM, Barry via Python-list wrote: > > > > > >> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list wrote: > >> > >> > >> import polars as pl > >> pl.read_json("file.json") > >> > >> > > > > This is not going to work unless the computer has a lot more the 60GiB of RAM. > > > > As later suggested a streaming parser is required. > > Streaming won't work because the file is gzipped. You have to receive > the whole thing before you can unzip it. Once unzipped it will be even > larger, and all in memory. > -- > https://mail.python.org/mailman/listinfo/python-list From list1 at tompassin.net Mon Sep 30 14:05:36 2024 From: list1 at tompassin.net (Thomas Passin) Date: Mon, 30 Sep 2024 14:05:36 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> Message-ID: On 9/30/2024 11:30 AM, Barry via Python-list wrote: > > >> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list wrote: >> >> >> import polars as pl >> pl.read_json("file.json") >> >> > > This is not going to work unless the computer has a lot more the 60GiB of RAM. > > As later suggested a streaming parser is required. There is also the json-stream library, on PyPi at https://pypi.org/project/json-stream/ From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Sep 30 18:16:03 2024 From: 2QdxY4RzWzUUiLuE at potatochowder.com (2QdxY4RzWzUUiLuE at potatochowder.com) Date: Mon, 30 Sep 2024 18:16:03 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <4XHQPG4LzsznVwM@mail.python.org> Message-ID: On 2024-10-01 at 04:46:35 +1000, Chris Angelico via Python-list wrote: > On Tue, 1 Oct 2024 at 04:30, Dan Sommers via Python-list > wrote: > > > > But why do I need to start with the least > > significant digit? > > If you start from the most significant, you don't know anything about > the number until you finish parsing it. There's almost nothing you can > say about a number given that it starts with a particular sequence > (since you don't know how MANY digits there are). However, if you know > the LAST digits, you can make certain statements about it (trivial > examples being whether it's odd or even). But that wasn't the question. Sure, under certain circumstances and for specific use cases and/or requirements, there might be arguments to read potential numbers as strings and possibly not have to parse them completely before accepting or rejecting them. And if I start with the least significant digit and the number happens to be written in scientific notation and/or has a decimal point, then I can't tell whether it's odd or even until I further process the whole thing anyway. > It's not very, well, significant. But there's something to it. And it > extends nicely to p-adic numbers, which can have an infinite number of > nonzero digits to the left of the decimal: > > https://en.wikipedia.org/wiki/P-adic_number In Common Lisp, integers can be written in any integer base from two to thirty six, inclusive. So knowing the last digit doesn't tell you whether an integer is even or odd until you know the base anyway. Curiously, we agree: if you move the goal posts arbitrarily, then some algorithms that parse JSON numbers will fail. From grant.b.edwards at gmail.com Mon Sep 30 18:54:52 2024 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Mon, 30 Sep 2024 18:54:52 -0400 (EDT) Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API References: <4XHQPG4LzsznVwM@mail.python.org> Message-ID: <4XHbxS5jl4znVGD@mail.python.org> On 2024-09-30, Dan Sommers via Python-list wrote: > In Common Lisp, integers can be written in any integer base from two > to thirty six, inclusive. So knowing the last digit doesn't tell > you whether an integer is even or odd until you know the base > anyway. I had to think about that for an embarassingly long time before it clicked. From rosuav at gmail.com Mon Sep 30 19:09:07 2024 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 1 Oct 2024 09:09:07 +1000 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: <4XHbxS5jl4znVGD@mail.python.org> References: <4XHQPG4LzsznVwM@mail.python.org> <4XHbxS5jl4znVGD@mail.python.org> Message-ID: On Tue, 1 Oct 2024 at 08:56, Grant Edwards via Python-list wrote: > > On 2024-09-30, Dan Sommers via Python-list wrote: > > > In Common Lisp, integers can be written in any integer base from two > > to thirty six, inclusive. So knowing the last digit doesn't tell > > you whether an integer is even or odd until you know the base > > anyway. > > I had to think about that for an embarassingly long time before it > clicked. The only part I'm not clear on is what identifies the base. If you're going to write numbers little-endian, it's not that hard to also write them with a base indicator before the digits. But, whatever. This is a typical tangent and people are argumentative for no reason. I was just trying to add some explanatory notes to why little-endian does make more sense than big-endian. ChrisA From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Sep 30 20:06:57 2024 From: 2QdxY4RzWzUUiLuE at potatochowder.com (2QdxY4RzWzUUiLuE at potatochowder.com) Date: Mon, 30 Sep 2024 20:06:57 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <4XHQPG4LzsznVwM@mail.python.org> <4XHbxS5jl4znVGD@mail.python.org> Message-ID: On 2024-10-01 at 09:09:07 +1000, Chris Angelico via Python-list wrote: > On Tue, 1 Oct 2024 at 08:56, Grant Edwards via Python-list > wrote: > > > > On 2024-09-30, Dan Sommers via Python-list wrote: > > > > > In Common Lisp, integers can be written in any integer base from two > > > to thirty six, inclusive. So knowing the last digit doesn't tell > > > you whether an integer is even or odd until you know the base > > > anyway. > > > > I had to think about that for an embarassingly long time before it > > clicked. > > The only part I'm not clear on is what identifies the base. If you're > going to write numbers little-endian, it's not that hard to also write > them with a base indicator before the digits [...] In Common Lisp, you can write integers as #nnR[digits], where nn is the decimal representation of the base (possibly without a leading zero), the # and the R are literal characters, and the digits are written in the intended base. So the input #16fFFFF is read as the integer 65535. You can also set or bind the global variable *read-base* (yes, the asterisks are part of the name) to an integer between 2 and 36, and then anything that looks like an integer in that base is interpreted as such (including literals in programs). The literals I described above are still handled correctly no matter the current value of *read-base*. So if the value of *read-base* is 16, then the input FFFF is read as the integer 65535 (as is the input #16rFFFF). (Pedants may point our details I omitted. I admit to omitting them.) IIRC, certain [old 8080 and Z-80?] assemblers used to put the base indicator at the end. So 10 meant, well, 10, but 10H meant 16 and 10b meant 2 (IDK; the capital H and the lower case b both look right to me). I don't recall numbers written from least significant digit to most significant digit (big and little endian *storage*, yes, but not the digits when presented to or read from a human). From Keith.S.Thompson+u at gmail.com Mon Sep 30 21:48:02 2024 From: Keith.S.Thompson+u at gmail.com (Keith Thompson) Date: Mon, 30 Sep 2024 18:48:02 -0700 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API References: <4XHQPG4LzsznVwM@mail.python.org> <4XHbxS5jl4znVGD@mail.python.org> Message-ID: <87jzesr3u5.fsf@nosuchdomain.example.com> 2QdxY4RzWzUUiLuE at potatochowder.com writes: [...] > In Common Lisp, you can write integers as #nnR[digits], where nn is the > decimal representation of the base (possibly without a leading zero), > the # and the R are literal characters, and the digits are written in > the intended base. So the input #16fFFFF is read as the integer 65535. Typo: You meant #16RFFFF, not #16fFFFF. -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u at gmail.com void Void(void) { Void(); } /* The recursive call of the void */ From olegsivokon at gmail.com Mon Sep 30 15:34:07 2024 From: olegsivokon at gmail.com (Left Right) Date: Mon, 30 Sep 2024 21:34:07 +0200 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> Message-ID: > What am I missing? Handwavingly, start with the first digit, and as > long as the next character is a digit, multipliy the accumulated result > by 10 (or the appropriate base) and add the next value. Oh, and handle > scientific notation as a special case, and perhaps fail spectacularly > instead of recovering gracefully in certain edge cases. And in the > pathological case of a single number with 60 billion digits, run out of > memory (and complain loudly to the person who claimed that the file > contained a "dataset"). But why do I need to start with the least > significant digit? You probably forgot that it has to be _streaming_. Suppose you parse the first digit: can you hand this information over to an external function to process the parsed data? -- No! because you don't know the magnitude yet. What about two digits? -- Same thing. You cannot leave the parser code until you know the magnitude (otherwise the information is useless to the external code). So, even if you have enough memory and don't care about special cases like scientific notation: yes, you will be able to parse it, but it won't be a streaming parser. On Mon, Sep 30, 2024 at 9:30?PM Left Right wrote: > > > Streaming won't work because the file is gzipped. You have to receive > > the whole thing before you can unzip it. Once unzipped it will be even > > larger, and all in memory. > > GZip is specifically designed to be streamed. So, that's not a > problem (in principle), but you would need to have a streaming GZip > parser, quick search in PyPI revealed this package: > https://pypi.org/project/gzip-stream/ . > > On Mon, Sep 30, 2024 at 6:20?PM Thomas Passin via Python-list > wrote: > > > > On 9/30/2024 11:30 AM, Barry via Python-list wrote: > > > > > > > > >> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list wrote: > > >> > > >> > > >> import polars as pl > > >> pl.read_json("file.json") > > >> > > >> > > > > > > This is not going to work unless the computer has a lot more the 60GiB of RAM. > > > > > > As later suggested a streaming parser is required. > > > > Streaming won't work because the file is gzipped. You have to receive > > the whole thing before you can unzip it. Once unzipped it will be even > > larger, and all in memory. > > -- > > https://mail.python.org/mailman/listinfo/python-list