From 2QdxY4RzWzUUiLuE at potatochowder.com Tue Oct 1 11:34:45 2024 From: 2QdxY4RzWzUUiLuE at potatochowder.com (2QdxY4RzWzUUiLuE at potatochowder.com) Date: Tue, 1 Oct 2024 11:34:45 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: <87jzesr3u5.fsf@nosuchdomain.example.com> References: <4XHQPG4LzsznVwM@mail.python.org>

<4XHbxS5jl4znVGD@mail.python.org>

<87jzesr3u5.fsf@nosuchdomain.example.com> Message-ID: On 2024-09-30 at 18:48:02 -0700, Keith Thompson via Python-list wrote: > 2QdxY4RzWzUUiLuE at potatochowder.com writes: > [...] > > In Common Lisp, you can write integers as #nnR[digits], where nn is the > > decimal representation of the base (possibly without a leading zero), > > the # and the R are literal characters, and the digits are written in > > the intended base. So the input #16fFFFF is read as the integer 65535. > > Typo: You meant #16RFFFF, not #16fFFFF. Yep. Sorry. From 2QdxY4RzWzUUiLuE at potatochowder.com Tue Oct 1 11:47:24 2024 From: 2QdxY4RzWzUUiLuE at potatochowder.com (2QdxY4RzWzUUiLuE at potatochowder.com) Date: Tue, 1 Oct 2024 11:47:24 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: On 2024-09-30 at 21:34:07 +0200, Regarding "Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API," Left Right via Python-list wrote: > > What am I missing? Handwavingly, start with the first digit, and as > > long as the next character is a digit, multipliy the accumulated result > > by 10 (or the appropriate base) and add the next value. Oh, and handle > > scientific notation as a special case, and perhaps fail spectacularly > > instead of recovering gracefully in certain edge cases. And in the > > pathological case of a single number with 60 billion digits, run out of > > memory (and complain loudly to the person who claimed that the file > > contained a "dataset"). But why do I need to start with the least > > significant digit? > > You probably forgot that it has to be _streaming_. Suppose you parse > the first digit: can you hand this information over to an external > function to process the parsed data? -- No! because you don't know the > magnitude yet. What about two digits? -- Same thing. You cannot > leave the parser code until you know the magnitude (otherwise the > information is useless to the external code). If I recognize the first digit, then I *can* hand that over to an external function to accumulate the digits that follow. > So, even if you have enough memory and don't care about special cases > like scientific notation: yes, you will be able to parse it, but it > won't be a streaming parser. Under that constraint, I'm not sure I can parse anything. How can I parse a string (and hand it over to an external function) until I've found the closing quote? How much state can a parser maintain (before it invokes an external function) and still be considered streaming? I fear that we may be getting hung up on terminology rather than solving the problem at hand. From thomas at python.org Tue Oct 1 12:39:31 2024 From: thomas at python.org (Thomas Wouters) Date: Tue, 1 Oct 2024 09:39:31 -0700 Subject: [RELEASE] Python 3.13.0rc3 and 3.12.7 released. Message-ID: This is not the release you?re looking for? (unless you?re looking for 3.12.7.) Because no plan survives contact with reality, instead of the actual Python 3.13.0 release we have a new Python 3.13 release candidate today. Python 3.13.0rc3 rolls back the incremental cyclic garbage collector (GC), which was added in one of the alpha releases. The incremental GC had more significant performance regressions in specific workloads than we expected. Rather than try to fiddle with its details in the hope of fixing them (and not making anything else worse) we decided to revert back to the old GC in 3.13. Work on the incremental GC will continue in 3.14. We also took the opportunity to fix some other (rare) bugs and issues found in 3.13.0rc2. The final release of Python 3.13.0 will now happen next week, Monday October 7th . In an effort to return to normalcy, we?ve also released Python 3.12.7 as scheduled, despite the expedited release a month ago. It?s important to be regular! 3.13.0rc3 https://www.python.org/downloads/release/python-3130rc3/ The final cut of 3.13.0 (really, honest). Besides the incremental GC revert it contains a small number of other fixes, as well as many documentation improvements and testsuite improvements (~145 changes in total). Call to action We strongly encourage maintainers of third-party Python projects to prepare their projects for 3.13 compatibilities during this phase, and where necessary publish Python 3.13 wheels on PyPI to be ready for the final release of 3.13.0. Any binary wheels built against Python 3.13.0rc1 and later will work with future versions of Python 3.13. As always, report any issues to the Python bug tracker . Please keep in mind that this is a preview release and while it?s as close to the final release as we can get it, its use is not recommended for production environments. Next week, though! New features in Python 3.13 - A new and improved interactive interpreter , based on PyPy ?s, featuring multi-line editing and color support, as well as colorized exception tracebacks . - An *experimental* free-threaded build mode , which disables the Global Interpreter Lock, allowing threads to run more concurrently. The build mode is available as an experimental feature in the Windows and macOS installers as well. - A preliminary, *experimental* JIT , providing the ground work for significant performance improvements. - The locals() builtin function (and its C equivalent) now has well-defined semantics when mutating the returned mapping , which allows debuggers to operate more consistently. - A modified version of mimalloc is now included, optional but enabled by default if supported by the platform, and required for the free-threaded build mode. - Docstrings now have their leading indentation stripped , reducing memory use and the size of .pyc files. (Most tools handling docstrings already strip leading indentation.) - The dbm module has a new dbm.sqlite3 backend that is used by default when creating new files. - The minimum supported macOS version was changed from 10.9 to 10.13 (High Sierra). Older macOS versions will not be supported going forward. - WASI is now a Tier 2 supported platform . Emscripten is no longer an officially supported platform (but Pyodide continues to support Emscripten). - iOS is now a Tier 3 supported platform . - Android is now a Tier 3 supported platform as well. Python 3.12.7 https://www.python.org/downloads/release/python-3127/ A small release since 3.12.6 was only a month ago, but nevertheless 3.12.7 contains ~120 bug fixes, build improvements and documentation changes. More resources - Python 3.13 Online Documentation - PEP 719 , Python 3.13 Release Schedule - Report bugs at Issues ? python/cpython ? GitHub . - Help fund Python directly (or via GitHub Sponsors ), and support the Python community . Enjoy the new releases Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation. Regards from a positively *melting* Menlo Park for some reason this time, Your release team, Thomas Wouters ?ukasz Langa Ned Deily Steve Dower From olegsivokon at gmail.com Tue Oct 1 17:03:01 2024 From: olegsivokon at gmail.com (Left Right) Date: Tue, 1 Oct 2024 23:03:01 +0200 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: > If I recognize the first digit, then I *can* hand that over to an > external function to accumulate the digits that follow. And what is that external function going to do with this information? The point is you didn't parse anything if you just sent the digit. You just delegated the parsing further. Parsing is only meaningful if you extracted some information, but your idea is, essentially "what if I do nothing?". > Under that constraint, I'm not sure I can parse anything. How can I parse a string (and hand it over to an external function) until I've found the closing quote? Nobody says that parsing a number is the only pathological case. You, however, exaggerate by saying you cannot parse _anything_. You can parse booleans or null, for example. There's no problem there. Again, I think you misunderstand what streaming is for. Let me remind: it's for processing information as it comes, potentially, indefinitely. This has far more important implications than what you find in computer science. For example, some mathematicians use the same argument to show that real numbers are either fiction or useless: consider adding two real numbers (where real numbers are potentially infinite strings of decimal digits after the period) -- there's no way to prove that such an addition is possible because you would need an infinite proof for that (because you need to start adding from the least significant digit). In principle, any language that has infinite words will have the same problem with streaming. If you ever pondered h/w or low-level protocols s.a. SCSI or IP, you'd see that they are specifically designed in such a way as to never have infinite words (because they must be amenable to streaming). Consider also an interesting consequence of SCSI not being able to have infinite words: this means, besides other things that fsync() is nonsense! :) If you aren't familiar with the concept: UNIX filesystem API suggests that it's possible to destage arbitrary large file (or a chunk of file) to disk. But SCSI is built of finite "words" and to describe an arbitrary large file you'd need to list all the blocks that constitute the file! And that's why fsync() and family are so hated by people who deal with storage: the only way to implement fsync() in compliance with the standard is to sync _everything_ (and it hurts!) On Tue, Oct 1, 2024 at 5:49?PM Dan Sommers via Python-list wrote: > > On 2024-09-30 at 21:34:07 +0200, > Regarding "Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API," > Left Right via Python-list wrote: > > > > What am I missing? Handwavingly, start with the first digit, and as > > > long as the next character is a digit, multipliy the accumulated result > > > by 10 (or the appropriate base) and add the next value. Oh, and handle > > > scientific notation as a special case, and perhaps fail spectacularly > > > instead of recovering gracefully in certain edge cases. And in the > > > pathological case of a single number with 60 billion digits, run out of > > > memory (and complain loudly to the person who claimed that the file > > > contained a "dataset"). But why do I need to start with the least > > > significant digit? > > > > You probably forgot that it has to be _streaming_. Suppose you parse > > the first digit: can you hand this information over to an external > > function to process the parsed data? -- No! because you don't know the > > magnitude yet. What about two digits? -- Same thing. You cannot > > leave the parser code until you know the magnitude (otherwise the > > information is useless to the external code). > > If I recognize the first digit, then I *can* hand that over to an > external function to accumulate the digits that follow. > > > So, even if you have enough memory and don't care about special cases > > like scientific notation: yes, you will be able to parse it, but it > > won't be a streaming parser. > > Under that constraint, I'm not sure I can parse anything. How can I > parse a string (and hand it over to an external function) until I've > found the closing quote? > > How much state can a parser maintain (before it invokes an external > function) and still be considered streaming? I fear that we may be > getting hung up on terminology rather than solving the problem at hand. > -- > https://mail.python.org/mailman/listinfo/python-list From greg.ewing at canterbury.ac.nz Tue Oct 1 17:48:24 2024 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 2 Oct 2024 10:48:24 +1300 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: On 1/10/24 8:34 am, Left Right wrote: > You probably forgot that it has to be _streaming_. Suppose you parse > the first digit: can you hand this information over to an external > function to process the parsed data? -- No! because you don't know the > magnitude yet. By that definition of "streaming", no parser can ever be streaming, because there will be some constructs that must be read in their entirety before a suitably-structured piece of output can be emitted. The context of this discussion about integers is the claim that they *could* be parsed incrementally if they were written little endian instead of big endian, but the same argument applies either way. -- Greg From greg.ewing at canterbury.ac.nz Tue Oct 1 18:07:41 2024 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 2 Oct 2024 11:07:41 +1300 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: On 2/10/24 10:03 am, Left Right wrote: > Consider also an interesting > consequence of SCSI not being able to have infinite words: this means, > besides other things that fsync() is nonsense! :) If you aren't > familiar with the concept: UNIX filesystem API suggests that it's > possible to destage arbitrary large file (or a chunk of file) to disk. > But SCSI is built of finite "words" and to describe an arbitrary large > file you'd need to list all the blocks that constitute the file! I don't follow. What fsync() does is ensure that any data buffered in the kernel relating to the file is sent to the storage device. It can send as many blocks of data over SCSI as required to achieve this. There's no requirement for it to be atomic at the level of the interface between the kernel and the hardware. Some devices do their own buffering in ways that are invisible to the software, so fsync() can't guarantee that the data is actually written to the storage medium. But that's a problem stemming from the design of the hardware, not the design of the protocol for communicating with the hardware. > the only way to implement fsync() in compliance with the > standard is to sync _everything_ Again I'm not sure what you mean here. It may be difficult for the kernel to track down exactly what data is relevant to a particular file, and so the kernel programmers take the easy way out and just implement fsync() as sync(). But again that has nothing to do with the protocol. -- Greg From avi.e.gross at gmail.com Tue Oct 1 19:26:52 2024 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Tue, 1 Oct 2024 19:26:52 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: <020101db1459$65b0c4d0$31124e70$@gmail.com> This discussion has become less useful. E can all agree that in Computer Science, real infinities are avoided, and frankly, need not be taken seriously in any serious program. You can store all kinds of infinities quite compactly as in a transcendental number you can derive to as many decimal points as you like. Want 1/7 to a thousand decimal places, no problem. You can be given a digit 1 and a digit 7 and asked to do a division to as many digits as you wish in a deterministic manner. I can think of quite a few generators that could easily supply the next digit, or just keep giving the next element from 142857 each time from a circular loop. Sines, cosines, pi, e and so on, can often be calculated to arbitrary precision by evaluating things like infinite Taylor Series as many times as needed up to the precision of the data holding the number as you move along. Similar ideas allow generators to give you as many primes as you want, and no more. So, if you can store arbitrary python code as part of your JSON, you can send quite a bit of somewhat compressed data. The real problem is how the JSON is set up. If you take umpteen data structures and wrap them all in something like a list, then it may be a tad hard to stream as you may not necessarily be examining the contents till the list finishes gigabytes later. But if, instead, you send lots of smaller parts, such as perhaps sending each row of something like a data.frame individually, the other side can recombine them incrementally to a larger structure such as a data.frame and do some logic on it as it streams, such as keeping only some columns and discarding the rest, or applying filters that only keep rows you care about. And, of course, all rows could be appended to one and perhaps more .CSV files as well so if you need multiple passes on the data, it can now be processed locally in various modes, including "streamed". I think that for some purposes, it makes some sense to not stream anything but results. I mean consider any database that allows a remote login and SQL commands that only stream results. If I only want info on records about company X between July 1 and September 15 of a particular year and only if the amount paid remains zero or is less than the amount owed, ... -----Original Message----- From: Python-list On Behalf Of Greg Ewing via Python-list Sent: Tuesday, October 1, 2024 5:48 PM To: python-list at python.org Subject: Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API On 1/10/24 8:34 am, Left Right wrote: > You probably forgot that it has to be _streaming_. Suppose you parse > the first digit: can you hand this information over to an external > function to process the parsed data? -- No! because you don't know the > magnitude yet. By that definition of "streaming", no parser can ever be streaming, because there will be some constructs that must be read in their entirety before a suitably-structured piece of output can be emitted. The context of this discussion about integers is the claim that they *could* be parsed incrementally if they were written little endian instead of big endian, but the same argument applies either way. -- Greg -- https://mail.python.org/mailman/listinfo/python-list From 2QdxY4RzWzUUiLuE at potatochowder.com Tue Oct 1 20:20:59 2024 From: 2QdxY4RzWzUUiLuE at potatochowder.com (2QdxY4RzWzUUiLuE at potatochowder.com) Date: Tue, 1 Oct 2024 20:20:59 -0400 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: On 2024-10-01 at 23:03:01 +0200, Left Right wrote: > > If I recognize the first digit, then I *can* hand that over to an > > external function to accumulate the digits that follow. > > And what is that external function going to do with this information? > The point is you didn't parse anything if you just sent the digit. > You just delegated the parsing further. Parsing is only meaningful if > you extracted some information, but your idea is, essentially "what if > I do nothing?". If the parser detects the first digit of a number, then the parser can read digits one at a time (i.e., "streaming"), assimilate and accumulate the value of the number being parsed, and successfully finish parsing the number it reads a non-digit. Whether the function that accumulates the value during the process is internal or external isn't relevant; the point is that it is possible to parse integers from most significant digit to least significant digit under a streaming model (and if you're sufficiently clever, you can even write partial results to external storage and/or another transmission protocol, thus allowing for numbers bigger (as measured by JSON or your internal representation) than your RAM). At most, the parser has to remember the non-digit character it read so that it (the parser) can begin to parse whatever comes after the number. Does that break your notion of "streaming"? Why do I have to start with the least significant digit? > > Under that constraint, I'm not sure I can parse anything. How can I > > parse a string (and hand it over to an external function) until I've > > found the closing quote? > > Nobody says that parsing a number is the only pathological case. You, > however, exaggerate by saying you cannot parse _anything_. You can > parse booleans or null, for example. There's no problem there. My intent was only to repeat what you implied: that any parser that reads its input until it has parsed a value is not streaming. So how much information can the parser keep before you consider it not to be "streaming"? [...] > In principle, any language that has infinite words will have the same > problem with streaming [...] So what magic allows anyone to stream any JSON file over SCSI or IP? Let alone some kind of "live stream" that by definition is indefinite, even if it only lasts a few tenths of a second? > [...] If you ever pondered h/w or low-level > protocols s.a. SCSI or IP [...] I spent a good deal of my career designing and implementing all manner of communicaations protocols, from transmitting and receiving single bits over a wire all the way up to what are now known as session and presentation layers. Some imposed maximum lengths in certain places; some allowed for indefinite amounts of data to be transferred from one end to the other without stopping, resetting, or overflowing. And yet somehow, the universe never collapsed. If you believe that some implementation of fsync fails to meet a specification, or fails to work correctly on files containign JSON, then file a bug report. From greg.ewing at canterbury.ac.nz Wed Oct 2 01:27:54 2024 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 2 Oct 2024 18:27:54 +1300 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

<020101db1459$65b0c4d0$31124e70$@gmail.com> Message-ID: On 2/10/24 12:26 pm, avi.e.gross at gmail.com wrote: > The real problem is how the JSON is set up. If you take umpteen data > structures and wrap them all in something like a list, then it may be a tad > hard to stream as you may not necessarily be examining the contents till the > list finishes gigabytes later. Yes, if you want to process the items as they come in, you might be better off sending a series of separate JSON strings, rather than one JSON string containing a list. Or, use a specialised JSON parser that processes each item of the list as soon as it's finished parsing it, instead of collecting the whole list first. -- Greg From guenther.sohler at gmail.com Wed Oct 2 09:26:47 2024 From: guenther.sohler at gmail.com (Guenther Sohler) Date: Wed, 2 Oct 2024 15:26:47 +0200 Subject: Python crash together with threads Message-ID: My Software project is working fine in most of the cases (www.pythonscad.org) however I am right now isolating a scenario, which makes it crash permanently. It does not happen with Python 3.11.6 (and possibly below), it happens with 3.12 and above It does not happen when not using Threads. However due to the architecture of the program I am forced to evaluate some parts in main thread and some parts in a dedicated Thread. The Thread is started with QThread(QT 5.0) whereas I am quite sure that program flows do not overlap. When I just execute my 1st very simple Python function inside the newly created thread, like: PyObject *a = PyFloat_FromDouble(3.3); my program crashes with this Stack trace 0 0x00007f6837fe000f in _PyInterpreterState_GET () at ./Include/internal/pycore_pystate.h:179 #1 get_float_state () at Objects/floatobject.c:38 #2 PyFloat_FromDouble (fval=3.2999999999999998) at Objects/floatobject.c:136 #3 0x00000000015a021f in python_testfunc() () #4 0x0000000001433301 in CGALWorker::work() () #5 0x0000000000457135 in CGALWorker::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) () #6 0x00007f68364d0f9f in void doActivate(QObject*, int, void**) () at /lib64/libQt5Core.so.5 #7 0x00007f68362e66ee in QThread::started(QThread::QPrivateSignal) () at /lib64/libQt5Core.so.5 #8 0x00007f68362e89c4 in QThreadPrivate::start(void*) () at /lib64/libQt5Core.so.5 #9 0x00007f6835cae19d in start_thread () at /lib64/libc.so.6 #10 0x00007f6835d2fc60 in clone3 () at /lib64/libc.so.6 I suspect, that this is a Null pointer here See also _PyInterpreterState_Get() and _PyGILState_GetInterpreterStateUnsafe(). */ static inline PyInterpreterState* _PyInterpreterState_GET(void) { PyThreadState *tstate = _PyThreadState_GET(); #ifdef Py_DEBUG _Py_EnsureTstateNotNULL(tstate); #endif # <<----------- suspect state is nullpointer return tstate->interp; } any clues , whats going on here, and how I can mitigate that ? From olegsivokon at gmail.com Wed Oct 2 02:05:02 2024 From: olegsivokon at gmail.com (Left Right) Date: Wed, 2 Oct 2024 08:05:02 +0200 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: > By that definition of "streaming", no parser can ever be streaming, > because there will be some constructs that must be read in their > entirety before a suitably-structured piece of output can be > emitted. In the same email you replied to, I gave examples of languages for which parsers can be streaming (in general): SCSI or IP. For some languages (eg. everything in the context-free family) streaming parsers are _in general_ impossible, because there are pathological cases like the one with parsing numbers. But this doesn't mean that you cannot come up with a parser that is only useful _sometimes_. And, in practice, languages like XML or JSON do well with streaming, even though in general it's impossible. I'm sorry if this comes as a surprise. On one hand I don't want to sound condescending, on the other hand, this is something that you'd typically study in automata theory class. Well, not exactly in the very same words, but you should be able to figure this stuff out if you had that class. From rosuav at gmail.com Wed Oct 2 09:59:41 2024 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 2 Oct 2024 23:59:41 +1000 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: On Wed, 2 Oct 2024 at 23:53, Left Right via Python-list wrote: > In the same email you replied to, I gave examples of languages for > which parsers can be streaming (in general): SCSI or IP. You can't validate an IP packet without having all of it. Your notion of "streaming" is nonsensical. ChrisA From rosuav at gmail.com Wed Oct 2 18:51:01 2024 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 3 Oct 2024 08:51:01 +1000 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: On Thu, 3 Oct 2024 at 08:48, Left Right wrote: > > > You can't validate an IP packet without having all of it. Your notion > > of "streaming" is nonsensical. > > Whoa, whoa, hold your horses! "nonsensical" needs a little bit of > justification :) > > It seems you don't understand the difference between words and > languages! In my examples, IP _protocol_ is the language, sequences of > IP packets are the words in the language. A language is amenable to > streaming if the words of the language are repetition of sequences of > symbols of the alphabet of fixed length. This is, essentially, like > saying that the words themselves are regular. One single IP packet is all you can parse. You're playing shenanigans with words the way Humpty Dumpty does. IP packets are not sequences, they are individuals. ChrisA From lkrupp at invalid.pssw.com.invalid Wed Oct 2 17:06:03 2024 From: lkrupp at invalid.pssw.com.invalid (Louis Krupp) Date: Wed, 2 Oct 2024 15:06:03 -0600 Subject: Python crash together with threads In-Reply-To: References:

Message-ID: <%AiLO.42528$s7Ce.9174@fx46.iad> On 10/2/2024 7:26 AM, Guenther Sohler wrote: > My Software project is working fine in most of the cases > (www.pythonscad.org) > however I am right now isolating a scenario, which makes it crash > permanently. > > It does not happen with Python 3.11.6 (and possibly below), it happens with > 3.12 and above > It does not happen when not using Threads. > > However due to the architecture of the program I am forced to evaluate some > parts in main thread and some parts in a dedicated Thread. The Thread is > started with QThread(QT 5.0) > whereas I am quite sure that program flows do not overlap. > > When I just execute my 1st very simple Python function inside the newly > created thread, like: > > PyObject *a = PyFloat_FromDouble(3.3); > > my program crashes with this Stack trace > > 0 0x00007f6837fe000f in _PyInterpreterState_GET () at > ./Include/internal/pycore_pystate.h:179 > #1 get_float_state () at Objects/floatobject.c:38 > #2 PyFloat_FromDouble (fval=3.2999999999999998) at > Objects/floatobject.c:136 > #3 0x00000000015a021f in python_testfunc() () > #4 0x0000000001433301 in CGALWorker::work() () > #5 0x0000000000457135 in CGALWorker::qt_static_metacall(QObject*, > QMetaObject::Call, int, void**) () > #6 0x00007f68364d0f9f in void doActivate(QObject*, int, void**) () > at /lib64/libQt5Core.so.5 > #7 0x00007f68362e66ee in QThread::started(QThread::QPrivateSignal) () at > /lib64/libQt5Core.so.5 > #8 0x00007f68362e89c4 in QThreadPrivate::start(void*) () at > /lib64/libQt5Core.so.5 > #9 0x00007f6835cae19d in start_thread () at /lib64/libc.so.6 > #10 0x00007f6835d2fc60 in clone3 () at /lib64/libc.so.6 > > > I suspect, that this is a Null pointer here > See also _PyInterpreterState_Get() > and _PyGILState_GetInterpreterStateUnsafe(). */ > static inline PyInterpreterState* _PyInterpreterState_GET(void) { > PyThreadState *tstate = _PyThreadState_GET(); > #ifdef Py_DEBUG > _Py_EnsureTstateNotNULL(tstate); > #endif > # <<----------- suspect state is nullpointer > return tstate->interp; > } > > any clues , whats going on here, and how I can mitigate that ? Can you post a small, self-contained program that demonstrates the problem? Louis From olegsivokon at gmail.com Wed Oct 2 18:48:10 2024 From: olegsivokon at gmail.com (Left Right) Date: Thu, 3 Oct 2024 00:48:10 +0200 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: > You can't validate an IP packet without having all of it. Your notion > of "streaming" is nonsensical. Whoa, whoa, hold your horses! "nonsensical" needs a little bit of justification :) It seems you don't understand the difference between words and languages! In my examples, IP _protocol_ is the language, sequences of IP packets are the words in the language. A language is amenable to streaming if the words of the language are repetition of sequences of symbols of the alphabet of fixed length. This is, essentially, like saying that the words themselves are regular. So, the follow-up question from you to me should be: how come strictly context-free languages can still be parsed with streaming parsers? -- And the answer to that is it's possible to approximate context-free languages with regular languages. In fact, this is a very interesting subject, which unfortunately is usually overlooked in automata classes. It's interesting in a sense that it's very accessible to the students who already mastered the understanding of regular and context-free formalisms. So, streaming parsers (eg. SAX) are written for a regular language that approximates XML. This is because in practice we will almost never encounter more than N nesting levels in an XML, more than N characters in an element name etc. (for some large enough N). Something which allows us to create a regular language from a context-free one. NB. "Nonsensical" has a very precise meaning, when it comes to discussing the truth value of a proposition, which I think you also somehow didn't know about. You seem to use "nonsensical" as a synonym to "wrong". But, unbeknownst to you, you said something else. You actually implied that there's no way to tell if my notion of streaming is correct or not. But, for the future reference: my notion of streaming is correct, and you would do better learning some materials about it before jumping to conclusions. From olegsivokon at gmail.com Wed Oct 2 18:56:36 2024 From: olegsivokon at gmail.com (Left Right) Date: Thu, 3 Oct 2024 00:56:36 +0200 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: > One single IP packet is all you can parse. I worked for an undisclosed company which manufactures h/w for ISPs (4- and 8-unit boxes you mount on a rack in a datacenter). Essentially, big-big routers. So, I had the pleasure of writing software that parses IP _protocol_, and let me tell you: you have no idea what you just wrote. But, like I wrote earlier: you don't understand the distinction between languages and words. And in general, are just being stubborn and rude because you are trying to prove a point to someone you don't like, but, in reality, you just look more and more ridiculous. On Thu, Oct 3, 2024 at 12:51?AM Chris Angelico wrote: > > On Thu, 3 Oct 2024 at 08:48, Left Right wrote: > > > > > You can't validate an IP packet without having all of it. Your notion > > > of "streaming" is nonsensical. > > > > Whoa, whoa, hold your horses! "nonsensical" needs a little bit of > > justification :) > > > > It seems you don't understand the difference between words and > > languages! In my examples, IP _protocol_ is the language, sequences of > > IP packets are the words in the language. A language is amenable to > > streaming if the words of the language are repetition of sequences of > > symbols of the alphabet of fixed length. This is, essentially, like > > saying that the words themselves are regular. > > One single IP packet is all you can parse. You're playing shenanigans > with words the way Humpty Dumpty does. IP packets are not sequences, > they are individuals. > > ChrisA From ethan at stoneleaf.us Wed Oct 2 21:57:51 2024 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 2 Oct 2024 18:57:51 -0700 Subject: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: Message-ID: This thread is derailing. Please consider it closed. -- ~Ethan~ Moderator From greg.ewing at canterbury.ac.nz Thu Oct 3 03:08:35 2024 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 3 Oct 2024 20:08:35 +1300 Subject: doRe: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API In-Reply-To: References: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>

Message-ID: On 3/10/24 11:48 am, Left Right wrote: > So, streaming parsers (eg. SAX) are written for a regular language > that approximates XML. SAX doesn't parse a whole XML document, it parses small pieces of it independently and passes them on. It's more like a lexical analyser than a parser in that respect. -- Greg From olegsivokon at gmail.com Thu Oct 3 17:01:53 2024 From: olegsivokon at gmail.com (Left Right) Date: Thu, 3 Oct 2024 23:01:53 +0200 Subject: Python crash together with threads In-Reply-To: References: Message-ID: > whereas I am quite sure that program flows do not overlap. You can never be sure of this in Python. Virtually all objects in Python are allocated on heap, so instantiating integers, doing simple arithmetic etc. -- all of this requires synchronization because it will allocate memory for a shared pool. The description of _PyThreadState_GET states that callers must hold GIL. Does your code do that? It's not possible to divine that from the stack trace, but you'd probably know that. On Wed, Oct 2, 2024 at 3:29?PM Guenther Sohler via Python-list wrote: > > My Software project is working fine in most of the cases > (www.pythonscad.org) > however I am right now isolating a scenario, which makes it crash > permanently. > > It does not happen with Python 3.11.6 (and possibly below), it happens with > 3.12 and above > It does not happen when not using Threads. > > However due to the architecture of the program I am forced to evaluate some > parts in main thread and some parts in a dedicated Thread. The Thread is > started with QThread(QT 5.0) > whereas I am quite sure that program flows do not overlap. > > When I just execute my 1st very simple Python function inside the newly > created thread, like: > > PyObject *a = PyFloat_FromDouble(3.3); > > my program crashes with this Stack trace > > 0 0x00007f6837fe000f in _PyInterpreterState_GET () at > ./Include/internal/pycore_pystate.h:179 > #1 get_float_state () at Objects/floatobject.c:38 > #2 PyFloat_FromDouble (fval=3.2999999999999998) at > Objects/floatobject.c:136 > #3 0x00000000015a021f in python_testfunc() () > #4 0x0000000001433301 in CGALWorker::work() () > #5 0x0000000000457135 in CGALWorker::qt_static_metacall(QObject*, > QMetaObject::Call, int, void**) () > #6 0x00007f68364d0f9f in void doActivate(QObject*, int, void**) () > at /lib64/libQt5Core.so.5 > #7 0x00007f68362e66ee in QThread::started(QThread::QPrivateSignal) () at > /lib64/libQt5Core.so.5 > #8 0x00007f68362e89c4 in QThreadPrivate::start(void*) () at > /lib64/libQt5Core.so.5 > #9 0x00007f6835cae19d in start_thread () at /lib64/libc.so.6 > #10 0x00007f6835d2fc60 in clone3 () at /lib64/libc.so.6 > > > I suspect, that this is a Null pointer here > See also _PyInterpreterState_Get() > and _PyGILState_GetInterpreterStateUnsafe(). */ > static inline PyInterpreterState* _PyInterpreterState_GET(void) { > PyThreadState *tstate = _PyThreadState_GET(); > #ifdef Py_DEBUG > _Py_EnsureTstateNotNULL(tstate); > #endif > # <<----------- suspect state is nullpointer > return tstate->interp; > } > > any clues , whats going on here, and how I can mitigate that ? > -- > https://mail.python.org/mailman/listinfo/python-list From dciprus at cisco.com Thu Oct 3 18:12:15 2024 From: dciprus at cisco.com (Dan Ciprus (dciprus)) Date: Thu, 3 Oct 2024 22:12:15 +0000 Subject: [Tutor] How to stop a specific thread in Python 2.7? In-Reply-To: References:

Message-ID: I'd be interested too :-). On Thu, Sep 26, 2024 at 03:34:05AM GMT, marc nicole via Python-list wrote: >Could you show a python code example of this? > > >On Thu, 26 Sept 2024, 03:08 Cameron Simpson, wrote: > >> On 25Sep2024 22:56, marc nicole wrote: >> >How to create a per-thread event in Python 2.7? >> >> Every time you make a Thread, make an Event. Pass it to the thread >> worker function and keep it to hand for your use outside the thread. >> _______________________________________________ >> Tutor maillist - Tutor at python.org >> To unsubscribe or change subscription options: >> https://mail.python.org/mailman/listinfo/tutor >> >-- >https://mail.python.org/mailman/listinfo/python-list -- Dan Ciprus [ curl -L http://git.io/unix ] -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: not available URL: From cs at cskk.id.au Thu Oct 3 19:17:19 2024 From: cs at cskk.id.au (Cameron Simpson) Date: Fri, 4 Oct 2024 09:17:19 +1000 Subject: [Tutor] How to stop a specific thread in Python 2.7? In-Reply-To: References: Message-ID: On 03Oct2024 22:12, Dan Ciprus (dciprus) wrote: >I'd be interested too :-). Untested sketch: def make_thread(target, *a, E=None, **kw): ''' Make a new Event E and Thread T, pass `[E,*a]` as the target positional arguments. A shared preexisting Event may be supplied. Return a 2-tuple of `(T,E)`. ''' if E is None: E = Event() T = Thread(target=target, args=[E, *a], kwargs=kw) return T, E Something along those lines. Cheers, Cameron Simpson From mk1853387 at gmail.com Sat Oct 5 13:55:36 2024 From: mk1853387 at gmail.com (marc nicole) Date: Sat, 5 Oct 2024 19:55:36 +0200 Subject: How to check whether lip movement is significant using face landmarks in dlib? Message-ID: I am trying to assess whether the lips of a person are moving too much while the mouth is closed (to conclude they are chewing). I try to assess the lip movement through landmarks (dlib) : Inspired by the mouth example ( https://github.com/mauckc/mouth-open/blob/master/detect_open_mouth.py#L17), and using it before the following function (as a primary condition for telling the person is chewing), I wrote the following function: def lips_aspect_ratio(shape): # grab the indexes of the facial landmarks for the lip (mStart, mEnd) = (61, 68) lip = shape[mStart:mEnd] print(len(lip)) # compute the euclidean distances between the two sets of # vertical lip landmarks (x, y)-coordinates # to reach landmark 68 I need to get lib[7] not lip[6] (while I get lip[7] I get IndexOutOfBoundError) A = dist.euclidean(lip[1], lip[6]) # 62, 68 B = dist.euclidean(lip[3], lip[5]) # 64, 66 # compute the euclidean distance between the horizontal # lip landmark (x, y)-coordinates C = dist.euclidean(lip[0], lip[4]) # 61, 65 # compute the lip aspect ratio mar = (A + B) / (2.0 * C) # return the lip aspect ratio return mar How to define an aspect ratio for the lips to conclude they are moving significantly? Is the mentioned function able to tell whether the lips are significantly moving while the mouth is closed? From ml at fam-goebel.de Sat Oct 5 16:27:33 2024 From: ml at fam-goebel.de (Ulrich Goebel) Date: Sat, 5 Oct 2024 22:27:33 +0200 Subject: Best Practice Virtual Environment Message-ID: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> Hi, I learned to use virtual environments where ever possible, and I learned to pip install the required packages there. That works quite nice at home. Now I come to deploy a Python script on a debian linux server, making it usable for a couple of users there. Debian (or even Python3 itself) doesn't allow to pip install required packages system wide, so I have to use virtual environments even there. But is it right, that I have to do that for every single user? Can someone give me a hint to find an howto for that? Best regards Ulrich -- Ulrich Goebel From cs at cskk.id.au Sat Oct 5 17:59:56 2024 From: cs at cskk.id.au (Cameron Simpson) Date: Sun, 6 Oct 2024 08:59:56 +1100 Subject: Best Practice Virtual Environment In-Reply-To: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> References: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> Message-ID: On 05Oct2024 22:27, Ulrich Goebel wrote: >Debian (or even Python3 itself) doesn't allow to pip install required >packages system wide, This is gnerally a good thing. You might modify a critical system-used package. >But is it right, that I have to do that for every single user? No. Just make a shared virtualenv, eg in /usr/local or /opt somewhere. Have the script commence with: #!/path/to/the/shred/venv/bin/python and make it readable and executable. Problem solved. Cheers, Cameron Simpson From list1 at tompassin.net Sat Oct 5 17:31:34 2024 From: list1 at tompassin.net (Thomas Passin) Date: Sat, 5 Oct 2024 17:31:34 -0400 Subject: Best Practice Virtual Environment In-Reply-To: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> References: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> Message-ID: <10ddef1d-d1e1-4614-8958-1f1c278c1ce1@tompassin.net> On 10/5/2024 4:27 PM, Ulrich Goebel via Python-list wrote: > Hi, > > I learned to use virtual environments where ever possible, and I learned to pip install the required packages there. > > That works quite nice at home. Now I come to deploy a Python script on a debian linux server, making it usable for a couple of users there. > > Debian (or even Python3 itself) doesn't allow to pip install required packages system wide, so I have to use virtual environments even there. But is it right, that I have to do that for every single user? > > Can someone give me a hint to find an howto for that? One alternative is to install a different version of Python without replacing the system's version. For example, if the system uses Python 3.11, install Python 3.12. That way there is no risk of breaking system operation, and you can install what you like where you like. From Karsten.Hilbert at gmx.net Sat Oct 5 18:21:09 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sun, 6 Oct 2024 00:21:09 +0200 Subject: Best Practice Virtual Environment In-Reply-To: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> References: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> Message-ID: Am Sat, Oct 05, 2024 at 10:27:33PM +0200 schrieb Ulrich Goebel via Python-list: > Debian (or even Python3 itself) doesn't allow to pip install required packages system wide, so I have to use virtual environments even there. But is it right, that I have to do that for every single user? > > Can someone give me a hint to find an howto for that? AFAICT the factual consensus appears to be install modules as packaged by the system you won't need anything else If you do find how to cleanly install non-packaged modules in a system-wide way (even if that means installing every application into its own *system-wide* venv) - do let me know. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From Karsten.Hilbert at gmx.net Sun Oct 6 09:44:02 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Sun, 6 Oct 2024 15:44:02 +0200 Subject: Best Practice Virtual Environment In-Reply-To: References: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> Message-ID: Am Sun, Oct 06, 2024 at 12:21:09AM +0200 schrieb Karsten Hilbert via Python-list: > Am Sat, Oct 05, 2024 at 10:27:33PM +0200 schrieb Ulrich Goebel via Python-list: > > > Debian (or even Python3 itself) doesn't allow to pip install required packages system wide, so I have to use virtual environments even there. But is it right, that I have to do that for every single user? > > > > Can someone give me a hint to find an howto for that? > > If you do find how to cleanly install non-packaged modules > in a system-wide way (even if that means installing every > application into its own *system-wide* venv) - do let me > know. It seems dh-virtualenv is one way to do it. On Debian. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From transreductionist at gmail.com Sun Oct 6 13:30:24 2024 From: transreductionist at gmail.com (transreductionist) Date: Sun, 6 Oct 2024 13:30:24 -0400 Subject: Best Practice Virtual Environment In-Reply-To: References: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> Message-ID: This is how we handle this problem at a large organization. In the repository there are a number of build scripts. For convenience we use poetry (poetry.toml) to manage the virtual environment. A pyproduct.toml is used to define dependencies, how tests are run, the linter config, etc. So there are scripts for poetry lock, poetry install, and whatever else is needed. A user pulls down the repository and runs 1. poetry lock 2. poetry install And they have their environment with the proper dependencies. On Sun, Oct 6, 2024, 09:47 Karsten Hilbert via Python-list < python-list at python.org> wrote: > Am Sun, Oct 06, 2024 at 12:21:09AM +0200 schrieb Karsten Hilbert via > Python-list: > > > Am Sat, Oct 05, 2024 at 10:27:33PM +0200 schrieb Ulrich Goebel via > Python-list: > > > > > Debian (or even Python3 itself) doesn't allow to pip install required > packages system wide, so I have to use virtual environments even there. But > is it right, that I have to do that for every single user? > > > > > > Can someone give me a hint to find an howto for that? > > > > If you do find how to cleanly install non-packaged modules > > in a system-wide way (even if that means installing every > > application into its own *system-wide* venv) - do let me > > know. > > It seems dh-virtualenv is one way to do it. On Debian. > > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > -- > https://mail.python.org/mailman/listinfo/python-list > From transreductionist at gmail.com Sun Oct 6 13:31:09 2024 From: transreductionist at gmail.com (transreductionist) Date: Sun, 6 Oct 2024 13:31:09 -0400 Subject: Best Practice Virtual Environment In-Reply-To: References: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> Message-ID: byproduct.toml On Sun, Oct 6, 2024, 13:30 transreductionist wrote: > This is how we handle this problem at a large organization. > > In the repository there are a number of build scripts. For convenience we > use poetry (poetry.toml) to manage the virtual environment. A > pyproduct.toml is used to define dependencies, how tests are run, the > linter config, etc. > > So there are scripts for poetry lock, poetry install, and whatever else is > needed. > > A user pulls down the repository and runs > 1. poetry lock > 2. poetry install > And they have their environment with the proper dependencies. > > On Sun, Oct 6, 2024, 09:47 Karsten Hilbert via Python-list < > python-list at python.org> wrote: > >> Am Sun, Oct 06, 2024 at 12:21:09AM +0200 schrieb Karsten Hilbert via >> Python-list: >> >> > Am Sat, Oct 05, 2024 at 10:27:33PM +0200 schrieb Ulrich Goebel via >> Python-list: >> > >> > > Debian (or even Python3 itself) doesn't allow to pip install required >> packages system wide, so I have to use virtual environments even there. But >> is it right, that I have to do that for every single user? >> > > >> > > Can someone give me a hint to find an howto for that? >> > >> > If you do find how to cleanly install non-packaged modules >> > in a system-wide way (even if that means installing every >> > application into its own *system-wide* venv) - do let me >> > know. >> >> It seems dh-virtualenv is one way to do it. On Debian. >> >> Karsten >> -- >> GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B >> -- >> https://mail.python.org/mailman/listinfo/python-list >> > From antoon.pardon at vub.be Sun Oct 6 16:19:10 2024 From: antoon.pardon at vub.be (Antoon Pardon) Date: Sun, 6 Oct 2024 22:19:10 +0200 Subject: Beazley's Problem In-Reply-To: <0709b4b8b0bbf2a32d53649d1a6fbefbcd44a68a.camel@tilde.green> References: <87tte941ko.fsf@nightsong.com> <87plow4v4p.fsf@nightsong.com> <0709b4b8b0bbf2a32d53649d1a6fbefbcd44a68a.camel@tilde.green> Message-ID: Op 23/09/2024 om 09:44 schreef Annada Behera via Python-list: > The "next-level math trick" Newton-Raphson has nothing to do with > functional programming. I have written solvers in purely iterative > style. What is your point. Any problem solved in a functional style can also be solved in a pure interative style. So you having written something in an interative style doesn't contradict Newton-Raphson being expressable in a functional style. > As far as I know, Newton-Raphson is the opposite of functional > programming as you iteratively solve for the root. Functional programming > is stateless where you are not allowed to store any state (current best > guess root). That doesn't prevent you from passing state along as a parameter, usualy in some helper function. -- Antoon Pardon. From thomas at python.org Mon Oct 7 14:57:24 2024 From: thomas at python.org (Thomas Wouters) Date: Mon, 7 Oct 2024 11:57:24 -0700 Subject: [RELEASE] Python 3.13.0 (final) released Message-ID: After all the shenanigans two weeks ago ? everyone discovering nasty little problems in release candidate 2 ? the last week was suspiciously quiet, and therefore I can finally say: Python 3.13.0 is now available https://www.python.org/downloads/release/python-3130/ This is the stable release of Python 3.13.0 Python 3.13.0 is the newest major release of the Python programming language, and it contains many new features and optimizations compared to Python 3.12. (Compared to the last release candidate, 3.13.0rc3, 3.13.0 contains two small bug fixes and some documentation and testing changes.) Major new features of the 3.13 series, compared to 3.12 Some of the new major new features and changes in Python 3.13 are: New features - A new and improved interactive interpreter , based on PyPy ?s, featuring multi-line editing and color support, as well as colorized exception tracebacks . - An *experimental* free-threaded build mode , which disables the Global Interpreter Lock, allowing threads to run more concurrently. The build mode is available as an experimental feature in the Windows and macOS installers as well. - A preliminary, *experimental* JIT , providing the ground work for significant performance improvements. - The locals() builtin function (and its C equivalent) now has well-defined semantics when mutating the returned mapping , which allows debuggers to operate more consistently. - A modified version of mimalloc is now included, optional but enabled by default if supported by the platform, and required for the free-threaded build mode. - Docstrings now have their leading indentation stripped , reducing memory use and the size of .pyc files. (Most tools handling docstrings already strip leading indentation.) - The dbm module has a new dbm.sqlite3 backend that is used by default when creating new files. - The minimum supported macOS version was changed from 10.9 to 10.13 (High Sierra). Older macOS versions will not be supported going forward. - WASI is now a Tier 2 supported platform . Emscripten is no longer an officially supported platform (but Pyodide continues to support Emscripten). - iOS is now a Tier 3 supported platform . - Android is now a Tier 3 supported platform . Typing - Support for type defaults in type parameters . - A new type narrowing annotation , typing.TypeIs. - A new annotation for read-only items in TypeDicts . - A new annotation for marking deprecations in the type system . Removals and new deprecations - PEP 594 (Removing dead batteries from the standard library) scheduled removals of many deprecated modules: aifc, audioop, chunk, cgi, cgitb, crypt, imghdr, mailcap, msilib, nis, nntplib, ossaudiodev, pipes, sndhdr, spwd, sunau, telnetlib, uu, xdrlib, lib2to3. - Many other removals of deprecated classes, functions and methods in various standard library modules. - C API removals and deprecations . (Some removals present in alpha 1 were reverted in alpha 2, as the removals were deemed too disruptive at this time.) - New deprecations , most of which are scheduled for removal from Python 3.15 or 3.16. For more details on the changes to Python 3.13, see What?s new in Python 3.13 . More resources - Online Documentation - PEP 719 , 3.13 Release Schedule - Report bugs at Issues ? python/cpython ? GitHub . - Help fund Python directly (or via GitHub Sponsors ), and support the Python community . We hope you enjoy the new releases! Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation . Choo-choo from the release train, Your release team, Thomas Wouters Ned Deily Steve Dower ?ukasz Langa From olegsivokon at gmail.com Sun Oct 6 07:42:18 2024 From: olegsivokon at gmail.com (Left Right) Date: Sun, 6 Oct 2024 13:42:18 +0200 Subject: Best Practice Virtual Environment In-Reply-To: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> References: <20241005222733.fd60f7e672e849aa63c8b360@fam-goebel.de> Message-ID: Hi. The advice here is from a perspective of someone who does this professionally, for large, highly loaded systems. This doesn't necessarily apply to your case / not to the full extent. > Debian (or even Python3 itself) doesn't allow to pip install required packages system wide, so I have to use virtual environments even there. But is it right, that I have to do that for every single user? 1. Yes, you can install packages system-wide with pip, but you don't need to. 2. pip is OK to install requirements once, to figure out what they are (in dev. environment). It's bad for production environment: it's slow, inconsistent, and insecure. For more context: pip dependency resolution is especially slow when installing local interdependent packages. Sometimes it can take up to a minute per package. Inconsistency comes from pip not using package checksums and signatures (by default): so, if the package being installed was updated w/o version update, to pip it's going to be the same package. Not just that, for some packages pip has to resort to building them from source, in which case nobody can guarantee the end result. Insecurity comes from Python allowing out-of-index package downloads during install. You can distribute your package through PyPI, while its dependency will point to a random Web site in a country with very permissive laws (and, essentially, just put malware on your computer). It's impossible to properly audit such situations because the outside Web site doesn't have to provide any security guarantees. To package anything Linux-related, use the packaging mechanism provided by the flavor of Linux you are using. In the case of Debian, use DEB. Don't use virtual environments for this (it's possible to roll the entire virtual environment into a DEB package, but that's a bad idea). The reason to do this is so that your package plays nice with other Python packages available as DEB packages. This will allow your users to use a consistent interface when dealing with installing packages, and to avoid situation when an out-of-bound tool installed something in the same path where dpkg will try to install the same files, but coming from a legitimate package. If you package the whole virtual environment, you might run into problems with locating native libraries linked from Python native modules. You will make it hard to audit the installation, especially when it comes to certificates, TLS etc. stuff that, preferably, should be handled in a centralized way by the OS. Of course, countless times I've seen developers do the exact opposite of what I'm suggesting here. Also, the big actors in the industry s.a. Microsoft and Amazon do the exact opposite of what I suggest. I have no problem acknowledging this and still maintaining that they are wrong and I'm right :) But, you don't have to trust me! From michael.stemper at gmail.com Mon Oct 7 09:35:32 2024 From: michael.stemper at gmail.com (Michael F. Stemper) Date: Mon, 7 Oct 2024 08:35:32 -0500 Subject: Correct syntax for pathological re.search() Message-ID: I'm trying to discard lines that include the string "\sout{" (which is TeX, for those who are curious. I have tried: if not re.search("\sout{", line): if not re.search("\sout\{", line): if not re.search("\\sout{", line): if not re.search("\\sout\{", line): But the lines with that string keep coming through. What is the right syntax to properly escape the backslash and the left curly bracket? -- Michael F. Stemper No animals were harmed in the composition of this message. From michael.stemper at gmail.com Mon Oct 7 10:14:53 2024 From: michael.stemper at gmail.com (Michael F. Stemper) Date: Mon, 7 Oct 2024 09:14:53 -0500 Subject: Correct syntax for pathological re.search() In-Reply-To: References:

Message-ID: On 07/10/2024 08.56, Stefan Ram wrote: > "Michael F. Stemper" wrote or quoted: >> if not re.search("\\sout\{", line): > > So, if you're not down to slap an "r" before your string literals, > you're going to end up doubling down on every backslash. Never heard of that before, but it did the trick. > Long story short, those double backslashes in your regex? > They'll be quadrupling up in your Python string literal! > for line in lines: > product = re.search( "\\\\sout\\{", line ) This also worked. For now, I'll use the "r" in a cargo-cult fashion, until I decide which syntax I prefer. (Is there any reason that one or the other is preferable?) Thanks for your help, Mike -- Michael F. Stemper Economists have correctly predicted seven of the last three recessions. From jon+usenet at unequivocal.eu Mon Oct 7 11:43:59 2024 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Mon, 7 Oct 2024 15:43:59 -0000 (UTC) Subject: Correct syntax for pathological re.search() References:

Message-ID: On 2024-10-07, Stefan Ram wrote: > "Michael F. Stemper" wrote or quoted: >>For now, I'll use the "r" in a cargo-cult fashion, until I decide which >>syntax I prefer. (Is there any reason that one or the other is preferable?) > > I'd totally go with the r-style notation! > > It's got one bummer though - you can't end such a string literal with > a backslash. But hey, no biggie, you could use one of those notations: > > main.py > > path = r'C:\Windows\example' + '\\' > > print( path ) > > path = r''' > C:\Windows\example\ > '''.strip() > > print( path ) > > stdout > > C:\Windows\example\ > C:\Windows\example\ > > . ... although of course in this example you should probably do neither of those things, and instead do: from pathlib import Path path = Path(r'C:\Windows\example') since in a Path the trailing '\' or '/' is unnecessary. Which leaves very few remaining uses for a raw-string with a trailing '\'... From pieter-l at vanoostrum.org Tue Oct 8 13:50:14 2024 From: pieter-l at vanoostrum.org (Pieter van Oostrum) Date: Tue, 08 Oct 2024 19:50:14 +0200 Subject: Correct syntax for pathological re.search() References:

Message-ID: ram at zedat.fu-berlin.de (Stefan Ram) writes: > "Michael F. Stemper" wrote or quoted: > > path = r'C:\Windows\example' + '\\' > You could even omit the '+'. Then the concatenation is done at parsing time instead of run time. -- Pieter van Oostrum www: http://pieter.vanoostrum.org/ PGP key: [8DAE142BE17999C4] From Karsten.Hilbert at gmx.net Tue Oct 8 14:30:34 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Tue, 8 Oct 2024 20:30:34 +0200 Subject: Correct syntax for pathological re.search() In-Reply-To: References: Message-ID: Am Mon, Oct 07, 2024 at 08:35:32AM -0500 schrieb Michael F. Stemper via Python-list: > I'm trying to discard lines that include the string "\sout{" (which is TeX, for > those who are curious. I have tried: > if not re.search("\sout{", line): > if not re.search("\sout\{", line): > if not re.search("\\sout{", line): > if not re.search("\\sout\{", line): unwanted_tex = '\sout{' if unwanted_tex not in line: do_something_with_libreoffice() Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From python at mrabarnett.plus.com Tue Oct 8 15:07:04 2024 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 8 Oct 2024 20:07:04 +0100 Subject: Correct syntax for pathological re.search() In-Reply-To: References: Message-ID: <1e13579f-693d-44cf-a563-7c0c9767e04e@mrabarnett.plus.com> On 2024-10-08 19:30, Karsten Hilbert via Python-list wrote: > Am Mon, Oct 07, 2024 at 08:35:32AM -0500 schrieb Michael F. Stemper via Python-list: > >> I'm trying to discard lines that include the string "\sout{" (which is TeX, for >> those who are curious. I have tried: >> if not re.search("\sout{", line): >> if not re.search("\sout\{", line): >> if not re.search("\\sout{", line): >> if not re.search("\\sout\{", line): > > unwanted_tex = '\sout{' > if unwanted_tex not in line: do_something_with_libreoffice() > That should be: unwanted_tex = r'\sout{' or: unwanted_tex = '\\sout{' From python at mrabarnett.plus.com Tue Oct 8 15:11:40 2024 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 8 Oct 2024 20:11:40 +0100 Subject: Correct syntax for pathological re.search() In-Reply-To: References: Message-ID: On 2024-10-07 14:35, Michael F. Stemper via Python-list wrote: > I'm trying to discard lines that include the string "\sout{" (which is TeX, for > those who are curious. I have tried: > if not re.search("\sout{", line): > if not re.search("\sout\{", line): > if not re.search("\\sout{", line): > if not re.search("\\sout\{", line): > > But the lines with that string keep coming through. What is the right syntax to > properly escape the backslash and the left curly bracket? > String literals use backslash is an escape character, so it needs to be escaped, or you need to use a "raw" string. However, regex also uses backslash as an escape character. That means that a literal backslash in a regex that's in a plain string literal needs to be doubly-escaped, once for the string literal and again for the regex. From Karsten.Hilbert at gmx.net Tue Oct 8 16:17:49 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Tue, 8 Oct 2024 22:17:49 +0200 Subject: Correct syntax for pathological re.search() In-Reply-To: <1e13579f-693d-44cf-a563-7c0c9767e04e@mrabarnett.plus.com> References: <1e13579f-693d-44cf-a563-7c0c9767e04e@mrabarnett.plus.com> Message-ID: Am Tue, Oct 08, 2024 at 08:07:04PM +0100 schrieb MRAB via Python-list: > >unwanted_tex = '\sout{' > >if unwanted_tex not in line: do_something_with_libreoffice() > > > That should be: > > unwanted_tex = r'\sout{' Hm. Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> tex = '\sout{' >>> tex '\\sout{' >>> Am I missing something ? Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From alan at csail.mit.edu Tue Oct 8 16:59:48 2024 From: alan at csail.mit.edu (Alan Bawden) Date: Tue, 08 Oct 2024 16:59:48 -0400 Subject: Correct syntax for pathological re.search() References: <1e13579f-693d-44cf-a563-7c0c9767e04e@mrabarnett.plus.com>

Message-ID: <864j5mfgzf.fsf@williamsburg.bawden.org> Karsten Hilbert writes: Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> tex = '\sout{' >>> tex '\\sout{' >>> Am I missing something ? You're missing the warning it generates: > python -E -Wonce Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> tex = '\sout{' :1: DeprecationWarning: invalid escape sequence '\s' >>> From python at mrabarnett.plus.com Tue Oct 8 18:10:03 2024 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 8 Oct 2024 23:10:03 +0100 Subject: Correct syntax for pathological re.search() In-Reply-To: <864j5mfgzf.fsf@williamsburg.bawden.org> References: <1e13579f-693d-44cf-a563-7c0c9767e04e@mrabarnett.plus.com>

<864j5mfgzf.fsf@williamsburg.bawden.org> Message-ID: <3ab03165-185b-45f7-9fba-1918b83afdd8@mrabarnett.plus.com> On 2024-10-08 21:59, Alan Bawden via Python-list wrote: > Karsten Hilbert writes: > > Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> tex = '\sout{' > >>> tex > '\\sout{' > >>> > > Am I missing something ? > > You're missing the warning it generates: > > > python -E -Wonce > Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> tex = '\sout{' > :1: DeprecationWarning: invalid escape sequence '\s' > >>> You got lucky that \s in invalid. If it had been \t you would've got a tab character. Historically, Python treated invalid escape sequences as literals, but it's deprecated now and will become an outright error in the future (probably) because it often hides a mistake, such as the aforementioned \t being treated as a tab character when the user expected it to be a literal backslash followed by letter t. (This can occur within Windows file paths written in plain string literals.) From avi.e.gross at gmail.com Tue Oct 8 19:43:35 2024 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Tue, 8 Oct 2024 19:43:35 -0400 Subject: Signing off In-Reply-To: <007701d89150$1dea86b0$59bf9410$@gmail.com> References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> <007701d89150$1dea86b0$59bf9410$@gmail.com> Message-ID: <012d01db19db$e42d7c40$ac8874c0$@gmail.com> Just a final brief note. I am leaving the python community so don't worry that anything happened to me. I have a disagreement with the direction some people are taking with the python community that is my issue and it that probably will not bother most people. I have lots of other interests including many other programming languages and it is time I stopped using python when I have so much else to choose from. My best wishes to everyone here. Avi From Karsten.Hilbert at gmx.net Wed Oct 9 14:06:10 2024 From: Karsten.Hilbert at gmx.net (Karsten Hilbert) Date: Wed, 9 Oct 2024 20:06:10 +0200 Subject: Correct syntax for pathological re.search() In-Reply-To: <864j5mfgzf.fsf@williamsburg.bawden.org> References: <1e13579f-693d-44cf-a563-7c0c9767e04e@mrabarnett.plus.com>

<864j5mfgzf.fsf@williamsburg.bawden.org> Message-ID: Am Tue, Oct 08, 2024 at 04:59:48PM -0400 schrieb Alan Bawden via Python-list: > Karsten Hilbert writes: > > Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> tex = '\sout{' > >>> tex > '\\sout{' > >>> > > Am I missing something ? > > You're missing the warning it generates: > > :1: DeprecationWarning: invalid escape sequence '\s' I knew it'd be good to ask :-D Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B From martin.stopka at gmail.com Thu Oct 10 09:07:20 2024 From: martin.stopka at gmail.com (stopa) Date: Thu, 10 Oct 2024 15:07:20 +0200 Subject: dis.get_instructions not showing CACHE instructions Message-ID: Hello, I noticed the change in dis module, no longer requiring show_caches to be set to True to show cache instructions. However I am not able to display them with get_instructions. Is there by any chance some bug preventing me to see them? Thanks Martin From vinay_sajip at yahoo.co.uk Thu Oct 10 10:50:38 2024 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 10 Oct 2024 14:50:38 +0000 (UTC) Subject: Announcement: distlib 0.3.9 released on PyPI References: <517012813.550286.1728571838732.ref@mail.yahoo.com> Message-ID: <517012813.550286.1728571838732@mail.yahoo.com> Version 0.3.9 of distlib has recently been released on PyPI [1]. For newcomers, distlib is a library of packaging functionality which is intended to be usable as the basis for third-party packaging tools. The main changes in this release are as follows: * Merge #215: Preload script wrappers on Windows to assist with a pip issue. * Fix #220: Remove duplicated newline in shebang of windows launcher. * Fix #222: Support mounting wheels that use extensions without an EXTENSIONS file. * Fix #224: Do not use the absolute path to cache wheel extensions. * Fix #225: Add support for wheel compatibility with the limited API. * Fix #230: Add handling for cross-compilation environments. A more detailed change log is available at [2]. Please try it out, and if you find any problems or have any suggestions for improvements, please give some feedback using the issue tracker at [3]. Regards, Vinay Sajip [1] https://pypi.org/project/distlib/0.3.9/ [2] https://distlib.readthedocs.io/en/latest/overview.html#change-log-for-distlib [3] https://github.com/pypa/distlib/issues/new/choose From barry at barrys-emacs.org Thu Oct 10 12:53:37 2024 From: barry at barrys-emacs.org (Barry) Date: Thu, 10 Oct 2024 17:53:37 +0100 Subject: dis.get_instructions not showing CACHE instructions In-Reply-To: References: Message-ID: <0FF12307-A10D-489B-8BF7-B397A93D698D@barrys-emacs.org> > On 10 Oct 2024, at 14:18, stopa via Python-list wrote: > > ?Hello, > I noticed the change in dis module, no longer requiring show_caches to be > set to True to show cache instructions. However I am not able to display > them with get_instructions. > Is there by any chance some bug preventing me to see them? We need more information to be able to comment. What version of python do you see this working for? What version of python are you see it change? Can you show an example function that demonstrates the issue please. Barry > > Thanks > > Martin > -- > https://mail.python.org/mailman/listinfo/python-list > From martin.stopka at gmail.com Thu Oct 10 13:31:57 2024 From: martin.stopka at gmail.com (stopa) Date: Thu, 10 Oct 2024 19:31:57 +0200 Subject: dis.get_instructions not showing CACHE instructions In-Reply-To: <0FF12307-A10D-489B-8BF7-B397A93D698D@barrys-emacs.org> References: <0FF12307-A10D-489B-8BF7-B397A93D698D@barrys-emacs.org> Message-ID: Oh god I am sorry :/ I somehow missed information about cache_info field. I was expecting to see those cache instructions as normal opcodes. So its working as expected. Thanks for your help. M. ?t 10. 10. 2024 o 18:53 Barry nap?sal(a): > > > > On 10 Oct 2024, at 14:18, stopa via Python-list > wrote: > > > > ?Hello, > > I noticed the change in dis module, no longer requiring show_caches to be > > set to True to show cache instructions. However I am not able to display > > them with get_instructions. > > Is there by any chance some bug preventing me to see them? > > We need more information to be able to comment. > > What version of python do you see this working for? > What version of python are you see it change? > > Can you show an example function that demonstrates the issue please. > > Barry > > > > > Thanks > > > > Martin > > -- > > https://mail.python.org/mailman/listinfo/python-list > > > > From dciprus at cisco.com Fri Oct 11 14:32:40 2024 From: dciprus at cisco.com (Dan Ciprus (dciprus)) Date: Fri, 11 Oct 2024 18:32:40 +0000 Subject: [Tutor] How to stop a specific thread in Python 2.7? In-Reply-To: References: Message-ID: Thank you for the hint ! On Fri, Oct 04, 2024 at 09:17:19AM GMT, Cameron Simpson wrote: >On 03Oct2024 22:12, Dan Ciprus (dciprus) wrote: >>I'd be interested too :-). > >Untested sketch: > > def make_thread(target, *a, E=None, **kw): > ''' > Make a new Event E and Thread T, pass `[E,*a]` as the target >positional arguments. > A shared preexisting Event may be supplied. > Return a 2-tuple of `(T,E)`. > ''' > if E is None: > E = Event() > T = Thread(target=target, args=[E, *a], kwargs=kw) > return T, E > >Something along those lines. > >Cheers, >Cameron Simpson -- Dan Ciprus [ curl -L http://git.io/unix ] -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: not available URL: From avi.e.gross at gmail.com Fri Oct 11 17:13:07 2024 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Fri, 11 Oct 2024 17:13:07 -0400 Subject: Correct syntax for pathological re.search() In-Reply-To: References: Message-ID: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> Is there some utility function out there that can be called to show what the regular expression you typed in will look like by the time it is ready to be used? Obviously, life is not that simple as it can go through multiple layers with each dealing with a layer of backslashes. But for simple cases, ... -----Original Message----- From: Python-list On Behalf Of Gilmeh Serda via Python-list Sent: Friday, October 11, 2024 10:44 AM To: python-list at python.org Subject: Re: Correct syntax for pathological re.search() On Mon, 7 Oct 2024 08:35:32 -0500, Michael F. Stemper wrote: > I'm trying to discard lines that include the string "\sout{" (which is > TeX, for those who are curious. I have tried: > if not re.search("\sout{", line): if not re.search("\sout\{", line): > if not re.search("\\sout{", line): if not re.search("\\sout\{", > line): > > But the lines with that string keep coming through. What is the right > syntax to properly escape the backslash and the left curly bracket? $ python Python 3.12.6 (main, Sep 8 2024, 13:18:56) [GCC 14.2.1 20240805] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import re >>> s = r"testing \sout{WHADDEVVA}" >>> re.search(r"\\sout{", s) You want a literal backslash, hence, you need to escape everything. It is not enough to escape the "\s" as "\\s", because that only takes care of Python's demands for escaping "\". You also need to escape the "\" for the RegEx as well, or it will read it like it means "\s", which is the RegEx for a space character and therefore your search doesn't match, because it reads it like you want to search for " out{". Therefore, you need to escape it either as per my example, or by using four "\" and no "r" in front of the first quote, which also works: >>> re.search("\\\\sout{", s) You don't need to escape the curly braces. We call them "seagull wings" where I live. -- Gilmeh Sometimes I simply feel that the whole world is a cigarette and I'm the only ashtray. -- https://mail.python.org/mailman/listinfo/python-list From python at mrabarnett.plus.com Fri Oct 11 20:37:55 2024 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 12 Oct 2024 01:37:55 +0100 Subject: Correct syntax for pathological re.search() In-Reply-To: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> References: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> Message-ID: On 2024-10-11 22:13, AVI GROSS via Python-list wrote: > Is there some utility function out there that can be called to show what the > regular expression you typed in will look like by the time it is ready to be > used? > > Obviously, life is not that simple as it can go through multiple layers with > each dealing with a layer of backslashes. > > But for simple cases, ... > Yes. It's called 'print'. :-) > > > -----Original Message----- > From: Python-list On > Behalf Of Gilmeh Serda via Python-list > Sent: Friday, October 11, 2024 10:44 AM > To: python-list at python.org > Subject: Re: Correct syntax for pathological re.search() > > On Mon, 7 Oct 2024 08:35:32 -0500, Michael F. Stemper wrote: > >> I'm trying to discard lines that include the string "\sout{" (which is >> TeX, for those who are curious. I have tried: >> if not re.search("\sout{", line): if not re.search("\sout\{", line): >> if not re.search("\\sout{", line): if not re.search("\\sout\{", >> line): >> >> But the lines with that string keep coming through. What is the right >> syntax to properly escape the backslash and the left curly bracket? > > $ python > Python 3.12.6 (main, Sep 8 2024, 13:18:56) [GCC 14.2.1 20240805] on linux > Type "help", "copyright", "credits" or "license" for more information. >>>> import re >>>> s = r"testing \sout{WHADDEVVA}" >>>> re.search(r"\\sout{", s) > > > You want a literal backslash, hence, you need to escape everything. > > It is not enough to escape the "\s" as "\\s", because that only takes care > of Python's demands for escaping "\". You also need to escape the "\" for > the RegEx as well, or it will read it like it means "\s", which is the > RegEx for a space character and therefore your search doesn't match, > because it reads it like you want to search for " out{". > > Therefore, you need to escape it either as per my example, or by using > four "\" and no "r" in front of the first quote, which also works: > >>>> re.search("\\\\sout{", s) > > > You don't need to escape the curly braces. We call them "seagull wings" > where I live. > From hjp-python at hjp.at Sat Oct 12 06:59:58 2024 From: hjp-python at hjp.at (Peter J. Holzer) Date: Sat, 12 Oct 2024 12:59:58 +0200 Subject: Correct syntax for pathological re.search() In-Reply-To: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> References: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> Message-ID: <20241012105958.cbctekv7vustleha@hjp.at> On 2024-10-11 17:13:07 -0400, AVI GROSS via Python-list wrote: > Is there some utility function out there that can be called to show what the > regular expression you typed in will look like by the time it is ready to be > used? I assume that by "ready to be used" you mean the compiled form? No, there doesn't seem to be a way to dump that. You can p = re.compile("\\\\sout{") print(p.pattern) but that just prints the input string, which you could do without compiling it first. But - without having looked at the implementation - it's far from clear that the compiled form would be useful to the user. It's probably some kind of state machine, and a large table of state transitions isn't very readable. There are a number of websites which visualize regular expressions. Those are probably better for debugging a regular expression than anything the re module could reasonably produce (although with the caveat that such a web site would use a different implementation and therefore might produce different results). hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp at hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From list1 at tompassin.net Sat Oct 12 08:51:57 2024 From: list1 at tompassin.net (Thomas Passin) Date: Sat, 12 Oct 2024 08:51:57 -0400 Subject: Correct syntax for pathological re.search() In-Reply-To: <20241012105958.cbctekv7vustleha@hjp.at> References: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> <20241012105958.cbctekv7vustleha@hjp.at> Message-ID: <966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net> On 10/12/2024 6:59 AM, Peter J. Holzer via Python-list wrote: > On 2024-10-11 17:13:07 -0400, AVI GROSS via Python-list wrote: >> Is there some utility function out there that can be called to show what the >> regular expression you typed in will look like by the time it is ready to be >> used? > > I assume that by "ready to be used" you mean the compiled form? > > No, there doesn't seem to be a way to dump that. You can > > p = re.compile("\\\\sout{") > print(p.pattern) > > but that just prints the input string, which you could do without > compiling it first. It prints the escaped version, so you can see if you escaped the string as you intended. In this case, the print will display '\\sout{'. That's worth something. > > But - without having looked at the implementation - it's far from clear > that the compiled form would be useful to the user. It's probably some > kind of state machine, and a large table of state transitions isn't very > readable. > > There are a number of websites which visualize regular expressions. > Those are probably better for debugging a regular expression than > anything the re module could reasonably produce (although with the > caveat that such a web site would use a different implementation and > therefore might produce different results). > > hp > > From avi.e.gross at gmail.com Sat Oct 12 10:10:41 2024 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sat, 12 Oct 2024 10:10:41 -0400 Subject: Correct syntax for pathological re.search() In-Reply-To: <20241012105958.cbctekv7vustleha@hjp.at> References: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> <20241012105958.cbctekv7vustleha@hjp.at> Message-ID: <003201db1cb0$85ac8760$91059620$@gmail.com> Peter, Matthew understood what I was hinting at in one way and you in another. The question asked how to add some power of two backslashes or make other changes, so the RE functionality sees what you want. The goal is to see what happens when one or more intermediate evaluations may change the string. So, a simple print may suffice as a parallel way to force the same evaluations. Thomas made his point. And, I am starting to feel like I need to change my name to something like Luke since this discussion must be gospel. FYI, I was not planning on posting at all. Time to detach. -----Original Message----- From: Python-list On Behalf Of Peter J. Holzer via Python-list Sent: Saturday, October 12, 2024 7:00 AM To: python-list at python.org Subject: Re: Correct syntax for pathological re.search() On 2024-10-11 17:13:07 -0400, AVI GROSS via Python-list wrote: > Is there some utility function out there that can be called to show what the > regular expression you typed in will look like by the time it is ready to be > used? I assume that by "ready to be used" you mean the compiled form? No, there doesn't seem to be a way to dump that. You can p = re.compile("\\\\sout{") print(p.pattern) but that just prints the input string, which you could do without compiling it first. But - without having looked at the implementation - it's far from clear that the compiled form would be useful to the user. It's probably some kind of state machine, and a large table of state transitions isn't very readable. There are a number of websites which visualize regular expressions. Those are probably better for debugging a regular expression than anything the re module could reasonably produce (although with the caveat that such a web site would use a different implementation and therefore might produce different results). hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp at hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" From list1 at tompassin.net Sat Oct 12 09:06:54 2024 From: list1 at tompassin.net (Thomas Passin) Date: Sat, 12 Oct 2024 09:06:54 -0400 Subject: Correct syntax for pathological re.search() In-Reply-To: References: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> Message-ID: On 10/11/2024 8:37 PM, MRAB via Python-list wrote: > On 2024-10-11 22:13, AVI GROSS via Python-list wrote: >> Is there some utility function out there that can be called to show >> what the >> regular expression you typed in will look like by the time it is ready >> to be >> used? >> >> Obviously, life is not that simple as it can go through multiple >> layers with >> each dealing with a layer of backslashes. >> >> But for simple cases, ... >> > Yes. It's called 'print'. :-) There is section in the Python docs about this backslash subject. It's titled "The Backslash Plague" in https://docs.python.org/3/howto/regex.html You can also inspect the compiled expression to see what string it received after all the escaping: >>> import re >>> >>> re_string = '\\w+\\\\sub' >>> re_pattern = re.compile(re_string) >>> >>> # Should look as if we had used r'\w+\\sub' >>> print(re_pattern.pattern) \w+\\sub >> -----Original Message----- >> From: Python-list > bounces+avi.e.gross=gmail.com at python.org> On >> Behalf Of Gilmeh Serda via Python-list >> Sent: Friday, October 11, 2024 10:44 AM >> To: python-list at python.org >> Subject: Re: Correct syntax for pathological re.search() >> >> On Mon, 7 Oct 2024 08:35:32 -0500, Michael F. Stemper wrote: >> >>> I'm trying to discard lines that include the string "\sout{" (which is >>> TeX, for those who are curious. I have tried: >>> ?? if not re.search("\sout{", line): if not re.search("\sout\{", line): >>> ?? if not re.search("\\sout{", line): if not re.search("\\sout\{", >>> ?? line): >>> >>> But the lines with that string keep coming through. What is the right >>> syntax to properly escape the backslash and the left curly bracket? >> >> $ python >> Python 3.12.6 (main, Sep? 8 2024, 13:18:56) [GCC 14.2.1 20240805] on >> linux >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import re >>>>> s = r"testing \sout{WHADDEVVA}" >>>>> re.search(r"\\sout{", s) >> >> >> You want a literal backslash, hence, you need to escape everything. >> >> It is not enough to escape the "\s" as "\\s", because that only takes >> care >> of Python's demands for escaping "\". You also need to escape the "\" for >> the RegEx as well, or it will read it like it means "\s", which is the >> RegEx for a space character and therefore your search doesn't match, >> because it reads it like you want to search for " out{". >> >> Therefore, you need to escape it either as per my example, or by using >> four "\" and no "r" in front of the first quote, which also works: >> >>>>> re.search("\\\\sout{", s) >> >> >> You don't need to escape the curly braces. We call them "seagull wings" >> where I live. >> > From martin.schoon at gmail.com Tue Oct 15 16:16:41 2024 From: martin.schoon at gmail.com (Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?=) Date: 15 Oct 2024 20:16:41 GMT Subject: Old matplotlib animation now fails Message-ID: Some years ago I created a Python program that reads GPS data and creates an animation stored in an mp4 file. Not very elegant but it worked. Not very original as it was based on the example found here: https://shorturl.at/dTCZZ Last time it worked was about a year ago. Since then I have moved to a later version of Debian and Conda and as a consequence a later version of Python 3 (now 3.12.2). Now my code fails. I have downloaded the latest version of the example and it also fails. It is the second to last line that throws an error: l.set_data(x0, y0) The error messages drills down to something called "/home/.../matplotlib/lines.py", line 1289, in set_xdata and tells me 'x must be a sequence' I have started to dig around in matplotlib's documentation but my strategy is clearly wanting. I don't really know where to start looking for information on how to correct my code. Hence, this call for help. Any ideas? TIA /Martin From python at mrabarnett.plus.com Tue Oct 15 19:38:01 2024 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 16 Oct 2024 00:38:01 +0100 Subject: Old matplotlib animation now fails In-Reply-To: References: Message-ID: <2136e51b-c556-4bb3-bcb3-d7299ae80be5@mrabarnett.plus.com> On 2024-10-15 21:16, Martin Sch??n via Python-list wrote: > Some years ago I created a Python program that reads GPS data and > creates an animation stored in an mp4 file. Not very elegant but it > worked. Not very original as it was based on the example found here: > > https://shorturl.at/dTCZZ > > Last time it worked was about a year ago. Since then I have moved to a > later version of Debian and Conda and as a consequence a later version > of Python 3 (now 3.12.2). > > Now my code fails. I have downloaded the latest version of the example > and it also fails. > > It is the second to last line that throws an error: > > l.set_data(x0, y0) > > The error messages drills down to something called > "/home/.../matplotlib/lines.py", line 1289, in set_xdata > > and tells me 'x must be a sequence' > > I have started to dig around in matplotlib's documentation but my > strategy is clearly wanting. I don't really know where to start > looking for information on how to correct my code. Hence, this > call for help. > > Any ideas? > This is from the help: """ Help on function set_data in module matplotlib.lines: set_data(self, *args) Set the x and y data. Parameters ---------- *args : (2, N) array or two 1D arrays See Also -------- set_xdata set_ydata """ So, the arguments should be arrays: For example: x0, y0 = np.array([0.0]), np.array([0.0]) Has the API changed at some point? From hugo at python.org Wed Oct 16 04:09:13 2024 From: hugo at python.org (Hugo van Kemenade) Date: Wed, 16 Oct 2024 11:09:13 +0300 Subject: [RELEASE] Python 3.14.0 alpha 1 is now available Message-ID: It's now time for a new alpha of a new version of Python! https://www.python.org/downloads/release/python-3140a1/ **This is an early developer preview of Python 3.14** # Major new features of the 3.14 series, compared to 3.13 Python 3.14 is still in development. This release, 3.14.0a1 is the first of seven planned alpha releases. Alpha releases are intended to make it easier to test the current state of new features and bug fixes and to test the release process. During the alpha phase, features may be added up until the start of the beta phase (2025-05-06) and, if necessary, may be modified or deleted up until the release candidate phase (2025-07-22). Please keep in mind that this is a preview release and its use is **not** recommended for production environments. Many new features for Python 3.14 are still being planned and written. Among the new major new features and changes so far: * PEP 649 (https://peps.python.org/pep-0649/): deferred evaluation of annotations ( https://docs.python.org/3.14/whatsnew/3.14.html#pep-649-deferred-evaluation-of-annotations ) * Improved error messages ( https://docs.python.org/3.14/whatsnew/3.14.html#improved-error-messages) * (Hey, **fellow core developer,** if a feature you find important is missing from this list, [let Hugo know (hugo at python.org).) The next pre-release of Python 3.14 will be 3.14.0a2, currently scheduled for 2024-11-19. # More resources * Online documentation: https://docs.python.org/3.14/ * PEP 745, 3.14 Release Schedule: https://peps.python.org/pep-0719/ * Report bugs at https://github.com/python/cpython/issues * Help fund Python and its community: https://www.python.org/psf/donations/ # And now for something completely different ? (or pi) is a mathematical constant, approximately 3.14, for the ratio of a circle's circumference to its diameter. It is an irrational number, which means it cannot be written as a simple fraction of two integers. When written as a decimal, its digits go on forever without ever repeating a pattern. Here's 76 digits of ?: 3.141592653589793238462643383279502884197169399375105820974944592307816406286 Piphilology is the creation of mnemonics to help remember digits of ?. In a pi-poem, or "piem", the number of letters in a word equal the corresponding digit. This covers 9 digits, 3.14159265: > *How I wish I could recollect pi easily today!* One of the most well-known covers 15 digits, 3.14159265358979: > *How I want a drink, alcoholic of course, after the heavy chapters involving quantum mechanics!* Here's a 35-word piem in the shape of a circle, 3.1415926535897932384626433832795728: It's a fact A ratio immutable Of circle round and width, Produces geometry's deepest conundrum. For as the numerals stay random, No repeat lets out its presence, Yet it forever stretches forth. Nothing to eternity. The Guiness World Record for memorising the most digits is held by Rajveer Meena, who recited 70,000 digits blindfold in 2015. The unofficial record is held by Akira Haraguchi who recited 100,000 digits in 2006. # Enjoy the new release Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation. Regards from a bright and colourful Helsinki, Your release team, Hugo van Kemenade Ned Deily Steve Dower ?ukasz Langa From martin.schoon at gmail.com Wed Oct 16 04:20:10 2024 From: martin.schoon at gmail.com (Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?=) Date: 16 Oct 2024 08:20:10 GMT Subject: Old matplotlib animation now fails References:

Message-ID: Den 2024-10-15 skrev Stefan Ram : > Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?= wrote or quoted: >>l.set_data(x0, y0) > > Well, I got to say, it's pretty rad that you're rocking Python! > That language is the bee's knees, for real. > > As for your question, here's my two cents off the cuff: > Could it be that the newer Matplotlib versions are jonesing > for something like "l.set_data( [ x0 ],[ y0 ])" in that spot? > Thanks, that was quick and adding square brackets fixed my code. Me rocking Python? /Martin From martin.schoon at gmail.com Wed Oct 16 04:23:17 2024 From: martin.schoon at gmail.com (Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?=) Date: 16 Oct 2024 08:23:17 GMT Subject: Old matplotlib animation now fails References: <2136e51b-c556-4bb3-bcb3-d7299ae80be5@mrabarnett.plus.com> Message-ID: Den 2024-10-15 skrev MRAB : > On 2024-10-15 21:16, Martin Sch??n via Python-list wrote: >> Some years ago I created a Python program that reads GPS data and >> It is the second to last line that throws an error: >> >> l.set_data(x0, y0) >> >> The error messages drills down to something called >> "/home/.../matplotlib/lines.py", line 1289, in set_xdata >> >> and tells me 'x must be a sequence' >> > """ > Help on function set_data in module matplotlib.lines: > > set_data(self, *args) > Set the x and y data. > > Parameters > ---------- > *args : (2, N) array or two 1D arrays > > See Also > -------- > set_xdata > set_ydata > """ > > So, the arguments should be arrays: > > For example: > > x0, y0 = np.array([0.0]), np.array([0.0]) > > Has the API changed at some point? > So it seems. Thanks for the quick reply. /Martin From roland.em0001 at googlemail.com Wed Oct 16 14:32:32 2024 From: roland.em0001 at googlemail.com (=?UTF-8?Q?Roland_M=C3=BCller?=) Date: Wed, 16 Oct 2024 21:32:32 +0300 Subject: Common objects for CLI commands with Typer In-Reply-To: References: <87tteayavt.fsf@zedat.fu-berlin.de> <28833A4D-B57C-4195-87BF-FAAF9EFF5F19@barrys-emacs.org> <1E3ED29E-161E-430C-9E99-F89266472ADB@barrys-emacs.org> Message-ID: On 9/23/24 22:51, Dan Sommers via Python-list wrote: > On 2024-09-23 at 19:00:10 +0100, > Barry Scott wrote: > >>> On 21 Sep 2024, at 11:40, Dan Sommers via Python-list wrote: >> But once your code gets big the disciple of using classes helps >> maintenance. Code with lots of globals is problematic. > Even before your code gets big, discipline helps maintenance. :-) > > Every level of your program has globals. An application with too many > classes is no better (or worse) than a class with too many methods, or a > module with too many functions. Insert your own definitions of (and > tolerances for) "too many," which will vary in flexibility. > I think the need of classes comes when you need objects thus a set of variables with an identity and that may be created N times. Classes are object factories. A second aspect is inheritance: classes may inherit from other classes and reuse existing functionality and data structures for their objects. In cases where you only need to encapsulate a single set of data and functions modules are the best choice. From martin.schoon at gmail.com Wed Oct 16 11:52:55 2024 From: martin.schoon at gmail.com (Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?=) Date: 16 Oct 2024 15:52:55 GMT Subject: Old matplotlib animation now fails References:

Message-ID: Den 2024-10-16 skrev Stefan Ram : > Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?= wrote or quoted: >>Me rocking Python? > >|to rock >|1. To use. To make do with, usually to great effect. >|"You don't need to make up the guest bed; we can rock the couch." > Urban Dictionary (2005) - Aaron Peckham (editor) (1979-04-03/), > Andrews McMeel Publishing, Kansas City > That is a use and meaning of rock I was not aware of. An example of what I use this Python code for (track top right): https://shorturl.at/m3ZKp (Youtube's compression algorithm clearly did not like this video.) /Martin From bowman at montana.com Wed Oct 16 17:47:08 2024 From: bowman at montana.com (rbowman) Date: 16 Oct 2024 21:47:08 GMT Subject: Old matplotlib animation now fails References:

Message-ID: On 16 Oct 2024 08:20:10 GMT, Martin Sch??n wrote: > Den 2024-10-15 skrev Stefan Ram : >> Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?= wrote or >> quoted: >>>l.set_data(x0, y0) >> >> Well, I got to say, it's pretty rad that you're rocking Python! >> That language is the bee's knees, for real. >> >> As for your question, here's my two cents off the cuff: >> Could it be that the newer Matplotlib versions are jonesing for >> something like "l.set_data( [ x0 ],[ y0 ])" in that spot? >> > Thanks, that was quick and adding square brackets fixed my code. > > Me rocking Python? > > /Martin You have to understand Stefan tries to use American slang, not always entirely accurately. I think 'bee's knees' died out around 1931. From news at cct-net.co.uk Wed Oct 16 18:30:42 2024 From: news at cct-net.co.uk (Chris Townley) Date: Wed, 16 Oct 2024 23:30:42 +0100 Subject: Old matplotlib animation now fails In-Reply-To: References:

Message-ID: On 16/10/2024 22:47, rbowman wrote: > On 16 Oct 2024 08:20:10 GMT, Martin Sch??n wrote: > >> Den 2024-10-15 skrev Stefan Ram : >>> Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?= wrote or >>> quoted: >>>> l.set_data(x0, y0) >>> >>> Well, I got to say, it's pretty rad that you're rocking Python! >>> That language is the bee's knees, for real. >>> >>> As for your question, here's my two cents off the cuff: >>> Could it be that the newer Matplotlib versions are jonesing for >>> something like "l.set_data( [ x0 ],[ y0 ])" in that spot? >>> >> Thanks, that was quick and adding square brackets fixed my code. >> >> Me rocking Python? >> >> /Martin > > You have to understand Stefan tries to use American slang, not always > entirely accurately. I think 'bee's knees' died out around 1931. > Not sure about America, but the bee's knees is still in common use in the UK -- Chris From bowman at montana.com Wed Oct 16 23:19:17 2024 From: bowman at montana.com (rbowman) Date: 17 Oct 2024 03:19:17 GMT Subject: Old matplotlib animation now fails References:

Message-ID: On Wed, 16 Oct 2024 23:30:42 +0100, Chris Townley wrote: > Not sure about America, but the bee's knees is still in common use in > the UK https://en.wikipedia.org/wiki/Bee's_knees That version? A local bakery makes a honey flavored pastry they call 'bee's knees' but using it in a conversation would be campy. From hjp-python at hjp.at Fri Oct 18 17:09:41 2024 From: hjp-python at hjp.at (Peter J. Holzer) Date: Fri, 18 Oct 2024 23:09:41 +0200 Subject: Correct syntax for pathological re.search() In-Reply-To: <966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net> References: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> <20241012105958.cbctekv7vustleha@hjp.at> <966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net> Message-ID: <20241018210941.f5azh2lvz7cxzcy5@hjp.at> On 2024-10-12 08:51:57 -0400, Thomas Passin via Python-list wrote: > On 10/12/2024 6:59 AM, Peter J. Holzer via Python-list wrote: > > On 2024-10-11 17:13:07 -0400, AVI GROSS via Python-list wrote: > > > Is there some utility function out there that can be called to show what the > > > regular expression you typed in will look like by the time it is ready to be > > > used? > > > > I assume that by "ready to be used" you mean the compiled form? > > > > No, there doesn't seem to be a way to dump that. You can > > > > p = re.compile("\\\\sout{") > > print(p.pattern) > > > > but that just prints the input string, which you could do without > > compiling it first. > > It prints the escaped version, Did you mean the *un*escaped version? Well, yeah, that's what print does. > so you can see if you escaped the string as you intended. In this > case, the print will display '\\sout{'. print("\\\\sout{") will do the same. It seems to me that for any string s which is a valid regular expression (i.e. re.compile doesn't throw an exception) assert re.compile(s).pattern == s holds. So it doesn't give you anything you didn't already know. As a trivial example, the regular expressions r"\\sout{" and r"\\sout\{" are equivalent (the \ before the { is redundant). Yet re.compile(s).pattern preserves the difference between the two strings. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp at hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From nospam at please.ty Fri Oct 18 18:15:23 2024 From: nospam at please.ty (jak) Date: Sat, 19 Oct 2024 00:15:23 +0200 Subject: Correct syntax for pathological re.search() In-Reply-To: References: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> <20241012105958.cbctekv7vustleha@hjp.at> <966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net> <20241018210941.f5azh2lvz7cxzcy5@hjp.at> Message-ID: Peter J. Holzer ha scritto: > As a trivial example, the regular expressions r"\\sout{" and r"\\sout\{" > are equivalent (the \ before the { is redundant). Yet > re.compile(s).pattern preserves the difference between the two strings. Hi, Allow me to be fussy: r"\\sout{" and r"\\sout\{" are similar but not equivalent. If you omit the backslash, the parser will have to determine if the graph is part of regular expression {n, m} and will take more time. In some online regexs have these results: r"\\sout{" : 1 match ( 7 steps, 620 ?s ) r"\\sout\{" : 1 match ( 7 steps, 360 ?s ) From hjp-python at hjp.at Mon Oct 21 15:10:49 2024 From: hjp-python at hjp.at (Peter J. Holzer) Date: Mon, 21 Oct 2024 21:10:49 +0200 Subject: Correct syntax for pathological re.search() In-Reply-To: References: <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> <20241012105958.cbctekv7vustleha@hjp.at> <966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net> <20241018210941.f5azh2lvz7cxzcy5@hjp.at>

Message-ID: <20241021191049.iclg7pmpfrpkel55@hjp.at> On 2024-10-19 00:15:23 +0200, jak via Python-list wrote: > Peter J. Holzer ha scritto: > > As a trivial example, the regular expressions r"\\sout{" and r"\\sout\{" > > are equivalent (the \ before the { is redundant). Yet > > re.compile(s).pattern preserves the difference between the two strings. > > Allow me to be fussy: r"\\sout{" and r"\\sout\{" are similar but not > equivalent. They are. Both will match the 6 character string 0005c \ REVERSE SOLIDUS 00073 s LATIN SMALL LETTER S 0006f o LATIN SMALL LETTER O 00075 u LATIN SMALL LETTER U 00074 t LATIN SMALL LETTER T 0007b { LEFT CURLY BRACKET > If you omit the backslash, the parser will have to determine if the > graph is part of regular expression {n, m} and will take more time. Yes, that's the parser. But the result of parsing will be the same: The string will end in a literal backslash. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp at hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From jacob.kruger.work at gmail.com Tue Oct 22 08:03:14 2024 From: jacob.kruger.work at gmail.com (Jacob Kruger) Date: Tue, 22 Oct 2024 14:03:14 +0200 Subject: Capturing screenshots and recording audio in an ongoing basis, and submitting data to a RESTFul API Message-ID: <5d797501-cc7d-46e8-9f72-098a5c4d0748@gmail.com> Hi there - know this might be a silly question, but asking anyway... As in, know these formats/data-types are probably not really possible to compress any more than they already are. Have managed to sort out capturing screenshots repeatedly, while recording audio in the background, using combination of PIL's ImageGrab, and pyaudio, and can then use moviepy, which is a sort of wrapper around/interface to the FFMPEG command-line utility - this all allows me to record forms of screencast recordings, setting my own forms of time-frames, etc. in terms of the looping interval when I want to capture screenshots, etc., before then combining them into video clips with the audio recording merged in as a background track, and, all works fine, but, we want to use this as a form of monitoring service for call-centre staff, at times, and, the only real remaining issue is file-size/data in terms of both hard-drive storage space, and, bandwidth in terms of submitting resulting data to a RESTFul API. For example, a test video clip, generated using the libvpx codec, resulting in a .webm file, with a total length of 14 seconds, has a file size of 100KB. Also, don't think it's really relevant, but, am then just using things like requests module to submit data to the RESTFul API, which have used flask to implement. So, know this question might be a waste of time since have already played around with selecting the video codec that generates the smallest resulting file-size, and, not sure if might be able to drop image snapshot file sizes by using something like grayscale, which moviepy doesn't want to work with directly during generating original video clips, but just wondering if there might be any way to try converting binary data into smaller data chunks to then upload these via my RESTFul API, where could then convert them back to multimedia formats, etc.? Any thoughts/suggestions on this type of thing, and, on that note, all of this will be running as something like a background service on call-centre staff's windows 11 machines, if relevant.? As in, if there might be some way to store data and then generate multimedia later on, on the server handling the RESTFul API, that could also work, but, main thing is to both save storage data on workstations, as well as limit amount of bandwidth required overall since the number of target machines could easily be enough to use up a lot of bandwidth, etc., so, what we are looking into at the moment relates to only triggerring recordings at certain times on certain machines, in between. Thanks in advance --- Jacob Kruger +2782 413 4791 "Resistance is futile!...Acceptance is versatile..." From sjeik_appie at hotmail.com Wed Oct 23 13:07:14 2024 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Wed, 23 Oct 2024 19:07:14 +0200 Subject: Chardet oddity Message-ID: Today I used chardet.detect in the repl and it returned windows-1252 (incorrect, because it later resulted in a UnicodeDecodeError). When I ran chardet as a script (which uses UniversalLineDetector) this returned MacRoman. Isn't charset.detect the correct way? I've used this method many times. # Interpreter >>> contents = open(FILENAME, "rb").read() >>> chardet.detect(content) {'encoding': 'Windows-1252', 'confidence': 0.7282676610947401, 'language': ''} # Terminal $ python -m chardet FILENAME FILENAME: MacRoman with confidence 0.7167379080370483 Thanks! Albert-Jan From nntp.mbourne at spamgourmet.com Wed Oct 23 15:42:00 2024 From: nntp.mbourne at spamgourmet.com (Mark Bourne) Date: Wed, 23 Oct 2024 20:42:00 +0100 Subject: Chardet oddity In-Reply-To: References:

Message-ID: Albert-Jan Roskam wrote: > Today I used chardet.detect in the repl and it returned windows-1252 > (incorrect, because it later resulted in a UnicodeDecodeError). When I ran > chardet as a script (which uses UniversalLineDetector) this returned > MacRoman. Isn't charset.detect the correct way? I've used this method many > times. > # Interpreter > >>> contents = open(FILENAME, "rb").read() > >>> chardet.detect(content) Is that copy and pasted from the terminal, or retyped with possible transcription errors? As written, you've assigned the open file handle to `contents`, but passed `content` (with no "s") to `chardet.detect` - so the result would depend on whatever was previously assigned to `content`. > {'encoding': 'Windows-1252', 'confidence': 0.7282676610947401, 'language': > ''} > # Terminal > $ python -m chardet FILENAME > FILENAME: MacRoman with confidence 0.7167379080370483 > Thanks! > Albert-Jan -- Mark. From c.buhtz at posteo.jp Thu Oct 24 03:33:04 2024 From: c.buhtz at posteo.jp (c.buhtz at posteo.jp) Date: Thu, 24 Oct 2024 07:33:04 +0000 Subject: shutil.rmtree() fails when used in Fedora (rpm) "mock" environment Message-ID: <4a13731716200669342338ae409e73ca@posteo.de> Hello, I am upstream maintainer of "Back In Time" [1] investigating an issue a distro maintainer from Fedora reported [2] to me. On one hand Fedora seems to use a tool called "mock" to build packages in a chroot environment. On the other hand the test suite of "Back In Time" does read and write to the real file system. One test fails because a temporary directory is cleaned up using shutil.rmtree(). Please see the output below. I am not familiar with Fedora and "mock". So I am not able to reproduce this on my own. It seems the Fedora maintainer also has no clue how to solve it or why it happens. Can you please have a look (especially at the line "assert func is os.lstat"). Maybe you have an idea what is the intention behind this error raised by an "assert" statement inside "shutil.rmtree()". Thanks in advance, Christian Buhtz [1] --