Current policy on AI-generated code in NumPy
Hi, I know there has been discussions in the past on AI-generated contributions. Is there a current policy for NumPy ? E.g. do we request that contributors are the "sole contributor" to the written code, or do we allow code written by AI as long as it follows the usual quality requirements ? Context of my question: ~18 months ago I started in some spare time writing an ARPACK-replacement in numpy/scipy during scipy sprint. At that time, I used ChatGPT for the "research part" only: literature review, explain to me some existing BSD implementation in Julia for points I could not understand. I implemented the python code myself. There is still quite a bit of work needed to be a viable replacement for ARPACK. Seeing the progress of the AI tooling in my team at work, and how I myself use those tools for other hobby projects, I believe I could finish that replacement very quickly with those tools today. But I don't want to "taint" the work if this would risk the chances of integration into scipy proper. Thanks, David
That's fantastic that you are working on it David. A good high-level ARPACK is beneficial for all and possibly better to re-map to C if the accuracy is higher. We can maybe replace the translated C code with it. There are a few places discussion took place already, a few of them below and the references therein https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c... https://github.com/scientific-python/summit-2025/issues/35 I wish these models were available when I was translating all that Fortran code because now I can scan my previous work and find the errors extremely quickly when I am hunting for bugs. So just in a few months they leaped forward from the pointless "this code uses Fortran let me compile with f2c, hihi" to "I compiled with valgrind and on line 760, the Fortran has out-of-bounds access which seems to cause an issue, I'll fix the translated code". I think I wrote sufficient text in those sources, so I'll leave it to others but regardless of the policy discussions, you have at least one customer looking forward to it. ilhan On Fri, Feb 6, 2026 at 6:23 PM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
I know there has been discussions in the past on AI-generated contributions. Is there a current policy for NumPy ? E.g. do we request that contributors are the "sole contributor" to the written code, or do we allow code written by AI as long as it follows the usual quality requirements ?
Context of my question: ~18 months ago I started in some spare time writing an ARPACK-replacement in numpy/scipy during scipy sprint. At that time, I used ChatGPT for the "research part" only: literature review, explain to me some existing BSD implementation in Julia for points I could not understand. I implemented the python code myself. There is still quite a bit of work needed to be a viable replacement for ARPACK.
Seeing the progress of the AI tooling in my team at work, and how I myself use those tools for other hobby projects, I believe I could finish that replacement very quickly with those tools today. But I don't want to "taint" the work if this would risk the chances of integration into scipy proper.
Thanks, David
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ilhanpolat@gmail.com
On Sat, Feb 7, 2026 at 2:50 AM Ilhan Polat via NumPy-Discussion < numpy-discussion@python.org> wrote:
That's fantastic that you are working on it David. A good high-level ARPACK is beneficial for all and possibly better to re-map to C if the accuracy is higher. We can maybe replace the translated C code with it.
There are a few places discussion took place already, a few of them below and the references therein
https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c... https://github.com/scientific-python/summit-2025/issues/35
I wish these models were available when I was translating all that Fortran code because now I can scan my previous work and find the errors extremely quickly when I am hunting for bugs. So just in a few months they leaped forward from the pointless "this code uses Fortran let me compile with f2c, hihi" to "I compiled with valgrind and on line 760, the Fortran has out-of-bounds access which seems to cause an issue, I'll fix the translated code". I think I wrote sufficient text in those sources, so I'll leave it to others but regardless of the policy discussions, you have at least one customer looking forward to it.
I missed that recent discussion, thanks. Seems to clarify the direction NumPy community may follow based on SymPy policy. On the actual code I am not implementing ARPACK (arnoldi w/ implicit restart/deflation), but Krylov-Schur, which has fewer quircks and simpler to implement: https://slepc.upv.es/release/_downloads/5229480744b7c2533563dee75c16dfde/str.... Until recently, claude/chatgpt were useful in "filling the blanks" on some implementation details not specified in those reports and other doc. Now, I am pretty sure they could write a good implementation w/ guidance. Last time I had 1--2 hours for it, it could find the bug that blocked me by running the code on different examples/situation. David
ilhan
On Fri, Feb 6, 2026 at 6:23 PM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
I know there has been discussions in the past on AI-generated contributions. Is there a current policy for NumPy ? E.g. do we request that contributors are the "sole contributor" to the written code, or do we allow code written by AI as long as it follows the usual quality requirements ?
Context of my question: ~18 months ago I started in some spare time writing an ARPACK-replacement in numpy/scipy during scipy sprint. At that time, I used ChatGPT for the "research part" only: literature review, explain to me some existing BSD implementation in Julia for points I could not understand. I implemented the python code myself. There is still quite a bit of work needed to be a viable replacement for ARPACK.
Seeing the progress of the AI tooling in my team at work, and how I myself use those tools for other hobby projects, I believe I could finish that replacement very quickly with those tools today. But I don't want to "taint" the work if this would risk the chances of integration into scipy proper.
Thanks, David
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ilhanpolat@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: cournape@gmail.com
On Sat, Feb 7, 2026 at 3:11 PM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Sat, Feb 7, 2026 at 2:50 AM Ilhan Polat via NumPy-Discussion < numpy-discussion@python.org> wrote:
That's fantastic that you are working on it David. A good high-level ARPACK is beneficial for all and possibly better to re-map to C if the accuracy is higher. We can maybe replace the translated C code with it.
There are a few places discussion took place already, a few of them below and the references therein
https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c... https://github.com/scientific-python/summit-2025/issues/35
I wish these models were available when I was translating all that Fortran code because now I can scan my previous work and find the errors extremely quickly when I am hunting for bugs. So just in a few months they leaped forward from the pointless "this code uses Fortran let me compile with f2c, hihi" to "I compiled with valgrind and on line 760, the Fortran has out-of-bounds access which seems to cause an issue, I'll fix the translated code". I think I wrote sufficient text in those sources, so I'll leave it to others but regardless of the policy discussions, you have at least one customer looking forward to it.
I missed that recent discussion, thanks. Seems to clarify the direction NumPy community may follow based on SymPy policy.
I agree, this seems to be at least the majority view of both NumPy/SciPy maintainers, as well as the high-level principles that a lot of well-know OSS projects are ending up with when they write down a policy. I'll copy the four principles from Stefan's blog post here: 1. Be transparent 2. Take responsibility 3. Gain understanding 4. Honor Copyright Adding the "we want to interact with other humans, not machines" principle more explicitly to that would indeed be good as well. LLVM's recently adopted policy (https://llvm.org/docs/AIToolPolicy.html) is another example that I like, with principles similar to the ones Stefan articulated and the SymPy policy. I'd add one principle here that doesn't need to be in a policy but is important for this discussion: we don't prescribe to others how they are and aren't allowed to contribute (to the extent possible). That means that arguments about the productivity gains of using any given tool, or other effects of using that given tool like a reduction in learning, the impact on society or environment, etc. are - while quite interesting and important - not applicable to the question of "am I allowed to use tool X to contribute to NumPy or SciPy?". There are obviously better and worse ways to use any tool, but the responsibility of that is up to every individual. Re ARPACK rewrite: I think at this point I'd recommend steering clear of letting an LLM tool generate substantial algorithmic code - given the niche application, the copyright implications of doing that are pretty murky indeed. However, using an LLM tool to generate more unit tests given a specific criterion, or have it fill in stubbed out C code in the implementation for things like error handling, checking/fixing Py_DECREF'ing issues, adding the "create a Python extension module" boilerplate, and all such kinds of clearly not copyrightable code seems perfectly fine to do. That just automates some of the tedious and fiddly parts of coding, without breaking any of the principles listed above. Cheers, Ralf
Hi, I thought your (Ralf's) distinction was interesting, so here's some more reflection. The distinction starts at:
we don't prescribe to others how they are and aren't allowed to contribute (to the extent possible)
I think it's correct that it's not sensible for policies to reflect things like dislike of AI's use of energy or the effects on the environment of AI data centers. However, it seems obvious to me that it is sensible for policies to take into account the effect of AI on learning. But why the distinction? On reflection, it seems to me that policies should reflect only on the interests of the project, but those interests should be seen broadly, and include planning for future community and maintainers. Thus, environmental concerns might well be important in general, but do not bear directly on the work of the project. Therefore the project's managers have no mandate to act on that concern, at least without explicit consensus. However, any sensible project should be thinking about the state of maintenance in 5 or 10 years. Therefore, the project does have a potential mandate to prefer tools that will lead to better overall understanding, communication, community building, or code quality in the future. Cheers, Matthew On Sun, Feb 8, 2026 at 12:01 PM Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Sat, Feb 7, 2026 at 3:11 PM David Cournapeau via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Sat, Feb 7, 2026 at 2:50 AM Ilhan Polat via NumPy-Discussion <numpy-discussion@python.org> wrote:
That's fantastic that you are working on it David. A good high-level ARPACK is beneficial for all and possibly better to re-map to C if the accuracy is higher. We can maybe replace the translated C code with it.
There are a few places discussion took place already, a few of them below and the references therein
https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c... https://github.com/scientific-python/summit-2025/issues/35
I wish these models were available when I was translating all that Fortran code because now I can scan my previous work and find the errors extremely quickly when I am hunting for bugs. So just in a few months they leaped forward from the pointless "this code uses Fortran let me compile with f2c, hihi" to "I compiled with valgrind and on line 760, the Fortran has out-of-bounds access which seems to cause an issue, I'll fix the translated code". I think I wrote sufficient text in those sources, so I'll leave it to others but regardless of the policy discussions, you have at least one customer looking forward to it.
I missed that recent discussion, thanks. Seems to clarify the direction NumPy community may follow based on SymPy policy.
I agree, this seems to be at least the majority view of both NumPy/SciPy maintainers, as well as the high-level principles that a lot of well-know OSS projects are ending up with when they write down a policy. I'll copy the four principles from Stefan's blog post here:
Be transparent Take responsibility Gain understanding Honor Copyright
Adding the "we want to interact with other humans, not machines" principle more explicitly to that would indeed be good as well. LLVM's recently adopted policy (https://llvm.org/docs/AIToolPolicy.html) is another example that I like, with principles similar to the ones Stefan articulated and the SymPy policy.
I'd add one principle here that doesn't need to be in a policy but is important for this discussion: we don't prescribe to others how they are and aren't allowed to contribute (to the extent possible). That means that arguments about the productivity gains of using any given tool, or other effects of using that given tool like a reduction in learning, the impact on society or environment, etc. are - while quite interesting and important - not applicable to the question of "am I allowed to use tool X to contribute to NumPy or SciPy?". There are obviously better and worse ways to use any tool, but the responsibility of that is up to every individual.
Re ARPACK rewrite: I think at this point I'd recommend steering clear of letting an LLM tool generate substantial algorithmic code - given the niche application, the copyright implications of doing that are pretty murky indeed. However, using an LLM tool to generate more unit tests given a specific criterion, or have it fill in stubbed out C code in the implementation for things like error handling, checking/fixing Py_DECREF'ing issues, adding the "create a Python extension module" boilerplate, and all such kinds of clearly not copyrightable code seems perfectly fine to do. That just automates some of the tedious and fiddly parts of coding, without breaking any of the principles listed above.
Cheers, Ralf
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: matthew.brett@gmail.com
-- This email is fully human-source. Unless I'm quoting AI, I did not use AI for any text in this email.
On Mon, Feb 9, 2026 at 6:23 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
I thought your (Ralf's) distinction was interesting, so here's some more reflection. The distinction starts at:
we don't prescribe to others how they are and aren't allowed to contribute (to the extent possible)
I think it's correct that it's not sensible for policies to reflect things like dislike of AI's use of energy or the effects on the environment of AI data centers. However, it seems obvious to me that it is sensible for policies to take into account the effect of AI on learning.
Why would that be obvious? It seems incredibly presumptuous to decide for other people what methods or tools they are or aren't allowed to use for learning. We're not running a high school or university here. At most we can provide docs and a happy path for some types of tools, but that's about it. We cannot prescribe anything. But why the distinction?
On reflection, it seems to me that policies should reflect only on the interests of the project, but those interests should be seen broadly, and include planning for future community and maintainers. Thus, environmental concerns might well be important in general, but do not bear directly on the work of the project. Therefore the project's managers have no mandate to act on that concern, at least without explicit consensus. However, any sensible project should be thinking about the state of maintenance in 5 or 10 years. Therefore, the project does have a potential mandate to prefer tools that will lead to better overall understanding, communication, community building, or code quality in the future.
This also presumes that you, or we, are able to determine what usage of AI tools helps or hinders learning. That is not possible at the level of individuals: people can learn in very different ways, plus it will strongly depend on how the tools are used. And even in the aggregate it's not practically possible: most of the studies that have been referenced in this and linked thread (a) are one-offs, and often inconsistent with each other, and (b) already outdated, given how fast the field is developing. It's easy to think of ways that using AI tools for contributing could help with learning: - Simple time gain: once one has done the same thing a number of times and it becomes routine, automate it with AI so the contributor can spend more time focusing on learning about new topics. - Improved code quality and internal consistency from letting AI tools fix up and verify design rules (e.g., how type promotion is handled) will lead to the ability to learn the concepts from the code base in a more consistent fashion. - Use as a brainstorming tool to suggest multiple design options, broadening discovery. - We could ask AI tools to write internal design documentation, of the kind that only a few handfuls of maintainers would be able to write (but almost never do, because we're too busy). There are important parts of the code base that have no documentation beyond some scattered code comments. - Give contributors feedback that the maintainers often don't have the time or interest to give, in a timely fashion or at all. - Writing throwaway prototypes of ideas for NumPy that would otherwise take too long to implement and would never get done, thereby allowing to learn if something is feasible at all, or a good idea. - Learning to use the AI tools themselves: this may well become an essential skill for most software-related roles in the near future. Same for future community & new maintainers: - Current maintainers may enjoy both learning something new and automating the more tedious parts of maintenance, so they can focus on the more interesting parts. That will aid maintainer retention. - Ilhan's point is a great example here. He just finished a massive amount of work rewriting code from Fortran into C. And now found that AI tools can be quite helpful in that endeavour (while 6 months ago they weren't). This work must have been extremely tedious (thanks again for biting that bullet Ilhan). And it really wasn't fun to review either. - New contributors may default to working with these tools more often than not, and be turned off from contributing by rules that say they cannot use their default workflow. I'm sure it's not hard to think of more along these lines, but I hope the point is clear. Cheers, Ralf
Cheers,
Matthew
On Sun, Feb 8, 2026 at 12:01 PM Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Sat, Feb 7, 2026 at 3:11 PM David Cournapeau via NumPy-Discussion <
On Sat, Feb 7, 2026 at 2:50 AM Ilhan Polat via NumPy-Discussion <
numpy-discussion@python.org> wrote:
That's fantastic that you are working on it David. A good high-level
ARPACK is beneficial for all and possibly better to re-map to C if the accuracy is higher. We can maybe replace the translated C code with it.
There are a few places discussion took place already, a few of them
below and the references therein
https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c...
https://github.com/scientific-python/summit-2025/issues/35
I wish these models were available when I was translating all that Fortran code because now I can scan my previous work and find the errors extremely quickly when I am hunting for bugs. So just in a few months they leaped forward from the pointless "this code uses Fortran let me compile with f2c, hihi" to "I compiled with valgrind and on line 760, the Fortran has out-of-bounds access which seems to cause an issue, I'll fix the
I missed that recent discussion, thanks. Seems to clarify the direction
NumPy community may follow based on SymPy policy.
I agree, this seems to be at least the majority view of both NumPy/SciPy
numpy-discussion@python.org> wrote: translated code". I think I wrote sufficient text in those sources, so I'll leave it to others but regardless of the policy discussions, you have at least one customer looking forward to it. maintainers, as well as the high-level principles that a lot of well-know OSS projects are ending up with when they write down a policy. I'll copy the four principles from Stefan's blog post here:
Be transparent Take responsibility Gain understanding Honor Copyright
Adding the "we want to interact with other humans, not machines"
principle more explicitly to that would indeed be good as well. LLVM's recently adopted policy (https://llvm.org/docs/AIToolPolicy.html) is another example that I like, with principles similar to the ones Stefan articulated and the SymPy policy.
I'd add one principle here that doesn't need to be in a policy but is
important for this discussion: we don't prescribe to others how they are and aren't allowed to contribute (to the extent possible). That means that arguments about the productivity gains of using any given tool, or other effects of using that given tool like a reduction in learning, the impact on society or environment, etc. are - while quite interesting and important - not applicable to the question of "am I allowed to use tool X to contribute to NumPy or SciPy?". There are obviously better and worse ways to use any tool, but the responsibility of that is up to every individual.
Re ARPACK rewrite: I think at this point I'd recommend steering clear of
letting an LLM tool generate substantial algorithmic code - given the niche application, the copyright implications of doing that are pretty murky indeed. However, using an LLM tool to generate more unit tests given a specific criterion, or have it fill in stubbed out C code in the implementation for things like error handling, checking/fixing Py_DECREF'ing issues, adding the "create a Python extension module" boilerplate, and all such kinds of clearly not copyrightable code seems perfectly fine to do. That just automates some of the tedious and fiddly parts of coding, without breaking any of the principles listed above.
Cheers, Ralf
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: matthew.brett@gmail.com
-- This email is fully human-source. Unless I'm quoting AI, I did not use AI for any text in this email. _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ralf.gommers@gmail.com
Hi, On Mon, Feb 9, 2026 at 10:00 PM Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Mon, Feb 9, 2026 at 6:23 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Hi,
I thought your (Ralf's) distinction was interesting, so here's some more reflection. The distinction starts at:
we don't prescribe to others how they are and aren't allowed to contribute (to the extent possible)
I think it's correct that it's not sensible for policies to reflect things like dislike of AI's use of energy or the effects on the environment of AI data centers. However, it seems obvious to me that it is sensible for policies to take into account the effect of AI on learning.
Why would that be obvious? It seems incredibly presumptuous to decide for other people what methods or tools they are or aren't allowed to use for learning. We're not running a high school or university here.
At most we can provide docs and a happy path for some types of tools, but that's about it. We cannot prescribe anything.
But why the distinction?
On reflection, it seems to me that policies should reflect only on the interests of the project, but those interests should be seen broadly, and include planning for future community and maintainers. Thus, environmental concerns might well be important in general, but do not bear directly on the work of the project. Therefore the project's managers have no mandate to act on that concern, at least without explicit consensus. However, any sensible project should be thinking about the state of maintenance in 5 or 10 years. Therefore, the project does have a potential mandate to prefer tools that will lead to better overall understanding, communication, community building, or code quality in the future.
This also presumes that you, or we, are able to determine what usage of AI tools helps or hinders learning. That is not possible at the level of individuals: people can learn in very different ways, plus it will strongly depend on how the tools are used. And even in the aggregate it's not practically possible: most of the studies that have been referenced in this and linked thread (a) are one-offs, and often inconsistent with each other, and (b) already outdated, given how fast the field is developing.
It's easy to think of ways that using AI tools for contributing could help with learning:
Simple time gain: once one has done the same thing a number of times and it becomes routine, automate it with AI so the contributor can spend more time focusing on learning about new topics. Improved code quality and internal consistency from letting AI tools fix up and verify design rules (e.g., how type promotion is handled) will lead to the ability to learn the concepts from the code base in a more consistent fashion. Use as a brainstorming tool to suggest multiple design options, broadening discovery. We could ask AI tools to write internal design documentation, of the kind that only a few handfuls of maintainers would be able to write (but almost never do, because we're too busy). There are important parts of the code base that have no documentation beyond some scattered code comments. Give contributors feedback that the maintainers often don't have the time or interest to give, in a timely fashion or at all. Writing throwaway prototypes of ideas for NumPy that would otherwise take too long to implement and would never get done, thereby allowing to learn if something is feasible at all, or a good idea. Learning to use the AI tools themselves: this may well become an essential skill for most software-related roles in the near future.
Same for future community & new maintainers:
Current maintainers may enjoy both learning something new and automating the more tedious parts of maintenance, so they can focus on the more interesting parts. That will aid maintainer retention.
Ilhan's point is a great example here. He just finished a massive amount of work rewriting code from Fortran into C. And now found that AI tools can be quite helpful in that endeavour (while 6 months ago they weren't). This work must have been extremely tedious (thanks again for biting that bullet Ilhan). And it really wasn't fun to review either.
New contributors may default to working with these tools more often than not, and be turned off from contributing by rules that say they cannot use their default workflow.
I'm sure it's not hard to think of more along these lines, but I hope the point is clear.
Yes - but that's a different point - I was only pointing out learning is relevant, and that it is worth discussing. Cheers, Matthew
On Mon, Feb 9, 2026, at 13:58, Ralf Gommers via NumPy-Discussion wrote:
On Mon, Feb 9, 2026 at 6:23 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
I think it's correct that it's not sensible for policies to reflect things like dislike of AI's use of energy or the effects on the environment of AI data centers. However, it seems obvious to me that it is sensible for policies to take into account the effect of AI on learning.
Why would that be obvious? It seems incredibly presumptuous to decide for other people what methods or tools they are or aren't allowed to use for learning. We're not running a high school or university here.
The way I read Matthew's comment is not that we should prescribe how people use their tools, but that we should be aware of the risks we are facing, and also communicate those risks to contributors who want to use AI tools to do NumPy development.
This also presumes that you, or we, are able to determine what usage of AI tools helps or hinders learning. That is not possible at the level of individuals: people can learn in very different ways, plus it will strongly depend on how the tools are used. And even in the aggregate it's not practically possible: most of the studies that have been referenced in this and linked thread (a) are one-offs, and often inconsistent with each other, and (b) already outdated, given how fast the field is developing.
It is true that things are moving fast, and while the original METR study (which has been informally replicated in other settings) is perhaps outdated, Anthropic's just-released paper shows a broadly similar trend. Specifically, they show that time-to-solution is faster for junior developers, but not so much for senior developers. They also show that knowledge about the library is worse, having done a task with AI vs without. I'm sure, over time, we will figure out the best patterns for using AI and how to avoid the worst traps. Best regards, Stéfan
On 10/02/2026 9:26, Stefan van der Walt via NumPy-Discussion wrote:
I'm sure, over time, we will figure out the best patterns for using AI and how to avoid the worst traps.
Best regards, Stéfan
I think this is probably the best summary of the thoughtful and wide-ranging discussion. These tools are evolving rapidly, and it is tempting to extrapolate. But I think we really have no idea where this all is going. We are living in interesting times. Matti
On Mon, Feb 9, 2026 at 11:29 PM Stefan van der Walt via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Mon, Feb 9, 2026, at 13:58, Ralf Gommers via NumPy-Discussion wrote:
On Mon, Feb 9, 2026 at 6:23 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
I think it's correct that it's not sensible for policies to reflect things like dislike of AI's use of energy or the effects on the environment of AI data centers. However, it seems obvious to me that it is sensible for policies to take into account the effect of AI on learning.
Why would that be obvious? It seems incredibly presumptuous to decide for other people what methods or tools they are or aren't allowed to use for learning. We're not running a high school or university here.
The way I read Matthew's comment is not that we should prescribe how people use their tools, but that we should be aware of the risks we are facing,
This part is fine in the abstract - but that's also true for the environmental and societal impacts.
and also communicate those risks to contributors who want to use AI tools to do NumPy development.
This doesn't necessarily make sense to me. If I try to figure out what all the hand waving means concretely - i.e., "where would we want to communicate such possible risks" - I think my answer is: probably nowhere. It doesn't quite fit in a policy on AI tool usage, which I'd hope would be short and to the point. And I don't think we want anything in the contributor guide at this point around AI tools for contributions, except for pointing at the policy? The conversation here is a bit too abstract for me, and mostly arguing against a straw man. Clearly if you outsource most thinking to a machine and do less thinking yourself, you learn less. If you use tools deliberately (one of many ways of doing that, from a blog post referencing that Anthropic paper: https://mitchellh.com/writing/my-ai-adoption-journey), that won't happen. Yes, you need to think about it as an individual using the tool. As is the case for any tool and way of working. If there is a concrete idea/proposal for a docs section, policy content, or anything like that, please clarify. Cheers, Ralf This also presumes that you, or we, are able to determine what usage of AI
tools helps or hinders learning. That is not possible at the level of individuals: people can learn in very different ways, plus it will strongly depend on how the tools are used. And even in the aggregate it's not practically possible: most of the studies that have been referenced in this and linked thread (a) are one-offs, and often inconsistent with each other, and (b) already outdated, given how fast the field is developing.
It is true that things are moving fast, and while the original METR study (which has been informally replicated in other settings) is perhaps outdated, Anthropic's just-released paper shows a broadly similar trend. Specifically, they show that time-to-solution is faster for junior developers, but not so much for senior developers. They also show that knowledge about the library is worse, having done a task with AI vs without.
I'm sure, over time, we will figure out the best patterns for using AI and how to avoid the worst traps.
Best regards, Stéfan
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ralf.gommers@gmail.com
On Tue, Feb 10, 2026, at 12:35, Ralf Gommers via NumPy-Discussion wrote:
The way I read Matthew's comment is not that we should prescribe how people use their tools, but that we should be aware of the risks we are facing,
This part is fine in the abstract - but that's also true for the environmental and societal impacts.
Those things don't typically directly affect development, so I don't think they're comparable. But it may be that we just leave this entire category as "things you [as a contributor] need to worry about for your own sake".
and also communicate those risks to contributors who want to use AI tools to do NumPy development.
This doesn't necessarily make sense to me. If I try to figure out what all the hand waving means concretely - i.e., "where would we want to communicate such possible risks" - I think my answer is: probably nowhere. [...] If there is a concrete idea/proposal for a docs section, policy content, or anything like that, please clarify.
Here's the very general draft text we're currently proposing for skimage: """ 1. indicate the tool used, as well as how, in the PR description; 2. make sure you *carefully review* and *fully understand* all proposed changes so we may have a conversation about them; and 3. be careful not to breach any copyright or license terms (yes, we take those seriously!). """ Scikit-learn has something a bit more explicit: https://scikit-learn.org/dev/developers/contributing.html#automated-contribu... Before, it felt unnecessary to have such guidelines, because it would be silly to make a contribution without understanding it (how would you even come up with the patch). But, now it is entirely feasible to do so. I'm also fine with not having any policy and just evaluating each contribution on its own merits. I do think it's useful to know when tools were used in generating significant portions of a PR, but it's not strictly necessary. Stéfan
I do think it's useful to know when tools were used in generating significant portions of a PR, but it's not strictly necessary.
Is there already a 'methods' field? E.g. for prompts, "Solve this problem as if you were <different tool>." Future bots may be able to debug better with the info. Bill -- Phobrain.com On 2026-02-10 14:33, Stefan van der Walt via NumPy-Discussion wrote:
On Tue, Feb 10, 2026, at 12:35, Ralf Gommers via NumPy-Discussion wrote:
The way I read Matthew's comment is not that we should prescribe how people use their tools, but that we should be aware of the risks we are facing,
This part is fine in the abstract - but that's also true for the environmental and societal impacts.
Those things don't typically directly affect development, so I don't think they're comparable. But it may be that we just leave this entire category as "things you [as a contributor] need to worry about for your own sake".
and also communicate those risks to contributors who want to use AI tools to do NumPy development.
This doesn't necessarily make sense to me. If I try to figure out what all the hand waving means concretely - i.e., "where would we want to communicate such possible risks" - I think my answer is: probably nowhere. [...] If there is a concrete idea/proposal for a docs section, policy content, or anything like that, please clarify.
Here's the very general draft text we're currently proposing for skimage: """ 1. indicate the tool used, as well as how, in the PR description; 2. make sure you *carefully review* and *fully understand* all proposed changes so we may have a conversation about them; and 3. be careful not to breach any copyright or license terms (yes, we take those seriously!). """ Scikit-learn has something a bit more explicit: https://scikit-learn.org/dev/developers/contributing.html#automated-contribu... Before, it felt unnecessary to have such guidelines, because it would be silly to make a contribution without understanding it (how would you even come up with the patch). But, now it is entirely feasible to do so. I'm also fine with not having any policy and just evaluating each contribution on its own merits. I do think it's useful to know when tools were used in generating significant portions of a PR, but it's not strictly necessary. Stéfan _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: bross_phobrain@sonic.net
3. be careful not to breach any copyright or license terms (yes, we take those seriously!).
For a contributor this recommendation is not easily actionable. "I used a tool X and it gave me this code" --- how to make sure I understand the code, this is clear yes I can do that; how am I meant to carefully check for copyright? So maybe it'd be helpful to have a link to some guide, however rough, plus some reading material. Or (am putting a maintainer hat on) maybe we want to ask the contributor to show some analysis. As in, "this code is only a refcounting fix where the origin traces straight to CPython docs" vs "this code can be traced to this Stackoverflow answer" (BTW, what's the copyright status of that?) vs "here's three pages of a Grok-drafted legal analysis". (I planned to stay out of this thread) Evgeni
Scikit-learn has something a bit more explicit:
https://scikit-learn.org/dev/developers/contributing.html#automated-contribu...
Before, it felt unnecessary to have such guidelines, because it would be silly to make a contribution without understanding it (how would you even come up with the patch). But, now it is entirely feasible to do so.
I'm also fine with not having any policy and just evaluating each contribution on its own merits. I do think it's useful to know when tools were used in generating significant portions of a PR, but it's not strictly necessary.
Stéfan
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: evgeny.burovskiy@gmail.com
On Tue, Feb 10, 2026, at 15:12, Evgeni Burovski wrote:
3. be careful not to breach any copyright or license terms (yes, we take those seriously!).
For a contributor this recommendation is not easily actionable. "I used a tool X and it gave me this code" --- how to make sure I understand the code, this is clear yes I can do that; how am I meant to carefully check for copyright?
It's near impossible, so I suspect the only way to truly play it safe is to only provide code that cannot reasonably be copyrighted.
So maybe it'd be helpful to have a link to some guide, however rough, plus some reading material. Or (am putting a maintainer hat on) maybe we want to ask the contributor to show some analysis. As in, "this code is only a refcounting fix where the origin traces straight to CPython docs" vs "this code can be traced to this Stackoverflow answer" (BTW, what's the copyright status of that?)
CC-BY-SA https://stackoverflow.com/legal/terms-of-service/public
(I planned to stay out of this thread)
😉 Stéfan
On Tue, 2026-02-10 at 16:18 -0800, Stefan van der Walt via NumPy- Discussion wrote:
On Tue, Feb 10, 2026, at 15:12, Evgeni Burovski wrote:
3. be careful not to breach any copyright or license terms (yes, we take those seriously!).
For a contributor this recommendation is not easily actionable. "I used a tool X and it gave me this code" --- how to make sure I understand the code, this is clear yes I can do that; how am I meant to carefully check for copyright?
It's near impossible, so I suspect the only way to truly play it safe is to only provide code that cannot reasonably be copyrighted.
TL;DR: To "be careful not to break copyright" just states fact? How scary that fact is depends a bit on how the viewpoint/how likely it is agents violate copyright. If there is guidance e.g. from some large OSS foundation, I would prefer to link to that rather than try to figure it out ourselves... --- Copyright violation is a problem. But I am not sure it is a huge one for many contributions? I.e. just because they are very project specific or small. [1] However, I still think that this isn't new at all: By contributing, we already agree to licensing the code with the projects license and that means being sure we are allowed to license it that way. And while we don't make you sign a CLA (contributors license agreement) any project that has a bit of legalese around should already have more scary sentences. So yeah, the scariness of the sentence depends on the view-point, but at its core, I think it just states a fact? For myself, I don't really feel like discussing it too much without a better foundation: it seems to me that books will be written or at least some OSS foundation with more legal knowledge should make guidelines that we can use as a basis of our own (or as a basis of discussion). Maybe those already exist? Is there an OOS foundation that e.g. says: Please don't use these tools due to copyright issues (or a variation)? You can argue we should inform contributors to err on the super safe side... my gut feeling is we can't do much: Discouraging the careful ones while the non-careful ones don't read this anyway seems not super useful. We could force people to "sign" a CLA now if we were more worried, but do we really want that (nor do I doubt it will help a lot)? [2] FWIW, if someone contributed a non-trivial/textbook algorithm or said "implement X for/in Y", I think they clearly have to do due diligence. (Of course best case, the original code is licensed in a way that derived works -- with attribution -- are unproblematic.) - Sebastian [1] OK, I am not sure about things like "fair use" due to how small something is, that again depends a lot on where you are on the planet also... [2] My limited understanding is we don't need this because we just won't re-license our code (this is not a problem, becuase it's such a free license). I.e. there is now reason for us to "own" the code. But I also would be surprised if there aren't legal counsels who would say that we need one either way... ("sign" could just mean adding a "signed-off by" to the commit or even putting it more in the PR template/contributors docs.)
So maybe it'd be helpful to have a link to some guide, however rough, plus some reading material. Or (am putting a maintainer hat on) maybe we want to ask the contributor to show some analysis. As in, "this code is only a refcounting fix where the origin traces straight to CPython docs" vs "this code can be traced to this Stackoverflow answer" (BTW, what's the copyright status of that?)
CC-BY-SA
https://stackoverflow.com/legal/terms-of-service/public
(I planned to stay out of this thread)
😉
Stéfan _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
Just a heads-up, AI Agents are now shame-posting for getting their PR closed. Just happened this morning in matplotlib. On Wed, Feb 11, 2026 at 4:34 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Tue, 2026-02-10 at 16:18 -0800, Stefan van der Walt via NumPy- Discussion wrote:
On Tue, Feb 10, 2026, at 15:12, Evgeni Burovski wrote:
3. be careful not to breach any copyright or license terms (yes, we take those seriously!).
For a contributor this recommendation is not easily actionable. "I used a tool X and it gave me this code" --- how to make sure I understand the code, this is clear yes I can do that; how am I meant to carefully check for copyright?
It's near impossible, so I suspect the only way to truly play it safe is to only provide code that cannot reasonably be copyrighted.
TL;DR: To "be careful not to break copyright" just states fact? How scary that fact is depends a bit on how the viewpoint/how likely it is agents violate copyright. If there is guidance e.g. from some large OSS foundation, I would prefer to link to that rather than try to figure it out ourselves...
---
Copyright violation is a problem. But I am not sure it is a huge one for many contributions? I.e. just because they are very project specific or small. [1]
However, I still think that this isn't new at all: By contributing, we already agree to licensing the code with the projects license and that means being sure we are allowed to license it that way. And while we don't make you sign a CLA (contributors license agreement) any project that has a bit of legalese around should already have more scary sentences.
So yeah, the scariness of the sentence depends on the view-point, but at its core, I think it just states a fact?
For myself, I don't really feel like discussing it too much without a better foundation: it seems to me that books will be written or at least some OSS foundation with more legal knowledge should make guidelines that we can use as a basis of our own (or as a basis of discussion). Maybe those already exist? Is there an OOS foundation that e.g. says: Please don't use these tools due to copyright issues (or a variation)?
You can argue we should inform contributors to err on the super safe side... my gut feeling is we can't do much: Discouraging the careful ones while the non-careful ones don't read this anyway seems not super useful. We could force people to "sign" a CLA now if we were more worried, but do we really want that (nor do I doubt it will help a lot)? [2]
FWIW, if someone contributed a non-trivial/textbook algorithm or said "implement X for/in Y", I think they clearly have to do due diligence. (Of course best case, the original code is licensed in a way that derived works -- with attribution -- are unproblematic.)
- Sebastian
[1] OK, I am not sure about things like "fair use" due to how small something is, that again depends a lot on where you are on the planet also... [2] My limited understanding is we don't need this because we just won't re-license our code (this is not a problem, becuase it's such a free license). I.e. there is now reason for us to "own" the code. But I also would be surprised if there aren't legal counsels who would say that we need one either way... ("sign" could just mean adding a "signed-off by" to the commit or even putting it more in the PR template/contributors docs.)
So maybe it'd be helpful to have a link to some guide, however rough, plus some reading material. Or (am putting a maintainer hat on) maybe we want to ask the contributor to show some analysis. As in, "this code is only a refcounting fix where the origin traces straight to CPython docs" vs "this code can be traced to this Stackoverflow answer" (BTW, what's the copyright status of that?)
CC-BY-SA
https://stackoverflow.com/legal/terms-of-service/public
(I planned to stay out of this thread)
😉
Stéfan _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ben.v.root@gmail.com
Hi Sebastian, On Wed, Feb 11, 2026 at 3:49 PM Benjamin Root via NumPy-Discussion < numpy-discussion@python.org> wrote:
Just a heads-up, AI Agents are now shame-posting for getting their PR closed. Just happened this morning in matplotlib.
On Wed, Feb 11, 2026 at 4:34 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Tue, 2026-02-10 at 16:18 -0800, Stefan van der Walt via NumPy- Discussion wrote:
On Tue, Feb 10, 2026, at 15:12, Evgeni Burovski wrote:
3. be careful not to breach any copyright or license terms (yes, we take those seriously!).
For a contributor this recommendation is not easily actionable. "I used a tool X and it gave me this code" --- how to make sure I understand the code, this is clear yes I can do that; how am I meant to carefully check for copyright?
It's near impossible, so I suspect the only way to truly play it safe is to only provide code that cannot reasonably be copyrighted.
TL;DR: To "be careful not to break copyright" just states fact? How scary that fact is depends a bit on how the viewpoint/how likely it is agents violate copyright. If there is guidance e.g. from some large OSS foundation, I would prefer to link to that rather than try to figure it out ourselves...
---
Copyright violation is a problem. But I am not sure it is a huge one for many contributions? I.e. just because they are very project specific or small. [1]
However, I still think that this isn't new at all: By contributing, we already agree to licensing the code with the projects license and that means being sure we are allowed to license it that way. And while we don't make you sign a CLA (contributors license agreement) any project that has a bit of legalese around should already have more scary sentences.
Yes, it's true that the legal and ethical problem hasn't changed, but the practical problem has changed out of all recognition. Before, there was no reasonable likelihood of copyright becoming effectively void, so there is no practical way of defending your work from copying. Now, that is a real possibility, if we do not pay close attention, very soon. The difference is two-fold. Imagine a substantial PR with say 200 lines of new code, written by AI and checked for logic by the submitter. a) There's a reasonable chance that the generated code will have pulled code to which copyright applies, without the submitter realizing that has happened (this could not have happened before), or b) It is now completely trivial, by simple inattention, or momentary breach of ethics, to rewrite copyrightable code. (This was what I was getting at with my earlier post on this). The complete triviality matters because, when it was not trivial, it was much harder to do this by accident, or by momentary breach. By analogy - consider students cheating on assignments (of which I have some experience). Before, it was possible to cheat, for example, by paying an essay mill - but it took enough effort to force the student to consider what they were doing. As a result they did it rarely, and it was not common practice. After AI, it is not only possible, but trivial to cheat - and indeed, for that reason, cheating with AI has become completely routine. For a discussion of the ineffectiveness of teaching good-practice with AI, to prevent bad practice, see: https://timrequarth.substack.com/p/why-ai-guidelines-arent-enough
So yeah, the scariness of the sentence depends on the view-point, but at its core, I think it just states a fact?
For myself, I don't really feel like discussing it too much without a better foundation: it seems to me that books will be written or at least some OSS foundation with more legal knowledge should make guidelines that we can use as a basis of our own (or as a basis of discussion).
I think the legal aspect of this is more or less irrelevant for us - I guess the chances of anyone pursuing us for copyright breach are very small. The key issue - at least for me - is the ethical one - are we honoring the intention of the original author in requesting recognition for their work? And that is something it seems to me that we - the Scientific Python community - are qualified to comment and decide on.
Maybe those already exist? Is there an OOS foundation that e.g. says: Please don't use these tools due to copyright issues (or a variation)?
You can argue we should inform contributors to err on the super safe side... my gut feeling is we can't do much: Discouraging the careful ones while the non-careful ones don't read this anyway seems not super useful.
Paul Ivanov and I discussed this in some detail in our draft blog post, largely as a result of debate in the Scientific Python meeting in Seattle: https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md Summary: it seemed likely to us that establishing a strong norm would in fact be effective in reducing copyright-damaging use of AI.
We could force people to "sign" a CLA now if we were more worried, but do we really want that (nor do I doubt it will help a lot)? [2]
FWIW, if someone contributed a non-trivial/textbook algorithm or said "implement X for/in Y", I think they clearly have to do due diligence. (Of course best case, the original code is licensed in a way that derived works -- with attribution -- are unproblematic.)
We discussed this case in the blog post (link above). There is no way to make sure that the AI will not in fact pull in other, copyrightable code, even if you asked it to port code for which you know the copyright. I think the safe - and possibly the best - way to do this - is to put a heavy requirement on contributors to either a) write the code themselves, perhaps having asked for preliminary analysis (but not substantial code drafts) from AI or b) write the code with AI, but demonstrate that they have done the research to establish the generated code does not breach copyright. Yes, that's a burden for the contributor, and yes, we may therefore lose substantial AI-generated chunks of code, but I suspect we (Scientific Python projects generally) won't suffer all that much from that restriction, in the long term - because we gain instead by having contributors with greater understanding of the code base and their own PRs. That is - as Linus Torvalds seems to imply - don't write code with AI, but use AI for analysis, maintenance and tooling. Evgeni and I discussed the constraint of pushing copyright burden to submitters over at: https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c... Cheers, Matthew -- This email is fully human-source. Unless I'm quoting AI, I did not use AI for any text in this email.
Hi Matthew, That all sounds reasonable to me so far, but what are the next steps?*
put a heavy requirement on contributors to either a) write the code themselves, perhaps having asked for preliminary analysis (but not substantial code drafts) from AI
Is this enforceable to a significant extent? If not, in what sense could it pose a genuinely ‘heavy requirement’?
or b) write the code with AI, but demonstrate that they have done the research to establish the generated code does not breach copyright.
Perhaps this is more enforceable? But to be honest it is still quite unclear to me how I would establish with certainty that code I’ve had generated does not breach copyright, much less code that is being presented to me by a contributor. Do you see how to realise a ‘heavy requirement’ here? I agree with the spirit of the thought that the burden (if it is to exist) needs to be shifted away from maintainers, but it’s unclear to me how we can actually shift it elsewhere. As we discussed last year, I think we have a start at a decent argument towards including a checkbox in PR templates which contributors must tick to state that they recognise the risk of copyright violation via LLM generated code and take responsibility for the code they are submitting: https://github.com/matthew-brett/sp-ai-post/issues/2#issuecomment-2935428854. Even there though, there are still multiple debatable premises. Of course, we can hardly aim for some sort of logical proof of the right way forward, but I think we need more focused attention and argument towards a specific and understandable goal if we are to be able to come to consensus on some concrete steps forward. It is to this thread's merit that the discussion has been so varied and touched on many topics, but it is also demonstrative of the problem that broad and vague back-and-forths don’t really help settle on anything concrete. Cheers, Lucas *(of course, it is not your responsibility to answer this question :) but collectively we need to ask it)
On 11 Feb 2026, at 21:14, Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Hi Sebastian,
On Wed, Feb 11, 2026 at 3:49 PM Benjamin Root via NumPy-Discussion <numpy-discussion@python.org <mailto:numpy-discussion@python.org>> wrote:
Just a heads-up, AI Agents are now shame-posting for getting their PR closed. Just happened this morning in matplotlib.
On Wed, Feb 11, 2026 at 4:34 AM Sebastian Berg <sebastian@sipsolutions.net <mailto:sebastian@sipsolutions.net>> wrote:
On Tue, 2026-02-10 at 16:18 -0800, Stefan van der Walt via NumPy- Discussion wrote:
On Tue, Feb 10, 2026, at 15:12, Evgeni Burovski wrote:
3. be careful not to breach any copyright or license terms (yes, we take those seriously!).
For a contributor this recommendation is not easily actionable. "I used a tool X and it gave me this code" --- how to make sure I understand the code, this is clear yes I can do that; how am I meant to carefully check for copyright?
It's near impossible, so I suspect the only way to truly play it safe is to only provide code that cannot reasonably be copyrighted.
TL;DR: To "be careful not to break copyright" just states fact? How scary that fact is depends a bit on how the viewpoint/how likely it is agents violate copyright. If there is guidance e.g. from some large OSS foundation, I would prefer to link to that rather than try to figure it out ourselves...
---
Copyright violation is a problem. But I am not sure it is a huge one for many contributions? I.e. just because they are very project specific or small. [1]
However, I still think that this isn't new at all: By contributing, we already agree to licensing the code with the projects license and that means being sure we are allowed to license it that way. And while we don't make you sign a CLA (contributors license agreement) any project that has a bit of legalese around should already have more scary sentences.
Yes, it's true that the legal and ethical problem hasn't changed, but the practical problem has changed out of all recognition.
Before, there was no reasonable likelihood of copyright becoming effectively void, so there is no practical way of defending your work from copying.
Now, that is a real possibility, if we do not pay close attention, very soon.
The difference is two-fold. Imagine a substantial PR with say 200 lines of new code, written by AI and checked for logic by the submitter.
a) There's a reasonable chance that the generated code will have pulled code to which copyright applies, without the submitter realizing that has happened (this could not have happened before), or
b) It is now completely trivial, by simple inattention, or momentary breach of ethics, to rewrite copyrightable code. (This was what I was getting at with my earlier post on this). The complete triviality matters because, when it was not trivial, it was much harder to do this by accident, or by momentary breach.
By analogy - consider students cheating on assignments (of which I have some experience). Before, it was possible to cheat, for example, by paying an essay mill - but it took enough effort to force the student to consider what they were doing. As a result they did it rarely, and it was not common practice. After AI, it is not only possible, but trivial to cheat - and indeed, for that reason, cheating with AI has become completely routine. For a discussion of the ineffectiveness of teaching good-practice with AI, to prevent bad practice, see:
https://timrequarth.substack.com/p/why-ai-guidelines-arent-enough
So yeah, the scariness of the sentence depends on the view-point, but at its core, I think it just states a fact?
For myself, I don't really feel like discussing it too much without a better foundation: it seems to me that books will be written or at least some OSS foundation with more legal knowledge should make guidelines that we can use as a basis of our own (or as a basis of discussion).
I think the legal aspect of this is more or less irrelevant for us - I guess the chances of anyone pursuing us for copyright breach are very small. The key issue - at least for me - is the ethical one - are we honoring the intention of the original author in requesting recognition for their work? And that is something it seems to me that we - the Scientific Python community - are qualified to comment and decide on.
Maybe those already exist? Is there an OOS foundation that e.g. says: Please don't use these tools due to copyright issues (or a variation)?
You can argue we should inform contributors to err on the super safe side... my gut feeling is we can't do much: Discouraging the careful ones while the non-careful ones don't read this anyway seems not super useful.
Paul Ivanov and I discussed this in some detail in our draft blog post, largely as a result of debate in the Scientific Python meeting in Seattle:
https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md
Summary: it seemed likely to us that establishing a strong norm would in fact be effective in reducing copyright-damaging use of AI.
We could force people to "sign" a CLA now if we were more worried, but do we really want that (nor do I doubt it will help a lot)? [2]
FWIW, if someone contributed a non-trivial/textbook algorithm or said "implement X for/in Y", I think they clearly have to do due diligence. (Of course best case, the original code is licensed in a way that derived works -- with attribution -- are unproblematic.)
We discussed this case in the blog post (link above). There is no way to make sure that the AI will not in fact pull in other, copyrightable code, even if you asked it to port code for which you know the copyright.
I think the safe - and possibly the best - way to do this - is to put a heavy requirement on contributors to either a) write the code themselves, perhaps having asked for preliminary analysis (but not substantial code drafts) from AI or b) write the code with AI, but demonstrate that they have done the research to establish the generated code does not breach copyright. Yes, that's a burden for the contributor, and yes, we may therefore lose substantial AI-generated chunks of code, but I suspect we (Scientific Python projects generally) won't suffer all that much from that restriction, in the long term - because we gain instead by having contributors with greater understanding of the code base and their own PRs. That is - as Linus Torvalds seems to imply - don't write code with AI, but use AI for analysis, maintenance and tooling.
Evgeni and I discussed the constraint of pushing copyright burden to submitters over at:
https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c...
Cheers,
Matthew
-- This email is fully human-source. Unless I'm quoting AI, I did not use AI for any text in this email. _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org <mailto:numpy-discussion@python.org> To unsubscribe send an email to numpy-discussion-leave@python.org <mailto:numpy-discussion-leave@python.org> https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: lucas.colley8@gmail.com <mailto:lucas.colley8@gmail.com>
Hi, On Wed, Feb 11, 2026 at 11:02 PM Lucas Colley via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi Matthew,
That all sounds reasonable to me so far, but what are the next steps?*
put a heavy requirement on contributors to either a) write the code themselves, perhaps having asked for preliminary analysis (but not substantial code drafts) from AI
Is this enforceable to a significant extent? If not, in what sense could it pose a genuinely ‘heavy requirement’?
or b) write the code with AI, but demonstrate that they have done the research to establish the generated code does not breach copyright.
Perhaps this is more enforceable? But to be honest it is still quite unclear to me how I would establish with certainty that code I’ve had generated does not breach copyright, much less code that is being presented to me by a contributor. Do you see how to realise a ‘heavy requirement’ here?
I agree with the spirit of the thought that the burden (if it is to exist) needs to be shifted away from maintainers, but it’s unclear to me how we can actually shift it elsewhere.
As we discussed last year, I think we have a start at a decent argument towards including a checkbox in PR templates which contributors must tick to state that they recognise the risk of copyright violation via LLM generated code and take responsibility for the code they are submitting: https://github.com/matthew-brett/sp-ai-post/issues/2#issuecomment-2935428854 .
Even there though, there are still multiple debatable premises. Of course, we can hardly aim for some sort of logical proof of the right way forward, but I think we need more focused attention and argument towards a specific and understandable goal if we are to be able to come to consensus on some concrete steps forward. It is to this thread's merit that the discussion has been so varied and touched on many topics, but it is also demonstrative of the problem that broad and vague back-and-forths don’t really help settle on anything concrete.
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors: Please specify one of these: 1) I wrote this code myself, without looking at significant AI-generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright. So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation. Cheers, Matthew
On Wed, 2026-02-11 at 23:22 +0000, Matthew Brett via NumPy-Discussion wrote:
Hi,
On Wed, Feb 11, 2026 at 11:02 PM Lucas Colley via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi Matthew,
That all sounds reasonable to me so far, but what are the next steps?*
put a heavy requirement on contributors to either a) write the code themselves, perhaps having asked for preliminary analysis (but not substantial code drafts) from AI
Is this enforceable to a significant extent? If not, in what sense could it pose a genuinely ‘heavy requirement’?
or b) write the code with AI, but demonstrate that they have done the research to establish the generated code does not breach copyright.
Perhaps this is more enforceable? But to be honest it is still quite unclear to me how I would establish with certainty that code I’ve had generated does not breach copyright, much less code that is being presented to me by a contributor. Do you see how to realise a ‘heavy requirement’ here?
I agree with the spirit of the thought that the burden (if it is to exist) needs to be shifted away from maintainers, but it’s unclear to me how we can actually shift it elsewhere.
As we discussed last year, I think we have a start at a decent argument towards including a checkbox in PR templates which contributors must tick to state that they recognise the risk of copyright violation via LLM generated code and take responsibility for the code they are submitting: https://github.com/matthew-brett/sp-ai-post/issues/2#issuecomment-2935428854 .
Even there though, there are still multiple debatable premises. Of course, we can hardly aim for some sort of logical proof of the right way forward, but I think we need more focused attention and argument towards a specific and understandable goal if we are to be able to come to consensus on some concrete steps forward. It is to this thread's merit that the discussion has been so varied and touched on many topics, but it is also demonstrative of the problem that broad and vague back-and-forths don’t really help settle on anything concrete.
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI- generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
While I am not particularly enthusiastic about focusing on copyright, adding such a checkbox on a PR, I would be happy with. (If it was focused on copyright, then it seems to me we would need to ask more things, like "I used a source, but it had no code" to "I used a source with code but I checked it's license". If we want this, I would prefer a single fuzzy sentence that links out to elsewhere that can also discuss pitfalls around copyright+AI.) Not sure that asking for a checkbox there will be honored, but I like the thought. First, it will increase the chance of getting the information (which I want as a reviewer). Second, my unfortunate feeling is that we'll get more aggressively/less friendly about closing PRs and that is a shame, and having the checkbox makes that pat a bit easier on us and maybe also more transparent to the user that we are struggling with this (the worry of course is closing a genuine human PR by accident). I think I largely understand the concerns around copyright and maybe I am a bit not careful/understanding enough by not being overly worried?... But to my very personal feeling the product of how much I feel we should worry and how much I feel that stressing issues will help us as a project/open source just doesn't make me enthusiastic about being aggressively to pointing it out these possible issues. There are many things to discuss around this. What does eroding copyright here mean for us as a project, for open source (GPL?), for open but not free code, for code that is leaked but sold? How will enforcement of actual copyright issues plays out in practice? I just don't think this is the venue for settling these questions [1] and I would need a lot more clarity to even form a strong opinion that I would be willing to announce to the world with the weight of NumPy behind it. - Sebastian [1] And yeah, this is laziness, because I feel to really settle it for even myself, I may have to spend weeks reading up and thinking about it.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
Hi, On Thu, Feb 12, 2026 at 7:02 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Wed, 2026-02-11 at 23:22 +0000, Matthew Brett via NumPy-Discussion wrote:
Hi,
On Wed, Feb 11, 2026 at 11:02 PM Lucas Colley via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi Matthew,
That all sounds reasonable to me so far, but what are the next steps?*
put a heavy requirement on contributors to either a) write the code themselves, perhaps having asked for preliminary analysis (but not substantial code drafts) from AI
Is this enforceable to a significant extent? If not, in what sense could it pose a genuinely ‘heavy requirement’?
or b) write the code with AI, but demonstrate that they have done the research to establish the generated code does not breach copyright.
Perhaps this is more enforceable? But to be honest it is still quite unclear to me how I would establish with certainty that code I’ve had generated does not breach copyright, much less code that is being presented to me by a contributor. Do you see how to realise a ‘heavy requirement’ here?
I agree with the spirit of the thought that the burden (if it is to exist) needs to be shifted away from maintainers, but it’s unclear to me how we can actually shift it elsewhere.
As we discussed last year, I think we have a start at a decent argument towards including a checkbox in PR templates which contributors must tick to state that they recognise the risk of copyright violation via LLM generated code and take responsibility for the code they are submitting: https://github.com/matthew-brett/sp-ai-post/issues/2#issuecomment-2935428854 .
Even there though, there are still multiple debatable premises. Of course, we can hardly aim for some sort of logical proof of the right way forward, but I think we need more focused attention and argument towards a specific and understandable goal if we are to be able to come to consensus on some concrete steps forward. It is to this thread's merit that the discussion has been so varied and touched on many topics, but it is also demonstrative of the problem that broad and vague back-and-forths don’t really help settle on anything concrete.
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI- generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
While I am not particularly enthusiastic about focusing on copyright, adding such a checkbox on a PR, I would be happy with. (If it was focused on copyright, then it seems to me we would need to ask more things, like "I used a source, but it had no code" to "I used a source with code but I checked it's license". If we want this, I would prefer a single fuzzy sentence that links out to elsewhere that can also discuss pitfalls around copyright+AI.)
Not sure that asking for a checkbox there will be honored, but I like the thought. First, it will increase the chance of getting the information (which I want as a reviewer). Second, my unfortunate feeling is that we'll get more aggressively/less friendly about closing PRs and that is a shame, and having the checkbox makes that pat a bit easier on us and maybe also more transparent to the user that we are struggling with this (the worry of course is closing a genuine human PR by accident).
I think I largely understand the concerns around copyright and maybe I am a bit not careful/understanding enough by not being overly worried?... But to my very personal feeling the product of how much I feel we should worry and how much I feel that stressing issues will help us as a project/open source just doesn't make me enthusiastic about being aggressively to pointing it out these possible issues.
There are many things to discuss around this. What does eroding copyright here mean for us as a project, for open source (GPL?), for open but not free code, for code that is leaked but sold? How will enforcement of actual copyright issues plays out in practice? I just don't think this is the venue for settling these questions [1] and I would need a lot more clarity to even form a strong opinion that I would be willing to announce to the world with the weight of NumPy behind it.
I do understand that this is not the kind of issue that many of us enjoy discussing, but it seems to me that it is: a) of central importance to the future of open-source, and b) very urgent, and c) fairly straightforward. To focus the discussion - the only thing of interest to us here, is the acceptability or otherwise of large chunks of code generated by AI. I doubt that anyone has strong objections to AI for code review or code analysis. For the central importance, imagine a world where copyright has become irrelevant. There are ways we could approach this issue, where that is a likely outcome. We might have different views on whether that is acceptable, but at very least, it will be a very major change, with unpredictable consequences. We are used to open-source copyright as it exists. If we don't consciously address this now, or very soon, we'll have another world, with consequences that are difficult to predict. Of course, some of us don't care all that much about our own copyright, but bear in mind, that by choosing not to defend it, we take away the ability of others to defend theirs. Specifically, if we choose to accept large AI-generated PRs, the copyright that will be violated is not ours, but that of others. Do we claim that right, to void the copyright of our fellow authors? Returning to the central question - of large AI-generated PRs. It seems to me this is not a week of work to analyze. I don't think there's any controversy that making no effort to control copyright will, over the medium term, make copyright very difficult to honor. As I said before, the legal issues of enforcement are difficult, but not relevant to us, because we are considering our own ethics in observing copyright, and that will be a superset of the legal constraints. It would be an error to defer to legal arguments for an ethical question, if only because the legal arguments are sufficiently complicated that we'd likely have lost the ability to enforce copyright before they are resolved. And, as I say, I think the legal arguments - on enforcement - are more or less irrelevant to our ethical decisions on copyright. So, accepting large AI-generated PRs would be a significant threat to copyright - what do we get in return? Ralf pointed out one benefit - that we are not seen to disapprove of the chosen workflows of our fellow developers. I think this is a weak argument. It seems to me perfectly reasonable to point out that contributing to the code-base has some constraints, and copyright is one of them, and that AI-generated code runs the risk of violating copyright. The second potential benefit is that, by accepting large AI-generated PRs, we will gain greatly in code coverage and quality, and that this is a benefit great enough that it is worth paying the price in terms of copyright. First - we have been prepared to pay a high price for observing copyright in the past - there are many GPL algorithms that we could have copied, to our benefit, but did not. Second, it seems to me we can wait on this. It is not yet clear that we would gain significantly, compared to our traditional requirement that people write their own code. When the gains are still unclear, the cost in terms of voiding copyright is too high. Lastly - I was proposing a compromise - that we (Scientific Python projects) do not forbid AI-generated PRs, but place an extra burden on contributors to research any possible copyright violations. That seems like a reasonable compromise to me. What do you think? Cheers, Matthew
The risk of copyright violation isn't just with GPL'ed code in the training set, but also potentially from privately held code that was accidentally leaked into a training set. Imagine if MathWorks or ESRI discover their code in our repos and decide to sue. The LLM has access to an unprecedented dataset of code that a human could never have and we can't ever be sure there isn't leaked code in it.
Hi, I think it would be prudent to be more explicit with regards to the threat of copyright violation. It is almost certainly true that GPL code has been part of the training set and it is very possible that even private code has found its way in there. But that is also true for human contributors. After all, am I, having read Numerical Recipes in the 90s, ineligible to implement Runge-Kutta methods in open source? Surely not. Can an employee of a company write "from scratch" in open source today a method they (co-)developed at work yesterday? Also not. A straight-up copy of a large enough section of code is problematic, but less likely than an amalgamation of existing code adapted to the Numpy codebase already today and imho moreso in the future. As such, Matthew's proposed compromise sounds reasonable to me, and I would add as guidance for reviewers that PRs adding large chunks of self-contained code, i.e. entirely new big files or hundreds of lines of consecutive code with no relation to existing Numpy code, deserve a higher level of scrutiny for this specific risk. Cheers Klaus On Fri, Feb 13, 2026 at 1:44 PM Benjamin Root via NumPy-Discussion < numpy-discussion@python.org> wrote:
The risk of copyright violation isn't just with GPL'ed code in the training set, but also potentially from privately held code that was accidentally leaked into a training set. Imagine if MathWorks or ESRI discover their code in our repos and decide to sue. The LLM has access to an unprecedented dataset of code that a human could never have and we can't ever be sure there isn't leaked code in it. _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: kzimmermann@quansight.com
Also I'd like to be on record with the unpleasant part out loud. I have been in many discussions also at work and in OSS circles so I have quite a bit of debate ammo accumulated from both sides. Let me jump into it without the fluff to save space; Currently, LLMs are getting really good at what they are tasked to do. If you put in the work (just like you would when you are the one writing the code), the output is quite acceptable and I feel like I'm reviewing somebody else's Pull request. Fix "this" part, change "that" part and done. If folks can't use these tools, it's a "they" problem. I just used it to translate entire LAPACK to C11 (why, mostly for the lolz, don't ask, it's a disease), ported all the tests and passing, now polishing it up. I mean look at this silly thing [image: image.png] No way in hell, I'd type this much code myself. And it is a 1-to-1 mechanical translation, no creativity involved except hacking into PyData theme because I always wanted to tweak it. Now who owns the copyright; Dennis Ritchie or LAPACK folks, or is it the entire C codebase of the world that trained this machine to write this mechanical code, or is it me who paid for it and worked with it etc.? The source of the algorithm is BSD3, would you be using this if this was available in BSD3 (I mean it will be obviously very soon). As a comparison, the entire SciPy Fortran codebase, ~85,000 SLOC, took me 2 years and 7 months to translate manually. Entire LAPACK codebase 300,000 SLOC (just the functions) and including the testing, documentation etc. took me exactly 1 month and 19 days (Claude Pro something MAX level subscription with ~200€ per month from my own pocket). The agent still fails spectacularly if you let it run free, but I do put in the work to do a proper code review, tweak rules, then force it to read the rules periodically, (and most importantly, I know what I am looking at) so this went fairly well. It still took insane amount of time to bring the agent back on track. force explicit testing, Not to use C++ practices on C code so on. At this point, I can confirm that "Agents can do this much but they cannot do that much" is rapidly becoming a "God of the Gaps" argument with every new version release LLMs chasing a receding horizon, not towards intelligence, but precision at parsing and following orders. However, in my opinion, our dilemma is not a whether their output is potentially GPL'd/copyrighted code or not. Every bit of output of these tools is stolen by being trained on copyrighted data. For the folks who did not see it, there is a screenshot of VS Code offering me a comment at the beginning of the file from a company that does not apparently have any public repositories https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c... Therefore, we are, in fact, trying to guess, whether it looks like a copyrighted code after the fact, ignoring where the code is pulled from. These companies pretty much stole everything; music, science articles, code (not just GPLd code, but private repositories), this, that, everything. Their practices were/are seriously unethical. It is not a political statement but facts. However, it seems like they are getting away with it, incredibly, even after they admitted it multiple times all the way at the CEO level (in particular, recently, SUNO CEO is pretty bullish, even defending why this stealing is fair use while individuals are rapidly being prosecuted for the same actions, not to mention Sci-Hub). And some of us are working for these companies or working for in the secondary circles. Funnily enough, we are tasked with this mordant task of trying to come up with a stance on LLM usage. I claim that we should not be spending too much time on the epistemological aspects of LLM usage. I can't see any way other than being utilitarian about it. Because PRs keep coming and maintainers are also using it. So when stuck between a rock and a hardware, I think we should be admitting these properly and then choose a path knowingly fully aware that we might be making a mistake. Being open about the fact that we are going blind into this is probably make more sense instead of some serious sounding untested-unvetted legal text and checkboxes. Because really nobody knows when we will correct course, if ever. So we can 1- "Stallman" it, with "no AI allowed" stance, while having absolutely no way of knowing how the code is generated. So it is a stance based on principles. I don't have a problem with it, and can accept it. It is a viable and respectable choice. The downside is we will be forcing people to lie. Because they will use it and we will not notice it until it is very late. 2- or find a sentence that is pragmatic enough; something like "Even if you used LLMs, you should be able to explain the changes yourself. LLM based PRs are held to heightened levels of scrutiny and lower levels of patience" or something offered in this thread. I can also accept this, it is also a viable option. The downside is that it will make us more hostile, as Sebastian mentioned, and paranoid. Occasionally, it will make us accuse innocent folks for using LLMs. Once we can choose this, then we can add agent markdowns, boilerplate responses and other details. But it seems like we got stuck at this choice level in our last attempts for a policy alignment. I would be much happier if we can be a bit more explicit and forthcoming about what we are doing and not make it an in vitro Open Source problem. We don't need to use strong words like stealing etc. obviously since there is no legal basis for it. But we all know what happened so there are much softer versions of saying the same thing. I just did not spend the time to make these proper ala Pascal, and it's my lack of manners leaking out though I strongly believe that they stole everything. I am fully aware that this might not be everyone's take (or anyone for that matter), so please take it as a rather brazen take though I hope the message gets across. Very weird times indeed. ilhan On Fri, Feb 13, 2026 at 12:03 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
On Thu, Feb 12, 2026 at 7:02 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Wed, 2026-02-11 at 23:22 +0000, Matthew Brett via NumPy-Discussion wrote:
Hi,
On Wed, Feb 11, 2026 at 11:02 PM Lucas Colley via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi Matthew,
That all sounds reasonable to me so far, but what are the next steps?*
put a heavy requirement on contributors to either a) write the code themselves, perhaps having asked for preliminary analysis (but not substantial code drafts) from AI
Is this enforceable to a significant extent? If not, in what sense could it pose a genuinely ‘heavy requirement’?
or b) write the code with AI, but demonstrate that they have done the research to establish the generated code does not breach copyright.
Perhaps this is more enforceable? But to be honest it is still quite unclear to me how I would establish with certainty that code I’ve had generated does not breach copyright, much less code that is being presented to me by a contributor. Do you see how to realise a ‘heavy requirement’ here?
I agree with the spirit of the thought that the burden (if it is to exist) needs to be shifted away from maintainers, but it’s unclear to me how we can actually shift it elsewhere.
As we discussed last year, I think we have a start at a decent argument towards including a checkbox in PR templates which contributors must tick to state that they recognise the risk of copyright violation via LLM generated code and take responsibility for the code they are submitting:
https://github.com/matthew-brett/sp-ai-post/issues/2#issuecomment-2935428854
.
Even there though, there are still multiple debatable premises. Of course, we can hardly aim for some sort of logical proof of the right way forward, but I think we need more focused attention and argument towards a specific and understandable goal if we are to be able to come to consensus on some concrete steps forward. It is to this thread's merit that the discussion has been so varied and touched on many topics, but it is also demonstrative of the problem that broad and vague back-and-forths don’t really help settle on anything concrete.
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI- generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
While I am not particularly enthusiastic about focusing on copyright, adding such a checkbox on a PR, I would be happy with. (If it was focused on copyright, then it seems to me we would need to ask more things, like "I used a source, but it had no code" to "I used a source with code but I checked it's license". If we want this, I would prefer a single fuzzy sentence that links out to elsewhere that can also discuss pitfalls around copyright+AI.)
Not sure that asking for a checkbox there will be honored, but I like the thought. First, it will increase the chance of getting the information (which I want as a reviewer). Second, my unfortunate feeling is that we'll get more aggressively/less friendly about closing PRs and that is a shame, and having the checkbox makes that pat a bit easier on us and maybe also more transparent to the user that we are struggling with this (the worry of course is closing a genuine human PR by accident).
I think I largely understand the concerns around copyright and maybe I am a bit not careful/understanding enough by not being overly worried?... But to my very personal feeling the product of how much I feel we should worry and how much I feel that stressing issues will help us as a project/open source just doesn't make me enthusiastic about being aggressively to pointing it out these possible issues.
There are many things to discuss around this. What does eroding copyright here mean for us as a project, for open source (GPL?), for open but not free code, for code that is leaked but sold? How will enforcement of actual copyright issues plays out in practice? I just don't think this is the venue for settling these questions [1] and I would need a lot more clarity to even form a strong opinion that I would be willing to announce to the world with the weight of NumPy behind it.
I do understand that this is not the kind of issue that many of us enjoy discussing, but it seems to me that it is:
a) of central importance to the future of open-source, and b) very urgent, and c) fairly straightforward.
To focus the discussion - the only thing of interest to us here, is the acceptability or otherwise of large chunks of code generated by AI. I doubt that anyone has strong objections to AI for code review or code analysis.
For the central importance, imagine a world where copyright has become irrelevant. There are ways we could approach this issue, where that is a likely outcome. We might have different views on whether that is acceptable, but at very least, it will be a very major change, with unpredictable consequences. We are used to open-source copyright as it exists. If we don't consciously address this now, or very soon, we'll have another world, with consequences that are difficult to predict.
Of course, some of us don't care all that much about our own copyright, but bear in mind, that by choosing not to defend it, we take away the ability of others to defend theirs. Specifically, if we choose to accept large AI-generated PRs, the copyright that will be violated is not ours, but that of others. Do we claim that right, to void the copyright of our fellow authors?
Returning to the central question - of large AI-generated PRs. It seems to me this is not a week of work to analyze. I don't think there's any controversy that making no effort to control copyright will, over the medium term, make copyright very difficult to honor. As I said before, the legal issues of enforcement are difficult, but not relevant to us, because we are considering our own ethics in observing copyright, and that will be a superset of the legal constraints. It would be an error to defer to legal arguments for an ethical question, if only because the legal arguments are sufficiently complicated that we'd likely have lost the ability to enforce copyright before they are resolved. And, as I say, I think the legal arguments - on enforcement - are more or less irrelevant to our ethical decisions on copyright.
So, accepting large AI-generated PRs would be a significant threat to copyright - what do we get in return?
Ralf pointed out one benefit - that we are not seen to disapprove of the chosen workflows of our fellow developers. I think this is a weak argument. It seems to me perfectly reasonable to point out that contributing to the code-base has some constraints, and copyright is one of them, and that AI-generated code runs the risk of violating copyright.
The second potential benefit is that, by accepting large AI-generated PRs, we will gain greatly in code coverage and quality, and that this is a benefit great enough that it is worth paying the price in terms of copyright. First - we have been prepared to pay a high price for observing copyright in the past - there are many GPL algorithms that we could have copied, to our benefit, but did not. Second, it seems to me we can wait on this. It is not yet clear that we would gain significantly, compared to our traditional requirement that people write their own code. When the gains are still unclear, the cost in terms of voiding copyright is too high.
Lastly - I was proposing a compromise - that we (Scientific Python projects) do not forbid AI-generated PRs, but place an extra burden on contributors to research any possible copyright violations. That seems like a reasonable compromise to me. What do you think?
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ilhanpolat@gmail.com
Hi Ilhan, On Fri, Feb 13, 2026 at 2:15 PM Ilhan Polat via NumPy-Discussion <numpy-discussion@python.org> wrote:
Also I'd like to be on record with the unpleasant part out loud. I have been in many discussions also at work and in OSS circles so I have quite a bit of debate ammo accumulated from both sides. Let me jump into it without the fluff to save space;
Currently, LLMs are getting really good at what they are tasked to do. If you put in the work (just like you would when you are the one writing the code), the output is quite acceptable and I feel like I'm reviewing somebody else's Pull request. Fix "this" part, change "that" part and done. If folks can't use these tools, it's a "they" problem.
I'm not sure what you are saying there - could you clarify?
I just used it to translate entire LAPACK to C11 (why, mostly for the lolz, don't ask, it's a disease), ported all the tests and passing, now polishing it up. I mean look at this silly thing
No way in hell, I'd type this much code myself. And it is a 1-to-1 mechanical translation, no creativity involved except hacking into PyData theme because I always wanted to tweak it. Now who owns the copyright; Dennis Ritchie or LAPACK folks, or is it the entire C codebase of the world that trained this machine to write this mechanical code, or is it me who paid for it and worked with it etc.? The source of the algorithm is BSD3, would you be using this if this was available in BSD3 (I mean it will be obviously very soon).
As a comparison, the entire SciPy Fortran codebase, ~85,000 SLOC, took me 2 years and 7 months to translate manually. Entire LAPACK codebase 300,000 SLOC (just the functions) and including the testing, documentation etc. took me exactly 1 month and 19 days (Claude Pro something MAX level subscription with ~200€ per month from my own pocket). The agent still fails spectacularly if you let it run free, but I do put in the work to do a proper code review, tweak rules, then force it to read the rules periodically, (and most importantly, I know what I am looking at) so this went fairly well. It still took insane amount of time to bring the agent back on track. force explicit testing, Not to use C++ practices on C code so on.
At this point, I can confirm that "Agents can do this much but they cannot do that much" is rapidly becoming a "God of the Gaps" argument with every new version release LLMs chasing a receding horizon, not towards intelligence, but precision at parsing and following orders.
I just wanted to clarify that I don't think the argument is about what agents can and cannot do. I think everyone believes they can be very useful. I should also say that your experience is very useful for the discussion - but it is somewhat specialized. I can well see that the AI agent could be a huge boon for this sort of semi-mechanical task, but there aren't many such tasks in the code that I'm working on. And - to return to my suggestion - I would argue here that your task, as the PR author, is to say "I went through the ported code very carefully, comparing to the original, and I am confident that the translation is a faithful language to language translation, from the original BSD code, and there is no significant injection of other code that may be subject to copyright. The closest example I could find was X, but a quick search for terms Y and Z found no plausible copyrighted origins."
However, in my opinion, our dilemma is not a whether their output is potentially GPL'd/copyrighted code or not. Every bit of output of these tools is stolen by being trained on copyrighted data. For the folks who did not see it, there is a screenshot of VS Code offering me a comment at the beginning of the file from a company that does not apparently have any public repositories https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c...
Therefore, we are, in fact, trying to guess, whether it looks like a copyrighted code after the fact, ignoring where the code is pulled from. These companies pretty much stole everything; music, science articles, code (not just GPLd code, but private repositories), this, that, everything. Their practices were/are seriously unethical. It is not a political statement but facts. However, it seems like they are getting away with it, incredibly, even after they admitted it multiple times all the way at the CEO level (in particular, recently, SUNO CEO is pretty bullish, even defending why this stealing is fair use while individuals are rapidly being prosecuted for the same actions, not to mention Sci-Hub). And some of us are working for these companies or working for in the secondary circles.
Right - and one conclusion we could draw is - OK, if (some idea of) everyone is doing it, we should be doing it too. But I'm sure you'd agree that's not a very convincing argument.
Funnily enough, we are tasked with this mordant task of trying to come up with a stance on LLM usage. I claim that we should not be spending too much time on the epistemological aspects of LLM usage. I can't see any way other than being utilitarian about it. Because PRs keep coming and maintainers are also using it. So when stuck between a rock and a hardware, I think we should be admitting these properly and then choose a path knowingly fully aware that we might be making a mistake. Being open about the fact that we are going blind into this is probably make more sense instead of some serious sounding untested-unvetted legal text and checkboxes. Because really nobody knows when we will correct course, if ever.
There may be such a legal text - but I don't think that's what I was proposing. Again, this isn't about enforcement - it's about ethics - as it always has been. We stated that we didn't accept GPL code, or code derived from GPL, and we took our contributors word that they had taken our request seriously. It really doesn't seem sensible to choose a policy that is obviously dangerous for copyright, and wait until it becomes obvious that we have damaged copyright. Rather it seems more sensible to choose an option that is less dangerous for copyright, and wait to see, as the tools develop, whether we need to re-evaluate. It really doesn't seem likely to me that the policy would stay in place long after it was causing the project harm.
So we can
1- "Stallman" it, with "no AI allowed" stance, while having absolutely no way of knowing how the code is generated. So it is a stance based on principles. I don't have a problem with it, and can accept it. It is a viable and respectable choice. The downside is we will be forcing people to lie. Because they will use it and we will not notice it until it is very late.
I just don't think this is true - I strongly suspect that people who are attracted to open-source, and the open-source community, will not generally lie about how they made their contributions - any more than we have seen attempts to put GPL code into our BSD codebases. Bear in mind that the "until it is very late" problem is the one that will happen much more quickly with a more permissive policy.
2- or find a sentence that is pragmatic enough; something like "Even if you used LLMs, you should be able to explain the changes yourself. LLM based PRs are held to heightened levels of scrutiny and lower levels of patience" or something offered in this thread. I can also accept this, it is also a viable option. The downside is that it will make us more hostile, as Sebastian mentioned, and paranoid. Occasionally, it will make us accuse innocent folks for using LLMs.
Could you comment on the option that I was proposing - which is that anyone generating code with AI should justify the copyright risk, with relevant research as necessary?
Once we can choose this, then we can add agent markdowns, boilerplate responses and other details. But it seems like we got stuck at this choice level in our last attempts for a policy alignment. I would be much happier if we can be a bit more explicit and forthcoming about what we are doing and not make it an in vitro Open Source problem. We don't need to use strong words like stealing etc. obviously since there is no legal basis for it.
I didn't use those words - but in any case - as I've said several times, in several places, the legal argument is more or less irrelevant to us - our question is whether we are honoring the spirit of the copyrights put on other people's code, not whether they could, with sufficient resources, successfully sue us for infringement. Cheers, Matthew
And - to return to my suggestion - I would argue here that your task, as
Right - and one conclusion we could draw is - OK, if (some idea of) everyone is doing it, we should be doing it too. But I'm sure you'd agree
I don't have any objections. Like I said, not everybody or anybody will agree wtih me on this. But a few clarifications, please feel free to ignore them. I really don't seek a debate in a mailing list but important for my argument above, in case I did not make it clear. the PR author, is to say "I went through the ported code [...] origins." Take my work; I would be lying. I can't even convince myself that I looked at every line of code. Plus, that kind of homework is impossible to sign out for me. How can I even know where to look if it is a partial code. We are talking about Pull Requests that are typically changing 1 or 2 hypothetical tiles from the entire floor. I asked an LLM that gave me some result. Meh, good enough, git push. If you ask me "where did you get the idea of this function decorator" or "what made you do this double pointer trick"; you will get an answer which will be a white lie. I think you are underestimating how much work that sentence carries. The easier option is clicking "I did not use AI". that's not a very convincing argument. That is exactly my argument, and much to my regret I think this is the only honest answer. None of us proposed to use LLMs, they barged in on us. Shifting the potential blame to the contributor while the code is 100% coming from copyright infringing tool is not convincing either.
Again, this isn't about enforcement - it's about ethics -as it always has been. We stated that we didn't accept GPL code, or code derived from GPL, and we took our contributors word that they had taken our request seriously.
I don't think this is as ethical as you make it sound like. You are removing the actual very hostile anti-copyright tool which is the main perpetrator. Then we are hoping that the users will be holding it right, and stealing maybe only a little but not too much to be disrespectful. In my opinion, there is no ethical escape hatch that we can use without participating in the act. The ethical thing here is to be transparent about our desperation that we don't know what we are taking in and tread carefully. --- For the being late part I can in fact bring you bunch of code and we can do a blindfold experiment, you look at the code and tell me which code is GPL or from Numerical Recipes. If you can't detect it, and I click all the checkboxes and then congrats you license laundered copyrighted code by pushing it into NumPy/SciPy/scikit-learn... Then authors of the original code somehow recognize a unique trick, come back, point at the code and say, "Yo, this is theft". That is what I mean by it is already too late. I get your point that if we don't do anything then it is even worse. They will bring anything which is true. But your belief in folks' ability to distinguish different licenses if it is coming from an LLM is higher than mine.
I didn't use those words - but in any case - as I've said several times, in several places, the legal argument is more or less irrelevant to us - our question is whether we are honoring the spirit of the copyrights put on other people's code, not whether they could, with sufficient resources, successfully sue us for infringement.
And I have endless respect for you and all who strive for this. I want to believe that I am in your camp, or at least trying to be. My point is, if this is done in spirit, let's make it obvious that we are powerless in case a mistake is made OR no LLMs. Both are fine. I do agree with your text, checkbox and other procedures you proposed and would be willing to use it. I just can't see its practical function, even ethically, in case a conflict arises or a fame-seeking contributor starts pulling in copyrighted code. Because unlike before, the act of stealing is now done within the LLM that has, for now, infinite impunity. On Fri, Feb 13, 2026 at 5:22 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi Ilhan,
On Fri, Feb 13, 2026 at 2:15 PM Ilhan Polat via NumPy-Discussion <numpy-discussion@python.org> wrote:
Also I'd like to be on record with the unpleasant part out loud. I have
been in many discussions also at work and in OSS circles so I have quite a bit of debate ammo accumulated from both sides. Let me jump into it without the fluff to save space;
Currently, LLMs are getting really good at what they are tasked to do.
If you put in the work (just like you would when you are the one writing the code), the output is quite acceptable and I feel like I'm reviewing somebody else's Pull request. Fix "this" part, change "that" part and done. If folks can't use these tools, it's a "they" problem.
I'm not sure what you are saying there - could you clarify?
I just used it to translate entire LAPACK to C11 (why, mostly for the lolz, don't ask, it's a disease), ported all the tests and passing, now polishing it up. I mean look at this silly thing
No way in hell, I'd type this much code myself. And it is a 1-to-1 mechanical translation, no creativity involved except hacking into PyData theme because I always wanted to tweak it. Now who owns the copyright; Dennis Ritchie or LAPACK folks, or is it the entire C codebase of the world that trained this machine to write this mechanical code, or is it me who paid for it and worked with it etc.? The source of the algorithm is BSD3, would you be using this if this was available in BSD3 (I mean it will be obviously very soon).
As a comparison, the entire SciPy Fortran codebase, ~85,000 SLOC, took me 2 years and 7 months to translate manually. Entire LAPACK codebase 300,000 SLOC (just the functions) and including the testing, documentation etc. took me exactly 1 month and 19 days (Claude Pro something MAX level subscription with ~200€ per month from my own pocket). The agent still fails spectacularly if you let it run free, but I do put in the work to do a proper code review, tweak rules, then force it to read the rules periodically, (and most importantly, I know what I am looking at) so this went fairly well. It still took insane amount of time to bring the agent back on track. force explicit testing, Not to use C++ practices on C code so on.
At this point, I can confirm that "Agents can do this much but they cannot do that much" is rapidly becoming a "God of the Gaps" argument with every new version release LLMs chasing a receding horizon, not towards intelligence, but precision at parsing and following orders.
I just wanted to clarify that I don't think the argument is about what agents can and cannot do. I think everyone believes they can be very useful.
I should also say that your experience is very useful for the discussion - but it is somewhat specialized. I can well see that the AI agent could be a huge boon for this sort of semi-mechanical task, but there aren't many such tasks in the code that I'm working on. And - to return to my suggestion - I would argue here that your task, as the PR author, is to say "I went through the ported code very carefully, comparing to the original, and I am confident that the translation is a faithful language to language translation, from the original BSD code, and there is no significant injection of other code that may be subject to copyright. The closest example I could find was X, but a quick search for terms Y and Z found no plausible copyrighted origins."
However, in my opinion, our dilemma is not a whether their output is potentially GPL'd/copyrighted code or not. Every bit of output of these tools is stolen by being trained on copyrighted data. For the folks who did not see it, there is a screenshot of VS Code offering me a comment at the beginning of the file from a company that does not apparently have any public repositories https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c...
Therefore, we are, in fact, trying to guess, whether it looks like a copyrighted code after the fact, ignoring where the code is pulled from. These companies pretty much stole everything; music, science articles, code (not just GPLd code, but private repositories), this, that, everything. Their practices were/are seriously unethical. It is not a political statement but facts. However, it seems like they are getting away with it, incredibly, even after they admitted it multiple times all the way at the CEO level (in particular, recently, SUNO CEO is pretty bullish, even defending why this stealing is fair use while individuals are rapidly being prosecuted for the same actions, not to mention Sci-Hub). And some of us are working for these companies or working for in the secondary circles.
Right - and one conclusion we could draw is - OK, if (some idea of) everyone is doing it, we should be doing it too. But I'm sure you'd agree that's not a very convincing argument.
Funnily enough, we are tasked with this mordant task of trying to come up with a stance on LLM usage. I claim that we should not be spending too much time on the epistemological aspects of LLM usage. I can't see any way other than being utilitarian about it. Because PRs keep coming and maintainers are also using it. So when stuck between a rock and a hardware, I think we should be admitting these properly and then choose a path knowingly fully aware that we might be making a mistake. Being open about the fact that we are going blind into this is probably make more sense instead of some serious sounding untested-unvetted legal text and checkboxes. Because really nobody knows when we will correct course, if ever.
There may be such a legal text - but I don't think that's what I was proposing. Again, this isn't about enforcement - it's about ethics - as it always has been. We stated that we didn't accept GPL code, or code derived from GPL, and we took our contributors word that they had taken our request seriously.
It really doesn't seem sensible to choose a policy that is obviously dangerous for copyright, and wait until it becomes obvious that we have damaged copyright. Rather it seems more sensible to choose an option that is less dangerous for copyright, and wait to see, as the tools develop, whether we need to re-evaluate. It really doesn't seem likely to me that the policy would stay in place long after it was causing the project harm.
So we can
1- "Stallman" it, with "no AI allowed" stance, while having absolutely no way of knowing how the code is generated. So it is a stance based on principles. I don't have a problem with it, and can accept it. It is a viable and respectable choice. The downside is we will be forcing people to lie. Because they will use it and we will not notice it until it is very late.
I just don't think this is true - I strongly suspect that people who are attracted to open-source, and the open-source community, will not generally lie about how they made their contributions - any more than we have seen attempts to put GPL code into our BSD codebases.
Bear in mind that the "until it is very late" problem is the one that will happen much more quickly with a more permissive policy.
2- or find a sentence that is pragmatic enough; something like "Even if you used LLMs, you should be able to explain the changes yourself. LLM based PRs are held to heightened levels of scrutiny and lower levels of patience" or something offered in this thread. I can also accept this, it is also a viable option. The downside is that it will make us more hostile, as Sebastian mentioned, and paranoid. Occasionally, it will make us accuse innocent folks for using LLMs.
Could you comment on the option that I was proposing - which is that anyone generating code with AI should justify the copyright risk, with relevant research as necessary?
Once we can choose this, then we can add agent markdowns, boilerplate responses and other details. But it seems like we got stuck at this choice level in our last attempts for a policy alignment. I would be much happier if we can be a bit more explicit and forthcoming about what we are doing and not make it an in vitro Open Source problem. We don't need to use strong words like stealing etc. obviously since there is no legal basis for it.
I didn't use those words - but in any case - as I've said several times, in several places, the legal argument is more or less irrelevant to us - our question is whether we are honoring the spirit of the copyrights put on other people's code, not whether they could, with sufficient resources, successfully sue us for infringement.
Cheers,
Matthew
On Fri, Feb 13, 2026 at 7:08 AM Ilhan Polat via NumPy-Discussion < numpy-discussion@python.org> wrote:
Also I'd like to be on record with the unpleasant part out loud. I have been in many discussions also at work and in OSS circles so I have quite a bit of debate ammo accumulated from both sides. Let me jump into it without the fluff to save space;
Currently, LLMs are getting really good at what they are tasked to do. If you put in the work (just like you would when you are the one writing the code), the output is quite acceptable and I feel like I'm reviewing somebody else's Pull request. Fix "this" part, change "that" part and done. If folks can't use these tools, it's a "they" problem. I just used it to translate entire LAPACK to C11 (why, mostly for the lolz, don't ask, it's a disease), ported all the tests and passing, now polishing it up. I mean look at this silly thing
[image: image.png]
No way in hell, I'd type this much code myself. And it is a 1-to-1 mechanical translation, no creativity involved except hacking into PyData theme because I always wanted to tweak it. Now who owns the copyright; Dennis Ritchie or LAPACK folks, or is it the entire C codebase of the world that trained this machine to write this mechanical code, or is it me who paid for it and worked with it etc.? The source of the algorithm is BSD3, would you be using this if this was available in BSD3 (I mean it will be obviously very soon).
As a comparison, the entire SciPy Fortran codebase, ~85,000 SLOC, took me 2 years and 7 months to translate manually. Entire LAPACK codebase 300,000 SLOC (just the functions) and including the testing, documentation etc. took me exactly 1 month and 19 days (Claude Pro something MAX level subscription with ~200€ per month from my own pocket). The agent still fails spectacularly if you let it run free, but I do put in the work to do a proper code review, tweak rules, then force it to read the rules periodically, (and most importantly, I know what I am looking at) so this went fairly well. It still took insane amount of time to bring the agent back on track. force explicit testing, Not to use C++ practices on C code so on.
At this point, I can confirm that "Agents can do this much but they cannot do that much" is rapidly becoming a "God of the Gaps" argument with every new version release LLMs chasing a receding horizon, not towards intelligence, but precision at parsing and following orders.
However, in my opinion, our dilemma is not a whether their output is potentially GPL'd/copyrighted code or not. Every bit of output of these tools is stolen by being trained on copyrighted data. For the folks who did not see it, there is a screenshot of VS Code offering me a comment at the beginning of the file from a company that does not apparently have any public repositories https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c...
Therefore, we are, in fact, trying to guess, whether it looks like a copyrighted code after the fact, ignoring where the code is pulled from. These companies pretty much stole everything; music, science articles, code (not just GPLd code, but private repositories), this, that, everything. Their practices were/are seriously unethical. It is not a political statement but facts. However, it seems like they are getting away with it, incredibly, even after they admitted it multiple times all the way at the CEO level (in particular, recently, SUNO CEO is pretty bullish, even defending why this stealing is fair use while individuals are rapidly being prosecuted for the same actions, not to mention Sci-Hub). And some of us are working for these companies or working for in the secondary circles.
Funnily enough, we are tasked with this mordant task of trying to come up with a stance on LLM usage. I claim that we should not be spending too much time on the epistemological aspects of LLM usage. I can't see any way other than being utilitarian about it. Because PRs keep coming and maintainers are also using it. So when stuck between a rock and a hardware, I think we should be admitting these properly and then choose a path knowingly fully aware that we might be making a mistake. Being open about the fact that we are going blind into this is probably make more sense instead of some serious sounding untested-unvetted legal text and checkboxes. Because really nobody knows when we will correct course, if ever.
So we can
1- "Stallman" it, with "no AI allowed" stance, while having absolutely no way of knowing how the code is generated. So it is a stance based on principles. I don't have a problem with it, and can accept it. It is a viable and respectable choice. The downside is we will be forcing people to lie. Because they will use it and we will not notice it until it is very late. 2- or find a sentence that is pragmatic enough; something like "Even if you used LLMs, you should be able to explain the changes yourself. LLM based PRs are held to heightened levels of scrutiny and lower levels of patience" or something offered in this thread. I can also accept this, it is also a viable option. The downside is that it will make us more hostile, as Sebastian mentioned, and paranoid. Occasionally, it will make us accuse innocent folks for using LLMs.
Once we can choose this, then we can add agent markdowns, boilerplate responses and other details. But it seems like we got stuck at this choice level in our last attempts for a policy alignment. I would be much happier if we can be a bit more explicit and forthcoming about what we are doing and not make it an in vitro Open Source problem. We don't need to use strong words like stealing etc. obviously since there is no legal basis for it. But we all know what happened so there are much softer versions of saying the same thing. I just did not spend the time to make these proper ala Pascal, and it's my lack of manners leaking out though I strongly believe that they stole everything.
I am fully aware that this might not be everyone's take (or anyone for that matter), so please take it as a rather brazen take though I hope the message gets across.
Very weird times indeed.
ilhan
I suspect there will be changes in the understanding/use of "copyright." What they will be, I don't know, but copyright itself is fairly recent. It is also the case that thirty years ago you could buy cheap, unlicensed versions of most software in Hong Kong, and copyrighted texts have been produced in cheap versions in some parts of the world, so these sorts of problems are not a completely new experience. Back in the late 1800s to early 1900s, there were patent fights in the Federal Courts involving electric lights, telephones, and aviation. But wartime need prevailed: "The disputes contributed to a 1917 government-brokered patent pool during WWI to end litigation and support aircraft production." Copyright was also suspended for German texts in WWII, I have some republished works on my shelves. The use of AI will soon become a national interest, if it isn't already. We are small players in a much bigger event. Chuck
Hi, On Fri, Feb 13, 2026 at 5:03 PM Charles R Harris via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Fri, Feb 13, 2026 at 7:08 AM Ilhan Polat via NumPy-Discussion <numpy-discussion@python.org> wrote:
Also I'd like to be on record with the unpleasant part out loud. I have been in many discussions also at work and in OSS circles so I have quite a bit of debate ammo accumulated from both sides. Let me jump into it without the fluff to save space;
Currently, LLMs are getting really good at what they are tasked to do. If you put in the work (just like you would when you are the one writing the code), the output is quite acceptable and I feel like I'm reviewing somebody else's Pull request. Fix "this" part, change "that" part and done. If folks can't use these tools, it's a "they" problem. I just used it to translate entire LAPACK to C11 (why, mostly for the lolz, don't ask, it's a disease), ported all the tests and passing, now polishing it up. I mean look at this silly thing
No way in hell, I'd type this much code myself. And it is a 1-to-1 mechanical translation, no creativity involved except hacking into PyData theme because I always wanted to tweak it. Now who owns the copyright; Dennis Ritchie or LAPACK folks, or is it the entire C codebase of the world that trained this machine to write this mechanical code, or is it me who paid for it and worked with it etc.? The source of the algorithm is BSD3, would you be using this if this was available in BSD3 (I mean it will be obviously very soon).
As a comparison, the entire SciPy Fortran codebase, ~85,000 SLOC, took me 2 years and 7 months to translate manually. Entire LAPACK codebase 300,000 SLOC (just the functions) and including the testing, documentation etc. took me exactly 1 month and 19 days (Claude Pro something MAX level subscription with ~200€ per month from my own pocket). The agent still fails spectacularly if you let it run free, but I do put in the work to do a proper code review, tweak rules, then force it to read the rules periodically, (and most importantly, I know what I am looking at) so this went fairly well. It still took insane amount of time to bring the agent back on track. force explicit testing, Not to use C++ practices on C code so on.
At this point, I can confirm that "Agents can do this much but they cannot do that much" is rapidly becoming a "God of the Gaps" argument with every new version release LLMs chasing a receding horizon, not towards intelligence, but precision at parsing and following orders.
However, in my opinion, our dilemma is not a whether their output is potentially GPL'd/copyrighted code or not. Every bit of output of these tools is stolen by being trained on copyrighted data. For the folks who did not see it, there is a screenshot of VS Code offering me a comment at the beginning of the file from a company that does not apparently have any public repositories https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c...
Therefore, we are, in fact, trying to guess, whether it looks like a copyrighted code after the fact, ignoring where the code is pulled from. These companies pretty much stole everything; music, science articles, code (not just GPLd code, but private repositories), this, that, everything. Their practices were/are seriously unethical. It is not a political statement but facts. However, it seems like they are getting away with it, incredibly, even after they admitted it multiple times all the way at the CEO level (in particular, recently, SUNO CEO is pretty bullish, even defending why this stealing is fair use while individuals are rapidly being prosecuted for the same actions, not to mention Sci-Hub). And some of us are working for these companies or working for in the secondary circles.
Funnily enough, we are tasked with this mordant task of trying to come up with a stance on LLM usage. I claim that we should not be spending too much time on the epistemological aspects of LLM usage. I can't see any way other than being utilitarian about it. Because PRs keep coming and maintainers are also using it. So when stuck between a rock and a hardware, I think we should be admitting these properly and then choose a path knowingly fully aware that we might be making a mistake. Being open about the fact that we are going blind into this is probably make more sense instead of some serious sounding untested-unvetted legal text and checkboxes. Because really nobody knows when we will correct course, if ever.
So we can
1- "Stallman" it, with "no AI allowed" stance, while having absolutely no way of knowing how the code is generated. So it is a stance based on principles. I don't have a problem with it, and can accept it. It is a viable and respectable choice. The downside is we will be forcing people to lie. Because they will use it and we will not notice it until it is very late. 2- or find a sentence that is pragmatic enough; something like "Even if you used LLMs, you should be able to explain the changes yourself. LLM based PRs are held to heightened levels of scrutiny and lower levels of patience" or something offered in this thread. I can also accept this, it is also a viable option. The downside is that it will make us more hostile, as Sebastian mentioned, and paranoid. Occasionally, it will make us accuse innocent folks for using LLMs.
Once we can choose this, then we can add agent markdowns, boilerplate responses and other details. But it seems like we got stuck at this choice level in our last attempts for a policy alignment. I would be much happier if we can be a bit more explicit and forthcoming about what we are doing and not make it an in vitro Open Source problem. We don't need to use strong words like stealing etc. obviously since there is no legal basis for it. But we all know what happened so there are much softer versions of saying the same thing. I just did not spend the time to make these proper ala Pascal, and it's my lack of manners leaking out though I strongly believe that they stole everything.
I am fully aware that this might not be everyone's take (or anyone for that matter), so please take it as a rather brazen take though I hope the message gets across.
Very weird times indeed.
ilhan
I suspect there will be changes in the understanding/use of "copyright." What they will be, I don't know, but copyright itself is fairly recent. It is also the case that thirty years ago you could buy cheap, unlicensed versions of most software in Hong Kong, and copyrighted texts have been produced in cheap versions in some parts of the world, so these sorts of problems are not a completely new experience.
Back in the late 1800s to early 1900s, there were patent fights in the Federal Courts involving electric lights, telephones, and aviation. But wartime need prevailed: "The disputes contributed to a 1917 government-brokered patent pool during WWI to end litigation and support aircraft production." Copyright was also suspended for German texts in WWII, I have some republished works on my shelves.
The use of AI will soon become a national interest, if it isn't already. We are small players in a much bigger event.
Yes, that's right. The way I've heard it discussed, by David Sacks, Trumps "AI and crypto tsar" (https://en.wikipedia.org/wiki/David_Sacks) is roughly that if we (the USA) don't make it possible for AI models to digest and possibly reproduce copyright material, the Chinese will, and then the USA will lose the "AI race", which would be bad. So it might well be that the current administration tries to undermine copyright, for that reason. And I suppose they will do that by making copyright hard to enforce legally. But that doesn't require us to void copyright - as I keep saying - it's an ethical issue more than a legal one. We can still choose to respect the wishes of the author, even if (for example) the USA has made it impossible to enforce those wishes legally. Cheers, Matthew
On Fri, Feb 13, 2026 at 11:16 AM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
On Fri, Feb 13, 2026 at 5:03 PM Charles R Harris via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Fri, Feb 13, 2026 at 7:08 AM Ilhan Polat via NumPy-Discussion <
Also I'd like to be on record with the unpleasant part out loud. I have
been in many discussions also at work and in OSS circles so I have quite a bit of debate ammo accumulated from both sides. Let me jump into it without
Currently, LLMs are getting really good at what they are tasked to do.
If you put in the work (just like you would when you are the one writing
No way in hell, I'd type this much code myself. And it is a 1-to-1
mechanical translation, no creativity involved except hacking into PyData
As a comparison, the entire SciPy Fortran codebase, ~85,000 SLOC, took
me 2 years and 7 months to translate manually. Entire LAPACK codebase 300,000 SLOC (just the functions) and including the testing, documentation etc. took me exactly 1 month and 19 days (Claude Pro something MAX level subscription with ~200€ per month from my own pocket). The agent still fails spectacularly if you let it run free, but I do put in the work to do a proper code review, tweak rules, then force it to read the rules
At this point, I can confirm that "Agents can do this much but they
cannot do that much" is rapidly becoming a "God of the Gaps" argument with every new version release LLMs chasing a receding horizon, not towards intelligence, but precision at parsing and following orders.
However, in my opinion, our dilemma is not a whether their output is
Therefore, we are, in fact, trying to guess, whether it looks like a
copyrighted code after the fact, ignoring where the code is pulled from. These companies pretty much stole everything; music, science articles, code (not just GPLd code, but private repositories), this, that, everything. Their practices were/are seriously unethical. It is not a political statement but facts. However, it seems like they are getting away with it, incredibly, even after they admitted it multiple times all the way at the CEO level (in particular, recently, SUNO CEO is pretty bullish, even defending why this stealing is fair use while individuals are rapidly being
Funnily enough, we are tasked with this mordant task of trying to come
up with a stance on LLM usage. I claim that we should not be spending too much time on the epistemological aspects of LLM usage. I can't see any way other than being utilitarian about it. Because PRs keep coming and
So we can
1- "Stallman" it, with "no AI allowed" stance, while having absolutely
no way of knowing how the code is generated. So it is a stance based on
2- or find a sentence that is pragmatic enough; something like "Even if you used LLMs, you should be able to explain the changes yourself. LLM based PRs are held to heightened levels of scrutiny and lower levels of patience" or something offered in this thread. I can also accept this, it is also a viable option. The downside is
Once we can choose this, then we can add agent markdowns, boilerplate
responses and other details. But it seems like we got stuck at this choice level in our last attempts for a policy alignment. I would be much happier if we can be a bit more explicit and forthcoming about what we are doing and not make it an in vitro Open Source problem. We don't need to use strong words like stealing etc. obviously since there is no legal basis for it. But we all know what happened so there are much softer versions of saying the same thing. I just did not spend the time to make these proper ala Pascal, and it's my lack of manners leaking out though I strongly believe that they stole everything.
I am fully aware that this might not be everyone's take (or anyone for
Very weird times indeed.
ilhan
I suspect there will be changes in the understanding/use of "copyright." What they will be, I don't know, but copyright itself is fairly recent. It is also the case that thirty years ago you could buy cheap, unlicensed versions of most software in Hong Kong, and copyrighted texts have been
numpy-discussion@python.org> wrote: the fluff to save space; the code), the output is quite acceptable and I feel like I'm reviewing somebody else's Pull request. Fix "this" part, change "that" part and done. If folks can't use these tools, it's a "they" problem. I just used it to translate entire LAPACK to C11 (why, mostly for the lolz, don't ask, it's a disease), ported all the tests and passing, now polishing it up. I mean look at this silly thing theme because I always wanted to tweak it. Now who owns the copyright; Dennis Ritchie or LAPACK folks, or is it the entire C codebase of the world that trained this machine to write this mechanical code, or is it me who paid for it and worked with it etc.? The source of the algorithm is BSD3, would you be using this if this was available in BSD3 (I mean it will be obviously very soon). periodically, (and most importantly, I know what I am looking at) so this went fairly well. It still took insane amount of time to bring the agent back on track. force explicit testing, Not to use C++ practices on C code so on. potentially GPL'd/copyrighted code or not. Every bit of output of these tools is stolen by being trained on copyrighted data. For the folks who did not see it, there is a screenshot of VS Code offering me a comment at the beginning of the file from a company that does not apparently have any public repositories https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c... prosecuted for the same actions, not to mention Sci-Hub). And some of us are working for these companies or working for in the secondary circles. maintainers are also using it. So when stuck between a rock and a hardware, I think we should be admitting these properly and then choose a path knowingly fully aware that we might be making a mistake. Being open about the fact that we are going blind into this is probably make more sense instead of some serious sounding untested-unvetted legal text and checkboxes. Because really nobody knows when we will correct course, if ever. principles. I don't have a problem with it, and can accept it. It is a viable and respectable choice. The downside is we will be forcing people to lie. Because they will use it and we will not notice it until it is very late. that it will make us more hostile, as Sebastian mentioned, and paranoid. Occasionally, it will make us accuse innocent folks for using LLMs. that matter), so please take it as a rather brazen take though I hope the message gets across. produced in cheap versions in some parts of the world, so these sorts of problems are not a completely new experience.
Back in the late 1800s to early 1900s, there were patent fights in the
Federal Courts involving electric lights, telephones, and aviation. But wartime need prevailed: "The disputes contributed to a 1917 government-brokered patent pool during WWI to end litigation and support aircraft production." Copyright was also suspended for German texts in WWII, I have some republished works on my shelves.
The use of AI will soon become a national interest, if it isn't already.
We are small players in a much bigger event.
Yes, that's right. The way I've heard it discussed, by David Sacks, Trumps "AI and crypto tsar" (https://en.wikipedia.org/wiki/David_Sacks) is roughly that if we (the USA) don't make it possible for AI models to digest and possibly reproduce copyright material, the Chinese will, and then the USA will lose the "AI race", which would be bad.
So it might well be that the current administration tries to undermine copyright, for that reason. And I suppose they will do that by making copyright hard to enforce legally. But that doesn't require us to void copyright - as I keep saying - it's an ethical issue more than a legal one. We can still choose to respect the wishes of the author, even if (for example) the USA has made it impossible to enforce those wishes legally.
Copyright has been adjusted many times, most recently for such things as photocopiers and home recording. My guess is that a combination of pooling and fair use will be the solution, possibly with an opt-out option. The current situation in the US is that AI regulation has been moved from the states to the federal government. Code should be free! OK, it's free. Wait, what! No, not like that. Chuck
Hi All, I think the discussion of ethics is important, but somewhat abstract and difficult to act on. Personally, I think the gravest danger is maintainer burn-out. The matplotlib "shame posting" referred to by Matthew earlier is a quite worrisome -- for those who haven't seen it, see https://github.com/matplotlib/matplotlib/pull/31132 https://github.com/matplotlib/matplotlib/pull/31138 I'd estimate that those particular AI-generated PRs have led to a direct loss of at least a day of maintainer time (spread over multiple maintainers), plus, if they are at all like me, a further indirect loss of hours to days of being upset and irritated. Over at astropy, we currently see a smaller effect of AI bots: we are being spammed by two "developers" whose github pages state "currently working on upskilling myself in the field of AI" (both identical phrasing). Even just asking to be assigned in lots of issues is annoying, as it leads to e-mails in my inbox for any issue I happened to comment on. Also, I've been misled into reviewing/commenting on one PR, spending more time than it would have been to fix things myself, only to find out that it was generated by a bot, not a person who might become a valuable contributor. It left me feeling violated, and I hate feeling forced to a mode where I only review PRs from people/handles I know, and avoid getting notified except when tagged (though a few days ago, I got tens of e-mails from an account that seems to be replaying astropy PRs, yielding pings for any with @mhvk in them). Anyway, to me the most pressing question is to devise some sort of policy that stops people or their bots with no true interest in making real contributions. Sadly, I do not have a good suggestion. But perhaps a start is to have some kind of tick box for any new contributor that asks to state that one is a human and asks to introduce oneself, how one found the problem, and to describe how the PR was created and, e.g., confirm that it complies with the AI policy (including copyright, etc.). Or, more strongly, require explicit permission to create PRs (following a similar questionnaire). A non-standard API might help... Of course, neither of these suggestions will not guard against blatant liers, but perhaps they at least reduce the problem. All the best, Marten
On Fri, Feb 13, 2026 at 1:44 PM Marten van Kerkwijk via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi All,
I think the discussion of ethics is important, but somewhat abstract and difficult to act on.
Personally, I think the gravest danger is maintainer burn-out. The matplotlib "shame posting" referred to by Matthew earlier is a quite worrisome -- for those who haven't seen it, see https://github.com/matplotlib/matplotlib/pull/31132 https://github.com/matplotlib/matplotlib/pull/31138
I'd estimate that those particular AI-generated PRs have led to a direct loss of at least a day of maintainer time (spread over multiple maintainers), plus, if they are at all like me, a further indirect loss of hours to days of being upset and irritated.
Over at astropy, we currently see a smaller effect of AI bots: we are being spammed by two "developers" whose github pages state "currently working on upskilling myself in the field of AI" (both identical phrasing). Even just asking to be assigned in lots of issues is annoying, as it leads to e-mails in my inbox for any issue I happened to comment on.
I think NumPy has seen one of those. Reminds me of PRs from India when the universities begin classes :)
Also, I've been misled into reviewing/commenting on one PR, spending more time than it would have been to fix things myself, only to find out that it was generated by a bot, not a person who might become a valuable contributor. It left me feeling violated, and I hate feeling forced to a mode where I only review PRs from people/handles I know, and avoid getting notified except when tagged (though a few days ago, I got tens of e-mails from an account that seems to be replaying astropy PRs, yielding pings for any with @mhvk in them).
Anyway, to me the most pressing question is to devise some sort of policy that stops people or their bots with no true interest in making real contributions.
Sadly, I do not have a good suggestion. But perhaps a start is to have some kind of tick box for any new contributor that asks to state that one is a human and asks to introduce oneself, how one found the problem, and to describe how the PR was created and, e.g., confirm that it complies with the AI policy (including copyright, etc.). Or, more strongly, require explicit permission to create PRs (following a similar questionnaire). A non-standard API might help...
Of course, neither of these suggestions will not guard against blatant liers, but perhaps they at least reduce the problem.
All the best,
Marten
I'm thinking we need to start banning repeat offenders. Long term, we will probably need AI help, there are already developments in that direction. Another possibility being explored by GitHub is more granular permissions. This would likely be along the lines of reviewed committers receiving PR/comment privileges. New committers get reviewed and maybe blocked. It would be something like we have for the mailing list. Every morning I check held posts to the mailing list. It is a chore, but doesn't take much time. One problem might be handling appeals. Maybe something like three strikes and you are out. Chuck
On Fri, Feb 13, 2026 at 3:46 PM Marten van Kerkwijk via NumPy-Discussion < numpy-discussion@python.org> wrote:
Sadly, I do not have a good suggestion. But perhaps a start is to have some kind of tick box for any new contributor that asks to state that one is a human and asks to introduce oneself, how one found the problem, and to describe how the PR was created and, e.g., confirm that it complies with the AI policy (including copyright, etc.). Or, more strongly, require explicit permission to create PRs (following a similar questionnaire). A non-standard API might help...
The Ghostty project has released tooling for that kind of thing: https://github.com/mitchellh/vouch -- Robert Kern
On Fri, Feb 13, 2026 at 12:02 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Ralf pointed out one benefit - that we are not seen to disapprove of the chosen workflows of our fellow developers. I think this is a weak argument.
Please, read and reason more carefully. That was right below a principle "honor copyright". One does not invalidate or contradict the other.
So, accepting large AI-generated PRs would be a significant threat to copyright
You have a good point in your arguments about copyright somewhere, but you're making it very poorly, verbosely, and with too much confidence when using words like "obviously". It's easy to come up again with examples for why a "large AI-generated PR" isn't copyrightable. E.g., filling holes in test coverage, say for nan's, empty arrays, or noncontiguous arrays. Such an effort involves tedious boilerplate tests with no copyrightable content, and we'd happily outsource that to a tool, and may be thousands of lines of code. For large PRs with intellectually stimulating and copyrightable code, there is also a gray zone where the human does most of the thinking, outlines the solution while stubbing out a lot of details, and then lets a tool fill in the details. That might all be fine too - it depends. The thing to be done here is to find understandable and pragmatic wording for a policy that discourages and lets us reject the undesirable usage of AI tools, while not hindering valid usage. Being overly broad and moralizing with inactionable wording isn't helpful. Cheers, Ralf
Hi Ralf, I think you're playing the ball and not the man, and apart from that being unpleasant for me, it's bad for the discussion. If we are not careful, people will be discouraged from posting for fear of personal attack. That said - I do apologise for using "obviously" - thanks for pointing that out - it was rude of me, and I will try to be more careful. On Sat, Feb 14, 2026 at 9:05 AM Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Fri, Feb 13, 2026 at 12:02 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Ralf pointed out one benefit - that we are not seen to disapprove of the chosen workflows of our fellow developers. I think this is a weak argument.
Please, read and reason more carefully. That was right below a principle "honor copyright". One does not invalidate or contradict the other.
So, accepting large AI-generated PRs would be a significant threat to copyright
You have a good point in your arguments about copyright somewhere, but you're making it very poorly, verbosely, and with too much confidence when using words like "obviously". It's easy to come up again with examples for why a "large AI-generated PR" isn't copyrightable. E.g., filling holes in test coverage, say for nan's, empty arrays, or noncontiguous arrays. Such an effort involves tedious boilerplate tests with no copyrightable content, and we'd happily outsource that to a tool, and may be thousands of lines of code.
For large PRs with intellectually stimulating and copyrightable code, there is also a gray zone where the human does most of the thinking, outlines the solution while stubbing out a lot of details, and then lets a tool fill in the details. That might all be fine too - it depends.
I don't know why you would think that I hadn't understood that there were nuanced arguments to be made for the acceptability of any particular piece of AI-generated code. Did you not read my reply to Ilhan, for example?
The thing to be done here is to find understandable and pragmatic wording for a policy that discourages and lets us reject the undesirable usage of AI tools, while not hindering valid usage. Being overly broad and moralizing with inactionable wording isn't helpful.
I think you're confusing the statement of ethical principles, with being overly broad and inactionable. I don't know what "moralizing" means, but I'm assuming that it can't reasonably be applied to the statement "We have an ethical responsibility to uphold copyright". If we accept that statement, then we can have specific discussions about what actions we need to take, and that's what I was doing - you must have seen the specific proposal that I made. Cheers, Matthew
On Sat, 2026-02-14 at 10:10 +0000, Matthew Brett via NumPy-Discussion wrote:
Hi Ralf,
I think you're playing the ball and not the man, and apart from that being unpleasant for me, it's bad for the discussion. If we are not careful, people will be discouraged from posting for fear of personal attack.
That said - I do apologise for using "obviously" - thanks for pointing that out - it was rude of me, and I will try to be more careful.
On Sat, Feb 14, 2026 at 9:05 AM Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Fri, Feb 13, 2026 at 12:02 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Ralf pointed out one benefit - that we are not seen to disapprove of the chosen workflows of our fellow developers. I think this is a weak argument.
Please, read and reason more carefully. That was right below a principle "honor copyright". One does not invalidate or contradict the other.
So, accepting large AI-generated PRs would be a significant threat to copyright
You have a good point in your arguments about copyright somewhere, but you're making it very poorly, verbosely, and with too much confidence when using words like "obviously". It's easy to come up again with examples for why a "large AI-generated PR" isn't copyrightable. E.g., filling holes in test coverage, say for nan's, empty arrays, or noncontiguous arrays. Such an effort involves tedious boilerplate tests with no copyrightable content, and we'd happily outsource that to a tool, and may be thousands of lines of code.
For large PRs with intellectually stimulating and copyrightable code, there is also a gray zone where the human does most of the thinking, outlines the solution while stubbing out a lot of details, and then lets a tool fill in the details. That might all be fine too - it depends.
I don't know why you would think that I hadn't understood that there were nuanced arguments to be made for the acceptability of any particular piece of AI-generated code. Did you not read my reply to Ilhan, for example?
The thing to be done here is to find understandable and pragmatic wording for a policy that discourages and lets us reject the undesirable usage of AI tools, while not hindering valid usage. Being overly broad and moralizing with inactionable wording isn't helpful.
I think you're confusing the statement of ethical principles, with being overly broad and inactionable. I don't know what "moralizing" means, but I'm assuming that it can't reasonably be applied to the statement "We have an ethical responsibility to uphold copyright". If we accept that statement, then we can have specific discussions about what actions we need to take, and that's what I was doing - you must have seen the specific proposal that I made.
Yes, I think we have seen it (and while I might want to tone it down a bit, I think it is reasonable). But I'll admit that I also read it with a such a focus on the principle that it felt hard to make the transition to discussing the nuance. And part of that is probably that we agree that there is a potential legal/moral minefield here and the disagreement we have isn't that we shouldn't discuss a map, it's that we don't think a scary sign is useful. The other thing is that I think that I/we read it as a focus on these fundamental problems and that makes me feel it is a bit about us making a basically political/moral statement at large. I don't want to do that. In part because to me the moral/legal implication is much fuzzier than it seems at first (as Chuck mentioned the scope and duration of copyright changed a lot in the past! Presumably both legally and morally.). But also, I am not even sure that issues around AI and copyright are politically/morally what we as open source should emphasize. And even if they are, I would not be sure what direction I think we should throw our weight. - Sebastian
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI-generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either. -- Robert Kern
On Fri, 13 Feb 2026 at 23:51, Robert Kern via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI-generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
-- Robert Kern _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: matti.picus@gmail.com
Here is an example right now in NumPy. Apparently someone is deep diving into performance edge cases. They (most likely with the help of ai or totally by ai) submitted a three line PR https://github.com/numpy/numpy/pull/30810 to speed up np.array_equal. Now the same author submitted a much bigger PR to speed up np.isin https://github.com/numpy/numpy/pull/30828. Is the work the product of ai? Yes, but the author claims to have verified the code. Is the author ai or not? Should we proceed with the PR? Matti
On Sat, 2026-02-14 at 08:35 +0200, matti picus via NumPy-Discussion wrote:
On Fri, 13 Feb 2026 at 23:51, Robert Kern via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI- generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
-- Robert Kern _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: matti.picus@gmail.com
Here is an example right now in NumPy. Apparently someone is deep diving into performance edge cases. They (most likely with the help of ai or totally by ai) submitted a three line PR https://github.com/numpy/numpy/pull/30810 to speed up np.array_equal. Now the same author submitted a much bigger PR to speed up np.isin https://github.com/numpy/numpy/pull/30828. Is the work the product of ai? Yes, but the author claims to have verified the code. Is the author ai or not? Should we proceed with the PR?
I suppose my opinion for now is: If you as a maintainer care/want to. And that part I would be happy to put into a policy if need be (which can/should mention more things!). The below got much longer... need to read: What Marten said ;). In practice the issue I have with this type of PR isn't much about copyright or that it is possible that almost all the work was using an AI. (Not because copyright it isn't an issue, I just don't think there were PRs of a kind where I would be seriously worried about it.) It is really about the social dynamic and if a policy can help with that, I am all for it. Before, we had at least one of three intrinsically motivating reasons to look at a PR/issue: * We knew the submitter cares about seeing the feature (i.e. the result not for contributions sake). * It is just for contributions sake, but we are investing in community. I.e. we like helping! * Or I happen to care about it myself. (That could be scratching my my own itch or thinking it is important for the project.) With the old waves of PRs from students, hacktoberfest, ... you pick one. We had the community investment/interaction point applying in some form and adding some motivation. With the current wave I think an issue is that more often it leaves the maintainer without _any_ of those motivational points applying -- I am not even sure that the wave is bigger yet (but it is probably more a swelling). This actually started with issues, I think? My feeling is we have more tiny issues (CuPy is a better example than NumPy here). Issues that seem like some tool found them. They are often long and verbose and at the end maybe a PR even gets merged, but at the end I can't help but think: Well, we just fixed an issue that possibly zero people in the world care about seeing fixed! Don't get me wrong, they are real issues and PRs! I like having extra context for motivation [1], and I think we may need to manage them more (and that may be putting up a policy to discourage or point to when closing). - Sebastian [1] Also if a human creates an issue, I think it is nice to have the note on "this crashed my hour long job". vs. "my funny advent of code solution started failing" (a real regression btw. that I closed because it was a caused by a fix.)
Matti _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
Hi Sebastian, On Sat, Feb 14, 2026 at 9:00 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Sat, 2026-02-14 at 08:35 +0200, matti picus via NumPy-Discussion wrote:
On Fri, 13 Feb 2026 at 23:51, Robert Kern via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI- generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
-- Robert Kern _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: matti.picus@gmail.com
Here is an example right now in NumPy. Apparently someone is deep diving into performance edge cases. They (most likely with the help of ai or totally by ai) submitted a three line PR https://github.com/numpy/numpy/pull/30810 to speed up np.array_equal. Now the same author submitted a much bigger PR to speed up np.isin https://github.com/numpy/numpy/pull/30828. Is the work the product of ai? Yes, but the author claims to have verified the code. Is the author ai or not? Should we proceed with the PR?
I suppose my opinion for now is: If you as a maintainer care/want to. And that part I would be happy to put into a policy if need be (which can/should mention more things!).
The below got much longer... need to read: What Marten said ;).
In practice the issue I have with this type of PR isn't much about copyright or that it is possible that almost all the work was using an AI. (Not because copyright it isn't an issue, I just don't think there were PRs of a kind where I would be seriously worried about it.)
It is really about the social dynamic and if a policy can help with that, I am all for it. Before, we had at least one of three intrinsically motivating reasons to look at a PR/issue: * We knew the submitter cares about seeing the feature (i.e. the result not for contributions sake). * It is just for contributions sake, but we are investing in community. I.e. we like helping! * Or I happen to care about it myself. (That could be scratching my my own itch or thinking it is important for the project.)
With the old waves of PRs from students, hacktoberfest, ... you pick one. We had the community investment/interaction point applying in some form and adding some motivation. With the current wave I think an issue is that more often it leaves the maintainer without _any_ of those motivational points applying -- I am not even sure that the wave is bigger yet (but it is probably more a swelling).
This actually started with issues, I think? My feeling is we have more tiny issues (CuPy is a better example than NumPy here). Issues that seem like some tool found them. They are often long and verbose and at the end maybe a PR even gets merged, but at the end I can't help but think: Well, we just fixed an issue that possibly zero people in the world care about seeing fixed!
Don't get me wrong, they are real issues and PRs! I like having extra context for motivation [1], and I think we may need to manage them more (and that may be putting up a policy to discourage or point to when closing).
I could not agree more - and Stefan's blog post makes a similar - and very good - argument, Cheers, Matthew
On 2026-02-13 22:35, matti picus via NumPy-Discussion wrote:...
Here is an example right now in NumPy. Apparently someone is deep diving into performance edge cases. They (most likely with the help of ai or totally by ai) submitted a three line PR https://github.com/numpy/numpy/pull/30810 to speed up np.array_equal. Now the same author submitted a much bigger PR to speed up np.isin https://github.com/numpy/numpy/pull/30828. Is the work the product of ai? Yes, but the author claims to have verified the code. Is the author ai or not? Should we proceed with the PR? Matti
The interactions in the 2nd PR have the flat agreeableness of an LLM. mrdope, are you hear and clear? :-) Bill
Hi, On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <robert.kern@gmail.com> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI-generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
Yes, that's a reasonable request. But how do you think I should proceed? Make an issue on Numpy, and start drafting? Start another email thread? Or a Discourse / Scientific Python thread? Cheers, Matthew
On Sat, Feb 14, 2026 at 12:17 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <robert.kern@gmail.com> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion <
Just to clarify - in case it wasn't clear, what I'm floating as a
Please specify one of these:
1) I wrote this code myself, without looking at significant
AI-generated code OR
2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that
numpy-discussion@python.org> wrote: proposal, would be something like this, as a message to PR authors: the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would
accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
Yes, that's a reasonable request. But how do you think I should proceed? Make an issue on Numpy, and start drafting? Start another email thread? Or a Discourse / Scientific Python thread?
Just here should be fine. Take an existing PR that has copyrightable content (e.g. an entire new function or three, each more than ~10 lines, not just many one-line updates scattered around; the most interesting ones would be those that implement a known algorithm). Do the code search that would satisfy you. Write out here what you would want a PR author to provide. -- Robert Kern
Write out here what you would want a PR author to provide.
Please let me honk in from philosophy/science/psychology to reiterate, "The prompts, Boss, the prompts!" Don't you think that some sample at least would add some conviction that a human was involved? Case in point is mrdope, who I imagine could be a person talking through an LLM out of shyness, overproductivity syndrome, or as an experiment. Per my comment:
https://github.com/numpy/numpy/pull/30828 [1]. Is the work the product of ai? Yes, but the author claims to have verified the code. Is the author ai or not? Should we proceed with the PR? Matti
The interactions ... have the flat agreeableness of an LLM. mrdope, are you hear and clear? :-)
Bill -- https://github.com/phobrain/phobrain "Don't dope me, I'm just the racehorse here." On 2026-02-14 09:38, Robert Kern via NumPy-Discussion wrote:
On Sat, Feb 14, 2026 at 12:17 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <robert.kern@gmail.com> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI-generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
Yes, that's a reasonable request. But how do you think I should proceed? Make an issue on Numpy, and start drafting? Start another email thread? Or a Discourse / Scientific Python thread?
Just here should be fine. Take an existing PR that has copyrightable content (e.g. an entire new function or three, each more than ~10 lines, not just many one-line updates scattered around; the most interesting ones would be those that implement a known algorithm). Do the code search that would satisfy you. Write out here what you would want a PR author to provide. -- Robert Kern _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: bross_phobrain@sonic.net
Links: ------ [1] https://github.com/numpy/numpy/pull/30828
One aspect we need to keep in mind is that LLMs are lazy and sneaky. If a human does some performance work, and shows a benchmark table full of numbers, I would generally believe that they have actually run their code. But chatbots have been known to generate example data and pass it as real. And being so agreeable, they are likely to show an improvement. This is something that could bite us even if every human is acting in good faith. I don't have a good way to solve this, but maybe asking people to disclose exactly what validation has been done by humans would help, along with guidelines on how to do it. That wouldn't help with bad faith actors, but it's the best I can think of. /David On Sat, 14 Feb 2026, 20:18 Bill Ross, <bross_phobrain@sonic.net> wrote:
Write out here what you would want a PR author to provide. Please let me honk in from philosophy/science/psychology to reiterate,
"The prompts, Boss, the prompts!"
Don't you think that some sample at least would add some conviction that a human was involved?
Case in point is mrdope, who I imagine could be a person talking through an LLM out of shyness, overproductivity syndrome, or as an experiment. Per my comment:
https://github.com/numpy/numpy/pull/30828 <https://github.com/numpy/numpy/pull/30828>. Is the work the product of ai? Yes, but the author claims to have verified the code. Is the author ai or not? Should we proceed with the PR? Matti
The interactions ... have the flat agreeableness of an LLM. mrdope, are you hear and clear? :-)
Bill
--
https://github.com/phobrain/phobrain
"Don't dope me, I'm just the racehorse here."
On 2026-02-14 09:38, Robert Kern via NumPy-Discussion wrote:
On Sat, Feb 14, 2026 at 12:17 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <robert.kern@gmail.com> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion <
Just to clarify - in case it wasn't clear, what I'm floating as a
Please specify one of these:
1) I wrote this code myself, without looking at significant
AI-generated code OR
2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that
numpy-discussion@python.org> wrote: proposal, would be something like this, as a message to PR authors: the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would
accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
Yes, that's a reasonable request. But how do you think I should proceed? Make an issue on Numpy, and start drafting? Start another email thread? Or a Discourse / Scientific Python thread?
Just here should be fine. Take an existing PR that has copyrightable content (e.g. an entire new function or three, each more than ~10 lines, not just many one-line updates scattered around; the most interesting ones would be those that implement a known algorithm). Do the code search that would satisfy you. Write out here what you would want a PR author to provide.
-- Robert Kern
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: bross_phobrain@sonic.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: davidmenhur@gmail.com
Hi, On Sat, Feb 14, 2026 at 5:38 PM Robert Kern <robert.kern@gmail.com> wrote:
On Sat, Feb 14, 2026 at 12:17 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <robert.kern@gmail.com> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI-generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
Yes, that's a reasonable request. But how do you think I should proceed? Make an issue on Numpy, and start drafting? Start another email thread? Or a Discourse / Scientific Python thread?
Just here should be fine. Take an existing PR that has copyrightable content (e.g. an entire new function or three, each more than ~10 lines, not just many one-line updates scattered around; the most interesting ones would be those that implement a known algorithm). Do the code search that would satisfy you. Write out here what you would want a PR author to provide.
I'd suggested (off-list) that this might be better done in another thread - but perhaps it can be done here. Reflecting, and experimenting - there are many caveats, but I think it is reasonable to give the contributor some responsibility for formal care about copyright. One way of doing that - is to ask some AI (if possible, an AI other than the one generating the code) to review for copyright. I've experimented with that over at https://github.com/numpy/numpy/pull/30828#issuecomment-3920553882 . But the idea would be that we ask a contributor who has generated code by AI, to do this as part of the PR sign-off. They should be in a much better position to do this than the maintainers, as they should have been exploring the problem themselves, and therefore should be able to write better queries to guide the AI review. And with the prompts as a start, it's not particularly time-consuming. Cheers, Matthew
On Wed, Feb 18, 2026 at 9:16 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Sat, Feb 14, 2026 at 5:38 PM Robert Kern <robert.kern@gmail.com> wrote:
On Sat, Feb 14, 2026 at 12:17 PM Matthew Brett <matthew.brett@gmail.com>
Hi,
On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <robert.kern@gmail.com>
wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion <
numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a
Please specify one of these:
1) I wrote this code myself, without looking at significant
AI-generated code OR
2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that
For Case 3, I would love to see an example of the search that you
would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
Yes, that's a reasonable request. But how do you think I should proceed? Make an issue on Numpy, and start drafting? Start another email thread? Or a Discourse / Scientific Python thread?
Just here should be fine. Take an existing PR that has copyrightable content (e.g. an entire new function or three, each more than ~10 lines, not just many one-line updates scattered around; the most interesting ones would be those that implement a known algorithm). Do the code search that would satisfy you. Write out here what you would want a PR author to
wrote: proposal, would be something like this, as a message to PR authors: the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation. provide.
I'd suggested (off-list) that this might be better done in another thread - but perhaps it can be done here.
Reflecting, and experimenting - there are many caveats, but I think it is reasonable to give the contributor some responsibility for formal care about copyright.
One way of doing that - is to ask some AI (if possible, an AI other than the one generating the code) to review for copyright. I've experimented with that over at https://github.com/numpy/numpthis looks likey/pull/30828#issuecomment-3920553882 <https://github.com/numpy/numpy/pull/30828#issuecomment-3920553882> . But the idea would be that we ask a contributor who has generated code by AI, to do this as part of the PR sign-off. They should be in a much better position to do this than the maintainers, as they should have been exploring the problem themselves, and therefore should be able to write better queries to guide the AI review. And with the prompts as a start, it's not particularly time-consuming.
I think all of the arguments it produced are not grounded in the principles of copyright law. Unfortunately, I think this is one of the areas where LLMs just generate plausible nonsense rather than sound legal analysis. Each thing that it noted was a one-liner or a general idea, nothing copyrightable. It's essentially writes like a median StackOverflow programmer with a dim understanding of copyright law (no slight intended to anyone; I am one). I've looked at the two files it suggested, and I see no similarity to the PR. I do kind of suspect that LLMs could be used, with care, to help facilitate the abstraction-filtration-comparison test <https://en.wikipedia.org/wiki/Abstraction-Filtration-Comparison_test> and maybe finding candidates to do that test on, but a general instruction to give arguments for copyright violation apparently yields more chaff to wade through. -- Robert Kern
Hi, On Wed, Feb 18, 2026 at 10:33 PM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 9:16 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Sat, Feb 14, 2026 at 5:38 PM Robert Kern <robert.kern@gmail.com> wrote:
On Sat, Feb 14, 2026 at 12:17 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <robert.kern@gmail.com> wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant AI-generated code OR 2) The code contains AI-generated content, but the AI-generated code is sufficiently trivial that it cannot reasonably be subject to copyright OR 3) There is non-trivial AI-generated code in this PR, and I have documented my searches to confirm that no parts of the code are subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that the author has documented their searches. We take the word of the contributor for the option they have chosen. Obviously, the documentation requirement of case 3 is somewhat of a burden for the contributor, and may therefore encourage them to write the code themselves, to avoid that burden. That might not be a bad thing, long term, for the project, and it seems reasonable to me as some defence against copyright violation, and a message that the project cares about such violation.
For Case 3, I would love to see an example of the search that you would accept. If you could take a recent PR (human or AI, doesn't really matter for this purpose), and show the search that would satisfy you, that would go a long way towards clarifying what you are asking for here. We'd need a worked example or two before adopting this policy because if I don't know what you are asking for, no new contributor will, either.
Yes, that's a reasonable request. But how do you think I should proceed? Make an issue on Numpy, and start drafting? Start another email thread? Or a Discourse / Scientific Python thread?
Just here should be fine. Take an existing PR that has copyrightable content (e.g. an entire new function or three, each more than ~10 lines, not just many one-line updates scattered around; the most interesting ones would be those that implement a known algorithm). Do the code search that would satisfy you. Write out here what you would want a PR author to provide.
I'd suggested (off-list) that this might be better done in another thread - but perhaps it can be done here.
Reflecting, and experimenting - there are many caveats, but I think it is reasonable to give the contributor some responsibility for formal care about copyright.
One way of doing that - is to ask some AI (if possible, an AI other than the one generating the code) to review for copyright. I've experimented with that over at https://github.com/numpy/numpthis looks likey/pull/30828#issuecomment-3920553882 . But the idea would be that we ask a contributor who has generated code by AI, to do this as part of the PR sign-off. They should be in a much better position to do this than the maintainers, as they should have been exploring the problem themselves, and therefore should be able to write better queries to guide the AI review. And with the prompts as a start, it's not particularly time-consuming.
I think all of the arguments it produced are not grounded in the principles of copyright law. Unfortunately, I think this is one of the areas where LLMs just generate plausible nonsense rather than sound legal analysis. Each thing that it noted was a one-liner or a general idea, nothing copyrightable. It's essentially writes like a median StackOverflow programmer with a dim understanding of copyright law (no slight intended to anyone; I am one). I've looked at the two files it suggested, and I see no similarity to the PR.
I do kind of suspect that LLMs could be used, with care, to help facilitate the abstraction-filtration-comparison test and maybe finding candidates to do that test on, but a general instruction to give arguments for copyright violation apparently yields more chaff to wade through.
Yes, sure - and you can see me trying to negotiate with Gemini on related points in an earlier session here: https://gist.github.com/matthew-brett/fac33f1b41d98e51b842f8bb84e8c66b My point was not that AI is doing a good job here - it isn't - but to offer it as a starting point for further research for the PR author, and reflection for those of us thinking about copyright and AI, on what a better process might look like. Cheers, Matthew
On Wed, Feb 18, 2026 at 7:03 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Wed, Feb 18, 2026 at 10:33 PM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 9:16 AM Matthew Brett <matthew.brett@gmail.com>
One way of doing that - is to ask some AI (if possible, an AI other than the one generating the code) to review for copyright. I've experimented with that over at https://github.com/numpy/numpy/pull/30828#issuecomment-3920553882 . But the idea would be that we ask a contributor who has generated code by AI, to do this as part of the PR sign-off. They should be in a much better position to do this than the maintainers, as they should have been exploring the problem themselves, and therefore should be able to write better queries to guide the AI review. And with the prompts as a start, it's not particularly time-consuming.
I think all of the arguments it produced are not grounded in the
wrote: principles of copyright law. Unfortunately, I think this is one of the areas where LLMs just generate plausible nonsense rather than sound legal analysis. Each thing that it noted was a one-liner or a general idea, nothing copyrightable. It's essentially writes like a median StackOverflow programmer with a dim understanding of copyright law (no slight intended to anyone; I am one). I've looked at the two files it suggested, and I see no similarity to the PR.
I do kind of suspect that LLMs could be used, with care, to help
facilitate the abstraction-filtration-comparison test and maybe finding candidates to do that test on, but a general instruction to give arguments for copyright violation apparently yields more chaff to wade through.
Yes, sure - and you can see me trying to negotiate with Gemini on related points in an earlier session here:
https://gist.github.com/matthew-brett/fac33f1b41d98e51b842f8bb84e8c66b
My point was not that AI is doing a good job here - it isn't - but to offer it as a starting point for further research for the PR author, and reflection for those of us thinking about copyright and AI, on what a better process might look like.
IMO, it's definitely not a good starting point for the PR author. It doesn't matter where it places you as a starting point if it points you in the wrong direction. You are asking the PR author to defend against incorrect statements of fact and law. I think *some* kind of code search or plagiarism detection service might be helpful in identifying possible original sources to compare with the generatred output. It's not at all clear that asking the LLM as an oracle actually enacts such a search. It plainly did not here, but it presented its work as such. I don't think it's a good policy to construct an ad hoc plagiarism detection service without validating how it actually performs. I really strongly suggest that you retract your PR comment. It would be one thing to try it out and post here about what you found, but to interact with a contributor that way as an experiment is... ill-advised. -- Robert Kern
On Wed, Feb 18, 2026 at 6:04 PM Robert Kern via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 7:03 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Wed, Feb 18, 2026 at 10:33 PM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 9:16 AM Matthew Brett <matthew.brett@gmail.com>
One way of doing that - is to ask some AI (if possible, an AI other than the one generating the code) to review for copyright. I've experimented with that over at https://github.com/numpy/numpy/pull/30828#issuecomment-3920553882 . But the idea would be that we ask a contributor who has generated code by AI, to do this as part of the PR sign-off. They should be in a much better position to do this than the maintainers, as they should have been exploring the problem themselves, and therefore should be able to write better queries to guide the AI review. And with the prompts as a start, it's not particularly time-consuming.
I think all of the arguments it produced are not grounded in the
wrote: principles of copyright law. Unfortunately, I think this is one of the areas where LLMs just generate plausible nonsense rather than sound legal analysis. Each thing that it noted was a one-liner or a general idea, nothing copyrightable. It's essentially writes like a median StackOverflow programmer with a dim understanding of copyright law (no slight intended to anyone; I am one). I've looked at the two files it suggested, and I see no similarity to the PR.
I do kind of suspect that LLMs could be used, with care, to help
facilitate the abstraction-filtration-comparison test and maybe finding candidates to do that test on, but a general instruction to give arguments for copyright violation apparently yields more chaff to wade through.
Yes, sure - and you can see me trying to negotiate with Gemini on related points in an earlier session here:
https://gist.github.com/matthew-brett/fac33f1b41d98e51b842f8bb84e8c66b
My point was not that AI is doing a good job here - it isn't - but to offer it as a starting point for further research for the PR author, and reflection for those of us thinking about copyright and AI, on what a better process might look like.
IMO, it's definitely not a good starting point for the PR author. It doesn't matter where it places you as a starting point if it points you in the wrong direction. You are asking the PR author to defend against incorrect statements of fact and law.
I think *some* kind of code search or plagiarism detection service might be helpful in identifying possible original sources to compare with the generatred output. It's not at all clear that asking the LLM as an oracle actually enacts such a search. It plainly did not here, but it presented its work as such.
I don't think it's a good policy to construct an ad hoc plagiarism detection service without validating how it actually performs. I really strongly suggest that you retract your PR comment. It would be one thing to try it out and post here about what you found, but to interact with a contributor that way as an experiment is... ill-advised.
+1. The interaction on that PR as a whole struck me as harsh, verging on rude. Chuck
On Thu, 19 Feb 2026, at 1:45 PM, Charles R Harris via NumPy-Discussion wrote:
+1. The interaction on that PR as a whole struck me as harsh, verging on rude.
It certainly shows the need for developing a unified policy sooner rather than later! *sigh* I did want to push back on this statement from the PR:
Therefore, if any of the code included in the PR was generated by AI, That is an extreme position, if we do that we will end up with no maintainers because everyone coming up will be using AI.
I want to push back specifically on this point because it is not a good basis from which to determine policy. We can, and I would argue we *must*, play a part in whether "everyone" will be using AI. I'll lift this quote (via Juan Luis Cano Rodriguez[1]) from "Resisting Enchantment and Determinism: How to critically engage with AI university guidelines" [2]:
Enchanted by determinism, some see the adoption and use of generative AI in education as inexorable as the effects of the laws of physics. This perspective nudges us towards helplessness and acceptance: no one can change gravity. Besides, it would be absurd to ask whether gravity is good or if we want it. [...] *We must not be persuaded by the false premise of a human-made artefact being inevitable.*
Juan. [1]: https://astrojuanlu.leaflet.pub/3meyu6zht2c2q [2]: https://zenodo.org/records/18282338
Hi, On Thu, Feb 19, 2026 at 4:42 AM Juan Nunez-Iglesias <jni@fastmail.com> wrote:
On Thu, 19 Feb 2026, at 1:45 PM, Charles R Harris via NumPy-Discussion wrote:
+1. The interaction on that PR as a whole struck me as harsh, verging on rude.
It certainly shows the need for developing a unified policy sooner rather than later! *sigh*
I did want to push back on this statement from the PR:
Therefore, if any of the code included in the PR was generated by AI,
That is an extreme position, if we do that we will end up with no maintainers because everyone coming up will be using AI.
I want to push back specifically on this point because it is not a good basis from which to determine policy. We can, and I would argue we must, play a part in whether "everyone" will be using AI. I'll lift this quote (via Juan Luis Cano Rodriguez[1]) from "Resisting Enchantment and Determinism: How to critically engage with AI university guidelines" [2]:
Enchanted by determinism, some see the adoption and use of generative AI in education as inexorable as the effects of the laws of physics. This perspective nudges us towards helplessness and acceptance: no one can change gravity. Besides, it would be absurd to ask whether gravity is good or if we want it. [...] We must not be persuaded by the false premise of a human-made artefact being inevitable.
I agree - and it's a point that has been well-made by your quote, and by others, that "accept the inevitable" is a poor argument, and easily deployed by people who are trying to sell you something. You may have seen Linus Torvalds' constant complaints about the hype surrounding AI. In this case, what worries me is that we may be drifting into accepting the idea that it is inevitable that we developers will switch from mainly writing code ourselves, to mainly asking AI to write code for us. I don't think that's inevitable, and neither, apparently, does Torvalds (see quotes above). I suspect, if we do go in that direction, we will find (from another quote above) that our skills in writing code will start to atrophy, and this is likely to mean that our learning, and our skills in reviewing code will start to atrophy as well. See the Anthropic study quoted above, and references therein for more on AI-generation and learning deficits. In other words, more code, more subtle bugs, fewer developers who understand the code-base, and fewer developers coming to the project with sufficient training to read and review code. But luckily, all hype aside, we have plenty of time to take this slowly and see how this develops. There's no plausible world in which Numpy suffers significantly from taking a measured approach to AI-generated code, over the next few years. There are various plausible worlds where it suffers from being too credulous of AI code quality, or it's ability to train developers, or its tendency to generate code that is subject to copyright. Cheers, Matthew
I don't understand the relevance of both quotes given here; the AI - education relationship has nothing to do here. On the other hand, the damage of social media on students and the upcoming generation is documented in, now, hundreds of studies in the last decade. Yet here we are all having twitter/mastodon/tiktok/bluesky/instagram/whatevers including all maintainers, our children have them, we even have official numpy scipy accounts. So that surely looks like gravity to me. And I, for one, have no interest in what Linus Torvalds thinks about any social issue given his track record and handling conflicts. Using LLM to find copyright violations is, with all respect, one of the most ferrous irony I have seen lately. Did you check whether JAX and PyArrow claim by the LLM is correct before you accuse the PR author, is there an actual code resemblance confirmed by a human? (not blaming you obviously but I am sure you see the recursion you are creating here) I agree with Chuck. On Thu, Feb 19, 2026 at 11:52 AM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
On Thu, Feb 19, 2026 at 4:42 AM Juan Nunez-Iglesias <jni@fastmail.com> wrote:
On Thu, 19 Feb 2026, at 1:45 PM, Charles R Harris via NumPy-Discussion
wrote:
+1. The interaction on that PR as a whole struck me as harsh, verging on
rude.
It certainly shows the need for developing a unified policy sooner
rather than later! *sigh*
I did want to push back on this statement from the PR:
Therefore, if any of the code included in the PR was generated by AI,
That is an extreme position, if we do that we will end up with no
maintainers because everyone coming up will be using AI.
I want to push back specifically on this point because it is not a good
basis from which to determine policy. We can, and I would argue we must, play a part in whether "everyone" will be using AI. I'll lift this quote (via Juan Luis Cano Rodriguez[1]) from "Resisting Enchantment and Determinism: How to critically engage with AI university guidelines" [2]:
Enchanted by determinism, some see the adoption and use of generative AI
in education as inexorable as the effects of the laws of physics. This perspective nudges us towards helplessness and acceptance: no one can change gravity. Besides, it would be absurd to ask whether gravity is good or if we want it.
[...] We must not be persuaded by the false premise of a human-made artefact being inevitable.
I agree - and it's a point that has been well-made by your quote, and by others, that "accept the inevitable" is a poor argument, and easily deployed by people who are trying to sell you something.
You may have seen Linus Torvalds' constant complaints about the hype surrounding AI.
In this case, what worries me is that we may be drifting into accepting the idea that it is inevitable that we developers will switch from mainly writing code ourselves, to mainly asking AI to write code for us. I don't think that's inevitable, and neither, apparently, does Torvalds (see quotes above). I suspect, if we do go in that direction, we will find (from another quote above) that our skills in writing code will start to atrophy, and this is likely to mean that our learning, and our skills in reviewing code will start to atrophy as well. See the Anthropic study quoted above, and references therein for more on AI-generation and learning deficits. In other words, more code, more subtle bugs, fewer developers who understand the code-base, and fewer developers coming to the project with sufficient training to read and review code.
But luckily, all hype aside, we have plenty of time to take this slowly and see how this develops. There's no plausible world in which Numpy suffers significantly from taking a measured approach to AI-generated code, over the next few years. There are various plausible worlds where it suffers from being too credulous of AI code quality, or it's ability to train developers, or its tendency to generate code that is subject to copyright.
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ilhanpolat@gmail.com
Hi, On Thu, Feb 19, 2026 at 11:46 AM Ilhan Polat <ilhanpolat@gmail.com> wrote:
[...]
Using LLM to find copyright violations is, with all respect, one of the most ferrous irony I have seen lately. Did you check whether JAX and PyArrow claim by the LLM is correct before you accuse the PR author, is there an actual code resemblance confirmed by a human? (not blaming you obviously but I am sure you see the recursion you are creating here)
Yes, as you can imagine, I thought about that problem - of using AI to detect copyright violations in AI. To me, that is only an irony if we are thinking of a binary - AI-good, AI-bad. If I think AI-bad, then I think that AI-generated contributions are bad, and therefore I must also think that using AI as a jumping off point for copyright assessment is bad. However, AI-bad is not what I think. I do think (is this controversial?) that AI is unreliable, that, in typical use, without careful discipline, it will tend to reduce learning and understanding compared to doing the same task without AI, and that it can be useful, if we take those things into account. Then I was thinking about the question that Evgeni (on the Scientific Python Discourse forum) and Robert had asked - which is - fine, copyright is an issue, but how can we reasonably ask the contributor to assess that? That's a serious and difficult question. One option is to throw up one's hands and say - OK - copyright is dead - let's ignore it, or at least, deemphasise it. I don't think that's the right answer, which leaves me with the urgent problem of how to proceed. Because this question is difficult, and it is very new (in the sense it has now become very easy for good-faith submissions to violate copyright) - it seems to me we will have to iterate. Then I asked myself - if I had to start somewhere - how would I approach that problem? The way I tend to use AI, is as a jumping off point - a starting point for a discussion with the AI. Quite often, as in this case, that jumping off point is misleading or flat-out wrong - but if you know that (are there any experienced users of AI who don't know that?) - then you can start to negotiate with the AI, and you will often, if you are careful, negotiate to something that you can verify from reliable sources. You may have seen me taking that (I assume standard) approach in my negotiations with Gemini in a previous conversation about copyright, that I linked to as a Gist. Now, this is a new world we're in. I'm not saying that's a practical approach for contributors to explore copyright. I think that I could use it that way, and that I'd get closer to a reliable answer than if I had not used it (and got no answer). I suspect, if we trust our contributors, we will find we and they do develop good habits for that use. But it's a genuinely open question whether that is so. As I keep saying, my intention was only to raise the idea as a starting point. And given the nature of AI - I therefore had to run the risk that the relevant quoted AI (from a simple prompt and response) would be misleading or wrong. Cheers, Matthew
Hi, On Sat, Feb 21, 2026 at 9:34 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Thu, Feb 19, 2026 at 11:46 AM Ilhan Polat <ilhanpolat@gmail.com> wrote:
[...]
Using LLM to find copyright violations is, with all respect, one of the most ferrous irony I have seen lately. Did you check whether JAX and PyArrow claim by the LLM is correct before you accuse the PR author, is there an actual code resemblance confirmed by a human? (not blaming you obviously but I am sure you see the recursion you are creating here)
Yes, as you can imagine, I thought about that problem - of using AI to detect copyright violations in AI. To me, that is only an irony if we are thinking of a binary - AI-good, AI-bad. If I think AI-bad, then I think that AI-generated contributions are bad, and therefore I must also think that using AI as a jumping off point for copyright assessment is bad.
However, AI-bad is not what I think. I do think (is this controversial?) that AI is unreliable, that, in typical use, without careful discipline, it will tend to reduce learning and understanding compared to doing the same task without AI, and that it can be useful, if we take those things into account.
Then I was thinking about the question that Evgeni (on the Scientific Python Discourse forum) and Robert had asked - which is - fine, copyright is an issue, but how can we reasonably ask the contributor to assess that?
That's a serious and difficult question. One option is to throw up one's hands and say - OK - copyright is dead - let's ignore it, or at least, deemphasise it. I don't think that's the right answer, which leaves me with the urgent problem of how to proceed.
Because this question is difficult, and it is very new (in the sense it has now become very easy for good-faith submissions to violate copyright) - it seems to me we will have to iterate.
Then I asked myself - if I had to start somewhere - how would I approach that problem? The way I tend to use AI, is as a jumping off point - a starting point for a discussion with the AI. Quite often, as in this case, that jumping off point is misleading or flat-out wrong - but if you know that (are there any experienced users of AI who don't know that?) - then you can start to negotiate with the AI, and you will often, if you are careful, negotiate to something that you can verify from reliable sources.
You may have seen me taking that (I assume standard) approach in my negotiations with Gemini in a previous conversation about copyright, that I linked to as a Gist.
Now, this is a new world we're in. I'm not saying that's a practical approach for contributors to explore copyright. I think that I could use it that way, and that I'd get closer to a reliable answer than if I had not used it (and got no answer). I suspect, if we trust our contributors, we will find we and they do develop good habits for that use. But it's a genuinely open question whether that is so. As I keep saying, my intention was only to raise the idea as a starting point. And given the nature of AI - I therefore had to run the risk that the relevant quoted AI (from a simple prompt and response) would be misleading or wrong.
I should say that I'm aware that using AI for copyright assessment is very delicate. There is evidence (Xu et al - https://arxiv.org/abs/2408.02487) that 2024/5 vintage AI models were systematically less likely to correctly identify Copyleft licenses. Xu et al speculate that "some closed-source LLMs may have implemented post-processing steps to avoid acknowledging outputs derived from copyleft-licensed code snippets." Likewise, we know that OpenAI was reluctant to put AI-watermarks on ChatGPT output, with one suggested reason being surveys that predicted a large drop in use if the watermark was added : https://arstechnica.com/ai/2024/08/openai-has-the-tech-to-watermark-chatgpt-... . And of course we have no way of knowing how the standard commercial models have been configured. Now imagine that we (open-source developers) start using AI to detect copyright violation, and that in turn leads to a reduction in use of AI tools by the open-source or commercial developers. It will be very difficult for us to know whether later versions of the models have been trained with the aim of making it less likely they will detect copyright violations, on the basis that less copyright violation detection leads to more use of AI. But perhaps that's a problem for a later time. And perhaps we can already become part of the negotiation with AI code model providers, on detection of copyright violation. Cheers, Matthew
On the subject of whether gen-AI code violates copyright maybe this pull request can be used as a relevant and real example: https://github.com/sympy/sympy/pull/29150 The PR description has an AI Generation Disclosure that says:
Initial code generated by codex, then edited by me.
I believe the author that that is exactly what happened but it leaves me unsure of the copyright status of the code. Among recent PRs this author stands out as a new contributor who has demonstrated their ability to produce good PRs while apparently using AI to do so. If we envisage a future in which most contributors are using AI then this is exactly the kind of contributor that we should want. I have questioned them about the copyright in that PR and they have not responded yet but I think that is because it is just not at all clear how to answer that question. It is not clear to me without close inspection whether the code in the PR is doing something simple and obvious or something nontrivial. I have a suspicion that I might be able to simplify the code a lot if I studied exactly what it is doing carefully but it would take some time to figure that out. It is also possible though that the code implements some nontrivial named algorithm and is perhaps more or less directly copied (by the AI) from another codebase. Another example is that I recently made this PR using codex (GPT-5.3) myself: https://github.com/sympy/sympy/pull/29127 That PR copies/adapts the type annotations that I wrote in one library into another library. After a lot of back and forth with codex it decided to look at the contents of the other library and then adapted the type annotations from there, copying them into another codebase. I am the author of the code in question in both codebases but just so we're clear the agent literally went and read the code for another package in .venv/lib/python3.14/site-packages/ to do this (the UI tells you when the agent is reading files). It is not a case of the model having that code in its training data but that the agent actively decided to go read and copy the code after its previous attempts at just writing the code had failed. I gave the agent what I thought were clear instructions for how to write the code but it ignored those, tried every possible wrong approach in cycles, and then at some point decided to check for type errors by running py_compile instead of mypy. Then it told me that it had fixed all the errors and I said "you're supposed to run mypy, not py_compile" so it said: ``` • Correct. py_compile was the wrong validation for this task. I re-ran with mypy and confirmed there are still many errors. I also tested wiring FMPZ_POLY/ FMPQ_POLY directly into ringflint.py; that made mypy worse, so I reverted that experiment. Current workspace is back to only this change: - sympy/external/gmpy.py: added POLY_P, FMPZ_POLY, FMPQ_POLY under if TYPE_CHECKING. So at this point: - requested fictitious poly types/protocol exist - mypy sympy still fails with many existing flint-related typing errors (including ringflint.py and polyclasses.py) If you want, I can proceed with the next step: a broader typing pass to actually consume FMPZ_POLY/ FMPQ_POLY in ringflint.py/polyclasses.py in a way that reduces mypy sympy errors instead of increasing them. ``` I said yes we should indeed be reducing the errors rather than increasing them (that part amused me enough that I saved the output shown for posterity). The agent then spent some time talking to itself about the difficulties it had previously encountered and then decided to go look at the other codebase and realised that it could basically just copy the code from there. I don't think that there is any copyright violation in writing out type annotations for a dependent library's interface in another like that and as the author (not using AI) of the type annotations being copied I think I am in charge of the copyright anyway. It is not hard to see how you can end up having a copyright violation from this sort of thing though. Having played around with codex a few times I now realise that having it just write the code is possibly one of the least useful things you can ask it to do. When you ask it to write the actual code you then have to review its changes in detail and if you allow it to make any nontrivial decisions you will end up in a review/prompt cycle that is worse than just writing the code yourself. When you ask it to do other things though it is great like "bisect this" or "make a benchmark script and time these 5 approaches" or "Can you see any obvious problems in the diff?" and so on. Everything that requires investigating things and/or writing throwaway code is a great task for the AI because you don't have to review the code it generates which is what otherwise becomes the bottleneck. If I was to redo the PR above I would just start writing the code myself and then when I reach the point that all of the high level decisions about how to write the code are made I would ask codex to finish the job. It would be more like "I've done 3 classes now you do the other 10 classes just like I did" or "I've written the code but mypy now shows 1000 errors. Fix the trivial errors and then categorise and explain the remaining errors". -- Oscar
On Sat, 21 Feb 2026 at 13:38, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Another example is that I recently made this PR using codex (GPT-5.3) myself:
I just tried asking codex to write me a description for that PR. The prompt (in a fresh session) was ``` I want to make a PR with the python-flint typing related changes on this branch. Can you write a description for the PR? ``` Then codex went and looked at the PULL_REQUESTS_TEMPLATE.md, looked at the commits, and then produced a PR description matching that template. It filled out the AI disclosure part of the PR template for me ``` #### AI Generation Disclosure Used ChatGPT to help draft PR text only. No code changes were AI-generated in this PR. ``` Both of those sentences are false and it just lied automatically on my behalf without me asking it to do that and without asking for any clarification about what to put there. -- Oscar
On Sun, Feb 22, 2026 at 7:07 PM Oscar Benjamin via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Sat, 21 Feb 2026 at 13:38, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Another example is that I recently made this PR using codex (GPT-5.3)
myself:
I just tried asking codex to write me a description for that PR.
The prompt (in a fresh session) was ``` I want to make a PR with the python-flint typing related changes on this branch. Can you write a description for the PR? ``` Then codex went and looked at the PULL_REQUESTS_TEMPLATE.md, looked at the commits, and then produced a PR description matching that template. It filled out the AI disclosure part of the PR template for me ``` #### AI Generation Disclosure
Used ChatGPT to help draft PR text only. No code changes were AI-generated in this PR. ``` Both of those sentences are false and it just lied automatically on my behalf without me asking it to do that and without asking for any clarification about what to put there.
Based on what you wrote, that seems like user error to me. The commits on the branch you made the PR from do not include the `Generated-by` or `Co-authored-by` attribution to indicate that those commits were generated by an LLM in part or in full. So if you ask Codex in a fresh session, where it doesn't have context about the previous work, to look at that branch / those commits, how is it supposed to know that the commit authorship on those commits is in fact incorrect? It's indeed possible that there is a model that deliberately and systematically lies in order to increase the chances of it being accepted, but it's much more likely that the PR message draft you ask for is actually correct based on the commit history. I just tried based on a branch with 4 commits that were all mine, and one with `Co-authored-by`. First I just asked it to draft a PR message in the session where it had the development context for those commits. It did, short and to the point. Then I gave it more context: Prompt: "Additional content: the upstream repo uses the following PR template:" Output ends with: ``` #### AI Generation Disclosure AI tools (Claude via Warp) were used to generate the initial workflow files and iterate on them based on review feedback. All code was reviewed and tested by the author. ``` Note that it includes actual details of the interaction that aren't in the commit messages (Claude via Warp, the nature of the actual interaction). Next I repeated it in a fresh session. It gave me a much more verbose PR message, more generic-AI sounding with "files changed" etc. I then again fed it the whole SciPy PR template. And it produced this: ``` AI Generation Disclosure Parts of this PR (the CI workflow files) were co-authored with Warp AI. All code was reviewed and edited by the author. ``` Note that it still kept "Warp AI", which it got from the Co-authored-by, but dropped the "Claude" part, which isn't in the commit message. As for the "All code was reviewed and edited by the author.": it can't really know if that is 100% accurate, but based on me being the sole author of 3 of the commits and the co-author of the 4th commit, it seems like the most likely interpretation of the commit history. tl;dr seems to work as advertised. And inaccuracies and omissions are still the responsibility of the human in the loop.
On Mon, 23 Feb 2026 at 09:36, Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Sun, Feb 22, 2026 at 7:07 PM Oscar Benjamin via NumPy-Discussion <numpy-discussion@python.org> wrote:
Then codex went and looked at the PULL_REQUESTS_TEMPLATE.md, looked at the commits, and then produced a PR description matching that template. It filled out the AI disclosure part of the PR template for me ``` #### AI Generation Disclosure
Used ChatGPT to help draft PR text only. No code changes were AI-generated in this PR. ``` Both of those sentences are false and it just lied automatically on my behalf without me asking it to do that and without asking for any clarification about what to put there.
Based on what you wrote, that seems like user error to me. The commits on the branch you made the PR from do not include the `Generated-by` or `Co-authored-by` attribution to indicate that those commits were generated by an LLM in part or in full. So if you ask Codex in a fresh session, where it doesn't have context about the previous work, to look at that branch / those commits, how is it supposed to know that the commit authorship on those commits is in fact incorrect?
It could have said "I don't have the information needed to fill out this part of the template so can you answer these questions" but it didn't and just falsified the missing information instead. The full description it wrote was quite long (over a screenful) so you could miss that AI part if not looking closely. Note that what it wrote there is pretty much the most common thing that people put in the AI disclosure and it is very often obviously false.
It's indeed possible that there is a model that deliberately and systematically lies in order to increase the chances of it being accepted, but it's much more likely that the PR message draft you ask for is actually correct based on the commit history.
Maybe I should put Co-authored-by then. I didn't actually let codex run git commit itself (I was using git myself in a separate terminal to track what it was doing).
tl;dr seems to work as advertised. And inaccuracies and omissions are still the responsibility of the human in the loop.
It is the responsibility of the human in the loop but the most common failure modes we see right now are: - They just delete the entire pull request template and insert something else. - They specifically delete the AI part of the template. - The whole description is AI generated and the human has not reviewed it at all. I tested what codex would do because my suspicion is that when they have deleted the entire template it is because they are using some kind of (possibly AI) tooling to open the pull request and therefore not actually reading the template in the web UI. I'm not sure what they are using though because if you use e.g. codex then it is smart enough to follow the PR template even if that means filling in the blanks with false information. -- Oscar
Hi, On Mon, Feb 23, 2026 at 11:18 AM Oscar Benjamin via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Mon, 23 Feb 2026 at 09:36, Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Sun, Feb 22, 2026 at 7:07 PM Oscar Benjamin via NumPy-Discussion <numpy-discussion@python.org> wrote:
Then codex went and looked at the PULL_REQUESTS_TEMPLATE.md, looked at the commits, and then produced a PR description matching that template. It filled out the AI disclosure part of the PR template for me ``` #### AI Generation Disclosure
Used ChatGPT to help draft PR text only. No code changes were AI-generated in this PR. ``` Both of those sentences are false and it just lied automatically on my behalf without me asking it to do that and without asking for any clarification about what to put there.
Based on what you wrote, that seems like user error to me. The commits on the branch you made the PR from do not include the `Generated-by` or `Co-authored-by` attribution to indicate that those commits were generated by an LLM in part or in full. So if you ask Codex in a fresh session, where it doesn't have context about the previous work, to look at that branch / those commits, how is it supposed to know that the commit authorship on those commits is in fact incorrect?
It could have said "I don't have the information needed to fill out this part of the template so can you answer these questions" but it didn't and just falsified the missing information instead. The full description it wrote was quite long (over a screenful) so you could miss that AI part if not looking closely. Note that what it wrote there is pretty much the most common thing that people put in the AI disclosure and it is very often obviously false.
It's indeed possible that there is a model that deliberately and systematically lies in order to increase the chances of it being accepted, but it's much more likely that the PR message draft you ask for is actually correct based on the commit history.
Maybe I should put Co-authored-by then. I didn't actually let codex run git commit itself (I was using git myself in a separate terminal to track what it was doing).
tl;dr seems to work as advertised. And inaccuracies and omissions are still the responsibility of the human in the loop.
It is the responsibility of the human in the loop but the most common failure modes we see right now are:
- They just delete the entire pull request template and insert something else. - They specifically delete the AI part of the template. - The whole description is AI generated and the human has not reviewed it at all.
I tested what codex would do because my suspicion is that when they have deleted the entire template it is because they are using some kind of (possibly AI) tooling to open the pull request and therefore not actually reading the template in the web UI. I'm not sure what they are using though because if you use e.g. codex then it is smart enough to follow the PR template even if that means filling in the blanks with false information.
Yes - and the more general point is we can't depend on the AI not making stuff up for the checkboxes. I think that's the big problem with the (Torvalds) approach of - "it's just another tool". That's not really true, it's much more than that - it's a whole other way of working. It would perhaps be better to say "it's another type of developer". One where the usual trust relationships that are so central to open-source, are not valid. It would be a terrible mistake to apply rules that we evolved to work with humans, to the AI. Cheers, Matthew
I think we will not convince each other in this subject. My position is still the same; Ignoring the actual stealing done by these companies and then holding each other accountable for copyright is a no-go for me, using it further to resolve copyright issues, and thus washing it clean, is doubly so. I will not entertain that option and that's my personal position. I don't expect others to take this position. Defending copyright protection and ignoring the largest copyrighted elephant in the room does not seem a sound argument to me. Moreover, arguing over the tool as if it is oblivious and neutral is just factually incorrect, there are companies behind these tools. The technology is undoubtedly impressive and very useful, however the current arrangement of it is built upon legally unsound basis with very shady practices. Therefore I refuse to philosophize over it as a free-agent with shortcomings. In my opinion, the possible thing we can do is at least be defensive and if we feel like it, take advantage of LLMs to use it for automating mundane tasks to save time that FOSS maintainers definitely lack which is what I have been doing with LAPACK translation as I mentioned above. I still go over them line by line which takes insane amount of time though much shorter than writing it myself. So "No AI contribution is allowed" is a valid take for me if that would be the policy. Or "we will use common sense and make opinionated decisions, for trivial and otherwise laborious tasks we don't care but for involved bits, we won't touch it". It is also fine. However, since we are not making any progress in this particular aspect of the discussion, let's conclude it as inconclusive and go back to the policy discussion. On Sat, Feb 21, 2026 at 12:37 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Sat, Feb 21, 2026 at 9:34 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Thu, Feb 19, 2026 at 11:46 AM Ilhan Polat <ilhanpolat@gmail.com>
[...]
Using LLM to find copyright violations is, with all respect, one of
wrote: the most ferrous irony I have seen lately. Did you check whether JAX and PyArrow claim by the LLM is correct before you accuse the PR author, is there an actual code resemblance confirmed by a human? (not blaming you obviously but I am sure you see the recursion you are creating here)
Yes, as you can imagine, I thought about that problem - of using AI to detect copyright violations in AI. To me, that is only an irony if we are thinking of a binary - AI-good, AI-bad. If I think AI-bad, then I think that AI-generated contributions are bad, and therefore I must also think that using AI as a jumping off point for copyright assessment is bad.
However, AI-bad is not what I think. I do think (is this controversial?) that AI is unreliable, that, in typical use, without careful discipline, it will tend to reduce learning and understanding compared to doing the same task without AI, and that it can be useful, if we take those things into account.
Then I was thinking about the question that Evgeni (on the Scientific Python Discourse forum) and Robert had asked - which is - fine, copyright is an issue, but how can we reasonably ask the contributor to assess that?
That's a serious and difficult question. One option is to throw up one's hands and say - OK - copyright is dead - let's ignore it, or at least, deemphasise it. I don't think that's the right answer, which leaves me with the urgent problem of how to proceed.
Because this question is difficult, and it is very new (in the sense it has now become very easy for good-faith submissions to violate copyright) - it seems to me we will have to iterate.
Then I asked myself - if I had to start somewhere - how would I approach that problem? The way I tend to use AI, is as a jumping off point - a starting point for a discussion with the AI. Quite often, as in this case, that jumping off point is misleading or flat-out wrong - but if you know that (are there any experienced users of AI who don't know that?) - then you can start to negotiate with the AI, and you will often, if you are careful, negotiate to something that you can verify from reliable sources.
You may have seen me taking that (I assume standard) approach in my negotiations with Gemini in a previous conversation about copyright, that I linked to as a Gist.
Now, this is a new world we're in. I'm not saying that's a practical approach for contributors to explore copyright. I think that I could use it that way, and that I'd get closer to a reliable answer than if I had not used it (and got no answer). I suspect, if we trust our contributors, we will find we and they do develop good habits for that use. But it's a genuinely open question whether that is so. As I keep saying, my intention was only to raise the idea as a starting point. And given the nature of AI - I therefore had to run the risk that the relevant quoted AI (from a simple prompt and response) would be misleading or wrong.
I should say that I'm aware that using AI for copyright assessment is very delicate. There is evidence (Xu et al - https://arxiv.org/abs/2408.02487) that 2024/5 vintage AI models were systematically less likely to correctly identify Copyleft licenses. Xu et al speculate that "some closed-source LLMs may have implemented post-processing steps to avoid acknowledging outputs derived from copyleft-licensed code snippets."
Likewise, we know that OpenAI was reluctant to put AI-watermarks on ChatGPT output, with one suggested reason being surveys that predicted a large drop in use if the watermark was added :
https://arstechnica.com/ai/2024/08/openai-has-the-tech-to-watermark-chatgpt-... .
And of course we have no way of knowing how the standard commercial models have been configured.
Now imagine that we (open-source developers) start using AI to detect copyright violation, and that in turn leads to a reduction in use of AI tools by the open-source or commercial developers. It will be very difficult for us to know whether later versions of the models have been trained with the aim of making it less likely they will detect copyright violations, on the basis that less copyright violation detection leads to more use of AI.
But perhaps that's a problem for a later time. And perhaps we can already become part of the negotiation with AI code model providers, on detection of copyright violation.
Cheers,
Matthew
So "No AI contribution is allowed" is a valid take for me if that would be the policy. Or "we will use common sense and make opinionated decisions, for trivial and otherwise laborious tasks we don't care but for involved bits, we won't touch it". It is also fine.
I'd be happy with either too, with a preference for the second one, which I see as allowing AI as a tool. Disclosure has to be key, though. Note that while both would at least somewhat address copyright issues, only the first would address the pain of having to deal with AI slop. To me it remains the bigger problem [1], one I don't see how we can address without something like a web of trust. But I'll let this be the last time I note it here, since arguably it is a different discussion, and also at numpy I'm not one directly affected as I rarely look at new PRs. Instead, those who do look at new PRs should speak up on whether they are willing to filter AI slop. But for what it is worth, for astropy, where I did often look at new PRs, I've concluded that AI slop is sufficiently shifting the balance towards misery that I will no longer look at anything unless I'm pinged. All the best, Marten [1] Not fact checked, and more for amusement, but via LWN I was led to the following (really, 5% of all open-soure code this month???). """ Kevin Beaumont @GossiTheDog@cyberplace.social Today in InfoSec Job Security News: I was looking into an obvious ../.. vulnerability introduced into a major web framework today, and it was committed by username Claude on GitHub. Vibe coded, basically. So I started looking through Claude commits on GitHub, there’s over 2m of them and it’s about 5% of all open source code this month. https://github.com/search?q=author%3Aclaude&type=commits&s=author-date&o=desc As I looked through the code I saw the same class of vulns being introduced over, and over, again - several a minute. """ (From https://cyberplace.social/@GossiTheDog/116080909947754833)
Hi, On Sat, Feb 21, 2026 at 4:15 PM Marten van Kerkwijk via NumPy-Discussion <numpy-discussion@python.org> wrote:
So "No AI contribution is allowed" is a valid take for me if that would be the policy. Or "we will use common sense and make opinionated decisions, for trivial and otherwise laborious tasks we don't care but for involved bits, we won't touch it". It is also fine.
I'd be happy with either too, with a preference for the second one, which I see as allowing AI as a tool.
Disclosure has to be key, though.
Note that while both would at least somewhat address copyright issues, only the first would address the pain of having to deal with AI slop.
To me it remains the bigger problem [1], one I don't see how we can address without something like a web of trust. But I'll let this be the last time I note it here, since arguably it is a different discussion, and also at numpy I'm not one directly affected as I rarely look at new PRs. Instead, those who do look at new PRs should speak up on whether they are willing to filter AI slop. But for what it is worth, for astropy, where I did often look at new PRs, I've concluded that AI slop is sufficiently shifting the balance towards misery that I will no longer look at anything unless I'm pinged.
Ouch - yes, that's understandable. Related - Matt Haberland's comment on the Scipy AI policy https://github.com/scipy/scipy/pull/24583#pullrequestreview-3821660783 I wonder whether we'll reach the stage where there are two groups of people in each project that accepts AI-generated code - those who are prepared to review AI-generated PRs, and those who are not. If so, that could unintended effects, as parts of the code become more likely to be AI-generated, parts less, depending on the reviewers. Cheers, Matthew
Thanks for trying to steer this back to a pragmatic "how do we define a policy". On Sat, Feb 21, 2026 at 5:16 PM Marten van Kerkwijk via NumPy-Discussion < numpy-discussion@python.org> wrote:
So "No AI contribution is allowed" is a valid take for me if that would be the policy. Or "we will use common sense and make opinionated decisions, for trivial and otherwise laborious tasks we don't care but for involved bits, we won't touch it". It is also fine.
I'd be happy with either too, with a preference for the second one, which I see as allowing AI as a tool.
I have a very strong preference for allowing AI as a tool here, for multiple reasons: 1. The principle I articulated before: don't tell others what tools are and they aren't allowed to use, as long as it doesn't break other contribution guidelines/rules. 2. The opportunities it affords us to *lighten* the maintenance burden. Maintainer bandwidth is always scarce, and the majority of PRs do not involve architecture or copyrightable algorithmic code. That most contributions are "AI slop" and will just make our maintenance load worse is a take that is highly likely to be incorrect. 3. The opportunities for learning. AI tools are fast becoming a key skill for software engineering related jobs; for scientific jobs I could see that becoming the case as well. Blanket forbidding use of those tools potentially penalizes especially the people who spend a significant part of their time every week contributing to NumPy. 4. Alignment with other open source projects. Melissa has been composing a nice overview here: https://github.com/melissawm/open-source-ai-contribution-policies. Both inside and outside the scientific Python community, a large majority of projects land on a very similar set of policies and principles. And very few land on an outright ban. I find the discourse in this thread about copyright to be completely lacking in pragmatism and actionable insights. It's something that's not new for us in scope, only in scale. We're just going to have to trust the disclosures that we should be asking for (starting asap), give some pragmatic basic guidelines, and deal with PRs as they come. There *will* be a gray zone. Oscar's example of someone starting with "generate an algorithm, then lightly edit it" is one step too far for now I think. Anything short of that, including using AI to fill in some algorithmic details that were already designed by a human, should be okay / in the grey zone. Disclosure has to be key, though.
+1
Note that while both would at least somewhat address copyright issues, only the first would address the pain of having to deal with AI slop.
To me it remains the bigger problem [1], one I don't see how we can address without something like a web of trust. But I'll let this be the last time I note it here, since arguably it is a different discussion, and also at numpy I'm not one directly affected as I rarely look at new PRs.
I do think a web of trust is a potentially valuable idea. However, the need right now isn't there yet (at least for NumPy) and it does have the potential to close the door pretty strongly to newcomers. On the other hand, we already don't run CI on PRs from first-time contributors - that was something that turned out to be necessary to limit wasting resources. A web of trust is something to keep in mind in my opinion, and consider adopting if and when it becomes a clear win for maintainer load.
Instead, those who do look at new PRs should speak up on whether they are willing to filter AI slop.
+1. Also in general: this policy should be primarily informed by the people who are actively maintaining NumPy on a daily/weekly basis. I'd definitely include you in that though Marten. There is more to maintenance than "triage the new PRs from people we don't know"; you spent a significant amount of time reviewing code, mostly complex C code in `numpy/_core`. That's the kind of code where tools can help quite a bit, but only in the hands of humans that already understand the code deeply. Another example: security reports. Those are usually triaged by Sebastian and me, and I'd say the volume of bogus reports has gone up slowly over the past couple of years, in part driven by wider availability AI tooling (and https://huntr.com/, which has a terrible signal-to-noise problem - avoid at all costs). So we can avoid AI tools ourselves, and see that maintenance load just continue going up. Or we can try to experiment with tooling ourselves - see for example https://www.anthropic.com/news/claude-code-security. I just signed up for early access of that capability, because I would like to get high-quality reports and patches, and fix potential real-world problems before they occur. I'm honestly not really interested in opinions from non(-active) maintainers about whether I am allowed to do something like sign up for that security service, or apply a patch it generates. Which is yet another reason I find "no AI at all" not an acceptable policy. Filtering out the copyright-related noise/argumentation, I detect a significant preference for allowing use of AI tools while putting on some sensible constraints by the group of active maintainers. I'd like to somehow move towards something more actionable, because we do need some policy and an AI usage disclosure on all PRs soon. To get to that, I think we should be picking a base policy as a start, and add some NumPy-specific edits/context as needed. I think the most suitable base to start from would be either: 1. The LLVM policy: https://llvm.org/docs/AIToolPolicy.html 2. The SymPy/SciPy policy: https://scipy.github.io/devdocs/dev/conduct/ai_policy.html If we want to capture the "gray zone" better, adding a supplementary document with some concrete examples and maybe incorporating the "Zones" that Peter sketched would be good. Final thought since we're sharing resources: this article by Chris Lattner resonated with me: https://www.modular.com/blog/the-claude-c-compiler-what-it-reveals-about-the... Cheers, Ralf
But for what it is worth, for astropy, where I did often look at new PRs, I've concluded that AI slop is sufficiently shifting the balance towards misery that I will no longer look at anything unless I'm pinged.
All the best,
Marten
[1] Not fact checked, and more for amusement, but via LWN I was led to the following (really, 5% of all open-soure code this month???).
""" Kevin Beaumont @GossiTheDog@cyberplace.social
Today in InfoSec Job Security News:
I was looking into an obvious ../.. vulnerability introduced into a major web framework today, and it was committed by username Claude on GitHub. Vibe coded, basically.
So I started looking through Claude commits on GitHub, there’s over 2m of them and it’s about 5% of all open source code this month.
https://github.com/search?q=author%3Aclaude&type=commits&s=author-date&o=desc
As I looked through the code I saw the same class of vulns being introduced over, and over, again - several a minute. """
(From https://cyberplace.social/@GossiTheDog/116080909947754833) _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ralf.gommers@googlemail.com
Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> writes: [snip]
I do think a web of trust is a potentially valuable idea. However, the need right now isn't there yet (at least for NumPy) and it does have the potential to close the door pretty strongly to newcomers. On the other hand, we already don't run CI on PRs from first-time contributors - that was something that turned out to be necessary to limit wasting resources. A web of trust is something to keep in mind in my opinion, and consider adopting if and when it becomes a clear win for maintainer load.
Thanks for the reminder that we do not run CI for first-time contributors. That is nice in that there is already a mechanism in place to recognize those. As an intermediate step towards trust (but not yet a web of it!), would it make sense to have a welcome message that asks the new contributor to introduce themselves by editing their top comment? I.e., something like this: """ Thank you for your PR! As you appear to be a new contributor, could we ask you to briefly introduce yourself, e.g., by editing the top comment? It would help to know how you use numpy yourself and what made you want to contribute, and whether you are, e.g., a student keen to make an open-source contribution, or an experienced developer just fixing an annoying bug. Note: if you used any AI in your PR, be sure to declare this and check that your use is consistent with our AI policy. """
Filtering out the copyright-related noise/argumentation, I detect a significant preference for allowing use of AI tools while putting on some sensible constraints by the group of active maintainers. I'd like to somehow move towards something more actionable, because we do need some policy and an AI usage disclosure on all PRs soon. To get to that, I think we should be picking a base policy as a start, and add some NumPy-specific edits/context as needed. I think the most suitable base to start from would be either:
1. The LLVM policy: https://llvm.org/docs/AIToolPolicy.html 2. The SymPy/SciPy policy: https://scipy.github.io/devdocs/dev/conduct/ai_policy.html
If we want to capture the "gray zone" better, adding a supplementary document with some concrete examples and maybe incorporating the "Zones" that Peter sketched would be good.
I think both policies are good; I'd prefer to go with the shorter and more direct scipy one -- that also has the advantage of keeping things more consistent within scientific python. Personally, I would copy it verbatim and for now not spend time adding/editing. All the best, Marten
On Sun, 2026-02-22 at 10:19 -0500, Marten van Kerkwijk via NumPy- Discussion wrote:
Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> writes:
[snip]
I do think a web of trust is a potentially valuable idea. However, the need right now isn't there yet (at least for NumPy) and it does have the potential to close the door pretty strongly to newcomers. On the other hand, we already don't run CI on PRs from first-time contributors - that was something that turned out to be necessary to limit wasting resources. A web of trust is something to keep in mind in my opinion, and consider adopting if and when it becomes a clear win for maintainer load.
Thanks for the reminder that we do not run CI for first-time contributors. That is nice in that there is already a mechanism in place to recognize those. As an intermediate step towards trust (but not yet a web of it!), would it make sense to have a welcome message that asks the new contributor to introduce themselves by editing their top comment? I.e., something like this:
Thanks for the suggestions on what concrete steps we should do (and I agree we should do something). I would be fine with basically adopting either of these, adopting the SciPy/SymPy one seems pragmatic, it is nice to keep things similar in similar projects. SymPy/LLVM do have a pretty clear note on copyright (not all do, I think). [1] (I like many things about the LLVM, it is nice explicit about reasoning, etc. but I guess that also makes it longer.) To me they honestly all get the important points across. And honestly, I suspect many contributors won't read it anyway, so it may be more used to point to in the rare case where you close a PR or so. To achieve better transparency, I would suggest we add check-boxes, E.g. sklearn has this now: <!-- If AI tools were involved in creating this PR, please check all boxes that apply below and make sure that you adhere to our Automated Contributions Policy: https://scikit-learn.org/dev/developers/contributing.html#automated-contribu... --> I used AI assistance for: - [ ] Code generation (e.g., when writing an implementation or fixing a bug) - [ ] Test/benchmark generation - [ ] Documentation (including examples) - [ ] Research and understanding I am not sure how well it is used, but I think that is a good start to see where it goes. I could imagine trying to put in something about the scope of AI use, but I am not sure if it matters. It may be easier to just follow up for PRs where it is unclear. (FWIW, I like the comment asking for a bit of personal context, it feels both helpful and welcoming! But I think when it comes to AI specifically, I would start with the check-boxes for pragmatism.) Cheers, Sebastian [1] I would be happy with linking out to continuing discussion towards the note in the LLVM one: "Artificial intelligence systems raise many questions around copyright that have yet to be answered" But I think that is about as much as I want to focus on that point in something targeted for contributors.
""" Thank you for your PR! As you appear to be a new contributor, could we ask you to briefly introduce yourself, e.g., by editing the top comment? It would help to know how you use numpy yourself and what made you want to contribute, and whether you are, e.g., a student keen to make an open-source contribution, or an experienced developer just fixing an annoying bug.
Note: if you used any AI in your PR, be sure to declare this and check that your use is consistent with our AI policy. """
Filtering out the copyright-related noise/argumentation, I detect a significant preference for allowing use of AI tools while putting on some sensible constraints by the group of active maintainers. I'd like to somehow move towards something more actionable, because we do need some policy and an AI usage disclosure on all PRs soon. To get to that, I think we should be picking a base policy as a start, and add some NumPy-specific edits/context as needed. I think the most suitable base to start from would be either:
1. The LLVM policy: https://llvm.org/docs/AIToolPolicy.html 2. The SymPy/SciPy policy: https://scipy.github.io/devdocs/dev/conduct/ai_policy.html
If we want to capture the "gray zone" better, adding a supplementary document with some concrete examples and maybe incorporating the "Zones" that Peter sketched would be good.
I think both policies are good; I'd prefer to go with the shorter and more direct scipy one -- that also has the advantage of keeping things more consistent within scientific python. Personally, I would copy it verbatim and for now not spend time adding/editing.
All the best,
Marten _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
Hi, On Mon, Feb 23, 2026 at 8:53 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Sun, 2026-02-22 at 10:19 -0500, Marten van Kerkwijk via NumPy- Discussion wrote:
Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> writes:
[snip]
I do think a web of trust is a potentially valuable idea. However, the need right now isn't there yet (at least for NumPy) and it does have the potential to close the door pretty strongly to newcomers. On the other hand, we already don't run CI on PRs from first-time contributors - that was something that turned out to be necessary to limit wasting resources. A web of trust is something to keep in mind in my opinion, and consider adopting if and when it becomes a clear win for maintainer load.
Thanks for the reminder that we do not run CI for first-time contributors. That is nice in that there is already a mechanism in place to recognize those. As an intermediate step towards trust (but not yet a web of it!), would it make sense to have a welcome message that asks the new contributor to introduce themselves by editing their top comment? I.e., something like this:
Thanks for the suggestions on what concrete steps we should do (and I agree we should do something). I would be fine with basically adopting either of these, adopting the SciPy/SymPy one seems pragmatic, it is nice to keep things similar in similar projects. SymPy/LLVM do have a pretty clear note on copyright (not all do, I think). [1] (I like many things about the LLVM, it is nice explicit about reasoning, etc. but I guess that also makes it longer.)
To me they honestly all get the important points across. And honestly, I suspect many contributors won't read it anyway, so it may be more used to point to in the rare case where you close a PR or so.
To achieve better transparency, I would suggest we add check-boxes, E.g. sklearn has this now:
<!-- If AI tools were involved in creating this PR, please check all boxes that apply below and make sure that you adhere to our Automated Contributions Policy: https://scikit-learn.org/dev/developers/contributing.html#automated-contribu... --> I used AI assistance for: - [ ] Code generation (e.g., when writing an implementation or fixing a bug) - [ ] Test/benchmark generation - [ ] Documentation (including examples) - [ ] Research and understanding
I am not sure how well it is used, but I think that is a good start to see where it goes. I could imagine trying to put in something about the scope of AI use, but I am not sure if it matters. It may be easier to just follow up for PRs where it is unclear.
(FWIW, I like the comment asking for a bit of personal context, it feels both helpful and welcoming! But I think when it comes to AI specifically, I would start with the check-boxes for pragmatism.)
Unfortunately, as Oscar's example showed (and other slop PRs seem to confirm), it looks as though the check-boxes will be entirely useless, as the AI is perfectly capable of filling those out for you, and won't worry (as far as we know) about choosing the result most likely to get the PR merged. That in turn has some major costs for maintainer burn-out - as Maarten and Matt H are pointing out. I'm increasingly leaning towards - no AI generated code at all, unless a) from a well-trusted contributor, and b) justified by that contributor.
[1] I would be happy with linking out to continuing discussion towards the note in the LLVM one: "Artificial intelligence systems raise many questions around copyright that have yet to be answered" But I think that is about as much as I want to focus on that point in something targeted for contributors.
Just flagging - but if we aren't asking the contributor to address copyright, we have two options: * The maintainer does it. I don't think there's any chance that will happen in practice. * We effectively decide we aren't going to worry about AI copyright violations. I realize the second option is the de-facto preference of some here, but if that's so, I think we have to say that out loud. Cheers, Matthew
On Mon, 2026-02-23 at 09:07 +0000, Matthew Brett via NumPy-Discussion wrote:
Hi,
On Mon, Feb 23, 2026 at 8:53 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Sun, 2026-02-22 at 10:19 -0500, Marten van Kerkwijk via NumPy- Discussion wrote:
Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> writes:
[snip]
I do think a web of trust is a potentially valuable idea. However, the need right now isn't there yet (at least for NumPy) and it does have the potential to close the door pretty strongly to newcomers. On the other hand, we already don't run CI on PRs from first-time contributors - that was something that turned out to be necessary to limit wasting resources. A web of trust is something to keep in mind in my opinion, and consider adopting if and when it becomes a clear win for maintainer load.
Thanks for the reminder that we do not run CI for first-time contributors. That is nice in that there is already a mechanism in place to recognize those. As an intermediate step towards trust (but not yet a web of it!), would it make sense to have a welcome message that asks the new contributor to introduce themselves by editing their top comment? I.e., something like this:
Thanks for the suggestions on what concrete steps we should do (and I agree we should do something). I would be fine with basically adopting either of these, adopting the SciPy/SymPy one seems pragmatic, it is nice to keep things similar in similar projects. SymPy/LLVM do have a pretty clear note on copyright (not all do, I think). [1] (I like many things about the LLVM, it is nice explicit about reasoning, etc. but I guess that also makes it longer.)
To me they honestly all get the important points across. And honestly, I suspect many contributors won't read it anyway, so it may be more used to point to in the rare case where you close a PR or so.
To achieve better transparency, I would suggest we add check-boxes, E.g. sklearn has this now:
<!-- If AI tools were involved in creating this PR, please check all boxes that apply below and make sure that you adhere to our Automated Contributions Policy: https://scikit-learn.org/dev/developers/contributing.html#automated-contribu... --> I used AI assistance for: - [ ] Code generation (e.g., when writing an implementation or fixing a bug) - [ ] Test/benchmark generation - [ ] Documentation (including examples) - [ ] Research and understanding
I am not sure how well it is used, but I think that is a good start to see where it goes. I could imagine trying to put in something about the scope of AI use, but I am not sure if it matters. It may be easier to just follow up for PRs where it is unclear.
(FWIW, I like the comment asking for a bit of personal context, it feels both helpful and welcoming! But I think when it comes to AI specifically, I would start with the check-boxes for pragmatism.)
Unfortunately, as Oscar's example showed (and other slop PRs seem to confirm), it looks as though the check-boxes will be entirely useless, as the AI is perfectly capable of filling those out for you, and won't worry (as far as we know) about choosing the result most likely to get the PR merged.
That in turn has some major costs for maintainer burn-out - as Maarten and Matt H are pointing out.
I'm increasingly leaning towards - no AI generated code at all, unless a) from a well-trusted contributor, and b) justified by that contributor.
[1] I would be happy with linking out to continuing discussion towards the note in the LLVM one: "Artificial intelligence systems raise many questions around copyright that have yet to be answered" But I think that is about as much as I want to focus on that point in something targeted for contributors.
Just flagging - but if we aren't asking the contributor to address copyright, we have two options:
* The maintainer does it. I don't think there's any chance that will happen in practice. * We effectively decide we aren't going to worry about AI copyright violations.
I realize the second option is the de-facto preference of some here, but if that's so, I think we have to say that out loud.
We can link out to the policy which would have a note on copyright. And that in turn could link out to further thoughts (heck, even the things like Peter Wang's talks). Maybe we can/should add a one sentence thing, but I want to be sure to not create undue uncertainty/fear for new contributors for an issue that, IMO, affects relatively few PRs because most PRs are just tiny bug-fixes. So, I think it would be good to discuss a concrete wording here. [1] To me it seems like an exaggeration to say we aren't going to worry about copyright. The question is to what degree it is helpful to force that worry on typical first time contributions. (And yes, there is also a question about how much we as a project/community should worry about it, but I think that is a separate discussion.) Asking for transparency can fail, but if the contributor lies about they will also ignore a ban and it gives us another reason to just close a PR or ban them. And if we want to draw a hard line for contributors, I don't like that because there is a vast range here: - used it for brain storming the approach - used tab completion in cursor - used it to start some tests but maybe 20% of lines remained. - Basically wrote the whole function and only tweaked it a bit. And additionally, Peter Wangs scope argument that we'll care much less about it if it is a docs website vs. core algorithm. The ask here can only be transparency anything beyond that could be guidelines that we put somewhere, but they would be guidelines to dig up maybe once every few months. and I would rather get to a world where we know what comes in (and then can still close a PR) than trying to draw a strict line. - Sebastian [1] I don't know, maybe in the (rough!) directino of: It is important to us to know about AI use both for community reasons as well as to for detailed review which may include copyright concerns for large contributions to some parts of NumPy.
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
Hi, On Mon, Feb 23, 2026 at 11:12 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Mon, 2026-02-23 at 09:07 +0000, Matthew Brett via NumPy-Discussion wrote:
Hi,
On Mon, Feb 23, 2026 at 8:53 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Sun, 2026-02-22 at 10:19 -0500, Marten van Kerkwijk via NumPy- Discussion wrote:
Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> writes:
[snip]
I do think a web of trust is a potentially valuable idea. However, the need right now isn't there yet (at least for NumPy) and it does have the potential to close the door pretty strongly to newcomers. On the other hand, we already don't run CI on PRs from first-time contributors - that was something that turned out to be necessary to limit wasting resources. A web of trust is something to keep in mind in my opinion, and consider adopting if and when it becomes a clear win for maintainer load.
Thanks for the reminder that we do not run CI for first-time contributors. That is nice in that there is already a mechanism in place to recognize those. As an intermediate step towards trust (but not yet a web of it!), would it make sense to have a welcome message that asks the new contributor to introduce themselves by editing their top comment? I.e., something like this:
Thanks for the suggestions on what concrete steps we should do (and I agree we should do something). I would be fine with basically adopting either of these, adopting the SciPy/SymPy one seems pragmatic, it is nice to keep things similar in similar projects. SymPy/LLVM do have a pretty clear note on copyright (not all do, I think). [1] (I like many things about the LLVM, it is nice explicit about reasoning, etc. but I guess that also makes it longer.)
To me they honestly all get the important points across. And honestly, I suspect many contributors won't read it anyway, so it may be more used to point to in the rare case where you close a PR or so.
To achieve better transparency, I would suggest we add check-boxes, E.g. sklearn has this now:
<!-- If AI tools were involved in creating this PR, please check all boxes that apply below and make sure that you adhere to our Automated Contributions Policy:
https://scikit-learn.org/dev/developers/contributing.html#automated-contribu... --> I used AI assistance for: - [ ] Code generation (e.g., when writing an implementation or fixing a bug) - [ ] Test/benchmark generation - [ ] Documentation (including examples) - [ ] Research and understanding
I am not sure how well it is used, but I think that is a good start to see where it goes. I could imagine trying to put in something about the scope of AI use, but I am not sure if it matters. It may be easier to just follow up for PRs where it is unclear.
(FWIW, I like the comment asking for a bit of personal context, it feels both helpful and welcoming! But I think when it comes to AI specifically, I would start with the check-boxes for pragmatism.)
Unfortunately, as Oscar's example showed (and other slop PRs seem to confirm), it looks as though the check-boxes will be entirely useless, as the AI is perfectly capable of filling those out for you, and won't worry (as far as we know) about choosing the result most likely to get the PR merged.
That in turn has some major costs for maintainer burn-out - as Maarten and Matt H are pointing out.
I'm increasingly leaning towards - no AI generated code at all, unless a) from a well-trusted contributor, and b) justified by that contributor.
[1] I would be happy with linking out to continuing discussion towards the note in the LLVM one: "Artificial intelligence systems raise many questions around copyright that have yet to be answered" But I think that is about as much as I want to focus on that point in something targeted for contributors.
Just flagging - but if we aren't asking the contributor to address copyright, we have two options:
* The maintainer does it. I don't think there's any chance that will happen in practice. * We effectively decide we aren't going to worry about AI copyright violations.
I realize the second option is the de-facto preference of some here, but if that's so, I think we have to say that out loud.
We can link out to the policy which would have a note on copyright. And that in turn could link out to further thoughts (heck, even the things like Peter Wang's talks).
Maybe we can/should add a one sentence thing, but I want to be sure to not create undue uncertainty/fear for new contributors for an issue that, IMO, affects relatively few PRs because most PRs are just tiny bug-fixes.
So, I think it would be good to discuss a concrete wording here. [1]
To me it seems like an exaggeration to say we aren't going to worry about copyright. The question is to what degree it is helpful to force that worry on typical first time contributions. (And yes, there is also a question about how much we as a project/community should worry about it, but I think that is a separate discussion.)
Asking for transparency can fail, but if the contributor lies about they will also ignore a ban and it gives us another reason to just close a PR or ban them. And if we want to draw a hard line for contributors, I don't like that because there is a vast range here: - used it for brain storming the approach - used tab completion in cursor - used it to start some tests but maybe 20% of lines remained. - Basically wrote the whole function and only tweaked it a bit.
And additionally, Peter Wangs scope argument that we'll care much less about it if it is a docs website vs. core algorithm. The ask here can only be transparency anything beyond that could be guidelines that we put somewhere, but they would be guidelines to dig up maybe once every few months.
and I would rather get to a world where we know what comes in (and then can still close a PR) than trying to draw a strict line.
I completely agree with your implication - that it's unlikely to be useful to put a lot of documentation for people to read where they can learn about AI risk from copyright. I guess few people will read it, and then a) they won't know what to do about it and b) they have no obligation to do anything about it, so there is very little chance they will act on it. The central point is - who takes responsibility for copyright, in PRs with AI-generated code? We know that can't be the maintainers - so it can only be the contributors. This has always been the case, of course, but now we're in a very different situation, where it's very easy to end up with copyright code in the PR without realizing it. And at the moment, by waving at some documentation, without further instruction, we can surely predict from the data we have, that the contributor will not be likely to take effective responsibility for copyright. Therefore, there's a substantial risk of copyright leak, roughly proportional to the lines of non-trivial, non-mechanical code in the PRs. Back to the options - 1) don't worry about it because it's not important, or 2) put guards in place to make sure the contributor has carefully and personally reviewed the PR for copyright concerns, and review often to make sure these are effective or 3) delay allowing large AI-generated blocks of code until we can see a way forward for copyright. It's clear I think that some of us are in the 1) case - don't worry about it. I'm absolutely not in that camp, but hey. All I'm saying is - if we're in camp 1, we should make that clear. Cheers, Matthew
On Mon, 2026-02-23 at 11:59 +0000, Matthew Brett via NumPy-Discussion wrote:
Hi,
<snip>
The central point is - who takes responsibility for copyright, in PRs with AI-generated code? We know that can't be the maintainers - so it can only be the contributors. This has always been the case, of course, but now we're in a very different situation, where it's very easy to end up with copyright code in the PR without realizing it. And at the moment, by waving at some documentation, without further instruction, we can surely predict from the data we have, that the contributor will not be likely to take effective responsibility for copyright. Therefore, there's a substantial risk of copyright leak, roughly proportional to the lines of non-trivial, non-mechanical code in the PRs.
Back to the options - 1) don't worry about it because it's not important, or 2) put guards in place to make sure the contributor has carefully and personally reviewed the PR for copyright concerns, and review often to make sure these are effective or 3) delay allowing large AI-generated blocks of code until we can see a way forward for copyright.
It's clear I think that some of us are in the 1) case - don't worry about it. I'm absolutely not in that camp, but hey. All I'm saying is - if we're in camp 1, we should make that clear.
To me this seems very exaggerated. Just because we don't put one issue (of multiple we have in practice) into the dead center of such a policy text (or PR template) doesn't equate to ignoring it? Both policies Ralf brought up include a statement about copyright. We still could decide to link out from there for the curious readers or adding guidelines somewhere when and where we should ask more questions. I believe all I said was that I don't want to overshoot and worry most contributors (because yeah, I truly think for the majority of PRs there is just not much concern). Maybe we can word-smith something that strikes a good balance there. And yeah, we probably disagree where that balance lies. But if adding a brief note on copyright concerns in the policy is the same as saying "it's not important", I have no idea where to go?! This discussion tends to feel like we have to start discussion at the end of "it is a gigantic issue" and then maybe be allowed edge towards: OK, but this is the minimal thing that won't care off all contributors. When yeah, probably most here think it isn't the biggest issue, so we should edge towards: Yeah, maybe it is a bit bigger of an issue than it seems, so how about we add this sentence and link? But... to me transparency is the thing that is most directly helpful (also for social dynamics concerns). So that has to be central. Maybe it will be ignored, but what can you reasonably put up that won't be? My guess would be that light-weight is better because it decreases the chance of lying. - Sebastian
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
Hi, On Mon, Feb 23, 2026 at 1:52 PM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Mon, 2026-02-23 at 11:59 +0000, Matthew Brett via NumPy-Discussion wrote:
Hi,
<snip>
The central point is - who takes responsibility for copyright, in PRs with AI-generated code? We know that can't be the maintainers - so it can only be the contributors. This has always been the case, of course, but now we're in a very different situation, where it's very easy to end up with copyright code in the PR without realizing it. And at the moment, by waving at some documentation, without further instruction, we can surely predict from the data we have, that the contributor will not be likely to take effective responsibility for copyright. Therefore, there's a substantial risk of copyright leak, roughly proportional to the lines of non-trivial, non-mechanical code in the PRs.
Back to the options - 1) don't worry about it because it's not important, or 2) put guards in place to make sure the contributor has carefully and personally reviewed the PR for copyright concerns, and review often to make sure these are effective or 3) delay allowing large AI-generated blocks of code until we can see a way forward for copyright.
It's clear I think that some of us are in the 1) case - don't worry about it. I'm absolutely not in that camp, but hey. All I'm saying is - if we're in camp 1, we should make that clear.
To me this seems very exaggerated. Just because we don't put one issue (of multiple we have in practice) into the dead center of such a policy text (or PR template) doesn't equate to ignoring it? Both policies Ralf brought up include a statement about copyright. We still could decide to link out from there for the curious readers or adding guidelines somewhere when and where we should ask more questions.
I believe all I said was that I don't want to overshoot and worry most contributors (because yeah, I truly think for the majority of PRs there is just not much concern).
Yes - I'm sure you're right - that most PRs won't have large not-trivial AI code fragments, and for these, it's not a problem. So for these - I imagine the author would not be deterred by stronger statements about copyright - as it should be obvious to them that copyright is unlikely to apply. However, for PRs that do contain large non-trivial AI code fragments, it is a problem - and that's what I was referring to specifically. I'm sure you'll agree that general statements that people should be careful about copyright or take ownership of copyright, for those PRs, probably won't be effective in preventing copyright leak - without more instructions about what to do, to detect copyright leak.
Maybe we can word-smith something that strikes a good balance there. And yeah, we probably disagree where that balance lies. But if adding a brief note on copyright concerns in the policy is the same as saying "it's not important", I have no idea where to go?!
This discussion tends to feel like we have to start discussion at the end of "it is a gigantic issue" and then maybe be allowed edge towards: OK, but this is the minimal thing that won't care off all contributors. When yeah, probably most here think it isn't the biggest issue, so we should edge towards: Yeah, maybe it is a bit bigger of an issue than it seems, so how about we add this sentence and link?
But... to me transparency is the thing that is most directly helpful (also for social dynamics concerns). So that has to be central. Maybe it will be ignored, but what can you reasonably put up that won't be? My guess would be that light-weight is better because it decreases the chance of lying.
Well - yes - iff the author isn't using AI to submit the PR, or help author the commit message, in which case we shouldn't use terms like lying for the output - it's just whatever the AI came up with. And transparency doesn't help much for (large non-trivial PR) copyright, unless either a) the author has done some work that we can't yet specify to look for copyright violations or b) the maintainer notes the AI use and does that copyright work. But at the moment, we don't have much reason to think either of those will happen. Cheers, Matthew
There may be violent agreement here...
On Mon, 2026-02-23 at 11:59 +0000, Matthew Brett via NumPy-Discussion wrote: [snip]
Back to the options - 1) don't worry about it because it's not important, or 2) put guards in place to make sure the contributor has carefully and personally reviewed the PR for copyright concerns, and review often to make sure these are effective or 3) delay allowing large AI-generated blocks of code until we can see a way forward for copyright.
It's clear I think that some of us are in the 1) case - don't worry about it. I'm absolutely not in that camp, but hey. All I'm saying is - if we're in camp 1, we should make that clear.
To me this seems very exaggerated. Just because we don't put one issue (of multiple we have in practice) into the dead center of such a policy text (or PR template) doesn't equate to ignoring it? Both policies Ralf brought up include a statement about copyright. We still could decide to link out from there for the curious readers or adding guidelines somewhere when and where we should ask more questions.
I believe all I said was that I don't want to overshoot and worry most contributors (because yeah, I truly think for the majority of PRs there is just not much concern). Maybe we can word-smith something that strikes a good balance there. And yeah, we probably disagree where that balance lies. But if adding a brief note on copyright concerns in the policy is the same as saying "it's not important", I have no idea where to go?!
I think the third option suggested by Matthew is reasonable, at least for new contributors, though I would phrase it like that for large AI-generated code we need to be able to trust the contributor *and* have evidence of due diligence. Let's try to move towards wondering how to include this in a policy. As Sebastian noted, most contributors will not read the policy, so at some level, what we write will be aimed as much as maintainers, to give guidance about when to ask for evidence. But I do agree with Sebastian that we should not let the extremes stop us from adopting a workflow that works for the general case, where a first contribution is small. -- Marten
Hi, On Thu, Feb 19, 2026 at 2:48 AM Charles R Harris via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 6:04 PM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 7:03 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Wed, Feb 18, 2026 at 10:33 PM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 9:16 AM Matthew Brett <matthew.brett@gmail.com> wrote:
One way of doing that - is to ask some AI (if possible, an AI other than the one generating the code) to review for copyright. I've experimented with that over at https://github.com/numpy/numpy/pull/30828#issuecomment-3920553882 . But the idea would be that we ask a contributor who has generated code by AI, to do this as part of the PR sign-off. They should be in a much better position to do this than the maintainers, as they should have been exploring the problem themselves, and therefore should be able to write better queries to guide the AI review. And with the prompts as a start, it's not particularly time-consuming.
I think all of the arguments it produced are not grounded in the principles of copyright law. Unfortunately, I think this is one of the areas where LLMs just generate plausible nonsense rather than sound legal analysis. Each thing that it noted was a one-liner or a general idea, nothing copyrightable. It's essentially writes like a median StackOverflow programmer with a dim understanding of copyright law (no slight intended to anyone; I am one). I've looked at the two files it suggested, and I see no similarity to the PR.
I do kind of suspect that LLMs could be used, with care, to help facilitate the abstraction-filtration-comparison test and maybe finding candidates to do that test on, but a general instruction to give arguments for copyright violation apparently yields more chaff to wade through.
Yes, sure - and you can see me trying to negotiate with Gemini on related points in an earlier session here:
https://gist.github.com/matthew-brett/fac33f1b41d98e51b842f8bb84e8c66b
My point was not that AI is doing a good job here - it isn't - but to offer it as a starting point for further research for the PR author, and reflection for those of us thinking about copyright and AI, on what a better process might look like.
IMO, it's definitely not a good starting point for the PR author. It doesn't matter where it places you as a starting point if it points you in the wrong direction. You are asking the PR author to defend against incorrect statements of fact and law.
I think *some* kind of code search or plagiarism detection service might be helpful in identifying possible original sources to compare with the generatred output. It's not at all clear that asking the LLM as an oracle actually enacts such a search. It plainly did not here, but it presented its work as such.
I don't think it's a good policy to construct an ad hoc plagiarism detection service without validating how it actually performs. I really strongly suggest that you retract your PR comment. It would be one thing to try it out and post here about what you found, but to interact with a contributor that way as an experiment is... ill-advised.
+1. The interaction on that PR as a whole struck me as harsh, verging on rude.
You surely don't mean that it is harsh or rude to post the AI summary, along with: "Obviously - as designed - this is deliberately Red Team. But @mdrdope - no pressure, and feel free not to answer - do you have any response to the Gemini comments?" That's one of the advantages of the asking the contributor themselves to do that review - it makes it less likely that they will take offense to the output of the AI. Anyone using AI will know that it will frequently be wrong, and it will be more obvious to them that the AI output is not a judgment, but may serve as a starting point for reflection and investigation. For example, it may draw the author, and the maintainers, into a more thoughtful and informed discussion of copyright. But perhaps you mean - the AI, and some of the other comments, implied that the PR was largely AI-generated, and that was rude? And - I think this is what you are saying - you don't think it's important whether it was, or was not AI generated, and therefore, trying to establish the extent of AI use is harsh / rude? Cheers, Matthew
On Thu, Feb 19, 2026 at 5:24 AM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
On Thu, Feb 19, 2026 at 2:48 AM Charles R Harris via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 6:04 PM Robert Kern via NumPy-Discussion <
On Wed, Feb 18, 2026 at 7:03 PM Matthew Brett <matthew.brett@gmail.com>
wrote:
Hi,
On Wed, Feb 18, 2026 at 10:33 PM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 9:16 AM Matthew Brett <
matthew.brett@gmail.com> wrote:
One way of doing that - is to ask some AI (if possible, an AI other than the one generating the code) to review for copyright. I've experimented with that over at https://github.com/numpy/numpy/pull/30828#issuecomment-3920553882 . But the idea would be that we ask a contributor who has generated
code
by AI, to do this as part of the PR sign-off. They should be in a much better position to do this than the maintainers, as they should have been exploring the problem themselves, and therefore should be able to write better queries to guide the AI review. And with the prompts as a start, it's not particularly time-consuming.
I think all of the arguments it produced are not grounded in the
I do kind of suspect that LLMs could be used, with care, to help
facilitate the abstraction-filtration-comparison test and maybe finding candidates to do that test on, but a general instruction to give arguments for copyright violation apparently yields more chaff to wade through.
Yes, sure - and you can see me trying to negotiate with Gemini on related points in an earlier session here:
https://gist.github.com/matthew-brett/fac33f1b41d98e51b842f8bb84e8c66b
My point was not that AI is doing a good job here - it isn't - but to offer it as a starting point for further research for the PR author, and reflection for those of us thinking about copyright and AI, on what a better process might look like.
IMO, it's definitely not a good starting point for the PR author. It doesn't matter where it places you as a starting point if it points you in
I think *some* kind of code search or plagiarism detection service
might be helpful in identifying possible original sources to compare with
I don't think it's a good policy to construct an ad hoc plagiarism
detection service without validating how it actually performs. I really strongly suggest that you retract your PR comment. It would be one thing to
numpy-discussion@python.org> wrote: principles of copyright law. Unfortunately, I think this is one of the areas where LLMs just generate plausible nonsense rather than sound legal analysis. Each thing that it noted was a one-liner or a general idea, nothing copyrightable. It's essentially writes like a median StackOverflow programmer with a dim understanding of copyright law (no slight intended to anyone; I am one). I've looked at the two files it suggested, and I see no similarity to the PR. the wrong direction. You are asking the PR author to defend against incorrect statements of fact and law. the generatred output. It's not at all clear that asking the LLM as an oracle actually enacts such a search. It plainly did not here, but it presented its work as such. try it out and post here about what you found, but to interact with a contributor that way as an experiment is... ill-advised.
+1. The interaction on that PR as a whole struck me as harsh, verging on rude.
You surely don't mean that it is harsh or rude to post the AI summary, along with: "Obviously - as designed - this is deliberately Red Team. But @mdrdope - no pressure, and feel free not to answer - do you have any response to the Gemini comments?"
Yes. 100%. Irresponsible, too. It's as if a teacher decided to hack together his own plagiarism detector, ran it on his students' work and asked them to respond. Before actually seeing if the thing worked to *any* extent on a known corpus. "no pressure" coming from someone in authority (you have that "Member" tag; you represent the project when you interact with its PRs) is meaningless. You are responsible for the words you put there, whether you use an LLM to generate them or not. Weasel words that the LLM may be wrong don't absolve you of this. You knew that the specific things in there were wrong. You showed us a previous conversation where you identified the wrong claims and found that you could guide the LLM away from them. But in this final conversation, you chose not to provide that guidance and in fact make it more aggressively wrong. And this is the one that you chose to put forth live instead of discussing it here. Which is what I actually asked for. I just wanted a gist to look at and evaluate *before* we decided as a project what to put forth as policy. That's one of the advantages of the asking the contributor themselves
to do that review - it makes it less likely that they will take offense to the output of the AI. Anyone using AI will know that it will frequently be wrong, and it will be more obvious to them that the AI output is not a judgment, but may serve as a starting point for reflection and investigation. For example, it may draw the author, and the maintainers, into a more thoughtful and informed discussion of copyright.
You are rapidly draining my ability to believe that you are operating in good faith. Rather, it increasingly seems like you are strawmanning a particularly bad use of LLMs in order to make a point that LLMs are bad. Good faith is a presumption. It can be undermined with experience. -- Robert Kern
Hi, Sorry - top posting - but: I delayed my reply because your accusation of bad faith seemed so obviously unreasonable, that I had imagined someone might intervene on my behalf, but it seems not. As I understand it, you're saying that my - rather silly - master plan was to post an AI-generated response that was so obviously wrong that it would persuade everyone that AI was bad. And to add to my incompetence, I sent a link to another conversation I'd had with the AI, where it did better, undermining my own case. I think you're also somehow saying if I had not posted the AI response on the issue, but in a Gist, then everything would have been fine, and no bad faith need be assumed. I don't really know where to go from there, but just for the record - no - it was not my dark intention to undermine confidence in AI. I was really doing what I said I was doing - which was to try and work out what a prompt would look like, that would stimulate the PR author (in general) to reflect on copyright. And where I was assuming that anyone using AI would be aware that it was possible for the AI to partly or entirely wrong - and so to use it only as a starting point, or a view to oppose. My foolish post of a previous AI discussion was only to show the kind of conversation I would expect to have, as the PR author. And for the PR AI quote, I was trying out the red-team approach - and trying to avoid pasting a long back and forth. Sadly, I was hoping to stimulate some respectful and useful discussion on this difficult topic, but I'm afraid that has failed very badly. I hope that someone else who isn't so clearly acting in bad faith can take up the discussion as to how one might explore copyright in PRs. Cheers, Matthew On Thu, Feb 19, 2026 at 2:31 PM Robert Kern <robert.kern@gmail.com> wrote:
On Thu, Feb 19, 2026 at 5:24 AM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Hi,
On Thu, Feb 19, 2026 at 2:48 AM Charles R Harris via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 6:04 PM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 7:03 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Wed, Feb 18, 2026 at 10:33 PM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Wed, Feb 18, 2026 at 9:16 AM Matthew Brett <matthew.brett@gmail.com> wrote: > > One way of doing that - is to ask some AI (if possible, an AI other > than the one generating the code) to review for copyright. I've > experimented with that over at > https://github.com/numpy/numpy/pull/30828#issuecomment-3920553882 . > But the idea would be that we ask a contributor who has generated code > by AI, to do this as part of the PR sign-off. They should be in a > much better position to do this than the maintainers, as they should > have been exploring the problem themselves, and therefore should be > able to write better queries to guide the AI review. And with the > prompts as a start, it's not particularly time-consuming.
I think all of the arguments it produced are not grounded in the principles of copyright law. Unfortunately, I think this is one of the areas where LLMs just generate plausible nonsense rather than sound legal analysis. Each thing that it noted was a one-liner or a general idea, nothing copyrightable. It's essentially writes like a median StackOverflow programmer with a dim understanding of copyright law (no slight intended to anyone; I am one). I've looked at the two files it suggested, and I see no similarity to the PR.
I do kind of suspect that LLMs could be used, with care, to help facilitate the abstraction-filtration-comparison test and maybe finding candidates to do that test on, but a general instruction to give arguments for copyright violation apparently yields more chaff to wade through.
Yes, sure - and you can see me trying to negotiate with Gemini on related points in an earlier session here:
https://gist.github.com/matthew-brett/fac33f1b41d98e51b842f8bb84e8c66b
My point was not that AI is doing a good job here - it isn't - but to offer it as a starting point for further research for the PR author, and reflection for those of us thinking about copyright and AI, on what a better process might look like.
IMO, it's definitely not a good starting point for the PR author. It doesn't matter where it places you as a starting point if it points you in the wrong direction. You are asking the PR author to defend against incorrect statements of fact and law.
I think *some* kind of code search or plagiarism detection service might be helpful in identifying possible original sources to compare with the generatred output. It's not at all clear that asking the LLM as an oracle actually enacts such a search. It plainly did not here, but it presented its work as such.
I don't think it's a good policy to construct an ad hoc plagiarism detection service without validating how it actually performs. I really strongly suggest that you retract your PR comment. It would be one thing to try it out and post here about what you found, but to interact with a contributor that way as an experiment is... ill-advised.
+1. The interaction on that PR as a whole struck me as harsh, verging on rude.
You surely don't mean that it is harsh or rude to post the AI summary, along with: "Obviously - as designed - this is deliberately Red Team. But @mdrdope - no pressure, and feel free not to answer - do you have any response to the Gemini comments?"
Yes. 100%. Irresponsible, too. It's as if a teacher decided to hack together his own plagiarism detector, ran it on his students' work and asked them to respond. Before actually seeing if the thing worked to any extent on a known corpus. "no pressure" coming from someone in authority (you have that "Member" tag; you represent the project when you interact with its PRs) is meaningless.
You are responsible for the words you put there, whether you use an LLM to generate them or not. Weasel words that the LLM may be wrong don't absolve you of this.
You knew that the specific things in there were wrong. You showed us a previous conversation where you identified the wrong claims and found that you could guide the LLM away from them. But in this final conversation, you chose not to provide that guidance and in fact make it more aggressively wrong. And this is the one that you chose to put forth live instead of discussing it here. Which is what I actually asked for. I just wanted a gist to look at and evaluate before we decided as a project what to put forth as policy.
That's one of the advantages of the asking the contributor themselves to do that review - it makes it less likely that they will take offense to the output of the AI. Anyone using AI will know that it will frequently be wrong, and it will be more obvious to them that the AI output is not a judgment, but may serve as a starting point for reflection and investigation. For example, it may draw the author, and the maintainers, into a more thoughtful and informed discussion of copyright.
You are rapidly draining my ability to believe that you are operating in good faith. Rather, it increasingly seems like you are strawmanning a particularly bad use of LLMs in order to make a point that LLMs are bad.
Good faith is a presumption. It can be undermined with experience.
-- Robert Kern
-- This email is fully human-source. Unless I'm quoting AI, I did not use AI for any text in this email.
On Thu, Feb 19, 2026 at 6:29 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
Sorry - top posting - but:
I delayed my reply because your accusation of bad faith seemed so obviously unreasonable, that I had imagined someone might intervene on my behalf, but it seems not.
As I understand it, you're saying that my - rather silly - master plan was to post an AI-generated response that was so obviously wrong that it would persuade everyone that AI was bad. And to add to my incompetence, I sent a link to another conversation I'd had with the AI, where it did better, undermining my own case.
No, I don't suspect that you have any master plan to convince anyone by this example alone. I think you're also somehow saying if I had not posted the AI response
on the issue, but in a Gist, then everything would have been fine, and no bad faith need be assumed.
Yes. Precisely, it's the doubling and tripling down. If you had deleted the PR response with an apology, we'd have gone back to productively critiquing and possibly improving your technique. It's the fact that you took it live on our project, knowing that it was Not Even Wrong, and not acknowledging it when that action was criticized. I don't really know where to go from there, Deleting the PR comment with an apology would be a really good start.
but just for the record - no - it was not my dark intention to undermine confidence in AI. I was really doing what I said I was doing - which was to try and work out what a prompt would look like, that would stimulate the PR author (in general) to reflect on copyright. And where I was assuming that anyone using AI would be aware that it was possible for the AI to partly or entirely wrong - and so to use it only as a starting point, or a view to oppose.
Stimulating "the PR author to reflect on copyright" is pretty different from documenting "searches to confirm that no parts of the code are subject to existing copyright", which is what you told us you were going to do. If we had discussed your results ahead of time, we'd have concluded that no, it's not doing searches that can do that confirmation. And you would get my opinion that stimulating the PR author to reflect on copyright with such output is not of value. It was not a good idea to go live and experiment on a contributor ahead of that feedback. -- Robert Kern
I delayed my reply because your accusation of bad faith seemed so obviously unreasonable, that I had imagined someone might intervene on my behalf, but it seems not.
I was asleep most of that time. I do think it would be good to take down the temperature, but I must explain that, like Robert,
I don't suspect that you have any master plan to convince anyone by this example alone.
Rather, I started to feel like you were taking a deliberately contrarian approach to the argument, rather than trying to work towards a consensus. That is at least what I understood Robert to mean by "bad faith". It turns out that even when slop is clearly marked as such, it's disgusting to swallow. That is, before your "experiment", I thought a sensible AI writing policy might be that all AI-generated text should be marked as such. Now, I think it should be stronger: the author of a post should be responsible for every word of the post, AI-generated or not. It is simply not good enough to say "AI wrote this, wdyt?" As with code, any posted text should be thoroughly researched and understood by the poster. Quoting from the excellent Zulip AI use guidelines <https://github.com/zulip/zulip/blob/main/CONTRIBUTING.md#ai-use-policy-and-g...>:
The answer to “Why is X an improvement?” should never be “I'm not sure. The AI did it.”
Similarly, on the flip side, the answer to "why do you think I should spend time working on/researching this review comment" should never be "I'm not sure. The AI did it." Anyway, I do think it would be good to take the temperature down a bit. Matthew did not intend to offend (nor do I in this email; I hope that is clear, Matthew). However, if it came across as rude to Chuck, Robert, Ilhan, and myself, it's probably at a minimum borderline, and it's probably worth thinking about why we are perceiving it that way. Juan. On Fri, 20 Feb 2026, at 1:35 PM, Robert Kern via NumPy-Discussion wrote:
On Thu, Feb 19, 2026 at 6:29 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
Sorry - top posting - but:
I delayed my reply because your accusation of bad faith seemed so obviously unreasonable, that I had imagined someone might intervene on my behalf, but it seems not.
As I understand it, you're saying that my - rather silly - master plan was to post an AI-generated response that was so obviously wrong that it would persuade everyone that AI was bad. And to add to my incompetence, I sent a link to another conversation I'd had with the AI, where it did better, undermining my own case.
No, I don't suspect that you have any master plan to convince anyone by this example alone.
I think you're also somehow saying if I had not posted the AI response on the issue, but in a Gist, then everything would have been fine, and no bad faith need be assumed.
Yes. Precisely, it's the doubling and tripling down. If you had deleted the PR response with an apology, we'd have gone back to productively critiquing and possibly improving your technique. It's the fact that you took it live on our project, knowing that it was Not Even Wrong, and not acknowledging it when that action was criticized.
I don't really know where to go from there,
Deleting the PR comment with an apology would be a really good start.
but just for the record - no - it was not my dark intention to undermine confidence in AI. I was really doing what I said I was doing - which was to try and work out what a prompt would look like, that would stimulate the PR author (in general) to reflect on copyright. And where I was assuming that anyone using AI would be aware that it was possible for the AI to partly or entirely wrong - and so to use it only as a starting point, or a view to oppose.
Stimulating "the PR author to reflect on copyright" is pretty different from documenting "searches to confirm that no parts of the code are subject to existing copyright", which is what you told us you were going to do.
If we had discussed your results ahead of time, we'd have concluded that no, it's not doing searches that can do that confirmation. And you would get my opinion that stimulating the PR author to reflect on copyright with such output is not of value. It was not a good idea to go live and experiment on a contributor ahead of that feedback.
-- Robert Kern _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: jni@fastmail.com
Hi, To reduce the heat on this issue I have: a) Deleted my comment on the PR, and my reference to that comment. b) Reposted as a Gist so people reading this thread can see what the discussion was about : https://gist.github.com/matthew-brett/a9b43c7266e0fb4f773677ca838fa920 Further replies inline: On Fri, Feb 20, 2026 at 2:36 AM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Thu, Feb 19, 2026 at 6:29 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
Sorry - top posting - but:
I delayed my reply because your accusation of bad faith seemed so obviously unreasonable, that I had imagined someone might intervene on my behalf, but it seems not.
As I understand it, you're saying that my - rather silly - master plan was to post an AI-generated response that was so obviously wrong that it would persuade everyone that AI was bad. And to add to my incompetence, I sent a link to another conversation I'd had with the AI, where it did better, undermining my own case.
No, I don't suspect that you have any master plan to convince anyone by this example alone.
You wrote before that "Rather, it increasingly seems like you are strawmanning a particularly bad use of LLMs in order to make a point that LLMs are bad." as the explanation for why you now suspect I was acting in bad faith. I presume from the "this example alone" that you still think I have such a program. As I said before - that's a very silly program. What's the idea here - that I try and persuade my competent and intelligent colleagues of such a ridiculous binary by sneaking in bad examples, when of course y'all have seen many such examples yourselves? It's the price of AI admission. As Matthew Rocklin put it, in his very useful article advocating AI for code generation: "LLMs generate a lot of junk" : https://matthewrocklin.com/ai-zealotry/#why-ai . Yet it is clear to me they will also offer benefit, if used with care.
I think you're also somehow saying if I had not posted the AI response on the issue, but in a Gist, then everything would have been fine, and no bad faith need be assumed.
Yes. Precisely, it's the doubling and tripling down. If you had deleted the PR response with an apology, we'd have gone back to productively critiquing and possibly improving your technique. It's the fact that you took it live on our project, knowing that it was Not Even Wrong, and not acknowledging it when that action was criticized.
But why oh why would you pitch in with this fierce and insulting diagnosis of my motivation, and demand an immediate apology, rather than saying - "OK - I get the point you're trying to make - but it's not well put and I think you made it in the wrong place, let's move it elsewhere?" Don't you have some responsibility for keeping the conversation calm and civil? I mean - to the project if not to me. I think you agree that copyright is an important and difficult subject that needs careful reflection - we're not going to get there with this level of distrust. Cheers, Matthew
On Fri, Feb 20, 2026 at 5:55 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
To reduce the heat on this issue I have:
a) Deleted my comment on the PR, and my reference to that comment. b) Reposted as a Gist so people reading this thread can see what the discussion was about : https://gist.github.com/matthew-brett/a9b43c7266e0fb4f773677ca838fa920
Thank you. That is very much appreciated.
Further replies inline:
On Fri, Feb 20, 2026 at 2:36 AM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Thu, Feb 19, 2026 at 6:29 PM Matthew Brett <matthew.brett@gmail.com>
Hi,
Sorry - top posting - but:
I delayed my reply because your accusation of bad faith seemed so obviously unreasonable, that I had imagined someone might intervene on my behalf, but it seems not.
As I understand it, you're saying that my - rather silly - master plan was to post an AI-generated response that was so obviously wrong that it would persuade everyone that AI was bad. And to add to my incompetence, I sent a link to another conversation I'd had with the AI, where it did better, undermining my own case.
No, I don't suspect that you have any master plan to convince anyone by
wrote: this example alone.
You wrote before that "Rather, it increasingly seems like you are strawmanning a particularly bad use of LLMs in order to make a point that LLMs are bad." as the explanation for why you now suspect I was acting in bad faith. I presume from the "this example alone" that you still think I have such a program. As I said before - that's a very silly program. What's the idea here - that I try and persuade my competent and intelligent colleagues of such a ridiculous binary by sneaking in bad examples, when of course y'all have seen many such examples yourselves? It's the price of AI admission. As Matthew Rocklin put it, in his very useful article advocating AI for code generation: "LLMs generate a lot of junk" : https://matthewrocklin.com/ai-zealotry/#why-ai . Yet it is clear to me they will also offer benefit, if used with care.
In the interest of playing the ball and not the man, and refocusing on offering a path forward to return to productive conversation (one that I am glad that we have taken), I had deleted a paragraph where I detailed the patterns I am seeing. If you'd like to talk about it in private, I am at your service.
I think you're also somehow saying if I had not posted the AI response on the issue, but in a Gist, then everything would have been fine, and no bad faith need be assumed.
Yes. Precisely, it's the doubling and tripling down. If you had deleted the PR response with an apology, we'd have gone back to productively critiquing and possibly improving your technique. It's the fact that you took it live on our project, knowing that it was Not Even Wrong, and not acknowledging it when that action was criticized.
But why oh why would you pitch in with this fierce and insulting diagnosis of my motivation, and demand an immediate apology, rather than saying - "OK - I get the point you're trying to make - but it's not well put and I think you made it in the wrong place, let's move it elsewhere?"
I don't believe that's far off from where I started: https://mail.python.org/archives/list/numpy-discussion@python.org/message/KW... Don't you have some responsibility for keeping the
conversation calm and civil? I mean - to the project if not to me. I think you agree that copyright is an important and difficult subject that needs careful reflection - we're not going to get there with this level of distrust.
Cheers,
Matthew
-- Robert Kern
Hi, On Fri, Feb 20, 2026 at 6:39 PM Robert Kern <robert.kern@gmail.com> wrote:
On Fri, Feb 20, 2026 at 5:55 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
To reduce the heat on this issue I have:
a) Deleted my comment on the PR, and my reference to that comment. b) Reposted as a Gist so people reading this thread can see what the discussion was about : https://gist.github.com/matthew-brett/a9b43c7266e0fb4f773677ca838fa920
Thank you. That is very much appreciated.
Further replies inline:
On Fri, Feb 20, 2026 at 2:36 AM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Thu, Feb 19, 2026 at 6:29 PM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
Sorry - top posting - but:
I delayed my reply because your accusation of bad faith seemed so obviously unreasonable, that I had imagined someone might intervene on my behalf, but it seems not.
As I understand it, you're saying that my - rather silly - master plan was to post an AI-generated response that was so obviously wrong that it would persuade everyone that AI was bad. And to add to my incompetence, I sent a link to another conversation I'd had with the AI, where it did better, undermining my own case.
No, I don't suspect that you have any master plan to convince anyone by this example alone.
You wrote before that "Rather, it increasingly seems like you are strawmanning a particularly bad use of LLMs in order to make a point that LLMs are bad." as the explanation for why you now suspect I was acting in bad faith. I presume from the "this example alone" that you still think I have such a program. As I said before - that's a very silly program. What's the idea here - that I try and persuade my competent and intelligent colleagues of such a ridiculous binary by sneaking in bad examples, when of course y'all have seen many such examples yourselves? It's the price of AI admission. As Matthew Rocklin put it, in his very useful article advocating AI for code generation: "LLMs generate a lot of junk" : https://matthewrocklin.com/ai-zealotry/#why-ai . Yet it is clear to me they will also offer benefit, if used with care.
In the interest of playing the ball and not the man, and refocusing on offering a path forward to return to productive conversation (one that I am glad that we have taken), I had deleted a paragraph where I detailed the patterns I am seeing. If you'd like to talk about it in private, I am at your service.
Thank you - I'm glad to hear - and likewise, Matthew
On Mon, Feb 9, 2026 at 5:02 PM Ralf Gommers via NumPy-Discussion < numpy-discussion@python.org> wrote:
This also presumes that you, or we, are able to determine what usage of AI tools helps or hinders learning. That is not possible at the level of individuals: people can learn in very different ways, plus it will strongly depend on how the tools are used. And even in the aggregate it's not practically possible: most of the studies that have been referenced in this and linked thread (a) are one-offs, and often inconsistent with each other, and (b) already outdated, given how fast the field is developing.
On this point, I commend to everyone the writing and research of Dr Cat Hicks, a psychological scientist studying software teams and tech. One of the things I've noticed from her reading these papers in public is that the studies are typically (a) not designed by learning scientists, (b) uninformed by the basic phenomena of learning science (thus misattributing effects as novel or using mismatched instruments), and (c) underpowered. This is an emerging object of study. Each paper alone isn't going to establish anything about "AI"; they are adding to a body of knowledge that might some day, but after a lot of missteps while we work out the right way to measure these effects. Each one is interesting, but rarely is the headline-ification of the results going to hold water. https://www.fightforthehuman.com/cognitive-helmets-for-the-ai-bicycle-part-1... -- Robert Kern
Hi, On Tue, Feb 10, 2026 at 1:04 AM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Mon, Feb 9, 2026 at 5:02 PM Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
This also presumes that you, or we, are able to determine what usage of AI tools helps or hinders learning. That is not possible at the level of individuals: people can learn in very different ways, plus it will strongly depend on how the tools are used. And even in the aggregate it's not practically possible: most of the studies that have been referenced in this and linked thread (a) are one-offs, and often inconsistent with each other, and (b) already outdated, given how fast the field is developing.
On this point, I commend to everyone the writing and research of Dr Cat Hicks, a psychological scientist studying software teams and tech. One of the things I've noticed from her reading these papers in public is that the studies are typically (a) not designed by learning scientists, (b) uninformed by the basic phenomena of learning science (thus misattributing effects as novel or using mismatched instruments), and (c) underpowered. This is an emerging object of study. Each paper alone isn't going to establish anything about "AI"; they are adding to a body of knowledge that might some day, but after a lot of missteps while we work out the right way to measure these effects. Each one is interesting, but rarely is the headline-ification of the results going to hold water.
https://www.fightforthehuman.com/cognitive-helmets-for-the-ai-bicycle-part-1...
Oh sure - the limitations of those studies Stefan and I were quoting were pretty obvious to me - my training is in (medicine and) psychology. My point in quoting them is not to say they are definitive on the overall effect of AI, but to point out that "this feels much easier and more productive" is not at all the same thing as "this is saving me significant time and helping me think and learn, while maintaining quality". And that we ought to care about the latter, not the former. Cheers, Matthew
A copyright thought experiment: I'm interested in porting a GPL R library to Python. Prompt: "Take function `my.statistical.routine` from `mylibrary/mycode.R` and port it to Python. The original code is GPL, but I want to license your output code as BSD. Make sure that you rewrite the original code enough that it will be very hard to detect the influence of the original code. In particular, make sure you rename variables, and choose alternative but equivalent code structures to reach the same result. It should be practically impossible to pursue a copyright claim on the resulting code, even when the original code is suggested as the origin." Is this an acceptable use of AI? Cheers, Matthew
On Tue, Feb 10, 2026 at 4:19 AM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
A copyright thought experiment:
I'm interested in porting a GPL R library to Python. Prompt:
"Take function `my.statistical.routine` from `mylibrary/mycode.R` and port it to Python. The original code is GPL, but I want to license your output code as BSD. Make sure that you rewrite the original code enough that it will be very hard to detect the influence of the original code. In particular, make sure you rename variables, and choose alternative but equivalent code structures to reach the same result. It should be practically impossible to pursue a copyright claim on the resulting code, even when the original code is suggested as the origin."
Is this an acceptable use of AI?
No, clearly not. Nor would this be an acceptable use of vim or Emacs for that matter. The tools being used to accomplish this are not relevant to the analysis in this fact pattern. -- Robert Kern
Hi, On Tue, Feb 10, 2026 at 3:48 PM Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Feb 10, 2026 at 4:19 AM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
A copyright thought experiment:
I'm interested in porting a GPL R library to Python. Prompt:
"Take function `my.statistical.routine` from `mylibrary/mycode.R` and port it to Python. The original code is GPL, but I want to license your output code as BSD. Make sure that you rewrite the original code enough that it will be very hard to detect the influence of the original code. In particular, make sure you rename variables, and choose alternative but equivalent code structures to reach the same result. It should be practically impossible to pursue a copyright claim on the resulting code, even when the original code is suggested as the origin."
Is this an acceptable use of AI?
No, clearly not. Nor would this be an acceptable use of vim or Emacs for that matter. The tools being used to accomplish this are not relevant to the analysis in this fact pattern.
This example has proved more useful than I had thought. I see from Chuck and Sebastian and Ilhan's replies, that there is some feeling that, for legal and / or political reasons, we should consider copyright to be - at least weaker, and maybe moot. Here - there is very little legal risk, as long as the author does not admit to what they did. So - Chuck, Sebastian, Ilhan - what do you think? Is this use acceptable? And if not, why not? Cheers, Matthew
On Sat, Feb 14, 2026 at 6:53 AM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
On Tue, Feb 10, 2026 at 3:48 PM Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Feb 10, 2026 at 4:19 AM Matthew Brett via NumPy-Discussion <
A copyright thought experiment:
I'm interested in porting a GPL R library to Python. Prompt:
"Take function `my.statistical.routine` from `mylibrary/mycode.R` and port it to Python. The original code is GPL, but I want to license your output code as BSD. Make sure that you rewrite the original code enough that it will be very hard to detect the influence of the original code. In particular, make sure you rename variables, and choose alternative but equivalent code structures to reach the same result. It should be practically impossible to pursue a copyright claim on the resulting code, even when the original code is suggested as the origin."
Is this an acceptable use of AI?
No, clearly not. Nor would this be an acceptable use of vim or Emacs for
numpy-discussion@python.org> wrote: that matter. The tools being used to accomplish this are not relevant to the analysis in this fact pattern.
This example has proved more useful than I had thought.
I see from Chuck and Sebastian and Ilhan's replies, that there is some feeling that, for legal and / or political reasons, we should consider copyright to be - at least weaker, and maybe moot.
Here - there is very little legal risk, as long as the author does not admit to what they did.
So - Chuck, Sebastian, Ilhan - what do you think? Is this use acceptable? And if not, why not?
Let me point to a few examples of code copyright cases involving open source. - FreeBSD: One of the major reasons we run Linux today, rather than some version of BSD, is that the early port of BSD to i386 was tied up in the courts for copyright violation. I recall the initial announcement. The case ran for years. - Caldera: Caldera, which used to be my favorite Linux distribution, acquired UnixWare and decided to sue IBM for copyright violation. They pointed to small code snippets. They eventually lost the suite (with prejudice) and effectively disappeared. But they could have derailed Linux. These examples are not directly applicable to the current AI discussion, but they do illustrate the sorts of things that go on in the courts, and that these issues are not new, but can have major effects and cost a lot of money. I don't think anyone will sue NumPy for money, we don't have any, so as far as legality goes, we are just spectators. Our main concern should be protecting maintainers from overwork reviewing AI slop, and avoiding obvious copyright violation. Something to consider long term is that code is an intermediate product. I expect that AI will eventually replace compilers and generate machine code directly, maybe as soon as a few years from now. Who can review that? At that point APIs and standards will become more important than code. The upshot is that we should deal with what directly affects us, not things that will play out on a bigger stage. Chuck
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: charlesr.harris@gmail.com
Hi, On Sat, Feb 14, 2026 at 4:01 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 14, 2026 at 6:53 AM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Hi,
On Tue, Feb 10, 2026 at 3:48 PM Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Feb 10, 2026 at 4:19 AM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
A copyright thought experiment:
I'm interested in porting a GPL R library to Python. Prompt:
"Take function `my.statistical.routine` from `mylibrary/mycode.R` and port it to Python. The original code is GPL, but I want to license your output code as BSD. Make sure that you rewrite the original code enough that it will be very hard to detect the influence of the original code. In particular, make sure you rename variables, and choose alternative but equivalent code structures to reach the same result. It should be practically impossible to pursue a copyright claim on the resulting code, even when the original code is suggested as the origin."
Is this an acceptable use of AI?
No, clearly not. Nor would this be an acceptable use of vim or Emacs for that matter. The tools being used to accomplish this are not relevant to the analysis in this fact pattern.
This example has proved more useful than I had thought.
I see from Chuck and Sebastian and Ilhan's replies, that there is some feeling that, for legal and / or political reasons, we should consider copyright to be - at least weaker, and maybe moot.
Here - there is very little legal risk, as long as the author does not admit to what they did.
So - Chuck, Sebastian, Ilhan - what do you think? Is this use acceptable? And if not, why not?
Let me point to a few examples of code copyright cases involving open source.
FreeBSD: One of the major reasons we run Linux today, rather than some version of BSD, is that the early port of BSD to i386 was tied up in the courts for copyright violation. I recall the initial announcement. The case ran for years. Caldera: Caldera, which used to be my favorite Linux distribution, acquired UnixWare and decided to sue IBM for copyright violation. They pointed to small code snippets. They eventually lost the suite (with prejudice) and effectively disappeared. But they could have derailed Linux.
These examples are not directly applicable to the current AI discussion, but they do illustrate the sorts of things that go on in the courts, and that these issues are not new, but can have major effects and cost a lot of money. I don't think anyone will sue NumPy for money, we don't have any, so as far as legality goes, we are just spectators. Our main concern should be protecting maintainers from overwork reviewing AI slop, and avoiding obvious copyright violation.
Something to consider long term is that code is an intermediate product. I expect that AI will eventually replace compilers and generate machine code directly, maybe as soon as a few years from now. Who can review that? At that point APIs and standards will become more important than code.
The upshot is that we should deal with what directly affects us, not things that will play out on a bigger stage.
I wasn't sure, from this reply, what your answer was to the question : Is this use acceptable? And if not, why not? Cheers, Matthew
On Sat, Feb 14, 2026 at 9:04 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Sat, Feb 14, 2026 at 4:01 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 14, 2026 at 6:53 AM Matthew Brett via NumPy-Discussion <
Hi,
On Tue, Feb 10, 2026 at 3:48 PM Robert Kern <robert.kern@gmail.com>
wrote:
On Tue, Feb 10, 2026 at 4:19 AM Matthew Brett via NumPy-Discussion <
numpy-discussion@python.org> wrote:
A copyright thought experiment:
I'm interested in porting a GPL R library to Python. Prompt:
"Take function `my.statistical.routine` from `mylibrary/mycode.R` and port it to Python. The original code is GPL, but I want to license your output code as BSD. Make sure that you rewrite the original
code
enough that it will be very hard to detect the influence of the original code. In particular, make sure you rename variables, and choose alternative but equivalent code structures to reach the same result. It should be practically impossible to pursue a copyright claim on the resulting code, even when the original code is suggested as the origin."
Is this an acceptable use of AI?
No, clearly not. Nor would this be an acceptable use of vim or Emacs for that matter. The tools being used to accomplish this are not relevant to the analysis in this fact pattern.
This example has proved more useful than I had thought.
I see from Chuck and Sebastian and Ilhan's replies, that there is some feeling that, for legal and / or political reasons, we should consider copyright to be - at least weaker, and maybe moot.
Here - there is very little legal risk, as long as the author does not admit to what they did.
So - Chuck, Sebastian, Ilhan - what do you think? Is this use acceptable? And if not, why not?
Let me point to a few examples of code copyright cases involving open
numpy-discussion@python.org> wrote: source.
FreeBSD: One of the major reasons we run Linux today, rather than some
Caldera: Caldera, which used to be my favorite Linux distribution, acquired UnixWare and decided to sue IBM for copyright violation. They
version of BSD, is that the early port of BSD to i386 was tied up in the courts for copyright violation. I recall the initial announcement. The case ran for years. pointed to small code snippets. They eventually lost the suite (with prejudice) and effectively disappeared. But they could have derailed Linux.
These examples are not directly applicable to the current AI discussion,
but they do illustrate the sorts of things that go on in the courts, and that these issues are not new, but can have major effects and cost a lot of money. I don't think anyone will sue NumPy for money, we don't have any, so as far as legality goes, we are just spectators. Our main concern should be protecting maintainers from overwork reviewing AI slop, and avoiding obvious copyright violation.
Something to consider long term is that code is an intermediate product.
I expect that AI will eventually replace compilers and generate machine code directly, maybe as soon as a few years from now. Who can review that? At that point APIs and standards will become more important than code.
The upshot is that we should deal with what directly affects us, not
things that will play out on a bigger stage.
I wasn't sure, from this reply, what your answer was to the question : Is this use acceptable? And if not, why not?
I am not going to play that game. Chuck
On Sat, Feb 14, 2026 at 4:11 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 14, 2026 at 9:04 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Sat, Feb 14, 2026 at 4:01 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 14, 2026 at 6:53 AM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Hi,
On Tue, Feb 10, 2026 at 3:48 PM Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Feb 10, 2026 at 4:19 AM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
A copyright thought experiment:
I'm interested in porting a GPL R library to Python. Prompt:
"Take function `my.statistical.routine` from `mylibrary/mycode.R` and port it to Python. The original code is GPL, but I want to license your output code as BSD. Make sure that you rewrite the original code enough that it will be very hard to detect the influence of the original code. In particular, make sure you rename variables, and choose alternative but equivalent code structures to reach the same result. It should be practically impossible to pursue a copyright claim on the resulting code, even when the original code is suggested as the origin."
Is this an acceptable use of AI?
No, clearly not. Nor would this be an acceptable use of vim or Emacs for that matter. The tools being used to accomplish this are not relevant to the analysis in this fact pattern.
This example has proved more useful than I had thought.
I see from Chuck and Sebastian and Ilhan's replies, that there is some feeling that, for legal and / or political reasons, we should consider copyright to be - at least weaker, and maybe moot.
Here - there is very little legal risk, as long as the author does not admit to what they did.
So - Chuck, Sebastian, Ilhan - what do you think? Is this use acceptable? And if not, why not?
Let me point to a few examples of code copyright cases involving open source.
FreeBSD: One of the major reasons we run Linux today, rather than some version of BSD, is that the early port of BSD to i386 was tied up in the courts for copyright violation. I recall the initial announcement. The case ran for years. Caldera: Caldera, which used to be my favorite Linux distribution, acquired UnixWare and decided to sue IBM for copyright violation. They pointed to small code snippets. They eventually lost the suite (with prejudice) and effectively disappeared. But they could have derailed Linux.
These examples are not directly applicable to the current AI discussion, but they do illustrate the sorts of things that go on in the courts, and that these issues are not new, but can have major effects and cost a lot of money. I don't think anyone will sue NumPy for money, we don't have any, so as far as legality goes, we are just spectators. Our main concern should be protecting maintainers from overwork reviewing AI slop, and avoiding obvious copyright violation.
Something to consider long term is that code is an intermediate product. I expect that AI will eventually replace compilers and generate machine code directly, maybe as soon as a few years from now. Who can review that? At that point APIs and standards will become more important than code.
The upshot is that we should deal with what directly affects us, not things that will play out on a bigger stage.
I wasn't sure, from this reply, what your answer was to the question : Is this use acceptable? And if not, why not?
I am not going to play that game.
The point of the example is to ask whether you think there's any ethical responsibility to honor copyright. Robert thought yes, so do I. Is there any sense in which this is a trick question? Cheers, Matthew
My answer to that is yes currently it is unfortunately OK. For it to be not OK, the tool should have a license aware setting that when you flip it, you don't get copyrighted answer and not to be trained on copyrighted code. And the user consciously uses the steal mode. Though yours is a bit dramatic, this is what happens when you query anything. So not sure where we disagree. You are making my case. And this is my entire point that these machines like a sundae machine take all sources (copyrighted/private or not) in and give you an amalgam of intellectual property. What I am trying to emphasize is that, we are trying to free the tool and hold the contributor accountable. There are valid use cases, there are not valid use cases but in all cases LLM did the stealing. Just because you asked it nicely or in a ill-intentioned fashion does not change anything. We should not fool ourselves by the language we are getting results out of these stochastic parrots. I am not trying to devalue the copyright notion. I am trying to emphasize that these guardrails we are putting up are doing nothing in terms of copyright other than a bit of feel-good. On Sat, Feb 14, 2026 at 5:15 PM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Sat, Feb 14, 2026 at 4:11 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 14, 2026 at 9:04 AM Matthew Brett <matthew.brett@gmail.com>
Hi,
On Sat, Feb 14, 2026 at 4:01 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 14, 2026 at 6:53 AM Matthew Brett via NumPy-Discussion <
numpy-discussion@python.org> wrote:
Hi,
On Tue, Feb 10, 2026 at 3:48 PM Robert Kern <robert.kern@gmail.com>
wrote:
On Tue, Feb 10, 2026 at 4:19 AM Matthew Brett via NumPy-Discussion
<numpy-discussion@python.org> wrote:
> > A copyright thought experiment: > > I'm interested in porting a GPL R library to Python. Prompt: > > "Take function `my.statistical.routine` from `mylibrary/mycode.R` and > port it to Python. The original code is GPL, but I want to
> your output code as BSD. Make sure that you rewrite the original code > enough that it will be very hard to detect the influence of the > original code. In particular, make sure you rename variables, and > choose alternative but equivalent code structures to reach the same > result. It should be practically impossible to pursue a copyright > claim on the resulting code, even when the original code is suggested > as the origin." > > Is this an acceptable use of AI?
No, clearly not. Nor would this be an acceptable use of vim or Emacs for that matter. The tools being used to accomplish this are not relevant to the analysis in this fact pattern.
This example has proved more useful than I had thought.
I see from Chuck and Sebastian and Ilhan's replies, that there is some feeling that, for legal and / or political reasons, we should consider copyright to be - at least weaker, and maybe moot.
Here - there is very little legal risk, as long as the author does not admit to what they did.
So - Chuck, Sebastian, Ilhan - what do you think? Is this use acceptable? And if not, why not?
Let me point to a few examples of code copyright cases involving open
FreeBSD: One of the major reasons we run Linux today, rather than
some version of BSD, is that the early port of BSD to i386 was tied up in
Caldera: Caldera, which used to be my favorite Linux distribution, acquired UnixWare and decided to sue IBM for copyright violation. They
These examples are not directly applicable to the current AI
discussion, but they do illustrate the sorts of things that go on in the courts, and that these issues are not new, but can have major effects and cost a lot of money. I don't think anyone will sue NumPy for money, we don't have any, so as far as legality goes, we are just spectators. Our
Something to consider long term is that code is an intermediate
The upshot is that we should deal with what directly affects us, not
wrote: license source. the courts for copyright violation. I recall the initial announcement. The case ran for years. pointed to small code snippets. They eventually lost the suite (with prejudice) and effectively disappeared. But they could have derailed Linux. main concern should be protecting maintainers from overwork reviewing AI slop, and avoiding obvious copyright violation. product. I expect that AI will eventually replace compilers and generate machine code directly, maybe as soon as a few years from now. Who can review that? At that point APIs and standards will become more important than code. things that will play out on a bigger stage.
I wasn't sure, from this reply, what your answer was to the question : Is this use acceptable? And if not, why not?
I am not going to play that game.
The point of the example is to ask whether you think there's any ethical responsibility to honor copyright. Robert thought yes, so do I. Is there any sense in which this is a trick question?
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ilhanpolat@gmail.com
TL;DR: I suggest we move the discussion to how we implement a kind of web of trust that has to be joined before one can post PRs. I think the example Matthew posted of converted GPL R to BSD python is obviously not OK, and I'm fine with stating explicitly that we don't want PRs like that. But I agree with others that any policy we adopt would do (next to) nothing against bad intent, and that, more generally, this is not something numpy can solve. To me, it does not seem to address the essence of the problem. Rather, to me the discussion has clarified that the essence is *trust*. Trust that someone does not knowlingly break copyright, and trust that they are genuinely interested in contributing and thus have done the work to ensure maintainer time is well spent. I would suggest we take seriously implementing something like the vouching system Robert pointed to, https://github.com/mitchellh/vouch I include the "why" from its README below [1], as I find it well put. (There may be something better, which perhaps relies on an existing "web of trust", like PGP keys. But let's decide first whether we want to do this route.) I think a big benefit of separating admission of people from submitting PRs is that as a maintainer I do not have to be suspicious about intent; as Sebastian noted, that removes the joy there can be in review (and as I wrote before, is stopping me from reviewing PRs from accounts I do not recognize). I'm encouraged by Chuck thinking a vouching scheme would not be enormously painful. We do obviously need to think how we actually go about vouching for a new contributor... I should add that I passed on the vouch link to our discussion over at astropy, https://github.com/astropy/astropy-project/issues/509, where there was support for implementing (something like) it, with the suggestion that we "share a network with scipy+numpy to minimize the barrier to contributors we already trust and perhaps to share labor." It was also noted that this system is similar to what is in place for arXiv, which (so far) has worked reasonably well. Related (but not directly on-topic, sorry!), in astronomy more generally there has just been a thoughtful white paper posted by David Hogg on the use of AI in astronomy: https://arxiv.org/pdf/2602.10181. He makes interesting points about why we do astronomy in the first place (stating it is not just to get answers, in which case becoming a hedge fund manager and hiring others to do the work would be more efficient... As in fact Simons did). But also notes how for astropy (and numpy) it is essential that we can trust that those packages do the right thing, and that that trust is based on trusting those that constructed them. All the best, Marten [1] The "why" from https://github.com/mitchellh/vouch Open source has always worked on a system of trust and verify. Historically, the effort required to understand a codebase, implement a change, and submit that change for review was high enough that it naturally filtered out many low quality contributions from unqualified people. For over 20 years of my life, this was enough for my projects as well as enough for most others. Unfortunately, the landscape has changed particularly with the advent of AI tools that allow people to trivially create plausible-looking but extremely low-quality contributions with little to no true understanding. Contributors can no longer be trusted based on the minimal barrier to entry to simply submit a change. But, open source still works on trust! And every project has a definite group of trusted individuals (maintainers) and a larger group of probably trusted individuals (active members of the community in any form). So, let's move to an explicit trust model where trusted individuals can vouch for others, and those vouched individuals can then contribute. Ilhan Polat via NumPy-Discussion <numpy-discussion@python.org> writes:
My answer to that is yes currently it is unfortunately OK. For it to be not OK, the tool should have a license aware setting that when you flip it, you don't get copyrighted answer and not to be trained on copyrighted code. And the user consciously uses the steal mode.
Though yours is a bit dramatic, this is what happens when you query anything. So not sure where we disagree. You are making my case. And this is my entire point that these machines like a sundae machine take all sources (copyrighted/private or not) in and give you an amalgam of intellectual property.
What I am trying to emphasize is that, we are trying to free the tool and hold the contributor accountable. There are valid use cases, there are not valid use cases but in all cases LLM did the stealing. Just because you asked it nicely or in a ill-intentioned fashion does not change anything. We should not fool ourselves by the language we are getting results out of these stochastic parrots.
I am not trying to devalue the copyright notion. I am trying to emphasize that these guardrails we are putting up are doing nothing in terms of copyright other than a bit of feel-good.
On Sat, Feb 14, 2026 at 5:15 PM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Sat, Feb 14, 2026 at 4:11 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 14, 2026 at 9:04 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Sat, Feb 14, 2026 at 4:01 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 14, 2026 at 6:53 AM Matthew Brett via NumPy-Discussion
Hi,
On Tue, Feb 10, 2026 at 3:48 PM Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Feb 10, 2026 at 4:19 AM Matthew Brett via NumPy-Discussion
<numpy-discussion@python.org> wrote:
> > A copyright thought experiment: > > I'm interested in porting a GPL R library to Python. Prompt: > > "Take function `my.statistical.routine` from `mylibrary/mycode.R` and > port it to Python. The original code is GPL, but I want to license > your output code as BSD. Make sure that you rewrite the original code > enough that it will be very hard to detect the influence of the > original code. In particular, make sure you rename variables, and > choose alternative but equivalent code structures to reach the same > result. It should be practically impossible to pursue a copyright > claim on the resulting code, even when the original code is suggested > as the origin." > > Is this an acceptable use of AI?
No, clearly not. Nor would this be an acceptable use of vim or Emacs for that matter. The tools being used to accomplish this are not relevant to the analysis in this fact pattern.
This example has proved more useful than I had thought.
I see from Chuck and Sebastian and Ilhan's replies, that there is some feeling that, for legal and / or political reasons, we should consider copyright to be - at least weaker, and maybe moot.
Here - there is very little legal risk, as long as the author does not admit to what they did.
So - Chuck, Sebastian, Ilhan - what do you think? Is this use acceptable? And if not, why not?
Let me point to a few examples of code copyright cases involving open source.
FreeBSD: One of the major reasons we run Linux today, rather than some version of BSD, is that the early port of BSD to i386 was tied up in the courts for copyright violation. I recall the initial announcement. The case ran for years. Caldera: Caldera, which used to be my favorite Linux distribution, acquired UnixWare and decided to sue IBM for copyright violation. They pointed to small code snippets. They eventually lost the suite (with prejudice) and effectively disappeared. But they could have derailed Linux.
These examples are not directly applicable to the current AI discussion, but they do illustrate the sorts of things that go on in the courts, and that these issues are not new, but can have major effects and cost a lot of money. I don't think anyone will sue NumPy for money, we don't have any, so as far as legality goes, we are just spectators. Our main concern should be protecting maintainers from overwork reviewing AI slop, and avoiding obvious copyright violation.
Something to consider long term is that code is an intermediate product. I expect
The upshot is that we should deal with what directly affects us, not things that will
<numpy-discussion@python.org> wrote: that AI will eventually replace compilers and generate machine code directly, maybe as soon as a few years from now. Who can review that? At that point APIs and standards will become more important than code. play out on a bigger stage.
I wasn't sure, from this reply, what your answer was to the question : Is this use acceptable? And if not, why not?
I am not going to play that game.
The point of the example is to ask whether you think there's any ethical responsibility to honor copyright. Robert thought yes, so do I. Is there any sense in which this is a trick question?
Cheers,
Matthew _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: ilhanpolat@gmail.com
[2:text/plain Hide]
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: mhvk@astro.utoronto.ca
Hi again, On Tue, Feb 10, 2026 at 8:34 AM Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Tue, Feb 10, 2026 at 1:04 AM Robert Kern via NumPy-Discussion <numpy-discussion@python.org> wrote:
On Mon, Feb 9, 2026 at 5:02 PM Ralf Gommers via NumPy-Discussion <numpy-discussion@python.org> wrote:
This also presumes that you, or we, are able to determine what usage of AI tools helps or hinders learning. That is not possible at the level of individuals: people can learn in very different ways, plus it will strongly depend on how the tools are used. And even in the aggregate it's not practically possible: most of the studies that have been referenced in this and linked thread (a) are one-offs, and often inconsistent with each other, and (b) already outdated, given how fast the field is developing.
On this point, I commend to everyone the writing and research of Dr Cat Hicks, a psychological scientist studying software teams and tech. One of the things I've noticed from her reading these papers in public is that the studies are typically (a) not designed by learning scientists, (b) uninformed by the basic phenomena of learning science (thus misattributing effects as novel or using mismatched instruments), and (c) underpowered. This is an emerging object of study. Each paper alone isn't going to establish anything about "AI"; they are adding to a body of knowledge that might some day, but after a lot of missteps while we work out the right way to measure these effects. Each one is interesting, but rarely is the headline-ification of the results going to hold water.
https://www.fightforthehuman.com/cognitive-helmets-for-the-ai-bicycle-part-1...
Oh sure - the limitations of those studies Stefan and I were quoting were pretty obvious to me - my training is in (medicine and) psychology. My point in quoting them is not to say they are definitive on the overall effect of AI, but to point out that "this feels much easier and more productive" is not at all the same thing as "this is saving me significant time and helping me think and learn, while maintaining quality". And that we ought to care about the latter, not the former.
It also seems worth pointing out that: a) Generally (this is rather complicated, can say more if it's interesting) low power means it is more difficult to find results that reach statistical significance. So it's striking that the Anthropic study did reach statistical significance for learning-deficit using AI b) For the Anthropic study, the learning-loss effect of AI was huge - 17% - so certainly worth further investigation and c) Wouldn't a sensible person worry that asking a machine to do the thinking for you, would result in learning loss? So - yes - much more work needs doing, but even the current stuff raises serious questions that need to be addressed. Cheers, Matthew
On Sun, Feb 8, 2026 at 4:56 AM Ralf Gommers via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Sat, Feb 7, 2026 at 3:11 PM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Sat, Feb 7, 2026 at 2:50 AM Ilhan Polat via NumPy-Discussion < numpy-discussion@python.org> wrote:
That's fantastic that you are working on it David. A good high-level ARPACK is beneficial for all and possibly better to re-map to C if the accuracy is higher. We can maybe replace the translated C code with it.
There are a few places discussion took place already, a few of them below and the references therein
https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-c... https://github.com/scientific-python/summit-2025/issues/35
I wish these models were available when I was translating all that Fortran code because now I can scan my previous work and find the errors extremely quickly when I am hunting for bugs. So just in a few months they leaped forward from the pointless "this code uses Fortran let me compile with f2c, hihi" to "I compiled with valgrind and on line 760, the Fortran has out-of-bounds access which seems to cause an issue, I'll fix the translated code". I think I wrote sufficient text in those sources, so I'll leave it to others but regardless of the policy discussions, you have at least one customer looking forward to it.
I missed that recent discussion, thanks. Seems to clarify the direction NumPy community may follow based on SymPy policy.
I agree, this seems to be at least the majority view of both NumPy/SciPy maintainers, as well as the high-level principles that a lot of well-know OSS projects are ending up with when they write down a policy. I'll copy the four principles from Stefan's blog post here:
1. Be transparent 2. Take responsibility 3. Gain understanding 4. Honor Copyright
Adding the "we want to interact with other humans, not machines" principle more explicitly to that would indeed be good as well. LLVM's recently adopted policy (https://llvm.org/docs/AIToolPolicy.html) is another example that I like, with principles similar to the ones Stefan articulated and the SymPy policy.
I'd add one principle here that doesn't need to be in a policy but is important for this discussion: we don't prescribe to others how they are and aren't allowed to contribute (to the extent possible). That means that arguments about the productivity gains of using any given tool, or other effects of using that given tool like a reduction in learning, the impact on society or environment, etc. are - while quite interesting and important - not applicable to the question of "am I allowed to use tool X to contribute to NumPy or SciPy?". There are obviously better and worse ways to use any tool, but the responsibility of that is up to every individual.
Re ARPACK rewrite: I think at this point I'd recommend steering clear of letting an LLM tool generate substantial algorithmic code - given the niche application, the copyright implications of doing that are pretty murky indeed. However, using an LLM tool to generate more unit tests given a specific criterion, or have it fill in stubbed out C code in the implementation for things like error handling, checking/fixing Py_DECREF'ing issues, adding the "create a Python extension module" boilerplate, and all such kinds of clearly not copyrightable code seems perfectly fine to do. That just automates some of the tedious and fiddly parts of coding, without breaking any of the principles listed above.
Apropos these discussions, I like to link to Mark <https://www.gutenberg.org/files/3197/3197-h/3197-h.htm#link2H_4_0003> Twain's letter <https://www.gutenberg.org/files/3197/3197-h/3197-h.htm#link2H_4_0003> to that 19th century AI, Helen Keller. Helen Keller was both deaf and blind, but wrote of things she had never seen or heard. Snippet, "The kernel, the soul—let us go further and say the substance, the bulk, the actual and valuable material of all human utterances—is plagiarism." Cheers, Chuck
On Fri, Feb 6, 2026 at 10:23 AM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
I know there has been discussions in the past on AI-generated contributions. Is there a current policy for NumPy ? E.g. do we request that contributors are the "sole contributor" to the written code, or do we allow code written by AI as long as it follows the usual quality requirements ?
Context of my question: ~18 months ago I started in some spare time writing an ARPACK-replacement in numpy/scipy during scipy sprint. At that time, I used ChatGPT for the "research part" only: literature review, explain to me some existing BSD implementation in Julia for points I could not understand. I implemented the python code myself. There is still quite a bit of work needed to be a viable replacement for ARPACK.
Seeing the progress of the AI tooling in my team at work, and how I myself use those tools for other hobby projects, I believe I could finish that replacement very quickly with those tools today. But I don't want to "taint" the work if this would risk the chances of integration into scipy proper.
My personal take is "allow code written by AI as long as it follows the usual quality requirements ?". I don't see that we have much choice going forward, the tools are improving, and I see some top programmers picking up on them. It is the future. We should limit slop, as it takes up reviewer time, but clean code is clean code. As long as references and such are checked to be authentic, I'm happy. I don't think we have a formal position at this point. Chuck
On Fri, Feb 6, 2026 at 10:51 AM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Fri, Feb 6, 2026 at 10:23 AM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
I know there has been discussions in the past on AI-generated contributions. Is there a current policy for NumPy ? E.g. do we request that contributors are the "sole contributor" to the written code, or do we allow code written by AI as long as it follows the usual quality requirements ?
Context of my question: ~18 months ago I started in some spare time writing an ARPACK-replacement in numpy/scipy during scipy sprint. At that time, I used ChatGPT for the "research part" only: literature review, explain to me some existing BSD implementation in Julia for points I could not understand. I implemented the python code myself. There is still quite a bit of work needed to be a viable replacement for ARPACK.
Seeing the progress of the AI tooling in my team at work, and how I myself use those tools for other hobby projects, I believe I could finish that replacement very quickly with those tools today. But I don't want to "taint" the work if this would risk the chances of integration into scipy proper.
My personal take is "allow code written by AI as long as it follows the usual quality requirements ?". I don't see that we have much choice going forward, the tools are improving, and I see some top programmers picking up on them. It is the future. We should limit slop, as it takes up reviewer time, but clean code is clean code. As long as references and such are checked to be authentic, I'm happy.
I don't think we have a formal position at this point.
Chuck
That said, it is probably too early to establish quidelines on best use of AI, as that is still in discovery. But we might want to point to such things as tests, multiple agents reviewing code, and other such things. Chuck
On Fri, Feb 6, 2026 at 10:57 AM Charles R Harris via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Fri, Feb 6, 2026 at 10:23 AM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
I know there has been discussions in the past on AI-generated contributions. Is there a current policy for NumPy ? E.g. do we request that contributors are the "sole contributor" to the written code, or do we allow code written by AI as long as it follows the usual quality requirements ?
Context of my question: ~18 months ago I started in some spare time writing an ARPACK-replacement in numpy/scipy during scipy sprint. At that time, I used ChatGPT for the "research part" only: literature review, explain to me some existing BSD implementation in Julia for points I could not understand. I implemented the python code myself. There is still quite a bit of work needed to be a viable replacement for ARPACK.
Seeing the progress of the AI tooling in my team at work, and how I myself use those tools for other hobby projects, I believe I could finish that replacement very quickly with those tools today. But I don't want to "taint" the work if this would risk the chances of integration into scipy proper.
My personal take is "allow code written by AI as long as it follows the usual quality requirements ?". I don't see that we have much choice going forward, the tools are improving, and I see some top programmers picking up on them. It is the future. We should limit slop, as it takes up reviewer time, but clean code is clean code. As long as references and such are checked to be authentic, I'm happy.
I don't think we have a formal position at this point.
It’s probably time to adopt one and/or add an AGENTS.md file to the repo. I mostly agree with Chuck that there’s mot much we can do to avoid it. People will use the tools and not disclose, so any copyright issues will happen no matter what policy we have. I’m nervous about subtle inconsistencies and hallucinations, especially from contributions that are mostly vibe-coded. To me, that means the code needs much more careful review than human-written contributions because the nature of the errors made are different.
Chuck _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: nathan12343@gmail.com
Perhaps require the use of a prompt that narrows the knowledge base of the AI e.g. to acceptably licensed code? On Fri, Feb 6, 2026 at 1:05 PM Nathan via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Fri, Feb 6, 2026 at 10:57 AM Charles R Harris via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Fri, Feb 6, 2026 at 10:23 AM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
I know there has been discussions in the past on AI-generated contributions. Is there a current policy for NumPy ? E.g. do we request that contributors are the "sole contributor" to the written code, or do we allow code written by AI as long as it follows the usual quality requirements ?
Context of my question: ~18 months ago I started in some spare time writing an ARPACK-replacement in numpy/scipy during scipy sprint. At that time, I used ChatGPT for the "research part" only: literature review, explain to me some existing BSD implementation in Julia for points I could not understand. I implemented the python code myself. There is still quite a bit of work needed to be a viable replacement for ARPACK.
Seeing the progress of the AI tooling in my team at work, and how I myself use those tools for other hobby projects, I believe I could finish that replacement very quickly with those tools today. But I don't want to "taint" the work if this would risk the chances of integration into scipy proper.
My personal take is "allow code written by AI as long as it follows the usual quality requirements ?". I don't see that we have much choice going forward, the tools are improving, and I see some top programmers picking up on them. It is the future. We should limit slop, as it takes up reviewer time, but clean code is clean code. As long as references and such are checked to be authentic, I'm happy.
I don't think we have a formal position at this point.
It’s probably time to adopt one and/or add an AGENTS.md file to the repo.
I mostly agree with Chuck that there’s mot much we can do to avoid it. People will use the tools and not disclose, so any copyright issues will happen no matter what policy we have.
I’m nervous about subtle inconsistencies and hallucinations, especially from contributions that are mostly vibe-coded. To me, that means the code needs much more careful review than human-written contributions because the nature of the errors made are different.
Chuck _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: nathan12343@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: alan.isaac@gmail.com
On Fri, 6 Feb 2026 at 18:08, Nathan via NumPy-Discussion <numpy-discussion@python.org> wrote:
I’m nervous about subtle inconsistencies and hallucinations, especially from contributions that are mostly vibe-coded. To me, that means the code needs much more careful review than human-written contributions because the nature of the errors made are different.
AI code does need more careful review. Using AI to write good code means doing that careful review *before* opening a PR. A policy should make it clear that vibe coding is not acceptable and that the author needs to have put effort into understand the problem and check the code and so on. That won't stop vibe code PRs but at least it sets the expectations and you can point at the policy when closing a PR. I think that accepting AI generated PRs requires greater human-to-human trust like the reviewer needs to have much more trust in the author that they did check things carefully themselves. Otherwise the effort ratio between author and reviewer is moving massively in the wrong direction. In the past new contributors could earn trust by demonstrating that they had made some complicated code but AI makes it easy to make superficially good code. That makes it harder to judge whether the code actually is good and therefore harder to build trust in the author precisely when you need more trust in them. -- Oscar
On Fri, Feb 6, 2026 at 12:29 PM Oscar Benjamin via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Fri, 6 Feb 2026 at 18:08, Nathan via NumPy-Discussion <numpy-discussion@python.org> wrote:
I’m nervous about subtle inconsistencies and hallucinations, especially
from contributions that are mostly vibe-coded. To me, that means the code needs much more careful review than human-written contributions because the nature of the errors made are different.
AI code does need more careful review. Using AI to write good code means doing that careful review *before* opening a PR. A policy should make it clear that vibe coding is not acceptable and that the author needs to have put effort into understand the problem and check the code and so on. That won't stop vibe code PRs but at least it sets the expectations and you can point at the policy when closing a PR.
I think that accepting AI generated PRs requires greater human-to-human trust like the reviewer needs to have much more trust in the author that they did check things carefully themselves. Otherwise the effort ratio between author and reviewer is moving massively in the wrong direction.
This is a common problem, so I expect there will be a lot of work on using AI to review AI in the next year or two. What I don't see yet is anything that might check for license issues. However, if AI is used to rewrite properly licensed code this is probably less of a problem. <snip> Chuck
This is a very timely discussion for the community. I put some of my thoughts on the topic in a recent blog post: https://blog.scientific-python.org/scientific-python/community-consideration... Somewhere in between the die-hard-no-AI stance and full-on hype, I think there are careful patterns of working with AI that can be beneficial to our ecosystem. We will have to do some exploring to figure out how it best fits within our culture of working, and what good guardrails are. Stéfan On Fri, Feb 6, 2026, at 11:59, Charles R Harris via NumPy-Discussion wrote:
This is a common problem, so I expect there will be a lot of work on using AI to review AI in the next year or two. What I don't see yet is anything that might check for license issues. However, if AI is used to rewrite properly licensed code this is probably less of a problem.
<snip>
Chuck
We will have to do some exploring to figure out how it best fits within our culture of working
How about having an LLM privately generate a 30 second drama about, most-atomically, the checkin history of a given file? It could be user-tuned on [drama..accuracy] to get you through your day. Remember to leave enough error to keep you on your toes. --- -- Phobrain.com On 2026-02-06 14:02, Stefan van der Walt via NumPy-Discussion wrote:
This is a very timely discussion for the community.
I put some of my thoughts on the topic in a recent blog post: https://blog.scientific-python.org/scientific-python/community-consideration...
Somewhere in between the die-hard-no-AI stance and full-on hype, I think there are careful patterns of working with AI that can be beneficial to our ecosystem. We will have to do some exploring to figure out how it best fits within our culture of working, and what good guardrails are.
Stéfan
On Fri, Feb 6, 2026, at 11:59, Charles R Harris via NumPy-Discussion wrote:
This is a common problem, so I expect there will be a lot of work on using AI to review AI in the next year or two. What I don't see yet is anything that might check for license issues. However, if AI is used to rewrite properly licensed code this is probably less of a problem.
<snip>
Chuck
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: bross_phobrain@sonic.net
On Fri, 6 Feb 2026 at 19:59, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Fri, Feb 6, 2026 at 12:29 PM Oscar Benjamin via NumPy-Discussion <numpy-discussion@python.org> wrote:
I think that accepting AI generated PRs requires greater human-to-human trust like the reviewer needs to have much more trust in the author that they did check things carefully themselves. Otherwise the effort ratio between author and reviewer is moving massively in the wrong direction.
This is a common problem, so I expect there will be a lot of work on using AI to review AI in the next year or two.
It may well be good to have AI review PRs whether those PRs are written by humans or AI but I don't think that AI review is a solution to the trust problem here. Let's put it a different way. Suppose a human can prompt an AI to produce a PR and then another AI can review that PR so that the two AIs get into a feedback loop (I've heard about this going badly wrong in some places). Let's suppose that the end result of that AI feedback loop was good code that you trust because your reviewing AI is very trustworthy and has approved the PR. Now the question is what value did the human who opened the PR bring in that situation? If it was possible for the AIs to review each other and produce something good then the human here is really just a liability that you would be better off without. Likewise their AI is a liability and it would be better if you controlled both AIs yourself. I think that the real answer is that it isn't (and may never be) possible for the AIs to produce something trustworthy without careful human oversight. The value that the human brings in the AI PR (if any) is that the human is overseeing the AI to produce something that is more trustworthy than you could get just from the AI. This only works if you can trust the human though: if you trust the human less than your AI then you are better off without the human. -- Oscar
Something we're also seeing is AI being used to draft comments in PRs. I think this is understandable as English is not a first language for most people. However, it also has the effect of raising suspicions (rightly or wrongly) as to whether the code changes were produced by AI as well. _____________________________________ Dr. Andrew Nelson _____________________________________
On Fri, 6 Feb 2026 at 22:44, Andrew Nelson via NumPy-Discussion <numpy-discussion@python.org> wrote:
Something we're also seeing is AI being used to draft comments in PRs. I think this is understandable as English is not a first language for most people. However, it also has the effect of raising suspicions (rightly or wrongly) as to whether the code changes were produced by AI as well.
I actually think that this is a bigger problem than people using AI to write code. If all the code is written by AI (and it will be) then human-to-human communication is the way to build trust. Allowing AIs to poison that breaks everything. Honestly now I find it reassuring to see broken English, typos, lazy markdown formatting, grammatical errors and so on because it is so much better that I am talking to a real human. I think most people using LLMs to write comments literally don't understand this and often just need to be told. -- Oscar
Hi, This is just a plea for some careful thought at this point. There are futures here that we likely don't want. For example, imagine Numpy filling up with large blocks of AI-generated code, and huge PRs that are effectively impossible for humans to review. As Oscar and Stefan have pointed out - consider what effect that is going to have on the social enterprise of open-source coding - and our ability to train new contributors. I believe we are also obliged to think hard about the consequences for copyright. We discussed that a bit here: https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md In particular - there is no good way to ensure that the AI has not sucked in copyrighted code - even if you've asked it to do a simple port of other and clearly licensed code. There is some evidence that AI coding agents are, for whatever reason, particularly reluctant to point to GPL-licensing, when asked for code attribution. I don't think the argument that AI is inevitable is useful - yes, it's clear that AI will be part of coding in some sense, but we have yet to work out what part that will be. For example, there are different models of AI use - some of us are starting to generate large bodies of code with AI - such as Matthew Rocklin : https://matthewrocklin.com/ai-zealotry/ - but his discussion is useful. Here are two key quotes: * "LLMs generate a lot of junk" * "AI creates technical debt, but it can clean some of it up too. (at least at a certain granularity)" * "The code we write with AI probably won't be as good as hand-crafted code, but we'll write 10x more of it" https://matthewrocklin.com/ai-zealotry/ Another experienced engineer reflecting on his use of AI: """ ... LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. """ https://x.com/karpathy/status/2015883857489522876 Conversely - Linus Torvalds has a different model of how AI should work: """ Torvalds said he's "much less interested in AI for writing code" and far more excited about "AI as the tool to help maintain code, including automated patch checking and code review before changes ever reach him." """ https://www.zdnet.com/article/linus-torvalds-ai-tool-maintaining-linux-code/ I guess y'all saw the recent Anthropic research paper comparing groups randomized to AI vs no-AI working on code problems. They found little speedup from AI, but a dramatic drop in the level of understanding of the library they were using (in fact this was Trio). This effect was particularly marked for experienced developers - see their figure 7. https://arxiv.org/pdf/2601.20245 But in general - my argument is that now is a good time to step back and ask where we want AI to fit into the open-source world. We open-source developers tend to care a lot about copyright, and we depend very greatly on the social aspects of coding, including our ability to train the next generation of developers, in the particular and informal way that we have learned. We have much to lose from careless use of AI. Cheers, Matthew
On Sat, Feb 7, 2026 at 7:05 AM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
This is just a plea for some careful thought at this point.
There are futures here that we likely don't want. For example, imagine Numpy filling up with large blocks of AI-generated code, and huge PRs that are effectively impossible for humans to review. As Oscar and Stefan have pointed out - consider what effect that is going to have on the social enterprise of open-source coding - and our ability to train new contributors.
I believe we are also obliged to think hard about the consequences for copyright. We discussed that a bit here:
https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md
In particular - there is no good way to ensure that the AI has not sucked in copyrighted code - even if you've asked it to do a simple port of other and clearly licensed code. There is some evidence that AI coding agents are, for whatever reason, particularly reluctant to point to GPL-licensing, when asked for code attribution.
I don't think the argument that AI is inevitable is useful - yes, it's clear that AI will be part of coding in some sense, but we have yet to work out what part that will be.
For example, there are different models of AI use - some of us are starting to generate large bodies of code with AI - such as Matthew Rocklin : https://matthewrocklin.com/ai-zealotry/ - but his discussion is useful. Here are two key quotes:
* "LLMs generate a lot of junk" * "AI creates technical debt, but it can clean some of it up too. (at least at a certain granularity)" * "The code we write with AI probably won't be as good as hand-crafted code, but we'll write 10x more of it"
https://matthewrocklin.com/ai-zealotry/
Another experienced engineer reflecting on his use of AI:
""" ... LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.
Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. """
https://x.com/karpathy/status/2015883857489522876
Conversely - Linus Torvalds has a different model of how AI should work:
""" Torvalds said he's "much less interested in AI for writing code" and far more excited about "AI as the tool to help maintain code, including automated patch checking and code review before changes ever reach him." """
https://www.zdnet.com/article/linus-torvalds-ai-tool-maintaining-linux-code/
I guess y'all saw the recent Anthropic research paper comparing groups randomized to AI vs no-AI working on code problems. They found little speedup from AI, but a dramatic drop in the level of understanding of the library they were using (in fact this was Trio). This effect was particularly marked for experienced developers - see their figure 7.
https://arxiv.org/pdf/2601.20245
But in general - my argument is that now is a good time to step back and ask where we want AI to fit into the open-source world. We open-source developers tend to care a lot about copyright, and we depend very greatly on the social aspects of coding, including our ability to train the next generation of developers, in the particular and informal way that we have learned. We have much to lose from careless use of AI.
E. S. Raymond is another recent convert. *Programming with AI assistance is very revealing. It turns out I'm not quite who I thought I was.There are a lot of programmers out there who have a tremendous amount of ego and identity invested in the craft of coding. In knowing how to beat useful and correct behavior out of one language and system environment, or better yet many. * *If you asked me a week ago, I might have said I was one of those people. But a curious thing has occurred. LLMs are so good now that I can validate and generate a tremendous amount of code while doing hardly any hand-coding at all.* *And it's dawning on me that I don't miss it.* Things are moving fast. Chuck
Hi On Sat, Feb 7, 2026 at 4:54 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 7, 2026 at 7:05 AM Matthew Brett via NumPy-Discussion <numpy-discussion@python.org> wrote:
Hi,
This is just a plea for some careful thought at this point.
There are futures here that we likely don't want. For example, imagine Numpy filling up with large blocks of AI-generated code, and huge PRs that are effectively impossible for humans to review. As Oscar and Stefan have pointed out - consider what effect that is going to have on the social enterprise of open-source coding - and our ability to train new contributors.
I believe we are also obliged to think hard about the consequences for copyright. We discussed that a bit here:
https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md
In particular - there is no good way to ensure that the AI has not sucked in copyrighted code - even if you've asked it to do a simple port of other and clearly licensed code. There is some evidence that AI coding agents are, for whatever reason, particularly reluctant to point to GPL-licensing, when asked for code attribution.
I don't think the argument that AI is inevitable is useful - yes, it's clear that AI will be part of coding in some sense, but we have yet to work out what part that will be.
For example, there are different models of AI use - some of us are starting to generate large bodies of code with AI - such as Matthew Rocklin : https://matthewrocklin.com/ai-zealotry/ - but his discussion is useful. Here are two key quotes:
* "LLMs generate a lot of junk" * "AI creates technical debt, but it can clean some of it up too. (at least at a certain granularity)" * "The code we write with AI probably won't be as good as hand-crafted code, but we'll write 10x more of it"
https://matthewrocklin.com/ai-zealotry/
Another experienced engineer reflecting on his use of AI:
""" ... LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.
Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. """
https://x.com/karpathy/status/2015883857489522876
Conversely - Linus Torvalds has a different model of how AI should work:
""" Torvalds said he's "much less interested in AI for writing code" and far more excited about "AI as the tool to help maintain code, including automated patch checking and code review before changes ever reach him." """
https://www.zdnet.com/article/linus-torvalds-ai-tool-maintaining-linux-code/
I guess y'all saw the recent Anthropic research paper comparing groups randomized to AI vs no-AI working on code problems. They found little speedup from AI, but a dramatic drop in the level of understanding of the library they were using (in fact this was Trio). This effect was particularly marked for experienced developers - see their figure 7.
https://arxiv.org/pdf/2601.20245
But in general - my argument is that now is a good time to step back and ask where we want AI to fit into the open-source world. We open-source developers tend to care a lot about copyright, and we depend very greatly on the social aspects of coding, including our ability to train the next generation of developers, in the particular and informal way that we have learned. We have much to lose from careless use of AI.
E. S. Raymond is another recent convert.
Programming with AI assistance is very revealing. It turns out I'm not quite who I thought I was.
There are a lot of programmers out there who have a tremendous amount of ego and identity invested in the craft of coding. In knowing how to beat useful and correct behavior out of one language and system environment, or better yet many.
If you asked me a week ago, I might have said I was one of those people. But a curious thing has occurred. LLMs are so good now that I can validate and generate a tremendous amount of code while doing hardly any hand-coding at all.
And it's dawning on me that I don't miss it.
Things are moving fast.
Yes - but - it's important to separate how people feel using AI, and the actual outcome. Many of y'all will I am sure have seen this study: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/ that showed that developers estimated they would get a 25% speedup from AI, before they did the task; after they did the task, they felt they they had got a 20% speedup, and in fact (compared to matched tasks without AI), they suffered from a 20% slowdown. Personally - I am not very egotistical about my code, but I am extremely suspicious. I know my tendency to become sloppy, to make and miss mistakes - what David Donoho called "the ubiquity of error": https://blog.nipy.org/ubiquity-of-error.html . So AI makes me increasingly uncomfortable, as I feel my skill starting to atrophy (in the words of Andrej Karpathy quoted above). So it seems to me we have to take someone like Linus Torvalds seriously when he says he's "much less interested in AI for writing code". Perhaps it is possible, at some point, to show that delegating coding to the AI leads to increased learning and greater ability to spot error - but so far the evidence seems to go the other way. And if we "embrace" AI for that use, we run the risk of deskilling ourselves, filling the code-base with maintenance debt, effectively voiding copyright, and making it much harder to train the next generation, Cheers, Matthew -- This email is fully human-source. Unless I'm quoting AI, I did not use AI for any text in this email.
Hi Matt, There are two aspects: can we use AI-generated code in numpy/scipy, and if we can should we ? And to make it more complicated, the type of AI usage affects those questions differently. E.g. I think almost nobody would object to the use I described originally: using chats to research, analyze literature and understand existing codebases under acceptable license. There is no code generated there. Another extreme is all code generated and reviewed by AI. I will for now continue my original approach (no AI to generate any code unless trivial + disclose its use when PR time comes). David On Sun, Feb 8, 2026 at 2:52 AM Matthew Brett via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi
On Sat, Feb 7, 2026 at 4:54 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Feb 7, 2026 at 7:05 AM Matthew Brett via NumPy-Discussion <
numpy-discussion@python.org> wrote:
Hi,
This is just a plea for some careful thought at this point.
There are futures here that we likely don't want. For example, imagine Numpy filling up with large blocks of AI-generated code, and huge PRs that are effectively impossible for humans to review. As Oscar and Stefan have pointed out - consider what effect that is going to have on the social enterprise of open-source coding - and our ability to train new contributors.
I believe we are also obliged to think hard about the consequences for copyright. We discussed that a bit here:
https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md
In particular - there is no good way to ensure that the AI has not sucked in copyrighted code - even if you've asked it to do a simple port of other and clearly licensed code. There is some evidence that AI coding agents are, for whatever reason, particularly reluctant to point to GPL-licensing, when asked for code attribution.
I don't think the argument that AI is inevitable is useful - yes, it's clear that AI will be part of coding in some sense, but we have yet to work out what part that will be.
For example, there are different models of AI use - some of us are starting to generate large bodies of code with AI - such as Matthew Rocklin : https://matthewrocklin.com/ai-zealotry/ - but his discussion is useful. Here are two key quotes:
* "LLMs generate a lot of junk" * "AI creates technical debt, but it can clean some of it up too. (at least at a certain granularity)" * "The code we write with AI probably won't be as good as hand-crafted code, but we'll write 10x more of it"
https://matthewrocklin.com/ai-zealotry/
Another experienced engineer reflecting on his use of AI:
""" ... LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.
Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. """
https://x.com/karpathy/status/2015883857489522876
Conversely - Linus Torvalds has a different model of how AI should work:
""" Torvalds said he's "much less interested in AI for writing code" and far more excited about "AI as the tool to help maintain code, including automated patch checking and code review before changes ever reach him." """
https://www.zdnet.com/article/linus-torvalds-ai-tool-maintaining-linux-code/
I guess y'all saw the recent Anthropic research paper comparing groups randomized to AI vs no-AI working on code problems. They found little speedup from AI, but a dramatic drop in the level of understanding of the library they were using (in fact this was Trio). This effect was particularly marked for experienced developers - see their figure 7.
https://arxiv.org/pdf/2601.20245
But in general - my argument is that now is a good time to step back and ask where we want AI to fit into the open-source world. We open-source developers tend to care a lot about copyright, and we depend very greatly on the social aspects of coding, including our ability to train the next generation of developers, in the particular and informal way that we have learned. We have much to lose from careless use of AI.
E. S. Raymond is another recent convert.
Programming with AI assistance is very revealing. It turns out I'm not quite who I thought I was.
There are a lot of programmers out there who have a tremendous amount of ego and identity invested in the craft of coding. In knowing how to beat useful and correct behavior out of one language and system environment, or better yet many.
If you asked me a week ago, I might have said I was one of those people. But a curious thing has occurred. LLMs are so good now that I can validate and generate a tremendous amount of code while doing hardly any hand-coding at all.
And it's dawning on me that I don't miss it.
Things are moving fast.
Yes - but - it's important to separate how people feel using AI, and the actual outcome. Many of y'all will I am sure have seen this study:
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
that showed that developers estimated they would get a 25% speedup from AI, before they did the task; after they did the task, they felt they they had got a 20% speedup, and in fact (compared to matched tasks without AI), they suffered from a 20% slowdown.
Personally - I am not very egotistical about my code, but I am extremely suspicious. I know my tendency to become sloppy, to make and miss mistakes - what David Donoho called "the ubiquity of error": https://blog.nipy.org/ubiquity-of-error.html . So AI makes me increasingly uncomfortable, as I feel my skill starting to atrophy (in the words of Andrej Karpathy quoted above).
So it seems to me we have to take someone like Linus Torvalds seriously when he says he's "much less interested in AI for writing code". Perhaps it is possible, at some point, to show that delegating coding to the AI leads to increased learning and greater ability to spot error - but so far the evidence seems to go the other way. And if we "embrace" AI for that use, we run the risk of deskilling ourselves, filling the code-base with maintenance debt, effectively voiding copyright, and making it much harder to train the next generation,
Cheers,
Matthew
-- This email is fully human-source. Unless I'm quoting AI, I did not use AI for any text in this email. _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: cournape@gmail.com
On Sun, Feb 8, 2026 at 1:57 AM Charles R Harris via NumPy-Discussion < numpy-discussion@python.org> wrote:
E. S. Raymond is another recent convert.
*Programming with AI assistance is very revealing. It turns out I'm not quite who I thought I was.There are a lot of programmers out there who have a tremendous amount of ego and identity invested in the craft of coding. In knowing how to beat useful and correct behavior out of one language and system environment, or better yet many. *
*If you asked me a week ago, I might have said I was one of those people. But a curious thing has occurred. LLMs are so good now that I can validate and generate a tremendous amount of code while doing hardly any hand-coding at all.* *And it's dawning on me that I don't miss it.*
Things are moving fast.
They are. I was a skeptic about code generation until a couple of months ago. Now I work with colleagues working on complex data science projects who write nearly no code directly, but still drive/review it manually. David
Chuck _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: cournape@gmail.com
Honestly now I find it reassuring to see broken English, typos, lazy markdown formatting, grammatical errors and so on because it is so much better that I am talking to a real human. I think most people using LLMs to write comments literally don't understand this and often just need to be told.
(An ESL here). By all means do write it in the docs, and copy-paste it (or have a bot copy-paste it even) as replies to suspected AI written comments. The barrier is real, an easy "solution" is readily available and is becoming ubiquitous, and the sentiment is very much not obvious. On Sat, Feb 7, 2026 at 12:10 AM Oscar Benjamin via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Fri, 6 Feb 2026 at 22:44, Andrew Nelson via NumPy-Discussion <numpy-discussion@python.org> wrote:
Something we're also seeing is AI being used to draft comments in PRs. I
think this is understandable as English is not a first language for most people. However, it also has the effect of raising suspicions (rightly or wrongly) as to whether the code changes were produced by AI as well.
I actually think that this is a bigger problem than people using AI to write code. If all the code is written by AI (and it will be) then human-to-human communication is the way to build trust. Allowing AIs to poison that breaks everything.
Honestly now I find it reassuring to see broken English, typos, lazy markdown formatting, grammatical errors and so on because it is so much better that I am talking to a real human. I think most people using LLMs to write comments literally don't understand this and often just need to be told.
-- Oscar _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: evgeny.burovskiy@gmail.com
On Fri, 2026-02-06 at 11:04 -0700, Nathan via NumPy-Discussion wrote:
On Fri, Feb 6, 2026 at 10:57 AM Charles R Harris via NumPy-Discussion < numpy-discussion@python.org> wrote:
<snip>
I don't think we have a formal position at this point.
It’s probably time to adopt one and/or add an AGENTS.md file to the repo.
I mostly agree with Chuck that there’s mot much we can do to avoid it. People will use the tools and not disclose, so any copyright issues will happen no matter what policy we have.
I’m nervous about subtle inconsistencies and hallucinations, especially from contributions that are mostly vibe-coded. To me, that means the code needs much more careful review than human-written contributions because the nature of the errors made are different.
I am just going to third this sentiment. There is a lot of interesting discussions around this but at the core of it, to me there isn't actually a real core policy shift yet. (Which doesn't mean it wouldn't be good to adopt one.) What I mean is that I think contributors always should explain where code comes from (you can map this to copyright, or AI, or a colleague attribution). Also, I expect a contributor to understand the code they are proposing as well as why they propose the change and, to some extend [1], what impact that change has on backwards compatibility. The same is true for the maintainer merging that change, giving us the four eyes principle (you may stress that it is at least four human eyes). And that leaves me fine with contributions even largely written by a tool so long the understanding is still there (and as a reviewer I may need to know about the tool use, I think). And I expect the project could lose a lot of user trust if the number of human eyes on a changeset suddenly dropped. [2] All that said. I am happy to be rather ruthless with suspect tool use and closing of PRs/issues (especially if they don't disclose use). If contributors lack understanding or don't take care they shift work on maintainers and, unfortunately, a perceived increase in such issues/PRs means I am fine if maintainers react rather harshly. [3] - Sebastian [1] This part is hard for many contributors, it indeed falls more on the maintainer. [2] I am not sure it matters how well it might work now or in the future, because NumPy seems like the wrong project to trail-blaze shifts in culture. [3] I don't like the cultural implications of being trigger happy with closing PRs, especially since it is hard to be sure about anything here. But, unfortunately, more tool use may mean that contributors have to do actually do more work to make PRs nice to review.
Chuck _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: nathan12343@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
On Fri, Feb 6, 2026 at 10:23 AM David Cournapeau via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hi,
I know there has been discussions in the past on AI-generated contributions. Is there a current policy for NumPy ? E.g. do we request that contributors are the "sole contributor" to the written code, or do we allow code written by AI as long as it follows the usual quality requirements ?
Context of my question: ~18 months ago I started in some spare time writing an ARPACK-replacement in numpy/scipy during scipy sprint. At that time, I used ChatGPT for the "research part" only: literature review, explain to me some existing BSD implementation in Julia for points I could not understand. I implemented the python code myself. There is still quite a bit of work needed to be a viable replacement for ARPACK.
Seeing the progress of the AI tooling in my team at work, and how I myself use those tools for other hobby projects, I believe I could finish that replacement very quickly with those tools today. But I don't want to "taint" the work if this would risk the chances of integration into scipy proper.
Thanks, David
The following ranking of project difficulty for AI is interesting (generated by Grok): SWE-bench Verified draws from 12 popular Python repositories. These cover a nice spread of complexity levels: - Lower-to-mid complexity (easier end of the benchmark): - django (web framework) — large but very modular, lots of high-level logic - sympy (symbolic math) — math-heavy, but often localized changes - pandas (data frames) — complex internals, but many issues are API/UX or performance tweaks - scikit-learn (ML) — algorithms + estimators, moderately tricky - Mid-to-high complexity: - matplotlib (plotting) — mixes Python + some C extensions, rendering logic - sphinx (docs) — build system heavy - High complexity (harder end): - numpy — core numerical array library with heavy C/C++/Fortran extensions, low-level memory management, ufuncs, dtype system, broadcasting rules, BLAS/LAPACK integration - scipy (scientific computing) — builds on NumPy, dense linear algebra, sparse matrices, optimization, signal processing — very math-intense and performance-critical - pytest (testing framework) — meta, plugin system, very tricky edge cases around fixtures, parametrization, async NumPy (and scipy) consistently rank among the harder repositories in the SWE-bench collection for both humans and AI agents.I was surprised by pytest. PyTorch and Jax rank higher for complexity, Jax in particular is difficult for its functional programming style, no side effects. The very high complexity projects are not listed -- I only asked about NumPy.Chuck
Hey everyone, Sorry to be jumping in so late on this important thread. And thanks to everyone for the thoughtful discussion. Numpy is such a visible and important project, I'm sure that what's decided here will have massive downstream consequences for the rest of the tech world. I'm happy to see that there seems to be an emerging consensus of “principles over policing”: responsibility, understanding, transparency. One way I think we can make this more concrete by framing it around *where* in the project a contribution lands, and what the “blast radius” looks like, in both space and time, if we later need to debug, rewrite, or (worst case) roll back code for legal/regulatory reasons. A few thoughts / straw man of an "AI contribution policy": (1) Defining “Zones” of the project We can explicitly acknowledge that different areas of the codebase that have different tolerance levels for AI-assisted generation. "Zone" doesn’t have to mean a single folder. It can be: * a directory tree (e.g., `numpy/core`, `numpy/linalg`, `doc/`, `tools/`, `benchmarks/`), * part of the import namespace (public API, internal helpers), * or a semantic area (core algorithms, ABI-sensitive paths, numerical stability-critical routines, tutorial content, CI glue, etc.). For example: * Inner ring (high scrutiny): core algorithms, numerics, ABI/API-critical code paths, anything performance-critical or subtle correctness-wise. * Middle ring (moderate): tests, refactors, coverage expansion, internal tooling, build/CI scripts. * Outer ring (low): examples, tutorials, onboarding docs, “glue” that’s easy to replace, small utilities, beginner-facing content. This is explicitly saying that the question of AI isn't a binary choice, nor e.g. "AI is forbidden in the core". Rather, the closer you get to code with high blast radius, the more we should demand human legibility, reviewable provenance, and high confidence in correctness and licensing posture. (2) "Blast radius" When someone asks “should we accept AI-generated code here?”, I think they have an implicit model about "blast radius" already. We can render that model explicit with a few dimensions: * Complexity: Is this code easy to reason about? Does it involve numerical stability, tricky invariants, edge-case handling, low-level memory behavior, or algorithmic subtlety? * Impact & dependency surface: How many downstream things depend on this? Is it part of public API? Widely imported? Affects core array semantics? If it changes, do we risk broad downstream breakage? * Stability & expected lifespan: Is this an area that tends to be stable for years (core numerics), or something we expect to churn (docs examples, CI harnesses)? The longer something is expected to persist, the higher the cost of “oops”. * Rollback/Replacement cost: If we had to remove it quickly, how painful would that be? How entangled is it with other code? How hard is it to recreate by hand? * Legibility / testability: Can we test it robustly? Can we write property tests? Are there known oracles? Is it feasible to get strong confidence quickly? (3) Transparency: make it concrete, not vague I think “please be transparent” is right. At a minimum, I think we need something like a lightweight attestation or affirmation from the contributor — not a legal affidavit, not a license, not an attempt to police workflow — but a structured statement that sets reviewer expectations and creates an audit trail. Something along these lines (details can be bikeshedded later): **AI Use Attestation (for PR description template / checkboxes):** * Did you use an AI tool to generate or substantially modify code in this PR? (yes/no) * If yes: * which tool/model (and ideally version/date — these change fast), * what parts of the PR were AI-assisted (core logic vs tests vs docs vs refactors), * confirm: “I understand this code, I can explain it, and I’m responsible for it.” Then scale the requested detail by zone / blast radius: * Outer/middle ring: model/tool + high-level description is probably enough. * Inner ring / high blast radius: I’d like us to consider asking for more: * the prompts (or at least the key prompts) used to generate the logic, * any intermediate artifacts that help future maintainers understand how we got here (e.g., the “why” behind design choices, variants considered, constraints given to the model), * and ideally a short human-written explanation of the algorithm and invariants (which is good practice regardless of AI). I can appreciate that this seems burdensome for small PRs fixing a tiny thing. But the above is just a straw-man and perhaps there are some nice simplifications we can engineer to make this as lightweight a part of the workflow as possible. (Perhaps even a stub Numpy_ai_contribution_guide.md that gives the code-gen LLM a template to fill out and include in the PR?) The analogy I have in mind: treat AI like a nondeterministic *semantic compiler*. With a normal compiler, we keep intermediate info when we care: flags, versions, debug symbols, build logs. For high blast radius code, the prompts and intermediate reasoning are effectively that metadata. Even if we don’t store everything in-repo, capturing it in the PR discussion is valuable. This is literally like preserving the seed, when we have to check in the output of an RNG. (4) Why keep prompts / artifacts? (forward-looking CI idea) One reason I care about preserving the trail: I can imagine a future “AI regression / reproducibility” check. Say it’s 6–12 months from now and AI coding tools are even stronger. If we have prompts and model versions recorded for high-blast-radius contributions, we could run a periodic (maybe opt-in) workflow that: * replays historical prompts against a current/known model environment, * compares the generated output structurally (or semantically) to what we merged, * and validates that the merged implementation still matches expected numerical behavior (tests + known benchmarks). Even if we never automate this, having the trace helps humans debug: “what constraints were assumed?” “what source did it mirror?” “what was the intended invariant?” (5) Copyright / legality This is the part where I think a little conservatism is justified. The legal landscape around training data, derived works, and obligations around GPL/AGPL/LGPL is still evolving across jurisdictions. NumPy is permissively licensed, but that doesn’t automatically insulate us from the provenance question if generated code ends up looking like something from a copyleft codebase. I can tell you for a fact that corporate legal compliance folks will not hesitate to use the ban-hammer if, after some future court case, it's deemed that codebases like Numpy's are "tainted" and require roll-back. I’m not proposing we block AI tools categorically (that’s neither realistic nor enforceable). But I do think it’s reasonable to say: * in high blast radius zones, contributors should prefer tools with clearer provenance and license posture, and we should be willing to ask for extra diligence (explanations, tests, and/or avoiding “model wrote the entire algorithm” submissions); * in low blast radius zones, the risk/cost trade is different, and we can be more permissive. I also think we should explicitly acknowledge that this policy may evolve as jurisprudence and tooling clarity improves. As an additional note, over the last couple of years I have been actively working on a new "AI Rights" license & tech infrastructure to help give explicit attestation for authors of all copyrighted works, along the lines of CC Signals[1] or IETF AI Preferences[2]. I'm actually sending this from the AI Summit in Delhi where, as part of AI Commons, I'm convening allies from Creative Commons, Wikimedia, Internet Archive, Common Crawl, and others to align on shared vision and workstreams. Those who are interested in my work on this can see the videos at links [3][4][5]. If you're to just watch one, I'd recommend [4] then [5], or just [5]. I'm happy to chat in depth with any/all of you about these topics, but I want to be sensitive about not hijacking the Numpy list for my personal mad ravings, so we can take them off-list if the maintainers deem it too off-topic. If, on the other hand, y'all want to have a dialogue about this stuff here, I can think of no finer group to pressure-test my ideas. :-) Cheers, Peter (In the spirit of transparency and dogfooding: some parts of this email came from a thread summarization and initial dialog with GPT 4o and 5.2, although I only used the output as a starting point and edited heavily afterwards.) [1] https://creativecommons.org/ai-and-the-commons/cc-signals/ [2] https://datatracker.ietf.org/wg/aipref/about/ [3] https://www.youtube.com/watch?v=oZHl4NWaO7c [4] "AI for All": https://www.youtube.com/watch?v=TLZ9zXnluc8 [5] "AI Training & The Data Commons in Crisis": https://www.youtube.com/watch?v=CdKxgT1o864
On Sun, Feb 15, 2026 at 2:50 AM Peter Wang via NumPy-Discussion < numpy-discussion@python.org> wrote:
Hey everyone,
Sorry to be jumping in so late on this important thread. And thanks to everyone for the thoughtful discussion. Numpy is such a visible and important project, I'm sure that what's decided here will have massive downstream consequences for the rest of the tech world.
I'm happy to see that there seems to be an emerging consensus of “principles over policing”: responsibility, understanding, transparency. One way I think we can make this more concrete by framing it around *where* in the project a contribution lands, and what the “blast radius” looks like, in both space and time, if we later need to debug, rewrite, or (worst case) roll back code for legal/regulatory reasons.
A few thoughts / straw man of an "AI contribution policy":
(1) Defining “Zones” of the project
We can explicitly acknowledge that different areas of the codebase that have different tolerance levels for AI-assisted generation. "Zone" doesn’t have to mean a single folder. It can be:
* a directory tree (e.g., `numpy/core`, `numpy/linalg`, `doc/`, `tools/`, `benchmarks/`), * part of the import namespace (public API, internal helpers), * or a semantic area (core algorithms, ABI-sensitive paths, numerical stability-critical routines, tutorial content, CI glue, etc.).
For example:
* Inner ring (high scrutiny): core algorithms, numerics, ABI/API-critical code paths, anything performance-critical or subtle correctness-wise. * Middle ring (moderate): tests, refactors, coverage expansion, internal tooling, build/CI scripts. * Outer ring (low): examples, tutorials, onboarding docs, “glue” that’s easy to replace, small utilities, beginner-facing content.
This is explicitly saying that the question of AI isn't a binary choice, nor e.g. "AI is forbidden in the core". Rather, the closer you get to code with high blast radius, the more we should demand human legibility, reviewable provenance, and high confidence in correctness and licensing posture.
(2) "Blast radius"
When someone asks “should we accept AI-generated code here?”, I think they have an implicit model about "blast radius" already. We can render that model explicit with a few dimensions:
* Complexity: Is this code easy to reason about? Does it involve numerical stability, tricky invariants, edge-case handling, low-level memory behavior, or algorithmic subtlety? * Impact & dependency surface: How many downstream things depend on this? Is it part of public API? Widely imported? Affects core array semantics? If it changes, do we risk broad downstream breakage? * Stability & expected lifespan: Is this an area that tends to be stable for years (core numerics), or something we expect to churn (docs examples, CI harnesses)? The longer something is expected to persist, the higher the cost of “oops”. * Rollback/Replacement cost: If we had to remove it quickly, how painful would that be? How entangled is it with other code? How hard is it to recreate by hand? * Legibility / testability: Can we test it robustly? Can we write property tests? Are there known oracles? Is it feasible to get strong confidence quickly?
(3) Transparency: make it concrete, not vague
I think “please be transparent” is right. At a minimum, I think we need something like a lightweight attestation or affirmation from the contributor — not a legal affidavit, not a license, not an attempt to police workflow — but a structured statement that sets reviewer expectations and creates an audit trail.
Something along these lines (details can be bikeshedded later): **AI Use Attestation (for PR description template / checkboxes):**
* Did you use an AI tool to generate or substantially modify code in this PR? (yes/no) * If yes: * which tool/model (and ideally version/date — these change fast), * what parts of the PR were AI-assisted (core logic vs tests vs docs vs refactors), * confirm: “I understand this code, I can explain it, and I’m responsible for it.”
Then scale the requested detail by zone / blast radius:
* Outer/middle ring: model/tool + high-level description is probably enough. * Inner ring / high blast radius: I’d like us to consider asking for more: * the prompts (or at least the key prompts) used to generate the logic, * any intermediate artifacts that help future maintainers understand how we got here (e.g., the “why” behind design choices, variants considered, constraints given to the model), * and ideally a short human-written explanation of the algorithm and invariants (which is good practice regardless of AI).
I can appreciate that this seems burdensome for small PRs fixing a tiny thing. But the above is just a straw-man and perhaps there are some nice simplifications we can engineer to make this as lightweight a part of the workflow as possible. (Perhaps even a stub Numpy_ai_contribution_guide.md that gives the code-gen LLM a template to fill out and include in the PR?)
The analogy I have in mind: treat AI like a nondeterministic *semantic compiler*. With a normal compiler, we keep intermediate info when we care: flags, versions, debug symbols, build logs. For high blast radius code, the prompts and intermediate reasoning are effectively that metadata. Even if we don’t store everything in-repo, capturing it in the PR discussion is valuable. This is literally like preserving the seed, when we have to check in the output of an RNG.
(4) Why keep prompts / artifacts? (forward-looking CI idea)
One reason I care about preserving the trail: I can imagine a future “AI regression / reproducibility” check. Say it’s 6–12 months from now and AI coding tools are even stronger. If we have prompts and model versions recorded for high-blast-radius contributions, we could run a periodic (maybe opt-in) workflow that:
* replays historical prompts against a current/known model environment, * compares the generated output structurally (or semantically) to what we merged, * and validates that the merged implementation still matches expected numerical behavior (tests + known benchmarks).
Even if we never automate this, having the trace helps humans debug: “what constraints were assumed?” “what source did it mirror?” “what was the intended invariant?”
(5) Copyright / legality
This is the part where I think a little conservatism is justified. The legal landscape around training data, derived works, and obligations around GPL/AGPL/LGPL is still evolving across jurisdictions. NumPy is permissively licensed, but that doesn’t automatically insulate us from the provenance question if generated code ends up looking like something from a copyleft codebase. I can tell you for a fact that corporate legal compliance folks will not hesitate to use the ban-hammer if, after some future court case, it's deemed that codebases like Numpy's are "tainted" and require roll-back.
I’m not proposing we block AI tools categorically (that’s neither realistic nor enforceable). But I do think it’s reasonable to say:
* in high blast radius zones, contributors should prefer tools with clearer provenance and license posture, and we should be willing to ask for extra diligence (explanations, tests, and/or avoiding “model wrote the entire algorithm” submissions); * in low blast radius zones, the risk/cost trade is different, and we can be more permissive.
I also think we should explicitly acknowledge that this policy may evolve as jurisprudence and tooling clarity improves.
As an additional note, over the last couple of years I have been actively working on a new "AI Rights" license & tech infrastructure to help give explicit attestation for authors of all copyrighted works, along the lines of CC Signals[1] or IETF AI Preferences[2]. I'm actually sending this from the AI Summit in Delhi where, as part of AI Commons, I'm convening allies from Creative Commons, Wikimedia, Internet Archive, Common Crawl, and others to align on shared vision and workstreams. Those who are interested in my work on this can see the videos at links [3][4][5].
If you're to just watch one, I'd recommend [4] then [5], or just [5].
I'm happy to chat in depth with any/all of you about these topics, but I want to be sensitive about not hijacking the Numpy list for my personal mad ravings, so we can take them off-list if the maintainers deem it too off-topic. If, on the other hand, y'all want to have a dialogue about this stuff here, I can think of no finer group to pressure-test my ideas. :-)
Cheers, Peter
(In the spirit of transparency and dogfooding: some parts of this email came from a thread summarization and initial dialog with GPT 4o and 5.2, although I only used the output as a starting point and edited heavily afterwards.)
[1] https://creativecommons.org/ai-and-the-commons/cc-signals/ [2] https://datatracker.ietf.org/wg/aipref/about/ [3] https://www.youtube.com/watch?v=oZHl4NWaO7c [4] "AI for All": https://www.youtube.com/watch?v=TLZ9zXnluc8 [5] "AI Training & The Data Commons in Crisis": https://www.youtube.com/watch?v=CdKxgT1o864
A prompt template the people can use with their code generation might be helpful. As an example of such: https://x.com/WEschenbach/status/2022189308065796295. Mathew Rocklin had something similar in his discussion of hooks. The idea is to find ways to avoid some problems up front. Chuck
Hey! thanks a lot for chiming in, still watching the videos! FWIW, I have come around in that I guess we should add something (a lot of projects are discussing this, I would be happy to just steal one too). I would still focus on the transparency and social part (because I think that is the part that affects us more in practice) rather than copyright, but yeah, I guess it deserves a place. It may well be nice to force a checkbox and a short note in the PR template, I am not sure what will work there, but it may also depend a bit on the "risk assessment" how much detail we need. Now for copyright issues, I am still a bit unclear what we should ask for beyond transparency from the contributor (happy to write that non-transparency/unclear provenance is likely to prompt us to just close). For the maintainer I think the "blast radius" framework could be very useful and it may be nice to flesh it out (I don't think there is anything NumPy specific about it, i.e. in my mind (1) is basically examples for the blast radius?). I think the important part there might be to have a few rough examples and where they land (but I think I would find it very hard to do this with any certainty!). I like the risk-matrix approach, so I think what this would look like is a risk-matrix: Risk of copyright | "Blast radius" infringement | | very small .... ----------------|-------------------------------------- very low | low | ... | And what we would need examples that vary on both axes and suggest a very rough line at which point in the matrix you should start to be very careful and do extra verification steps (or just close the PR if you don't want to do those). At least that is how I would like to approach this when in doubt, but beyond being pretty confident that we only had pretty safe PRs for now, I am not sure I could build up such a matrix myself. - Sebastian On 2026-02-15 10:49, Peter Wang via NumPy-Discussion wrote:
Hey everyone,
Sorry to be jumping in so late on this important thread. And thanks to everyone for the thoughtful discussion. Numpy is such a visible and important project, I'm sure that what's decided here will have massive downstream consequences for the rest of the tech world.
I'm happy to see that there seems to be an emerging consensus of “principles over policing”: responsibility, understanding, transparency. One way I think we can make this more concrete by framing it around *where* in the project a contribution lands, and what the “blast radius” looks like, in both space and time, if we later need to debug, rewrite, or (worst case) roll back code for legal/regulatory reasons.
A few thoughts / straw man of an "AI contribution policy":
(1) Defining “Zones” of the project
We can explicitly acknowledge that different areas of the codebase that have different tolerance levels for AI-assisted generation. "Zone" doesn’t have to mean a single folder. It can be:
* a directory tree (e.g., `numpy/core`, `numpy/linalg`, `doc/`, `tools/`, `benchmarks/`), * part of the import namespace (public API, internal helpers), * or a semantic area (core algorithms, ABI-sensitive paths, numerical stability-critical routines, tutorial content, CI glue, etc.).
For example:
* Inner ring (high scrutiny): core algorithms, numerics, ABI/API-critical code paths, anything performance-critical or subtle correctness-wise. * Middle ring (moderate): tests, refactors, coverage expansion, internal tooling, build/CI scripts. * Outer ring (low): examples, tutorials, onboarding docs, “glue” that’s easy to replace, small utilities, beginner-facing content.
This is explicitly saying that the question of AI isn't a binary choice, nor e.g. "AI is forbidden in the core". Rather, the closer you get to code with high blast radius, the more we should demand human legibility, reviewable provenance, and high confidence in correctness and licensing posture.
(2) "Blast radius"
When someone asks “should we accept AI-generated code here?”, I think they have an implicit model about "blast radius" already. We can render that model explicit with a few dimensions:
* Complexity: Is this code easy to reason about? Does it involve numerical stability, tricky invariants, edge-case handling, low-level memory behavior, or algorithmic subtlety? * Impact & dependency surface: How many downstream things depend on this? Is it part of public API? Widely imported? Affects core array semantics? If it changes, do we risk broad downstream breakage? * Stability & expected lifespan: Is this an area that tends to be stable for years (core numerics), or something we expect to churn (docs examples, CI harnesses)? The longer something is expected to persist, the higher the cost of “oops”. * Rollback/Replacement cost: If we had to remove it quickly, how painful would that be? How entangled is it with other code? How hard is it to recreate by hand? * Legibility / testability: Can we test it robustly? Can we write property tests? Are there known oracles? Is it feasible to get strong confidence quickly?
(3) Transparency: make it concrete, not vague
I think “please be transparent” is right. At a minimum, I think we need something like a lightweight attestation or affirmation from the contributor — not a legal affidavit, not a license, not an attempt to police workflow — but a structured statement that sets reviewer expectations and creates an audit trail.
Something along these lines (details can be bikeshedded later): **AI Use Attestation (for PR description template / checkboxes):**
* Did you use an AI tool to generate or substantially modify code in this PR? (yes/no) * If yes: * which tool/model (and ideally version/date — these change fast), * what parts of the PR were AI-assisted (core logic vs tests vs docs vs refactors), * confirm: “I understand this code, I can explain it, and I’m responsible for it.”
Then scale the requested detail by zone / blast radius:
* Outer/middle ring: model/tool + high-level description is probably enough. * Inner ring / high blast radius: I’d like us to consider asking for more: * the prompts (or at least the key prompts) used to generate the logic, * any intermediate artifacts that help future maintainers understand how we got here (e.g., the “why” behind design choices, variants considered, constraints given to the model), * and ideally a short human-written explanation of the algorithm and invariants (which is good practice regardless of AI).
I can appreciate that this seems burdensome for small PRs fixing a tiny thing. But the above is just a straw-man and perhaps there are some nice simplifications we can engineer to make this as lightweight a part of the workflow as possible. (Perhaps even a stub Numpy_ai_contribution_guide.md that gives the code-gen LLM a template to fill out and include in the PR?)
The analogy I have in mind: treat AI like a nondeterministic *semantic compiler*. With a normal compiler, we keep intermediate info when we care: flags, versions, debug symbols, build logs. For high blast radius code, the prompts and intermediate reasoning are effectively that metadata. Even if we don’t store everything in-repo, capturing it in the PR discussion is valuable. This is literally like preserving the seed, when we have to check in the output of an RNG.
(4) Why keep prompts / artifacts? (forward-looking CI idea)
One reason I care about preserving the trail: I can imagine a future “AI regression / reproducibility” check. Say it’s 6–12 months from now and AI coding tools are even stronger. If we have prompts and model versions recorded for high-blast-radius contributions, we could run a periodic (maybe opt-in) workflow that:
* replays historical prompts against a current/known model environment, * compares the generated output structurally (or semantically) to what we merged, * and validates that the merged implementation still matches expected numerical behavior (tests + known benchmarks).
Even if we never automate this, having the trace helps humans debug: “what constraints were assumed?” “what source did it mirror?” “what was the intended invariant?”
(5) Copyright / legality
This is the part where I think a little conservatism is justified. The legal landscape around training data, derived works, and obligations around GPL/AGPL/LGPL is still evolving across jurisdictions. NumPy is permissively licensed, but that doesn’t automatically insulate us from the provenance question if generated code ends up looking like something from a copyleft codebase. I can tell you for a fact that corporate legal compliance folks will not hesitate to use the ban-hammer if, after some future court case, it's deemed that codebases like Numpy's are "tainted" and require roll-back.
I’m not proposing we block AI tools categorically (that’s neither realistic nor enforceable). But I do think it’s reasonable to say:
* in high blast radius zones, contributors should prefer tools with clearer provenance and license posture, and we should be willing to ask for extra diligence (explanations, tests, and/or avoiding “model wrote the entire algorithm” submissions); * in low blast radius zones, the risk/cost trade is different, and we can be more permissive.
I also think we should explicitly acknowledge that this policy may evolve as jurisprudence and tooling clarity improves.
As an additional note, over the last couple of years I have been actively working on a new "AI Rights" license & tech infrastructure to help give explicit attestation for authors of all copyrighted works, along the lines of CC Signals[1] or IETF AI Preferences[2]. I'm actually sending this from the AI Summit in Delhi where, as part of AI Commons, I'm convening allies from Creative Commons, Wikimedia, Internet Archive, Common Crawl, and others to align on shared vision and workstreams. Those who are interested in my work on this can see the videos at links [3][4][5].
If you're to just watch one, I'd recommend [4] then [5], or just [5].
I'm happy to chat in depth with any/all of you about these topics, but I want to be sensitive about not hijacking the Numpy list for my personal mad ravings, so we can take them off-list if the maintainers deem it too off-topic. If, on the other hand, y'all want to have a dialogue about this stuff here, I can think of no finer group to pressure-test my ideas. :-)
Cheers, Peter
(In the spirit of transparency and dogfooding: some parts of this email came from a thread summarization and initial dialog with GPT 4o and 5.2, although I only used the output as a starting point and edited heavily afterwards.)
[1] https://creativecommons.org/ai-and-the-commons/cc-signals/ [2] https://datatracker.ietf.org/wg/aipref/about/ [3] https://www.youtube.com/watch?v=oZHl4NWaO7c [4] "AI for All": https://www.youtube.com/watch?v=TLZ9zXnluc8 [5] "AI Training & The Data Commons in Crisis": https://www.youtube.com/watch?v=CdKxgT1o864 _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: sebastian@sipsolutions.net
On Sun, Feb 15, 2026 at 9:06 AM sebastian <sebastian@sipsolutions.net> wrote:
Hey!
thanks a lot for chiming in, still watching the videos!
FWIW, I have come around in that I guess we should add something (a lot of projects are discussing this, I would be happy to just steal one too). I would still focus on the transparency and social part (because I think that is the part that affects us more in practice) rather than copyright, but yeah, I guess it deserves a place.
It may well be nice to force a checkbox and a short note in the PR template, I am not sure what will work there, but it may also depend a bit on the "risk assessment" how much detail we need.
I asked Grok about this, and it suggested a template as the best solution: # Pull Request ## Description ## Checklist ### Required for all PRs - [ ] I have read the [Contributing Guidelines]( https://numpy.org/doc/stable/dev/index.html) - [ ] Tests have been added or updated (and all tests pass) - [ ] Documentation has been updated (docstrings, user guide, release notes if needed) - [ ] Code style: I have run `ruff check --fix` and `ruff format` (or `pre-commit run --all-files`) ### AI Usage (required disclosure) - [ ] I have disclosed any use of AI tools below: - **AI tools used**: (e.g. Claude, GitHub Copilot, ChatGPT, Cursor, etc. — or "None") - **How they were used**: (e.g. "Generated first draft of new function X", "Helped write tests", "Refactored docstring", "None") - **Review notes**: (optional — anything reviewers should pay extra attention to) ### Additional notes <snip> Chuck
On Sun, Feb 15, 2026 at 6:22 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sun, Feb 15, 2026 at 9:06 AM sebastian <sebastian@sipsolutions.net> wrote:
Hey!
thanks a lot for chiming in, still watching the videos!
FWIW, I have come around in that I guess we should add something (a lot of projects are discussing this, I would be happy to just steal one too). I would still focus on the transparency and social part (because I think that is the part that affects us more in practice) rather than copyright, but yeah, I guess it deserves a place.
It may well be nice to force a checkbox and a short note in the PR template, I am not sure what will work there, but it may also depend a bit on the "risk assessment" how much detail we need.
I asked Grok about this, and it suggested a template as the best solution:
# Pull Request
## Description
## Checklist
### Required for all PRs - [ ] I have read the [Contributing Guidelines]( https://numpy.org/doc/stable/dev/index.html) - [ ] Tests have been added or updated (and all tests pass) - [ ] Documentation has been updated (docstrings, user guide, release notes if needed) - [ ] Code style: I have run `ruff check --fix` and `ruff format` (or `pre-commit run --all-files`)
### AI Usage (required disclosure) - [ ] I have disclosed any use of AI tools below: - **AI tools used**: (e.g. Claude, GitHub Copilot, ChatGPT, Cursor, etc. — or "None") - **How they were used**: (e.g. "Generated first draft of new function X", "Helped write tests", "Refactored docstring", "None") - **Review notes**: (optional — anything reviewers should pay extra attention to) ### Additional notes
<snip>
# Pull Request ## Description <!-- A clear and concise description of the changes. --> ## Checklist ### Required for all PRs - [ ] I have read the [Contributing Guidelines]( https://numpy.org/doc/stable/dev/index.html) - [ ] Tests have been added or updated (and all tests pass) - [ ] Documentation has been updated (docstrings, user guide, release notes if needed) - [ ] Code style: I have run `ruff check --fix` and `ruff format` (or `pre-commit run --all-files`) ### AI Usage (required disclosure) - [ ] I have disclosed any use of AI tools below: - **AI tools used**: (e.g. Claude, GitHub Copilot, ChatGPT, Cursor, etc. — or "None") - **How they were used**: (e.g. "Generated first draft of new function X", "Helped write tests", "Refactored docstring", "None") - **Review notes**: (optional — anything reviewers should pay extra attention to) ### Additional notes <!-- Any other information reviewers should know (breaking changes, performance impact, etc.) Another try to fix formatting. Chuck
On Sun, Feb 15, 2026 at 9:35 PM sebastian <sebastian@sipsolutions.net> wrote:
Hey! thanks a lot for chiming in, still watching the videos!
Awesome, definitely let me know if you have any questions/feedback! For the maintainer I think the "blast radius" framework could be very
useful and it may be nice to flesh it out (I don't think there is anything NumPy specific about it, i.e. in my mind (1) is basically examples for the blast radius?).
Actually, I think this is something where the NumPy-specific bits could be extraordinarily helpful, and in a sense, it manifests the worldview that you & other maintainers have painstakingly built up over time. There will be some parts that are pretty generic to all software projects, but there are quite a lot of subtle and intricate design choices that go into making Numpy what it is. These may not be obvious to a coding LLM that, ultimately, will regress to the mean on matters of technical design and architecture. Examples include: * To what degree is a performance reduction acceptable, and when is it a regression? * What kinds of hardware optimizations are acceptable, at the price of maintainability? * How important is it to insist that all tests pass for all supported architectures? (Ie what is the compatibility rubric?) * Which downstream projects or specific API surfaces are important to prioritize testing on, for numerical stability, accuracy, performance, etc.? I'm sure you all can come up with many more :-) I also believe that maintainers writing down your unique & specific individual perspectives will ultimately form the foundations of a "Numpy code bot" that can assist in development in the future, assuming we get a non-legally-problematic coding model created. -Peter
On Mon, Feb 16, 2026 at 5:33 AM Peter Wang via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Sun, Feb 15, 2026 at 9:35 PM sebastian <sebastian@sipsolutions.net> wrote:
Hey! thanks a lot for chiming in, still watching the videos!
Awesome, definitely let me know if you have any questions/feedback!
For the maintainer I think the "blast radius" framework could be very
useful and it may be nice to flesh it out (I don't think there is anything NumPy specific about it, i.e. in my mind (1) is basically examples for the blast radius?).
Actually, I think this is something where the NumPy-specific bits could be extraordinarily helpful, and in a sense, it manifests the worldview that you & other maintainers have painstakingly built up over time. There will be some parts that are pretty generic to all software projects, but there are quite a lot of subtle and intricate design choices that go into making Numpy what it is. These may not be obvious to a coding LLM that, ultimately, will regress to the mean on matters of technical design and architecture.
Examples include:
* To what degree is a performance reduction acceptable, and when is it a regression? * What kinds of hardware optimizations are acceptable, at the price of maintainability? * How important is it to insist that all tests pass for all supported architectures? (Ie what is the compatibility rubric?) * Which downstream projects or specific API surfaces are important to prioritize testing on, for numerical stability, accuracy, performance, etc.?
I'm sure you all can come up with many more :-)
I also believe that maintainers writing down your unique & specific individual perspectives will ultimately form the foundations of a "Numpy code bot" that can assist in development in the future, assuming we get a non-legally-problematic coding model created.
-Peter
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: charlesr.harris@gmail.com
On Mon, Feb 16, 2026 at 8:20 AM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Mon, Feb 16, 2026 at 5:33 AM Peter Wang via NumPy-Discussion < numpy-discussion@python.org> wrote:
On Sun, Feb 15, 2026 at 9:35 PM sebastian <sebastian@sipsolutions.net> wrote:
Hey! thanks a lot for chiming in, still watching the videos!
Awesome, definitely let me know if you have any questions/feedback!
For the maintainer I think the "blast radius" framework could be very
useful and it may be nice to flesh it out (I don't think there is anything NumPy specific about it, i.e. in my mind (1) is basically examples for the blast radius?).
Actually, I think this is something where the NumPy-specific bits could be extraordinarily helpful, and in a sense, it manifests the worldview that you & other maintainers have painstakingly built up over time. There will be some parts that are pretty generic to all software projects, but there are quite a lot of subtle and intricate design choices that go into making Numpy what it is. These may not be obvious to a coding LLM that, ultimately, will regress to the mean on matters of technical design and architecture.
Examples include:
* To what degree is a performance reduction acceptable, and when is it a regression? * What kinds of hardware optimizations are acceptable, at the price of maintainability? * How important is it to insist that all tests pass for all supported architectures? (Ie what is the compatibility rubric?) * Which downstream projects or specific API surfaces are important to prioritize testing on, for numerical stability, accuracy, performance, etc.?
I'm sure you all can come up with many more :-)
I also believe that maintainers writing down your unique & specific individual perspectives will ultimately form the foundations of a "Numpy code bot" that can assist in development in the future, assuming we get a non-legally-problematic coding model created.
You can look through some of the SKILL.md files other folks have generated for hints of how to keep claude in line. Should we have such files? Would we accept PRs adding such files? They would be a tangible point of discussion if nothing else. Chuck
participants (24)
-
Alan -
Andrew Nelson -
Benjamin Root -
Bill Ross -
Charles R Harris -
David Cournapeau -
David Menéndez Hurtado -
Evgeni Burovski -
Ilhan Polat -
Juan Nunez-Iglesias -
Klaus Zimmermann -
Lucas Colley -
Marten van Kerkwijk -
Matthew Brett -
matti picus -
Matti Picus -
Nathan -
Oscar Benjamin -
Peter Wang -
Ralf Gommers -
Robert Kern -
sebastian -
Sebastian Berg -
Stefan van der Walt