[Tutor] Borrowing restricted code

Steven D'Aprano steve at pearwood.info
Wed Dec 5 19:45:43 EST 2018


On Wed, Dec 05, 2018 at 11:22:35AM -0500, Avi Gross wrote:

> I am NOT advocating copying code. Not even "free" code.
> 
> I am saying there may be times you want to package the code for special
> purposes.

"Packaging" the code IS copying the code.


> Perhaps someone can enlighten me on a subtle aspect here.
> 
> What does it mean to include something?

https://en.wiktionary.org/wiki/include


[... description of how #include works in C ...]
> The above is not meant to be a precise description but indicates that your
> code was rarely all your own. You could literally set it up so if someone
> else left their code unguarded, you could arrange to grab text or library
> code into your finished executable.

The precise details of what an individual programming language means by 
"including" code will, naturally, depend on the language in question.

This is not really the place to go into a detailed description of all 
the possible variations (this is supposed to be a *Python* forum after 
all) but briefly there are at least two kinds of code which we might 
"include": source code and compiled object code.

When "including" source code, the interpreter might read a command like 
"load spam", look up a file "spam", read the contents into memory, parse 
it and run it through the interpreter as if it had been copied and 
pasted into the original file in place of the "load" line.

When "including" object code, the compiler (or often a separate program 
called a linker) can make a copy of the object code and insert that code 
directly into the object code it is generating. This is called "static 
linking".

An alternative is to leave the object code where it is, but instead 
insert instructions for how to access it at runtime. This is called 
"dynamic linking".

https://kb.iu.edu/d/akqn

Python's import is (I think) closer to dynamic linking than the others.


> In python, you can generally have access to the python code of what you can
> import or at least to the byte code. But there may be different rules
> attached to several categories.
> 
> If you use "import" you don't so much copy as USE the code. I mean the
> interpreter pauses evaluating the current file and opens the one you asked
> for and reads it as if it were your code.

That is not entirely wrong, but its not quite right either.

The action of the import command is surprisingly complex, and so I may 
have got some subtle details wrong. But the high-level overview of what 
the interpreter does when you run "import spam" is as follows.

1. Look for an already loaded module "spam"; if the interpreter finds 
one, it creates an new variable called "spam" in the current namespace, 
and assigns that module object to that name. The import is complete.

# Pseudo-code
if "spam" in sys.modules:
    spam = sys.modules["spam"]
    return


2. If no such already loaded module, then search a set of known 
locations for a library called "spam":

# Pseudo-code
for location in sys.path:
    for filename in os.listdir(location):
        name, ext = path.splitext(filename)
        if name == "spam":
            if ext == ".py":
                read the source code from spam.py into memory
                parse it into bytecode into memory
                write out the bytecode to spam.pyc
            elif ext == ".pyc":
                read the bytecode from spam.pyc into memory
            else:
                # other cases handled here, e.g. packages, zip files,
                # C libraries (.dll or .so), other extensions etc.
            # Assuming we get to here...
            create a module object in memory
            run that bytecode, using that module object as the namespace
            cache the module object in sys.modules['spam']
            spam = module object
            return
# no such "spam" module or package found
raise ImportError


So you can see that modules are only executed the first time the 
interpreter imports them, not on subsequent imports.

There really isn't a distinction to make between code treated "as if it 
were your code" and other code. All code is treated the same.

How could it not be? The interpreter cannot know which modules or 
libraries you wrote, which were collaborative efforts between you and 
other people, and which were written by other people.



> Some actions are transient. A line
> like "5+3" evaluates to 8 and you ignore it. But many and perhaps most lines
> end up creating instantiations of objects (yes, classes are also objects)
> which then sit in memory.

Or get used and then disposed of by the garbage collector.


> Functions are also objects and pretty much contain
> within them everything you need to reconstruct the function if you know
> where to look. Is what is copied really different than the original text
> used? 

Yes.


> If you import the base modules that come with python, can you assume it is
> freely given with few strings other than the kind discussed?

There is no need to assume anything, you can read the licence. When you 
start the interactive interpeter, it says:

Type "help", "copyright", "credits" or "license" for more information.


so the licence is always at your fingertips.

If that's too much trouble, then you can trust the millions of other 
people who use Python every day, and believe them when they say the 
interpreter and the standard library are available under a permissive 
licence.

Unless you are distributing a copy of Python or its libraries to others, 
you don't need to care about the details. That's the point of using a 
permissive licence.


[...]
> Back to the main topic, for me. What kind of uses might be considered legal
> or at least not worth prosecuting?

Under the current copyright maximalist domain, virtually nothing is 
legal unless you are the copyright owner, or have a licence from the 
copyright owner. (Disclaimer: I am not a lawyer.)

You may have the right to *read* the code (if it isn't a Trade Secret), 
but that's about as far as it goes. In principle, copying the code in 
any way (whether by a literal "copy and paste" into your text editor, or 
by merely re-implementing the code from memory after reading it) could 
be deemed to be either copyright infringement or plagiarism or both.

Especially under the current academic standards of zero-tolerance for 
so-called "plagiarism" in the US, which are extreme and hypocritical. 
But that's off-topic.

There's a theoretical "Fair Use" right in some countries (but not in 
Australia!) that could allow you to copy small, trivial chunks, perhaps 
as much as 10% of the work, under certain circumstances, but lawyers 
LOVE arguing about what is or isn't Fair Use because such arguments can 
go on and on for months.

There are also theoretical arguments that the code in question was so 
trivial that there should be no copyright on it in the first place. Just 
as you can't copyright the sentence "I ate a spam sandwich" in isolation 
(only as part of a more substantial work), so you can't copyright a 
trivial isolated line of code. Again, lawyers love arguing about what is 
or isn't trivial.

In principle, you might even argue independent invention. If you 
happened to come up with substantially the same work independently, 
perhaps because there is only a single obvious way to do something, then 
no copying too place and so you may be deemed to have not infringed 
copyright. (But independent invention is no defence against patent 
infringement.)

So you can see why many organisations are so paranoid about having 
licences for every line of code they use. Failure to be fully licenced 
could be *incredibly* time-consuming and expensive if they get into a 
legal dispute. The only way to win is to avoid getting into a legal 
dispute in the first place.

As for what is "not worth prosecuting", there are no copyright police 
who troll the internet looking for copied lines of code. Nobody is going 
to be scanning your Github repos looking for infringement (at least not 
yet, and I'm sure that by now the anti-plagiarism software industry is 
looking to do something like that...)

There generally needs to be a civil complaint where the copyright owner 
needs to sue you.

In the USA, the TSA and immigration do look for infringing software (at 
least sometimes) but they mostly care about commercial-scale piracy. If 
the customs official is feeling especially obnoxious and/or diligent, 
they might demand to know why you don't have a Windows licence key on 
your laptop[1] or poke around looking for pirated videos on behalf of 
Hollywood and co. Deep pockets speak loudly.

But they're not going to open up your Python folder and demand to see 
licences for everything or question whether or not you copy code from 
Stackoverflow without permission.


> The law in various countries likely
> differs, Alan and I are definitely using different legal systems but
> possibly not as different as where Asad is. (His phone number would be in
> India.)

Most countries in the world abide by the Berne convention on 
copyrights, so probably not as different as you may be imagining.

India, in particular, is a signatory to the Berne Convention, the 
Universal Copyright Convention of both Geneva and Paris, and the TRIPS 
agreement (but not the WIPO Copyright Treaty) so the differences will be 
relatively minor.

https://en.wikipedia.org/wiki/Berne_Convention

https://en.wikipedia.org/wiki/List_of_parties_to_international_copyright_agreements



> Clearly if you make a product and sell it, then borrowing code can
> be a serious thing. 

Indeed. Most FOSS (Free Open Source Software) projects turn a blind eye 
to trivial non-compliance unless it involves commercial products. Even 
Richard Stallman is probably not going to chase you for copying a few 
lines of GPLed software


> If you are a student doing a homework assignment and
> will never use it again, few might care especially if you use small amounts
> and don't claim it is your own work.

I wouldn't be so sure about that, especially in US macademia. (Warning: 
may contain nuts.) Their rules for plagiarism are insanely strict and 
exceedingly hypocritical, because the macademics enforcing those rules 
against students don't even come close to living up to the same 
standards themselves.


> But the question I would like answered is DOES IT MAKE A DIFFERENCE how you
> borrow the use of the code? 

Not legally. Copying is copying, whether you cut and paste in a text 
editor, drag and drop a file from one disk to another, download it from 
the Pirate Bay, or retype it from memory.


> We talked about the normal intended method:
> 
> Load the file into a place on your local machine.

That's a copy. If you are not permitted to make that copy, you are 
already infringing right there.


> Write one or more python files and one of them imports it.
> 
> A second method I suggested WHEN NEEDED AND ALLOWED is to find the code you
> need, perhaps on another machine where you can download it, or in some other
> public repository. Copy the minimum you need into your own code, with
> perhaps enough attribution and explanation.

That's a copy. If you are not permitted to make that copy, you are
already infringing right there.

In addition, "enough attribution" may not be sufficient to avoid charges 
of plagiarism.

Example: I've seen at least one case of a student given a reprimand for 
plagiarism after giving a one-line quote, acknowledging the author, but 
failing to give a formal reference including page number. 



> Now what about a third way? No edit needed. Simply use a program like "cat"
> to concatenate multiple files

That's a copy. If you are not permitted to make that copy, you are
already infringing right there.

(How do you get the copy of the file you pass to cat, if not by making a 
copy?)






[1] Because it is unthinkable that you might be running some OS other 
than Windows, one which doesn't require a licence key.


-- 
Steve


More information about the Tutor mailing list