[Edu-sig] Things to come

Dinu C. Gherman gherman@darwin.in-berlin.de
Mon, 15 May 2000 19:00:23 +0200


Hello,

although I do not see *that much* value in discussing formats
on this list at length, I'd like to add some short comments 
about PDF. I'm certainly not an expert, but I did something 
like a semi-useful PDF parser and used some tools that others 
maybe haven't.

PDF was never designed to be a collaborative editing format.
It was designed to blurr the lines between paper and web, 
while improving certain multimedia capabilities that Post-
Script did not have at all. Eventually, it is still a paper
description language like PS, although much enricheded by 
now. So, one can hardly blame PDF/PS for not being something 
it never pretended to be.

The comparison with executable binaries is closer to the 
truth, although it is not impossible at all to do something 
else but print a PDF on paper or watch it onscreen. In fact,
there is a whole "prepress" industry of companies providing 
PDF modification tools, AGFA being one of the bigger fish in 
the pool. 

While this is likely often limited to color and font manipu-
lation (PDF does *not* only contain graphics), it is absolute-
ly *not* impossible to do more to PDF. As I promissed to be 
brief just that much: about 5 years ago I was using an appli-
cation called Taylor on NeXTSTEP that would let you edit PS 
code *graphically* on something like a Pentium-90, and PDF is 
more or less just wrapped enriched PostScript...

PDF is a proprietary format only in the sense that it was
largely (fully?) developped by one company. But with the en-
tire format being openly specified I can see no point in say-
ing this is some evil thing limiting people's freedom to edit
a PDF document. Nobody in the world prevents you from writing
your PDF generation kit, that will put the same text that you
see on the visible part of the file into some hidden compres-
sed, searchable (?) ASCII stream that you can rather easily 
extract with even simple Python tools...

So, let's not make this a religious war without a good reason.

Regards,

Dinu

-- 
Dinu C. Gherman
................................................................
"The only possible values [for quality] are 'excellent' and 'in-
sanely excellent', depending on whether lives are at stake or 
not. Otherwise you don't enjoy your work, you don't work well, 
and the project goes down the drain." 
                    (Kent Beck, "Extreme Programming Explained")



Steve Morris wrote:
> 
> Kirby Urner writes:
>  > BTW, I don't consider PDFs all that web-unready (despite
>  > previous poster's remarks).  Just download and print.  Sure,
>  > you can't easily change the text, but not all published
>  > documents are open source code in that sense.
> 
> As "previous poster" mentioned above allow me to comment. I didn't say
> that PDF's were web unready. PDF is an excellent format for paper
> replacer electronic publishing where copyright protection is the issue
> and the author wishes to protect and control content. PDF is a
> graphical format which strips all intellectual content from electronic
> representation and turns it into a picture. This is a feature if you
> want to limit electronic access to the data (editing, searching, copy
> and paste etc.) Also PDF is a portable graphical format that can be
> represented on any OS and printer.
> 
> I am not an FSF fanatic that thinks all intellectual property should
> be free. There is a place for protecting access and control of
> copyright material. It is an important part of commerce AND
> innovation. The creator and/or owner of intellectual property has the
> legal right to put controls on that property and this is a good
> thing. The point I want to make is that using PDF `IS' a control on user
> access and probably shouldn't be used if that is not the intent.
> 
> Specifically (although perhaps not clearly) what I was suggesting is
> that PDF is a poor format for collaborative efforts where the users
> might be expected to modify and enhance the materials (translate to
> other formats etc) and perhaps contribute the enhancements back to the
> community. Thus it is a poor format for documentation intended for the
> public domain or its various sourceware equivalents.
> 
> Putting it another way pdf is the documentation equivalent of
> distributing software as executables and not source code.
> 
>  > As an author/writer, I produce finished products, as well
>  > as works in progress.  The concept of plagiarism still
>  > holds (i.e. if it's not your own work, don't pretend that
>  > it is; credit your sources).
> 
> Hmmm... I will give the benefit of the doubt and assume that this
> comment was general and not a characterization of people who dislike
> pdf file format; even though the juxtaposition made me a little queasy
> when I read it. I'm from the old school. I believe that when students
> cheat they should be expelled and only allowed back in with reasonable
> proof of contrition. Equivalent fate should follow professional
> plagarism or other forms of misrepresentation.