Can anyone recomend a good intoduction to C...

Alex Martelli aleaxit at yahoo.com
Wed Mar 7 11:42:36 EST 2001


"Werner Schiendl" <ws-news at gmx.at> wrote in message
news:983967284.246700 at newsmaster-04.atnet.at...
> > > That is of course a point, if you need to let the program inspect you
> will
> > > try to keep it as simple as possible.
> > > But given the same functionality, I think a C++ program will not be
more
> > > complex.
> >
> > The program will not be, the language is.  It's a hard-to-judge
tradeoff.
> >
>
> As to my knowledge, the application itself is inspected for safety
critical
> appliances.
> And you must not change anything afterwards in the software or you need to
> re-inspect the software.
>
> What use would it be to check the language?

Complete, detailed, total knowledge of the many tripwires that the
C++ language deploys for you to stumble on is needed on the part of
the code-inspection team (ideally, all members thereof).  This is
rarer than you might think -- I'm easily in the upper centile of C++
programmers according to Brainbench's tests, yet I'm fully aware
that there _are_ areas I'm fuzzy on (and, yes, they DO come up by
mistake time and again).

Further, a language that is very complicated is likelier to have
buggy implementations.  Again, the bugs need not be in obscure areas
*that you can easily avoid*; for example, name-resolution (and you
can hardly avoid NAMES, can you?-) does NOT fully conform to the
C++ standard in several popular C++ implementations (Microsoft, Borland,
gcc -- last time I looked at each, but that wasn't *AGES* ago:-).

A language with a simpler standard makes it likelier...:
    [A] that all code-inspectors will be fully aware of
        each and every provision of the language,
    [B] that implementations will be reasonably bug-free
        and fully implement the language standard (at
        least if said standard HAS been around a while:-).

This is what makes using a complicated language such as C++ a hard
to judge tradeoff for security-critical code.  On one hand, *IF* you
could fully trust the C++ implementation (libraries included), the
amount of code to be painstakingly checked would be vastly reduced
(a factor of two would not be surprising!); on the other hand, the
checking is going to be harder (in terms of cost-per-line-checked),
since the checkers need complete knowledge of the complicated (sigh)
language, *AND* said language has a nasty habit of implicitly
invoking code on your behalf -- it's easier to check if you SEE
everything that is happening, but, in C++, you often don't (ctors
and dtors in particular are being invoked left and right every
time you breathe, it seems at times:-).


> > I do not: C++ *forces* "any number of features" on you -- you can't
> > easily subset it and ENSURE features outside of the subset are not
> > intruding.
>
> Why this?

I guess it's because it was never designed with "ability to subset
easily and effectively" among the design-goals.  Lord knows it had
enough extremely hard design goals already, and its complication
stems from stretching to meet them all.

> You can use templates. But you can leave them at the side with no ill
> effects.

But then you say goodbye to the Standard Library -- it's just about
fully implemented in terms of templates!  Without templates, no
strings, no data structures, no algorithms, no auto-pointers...
(oops, I seem to have slipped into recitation of the Heart of
Wisdom Sutra -- but, hey, it *IS* true...!-).

In practice, templates are not optional -- you may strive to avoid
defining new ones, but, if you are to get any substantial utility
out of C++, it's a lost cause to avoid using all the ones that the
standard library is built of.

(If the choice for a project was, C++ or C, in full awareness of
all the tradeoffs, I would generally choose C++; but, if it was
"C++ without templates" or C, I would have ABSOLUTELY _NO_ DOUBTS
choosing C -- if you must cut out 90% of C++'s immediate pluses,
you might as well go all out for simplicity).

> You can use multiple inheritance. But, as well, you can leave it out if
you
> don't really need or like it.

Yes, multiple inheritance CAN be fully avoided -- it doesn't happen
"behind your backs" like other things do:-).

Things you _cannot_ avoid include, for example, exceptions.  They
are a VERY powerful mechanism (and an excellent reason to choose
C++: error-processing is absolutely essential), but _simple_ they
aren't (I'm not claiming they could have been made any simpler
within the design-goals of C++, mind you).  But, there's more...:

> You can use function/method overloading. But you can as well select a
> distinct name.

You cannot set your C++ compiler to turn off the overloading
feature.  And overloading IS an easy thing to do by mistake.

Its existence synergizes with templates, and subtleties of
name-resolution, to manufacture splendidly-subtle bugs, ones
very likely to escape even the most-careful and knowledgeable
human code-inspector (you MAY be able to get third party code
checking tools that CAN be set to warn on policy violations,
including the existence of overloading -- I have not had much
luck with my past experience of such source-checkers, but it
HAS been a while in this case, so things may have changed).

Avoiding overloads of *NON*-renameable functions (where the
language does NOT allow you to select a distinct name and
still get the same functionality), constructors above all,
is also basically impossible.  How CAN you avoid, at the
_very_ least, overloading constructors for default and copy
(for canonical classes)?!

And those (overloaded) functions *DON'T GET EXPLICITLY
NAMED IN MANY CASES* -- you find yourself _magically and
implicitly_ invoking a constructor (maybe of a temporary
object...) and the related destructor... all inserted for
you, behind the scenes, care of the compiler.  Nor can
you avoid 'slicing' if you do ANY public inheritance --
the language is designed so you can't avoid it, because
if A extends B, an A object IS accepted wherever a
const B& is desired (e.g., the copy constructor of B,
which is magically invoked anytime a B object is taken
by-value as a function argument, for example).

> You can use the new streams syntax introduced with C++, and it may have
> additional benefits.
> But nobody hinders you using the traditional APIs read, write, and
friends.

Formatting becomes a huge problem if you decide to eschew
iostreams, as the language then offers no other language-supplied
way to perform formatting (read and write are Unix _system calls_
related to UN-formatted I/O, and hardly a solution to that).
In practice, you'd have to build (or acquire and certify) a
completely separate formatting-and-IO library (as you would for
C, since its 'standard I/O' is among its weakest parts -- the
BSD I/O library, where explicit buffer-size arguments are always
passed, would be a much preferable approach).


> > > family of classes does not have that kind of problem.
> >
> > No, but it does have other issues -- e.g.,
> >     std::string oneletter('A', 1);
> > does NOT do what one might expect, nor does it give any warning
> > in the implementations I have at hand (oneletter becomes a string
> > of 65 characters all equal to '\001', not one of 1 character
> > equal to 'A' -- oops, arguments the wrong way around...!-).
>
> Yes, trading characters as integral types is maybe not a very good
decision
> from today's view on things.

But, knowing the language has made this non-changeable-now decision,
the constructors of std::string _should_ have been carefully designed
to avoid smacking heads-on against it.  This is one of the few aspects
of the standard C++ libraries that I consider an outright defect.

> However, I guess that would not do that much bad and at least not
introduce
> a security problem that turns out after years the software is in use.

If any serious testing is done (and it had BETTER be, NO MATTER
how carefully one code-inspects!), then NO problem should ever
'turn out after years the software is in use'.

If testing is skimped on, then "the buffer which is dimensioned
differently from what the code author and inspectors think it
is" MAY be excremental matter suspended over a ventilator and
just waiting for the thin thread it hangs on to snap -- today,
tomorrow, or three years from now.  E.g.,

    static const int buffer_size = 100;
    std::string largebuffer('X', buffer_size);
    if(data_to_copy.size() > buffer_size) {
        raise too_much_data_for_the_buffer;
    }
    memcpy(largebuffer.data(), data_to_copy.data(),
        data_to_copy.size());
    // or std::copy with all the .begin() and .end()
    // you wish -- just to give a false sense of
    // security...:-)

The author and inspectors may even feel particularly proud
of the named buffer_size, the careful use of const rather
than bleak old #define, the painstaking checking, and the
exception raised if the check fails...!-)

If, during tests, the data_to_copy.size() happens to be all
the time <= 88 OR > 100, then here we have a buffer overrun
just waiting to happen... the first time data_to_copy has
a size between 89 and 100, included.

You think these things don't happen in reality...?  How much
code-inspection and unit-testing have you done on C++ code?!

As the "C++ guru" (among other responsibilities) at my
employer, I routinely get called to help with mysterious
problems (and I _wish_ I was more often called for design
and code inspections *before* any problems show up, but that's
another issue:-).  I assure you that they DO happen, even to
coders who SHOULD know best -- sometimes code is written
late, by very tired people rushing against deadlines, or
(worse, for some:-) early in the morning, before enough
coffe (or, depending on taste, cola) has been imbibed to
raise the individual's awareness above amoeba-level...!-)


Note I'm *NOT* saying that C is better on this specific
task: I very specifically said:

> > The trade-off is no doubt a win for std::string here -- it
> > removes many more issues than it introduces.  But it's not


But it would be a _very_ serious mistake to assert:

> I think security is not a problem of C++...

because of COURSE it is.  The language's complexity is a
serious security MINUS, and MUST be carefully and responsibly
weighed against its pluses to choose the best too.


> You can easily write C or even Python code that does introduce security
> problems.

Of course you can.  But, the simplicity of the language
makes it easier for such problems to be caught in code
inspection (that's ASSUMING code inspection is not skimped
upon for security-critical code, or you're dogmeat ANYWAY!-).

Each language carefully avoids introducing a false sense
of security -- C is clearly so low-level everything needs
to be checked, Python does no static type-checks and so
just as clearly needs unit-testing and code-inspection.

Note that C++ does too, but it MAY lull the unwary into
easily-misinterpreted assertions such as the ones you've
made just above:-).  [Maybe Eiffel, since it DOES strive SO
darn hard (and makes such a good job, but still of necessity
incomplete... it can't solve the halting problem, can it?-)
to check everything and a half, is the worst from the point
of view of false sense of security...?-) -- but I think C++
has more foot-height tripwires ready:-)].


Alex






More information about the Python-list mailing list