[Cython] New function (pointer) syntax.

Robert Bradshaw robertwb at gmail.com
Fri Nov 7 09:22:05 CET 2014


On Thu, Nov 6, 2014 at 1:35 PM, C Blake <cblake at pdos.csail.mit.edu> wrote:
> I think you should just use the C declarator syntax.  Cython already
> allows you to say "cdef int *foo[10]".

Quick: is that a pointer to an array or 10 pointers to ints? Yes, I
know what it is, but the thing is without knowing C (well) it's not
immediately obvious what the precedence should be.

Cython's target audience includes lots of people who don't know C well.

> Declarators aren't bad - just
> poorly taught, though I can see some saying those are the same thing.

If they were good, they would be easy to learn, no expert teaching required.

> More below.  I absolutely like the declarator one the most, and the
> lambda one second most.  Declarator style makes it far easier to take
> C code/.h files using function pointers over to Cython.  So, this
> discussion also depends on whether you view Cython as a bridge to
> C libs or its own island/bias toward pure Py3.
>
> One other proposal that might appease Stefan's "Python lexical cues"
> where he was missing "def" would be to take the Python3 function
> definition header syntax and strip just variable names.  I.e., keep the
> ":"s
>     def foo(x: type1, y: type2, z: type3) -> type0: pass
> goes to
>     (: type1, : type2, : type3) -> type0 [ident]
>
> I don't think that looks like anything else that might valid in C or
> Python.  It does just what C does - strip variable names from a function
> header, but the ":"s maybe key your brain into a different syntax mode
> since they are arguably more rare in C.  (Besides stripping names, C has
> some extra parens for precedence/associativity - which admittedly are
> tuned to make use-expressions simpler than type-expressions.)  Anyway,
> I don't really like my own proposal above.  Just mentioning it for
> completeness in case the ":"s help anyone.
>
>
> Robert wrote:
>>I really hope, in the long run, people won't have to manually do these
>>declarations.
>
> I agree they'll always be there somehow and also with Stefan's comments
> about the entry bar.  So, most people not needing them most of the time
> doesn't remove the question.
>
>
>>I am curious, when you read "cdef int * p" do you parse this as "cdef
>>(int*) p" or "cdef int (*p)" 'cause for me no matter how well I know
>>it's the latter, I think the former (i.e. I think "I'm declaring a
>>variable of type p that's of int pointer type.")
>>[..]
>>Essentially everyone thinks "cdef type var" even though that's not
>>currently the true grammar.
>>[..]
>>The reason ctypedefs help, and are so commonly used for with function
>>pointers, is because the existing syntax is just so horrendously bad.
>>If there's a clear way to declare a function taking a single float and
>>returning a single float, no typedef needed.
>
> No, no, no.  Look, generations of C/C++ programmers have been done
> monumental disservice by textbooks/code/style guides that suggest
> "int*  p" is *any less* confusing than spacing "2+3/x" as "2+3 / x".
> Early on in my C exposure someone pointed this out and I've never been
> confused since.  It's a syntax-semantics confusion.  Concrete syntax
> has always been right associative dereference *.  In this syntax family,
> the moment any operators []/*/() are introduced, you have to start
> formatting it/thinking of it as a real expression, and that formatting
> should track the syntax not semantics like in your head "pointer
> to"/indirection speak or whatever.

syntax != semantics => baddness

> Spacing it as if * were left
> associative to the type name is misleading at best.
>
> If you can only think of type decls in general as (type, var) pairs
> syntactically and semantically then *of course* you find typedefs more
> clear.  They make pairing more explicit & shrink the expression tree to
> be more trivial.  You should *still* space the typedef itself in a way
> suggestive of the actual concrete syntax -- "typedef int *p" (or
> "ctypedef") just like you shouldn't write "2+3 / x".  You should still
> not essentially think of "ctypedef type var" *either*, but rather
> "typedef basetype expr".  In short, "essentially everyone" *should* think
> and be taught and have it reinforced and "gel" by spacing that "basetype
> expr" is the syntax to create a type-var bindings semantically, and only
> perceive "type var" as just one simple case.

You are reenforcing the point that declarators are not intuitive, but
rather one has to be forcibly hit over the head with them, because
they're easy to mis-understand and abuse.

> Taking significance of
> space in Python/Cython one step further, "int* p" could even be a hard
> syntax error, but I'm not seriously proposing that change.

:)

Distinguishing any "a* b" from "a *b" would be a major change to the
tokenizer. Indentation, not all whitespace, is significant.

>I really do not think it is "essentially everyone".  You know better as you
> said anyway, but are in conflict with yourself, I think syntax-semantics
> wise.
>
> Semantically, pointer indirection loads from an address before using,
> and sure that can be confusing to new programmers in its own right.
> Trying to unravel that confusion with anti-syntax spacing/thought
> cascades the wrong way out of the whole situation and contributes
> to your blocked assimilation.  *If* the space guides you or barring
> space parens guide you, you quickly get to never forgetting that types
> are inside-out/inverse/what-I-get-if specifications.  Note that this
> is as it would if +,/ had somehow tricky concepts somehow "fixable"
> by writing "2+3 / x" all the time.  Arithmetic isn't a binding..so
> the analogy is hard to complete, but my point is ad nauseum at this
> stage (or even sooner! ;-).  Undermine syntax with contrary whitespace
> and of course it will seem bad/be harder.  It might even lock you in
> to thought patterns that make it really hard to think about it how
> you know you "ought" to.
>
> Anyway, more the point, Cython has "cdef" not "py3def" or "javadef" or
> whatever, after all.  There is even the cdef: blocks where all the decls
> and inits look eminently C-like.  Having an alternative ptr syntax from C
> only for functions but not arrays seems wrong to me *unless* it's really
> part of a general Py3 annotations syntax move.

It's about getting rid of declarators, so when one says "add types"
one literally goes from

    def foo(arg): ...

to

    def foo([type] arg): ...

rather than

   def foo([base_type] [decl_stuff]arg[more_decl_stuff): ...

To use your terminology, to bring the syntax in line with the
semantics. I've also seen lots of people, good coders even, struggle
with C function pointer declarations.

> Stefan was in some
> mypy-in-Py3 thread last Summer.  If the situation were Cython just
> moving that way to be a Py3-with-mypy compiler with some extras for C
> integration, that would be a different story to me.  Then trying to make
> function pointers seem like definitions sans-names might make more sense.
> It would really ease moves to Cython from pure Py3 users used to the type
> annotations "for checking" to apply to code generation.  That may be such
> a big move from Cython now that it might almost deserve some "Cython3"
> fork that could also drop other redundant things (like pre-memory views
> NumPy integration that seems not as good as more general memviews or
> something).

Yeah, deleting the old way isn't going to happen any time soon.

> All that being said, using lambda syntax to signify types as well as
> values seems an interesting idea, if it would really work.

I'm fairly convinced it would work, despite being a bit clunky looking.

> But we
> know C declarators will work for all things C.  I don't see why we
> shouldn't just stick to that.  Right now and with declarators you can
> tell people you just need to learn C decl syntax and Python.  That's
> a VERY simple thing to say (even if it requires some learning work,
> and even if it isn't 100% exact).  There is a lot of documentation
> out there to learn C decls.

Yes, the one (I'd argue only) thing C declarators has going for it is
that that's what's used in C.

- Robert


More information about the cython-devel mailing list