[Tutor] Access HBase

Laura Creighton lac at openend.se
Sun Jul 12 12:46:15 CEST 2015


In a message of Sat, 11 Jul 2015 23:46:56 -0400, Michelle Meiduo Wu writes:
>Thanks a lot! 
>
>Do you know anything about HappyBase compared with Jython?
>
>Best,
>Michelle

I don't know anything at all about HappyBase, and next to nothing about
Hadoop.  But I know quite a bit about Jython.

The Python you get from python.org (and from ActiveState) is written
in the C programming language.  That is why some of us often call it
CPython.  However, there are other implementations of Python that
are not written in C, and one of them is called Jython.  Jython is
the Python Programming lanaguage written in Java.

Now, in the Greater Python Programming World, most libraries you
would like to import and include in your own programs are written
in Python itself.  They will be perfectly happy running under CPython
or under Jython.  There are also a whole lot of libraries that are
written in C.  Getting other things written in C to talk to CPython
is work, but it is a pretty straight forward business to wrap your
C libraries and getting them to work with CPython.

Getting something written in some other language than C to work with
CPython is much harder.  C++ libraries get wrapped often enough that
we have standard ways of doing that, too, but it is a lot more work.
And C++ is, by design, a whole lot like C.

It is very rare that CPython developers bother to wrap something
written in any other language at all.  So the world is full of
perfectly great libraries written in Java and the CPython world
basically never uses them because going from the Java world to the C
world is even harder.

On the Java side things are similar.  If you like your Java world,
but want Python syntax, not Java syntax -- you would like to program
in Python -- you use Jython.  If you want to use a library and it is
written in Python, you import that, and it just works.  If you want
to use a library and it is written in Java, usually -- not all of the
time, but most of the time -- you can wrap it very easily, import it,
and it works.  And if you want to use a C or a C++ library, things
get harder.

But your goal here is to talk to apache HBase, and that is a java
library.  Somebody wanted that enough that they made CPython bindings
for that, and that is called Thrift, and all I know about that is that
it was so hard to use and error prone that the HappyBase developers
wrote HappyBase to wrap Thrift in something easier to use.  If you
want to develop in CPython this is probably where you will end up.
(But somebody tomorrow could come by with information on a library
I have never heard of, of course.)

The other way to go after this is to develop your code with Jython
not CPython.  Then you follow the instructions here:
http://wiki.apache.org/hadoop/Hbase/Jython

and you are good to go.

You just write to Hbase directly using Python syntax.

The next thing you need to do is see what other libraries you need to
use for your work, and see if they exist for Jython.  For instance,
if you need NumPy, you cannot use Jython at this time.  People are
working on this problem over here: http://jyni.org/  where they are
trying to build a better way to interface Jython with C and C++
programs, but they aren't close to being done yet.

Laura



More information about the Tutor mailing list