DCOracle2 performance tuning

Tony Colston agcolston at buckman.com
Mon Nov 25 17:27:25 EST 2002


I have noticed also that Python is quite a bit slower if you leave the
arraysize parameter alone when using cx_Oracle. If you change the
arraysize on the cursor you can get much better performance.

The amount of data that is being pulled across the network makes a
HUGE difference of course also.

For a large sized table in rows (250,000) and in columns 56 the time
to pull * from the table in my case is about 250 seconds. If you
change the arraysize to 50 then you will pull the data in about half
the time (123 seconds).

Changing the arraysize to greater than 50 in my case brought only
slightly better results which also makes sense.

If you are not pulling back the entire table then time decreases a lot
also, 72 seconds for 3 fields. If change the arraysize then you can
get down to 13 seconds.

The important thing is pick and choose the amount of data that you
really want to pull across the network. Most applications will not
need to pull an entire table of data like this... or you would hope.

The other thing that you need to consider for Oracle at least is the
execution plan for your queries. In both of my cases you always get a
full table scan. Once again this makes sense cause you are pulling ALL
of the data. Also depending on your Oracle version you MUST have
tables with valid statistics or none of your testing will matter at
all... since the Oracle optimizer relies on stats.

Probably the only thing that bothers me out of this is that I wrote a
simliar testing program for Java and got much better performance for
an out of the box experience. The Java test program took only 24
seconds to pull the fields... as compared to the 72 seconds for
Python. When you change the arraysize in Python on the cursor you can
get the same performance and even better. I did not take the time to
optimize the Java version of the test though I imagine I could get
better than 24 seconds there also.

Tony


SQL> select count(*) from order_invoice_detail;

  COUNT(*)
----------
    243727
    
SQL> select count(*) from all_tab_columns
  2  where table_name = 'ORDER_INVOICE_DETAIL' and
  3  owner = 'GRASP';

  COUNT(*)
----------
        56


----------------------------------------------------------------------
Table Access Full on order_invoice_detail
Case 1.
Select * from order_invoice_detail (no arraysize set)
Total time:  252.318585463
Total time:  251.900257156
Total time:  249.472460327

Case 2.
Select * from order_invoice_detail (arraysize = 50)
Total time:  123.798607644
Total time:  122.675535298
Total time:  122.581093128

Case 3.
Select * from order_invoice_detail (arraysize = 100)
Total time:  118.943308158
Total time:  119.487127275
Total time:  117.969179044

----------------------------------------------------------------------
Table access full
Case 10.
select sales_order_no, extended_price, industry_lookup_id from
order_invoice_detail (no arraysize set)
Total time:  72.1730837299
Total time:  72.516651418
Total time:  70.934486925

Case 11.
select sales_order_no, extended_price, industry_lookup_id from
order_invoice_detail (arraysize=50)
Total time:  17.6729447458
Total time:  17.6347633568
Total time:  17.621397971

Case 12.
select sales_order_no, extended_price, industry_lookup_id from
order_invoice_detail (arraysize=100)
Total time:  13.1321153946
Total time:  13.1474268378
Total time:  13.1184346055

Case 13.
select sales_order_no, extended_price, industry_lookup_id from
order_invoice_detail (arraysize=20)
Total time:  23.0417743037
Total time:  23.005110426
Total time:  23.0129290175



More information about the Python-list mailing list