[TriPython] Places to look at performance tuning

Tue Jun 6 16:47:26 EDT 2017

I have played with some calls to cProfile, using in more of a call the
server.py file I use for dev testing.

And this comes to your question, yes using the ORM.  Which I would wager
could be the performance bottle neck as the record set returned grows.

On Tue, Jun 6, 2017 at 4:44 PM, James Whisnant <jwhisnant at gmail.com> wrote:

>    I recommend profiling your code to see exactly where it is slow. It may
> or
>    may not be what you expect. [1]https://pypi.python.org/
> pypi/profilehooks
>    is an easy to use decorate your functions to profile them. It does sound
>    like some of your database queries are slow. This may help you pinpoint
>    where and why. Are you using the SQLAlchemy ORM or Core? The ORM can
> have
>    a large overhead (in some cases) as compared to the SQLAlchemy Core.
>
>    On Tue, Jun 6, 2017 at 4:28 PM, Ken MacKenzie <[2]ken at mack-z.com>
> wrote:
>
>      ** **At present the SQL table has about 8 fields that make up the
>      primary key,
>      ** **that PK creates an index and is the only index at present.
>      ** **I don't think sql selection is my bottle neck. **here is why.
>      ** **basically my route is like this:
>      ** **/type/fiscal_year/fiscal_period/entity
>      ** **entity is the only one that is "optional"
>      ** **When an entity is specified I get a return in a second or two.**
>      Granted a
>      ** **much smaller record set.
>      ** **When entity is omitted I get about 50x the result count and that
> is
>      where
>      ** **I get to a 20 second return time.
>      ** **Those results at first made me think the problem was io bound on
>      the
>      ** **router till I saw that the first return byte was so behind.
>      ** **Am I misunderstanding your suggestion that additional indexes
> would
>      ** **improve the speed at which the DB returns the data, because I am
>      ** **interpreting your suggestion as one to improve the selection
> speed.
>      ** **On Tue, Jun 6, 2017 at 4:18 PM, George Gergues
>      ** **<[1][3]george.gergues at gmail.com> wrote:
>
>      ** ** **** **For SQL table **add at least one index. it will improve
>      table
>      ** ** **scans.
>      ** ** **** **On Jun 6, 2017 11:44, "Ken MacKenzie"
>      <[1][2][4]ken at mack-z.com> wrote:
>
>      ** ** **** ** **** **So I am in the demo and test phase of an early
>      ReSTful API
>      ** ** **for
>      ** ** **** ** **reporting.
>      ** ** **** ** **** **Currently a wider scale report set request hits
> the
>      ** ** **following
>      ** ** **** ** **marks:
>      ** ** **** ** **** **TTFB: ~20s
>      ** ** **** ** **** **Record Count: ~92k
>      ** ** **** ** **** **Download Size: 15.8MB
>      ** ** **** ** **** **Details:
>      ** ** **** ** **** **Web Server: NGINX
>      ** ** **** ** **** **Python App Server: Gunicorn
>      ** ** **** ** **** **Web Framework: Falcon
>      ** ** **** ** **** **Python version: 3.5 (in a venv)
>      ** ** **** ** **** **DB: MS SQL Server Express using SQL Alchemy +
>      pyodbc
>      ** ** **** ** **** **Webserver OS: CentOS 7
>      ** ** **** ** **** **Gunicorn is setup with 4 workers, on a private
>      port, nginx
>      ** ** **does a
>      ** ** **** ** **proxy
>      ** ** **** ** **** **pass to the port
>      ** ** **** ** **** **DB Details, the table in question has a total of
>      about 8
>      ** ** **million
>      ** ** **** ** **rows.**
>      ** ** **** ** **** **Sample query execution within SQL Server Mgmt
>      Studio is ~7s
>      ** ** **** ** **** **So my question is which of the following would
> be a
>      better
>      ** ** **target
>      ** ** **** ** **to
>      ** ** **** ** **** **improve performance, or do I need to as my
>      performance
>      ** ** **should be
>      ** ** **** ** **** **considered good enough.** I mean the server in
> this
>      case is
>      ** ** **a
>      ** ** **** ** **surplus dual
>      ** ** **** ** **** **core desktop right now.
>      ** ** **** ** **** **add gzip compression to nginx for proxys
>      ** ** **** ** **** **switch gunicorn to use a unix socket instead of a
>      tcp port
>      ** ** **** ** **** **consider leaner SQL and JSON marshaling requests
>      instead of
>      ** ** **ORM's
>      ** ** **** ** **and
>      ** ** **** ** **** **dictionary bundles.
>      ** ** **** ** **** **Appreciate and advice or suggestions.** Thank
> you.
>
>      ** ** **** ** **_______________________________________________
>      ** ** **** ** **TriZPUG mailing list
>      ** ** **** ** **[2][3][5]TriZPUG at python.org
>      ** ** **** **
>      **[3][4][6]https://mail.python.org/mailman/listinfo/trizpug
>      ** ** **** ** **[4][5][7]http://tripython.org is the Triangle Python
>      Users Group
>
>      References
>
>      ** **Visible links
>      ** **1. mailto:[8]george.gergues at gmail.com
>      ** **2. mailto:[9]ken at mack-z.com
>      ** **3. mailto:[10]TriZPUG at python.org
>      ** **4. [11]https://mail.python.org/mailman/listinfo/trizpug
>      ** **5. [12]http://tripython.org/
>
>      _______________________________________________
>      TriZPUG mailing list
>      [13]TriZPUG at python.org
>      [14]https://mail.python.org/mailman/listinfo/trizpug
>      [15]http://tripython.org is the Triangle Python Users Group
>
> p
>
>
-------------- next part --------------
   I have played with some calls to cProfile, using in more of a call the
   server.py file I use for dev testing.
   And this comes to your question, yes using the ORM.** Which I would wager
   could be the performance bottle neck as the record set returned grows.
   On Tue, Jun 6, 2017 at 4:44 PM, James Whisnant <[1]jwhisnant at gmail.com>
   wrote:

     ** **I recommend profiling your code to see exactly where it is slow. It
     may or
     ** **may not be what you expect.
     [1][2]https://pypi.python.org/pypi/profilehooks
     ** **is an easy to use decorate your functions to profile them. It does
     sound
     ** **like some of your database queries are slow. This may help you
     pinpoint
     ** **where and why. Are you using the SQLAlchemy ORM or Core? The ORM
     can have
     ** **a large overhead (in some cases) as compared to the SQLAlchemy
     Core.

     ** **On Tue, Jun 6, 2017 at 4:28 PM, Ken MacKenzie
     <[2][3]ken at mack-z.com> wrote:

     ** ** **** **At present the SQL table has about 8 fields that make up
     the
     ** ** **primary key,
     ** ** **** **that PK creates an index and is the only index at present.
     ** ** **** **I don't think sql selection is my bottle neck. **here is
     why.
     ** ** **** **basically my route is like this:
     ** ** **** **/type/fiscal_year/fiscal_period/entity
     ** ** **** **entity is the only one that is "optional"
     ** ** **** **When an entity is specified I get a return in a second or
     two.**
     ** ** **Granted a
     ** ** **** **much smaller record set.
     ** ** **** **When entity is omitted I get about 50x the result count and
     that is
     ** ** **where
     ** ** **** **I get to a 20 second return time.
     ** ** **** **Those results at first made me think the problem was io
     bound on
     ** ** **the
     ** ** **** **router till I saw that the first return byte was so behind.
     ** ** **** **Am I misunderstanding your suggestion that additional
     indexes would
     ** ** **** **improve the speed at which the DB returns the data, because
     I am
     ** ** **** **interpreting your suggestion as one to improve the
     selection speed.
     ** ** **** **On Tue, Jun 6, 2017 at 4:18 PM, George Gergues
     ** ** **** **<[1][3][4]george.gergues at gmail.com> wrote:

     ** ** **** ** **** **For SQL table **add at least one index. it will
     improve
     ** ** **table
     ** ** **** ** **scans.
     ** ** **** ** **** **On Jun 6, 2017 11:44, "Ken MacKenzie"
     ** ** **<[1][2][4][5]ken at mack-z.com> wrote:

     ** ** **** ** **** ** **** **So I am in the demo and test phase of an
     early
     ** ** **ReSTful API
     ** ** **** ** **for
     ** ** **** ** **** ** **reporting.
     ** ** **** ** **** ** **** **Currently a wider scale report set request
     hits the
     ** ** **** ** **following
     ** ** **** ** **** ** **marks:
     ** ** **** ** **** ** **** **TTFB: ~20s
     ** ** **** ** **** ** **** **Record Count: ~92k
     ** ** **** ** **** ** **** **Download Size: 15.8MB
     ** ** **** ** **** ** **** **Details:
     ** ** **** ** **** ** **** **Web Server: NGINX
     ** ** **** ** **** ** **** **Python App Server: Gunicorn
     ** ** **** ** **** ** **** **Web Framework: Falcon
     ** ** **** ** **** ** **** **Python version: 3.5 (in a venv)
     ** ** **** ** **** ** **** **DB: MS SQL Server Express using SQL Alchemy
     +
     ** ** **pyodbc
     ** ** **** ** **** ** **** **Webserver OS: CentOS 7
     ** ** **** ** **** ** **** **Gunicorn is setup with 4 workers, on a
     private
     ** ** **port, nginx
     ** ** **** ** **does a
     ** ** **** ** **** ** **proxy
     ** ** **** ** **** ** **** **pass to the port
     ** ** **** ** **** ** **** **DB Details, the table in question has a
     total of
     ** ** **about 8
     ** ** **** ** **million
     ** ** **** ** **** ** **rows.**
     ** ** **** ** **** ** **** **Sample query execution within SQL Server
     Mgmt
     ** ** **Studio is ~7s
     ** ** **** ** **** ** **** **So my question is which of the following
     would be a
     ** ** **better
     ** ** **** ** **target
     ** ** **** ** **** ** **to
     ** ** **** ** **** ** **** **improve performance, or do I need to as my
     ** ** **performance
     ** ** **** ** **should be
     ** ** **** ** **** ** **** **considered good enough.** I mean the server
     in this
     ** ** **case is
     ** ** **** ** **a
     ** ** **** ** **** ** **surplus dual
     ** ** **** ** **** ** **** **core desktop right now.
     ** ** **** ** **** ** **** **add gzip compression to nginx for proxys
     ** ** **** ** **** ** **** **switch gunicorn to use a unix socket
     instead of a
     ** ** **tcp port
     ** ** **** ** **** ** **** **consider leaner SQL and JSON marshaling
     requests
     ** ** **instead of
     ** ** **** ** **ORM's
     ** ** **** ** **** ** **and
     ** ** **** ** **** ** **** **dictionary bundles.
     ** ** **** ** **** ** **** **Appreciate and advice or suggestions.**
     Thank you.

     ** ** **** ** **** ** **_______________________________________________
     ** ** **** ** **** ** **TriZPUG mailing list
     ** ** **** ** **** ** **[2][3][5][6]TriZPUG at python.org
     ** ** **** ** **** **
     ** ** ****[3][4][6][7]https://mail.python.org/mailman/listinfo/trizpug
     ** ** **** ** **** ** **[4][5][7][8]http://tripython.org is the Triangle
     Python
     ** ** **Users Group

     ** ** **References

     ** ** **** **Visible links
     ** ** **** **1. mailto:[8][9]george.gergues at gmail.com
     ** ** **** **2. mailto:[9][10]ken at mack-z.com
     ** ** **** **3. mailto:[10][11]TriZPUG at python.org
     ** ** **** **4. [11][12]https://mail.python.org/mailman/listinfo/trizpug
     ** ** **** **5. [12][13]http://tripython.org/

     ** ** **_______________________________________________
     ** ** **TriZPUG mailing list
     ** ** **[13][14]TriZPUG at python.org
     ** ** **[14][15]https://mail.python.org/mailman/listinfo/trizpug
     ** ** **[15][16]http://tripython.org is the Triangle Python Users Group

     p

References

   Visible links
   1. mailto:jwhisnant at gmail.com
   2. https://pypi.python.org/pypi/profilehooks
   3. mailto:ken at mack-z.com
   4. mailto:george.gergues at gmail.com
   5. mailto:ken at mack-z.com
   6. mailto:TriZPUG at python.org
   7. https://mail.python.org/mailman/listinfo/trizpug
   8. http://tripython.org/
   9. mailto:george.gergues at gmail.com
  10. mailto:ken at mack-z.com
  11. mailto:TriZPUG at python.org
  12. https://mail.python.org/mailman/listinfo/trizpug
  13. http://tripython.org/
  14. mailto:TriZPUG at python.org
  15. https://mail.python.org/mailman/listinfo/trizpug
  16. http://tripython.org/