[TriPython] Places to look at performance tuning
Ken MacKenzie
ken at mack-z.com
Tue Jun 6 16:47:26 EDT 2017
I have played with some calls to cProfile, using in more of a call the
server.py file I use for dev testing.
And this comes to your question, yes using the ORM. Which I would wager
could be the performance bottle neck as the record set returned grows.
On Tue, Jun 6, 2017 at 4:44 PM, James Whisnant <jwhisnant at gmail.com> wrote:
> I recommend profiling your code to see exactly where it is slow. It may
> or
> may not be what you expect. [1]https://pypi.python.org/
> pypi/profilehooks
> is an easy to use decorate your functions to profile them. It does sound
> like some of your database queries are slow. This may help you pinpoint
> where and why. Are you using the SQLAlchemy ORM or Core? The ORM can
> have
> a large overhead (in some cases) as compared to the SQLAlchemy Core.
>
> On Tue, Jun 6, 2017 at 4:28 PM, Ken MacKenzie <[2]ken at mack-z.com>
> wrote:
>
> ** **At present the SQL table has about 8 fields that make up the
> primary key,
> ** **that PK creates an index and is the only index at present.
> ** **I don't think sql selection is my bottle neck. **here is why.
> ** **basically my route is like this:
> ** **/type/fiscal_year/fiscal_period/entity
> ** **entity is the only one that is "optional"
> ** **When an entity is specified I get a return in a second or two.**
> Granted a
> ** **much smaller record set.
> ** **When entity is omitted I get about 50x the result count and that
> is
> where
> ** **I get to a 20 second return time.
> ** **Those results at first made me think the problem was io bound on
> the
> ** **router till I saw that the first return byte was so behind.
> ** **Am I misunderstanding your suggestion that additional indexes
> would
> ** **improve the speed at which the DB returns the data, because I am
> ** **interpreting your suggestion as one to improve the selection
> speed.
> ** **On Tue, Jun 6, 2017 at 4:18 PM, George Gergues
> ** **<[1][3]george.gergues at gmail.com> wrote:
>
> ** ** **** **For SQL table **add at least one index. it will improve
> table
> ** ** **scans.
> ** ** **** **On Jun 6, 2017 11:44, "Ken MacKenzie"
> <[1][2][4]ken at mack-z.com> wrote:
>
> ** ** **** ** **** **So I am in the demo and test phase of an early
> ReSTful API
> ** ** **for
> ** ** **** ** **reporting.
> ** ** **** ** **** **Currently a wider scale report set request hits
> the
> ** ** **following
> ** ** **** ** **marks:
> ** ** **** ** **** **TTFB: ~20s
> ** ** **** ** **** **Record Count: ~92k
> ** ** **** ** **** **Download Size: 15.8MB
> ** ** **** ** **** **Details:
> ** ** **** ** **** **Web Server: NGINX
> ** ** **** ** **** **Python App Server: Gunicorn
> ** ** **** ** **** **Web Framework: Falcon
> ** ** **** ** **** **Python version: 3.5 (in a venv)
> ** ** **** ** **** **DB: MS SQL Server Express using SQL Alchemy +
> pyodbc
> ** ** **** ** **** **Webserver OS: CentOS 7
> ** ** **** ** **** **Gunicorn is setup with 4 workers, on a private
> port, nginx
> ** ** **does a
> ** ** **** ** **proxy
> ** ** **** ** **** **pass to the port
> ** ** **** ** **** **DB Details, the table in question has a total of
> about 8
> ** ** **million
> ** ** **** ** **rows.**
> ** ** **** ** **** **Sample query execution within SQL Server Mgmt
> Studio is ~7s
> ** ** **** ** **** **So my question is which of the following would
> be a
> better
> ** ** **target
> ** ** **** ** **to
> ** ** **** ** **** **improve performance, or do I need to as my
> performance
> ** ** **should be
> ** ** **** ** **** **considered good enough.** I mean the server in
> this
> case is
> ** ** **a
> ** ** **** ** **surplus dual
> ** ** **** ** **** **core desktop right now.
> ** ** **** ** **** **add gzip compression to nginx for proxys
> ** ** **** ** **** **switch gunicorn to use a unix socket instead of a
> tcp port
> ** ** **** ** **** **consider leaner SQL and JSON marshaling requests
> instead of
> ** ** **ORM's
> ** ** **** ** **and
> ** ** **** ** **** **dictionary bundles.
> ** ** **** ** **** **Appreciate and advice or suggestions.** Thank
> you.
>
> ** ** **** ** **_______________________________________________
> ** ** **** ** **TriZPUG mailing list
> ** ** **** ** **[2][3][5]TriZPUG at python.org
> ** ** **** **
> **[3][4][6]https://mail.python.org/mailman/listinfo/trizpug
> ** ** **** ** **[4][5][7]http://tripython.org is the Triangle Python
> Users Group
>
> References
>
> ** **Visible links
> ** **1. mailto:[8]george.gergues at gmail.com
> ** **2. mailto:[9]ken at mack-z.com
> ** **3. mailto:[10]TriZPUG at python.org
> ** **4. [11]https://mail.python.org/mailman/listinfo/trizpug
> ** **5. [12]http://tripython.org/
>
> _______________________________________________
> TriZPUG mailing list
> [13]TriZPUG at python.org
> [14]https://mail.python.org/mailman/listinfo/trizpug
> [15]http://tripython.org is the Triangle Python Users Group
>
> p
>
>
-------------- next part --------------
I have played with some calls to cProfile, using in more of a call the
server.py file I use for dev testing.
And this comes to your question, yes using the ORM.** Which I would wager
could be the performance bottle neck as the record set returned grows.
On Tue, Jun 6, 2017 at 4:44 PM, James Whisnant <[1]jwhisnant at gmail.com>
wrote:
** **I recommend profiling your code to see exactly where it is slow. It
may or
** **may not be what you expect.
[1][2]https://pypi.python.org/pypi/profilehooks
** **is an easy to use decorate your functions to profile them. It does
sound
** **like some of your database queries are slow. This may help you
pinpoint
** **where and why. Are you using the SQLAlchemy ORM or Core? The ORM
can have
** **a large overhead (in some cases) as compared to the SQLAlchemy
Core.
** **On Tue, Jun 6, 2017 at 4:28 PM, Ken MacKenzie
<[2][3]ken at mack-z.com> wrote:
** ** **** **At present the SQL table has about 8 fields that make up
the
** ** **primary key,
** ** **** **that PK creates an index and is the only index at present.
** ** **** **I don't think sql selection is my bottle neck. **here is
why.
** ** **** **basically my route is like this:
** ** **** **/type/fiscal_year/fiscal_period/entity
** ** **** **entity is the only one that is "optional"
** ** **** **When an entity is specified I get a return in a second or
two.**
** ** **Granted a
** ** **** **much smaller record set.
** ** **** **When entity is omitted I get about 50x the result count and
that is
** ** **where
** ** **** **I get to a 20 second return time.
** ** **** **Those results at first made me think the problem was io
bound on
** ** **the
** ** **** **router till I saw that the first return byte was so behind.
** ** **** **Am I misunderstanding your suggestion that additional
indexes would
** ** **** **improve the speed at which the DB returns the data, because
I am
** ** **** **interpreting your suggestion as one to improve the
selection speed.
** ** **** **On Tue, Jun 6, 2017 at 4:18 PM, George Gergues
** ** **** **<[1][3][4]george.gergues at gmail.com> wrote:
** ** **** ** **** **For SQL table **add at least one index. it will
improve
** ** **table
** ** **** ** **scans.
** ** **** ** **** **On Jun 6, 2017 11:44, "Ken MacKenzie"
** ** **<[1][2][4][5]ken at mack-z.com> wrote:
** ** **** ** **** ** **** **So I am in the demo and test phase of an
early
** ** **ReSTful API
** ** **** ** **for
** ** **** ** **** ** **reporting.
** ** **** ** **** ** **** **Currently a wider scale report set request
hits the
** ** **** ** **following
** ** **** ** **** ** **marks:
** ** **** ** **** ** **** **TTFB: ~20s
** ** **** ** **** ** **** **Record Count: ~92k
** ** **** ** **** ** **** **Download Size: 15.8MB
** ** **** ** **** ** **** **Details:
** ** **** ** **** ** **** **Web Server: NGINX
** ** **** ** **** ** **** **Python App Server: Gunicorn
** ** **** ** **** ** **** **Web Framework: Falcon
** ** **** ** **** ** **** **Python version: 3.5 (in a venv)
** ** **** ** **** ** **** **DB: MS SQL Server Express using SQL Alchemy
+
** ** **pyodbc
** ** **** ** **** ** **** **Webserver OS: CentOS 7
** ** **** ** **** ** **** **Gunicorn is setup with 4 workers, on a
private
** ** **port, nginx
** ** **** ** **does a
** ** **** ** **** ** **proxy
** ** **** ** **** ** **** **pass to the port
** ** **** ** **** ** **** **DB Details, the table in question has a
total of
** ** **about 8
** ** **** ** **million
** ** **** ** **** ** **rows.**
** ** **** ** **** ** **** **Sample query execution within SQL Server
Mgmt
** ** **Studio is ~7s
** ** **** ** **** ** **** **So my question is which of the following
would be a
** ** **better
** ** **** ** **target
** ** **** ** **** ** **to
** ** **** ** **** ** **** **improve performance, or do I need to as my
** ** **performance
** ** **** ** **should be
** ** **** ** **** ** **** **considered good enough.** I mean the server
in this
** ** **case is
** ** **** ** **a
** ** **** ** **** ** **surplus dual
** ** **** ** **** ** **** **core desktop right now.
** ** **** ** **** ** **** **add gzip compression to nginx for proxys
** ** **** ** **** ** **** **switch gunicorn to use a unix socket
instead of a
** ** **tcp port
** ** **** ** **** ** **** **consider leaner SQL and JSON marshaling
requests
** ** **instead of
** ** **** ** **ORM's
** ** **** ** **** ** **and
** ** **** ** **** ** **** **dictionary bundles.
** ** **** ** **** ** **** **Appreciate and advice or suggestions.**
Thank you.
** ** **** ** **** ** **_______________________________________________
** ** **** ** **** ** **TriZPUG mailing list
** ** **** ** **** ** **[2][3][5][6]TriZPUG at python.org
** ** **** ** **** **
** ** ****[3][4][6][7]https://mail.python.org/mailman/listinfo/trizpug
** ** **** ** **** ** **[4][5][7][8]http://tripython.org is the Triangle
Python
** ** **Users Group
** ** **References
** ** **** **Visible links
** ** **** **1. mailto:[8][9]george.gergues at gmail.com
** ** **** **2. mailto:[9][10]ken at mack-z.com
** ** **** **3. mailto:[10][11]TriZPUG at python.org
** ** **** **4. [11][12]https://mail.python.org/mailman/listinfo/trizpug
** ** **** **5. [12][13]http://tripython.org/
** ** **_______________________________________________
** ** **TriZPUG mailing list
** ** **[13][14]TriZPUG at python.org
** ** **[14][15]https://mail.python.org/mailman/listinfo/trizpug
** ** **[15][16]http://tripython.org is the Triangle Python Users Group
p
References
Visible links
1. mailto:jwhisnant at gmail.com
2. https://pypi.python.org/pypi/profilehooks
3. mailto:ken at mack-z.com
4. mailto:george.gergues at gmail.com
5. mailto:ken at mack-z.com
6. mailto:TriZPUG at python.org
7. https://mail.python.org/mailman/listinfo/trizpug
8. http://tripython.org/
9. mailto:george.gergues at gmail.com
10. mailto:ken at mack-z.com
11. mailto:TriZPUG at python.org
12. https://mail.python.org/mailman/listinfo/trizpug
13. http://tripython.org/
14. mailto:TriZPUG at python.org
15. https://mail.python.org/mailman/listinfo/trizpug
16. http://tripython.org/
More information about the TriZPUG
mailing list