Mailman 3 Question regarding custom contextFactory for twisted.web.client.Agent - Twisted-web

Question regarding custom contextFactory for twisted.web.client.Agent

Jason Litzinger

12 May 2017 12 May '17

8:56 p.m.

Hello, I have an optimization question in the realm of "before I do something foolish..." regarding the contextFactory passed to twisted.web.client.Agent. Background: I have (several) twisted applications running on a Cortex-A5 system, one of which interfaces with a web service using the twisted.web.client.Agent API*. The problem I've observed is that requests to this web service result in a higher than desired CPU load. To understand the source of the load I isolated the server interactions and profiled. Investigation: The profile data revealed the load in question stemmed from TLS related calls, specifically optionsForClientTLS, which is called once per connection. In looking (briefly) at what optionsForClientTLS does, it seemed (perhaps wrongly) that there was an opportunity for optimization in my specific case. The optimization is to create a custom contextFactory that caches the connection creator since the host is not changing. I prototyped an implementation and the load was reduced by a factor of ten. Question: Is it safe/sane to reuse a connection creator if the host to which I'm connecting is not going to change? Thanks, -Jason Litzinger * Technically this code is inside an open source third party SDK, but were I in their shoes I'd write it the same way, so I've no issues with their implementation.

Show replies by date

Glyph

12 May 12 May

11:24 p.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

On May 12, 2017, at 1:56 PM, Jason Litzinger wrote:

Hello,

I have an optimization question in the realm of "before I do something foolish..." regarding the contextFactory passed to twisted.web.client.Agent.

Background:

I have (several) twisted applications running on a Cortex-A5 system, one of which interfaces with a web service using the twisted.web.client.Agent API*. The problem I've observed is that requests to this web service result in a higher than desired CPU load. To understand the source of the load I isolated the server interactions and profiled.

Investigation:

The profile data revealed the load in question stemmed from TLS related calls, specifically optionsForClientTLS, which is called once per connection. In looking (briefly) at what optionsForClientTLS does, it seemed (perhaps wrongly) that there was an opportunity for optimization in my specific case.

The optimization is to create a custom contextFactory that caches the connection creator since the host is not changing. I prototyped an implementation and the load was reduced by a factor of ten.

Question:

Is it safe/sane to reuse a connection creator if the host to which I'm connecting is not going to change?

This sounds eminently sensible, and in fact sounds like Twisted, or Treq at least, really ought to do something like this on our end with some kind of small LRU cache, keeping the most-used N hostnames (where N defaults to some small number, say, 20). I'm surprised to hear it is such a big optimization, but surprises like that are entirely the point of performance testing! Nevertheless, thanks for asking! I really wish people would ask questions about client TLS more often before doing something potentially insecure :). In this case, the client connection creator is used to produce client connections anyway, and the connections are not shared, so there should be no worrisome mingling of state between connections. We do host a performance test suite here: http://speed.twistedmatrix.com http://speed.twistedmatrix.com/ and the benchmarks are here https://github.com/twisted-infra/twisted-benchmarks https://github.com/twisted-infra/twisted-benchmarks if your test is not adequately represented in an existing benchmark. -g

Jason Litzinger

14 May 14 May

2:36 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

This sounds eminently sensible, and in fact sounds like Twisted, or Treq at least, really ought to do something like this on our end with some kind of small LRU cache, keeping the most-used N hostnames (where N defaults to some small number, say, 20). I'm surprised to hear it is such a big optimization, but surprises like that are entirely the point of performance testing! Thanks for taking a look!

I like the idea of an LRU for the last N hosts. Given I have something prototyped that I will expand on, that isn't much of a stretch. If you think it might be useful to Twisted then I can try to iterate the patch until it is either rejected or accepted (following the well documented steps for contributing of course), thoughts? If it fits more in treq then I'm more than happy to share the code, but, since I'm not currently using treq the amount of effort I can realistically put towards it isn't as high, though that project is very appealing. On the size of the performance increase -- I was surprised as well, and there's always the possibility I made an error, but it was consistent across the five tests I ran. Regardless, I'll be doing more profiling as I work toward the final implementation.

...

Nevertheless, thanks for asking! I really wish people would ask questions about client TLS more often before doing something potentially insecure :). In this case, the client connection creator is used to produce client connections anyway, and the connections are not shared, so there should be no worrisome mingling of state between connections.

Agree. I've come to believe that assumptions mixed with TLS/security related issues are my worst enemy. -Jason * Completely unrelated - I still owe on my committment to ticket 6597...yeah, life happened. Luckily I saved my notes now that I'm finding some time here and there.

Glyph

15 May 15 May

1:49 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

On May 13, 2017, at 7:36 PM, Jason Litzinger wrote:

...
This sounds eminently sensible, and in fact sounds like Twisted, or Treq at least, really ought to do something like this on our end with some kind of small LRU cache, keeping the most-used N hostnames (where N defaults to some small number, say, 20). I'm surprised to hear it is such a big optimization, but surprises like that are entirely the point of performance testing! Thanks for taking a look!

I like the idea of an LRU for the last N hosts. Given I have something prototyped that I will expand on, that isn't much of a stretch. If you think it might be useful to Twisted then I can try to iterate the patch until it is either rejected or accepted (following the well documented steps for contributing of course), thoughts?

This sounds like exactly what you should do. (Keeping in mind that very nearly 100% of patches are "rejected" on their first attempt, so please do resubmit :))

...

If it fits more in treq then I'm more than happy to share the code, but, since I'm not currently using treq the amount of effort I can realistically put towards it isn't as high, though that project is very appealing.

Probably the best place to put it would be Twisted proper. I just mentioned Treq because sometimes utility functionality like this can become surprisingly complex and have a lot of dependencies, and pin Treq it's easier to make the case that it's a high-level API that should just always do the right thing, as opposed to Agent which needs to preserve maximum flexibility for all HTTP client applications.

...

On the size of the performance increase -- I was surprised as well, and there's always the possibility I made an error, but it was consistent across the five tests I ran. Regardless, I'll be doing more profiling as I work toward the final implementation.

Like I said, if we were never surprised we wouldn't have to have tests :-).

...

...
Nevertheless, thanks for asking! I really wish people would ask questions about client TLS more often before doing something potentially insecure :). In this case, the client connection creator is used to produce client connections anyway, and the connections are not shared, so there should be no worrisome mingling of state between connections. Agree. I've come to believe that assumptions mixed with TLS/security related issues are my worst enemy.

Not just yours; the latin for "HTTP-only hist" is "hostis humani generis".

...

-Jason

* Completely unrelated - I still owe on my committment to ticket 6597...yeah, life happened. Luckily I saved my notes now that I'm finding some time here and there.

We do what we must, because we can. -glyph

Jason Litzinger

22 May 22 May

4:41 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

We do host a performance test suite here: http://speed.twistedmatrix.com http://speed.twistedmatrix.com/ and the benchmarks are here https://github.com/twisted-infra/twisted-benchmarks https://github.com/twisted-infra/twisted-benchmarks if your test is not adequately represented in an existing benchmark.

I'm working on the previously discussed change (ticket 9138) and finally took a (brief) look at the benchmark suite. Assuming master is the correct branch, from what I can tell the existing benchmarks for the Agent class use only HTTP, and the SSL tests are a lean client/server, not an HTTPS client/server. Am I missing something? Thanks! -Jason

Glyph

8:58 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

On May 21, 2017, at 9:41 PM, Jason Litzinger wrote:

...
We do host a performance test suite here: http://speed.twistedmatrix.com http://speed.twistedmatrix.com/ and the benchmarks are here https://github.com/twisted-infra/twisted-benchmarks https://github.com/twisted-infra/twisted-benchmarks if your test is not adequately represented in an existing benchmark.

I'm working on the previously discussed change (ticket 9138) and finally took a (brief) look at the benchmark suite. Assuming master is the correct branch, from what I can tell the existing benchmarks for the Agent class use only HTTP, and the SSL tests are a lean client/server, not an HTTPS client/server.

Am I missing something?

A quick glance at the tests confirms your suspicions: we test TLS, and we test HTTP, but we do not test HTTP over TLS. This is definitely something that could be rectified. Thanks for spotting it - please submit a PR to fix it :) -glyph

Jason Litzinger

23 May 23 May

12:38 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

Thanks for spotting it - please submit a PR to fix it :) Copy that, was planning on it, though originally the context was in comparison to an existing benchmark.

I assume the process/coding policies match submissions for Twisted Proper? I briefly searched for docs specific to benchmark updates but didn't see anything glaring. I'm sure I'll need some guidance as to the best way to do this in a fully isolated environment. For example, the host cert is somewhat tricky. The simplest thing would be to serve up a self-signed cert, but I think a realistic benchmark needs to validate a full chain and do hostname verification. Is a PR discussion the best place for these questions? Cheers, -Jason

Glyph

5:54 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

On May 22, 2017, at 5:38 PM, Jason Litzinger wrote:

...
Thanks for spotting it - please submit a PR to fix it :) Copy that, was planning on it, though originally the context was in comparison to an existing benchmark.

I assume the process/coding policies match submissions for Twisted Proper? I briefly searched for docs specific to benchmark updates but didn't see anything glaring.

It's a bit less formal, but largely similar. The main difference is that you should just file a Github issue rather than a Trac ticket.

...

I'm sure I'll need some guidance as to the best way to do this in a fully isolated environment. For example, the host cert is somewhat tricky. The simplest thing would be to serve up a self-signed cert, but I think a realistic benchmark needs to validate a full chain and do hostname verification. Is a PR discussion the best place for these questions?

You can keep asking questions here. The benchmark could use the Cryptography x509 layer to do setup for a custom CA cert, similar to the way many unit tests in Twisted already do... -glyph

Jean-Paul Calderone

12:16 p.m.

On Tue, May 23, 2017 at 1:54 AM, Glyph wrote:

...

...
On May 22, 2017, at 5:38 PM, Jason Litzinger wrote:

...
Thanks for spotting it - please submit a PR to fix it :) Copy that, was planning on it, though originally the context was in comparison to an existing benchmark.

I assume the process/coding policies match submissions for Twisted Proper? I briefly searched for docs specific to benchmark updates but didn't see anything glaring.

It's a bit less formal, but largely similar. The main difference is that you should just file a Github issue rather than a Trac ticket.

...
I'm sure I'll need some guidance as to the best way to do this in a fully isolated environment. For example, the host cert is somewhat tricky. The simplest thing would be to serve up a self-signed cert, but I think a realistic benchmark needs to validate a full chain and do hostname verification. Is a PR discussion the best place for these questions?

You can keep asking questions here.

The benchmark could use the Cryptography x509 layer to do setup for a custom CA cert, similar to the way many unit tests in Twisted already do...

The cryptography APIs for making certs are pretty straightforward and well documented. But if another example helps, here's some code that creates a self-signed ca cert and a client cert with an intermediate cert in between: https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd... https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd... Jean-Paul

Jason Litzinger

24 May 24 May

2:59 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

The cryptography APIs for making certs are pretty straightforward and well documented. But if another example helps, here's some code that creates a self-signed ca cert and a client cert with an intermediate cert in between:

https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd...

https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd...

Thanks! I always find that working examples helps an API click for me, so this is much appreciated. I'll post here when I submit the PR. -Jason

...

_______________________________________________ Twisted-web mailing list Twisted-web@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web

Jason Litzinger

3 Jul 3 Jul

4:58 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

The cryptography APIs for making certs are pretty straightforward and well documented. But if another example helps, here's some code that creates a self-signed ca cert and a client cert with an intermediate cert in between:

https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd...

https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd...

I'm diving into adding benchmarks for testing HTTPS and this has been very helpful. I did have one question, in the cert function neither the pubkey or privkey parameters are used, rather, a_key is always used as both the public and the signing key. Is that intentional? Or, should the public key be the pubkey value and the signing key the privkey value? Meaning, each cert uses the supplied public key and the signing order is: a signs a a signs b b signs c If so, do you want me to send you a PR for this change? Thanks again, -Jason

Glyph

6:15 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

On Jul 2, 2017, at 9:58 PM, Jason Litzinger wrote:

...
The cryptography APIs for making certs are pretty straightforward and well documented. But if another example helps, here's some code that creates a self-signed ca cert and a client cert with an intermediate cert in between:

https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd...

https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd...

I'm diving into adding benchmarks for testing HTTPS and this has been very helpful. I did have one question, in the cert function neither the pubkey or privkey parameters are used, rather, a_key is always used as both the public and the signing key. Is that intentional?

Or, should the public key be the pubkey value and the signing key the privkey value? Meaning, each cert uses the supplied public key and the signing order is:

a signs a a signs b b signs c

If so, do you want me to send you a PR for this change?

I don't know much about txkube, but my understanding of certificate chains suggests that this is what it is trying to test. It looks like this might succeed by accident since everything is signed by the root, and therefore trusted? It also appears to be implied that the chain is verified somehow, but the invariant is actually just about the subject/issuer pairs: https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd... https://github.com/LeastAuthority/txkube/blob/faa0374fcef6d089af39a98310f1bd... so something else must be validating the chain. -glyph

Jean-Paul Calderone

11:08 a.m.

On Mon, Jul 3, 2017 at 12:58 AM, Jason Litzinger wrote:

...

...
The cryptography APIs for making certs are pretty straightforward and well documented. But if another example helps, here's some code that creates a self-signed ca cert and a client cert with an intermediate cert in between:

https://github.com/LeastAuthority/txkube/blob/ faa0374fcef6d089af39a98310f1bd798eb54b08/src/txkube/test/ test_authentication.py#L17-L29

https://github.com/LeastAuthority/txkube/blob/ faa0374fcef6d089af39a98310f1bd798eb54b08/src/txkube/test/ test_authentication.py#L276-L309

I'm diving into adding benchmarks for testing HTTPS and this has been very helpful. I did have one question, in the cert function neither the pubkey or privkey parameters are used, rather, a_key is always used as both the public and the signing key. Is that intentional?

Or, should the public key be the pubkey value and the signing key the

...

privkey value? Meaning, each cert uses the supplied public key and the signing order is:

a signs a a signs b b signs c

If so, do you want me to send you a PR for this change?

Heya Jason, Thanks for pointing this out. Yes, it's a bug. Also, it turns out re-using serial numbers is a bad idea too. I don't think their mistake *currently* hurts the tests but it would be great to get them both fixed. So a PR would be quite welcome. Regarding Glyph's comments about where the chain is verified - that's still handled by the TLS library. The Chain invariant in _authentication.py is just a superficial sanity check to catch the common problem of "a-b-c" vs "c-b-a" chain certificate order. Thanks! Jean-Paul

Jason Litzinger

4 Jul 4 Jul

3:54 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

...

Thanks for pointing this out. Yes, it's a bug. Also, it turns out re-using serial numbers is a bad idea too. I don't think their mistake *currently* hurts the tests but it would be great to get them both fixed. So a PR would be quite welcome. No problem. I noticed most of the branches merged into txkube correspond to issues, do you want an issue first? Also, is the branch naming convention <issue>.<branch-name>?

-Jason

Jean-Paul Calderone

12:30 p.m.

On Mon, Jul 3, 2017 at 11:54 PM, Jason Litzinger wrote:

...

...
Thanks for pointing this out. Yes, it's a bug. Also, it turns out re-using serial numbers is a bad idea too. I don't think their mistake *currently* hurts the tests but it would be great to get them both fixed. So a PR would be quite welcome. No problem. I noticed most of the branches merged into txkube correspond to issues, do you want an issue first? Also, is the branch naming convention <issue>.<branch-name>?

Yes and yes. Thanks!

...

-Jason

_______________________________________________ Twisted-web mailing list Twisted-web@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web

Jason Litzinger

6 Jul 6 Jul

5:45 a.m.

New subject: Question regarding custom contextFactory for twisted.web.client.Agent

On Mon, May 22, 2017 at 10:54:06PM -0700, Glyph wrote:

...

The benchmark could use the Cryptography x509 layer to do setup for a custom CA cert, similar to the way many unit tests in Twisted already do... FYI, WIP here:

https://github.com/jlitzingerdev/twisted-benchmarks/commit/37ca9e5fe64a94e2a... I will be doing more testing/cleanup before submitting a PR. I'd also like to write a benchmark using persistent connections as that is a different but interesting path (as indicated in web.py). That of course means one server/many connections. Cheers, -Jason

2484

Age (days ago)

2539

Last active (days ago)

List overview

Download

15 comments

3 participants

participants (3)

Glyph
Jason Litzinger
Jean-Paul Calderone

Question regarding custom contextFactory for twisted.web.client.Agent

tags

participants (3)