[Pandas-dev] pandas new infrastructure (OVH donation)

Marc Garcia garcia.marc at gmail.com
Thu Jan 12 07:58:55 EST 2023


Hi all, an update on this.

OVH added dedicated hardware to their public cloud, and we can now create
metal instances on demand. I created one, I isolated a CPU, and I'm running
our benchmark suite on one it for the last 50 commits of the project. Each
run takes 2 hours in the cheapest dedicated server, which costs 170 EUR per
month. I don't think more expensive instances will make a difference, since
they seem to have more cores, but not sure if a single core (what we're
using) would be faster. If everything looks good when the current run
finishes, I'll set things up so benchmarks start running automatically in
that machine, and results are published.

In parallel, I'm doing some research and some tests to see if we can detect
performance regressions in PRs. I don't think it makes much sense to do it
with asv, but I think it's feasible to build something specific. I'll send
an update with the results of the research and a proposal when I'm ready.

Any feedback or whatever just let me know.

On Tue, Nov 15, 2022 at 3:05 PM Marc Garcia <garcia.marc at gmail.com> wrote:

> Quick update about the new infrastructure.
>
> - New hosting for the website seems to be working just fine, no issues
> detected. I just stopped nginx in the old server, in case there is anything
> there still being used we hopefully realize. But if there are no issues and
> no objections, I'll be switching off the server in few days.
>
> - We should be able to start using dedicated hardware for the benchmarks
> from our OVH cloud account in December. It'll work as regular cloud
> instances, but with dedicated servers. We'll be doing some tests to try to
> get more stability in the benchmarks, and hopefully we can get something
> even better than until now when the OVH hardware is ready.
>
> On Thu, Nov 10, 2022 at 9:31 PM Marc Garcia <garcia.marc at gmail.com> wrote:
>
>> Oh, I forgot we were not using the rendered asv website from the old
>> server. We're using nginx, so I can easily make pandas.pydata.org/speed
>> show the content from that url. But I guess we can also check them directly
>> in the github pages url, not sure if it makes a difference.
>>
>> Let me know if it's useful, and I'll set it up. Thanks for the info!
>>
>> On Thu, Nov 10, 2022, 21:18 Richard Shadrach <rhshadrach at gmail.com>
>> wrote:
>>
>>> > Besides the open PR, the only missing thing are the benchmarks at (
>>> pandas.pydata.org/speed). The link is not working now, since I didn't
>>> move the benchmarks yet. But before moving this, we should also make the
>>> changes in the benchmarks repo, so benchmark results start to synchronize
>>> with the new server. Can someone with access to the server take care of it
>>> please (DM for the new server info).
>>>
>>> The link https://asv-runner.github.io/asv-collection/pandas/ is being
>>> automatically updated. Can we point to this URL for now, given that we may
>>> be changing how the benchmarks are run? If it's desirable to have the
>>> benchmarks results on the docs server and our current solution is deemed to
>>> be the long term one, I can work on the synchronization. However I'm
>>> resistant to putting in that work if it's just going to go away given the
>>> easier solution.
>>>
>>> Best,
>>> Richard
>>>
>>>
>>> On Wed, Nov 9, 2022 at 11:50 PM Marc Garcia <garcia.marc at gmail.com>
>>> wrote:
>>>
>>>> Some updates (the ones shared in yesterday's call, and some new ones.
>>>>
>>>> The cloud (bucket) storage didn't seem convenient for different
>>>> reasons, so I moved forward with a regular Ubuntu instance (the cheapest, 2
>>>> cores, 7Gb ram, 24 EUR/month). I moved now all the traffic to the new
>>>> instance, and since we've just got static file serving, the instance seems
>>>> to be more than enough to handle our traffic (I didn't see CPU or RAM
>>>> exceed 4% usage in the time I've been monitoring the resources). I've got a
>>>> PR open (#49614) to start syncing our web/docs with the new server. In few
>>>> hours I'll stop the nginx in the old server (I confirmed there is no
>>>> traffic already, since we use cloudflare our dns changes are immediate).
>>>> And in few days I'll switch off the instance in rackspace.
>>>>
>>>> Besides the open PR, the only missing thing are the benchmarks at (
>>>> pandas.pydata.org/speed). The link is not working now, since I didn't
>>>> move the benchmarks yet. But before moving this, we should also make the
>>>> changes in the benchmarks repo, so benchmark results start to synchronize
>>>> with the new server. Can someone with access to the server take care of it
>>>> please (DM for the new server info).
>>>>
>>>> On running the benchmarks in OVH, the VM instances don't seem to be
>>>> stable enough to keep track of performance over time, as it was likely.
>>>> Full results of the tests I did are in this repo:
>>>> https://gitlab.com/datapythonista/pandas_ovh_benchmarks . OVH is
>>>> checking the best way to give us access to dedicated hardware, will
>>>> continue with that once we've got it. In parallel to that, I'm planning to
>>>> do some tests to see if it could be feasible to use valgrind's cachegrind
>>>> (or equivalent) to instead of monitor time, we monitor CPU cycles. That
>>>> should make benchmarking much easier and faster, as any hardware would
>>>> work, and benchmarks could be run in parallel. With a dedicated server
>>>> we're likely to only be able to use a single core to have stable results,
>>>> which means that we can only run one benchmark suite per server every 3
>>>> hours. But implementing it can be tricky.
>>>>
>>>> About CIrun, as you say Joris, it's like a middle man between our
>>>> hardware (the OVH openstack API to create/delete instances) and GitHub
>>>> actions. We need to add an extra yaml file with the CIrun configuration,
>>>> and other than that we should be able to use OVH hardware directly from our
>>>> current CI jobs without changes (except one entry to say what instance we
>>>> want to use for the jobs running in OVH I assume).
>>>>
>>>> Please let me know of any feedback. In particular if you see any
>>>> problem with our website that could be caused by the migration.
>>>>
>>>> Cheers,
>>>>
>>>> On Thu, Nov 10, 2022 at 12:43 AM Joris Van den Bossche <
>>>> jorisvandenbossche at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Sat, 5 Nov 2022 at 15:24, Marc Garcia <garcia.marc at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> pandas has received a donation from OVHcloud
>>>>>> <https://www.ovhcloud.com/> to support the project infrastructure,
>>>>>> with OVHcloud public cloud credits (an initial amount of 10,000 EUR for a
>>>>>> period of one year). OVH is open to sponsor longer term and also other
>>>>>> projects of the ecosystem (or NumFOCUS as a whole), but we started with
>>>>>> this to have feedback at a smaller scale first.
>>>>>>
>>>>>> The credits will be used initially for:
>>>>>> - Hosting of the pandas website
>>>>>> - Running the pandas benchmarks
>>>>>> - Speeding up the project CI
>>>>>>
>>>>>> I detail next what I have in mind to set up for each. If anyone is
>>>>>> interested in getting involved, or has ideas, comments... please let me
>>>>>> know. I'll publish updates here as there is progress on this.
>>>>>>
>>>>>>
>>>>>> Website: I'm planning to experiment on splitting the website in two
>>>>>> (it'll be transparent for users). The website and the stable docs which
>>>>>> receive most of the traffic can probably be stored in Cloudflare pages.
>>>>>> We're already using Cloudflare as a CDN, so instead of using it as a cache,
>>>>>> we can publish the documents there. The rest of the docs (old versions and
>>>>>> the dev version) can be hosted in bucket storage of the OVHcloud. Response
>>>>>> times may be a bit slower, but our website is bigger than the Cloudflare
>>>>>> quota, and having old docs rarely accessed in a CDN seems unnecessary
>>>>>> anyway.
>>>>>>
>>>>>
>>>>> Splitting like that makes sense! (_if_ it is within quota, we could
>>>>> maybe consider keeping the dev docs, and only move old docs to bucket
>>>>> storage?)
>>>>>
>>>>>
>>>>>>
>>>>>> - Benchmarks: OVHcloud instances have guaranteed hardware, and we'll
>>>>>> be checking if this is enough for the results of the benchmarks to be
>>>>>> consistent over runs, or if there is too much variability and we need
>>>>>> dedicated hardware. If consistency is good enough that would be great,
>>>>>> since our benchmarks mostly use one core, and using dedicated hardware is
>>>>>> likely to be a decent waste of resources, since most servers will likely
>>>>>> have 16 cores or more. We'll discuss with OVH if dedicated hardware is
>>>>>> needed, as at the moment their public cloud doesn't offer it (there is an
>>>>>> alpha for providing dedicated instances, but we need to check with them).
>>>>>>
>>>>>> - Faster CI: Our GitHub runners are small, and most builds take
>>>>>> around one hour or more to finish. We should be able to use bigger OVH
>>>>>> instances for our existing CI pretty easily, via their OpenStack API and
>>>>>> CIrun.
>>>>>>
>>>>>
>>>>> I am not familiar with CIrun, but quickly checking it, that would
>>>>> basically be using our current github actions but through their
>>>>> "self-hosted" runner feature?
>>>>>
>>>>>
>>>>>> _______________________________________________
>>>>>> Pandas-dev mailing list
>>>>>> Pandas-dev at python.org
>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>
>>>>> _______________________________________________
>>>> Pandas-dev mailing list
>>>> Pandas-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230112/ca7f312d/attachment-0001.html>


More information about the Pandas-dev mailing list