Hi, I wrote two scripts based on the work of INADA-san's work to (1) download the source code of the PyPI top 5000 projects (2) search for a regex in these projects (compressed source archives). You can use these tools if you work on an incompatible Python or C API change to estimate how many projects are impacted. The HPy project created a Git repository for a similar need (latest update in June 2021): https://github.com/hpyproject/top4000-pypi-packages There are also online services for code search: * GitHub: https://github.com/search * https://grep.app/ (I didn't try it yet) * Debian: https://codesearch.debian.net/ (1) Dowload Script: https://github.com/vstinner/misc/blob/main/cpython/download_pypi_top.py Usage: download_pypi_top.py PATH It uses this JSON file: https://hugovk.github.io/top-pypi-packages/top-pypi-packages-30-days.min.jso... From this service: https://hugovk.github.io/top-pypi-packages/ At December 1, on 5000 projects, it only downloads 4760 tarball and ZIP archives: I guess that 240 projects don't provide a source archive. It takes around 5,2 GB of disk space. (2) Code search First, I used the fast and nice "ripgrep" tool with the command "rg -zl REGEX path/*.{zip,gz,bz2,tgz}" (-z searchs in ZIP and tarball archives). But it doesn't show the path inside the archive and it searchs in files generated by Cython whereas I wanted to ignore these files. So I wrote a short Python script which decompress tarball and ZIP archive in memory and looks for a regex: https://github.com/vstinner/misc/blob/main/cpython/search_pypi_top.py Usage: search_pypi_top.py "REGEX" output_filename The code to parse command line option is hardcoded and pypi_dir = "PYPI-2021-12-01-TOP-5000" are hardcoded :-D It ignores files generated by Cython and .so binary files (Linux dynamic libraries). While "rg" is very fast, my script is very slow. But I don't care, once the regex is written, I only need to search for the regex once, I can wait 10-15 min ;-) I prefer to wait longer and have a more accurate result. Also, there is room for enhancement, like running multiple jobs in different processes or threads. Victor -- Night gathers, and now my watch begins. It shall not end until my death.