tail
Barry
barry at barrys-emacs.org
Mon May 9 16:58:45 EDT 2022
> On 9 May 2022, at 17:41, ram at zedat.fu-berlin.de wrote:
>
> Barry Scott <barry at barrys-emacs.org> writes:
>> Why use tiny chunks? You can read 4KiB as fast as 100 bytes
>
> When optimizing code, it helps to be aware of the orders of
> magnitude
That is true and we’ll know to me, now show how what I said is wrong.
The os is going to DMA at least 4k, with read ahead more like 64k.
So I can get that into the python memory at the same scale of time as
1 byte because it’s the setup of the I/O that is expensive not the bytes
transferred.
Barry
> . Code that is more cache-friendly is faster, that is,
> code that holds data in single region of memory and that uses
> regular patterns of access. Chandler Carruth talked about this,
> and I made some notes when watching the video of his talk:
>
> CPUS HAVE A HIERARCHICAL CACHE SYSTEM
> (from a 2014 talk by Chandler Carruth)
>
> One cycle on a 3 GHz processor 1 ns
> L1 cache reference 0.5 ns
> Branch mispredict 5 ns
> L2 cache reference 7 ns 14x L1 cache
> Mutex lock/unlock 25 ns
> Main memory reference 100 ns 20xL2, 200xL1
> Compress 1K bytes with Snappy 3,000 ns
> Send 1K bytes over 1 Gbps network 10,000 ns 0.01 ms
> Read 4K randomly from SSD 150,000 ns 0.15 ms
> Read 1 MB sequentially from memory 250,000 ns 0.25 ms
> Round trip within same datacenter 500,000 ns 0.5 ms
> Read 1 MB sequentially From SSD 1,000,000 ns 1 ms 4x memory
> Disk seek 10,000,000 ns 10 ms 20xdatacen. RT
> Read 1 MB sequentially from disk 20,000,000 ns 20 ms 80xmem.,20xSSD
> Send packet CA->Netherlands->CA 150,000,000 ns 150 ms
>
> . Remember how recently people here talked about how you cannot
> copy text from a video? Then, how did I do it? Turns out, for my
> operating system, there's a screen OCR program! So I did this OCR
> and then manually corrected a few wrong characters, and was done!
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list