Flushing buffer on file copy on linux
Antoine Pitrou
solipsis at pitrou.net
Wed Aug 15 11:26:02 EDT 2012
J <dreadpiratejeff <at> gmail.com> writes:
>
> Now, the problem I have is that linux tends to buffer data writes to a
> device, and I want to work around that. When run in normal non-stress
> mode, the program is slow enough that the linux buffers flush and put
> the file on disk before the hash occurs. However, when run in stress
> mode, what I'm finding is that it appears that the files are possibly
> being hashed while still in the buffer, before being flushed to disk.
Your analysis is partly wrong. It is right that the files can be hashed from
in-memory buffers; but even if you flush the buffers to disk using standard
techniques (such as fsync()), those buffers still exist in memory, and
therefore the file will still be hashed from memory (for obvious efficiency
reasons).
I don't think there's a portable solution to get away entirely with the
in-memory buffers, but under Linux you can write "1" to the special file
/proc/sys/vm/drop_caches:
$ sudo sh -c "echo 1 > /proc/sys/vm/drop_caches"
Or, to quote the /proc man page:
/proc/sys/vm/drop_caches (since Linux 2.6.16)
Writing to this file causes the kernel to drop clean
caches, dentries and inodes from memory, causing that mem‐
ory to become free.
To free pagecache, use echo 1 > /proc/sys/vm/drop_caches;
to free dentries and inodes, use echo 2 >
/proc/sys/vm/drop_caches; to free pagecache, dentries and
inodes, use echo 3 > /proc/sys/vm/drop_caches.
Because this is a nondestructive operation and dirty
objects are not freeable, the user should run sync(8)
first.
Regards
Antoine.
--
Software development and contracting: http://pro.pitrou.net
More information about the Python-list
mailing list