DragonFlyBSD/src 6aaf5cbsys/vfs/hammer2 hammer2_flush.c hammer2_io.c

hammer2 - performance pass

* Get rid of vfs.hammer2.cluster_write and stop using cluster_write()
  for the block device I/O.  This coupled into common unlock/lock
  situations on chains which would acquire and retire the DIO, and
  usually thus also the underlying buffer, many times before it
  really needed to be committed.

  This greatly reduces unnecessary writes to disk.

* Increase HAMMER2_FLUSH_DEPTH_LIMIT to 60.  It was set to 10 for
  debugging purposes.  This created an O(N^2) overhead situation
  in hammer2_flush().  20,000 dirty inodes would translate to
  30 million chain scans, resulting in cpu-bound stalls for long
  periods of time.

  Fixing this value reduces a 20,000 dirty inode flush to around
  200,000 chain scans (100x faster).

* Use hammer2_chain_ref_hold() and hammer2_chain_drop_unhold()
  to reduce the amount of buffer cache buffer cycling that occurs
  during a flush, by retaining the DIO associated with a parent
  chain across its unlock/recurse/relock sequence.

  The number of buffers held locked is limited by the flush recursion
  depth.
DeltaFile
+53-4sys/vfs/hammer2/hammer2_flush.c
+21-8sys/vfs/hammer2/hammer2_io.c
+0-3sys/vfs/hammer2/hammer2_vfsops.c
+0-1sys/vfs/hammer2/hammer2.h
+74-164 files

UnifiedSplitRaw