Bug 143732 - [patch] mtree(8) does a full hierarchy walk when requested to just update structure (-u -e)
Summary: [patch] mtree(8) does a full hierarchy walk when requested to just update str...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-02-10 19:30 UTC by David Naylor
Modified: 2014-12-16 14:33 UTC (History)
1 user (show)

See Also:


Attachments
file.diff (1.78 KB, patch)
2010-02-10 19:30 UTC, David Naylor
no flags Details | Diff
mtree.diff (4.42 KB, patch)
2010-09-01 08:43 UTC, David Naylor
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Naylor 2010-02-10 19:30:01 UTC
mtree does a full hierarchy walk when requested to update a (directory) structure.  This is unnecessary as mtree need only visit the directories or files in question.  This can have particular performance impact if the path's structure is extensive (i.e. /usr/local with 1000+ ports installed) and the structure to update is relatively short or when IO is slow (i.e. stacked unionfs).  

This has real world impact as this mtree command is run every time a port is installed.

Fix: The attached patch prevents a hierarchy walk if only an update is requested.  It simulated the structure returned from fts_read as required by compare.  

An improvement in performance could be implemented using chdir (as fts_* does) however that would require a pervasive change to existing code.  

Patch attached with submission follows:
How-To-Repeat: The following script simulates slow IO (using stacked unionfs).  Note that the time mtree takes is similar to find when unpatched but a fraction of the time (between 60% to 80% quicker [1 - new/old]).  I expect this performance to be better when the path is populated with extra content (e.g. installed ports).  

The script should be run as root in an empty root (the number of overlayed unionfs are specified as the first command, please see http://lists.freebsd.org/pipermail/freebsd-current/2010-January/015040.html for possible problems [system freeze, kstack overflow]).  

On my computer with 155 overlayed unionfs the performance I got:
# ./unionfs_mtree.sh 155 | tail -n 1
mtree 155:       123.77 real         0.03 user       112.67 sys (unpatched)
mtree 155:        29.94 real         0.00 user        29.81 sys (patched)

The script (unionfs_mtree.sh):

#!/bin/sh
set -e

mount_unionfs() {

  count=1
  while [ $count -lt $1 ]
  do
    mount -rt unionfs -o noatime $count top
    count=$(($count + 1))
  done

}

umount_unionfs() {

  count=$1
  while [ $count -gt 1 ]
  do
    count=$(($count - 1))
    umount top
  done

}

mkdir -p top

[ $# -eq 1 ] || (echo "Please specify directory count!!!"; false)

for i in `jot $1`
do
  mount_unionfs $i
  mkdir -p $i
  mount -t unionfs -o noatime $i top

  set +e
  trap "true" INT TERM EXIT
  echo -n "find $i:  "
  time find top > /dev/null
  status=$?

  if [ $status -eq 0 ]
  then
    echo -n "mtree $i: "
    time /usr/sbin/mtree -U -f /usr/ports/Templates/BSD.local.dist -d -e -p top
    status=$?
  fi
  trap - INT TERM EXIT
  set -e

  umount top
  umount_unionfs $i
  [ $status -eq 0 ] || (echo "Terminated"; false)
done
Comment 1 David Naylor 2010-02-15 16:34:46 UTC
Using my "real" world benchmark (see http://unix.derkeiler.com/Mailing-
Lists/FreeBSD/current/2010-01/msg00453.html) I have achieved a 20% speedup 
using the previously attached patch.  My results:

# time ./ports-union-builder.sh (old mtree)
     8123.25 real      2280.53 user      6319.77 sys

# time ./ports-union-builder.sh (new mtree)
     6456.11 real      2272.07 user      5778.74 sys

By my estimated the hierarchical walking of mtree resulted in an additional 28 
minutes real time and 9 minutes system time.
Comment 2 David Naylor 2010-09-01 08:43:53 UTC
Thanks to the feedback from the hackers ML I've updated the patch.  It now is 
more aggressive in using the optimisation and handles buffer overflows (that 
previously went unhandled by mtree).  

Please see attached for the patch.
Comment 3 David Naylor freebsd_committer freebsd_triage 2014-12-16 14:33:49 UTC
This performance issue no longer appears reproducable.