% ls -l /var/db/services.db -rw-r--r-- 1 root wheel 2097920 Jan 29 04:05 /var/db/services.db I've noted this in various places before, but there is now way that this file should be 2MB. I've added a review to exclude installation of /var/db/services.db for use with small installs, but I think the "answer" is to investigate what exactly is causing this file to explode in size.
87% of the file is zero bytes. >>> f = file("/var/db/services.db").read() >>> zb = 0 >>> for b in f: ... if b == '\x00': zb += 1 ... >>> print zb 1836320 $ ls -l /var/db/services.db -rw-r--r-- 1 root wheel 2097920 Jan 16 15:16 /var/db/services.db 1836320 / 2097920 = 0.87530506
As far as: can Berkeley db be VACUUMed: maybe? http://stackoverflow.com/questions/8722687/berkeley-db-file-compression says yes for 5.x. But I don't think 5.x is what we ship in base (libc). There is a port: db5.
For entertainment, I ran services_mkdb against a services file with only 1 entry in it. The file was still 2MB!
I'm guessing that this is an initialization problem. The hash is being setup to handle way more elements than is really needed in here. % wc -l /etc/services 2495 /etc/services HASHINFO hinfo = { .bsize = 256, .ffactor = 4, .nelem = 32768, .cachesize = 1024, .hash = NULL, .lorder = 0 }; If I change the HASHINFO to be slightly less over engineered (and less future proof), I can get the *empty* services file down to 260k, but that's not really a huge improvement for a basically empty file. Should it be that big? I didn't really think I was going to have to go and learn berkley DB this week. :-) Index: services_mkdb.c =================================================================== --- services_mkdb.c (revision 318297) +++ services_mkdb.c (working copy) @@ -68,10 +68,10 @@ static void usage(void); HASHINFO hinfo = { - .bsize = 256, - .ffactor = 4, - .nelem = 32768, - .cachesize = 1024, + .bsize = 48, + .ffactor = 1, + .nelem = 4096, + .cachesize = 256, .hash = NULL, .lorder = 0 }; -rw-r--r-- 1 sbruno sbruno 262720 May 15 12:04 services.db