Disk-level Clustering for NetBSD

These patches add disk-level clustering to NetBSD-1.6. There's an explanation of what they implement over on kerneltrap.org. When I get some hard figures on their performance, I'll put them up here.

The Patches

The copyright on these patches is held by Network Storage Solutions, Inc, who are now in Chapter 11 bankruptcy proceedings [though that link is dead]. They've released them under a BSD license, see the subr_cluster.c file. The notice should look real familiar.

I've updated the patches to apply against NetBSD-current as of early February 2004. There's only one patch file in this set.

The following striken out info is on the mechanics of applying the older patches from 2002.

There's 2 parts. There would be just one part if I had known how to get cvs diff to work like diff -N, but at the time I made the patch, I didn't. (Some kind soul has since told me how that's done. ) They apply to NetBSD-current as of 4 Sep 2002.

First, there's a patch to download and apply: clust-diff. The diff was taken from the sys/ directory so run patch from there, something like:

	csh% cd /usr/src/sys
	csh% patch < cluster-diff
This diff has been hand-edited to remove some stuff that isn't relevant. If the diff is cocked up, let me know.

Second, there's a new kernel source file to add: subr_cluster.c. Install it like:

	csh% cp subr_cluster.c /usr/src/sys/kern

I think the subr_cluster.c code might better belong in subr_disk.c, with its prototypes in <sys/disk.h>. However, the hysterical porpoises are happy where it is for now, in its own file with prototypes in <sys/buf.h>.

Kernel Options

Some kernel options of interest:

SD_CLUSTER Enables clustering by the sd driver. I've tested this and it works
WD_CLUSTER Enables clustering by the wd driver. I haven't tested this, yet, but the code looks an awful lot like the sd, so I'm betting it works. Of course, I haven't compiled it, yet.
MAX_CLUSTERS Default number of buffers allowed to be clustered together; defaults to the number of pages that fit in MAXPHYS (16 on an i386) on the assumption that you ordinarily transfer at least a page of data at a time
MAX_CLUSTEREDVA Default amount of virtual memory mapped by clusters; defaults to 32M
CLUSTER_STATS Enable this flag if you want the stats listed in the next table enabled for non-DIAGNOSTIC kernels
These values arguably need tweaking. In particular, MAX_CLUSTEREDVA should probably be scaled to some percentage of total VA.

Global Variables Controlling/Observing Clustering

Some globals of interest:

dosdcluster

Boolean controlling whether clusters are built by the sd driver; defaults to 1

dowdcluster

Ditto for the wd driver; defaults to 1

max_clusters Maximum number of buffers that can be clustered together into a single disk xfer; defaults to MAX_CLUSTERS, above
max_clusteredva Total amount of virtual address space mapped by clusters; defaults to MAX_CLUSTEREDVA, above
totalclusters Counts the number of clusters built by the disk driver
currentclusters The current number of clusters in transit to/from the disk
missedclusters A count of the number of clusters that could have been built, but weren't because of a resource shortage like not enough RAM
The latter 3 are enabled for DIAGNOSTIC kernels or those compiled with the CLUSTER_STATS option. Ideally, all would be sysctl'ed and per-disk. And mebbe a global max on the VA used by clusters would be good, too.


Back to Blasted Heath Consulting LLC