An introduction to the device mapper
====================================

The goal of this driver is to support volume management.  
The driver enables the definition of new block devices composed of
ranges of sectors of existing devices.  This can be used to define
disk partitions - or logical volumes.  This light-weight kernel
component can support user-space tools for logical volume management.

The driver maps ranges of sectors for the new logical device onto
'mapping targets' according to a mapping table.  Currently the mapping 
table must be supplied to the driver through an ioctl interface.
Earlier versions of the driver also had a custom file system interface 
(dmfs), but we stopped work on this because of pressure of time.

The mapping table consists of an ordered list of rules of the form:
  <start> <length> <target> [<target args> ...]
which map <length> sectors beginning at <start> to a target.

Every sector on the new device must be specified - there must be no
gaps between the rules.  The first rule has <start> = 0. 
Each subsequent rule starts from the previous <start> + <length> + 1.

When a sector of the new logical device is accessed, the make_request
function looks up the correct target and then passes the request on to
the target to perform the remapping according to its arguments.

The following targets are available:
  linear
  striped
  error
  snapshot
  mirror

The 'linear' target takes as arguments a target device name (eg
/dev/hda6) and a start sector and maps the range of sectors linearly
to the target.

The 'striped' target is designed to handle striping across physical
volumes.  It takes as arguments the number of stripes and the striping
chunk size followed by a list of pairs of device name and sector.

The 'error' target causes any I/O to the mapped sectors to fail.  This
is useful for defining gaps in the new logical device.

The 'snapshot' target supports asynchronous snapshots.
See http://people.sistina.com/~thornber/snap_performance.html.

The 'mirror' target is used to implement pvmove.

In normal scenarios the mapping tables will remain small.
A btree structure is used to hold the sector range -> target mapping.
Since we know all the entries in the btree in advance we can make a
very compact tree, omitting pointers to child nodes as child node
locations can be calculated.

Benchmarking with bonnie++ suggests that this is certainly no slower
than current LVM.


Sistina UK
Updated 30/04/2003