[OpenBSD]

[FAQ Index] [To Section 13 - Multimedia] [To Section 15 - Packages and Ports]

14 - Disk Setup


Table of Contents


14.1 - Disks and Partitions

The details of setting up disks in OpenBSD vary between platforms, so you should read the installation instructions in the INSTALL.<arch> file for your platform to determine the specifics for your system.

Drive identification

OpenBSD handles mass storage with two drivers on most platforms, depending upon the normal command set that kind of device supports: The first drive of a particular type identified by OpenBSD will be drive '0', the second will be '1', etc. So, the first IDE-like disk will be wd0, the third SCSI-like disk will be sd2. If you have two SCSI-like drives and three IDE-like drives on a system, you would have sd0, sd1, wd0, wd1, and wd2 on that machine. The order is based on the order they are found during hardware discovery at boot. There are a few key points to keep in mind:

Partitioning

Due to historical reasons, the term "partition" is regularly used for two different things in OpenBSD and this leads to some confusion.

The two types of "partitions" are:

All OpenBSD platforms use disklabel(8) as the primary way to manage OpenBSD filesystem partitions, but only some platforms also require using fdisk(8) to manage Partition Table partitions. On the platforms that use fdisk partitions, one fdisk partition is used to hold all of the OpenBSD file systems, this partition is then sliced up into disklabel partitions. These disklabel partitions are labeled "a" through "p". A few of these are "special":

Partition identification

An OpenBSD filesystem is identified by the disk it is on, plus the file system partition on that disk. So, file systems may be identified by identifiers like "sd0a" (the "a" partition of the first "sd" device), "wd2h" (the "h" partition of the third "wd" device), or "sd1c" (the entire second sd device). The device files would be /dev/sd0a for the block device, /dev/rsd0a would be the device file for the "raw" (character) device.

Some utilities will let you use the "shortcut" name of a partition (i.e., "sd0d") or a drive (i.e., "wd1") instead of the actual device name ("/dev/sd0d" or "/dev/wd1c", respectively).

Note again that if you put data on wd2d, then later remove wd1 from the system and reboot, your data is now on wd1d, as your old wd2 is now wd1. However, a drive's identification won't change after boot, so if a USB drive is unplugged or fails, it won't change the identification of other drives until reboot.

Disklabel Unique Identifiers

Disks can also be identified by Disklabel Unique Identifiers (DUIDs), a 16 hex digit number, managed by the diskmap(4) device. This number is generated automatically as a random number when a disklabel is first created, though defaults to all zeros on an existing (pre OpenBSD 4.8) labels. disklabel(8) can be used to change the UID if desired. These UIDs are "persistent" -- if you identify your disks this way, drive "f18e359c8fa2522b" will always be f18e359c8fa2522b, no matter what order or how it is attached. You can specify partitions on the disk by appending a period and the partition letter, for example, f18e359c8fa2522b.d is the 'd' partition of the disk f18e359c8fa2522b and will ALWAYS refer to the same chunk of storage, no matter what order the device is attached to the system, or what kind of interface it is attached to.

These UIDs can be used to identify the disks almost anywhere a partition or device would be specified, for example in /etc/fstab or in command lines. Of course, disks and partitions may also be identified in the traditional way, by device, unit number and partition (i.e., /dev/sd1f), and this can be done interchangeably.

It is worth noting that the DUID is a property of the disklabel, though as OpenBSD only supports one disklabel per disk, this is mostly academic.

14.2 - Using fdisk(8)

Be sure to check the fdisk(8) man page.

fdisk(8) is used on some platforms (i386, amd64, macppc, zaurus and armish) to create a partition recognized by the system boot ROM, into which the OpenBSD disklabel partitions can be placed. Other platforms do not need or use fdisk(8). fdisk(8) can also be used for manipulations of the Master Boot Record (MBR), which can impact all operating systems on a computer. Unlike the fdisk-like programs on some other operating systems, OpenBSD's fdisk assumes you know what you want to do, and for the most part, it will let you do what you need to do, making it a powerful tool to have on hand. It will also let you do things you shouldn't or didn't intend to do, so it must be used with care.

Normally, only one OpenBSD fdisk partition will be placed on a disk. That partition will be subdivided by disklabel into OpenBSD filesystem partitions.

To just view your partition table using fdisk, use:

# fdisk sd0

Which will give an output similar to this:

Disk: sd0       geometry: 553/255/63 [8883945 Sectors]
Offset: 0       Signature: 0xAA55
         Starting       Ending       LBA Info:
 #: id    C   H  S -    C   H  S [       start:      size   ]
------------------------------------------------------------------------
*0: A6    3   0  1 -  552 254 63 [       48195:     8835750 ] OpenBSD     
 1: 12    0   1  1 -    2 254 63 [          63:       48132 ] Compaq Diag.
 2: 00    0   0  0 -    0   0  0 [           0:           0 ] unused      
 3: 00    0   0  0 -    0   0  0 [           0:           0 ] unused      

In this example we are viewing the fdisk output of the first SCSI-like drive. We can see the OpenBSD partition (id A6) and its size. The * tells us that the OpenBSD partition is the bootable partition.

In the previous example we just viewed our information. What if we want to edit our partition table? Well, to do so we must use the -e flag. This will bring up a command line prompt to interact with fdisk.

# fdisk -e wd0
Enter 'help' for information
fdisk: 1> help
        help            Command help list
        manual          Show entire OpenBSD man page for fdisk
        reinit          Re-initialize loaded MBR (to defaults)
        setpid          Set the identifier of a given table entry
        disk            Edit current drive stats
        edit            Edit given table entry
        flag            Flag given table entry as bootable
        update          Update machine code in loaded MBR
        select          Select extended partition table entry MBR
        swap            Swap two partition entries
        print           Print loaded MBR partition table
        write           Write loaded MBR to disk
        exit            Exit edit of current MBR, without saving changes
        quit            Quit edit of current MBR, saving current changes
        abort           Abort program without saving current changes
fdisk: 1> 

Here is an overview of the commands you can use when you choose the -e flag.

fdisk tricks and tips

14.3 - Using OpenBSD's disklabel(8)

What is disklabel(8)?

First, be sure to read the disklabel(8) man page.

The details of setting up disks in OpenBSD varies somewhat between platforms. For i386, amd64, macppc, zaurus, and armish, disk setup is done in two stages. First, the OpenBSD slice of the hard disk is defined using fdisk(8), then that slice is subdivided into OpenBSD partitions using disklabel(8).

All OpenBSD platforms, however, use disklabel(8) as the primary way to manage OpenBSD partitions. Platforms that also use fdisk(8) place all the disklabel(8) partitions in a single fdisk partition.

Labels hold certain information about your disk, like your drive geometry and information about the filesystems on the disk. The disklabel is then used by the bootstrap program to access the drive and to know where filesystems are contained on the drive. You can read more in-depth information about disklabel in the disklabel(5) man page.

On some platforms, disklabel helps overcome architecture limitations on disk partitioning. For example, on i386, you can have 4 primary partitions, but with disklabel(8), you use one of these 'primary' partitions to store all of your OpenBSD partitions (for example, 'swap', '/', '/usr', '/var', etc.), and you still have 3 more partitions available for other OSs.

disklabel(8) during OpenBSD's install

One of the major parts of OpenBSD's install is your initial creation of labels. During the install you use disklabel(8) to create your separate partitions. As part of the install process, you can define your mount points from within disklabel(8), but you can change these later in the install or post-install, as well.

There is not one "right" way to label a disk, but there are many wrong ways. Before attempting to label your disk, see this discussion on partitioning and partition sizing.

For an example of using disklabel(8) during install, see the Custom disklabel layout part of the Installation Guide.

Using disklabel(8) after install

After install, one of the most common reasons to use disklabel(8) is to look at how your disk is laid out. The following command will show you the current disklabel, without modifying it:

# disklabel wd0 <-- Or whatever disk device you'd like to view
type: ESDI
disk: ESDI/IDE disk
label: SAMSUNG HD154UI 
duid: d920a43a5a56ad5f
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 2907021
total sectors: 2930277168
boundstart: 64
boundend: 2930272065
drivedata: 0 

16 partitions:
#                size           offset  fstype [fsize bsize  cpg]
  a:          1024064               64  4.2BSD   2048 16384    1 # /
  b:          4195296          1024128    swap                   
  c:       2930277168                0  unused                   
  d:          4195296          5219424  4.2BSD   2048 16384    1 # /usr
  e:          4195296          9414720  4.2BSD   2048 16384    1 # /tmp
  f:         20972448         13610016  4.2BSD   2048 16384    1 # /var
  h:          2097632         34582464  4.2BSD   2048 16384    1 # /home

Note how this disk has only part of its disk space allocated at this time. Disklabel offers two different modes for editing the disklabel, a built-in command-driven editor (this is how you installed OpenBSD originally), and a full editor, such as vi(1). You may find the command-driven editor "easier", as it guides you through all the steps and provides help upon request, but the full-screen editor has definite use, too.

Let's add a partition to the above system.

Warning: Any time you are fiddling with your disklabel, you are putting all the data on your disk at risk. Make sure your data is backed up before editing an existing disklabel!

We will use the built-in command-driven editor, which is invoked using the "-E" option to disklabel(8).

# disklabel -E wd0
...
> a k
offset: [36680096] 
size: [2893591969] 1T
Rounding to cylinder: 2147483536
FS type: [4.2BSD] 
> p m
OpenBSD area: 64-2930272065; size: 1430796.9M; free: 364310.8M
#                size           offset  fstype [fsize bsize  cpg]
  a:           500.0M               64  4.2BSD   2048 16384    1 # /
  b:          2048.5M          1024128    swap                   
  c:       1430799.4M                0  unused                   
  d:          2048.5M          5219424  4.2BSD   2048 16384    1 # /usr
  e:          2048.5M          9414720  4.2BSD   2048 16384    1 # /tmp
  f:         10240.5M         13610016  4.2BSD   2048 16384    1 # /var
  h:          1024.2M         34582464  4.2BSD   2048 16384    1 # /home
  k:       1048575.9M         36680192  4.2BSD   8192 65536    1 
> q
Write new label?: [y] 
In this case, disklabel(8) was kind enough to calculate a good starting offset for the partition. In many cases, it will be able to do this, but if you have "holes" in the disklabel (i.e., you deleted a partition, or you just like making your life miserable) you may need to sit down with a paper and pencil to calculate the proper offset. Note that while disklabel(8) does some sanity checking, it is very possible to do things very wrong here. Be careful, understand the meaning of the numbers you are entering.

On most OpenBSD platforms, there are sixteen disklabel partitions available, labeled "a" through "p". (some "specialty" systems may have only eight). Every disklabel should have a 'c' partition, with an "fstype" of "unused" that covers the entire physical drive. If your disklabel is not like this, it must be fixed, the "D" option (below) can help. Never try to use the "c" partition for anything other than accessing the raw sectors of the disk, do not attempt to create a file system on "c". On the boot device, "a" is reserved for the root partition, and "b" is the swap partition, but only the boot device makes these distinctions. Other devices may use all fifteen partitions other than "c" for file systems.

Disklabel tricks and tips

14.4 - Adding extra disks in OpenBSD

Well once you get your disk installed PROPERLY you need to use fdisk(8) (i386 only) and disklabel(8) to set up your disk in OpenBSD.

For i386 folks, start with fdisk. Other architectures can ignore this. In the below example we're adding a third SCSI-like drive to the system.

# fdisk -i sd2
This will initialize the disk's "real" partition table for exclusive use by OpenBSD. Next you need to create a disklabel for it. This will seem confusing.
# disklabel -e sd2

(screen goes blank, your $EDITOR comes up)
type: SCSI
...bla...
sectors/track: 63
total sectors: 6185088
...bla...
16 partitions:
#        size   offset    fstype   [fsize bsize   cpg]
  c:  6185088        0    unused        0     0         # (Cyl.    0 - 6135)
  d:  1405080       63    4.2BSD     1024  8192    16   # (Cyl.    0*- 1393*)
  e:  4779945  1405143    4.2BSD     1024  8192    16   # (Cyl. 1393*- 6135)
First, ignore the 'c' partition, it's always there and is for programs like disklabel to function! Fstype for OpenBSD is 4.2BSD. Total sectors is the total size of the disk. Say this is a 3 gigabyte disk. Three gigabytes in disk manufacturer terms is 3000 megabytes. So divide 6185088/3000 (use bc(1)). You get 2061. So, to make up partition sizes for a, d, e, f, g, ... just multiply X*2061 to get X megabytes of space on that partition. The offset for your first new partition should be the same as the "sectors/track" reported earlier in disklabel's output. For us it is 63. The offset for each partition afterwards should be a combination of the size of each partition and the offset of each partition (except the 'c' partition, since it has no play into this equation.)

Or, if you just want one partition on the disk, say you will use the whole thing for web storage or a home directory or something, just take the total size of the disk and subtract the sectors per track from it. 6185088-63 = 6185025. Your partition is

    d:  6185025       63    4.2BSD     1024  8192    16 
If all this seems needlessly complex, you can just use disklabel -E to get the same partitioning mode that you got on your install disk! There, you can just use "96M" to specify "96 megabytes", or 96G for 96 gigs.

That was a lot. But you are not finished. Finally, you need to create the filesystem on that disk using newfs(8).

# newfs sd2d 

Or whatever your disk was named as per OpenBSD's disk numbering scheme. (Look at the output from dmesg(8) to see what your disk was named by OpenBSD.)

Now figure out where you are going to mount this new partition you just created. Say you want to put it on /u. First, make the directory /u. Then, mount it.

# mount /dev/sd2d /u

Finally, add it to /etc/fstab(5).

/dev/sd2d /u ffs rw 1 1

What if you need to migrate an existing directory like /usr/local? You should mount the new drive in /mnt and copy /usr/local to the /mnt directory. Example:

# cd /usr/local && pax -rw -p e . /mnt
Edit the /etc/fstab(5) file to show that the /usr/local partition is now /dev/sd2d (your freshly formatted partition). Example:
/dev/sd2d /usr/local ffs rw 1 1

Reboot into single user mode with boot -s, move the existing /usr/local to /usr/local-backup (or delete it if you feel lucky) and create an empty directory /usr/local. Then reboot the system, and voila, the files are there!

14.5 - How is swap handled?

14.5.1 - About swap

Historically, all kinds of rules have been tossed about to guide administrators on how much swap to configure on their machines. The problem, of course, is there are few "normal" applications.

One non-obvious use for swap is to be a place the kernel can dump a copy of what is in core in the event of a system panic for later analysis. For this to work, you must have a swap partition (not a swap file) at least as large as your RAM. By default, the system will save a copy of this dump to /var/crash on reboot, so if you wish to be able to do this automatically, you will need sufficient free space on /var. However, you can also bring the system up single-user, and use savecore(8) to dump it elsewhere.

Many types of systems may be appropriately configured with no swap at all. For example, firewalls should not swap in normal operation. Machines with flash storage generally should not swap. If your firewall is flash based, you may benefit (slightly) by not allocating a swap partition, though in most other cases, a swap partition won't actually hurt anything; most disks have more than enough space to allocate a little to swap.

There are all kinds of tips about optimizing swap (where on the disk, separate disks, etc.), but if you find yourself in a situation where optimizing swap is an issue, you probably need more RAM. In general, the best optimization for swap is to not need it.

In OpenBSD, swap is managed with the swapctl(8) program, which adds, removes, lists and prioritizes swap devices and files.

14.5.2 - Swapping to a partition

On OpenBSD, the 'b' partition of the boot drive is used by default and automatically for swap. No configuration is needed for this to take place. If you do not wish to use swap on the boot disk, do not define a "b" partition. If you wish to use swap on other partitions or on other disks, you need to define these partitions in /etc/fstab with lines something like:

/dev/sd3b none swap sw 0 0
/dev/sd3d none swap sw 0 0

14.5.3 - Swapping to a file

(Note: if you are looking to swap to a file because you are getting "virtual memory exhausted" errors, you should try raising the per-process limits first with csh(1)'s unlimit, or sh(1)'s ulimit.)

Sometimes, your initial guess about how much swap you need proves to be wrong, and you have to add additional swap space, occasionally in a hurry (as in, "Geez, at the rate it is burning swap, we'll be wedged in five minutes"). If you find yourself in this position, adding swap space as a file on an existing file system can be a quick fix.

The file must not reside on a filesystem which has SoftUpdates enabled (they are disabled by default). To start out, you can see how much swap you currently have and how much you are using with the swapctl(8) utility. You can do this by using the command:

$ swapctl -l
Device      512-blocks     Used    Avail Capacity  Priority
swap_device      65520        8    65512     0%    0

This shows the devices currently being used for swapping and their current statistics. In the above example there is only one device named "swap_device". This is the predefined area on disk that is used for swapping. (Shows up as partition b when viewing disklabels) As you can also see in the above example, that device isn't getting much use at the moment, but for the purposes of this document, we will act as if an extra 32M is needed.

The first step to setting up a file as a swap device is to create the file. It's best to do this with the dd(1) utility. Here is an example of creating the file /var/swap that is 32M in size.

$ sudo dd if=/dev/zero of=/var/swap bs=1k count=32768
32768+0 records in
32768+0 records out
33554432 bytes transferred in 20 secs (1677721 bytes/sec)

Once this has been done, we can turn on swapping to that device. Use the following command to turn on swapping to this device

$ sudo chmod 600 /var/swap
$ sudo swapctl -a /var/swap

Now we need to check to see if it has been correctly added to the list of our swap devices.

$ swapctl -l
Device      512-blocks     Used    Avail Capacity  Priority
swap_device      65520        8    65512     0%    0
/var/swap        65536        0    65536     0%    0
Total           131056        8   131048     0%

Now that the file is setup and swapping is being done, you need to add a line to your /etc/fstab file so that this file is configured on the next boot time also. If this line is not added, your won't have this swap device configured.

$ cat /etc/fstab
/dev/wd0a / ffs rw 1 1
/var/swap /var/swap swap sw 0 0

14.6 - Soft Updates

Soft Updates is based on an idea proposed by Greg Ganger and Yale Patt and developed for FreeBSD by Kirk McKusick. SoftUpdates imposes a partial ordering on the buffer cache operations which permits the requirement for synchronous writing of directory entries to be removed from the FFS code. Thus, a large performance increase is seen in disk writing performance.

Enabling soft updates must be done with a mount-time option. When mounting a partition with the mount(8) utility, you can specify that you wish to have soft updates enabled on that partition. Below is a sample /etc/fstab(5) entry that has one partition sd0a that we wish to have mounted with soft updates.

/dev/sd0a / ffs rw,softdep 1 1

Note to sparc users: Do not enable soft updates on sun4 or sun4c machines. These architectures support only a very limited amount of kernel memory and cannot use this feature. However, sun4m machines are fine.

14.7 - How do OpenBSD/i386 and OpenBSD/amd64 boot?

The boot process for OpenBSD/i386 and OpenBSD/amd64 is not trivial, and understanding how it works can be useful to troubleshoot a problem when things don't work. There are four key pieces to the boot process:
  1. Master Boot Record (MBR): The Master Boot Record is the first 512 bytes on the disk. It contains the primary partition table and a small program to load the Partition Boot Record (PBR). Note that in some environments, the term "MBR" is used to refer to only the code portion of this first block on the disk, rather than the whole first block (including the partition table). It is critical to understand the meaning of "initialize the MBR" -- in the terminology of OpenBSD, it would involve rewriting the entire MBR, clearing any existing partition table, not just the code, as it might on some systems. You will often not want to do this. Instead, use fdisk(8)'s "-u" command line option ("fdisk -u wd0") to (re)install the MBR boot code.

    While OpenBSD includes its own MBR code, you are not obliged to use it, as virtually any MBR code can boot OpenBSD. The MBR is manipulated by the fdisk(8) program, which is used both to edit the partition table, and also to install the MBR code on the disk.

    OpenBSD's MBR announces itself with the message:

    Using drive 0, partition 3.
    
    showing the disk and partition it is about to load the PBR from. In addition to the obvious, it also shows a trailing period ("."), which indicates this machine is capable of using LBA translation to boot. If the machine were incapable of using LBA translation, the above period would have been replaced with a semicolon (";"), indicating CHS translation:
    Using drive 0, partition 3;
    
    Note that the trailing period or semicolon can be used as an indicator of the "new" OpenBSD MBR, introduced with OpenBSD 3.5.

  2. Partition Boot Record (PBR): The Partition Boot Record, also called the PBR or biosboot(8) (after the name of the file that holds the code) is the first 512 bytes of the OpenBSD partition of the disk. The PBR is the "first-stage boot loader" for OpenBSD. It is loaded by the MBR code, and has the task of loading the OpenBSD second-stage boot loader, boot(8). Like the MBR, the PBR is a very tiny section of code and data, only 512 bytes, total. That's not enough to have a fully filesystem-aware application, so rather than having the PBR locate /boot on the disk, the BIOS-accessible location of /boot is physically coded into the PBR at installation time.

    The PBR is installed by installboot(8), which is further described later in this document. The PBR announces itself with the message:

    Loading...
    
    printing a dot for every file system block it attempts to load. Again, the PBR shows if it is using LBA or CHS to load, if it has to use CHS translation, it displays a message with a semicolon:
    Loading;...
    

  3. Second Stage Boot Loader, /boot: /boot is loaded by the PBR, and has the task of accessing the OpenBSD file system through the machine's BIOS, and locating and loading the actual kernel. boot(8) also passes various options and information to the kernel.

    boot(8) is an interactive program. After it loads, it attempts to locate and read /etc/boot.conf, if it exists (which it does not on a default install), and processes any commands in it. Unless instructed otherwise by /etc/boot.conf, it then gives the user a prompt:

    probing: pc0 com0 com1 apm mem[636k 190M a20=on]
    disk: fd0 hd0+
    >> OpenBSD/i386 BOOT 3.21
    boot>
    
    It gives the user (by default) five seconds to start giving it other tasks, but if none are given before the timeout, it starts its default behavior: loading the kernel, bsd, from the root partition of the first hard drive. The second-stage boot loader probes (examines) your system hardware, through the BIOS (as the OpenBSD kernel is not loaded). Above, you can see a few things it looked for and found: The '+' character after the "hd0" indicates that the BIOS has told /boot that this disk can be accessed via LBA. When doing a first-time install, you will sometimes see a '*' after a hard disk -- this indicates a disk that does not seem to have a valid OpenBSD disk label on it.

  4. Kernel: /bsd: This is the goal of the boot process, to have the OpenBSD kernel loaded into RAM and properly running. Once the kernel has loaded, OpenBSD accesses the hardware directly, no longer through the BIOS.
So, the very start of the boot process could look like this:
Using drive 0, partition 3.                      <- MBR
Loading....                                      <- PBR
probing: pc0 com0 com1 apm mem[636k 190M a20=on] <- /boot
disk: fd0 hd0+
>> OpenBSD/i386 BOOT 3.21
boot>
booting hd0a:/bsd 4464500+838332 [58+204240+181750]=0x56cfd0
entry point at 0x100120

[ using 386464 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993       <- Kernel
        The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2013 OpenBSD.  All rights reserved.  http://www.OpenBSD.org

OpenBSD 5.4 (GENERIC) #37: Tue Jul 30 12:05:01 MDT 2013
    deraadt@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
   ...

What can go wrong

As the PBR is very small, its range of error messages is pretty limited, and somewhat cryptic. Most likely messages are: Other error messages are detailed in the biosboot(8) manual page.

For more information on the i386 boot process, see:

14.8 - What are the issues regarding large drives with OpenBSD?

OpenBSD supports both FFS and FFS2 (also known as UFS and UFS2) file systems. FFS is the historic OpenBSD file system, FFS2 is new as of 4.3. Before looking at the limits of each system, we need to look at some more general system limits.

Of course, the ability of file system and the abilities of particular hardware are two different things. A newer 250G IDE hard disk may have issues on older (pre >137G standards) interfaces (though for the most part, they work just fine), and some very old SCSI adapters have been seen to have problems with more modern drives, and some older BIOSs will hang when they encounter a modern sized hard disk. You must respect the abilities of your hardware and boot code, of course.

Partition size and location limitations

Unfortunately, the full ability of the OS isn't available until AFTER the OS has been loaded into memory. The boot process has to utilize (and is thus limited by) the system's boot ROM.

For this reason, the entire /bsd file (the kernel) must be located on the disk within the boot ROM addressable area. This means that on some older i386 systems, the root partition must be completely within the first 504M, but newer computers may have limits of 2G, 8G, 32G, 128G or more. It is worth noting that many relatively new computers which support larger than 128G drives actually have BIOS limitations of booting only from within the first 128G. You can use these systems with large drives, but your root partition must be within the space supported by the boot ROM.

Note that it is possible to install a 40G drive on an old 486 and load OpenBSD on it as one huge partition, and think you have successfully violated the above rule. However, it might come back to haunt you in a most unpleasant way:

Why? Because when you copied "over" the new /bsd file, it didn't overwrite the old one, it got relocated to a new location on the disk, probably outside the 504M range the BIOS supported. The boot loader was unable to fetch the file /bsd, and the system hung.

To get OpenBSD to boot, the boot loaders (biosboot(8) and /boot in the case of i386/amd64) and the kernel (/bsd) must be within the boot ROM's supported range, and within their own abilities. To play it safe, the rule is simple:

The entire root partition must be within the computer's BIOS (or boot ROM) addressable space.

Some non-i386 users think they are immune to this, however most platforms have some kind of boot ROM limitation on disk size. Finding out for sure what the limit is, however, can be difficult.

This is another good reason to partition your hard disk, rather than using one large partition.

fsck(8) time and memory requirements

Another consideration with large file systems is the time and memory required to fsck(8) the file system after a crash or power interruption. One should not put a 120G file system on a system with 32M of RAM and expect it to successfully fsck(8) after a crash. A rough guideline is the system should have at least 1M of available memory for every 1G of disk space to successfully fsck the disk. Swap can be used here, but at a very significant performance penalty, so severe that it is usually unacceptable, except in special cases.

The time required to fsck the drive may become a problem as the file system size expands, but you only have to fsck the disk space that is actually allocated to mounted filesystems. This is another reason NOT to allocate all your disk space Just Because It Is There. Keeping file systems mounted RO or not mounted helps keep them from needing to be fsck(8)ed after tripping over the power cord. Reducing the number of inodes (using the -i option of newfs) can also improve fsck time -- assuming you really don't need them.

Don't forget that if you have multiple disks on the system, they could all end up being fsck(8)ed after a crash at the same time, so they could require more RAM than a single disk.

FFS vs. FFS2

Using FFS, OpenBSD supports an individual file system of up to 231-1, or 2,147,483,647 blocks, and as each block is 512 bytes, that's a tiny amount less than 1T. FFS2 is capable of much larger file systems, though other limits will be reached long before the file system limits will be reached.

The boot/installation kernels only support FFS, not FFS2, so key system partitions (/, /usr, /var, /tmp) should not be FFS2, or severe maintenance problems can arise (there should be no reason for those partitions to be that large, anyway). For this reason, very large partitions should only be used for "non-system" partitions, for example, /home, /var/www/, /bigarray, etc.

Note that not all controllers and drivers support large disks. For example, ami(4) has a limit of 2TB per logical volume. Always be aware of what was available when a controller or interface was manufactured, and don't just rely on "the connectors fit".

Larger than 2TB disks

The MBR system used on PCs only directly understands disks up to 2TB in size. fdisk(8) will show larger than 2TB disks as 2TB. This does not in any way hinder OpenBSD's ability to utilize larger disks, as the MBR is used only to bootstrap the OS, once the OS is running, the file systems are defined by the disklabel, which does not have a 2TB limit.

To use a larger than 2TB disk, create an OpenBSD partition on the disk using fdisk, whatever size fdisk will let you. When you label the disk with disklabel(8), use the "b" option to set the OpenBSD boundaries (which defaulted to the size of the OpenBSD fdisk partition) to cover the entire disk. Now you can create your partitions as you wish. You must still respect the abilities of your BIOS, which will have the limitation of only understanding fdisk partitions, so your 'a' partition should be entirely within the fdisk-managed part of the disk, in addition to any BIOS limitations.

14.9 - Installing Bootblocks - i386/amd64 specific

OpenBSD has a very robust boot loader that is quite indifferent to drive geometries, however, it is sensitive to where the file /boot resides on the disk. If you do something that causes boot(8) to be moved to a new place on the disk (actually, a new inode), you will "break" your system, preventing it from booting properly. To fix your boot block so that you can boot normally, just put a boot CDROM in your drive (or use a boot floppy) and at the boot prompt, type "boot hd0a:/bsd" to force it to boot from the first hard disk (and not the CD or floppy). Your machine should come up normally. You now need to reinstall the first-stage boot loader (biosboot(8)) based on the position of the /boot file, using the installboot(8) program.

Our example will assume your boot disk is sd0 (but for IDE it would be wd0, etc.):

# cd /usr/mdec; ./installboot /boot biosboot sd0
Note that "/boot" is the physical location of the file "boot" you wish to use when the system boots normally as the system is currently mounted. If your situation were a little different and you had booted from the CD and mounted your 'a' partition on /mnt, this would probably be "/mnt/boot" instead. installboot(8) does two things here -- it installs the file "biosboot" to where it needs to be in the Partition Boot Record, and modifies it with the physical location of the "/boot" file.

14.10 - Preparing for disaster: Backing up and Restoring from tape

Introduction:

If you plan on running what might be called a production server, it is advisable to have some form of backup in the event one of your fixed disk drives fails, or the data is otherwise lost.

This information will assist you in using the standard dump(8)/restore(8) utilities provided with OpenBSD. More advanced backup utilities, such as "Amanda" and "Bacula" are available through packages for backing up multiple servers to disk and tape.

Backing up to tape:

Backing up to tape requires knowledge of where your file systems are mounted. You can determine how your filesystems are mounted using the mount(8) command at your shell prompt. You should get output similar to this:

# mount
/dev/sd0a on / type ffs (local)
/dev/sd0h on /usr type ffs (local)

In this example, the root (/) filesystem resides physically on sd0a which indicates a SCSI-like fixed disk 0, partition a. The /usr filesystem resides on sd0h, which indicates SCSI-like fixed disk 0, partition h.

Another example of a more advanced mount table might be:

# mount
/dev/sd0a on / type ffs (local)
/dev/sd0d on /var type ffs (local)
/dev/sd0e on /home type ffs (local)
/dev/sd0h on /usr type ffs (local)

In this more advanced example, the root (/) filesystem resides physically on sd0a. The /var filesystem resides on sd0d, the /home filesystem on sd0e and finally /usr on sd0h.

To backup your machine you will need to feed dump the name of each fixed disk partition. Here is an example of the commands needed to backup the simpler mount table listed above:

# /sbin/dump -0au -f /dev/nrst0 /dev/rsd0a
# /sbin/dump -0au -f /dev/nrst0 /dev/rsd0h
# mt -f /dev/rst0 rewind  

For the more advanced mount table example, you would use something similar to:

# /sbin/dump -0au -f /dev/nrst0 /dev/rsd0a
# /sbin/dump -0au -f /dev/nrst0 /dev/rsd0d
# /sbin/dump -0au -f /dev/nrst0 /dev/rsd0e
# /sbin/dump -0au -f /dev/nrst0 /dev/rsd0h  
# mt -f /dev/rst0 rewind  

You can review the dump(8) man page to learn exactly what each command line switch does. Here is a brief description of the parameters used above:

Finally which partition to backup (/dev/rsd0a, etc.)

The mt(1) command is used at the end to rewind the drive. Review the mt man page for more options (such as eject).

If you are unsure of your tape device name, use dmesg to locate it. An example tape drive entry in dmesg might appear similar to:

st0 at scsibus0 targ 5 lun 0: <ARCHIVE, Python 28388-XXX, 5.28>

You may have noticed that when backing up, the tape drive is accessed as device name "nrst0" instead of the "st0" name that is seen in dmesg. When you access st0 as nrst0 you are accessing the same physical tape drive but telling the drive to not rewind at the end of the job and access the device in raw mode. To back up multiple file systems to a single tape, be sure you use the non-rewind device, if you use a rewind device (rst0) to back up multiple file systems, you'll end up overwriting the prior filesystem with the next one dump tries to write to tape. You can find a more elaborate description of various tape drive devices in the dump man page.

If you wanted to write a small script called "backup", it might look something like this:

echo "  Starting Full Backup..."
/sbin/dump -0au -f /dev/nrst0 /dev/rsd0a
/sbin/dump -0au -f /dev/nrst0 /dev/rsd0d
/sbin/dump -0au -f /dev/nrst0 /dev/rsd0e
/sbin/dump -0au -f /dev/nrst0 /dev/rsd0h
echo
echo -n "  Rewinding Drive, Please wait..."
mt -f /dev/rst0 rewind
echo "Done."
echo

If scheduled nightly backups are desired, cron(8) could be used to launch your backup script automatically.

It will also be helpful to document (on a scrap of paper) how large each file system needs to be. You can use "df -h" to determine how much space each partition is currently using. This will be handy when the drive fails and you need to recreate your partition table on the new drive.

Restoring your data will also help reduce fragmentation. To ensure you get all files, the best way of backing up is rebooting your system in single user mode. File systems do not need to be mounted to be backed up. Don't forget to mount root (/) r/w after rebooting in single user mode or your dump will fail when trying to write out dumpdates. Enter "bsd -s" at the boot> prompt for single user mode.

Viewing the contents of a dump tape:

After you've backed up your file systems for the first time, it would be a good idea to briefly test your tape and be sure the data on it is as you expect it should be.

You can use the following example to review a catalog of files on a dump tape:

# /sbin/restore -tvs 1 -f /dev/rst0

This will cause a list of files that exist on the 1st partition of the dump tape to be listed. Following along from the above examples, 1 would be your root (/) file system.

To see what resides on the 2nd tape partition and send the output to a file, you would use a command similar to:

# /sbin/restore -tvs 2 -f /dev/rst0 > /home/me/list.txt

If you have a mount table like the simple one, 2 would be /usr, if yours is a more advanced mount table 2 might be /var or another fs. The sequence number matches the order in which the file systems are written to tape.

Restoring from tape:

The example scenario listed below would be useful if your fixed drive has failed completely. In the event you want to restore a single file from tape, review the restore man page and pay attention to the interactive mode instructions.

If you have prepared properly, replacing a disk and restoring your data from tape can be a very quick process. The standard OpenBSD install/boot floppy already contains the required restore utility as well as the binaries required to partition and make your new drive bootable. In most cases, this floppy and your most recent dump tape is all you'll need to get back up and running.

After physically replacing the failed disk drive, the basic steps to restore your data are as follows:

14.11 - Mounting disk images in OpenBSD

To mount a disk image (ISO images, disk images created with dd, etc.) in OpenBSD you must configure a vnd(4) device. For example, if you have an ISO image located at /tmp/ISO.image, you would take the following steps to mount the image.

# vnconfig vnd0 /tmp/ISO.image
# mount -t cd9660 /dev/vnd0c /mnt

Notice that since this is an ISO-9660 image, as used by CDs and DVDs, you must specify type of cd9660 when mounting it. This is true, no matter what type, e.g. you must use type ext2fs when mounting Linux disk images.

To unmount the image use the following commands.

# umount /mnt
# vnconfig -u vnd0

For more information, refer to the vnconfig(8) man page.

14.12 - Help! I'm getting errors with IDE DMA!

DMA IDE transfers, supported by pciide(4) are unreliable with many combinations of older hardware.

OpenBSD is aggressive and attempts to use the highest DMA Mode it can configure. This will cause corruption of data transfers in some configurations because of buggy motherboard chipsets, buggy drives, and/or noise on the cables. Luckily, Ultra-DMA modes protect data transfers with a CRC to detect corruption. When the Ultra-DMA CRC fails, OpenBSD will print an error message and try the operation again.

wd2a:  aborted command, interface CRC error reading fsbn 64 of 64-79
(wd2 bn 127; cn 0 tn 2 sn 1), retrying

After failing a couple times, OpenBSD will downgrade to a slower (hopefully more reliable) Ultra-DMA mode. If Ultra-DMA mode 0 is hit, then the drive downgrades to PIO mode.

UDMA errors are often caused by low quality or damaged cables. Cable problems should usually be the first suspect if you get many DMA errors or unexpectedly low DMA performance. It is also a bad idea to put the CD-ROM on the same channel with a hard disk.

If replacing cables does not resolve the problem and OpenBSD does not successfully downgrade, or the process causes your machine to lock hard, or causes excessive messages on the console and in the logs, you may wish to force the system to use a lower level of DMA or UDMA by default. This can be done by using UKC or config(8) to change the flags on the wd(4) device.

14.14 - Why does df(1) tell me I have over 100% of my disk used?

People are sometimes surprised to find they have negative available disk space, or more than 100% of a filesystem in use, as shown by df(1).

When a filesystem is created with newfs(8), some of the available space is held in reserve from normal users. This provides a margin of error when you accidently fill the disk, and helps keep disk fragmentation to a minimum. Default for this is 5% of the disk capacity, so if the root user has been carelessly filling the disk, you may see up to 105% of the available capacity in use.

If the 5% value is not appropriate for you, you can change it with the tunefs(8) command.

14.15 - Recovering partitions after deleting the disklabel

If you have a damaged partition table, there are various things you can attempt to do to recover it.

Firstly, panic. You usually do so anyways, so you might as well get it over with. Just don't do anything stupid. Panic away from your machine. Then relax, and see if the steps below won't help you out.

A copy of the disklabel for each disk is saved in /var/backups as part of the daily system maintenance. Assuming you still have the var partition, you can simply read the output, and put it back into disklabel.

In the event that you can no longer see that partition, there are two options. Fix enough of the disc so you can see it, or fix enough of the disc so that you can get your data off. Depending on what happened, one or other of those may be preferable (with dying discs you want the data first, with sloppy fingers you can just have the label)

The first tool you need is scan_ffs(8) (note the underscore, it isn't called "scanffs"). scan_ffs(8) will look through a disc, and try and find partitions and also tell you what information it finds about them. You can use this information to recreate the disklabel. If you just want /var back, you can recreate the partition for /var, and then recover the backed up label and add the rest from that.

disklabel(8) will update both the kernel's understanding of the disklabel, and then attempt to write the label to disk. Therefore, even if the area of the disk containing the disklabel is unreadable, you will be able to mount(8) it until the next reboot.

14.16 - Can I access data on filesystems other than FFS?

Yes. Other supported filesystems include: ext2 (Linux), ISO9660 and UDF (CD-ROM, DVD media), FAT (MS-DOS and Windows), NFS, NTFS (Windows). Some of them have limited, for instance read-only, support.

We will give a general overview on how to use one of these filesystems under OpenBSD. To be able to use a filesystem, it must be mounted. For details and mount options, please consult the mount(8) manual page, and that of the mount command for the filesystem you will be mounting, e.g. mount_msdos, mount_ext2fs, ...

First, you must know on which device your filesystem is located. This can be simply your first hard disk, wd0 or sd0, but it may be less obvious. All recognized and configured devices on your system are mentioned in the output of the dmesg(1) command: a device name, followed by a one-line description of the device. For example, my first CD-ROM drive is recognized as follows:

cd0 at scsibus0 targ 0 lun 0: <COMPAQ, DVD-ROM LTD163, GQH3> SCSI0 5/cdrom removable

For a much shorter list of available disks, you can use sysctl(8). The command

# sysctl hw.disknames
will show all disks currently known to your system, for example:
hw.disknames=cd0:,cd1:,wd0:,fd0:,cd2:

At this point, it is time to find out which partitions are on the device, and in which partition the desired filesystem resides. Therefore, we examine the device using disklabel(8). The disklabel contains a list of partitions, with a maximum number of 16. Partition c always indicates the entire device. Partitions a-b and d-p are used by OpenBSD. Partitions i-p may be automatically allocated to file systems of other operating systems. In this case, I'll be viewing the disklabel of my hard disk, which contains a number of different filesystems.

NOTE: OpenBSD was installed after the other operating systems on this system, and during the install a disklabel containing partitions for the native as well as the foreign filesystems was installed on the disk. However, if you install foreign filesystems after the OpenBSD disklabel was already installed on the disk, you need to add or modify them manually afterwards. This will be explained in this subsection.

# disklabel wd0

# using MBR partition 2: type A6 off 20338290 (0x1365672) size 29318625 (0x1bf5de1)
# /dev/rwd0c:
type: ESDI
disk: ESDI/IDE disk
label: ST340016A       
duid: d920a43a5a56ad5f
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 78165360
boundstart: 20338290
boundend: 49656915
drivedata: 0 

16 partitions:
#             size        offset  fstype [fsize bsize  cpg]
  a:        408366      20338290  4.2BSD   2048 16384   16 # /
  b:       1638000      20746656    swap
  c:      78165360             0  unused
  d:       4194288      22384656  4.2BSD   2048 16384   16 # /usr
  e:        409248      26578944  4.2BSD   2048 16384   16 # /tmp
  f:      10486224      26988192  4.2BSD   2048 16384   16 # /var
  g:      12182499      37474416  4.2BSD   2048 16384   16 # /home
  i:         64197            63 unknown
  j:      20274030         64260 unknown
  k:       1975932      49656978   MSDOS
  l:       3919797      51632973 unknown
  m:       2939832      55552833  ext2fs
  n:       5879727      58492728  ext2fs
  o:      13783707      64372518  ext2fs 

As can be seen in the above output, the OpenBSD partitions are listed first. Next to them are a number of ext2 partitions and one MSDOS partition, as well as a few 'unknown' partitions. On i386 and amd64 systems, you can usually find out more about those using the fdisk(8) utility. For the curious reader: partition i is a maintenance partition created by the vendor, partition j is a NTFS partition and partition l is a Linux swap partition.

Once you have determined which partition it is you want to use, you can move to the final step: mounting the filesystem contained in it. Most filesystems are supported in the GENERIC kernel: just have a look at the kernel configuration file, located in the /usr/src/sys/arch/<arch>/conf directory. If you want to use one of the filesystems not supported in GENERIC, you will need to build a custom kernel.

When you have gathered the information needed as mentioned above, it is time to mount the filesystem. Let's assume a directory /mnt/otherfs exists, which we will use as a mount point where we will mount the desired filesystem. In this example, we will mount the ext2 filesystem in partition m:

# mount -t ext2fs /dev/wd0m /mnt/otherfs

If you plan to use this filesystem regularly, you may save yourself some time by inserting a line for it in /etc/fstab, for example something like:

/dev/wd0m /mnt/otherfs ext2fs rw,noauto,nodev,nosuid 0 0
Notice the 0 values in the fifth and sixth field. This means we do not require the filesystem to be dumped, and checked using fsck. Generally, those are things you want to have handled by the native operating system associated with the filesystem.

14.16.1 - The partitions are not in my disklabel! What should I do?

If you install foreign filesystems on your system (often the result of adding a new operating system) after you have already installed OpenBSD, a disklabel will already be present, and it will not be updated automatically to contain the new foreign filesystem partitions. If you wish to use them, you need to add or modify these partitions manually using disklabel(8).

As an example, I have modified one of my existing ext2 partitions: using Linux's fdisk program, I've reduced the size of the 'o' partition (see disklabel output above) to 1G. We will be able to recognize it easily by its starting position (offset: 64372518) and size (13783707). Note that these values are sector numbers, and that using sector numbers (not megabytes or any other measure) is the most exact and safest way of reading this information.

Before the change, the partition looked like this using OpenBSD's fdisk(8) utility (leaving only relevant output):

# fdisk wd0
. . .
Offset: 64372455        Signature: 0xAA55
         Starting       Ending       LBA Info:
 #: id    C   H  S -    C   H  S [       start:      size   ]
------------------------------------------------------------------------
 0: 83 4007   1  1 - 4864 254 63 [    64372518:    13783707 ] Linux files*
. . .
As you can see, the starting position and size are exactly those reported by disklabel(8) earlier. (Dont' be confused by the value indicated by "Offset": it is referring to the starting position of the extended partition in which the ext2 partition is contained.)

After changing the partition's size from Linux, it looks like this:

# fdisk wd0
. . .
Offset: 64372455        Signature: 0xAA55
         Starting       Ending       LBA Info:
 #: id    C   H  S -    C   H  S [       start:      size   ]
------------------------------------------------------------------------
 0: 83 4007   1  1 - 4137 254 63 [    64372518:     2104452 ] Linux files*
. . .
Now this needs to be changed using disklabel(8). For instance, you can issue disklabel -e wd0, which will invoke an editor specified by the EDITOR environment variable (default is vi). Within the editor, change the last line of the disklabel to match the new size:
  o:       2104452      64372518  ext2fs
Save the disklabel to disk when finished. Now that the disklabel is up to date again, you should be able to mount your partitions as described above.

You can follow a very similar procedure to add new partitions.

14.17 - Can I use a flash memory device with OpenBSD?

14.17.1 - Flash memory as a portable storage device

Normally, the memory device should be recognized upon plugging it into a port of your machine. Shortly after inserting it, a number of messages are written to the console by the kernel. For instance, when I plug in my USB flash memory device, I see the following on my console:
umass0 at uhub1 port 1 configuration 1 interface 0
umass0: LEXR PLUG DRIVE LEXR PLUG DRIVE, rev 1.10/0.01, addr 2
umass0: using SCSI over Bulk-Only
scsibus2 at umass0: 2 targets
sd0 at scsibus2 targ 1 lun 0: <LEXAR, DIGITAL FILM, /W1.> SCSI2 0/direct removable
sd0: 123MB, 512 bytes/sec, 251904 sec total
These lines indicate that the umass(4) (USB mass storage) driver has been attached to the memory device, and that it is using the SCSI system. The last two lines are the most important ones: they are saying to which device node the memory device has been attached, and what the total amount of storage space is. If you somehow missed these lines, you can still see them afterwards with the dmesg(1) command. The reported CHS geometry is a rather fictitious one, as the flash memory is being treated like any regular SCSI disk.

We will discuss two scenarios below.

The device is new/empty and you want to use it with OpenBSD only

You will need to initialize a disklabel onto the device, and create at least one partition. Please read Using OpenBSD's disklabel and the disklabel(8) manual page for details about this.

In this example I created just one partition a in which I will place a FFS filesystem:

# newfs sd0a
Warning: inode blocks/cyl group (125) >= data blocks (62) in last
    cylinder group. This implies 1984 sector(s) cannot be allocated.
/dev/rsd0a:     249856 sectors in 122 cylinders of 64 tracks, 32 sectors
        122.0MB in 1 cyl groups (122 c/g, 122.00MB/g, 15488 i/g)
super-block backups (for fsck -b #) at:
 32,
Let's mount the filesystem we created in the a partition on /mnt/flashmem. Create the mount point first if it does not exist.
# mkdir /mnt/flashmem
# mount /dev/sd0a /mnt/flashmem

You received the memory device from someone with whom you want to exchange data

There is a considerable chance the other person is not using OpenBSD, so there may be a foreign filesystem on the memory device. Therefore, we will first need to find out which partitions are on the device, as described in FAQ 14 - Foreign Filesystems.

# disklabel sd0

# /dev/rsd0c:
type: SCSI
disk: SCSI disk
label: DIGITAL FILM    
flags:
bytes/sector: 512
sectors/track: 32
tracks/cylinder: 64
sectors/cylinder: 2048
cylinders: 123
total sectors: 251904
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0           # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0 

16 partitions:
#             size        offset  fstype [fsize bsize  cpg]
  c:        251904             0  unused      0     0      # Cyl     0 -   122 
  i:        250592            32   MSDOS                   # Cyl     0*-   122*
As can be seen in the disklabel output above, there is only one partition i, containing a FAT filesystem created on a Windows machine. As usual, the c partition indicates the entire device.

Let's now mount the filesystem in the i partition on /mnt/flashmem.

# mount -t msdos /dev/sd0i /mnt/flashmem
Now you can start using it just like any other disk.

WARNING: You should always unmount the filesystem before unplugging the memory device. If you don't, the filesystem may be left in an inconsistent state, which may result in data corruption.

Upon detaching the memory device from your machine, you will again see the kernel write messages about this to the console:

umass0: at uhub1 port 1 (addr 2) disconnected
sd0 detached
scsibus2 detached
umass0 detached

14.17.2 - Flash memory as bootable storage

One can also use flash memory in various forms as bootable disk with OpenBSD. This can be done with both USB devices (assuming your computer can boot from a USB flash device, not all can), or with a non-USB (i.e., CF) device with an IDE or SATA adapter. (Non-USB devices attached with a USB adapter are treated as USB devices.) In some cases, you may actually use a device in both ways (load the media in a USB adapter, but run it in an IDE adapter).

A flash device attached to a USB port will show up as a sd(4) SCSI-like device. When attached to an IDE adapter, it will show up as a wd(4) device.

In the case of flash media in an IDE adapter, it can be booted from any system that could boot from an IDE hard disk on the same adapter. In every sense, the system sees the flash media as an IDE disk. Simply configure the hardware appropriately, then install OpenBSD to the flash disk as normal.

In the case of booting from a USB device, your system must be able to boot from the USB device without being distracted by other devices on the system. Note that if your intention is to make a portable boot environment on a USB device, you really want to use DUIDs, rather than the traditional "/dev/sd0X" notation. The USB device will show up as a SCSI disk, sometimes sd0. Without DUIDs, if you plug this device into a system which already has a few SCSI-like disks (i.e., devices attached to an ahci(4) interface) on it, it will probably end up with a different identifier, which will complicate carrying the flash device from system to system, as you would have to update /etc/fstab. Using DUIDs completely resolves this issue.

Some notes:

14.17.3 - How do I create a bootable "Live" USB device?

It is very easy to create a bootable USB flash (or other!) drive that can be used as a "live" OpenBSD system without installing OpenBSD on the local hard disk of a machine. Obviously, the target machine must be bootable from a USB device, but the initial load can actually be done from any machine with a USB interface.

Some reasons you may want to do this:

Creating such a "live OpenBSD drive" is simple. That's it!

There are some things you may want to do after the install to improve your results:

14.18 - Optimizing disk performance

Disk performance is a significant factor in the overall speed of your computer. It becomes increasingly important when your computer is hosting a multi-user environment (users of all kinds, from those who log-in interactively to those who see you as a file-server or a web-server). Data storage constantly needs attention, especially when your partitions run out of space or when your disks fail. OpenBSD has a few options to increase the speed of your disk operations.

14.18.1 - Soft updates

An important tool that can be used to speed up your system is softupdates. One of the slowest operations in the traditional BSD file system is updating metainfo (which happens, among other times, when you create or delete files and directories). Softupdates attempts to update metainfo in RAM instead of writing to the hard disk each and every single metainfo update. Another effect of this is that the metainfo on disk should always be complete, although not always up to date. You can read more about softupdates in the Softupdates FAQ entry.

14.18.2 - Size of the namei() cache

The name-to-inode translation (a.k.a., namei()) cache controls the speed of pathname to inode(5) translation. A reasonable way to derive a value for the cache, should a large number of namei() cache misses be noticed with a tool such as systat(1), is to examine the system's current computed value with sysctl(8), (which calls this parameter "kern.maxvnodes") and to increase this value until either the namei() cache hit rate improves or it is determined that the system does not benefit substantially from an increase in the size of the namei() cache. After the value has been determined, you can set it at system startup time with sysctl.conf(5).

14.19 - Why aren't we using async mounts?

Question: "I simply do "mount -u -o async /" which makes one package I use (which insists on touching a few hundred things from time to time) usable. Why is async mounting frowned upon and not on by default (as it is in some other unixen)? Isn't it a much simpler, and therefore, a safer way of improving performance in some applications?"

Answer: "Async mounts are indeed faster than sync mounts, but they are also less safe. What happens in case of a power failure? Or a hardware problem? The quest for speed should not sacrifice the reliability and the stability of the system. Check the man page for mount(8)."

             async   All I/O to the file system should be done asynchronously.
                     This is a dangerous flag to set since it does not guaran-
                     tee to keep a consistent file system structure on the
                     disk.  You should not use this flag unless you are pre-
                     pared to recreate the file system should your system
                     crash.  The most common use of this flag is to speed up
                     restore(8) where it can give a factor of two speed in-
                     crease.

On the other hand, when you are dealing with temp data that you can recreate from scratch after a crash, you can gain speed by using a separate partition for that data only, mounted async. Again, do this only if you don't mind the loss of all the data in the partition when something goes wrong. For this reason, mfs(8) partitions are mounted asynchronously, as they will get wiped and recreated on a reboot anyway.

14.20 - Duplicating your root partition: altroot

OpenBSD provides an "altroot" facility in the daily scripts. If the environment variable ROOTBACKUP=1 is set in either /etc/daily.local or root's crontab(5), and a partition is specified in /etc/fstab as mounting to /altroot with the mount options of "xx", every night the entire contents of the root partition will be duplicated to the /altroot partition. Assuming we want to back up our root partition to wd1a, we add the following to /etc/fstab
/dev/wd1a /altroot ffs xx 0 0
and set the appropriate environment variable in /etc/daily.local:
# echo ROOTBACKUP=1 >>/etc/daily.local
As the altroot process will capture your /etc directory, this will make sure any configuration changes there are updated daily.

This is a "disk image" copy done with dd(8), not a file-by-file copy, so your /altroot partition should be exactly the same size as your root partition or larger. Also, excessively large root partitions should be avoided so the process does not take too long.

For full redundancy, the rest of the partitions should be duplicated as well, using softraid disk, dump(8)/restore(8), rsync, etc. It can be done manually, or part of a regular schedule, such as the weekly.local, daily.local, or monthly.local scripts.

Generally, you will want your "altroot" partition to be on a different disk that has been configured to be fully bootable should the primary disk fail. It is possible to have an "altroot" on the same disk as your boot drive, but the benefit of this is limited.

Note that we did not specify the altroot device by DUID, but by device name. We probably want to be pushing from the boot device to the secondary device, which can end up changing if the drive order is changed. For this reason, you may want to specify the root and altroot in /etc/fstab as a device name, not a DUID.

14.21 - How do I use softraid(4)?

Softraid(4) works by emulating a scsibus(4) with sd(4) devices made by combining a number of OpenBSD disklabel(8) partitions ("chunks") into a virtual disk ("volume") with the desired RAID strategy ("discipline"), such as RAID0, RAID1, RAID4, RAID5 or Crypto. (Note: only RAID0, RAID1 and Crypto are fully supported at the moment.)

This virtual disk is treated as any other disk -- first partitioned with fdisk(8) (on fdisk platforms) and then with disklabel(8), partitions have file systems made, mounted, then used.

Some words on RAID in general:

14.21.1 - Doing the install

Softraid disk space can be created using free (or added!) chunks of disk space after install, but that is just a special (and simple) case of adding softraid during install.

The tools to assemble your softraid system are in the basic OpenBSD install (for adding softraid devices after install), but they are also available on the CD-ROM and bsd.rd installation kernels. They do not exist on the floppies due to space issues; one simple work-around is to do a very minimal OpenBSD install from floppy, then boot from bsd.rd on your installed system and re-build as desired.

The installation process will be a little different than the standard OpenBSD install, as you will want to drop to the shell and create your softraid(4) drive before doing the install. Once the softraid(4) disk is created, you will perform the install relatively normally, placing the partitions you wish to be RAIDed on the newly configured drive.

You can pre-create just the RAID partitions and assemble them into a softraid(4) volume and let the installer do the rest, but it is probably easier to also manually create your root and swap partitions before invoking the installer.

This does mean you will have to carefully set up the disk before invoking the installer, making sure you manually do a few steps that the installer normally takes you through.

The install kernel only has the /dev entries for one wd(4) device and one sd(4) device on boot, so you will need to create more disk devices to set up your softraid device. This process is normally done automatically by the installer, but you haven't yet run the installer, and you will be adding a disk that didn't exist at boot. For example, if we needed to support a second and third wd(4) device and a second sd(4) device (remember, the softraid devices will be sd(4) devices), you could do the following from the shell prompt:

# cd /dev
# sh MAKEDEV wd1 wd2 sd1
You now have full support for sd0, sd1, wd0, wd1 and wd2.

You will need to properly fdisk(1) the physical drives (if appropriate for your platform -- make sure you set up the second disk so it is bootable!) and then use disklabel to set up the partitions.

The fdisk(8) steps below will put an MBR on the disk and an OpenBSD partition on the disk. IF you wish to use the entire disk for OpenBSD (i.e., have NOTHING else on the disk), you can do this with a simple one-liner for each drive:

# fdisk -iy wd0
# fdisk -iy wd1
(Do be sure you understand what those lines do to any data that was on your disk before using it blindly!) Otherwise, you will need to create an OpenBSD partition within the new disks.

Create the partitions for softraid

When creating the softraid(4) partition, give it the type of "RAID" rather than the normal "4.2BSD" or "swap". In this case, we will create our desired partitions on wd0:
# disklabel -E wd0
Label editor (enter '?' for help at any prompt)
> a a
offset: [64] ENTER 
size: [30282461] 500m
Rounding to cylinder (16065 sectors): 1028096
FS type: [4.2BSD] ENTER 
> a b
offset: [1028160]  ENTER 
size: [29254365] 500m
Rounding to cylinder (16065 sectors): 1028160
FS type: [swap] ENTER 
> a m
offset: [3148740]  ENTER
size: [28226205] 10g
Rounding to cylinder (16065 sectors): 20980890
FS type: [4.2BSD] RAID
> q
Write new label?: [y] ENTER

Now, we need to prep out the second disk to match key parts of the first disk's layout. Since we are using the /altroot system, we will want an 'a' partition on the secondary disk the same size as the primary's 'a'. We want the system to run off the second drive as it would the first, so we will want to have a similar sized swap partition (though a little bigger or smaller will not hurt). We will also want a RAID partition the same size as the primary. If the RAID partitions are not the same size, the smaller of the two will dictate the final RAID volume size.

In short...you really want to just repeat the above allocation process on the second drive, wd1.

Assembling the RAID volume

We will assume that your two RAID partitions are on wd0m and wd1m. Note that the RAID partition letter is arbitrary, it does not need to be the same letter on the secondary drive, but it will make keeping track of things that much easier for you if you keep them the same.

Note that since softraid(4) has to look around a bit to find evidence of arrays it needs to assemble, if your disk has been used for softraid previously, you may find it very helpful to use dd(1) to clear the first megabyte or so from each partition before going any further:

# dd if=/dev/zero of=/dev/rwd0m bs=1m count=1
   ...
# dd if=/dev/zero of=/dev/rwd1m bs=1m count=1
   ...

We now create our new softraid(4) disk using bioctl(8):

# bioctl -c 1 -l /dev/wd0m,/dev/wd1m softraid0
This creates a RAID1 volume ("-c 1"), using the listed partitions ("-l /dev/wd0m,/dev/wd1m"), using the softraid0 driver. If there are no other sd(4) devices on this system, this will become sd0. Note that if you are creating multiple RAID devices, either on one disk or on multiple devices, you are always going to be using the softraid0 virtual disk interface driver, you won't be using "softraid1" or others. Remember, the "softraid0" there is a virtual RAID controller, you can hang many virtual disks off this controller.

This will create a new disk, "sd0" (assuming there are no other sd(4) devices on your system). This device will now show on the system console and dmesg as a newly installed device:

scsibus1 at softraid0: 1 targets
sd0 at scsibus2 targ 0 lun 0: <OPENBSD, SR RAID 1, 005> SCSI2 0/direct fixed
sd0: 10244MB, 512 bytes/sec, 20980362 sec total
showing that we now have a new SCSI bus, and a new disk. This volume will be automatically detected and assembled from this point onwards when the system boots.

Because the new device probably has a lot of garbage where you expect a MBR and disklabel, zeroing the first chunk of the new disk is highly recommended, if you didn't zero the component parts above:

# dd if=/dev/zero of=/dev/rsd0c bs=1m count=1
You are now ready to install OpenBSD on your system. Perform the install as normal by invoking "install" at the boot media command prompt. Be careful to select "custom" layout for disklabel when prompted, otherwise your RAID partition will be overwritten! Use the 'n' option of disklabel to define the mount point for your root partition, and create all the partitions on your new softraid disk (sd0 in our example here) that should be there, rather than on your non-RAID disks.

Now you can reboot your system, and if you have done all properly, it will automatically assemble your RAID set, and mount the appropriate partitions.

14.21.3 - Softraid notes

Complications when other sd(4) disks exist

Softraid disks are assembled after all other IDE, SATA, SAS and SCSI disks are attached. As a result, if the number of sd(4) devices changes (either by adding or removing devices -- or if a device fails), the identifier of the softraid disk will change. For this reason, it is very important to use DUIDs (Disklabel Unique Identifiers) rather than drive names in your fstab(5) file.

You may not want to specify the root device by DUID.

Three disk RAID1?

Softraid supports RAID1 with more than two "chunks", and the man page examples show a three disk RAID1 configuration. RAID1 simply duplicates the data across all the chunks of storage, two gives full redundancy, three gives additional fault tolerance. The advantage of RAID1 with three (or more!) disks or chunks is that in event of one disk failure, you STILL have complete redundancy. Think of it as a hot spare that doesn't need time to rebuild! In theory, a three disk RAID1 is slower on writes than a two disk, though should be much faster on writes than a rebuilding two disk RAID1.

14.21.4 - Disaster recovery

This is the section you want to skip over, but don't. This is the reason for RAID -- if disks never failed, you wouldn't add the complexity of RAID to your system! Unfortunately, as failures are very difficult to list comprehensively, there is a strong probability that the event you experience won't be described exactly here, but if you take the time to understand the strategies here, and the WHY, hopefully you can use them to recover from whatever situations come your way.

Keep in mind, failures are often not simple. The author of this article had a drive in a hardware RAID solution develop a short across the power feed, which in addition to the drive itself, also required replacing the power supply, the RAID enclosure and a power supply on a second computer he used to verify the drive was actually dead, and the data from backup as he didn't properly configure the replacement enclosure.

The steps needed for system recovery can be performed in single user mode, or from the install kernel (bsd.rd).

If you plan on practicing softraid recovery (and we HIGHLY suggest you do so!), you may find it helpful to zero a drive you remove from the array before you attempt to return it to the array. Not only does this more accurately simulate replacing the drive with a new one, it will avoid the confusion that can result when the system detects the remains of a softraid array.

Recovery from a failure will often be a two-stage event -- the first stage is bringing the system back up to a running state, the second stage is to rebuild the failed array. The two stages may be separated by some time if you don't have a replacement drive handy.

Recovery from drive failure: secondary

This is relatively easy. You may have to remove the failed disk to get the system back up.

When you are ready to repair the system, you will replace the failed drive, create the RAID and other disklabel partitions, then rebuild the mirror. Assuming your RAID volume is sd0, and you are replacing the failed device with wd1m, the following process should work:

Recovery from drive failure: primary

Many PC-like computers can not boot from a second drive if the primary drive has failed, but still attached unless it is so dead it isn't detected. Many can not boot from a drive that isn't the "primary", even if there is no other drive.

In general, if your primary drive fails, you will have to remove it, and in many cases "promote" your secondary drive to primary configuration before the system will boot. This may involve re-jumpering the disk, plugging the disk into another port or some other variation. Of course, what is on the secondary disk has to not only include your RAID partition, but also has to be functionally bootable.

Once you have the system back up on the secondary disk and a new disk in place, you rebuild as above.

Recovery from "shuffling" your disks

What if you have four disks in your system, say, sd0, sd1, sd2, and sd3, and for reasons of hardware replacement or upgrade, you end up with the drives out of the machine, and lose track of which was which?

Fortunately, softraid handles this very well, it considers the disks "roaming", but will successfully rebuild your arrays. However, the boot disk in the machine has to be bootable, and if you just made changes in the root partition before doing this, you probably want to be sure you didn't boot from your altroot partition by mistake.

14.21.5 - Softraid Crypto

Cryptographic softraid(4) volumes are set up rather simply: Once this is set up, you can then "unlock" the crypto volume when desired with:
# bioctl -c C -l /dev/sd1m softraid0
Passphrase: My Crypto Pass Phrase
softraid0: CRYPTO volume attached as sd1
You can then mount the encrypted volume's partitions using mount as usual.

To disconnect a crypto volume (rendering it unusable again), dismount any file systems and use the following (where the encrypted volume is sd1):

# bioctl -d sd1
The man page for this looks a little scary, as the -d command is described as "deleting" the volume, but in the case of crypto, it just deactivates encrypted volume so it can't be accessed until it is activated again with the passphrase.

Many other options are available with softraid, and new features are being added and improvements made, so do consult the man pages for bioctl(8) and softraid(4) on your system.

I forgot my passphrase!

Sorry. This is real encryption, there's not a back door or magic unlocking tool. If you lose your passphrase, your data on your softraid crypto volume will be unusable.


[FAQ Index] [To Section 13 - Multimedia] [To Section 15 - Packages and Ports]