Disks: The Alto File System
This document describes the disk formats used in the Alto File System.
It also describes a "disk object," a Bcpl software construct that is
used to interface low-level disk drivers with packages that implement
higher-level objects, such as streams.
The primary focus of the description will be for the "standard" Alto
disks: either (1) up to 2 Diablo Model 31 disk drives or (2) one Diablo
Model 44 disk drive. The low-level drivers for these disks are called
"Bfs" (Basic File System). With minor modifications, the description
below applies to the Trident Model T80 and T300 disk drives, when
formatted for Alto file system conventions. The differences are
flagged with the string [Trident]. Low-level drivers for the Trident
disks are called "Tfs."
Relocatable binary files for the BFS are kept in BFSBrs.dm. The
sources, command files, and test program (described later in this
document) are kept in BFSSources.dm Relocatable binary
files for the TFS are kept in TFS.dm; sources are kept on
2. File and Disk Structure
This section describes the conventions of the Alto file system. The
files AltoFileSys.D and Bfs.D contain Bcpl structure declarations that
correspond to this description ([Trident]: See also "Tfs.D").
The unit of transfer between disk and memory, and hence that of the
file system, is the disk sector. Each sector has three fields: a 2-
word header, an 8-word label, and a 256-word data page. ([Trident]:
The fields are a 2-word header, a 10-word label, and a 1024-word data
A sector is identified by a disk address; there are two kinds of disk
addresses, real and virtual. The hardware deals in real addresses,
which have a somewhat arbitrary format. An unfortunate consequence is
that the real addresses for all the pages on a disk unit are sparse in
the set of 16 bit integers. To correct this defect, virtual addresses
have been introduced. They have the property that the pages of a disk
unit which holds n pages have virtual addresses 0 ... (n-1).
Furthermore, the ordering of pages by virtual address is such that
successive pages in the virtual space are usually sequential on the
disk. As a result, assigning a sequence of pages to consecutive
virtual addresses will ensure that they can be read in as fast as
Copyright Xerox Corporation 1982
Disks & Bfs April 10, 1982 5:46 PM 2
2.1. Legal Alto Files
An Alto file is a data structure that contains two sorts of
information: some is mandatory, and is required for all legal files;
the remainder is "hints". Programs that operate on files should
endeavor to keep the hints accurate, but should never depend on the
accuracy of a hint.
A legal Alto file consists of a sequence of pages held together by a
doubly-linked list recorded in the label fields. Each label contains
the mandatory information:
The forward and backward links, recorded as real disk addresses.
A page number which gives the position of the page in the file;
pages are numbered from 0.
A count of the number of characters of data in the page (numchars).
This may range from 0 (for a completely empty page) to 512 (for a
completely full page). ([Trident]: A full page contains 2048
A real file id, which is a three-word unique identifier for the
file. The user normally deals with virtual file ids (see the
discussion of file pointers, below), which are automatically
converted into real file ids when a label is needed.
Three bits in the file id deserve special mention:
Directory: This bit is on if the file is itself a directory file.
This information is used by the disk Scavenger when trying to re-
build a damaged disk data structure.
Random: This bit is currently unused.
NoLog: This bit is no longer used, but many existing files are
likely to have it set.
Leader Page: Page 0 of a file is called the leader page; it contains no
file data, but only a collection of file properties, all of which are
hints. The structure LD in AltoFileSys.D declares the format of a
leader page, which contains the following standard items:
The file name, a hint so that the Scavenger can enter this file in
a directory if it is not already in one.
The times for creation, last read and last write, interpreted as
A file's creation date is a stamp generated when the
information in the file is created. When a file is copied
(without modification), the creation date should be copied
with it. When a file is modified in any way (either in-place
or as a result of being overwritten by newly-created
information), a new creation date should be generated.
A file's write date is updated whenever that file is
physically written on a given file system.
Disks & Bfs April 10, 1982 5:46 PM 3
A file's read date is updated whenever that file is
physically read from within a given file system.
A pointer to the directory in which the file is thought to be
entered (zeroes imply the system directory SysDir).
A "hint" describing the last page of the file.
A "consecutive" bit which is a hint that the pages of the file lie
at consecutive virtual disk addresses.
The changeSerial field related to version numbering: whenever a
new version of a file "foo" is made, the changeSerial field of all
other files "foo" (i.e., older versions) is incremented. Thus, a
program that wishes to be sure that it is using the most recent
version of a file can verify that changeSerial=0. If a program
keeps an FP as a hint for a file, and is concerned about the
relative position of that file in the list of version numbers, it
can also keep and verify the changeSerial entry of the file.
Version numbers have been deimplemented.
These standard items use up about 40 words of the leader page. The
remaining space is available for storing other information in blocks
which start with a one word header containing type and length fields.
A zero terminates the list. The structure FPROP in AltoFileSys.d
defines the header format. The only standard use of this facility is
to record the logical shape of the disk in the leader page of SysDir.
Data: The first data byte of a file is the first byte of page 1.
In a legal file with n pages, the label field of page i must contain:
A next link with the real disk address of page (i+1), or 0 if i=n-1.
A previous link with the real disk address of page (i-1), or 0 if
A page number between 0 and (n-1), inclusive.
A numchars word = 512 if i>FP.serialNumber.word1
portion of the file pointer. This allows the directory and random
bits to be set in the file id.
If useOldFp is true, then filePtr already points to a legal file;
the purpose of calling CreateDiskFile is to re-write all the labels
of the existing file with the new serial number, and to re-
initialize the leader page. The data contents of the original file
are lost. Note that this process effectively "deletes" the file
described by filePtr when CreateDiskFile is called, and makes a new
file; the FP for the new file is returned in filePtr.
If pageBuf is supplied, it is written on the leader page of the new
file after setting the creation date and directory FP hint (if
supplied). If pageBuf is omitted, a minimal leader page is created.
DeleteDiskPages(disk, CA, firstDA, filePtr, firstPage, newFp,
Arguments beyond firstPage are optional. Deletes the pages of a file,
starting with the page whose number is firstPage and whose disk address
is firstDA. CA is a page-sized buffer which is clobbered by the
routine. hintLastPage is as described under ActOnDiskPages.
If newFp is supplied and nonzero, it (rather than freePageFp) is
installed as the FP of the file, and the pages are not deallocated.
6. Allocating Disk Space
The disk class also contains routines for allocating space and for
converting between virtual and real disk addresses. In most cases,
users need not call these routines directly, as the four routines given
above (ActOnDiskPages, WriteDiskPages, DeleteDiskPages, CreateDiskFile)
manage disk addresses and disk space internally.
AssignDiskPage(disk, virtualDA, nil) returns the virtual disk address
of the first free page following virtualDA, according to the bit table,
Disks & Bfs April 10, 1982 5:46 PM 11
and sets the corresponding bit. It does not do any checking that the
page is actually free (but WriteDiskPages does). If there are no free
pages it returns -1. If it is called with three arguments, it returns
true if (virtualDA+1) is available without assigning it.
If virtualDA is eofDA, AssignDiskPage makes a free-choice assignment.
The disk object remembers the virtual DA of the last page assigned and
uses it as the first page to attempt to assign next time AssignDiskPage
is called with a virtualDA of eofDA. This means that you can force a
file to be created starting at a particular virtual address by means of
the following strategy:
ReleaseDiskPage(disk, AssignDiskPage(disk, desiredVDA-1))
CreateDiskFile(disk, ...) // or whatever (e.g., OpenFile)
ReleaseDiskPage(disk, virtualDA) marks the page as free in the bit
table. It does not write anything on the disk (but DeleteDiskPages
VirtualDiskDA(disk, lvRealDA) returns the virtual disk address, given a
real disk address in rv lvRealDA. (The address, lvRealDA, is passed
because a real disk address may occupy more than 1 word.) This
procedure returns eofDA if the real disk address is zero (end-of-file),
and fillInDA if the real disk address does not correspond to a legal
virtual disk address in this file system.
RealDiskDA(disk, virtualDA, lvRealDA) computes the real disk address
and stores it in rv lvRealDA. The function returns true if the virtual
disk address is legal, i.e. within the bounds of disk addresses for the
given "disk." Otherwise, it returns false.
7. Lower-level Disk Access
The transfer routines described previously have the property that all
disk activity occurs during calls to the routines; the routines wait
for the requested disk transfers to complete before returning.
Consequently, disk transfers cannot conveniently be overlapped with
computation, and the number of pages transferred consecutively at full
disk speed is generally limited by the number of buffers that a caller
is able to supply in a single call.
It is also possible to use the disk routines at a lower level in order
to overlap transfers with computation and to transfer pages at the full
speed of the disk (assuming the file is consecutively allocated on the
disk and the amount of computation per page is kept relatively small).
The necessary generic disk operations and other information are
available to permit callers to operate the low-level disk routines in a
device-independent fashion for most applications.
This level makes used of a Command Block Zone (CBZ), part of whose
structure is public and defined in Disks.d, and the rest of which is
private to the implementation. The general idea is that a CBZ is set
up with empty disk command blocks in it. A free block is obtained from
the CBZ with GetDiskCb and sent to the disk with DoDiskCommand. When
it is sent to the disk, it is also put on the queue which GetDiskCb
uses, but GetDiskCb waits until the disk is done with the command
before returning it, and also checks for errors.
Disks & Bfs April 10, 1982 5:46 PM 12
If you plan to use these routines, read the code for ActOnDiskPages to
find out how they are intended to be called. An example of use of
these routines in a disk-independent fashion (i.e., using only the
public definitions in Disks.d) may be found in the DiskStreamsScan
module of the Operating System. Only in unusual applications should it
be necessary to make use of the implementation-dependent information in
Bfs.d or Tfs.d.
InitializeDiskCBZ(disk, cbz, firstPage, length, retry, lvErrorRoutine).
CBZ is the address of a block of length words which can be used to
store CBs. It takes at least three CBs to run the disk at full speed;
the disk object contains the values DSK.lengthCBZ (fixed overhead) and
DSK.lengthCB (size of each command block) which may be used to compute
the required length (that is, length should be at least
lengthCBZ+3*lengthCB). FirstPage is used to initialize the currentPage
field of the cbz. Retry is a label used for an error return, as
described below. lvErrorRoutine is an error routine for unrecoverable
errors, described below; it defaults to a routine that simply invokes
SysErr. The arguments after firstPage can be omitted if an existing
CBZ is being reinitialized, and they will remain unchanged from the
cb = GetDiskCb(disk, cbz, dontClear[false], returnIfNoCB[false])
returns the next CB for the CBZ. If the next CB is empty (i.e., it has
never been passed to DoDiskCommand), GetDiskCb simply zeroes it and
returns it. However, if the next CB is still on the disk command
queue, GetDiskCb waits until the disk has finished with it. Before
returning a CB, GetDiskCb checks for errors, and handles them as
described below. If there is no error, GetDiskCb updates the nextDA
and currentNumChars cells in the CBZ, then calls
cbz>>CBZ.cleanupRoutine(disk, cb, cbz). Next, unless dontClear is
true, the CB is zeroed. Finally, the CB is returned as the value of
GetDiskCb. If returnIfNoCB is true, GetDiskCb returns zero if there
are no CBs in the CBZ or the next CB is still on the disk command
If the next CB has suffered an error, then GetDiskCb instead takes the
following actions. First it increments cbz>>CBZ.errorCount. If this
number is ge the value disk>>DSK.retryCount, GetDiskCb calls the error
routine which was passed to InitializeDiskCBZ; the way this is done is
explained in the description of ActOnDiskPages above. (If the error
routine returns, GetDiskCb will proceed as if an error hadn't
occurred.) Otherwise, after doing a restore on the disk if errorCount
ge disk>>DSK.retryCount/2, it reinitializes the CBZ with firstPage
equal to the page with the error, and returns to cbz>>CBZ.retry (which
was initialized by InitializeDiskCBZ) instead of returning normally.
The idea is that the code following the retry label will retry all the
incomplete commands, starting with the one whose page number is
cbz>>CBZ.currentPage and whose disk address is cbz>>CBZ.errorDA.
DoDiskCommand(disk, cb, CA, DA, filePtr, pageNumber, action, nextCb)
Constructs a disk command in cb with data address CA, virtual disk
address DA, serial and version number taken from the virtual file id in
filePtr, page number taken from pageNumber, and disk command specified
by action. The nextCb argument is optional; if supplied and nonzero,
DoDiskCommand will "chain" the current CB's label address to nextCb, in
such a way that the DL.next word will fall into nextCb>>CB.diskAddress.
DoDiskCommand expects the cb to be zeroed, except that the following
Disks & Bfs April 10, 1982 5:46 PM 13
fields may be preset; if they are zero the indicated default is
labelAddress lv cb>>CB.label
If DA eq fillInDA, the real disk address in the command is not set (the
caller should have either set it explicitly or passed the CB as the
nextCb argument for a previous command). Actions are checked for
The public cells in the CBZ most likely to be of interest are the
client: information of the caller's choosing (e.g., a pointer to a
related higher-level data structure such as a stream.)
cleanupRoutine: the cleanup routine called by GetDiskCb (defaulted
to Noop by InitializeDiskCBZ).
currentPage: set to the firstPage argument of InitializeDiskCBZ and
not touched by the other routines. (Note, however, that GetDiskCb
calls InitializeDiskCBZ when a retry is about to occur, so when
control arrives at the retry label, currentPage will be set to the
page number of the command that suffered the error.)
errorDA: set by GetDiskCb to the virtual disk address of the command
that suffered an error.
nextDA: set by GetDiskCb to the virtual disk address of the page
following the one whose CB is being returned. (This information is
obtained from the next pointer in the current page's label. Note
that errorDA and nextDA are actually the same cell, but they are
used in non-conflicting circumstances.)
currentNumChars: set by GetDiskCb to the numChars of the page whose
CB is being returned.
head: points to the first CB on GetDiskCb's queue; contains zero if
the queue is empty.
8. Error Codes
The following errors are generated by the BFS. Similar errors are
generated by other instances of a disk object.
1101 unrecoverable disk error
1102 disk full
1103 bad disk action
1104 control block queues fouled up
1105 attempt to create a file without creation ability
1106 can't create an essential file during NewDisk
1107 bit table problem during NewDisk
1108 attempt to access nonexistant bit table page
Disks & Bfs April 10, 1982 5:46 PM 14
9. Implementation -- Bfs
The implementation expects a structure BFSDSK to be passed as the
"disk" argument to the routines. The initial portion of this structure
is the standard DSK structure followed by a copy of the DiskDescriptor
header and finally some private instance data for the disk in use.
(Note: The Alto operating system maintains a static sysDisk that points
to such a structure for disk drive 0.)
Bfs ("Basic File System") is the name for a package of routines that
implement the disk class for the standard Alto disks (either Diablo
Model 31 drives or a single Diablo Model 44 drive). The definitions
(in addition to those in AltoFileSys.D and Disks.D) are contained in
Bfs.D. The code comes in two "levels:" a "base" for reading and
writing existing files (implements ActOnDiskPages, RealDiskDA and
VirtualDiskDA only); and a "write" level for creating, deleting,
lengthening and shortening files (implements WriteDiskPages,
CreateDiskFile, DeleteDiskPages, AssignDiskPage, ReleaseDiskPage). The
source files BfsBase.Bcpl, Dvec.Bcpl and BfsMl.Asm comprise the base
level; files BfsWrite.Bcpl BfsCreate.bcpl, BfsClose.bcpl, and
BfsDDMgr.bcpl implement the write level.
BfsMakeFpFromLabel(fp, la) constructs a virtual file id in the file
pointer fp from the real file id in the label la.
disk = BFSInit(diskZone, allocate[false], driveNumber, ddMgr,
freshDisk[false], tempZone[diskZone]) returns a disk object for
driveNumber or zero. The permanent data structures for the disk are
allocated from diskZone; temporary free storage needed during the
initialization process is allocated from tempZone. If allocate is
true, the machinery for allocating and deallocating disk space is
enabled. If it is enabled, a small DDMgr object and a 256 word buffer
will be extracted from diskZone in order to buffer the bit table. A
single DDMgr, created by calling 'ddMgr = CreateDDMgr(zone)', can
manage both disks. If freshDisk is true, BFSInit does not attempt to
open and read the DiskDescriptor file. This operation is essential for
creating a virgin file system.
success = BFSNewDisk(zone, driveNum, nDisks[number spinning],
nTracks[physical size], dirLen, nSectors[physical size]) creates
a virgin Alto file system on the specified drive and returns true if
successful. The zone must be capable of supplying about 1000 words of
storage. The logical size of the file system may be different from the
physical size of driveNum: it may span both disks (a 'double-disk file
system'), or it may occupy fewer tracks (a model 44 used as a model
31). The length in words of SysDir, the master directory, is specified
by dirLen. Some machines that emulate Altos implement 14 sectors per
BFSExtendDisk(zone, disk, nDisks, nTracks) extends (i.e. adds pages to)
the filesystem on 'disk'. Presumably 'nDisks' or 'nTracks' or both is
bigger than the corresponding parameters currently in disk. A single
model 31 may be extended to a double model 31 or a single model 44 or a
double model 44, and a single model 44 may be extended to a double
model 44. The zone must be capable of supplying about 750 words of
0 = BFSClose(disk, dontFree[false]) destroys the disk object in an
Disks & Bfs April 10, 1982 5:46 PM 15
orderly way. If dontFree is true, the ddMgr for the disk is not
destroyed; presumably it is still in use by the other disk. (Note that
this procedure is the one invoked by the CloseDisk generic operation.)
BFSWriteDiskDescriptor(disk) insures that any important state saved in
memory is correctly written on the disk.
virtualDA = BFSFindHole(disk, nPages) attempts to find a contiguous
hole nPages long in disk. It returns the virtual disk address of the
first page of a hole if successful, else -1.
BFSTryDisk(drive, track, sector) returns true if a seek command to
the specified track on the specified drive is successful. Note that
the drive argument can contain an imbedded partition number. Seeks to
track zero will fail if the drive is not on line. Seeks to track
BFS31NTracks+1 will fail if the drive is a model 31.
10. Implementation -- Tfs
Operation and implementation of the Trident T80 disks is described in
separate documentation under the heading "TFS/TFU" in Alto Subsystems
BFSTest is the test and utility program for the Basic File System. It
performs many of the same functions as TFU (in fact much of its code is
lifted from TFU), except that commands which are better provided by
other subsystems such as the Executive (copy, rename, delete, etc.) are
omitted. It has a conventional command scanner and implements the
ERASE formats one or more disks as an Alto file system. It asks
you to specify the number of disks, cylinders and sectors. Any
sectors marked "incorrigable" (by the CERTIFY command, below) are
not included in the file system and will never again cause trouble
unless the disk is erased with DiEx. The OS erase command also
preserves this bad spot information.
EXERCISE creates, deletes, reads, writes and positions files the same
way that normal programs do, and checks the results which normal
programs do not do. These high-level operations cause patterns of
disk commands which are quite different from those generated by
lower-level tests such as DiEx.
Exercise assumes that a file system exists on DP0 (and perhaps
extends on to DP1). It creates as many test files (named
Test.001, Test.002, ...) as will fit in the file system, filling
each file with a carefully chosen test pattern. When it is done,
it deletes all of its files. One 'pass' consists of stepping
through the test files, performing a randomly chosen operation on
the file, and checking the results. The duration and throughness
of a pass depends on the amount of free space on the disks.
Disks & Bfs April 10, 1982 5:46 PM 16
Ideally, the disk(s) under test have just been erased with the
ERASE command, below.
While running, exercise looks for commands from the keyboard. The
current commands are:
Q Quit Delete all test files and stop.
S StopOnError Wait until a character is typed.
All test files are 100 pages long. Each page of a file has the
page number in its first and last words and a data pattern in the
middle 254 words. The data pattern is constant throughout a file,
consisting of a single one-bit in a word of zeros or a single
zero-bit in a word of ones. Files are read and written with
ReadBlock and WriteBlock using buffers whose lengths are not
multiples of the page size. The operations are:
Write Write the entire file with the data pattern.
Read Read the entire file checking the data pattern.
Delete Delete the file, create it again and then write
Copy Copy the file to some other randomly chosen
file. If both disks are being tested, one
third of the time pick a destination file on
the other disk.
Position Position to twenty randomly chosen pages in the
file. Check that the first word of the page is
indeed the page number. One third of the time
dirty the stream by writing the page number in
the last word of the page.
CERTIFY tests one or more disks for bad spots and marks such pages
"incorrigable" so that they will not be included in subsequent
file systems. Ideally, a disk should be certified before being
used. Certify can be stopped at any time by hitting any key.
A bad spot is any sector which gives three or more checksum
errors. If the Scavenger encounters a bad spot, it will also mark
the sector incorrigable. Alto and Dorado disks almost never have
bad spots; but Dolphin disks typically have a few, so CERTIFYing
is a must for trouble-free operation. Note that bad spot
information will be lost if a certified pack is written by
COPYDISK or DIEX.
CREATEFILE attempts to create a contiguous file of a specified size.
If it can't, it creates a file with a minimum number of page runs.
PARTITION allows you to set the disk partition on which BFSTest will
operate. This command is only available on Dolphins and Dorados.
A thorough, high-level "acceptance test" for the disk subsystem of an
Alto or Alto-emulating machine (i.e. a Dolphin or Dorado) consists of
at least 100 passes of CERTIFY with less than 5 bad spots (any bad
spots on cylinder 0 are unacceptable), followed by ERASE, followed by
at least 10 passes of EXERCISE with no errors.