Find - a file searching subsystem



The Find subsystem allows you  to search text files at very  high speed
on an Alto.  Examples of such files might be an address/telephone list,
a source program, or a library catalog.

Find basically looks  for all the occurrences  of a pattern in  a file,
just like doing repeated Jump  commands in Bravo.  A pattern is  just a
character sequence, except for the following:
      # in  a pattern means  "any character at  all", e.g. CAP  and CUP
count as occurrences of the pattern C#P.
      ~ in a  pattern means "allow one  character in the  occurrence to
disagree  with  the  corresponding  character  in  the  pattern".   For
example, LAP, CUP, and CAT all count as occurrences of the pattern ~CAP
(or CAP~ or C~AP).  Two  ~s mean "allow two disagreements", and  so on.
Note  that "disagreement"  only  means substituting  one  character for
another: it does not include insertions (e.g. CLAP for  CAP), deletions
(CP for CAP), or transpositions (CPA for CAP).
      If you really want to have a pattern containing # or ~,  you have
to type a ' before it,  e.g. to search for the character  sequence ATOM
#, you have to type ATOM '#.
      Unless you use  the /c (Case)  switch described below,  upper and
lower case letters are considered identical, e.g. Cap, cap, and CAP all
count as occurrences of CAP or of cap.
      Unless  you use  the /s  (Space) switch  described  below, blanks
(spaces) in the file  are completely ignored, e.g.  C A P counts  as an
occurrence of CAP; blanks in the pattern are also ignored.

There are two ways to invoke Find.  The first way just searches  a file
for one pattern:
      >Find filename pattern
(Since the Executive does something special about @, #, %, *, ↑,  and ;
in command  lines, you  must precede  any of  these characters  in your
pattern  by a  '.   This is  in  addition to  any  's you  may  need as
described in the preceding  paragraph.)  The second way  only specifies
the file:
      >Find filename
and Find then prompts you repeatedly for patterns.  To leave  Find when
using it this way, use  shift-Swat or type an empty pattern  (just type
<return> when  Find says Pattern:).   You can also  use Find  to search
several files together, by invoking it with
      >Find/m filename1 ... filenamen
which also prompts you for patterns.

In any of the above command lines, you can also use the /c,  /d, and/or
/s switches described above, i.e. any of the forms
      >Find/s filename pattern
      >Find/s filename
      >Find/ms filename1 ... filenamen
The switches may be in any order or combination, e.g.
      >Find/csm filename1 ... filenamen
tells Find to search  filename1 ... filenamen treating upper  and lower
case as different  and not ignoring spaces.   This also applies  to the
switches described below.


                             ------------
                   Copyright Xerox Corporation 1979


Find                       November 6, 1979                           2




After completing the search, Find  displays at the top of the  screen a
summary of the form:
      79 occurrences, 1200 ms, 150 pages
giving the total number  of occurrences, the time in  milliseconds, and
the number of disk pages in the file.  In the remainder of  the screen,
Find displays the  line containing each  occurrence of a  pattern, with
the occurrence indicated  in boldface.  To the  left of the  line, Find
displays the character  position in the  file where the  occurrence was
found.  After each screenful, Find waits for you to type <space> if you
want more, or <del> if you don't.

In addition to displaying matches on the screen, Find always writes the
lines containing matches on a file called Find.Matches.  Normally, Find
only writes  those matches which  it displayed, so  if you  stopped the
display of matches with <del>, only those matches you actually saw will
appear on the file.  However, if you use the /a (All) switch, Find will
write all matches on the file, not just the ones you saw  displayed; if
you use the /w (Write only) switch, Find will write all matches  on the
file and not display them at all.

What Find finds  for you is all  the "items" containing  occurrences of
the  pattern.   Normally an  "item"  is  just a  single  line  of text,
delimited by  <cr> on both  ends.  However, Find  also knows  about two
other kinds of items:  Bravo paragraphs, and groups of  lines separated
from each other by a blank line.  If you use the /p (Paragraph) switch,
Find will  show (display  and write on  Find.Matches) the  entire Bravo
paragraph containing the  occurrence.  If you  use the /b  (Blank line)
switch, Find will show everything surrounding the occurrence up  to the
next preceding and following blank line.

So that you can examine Find.Matches with Bravo, Find  normally removes
any  character  sequences  that  Bravo  might  confuse  with   its  own
formatting information.  There are two exceptions to this.  If  you ask
for entire paragraphs (/p  switch), Find preserves the  formatting.  If
for  some  reason  you  want the  characters  around  the  match copied
regardless of their possible  interpretation by Bravo (e.g. if  you are
searching a binary file or some unusual kind of text file), you can use
the /v (Verbatim) switch, which instructs Find not to  remove sequences
that look like Bravo formatting; if you do this, you will  probably not
be able to read the file into Bravo with the ordinary Get  command, but
should use the ↑Z (unformatted Get) command instead.

Find  normally  displays,  but  does  not  write  on  Find.Matches, the
position of  each occurrence within  the file, in  octal.  If  you want
this number written Find.Matches as well, use the /o (Octal) switch.

Find produces a large number of error messages.  The messages
      Pattern too long
      Can't preallocate
      RAM full
all mean the same  thing, namely that your  pattern is too long  or too
complicated (unfortunately,  it is too  complicated to  explain exactly
what "too complicated" means).  The message
      Can't load RAM
means that  your Alto has  old or non-standard  ROMs and Find  can't do
what it needs to do:  you should contact a hardware  maintainer.  (This
should never happen on Alto II's.)

Find has many obvious limitations.   They can all be removed  if people


Find                       November 6, 1979                           3




complain about them.  The  following features could also be  added upon
request:
      Boolean combinations of matches, maybe.
      Ability to work with Trident disks.
      Possibly other features requested by users.
Programmers  should note  that the  file searching  capability  is also
available as a library package (called FindPkg), so programs can use it
as well as people.

Alphabetic summary of switches:
      /a - write All matches on file
      /b - item = text between Blank lines
      /c - distinguish between upper and lower Case
      /m - Multiple files
      /o - write Octal position on Find.Matches
      /p - item = Bravo Paragraph
      /s - consider Spaces significant
      /v  -  write  Verbatim  on  Find.Matches  (don't  strip  possible
formatting)
      /w - only Write on Find.Matches, don't display

History of changes:

Release of October 30, 1979

      Added /o (write octal position), /v (verbatim output  of matches,
i.e. don't flush Bravo formatting), /a (write all matches to file), and
/w  (only  write  matches, don't  display).   Fixed  bugs  which caused
display garbage and occasional  crashes when lines were very  long, and
infinite loop when searching files containing <del>s.   Changed default
to remove  Bravo formatting from  matches file unless  /p or  /v switch
set.

Release of January 16, 1978

      Added  /c  (distinguish  upper  and  lower  case),  /p   (item  =
paragraph), and /b (item = between blank lines) switches.