ref: 945bd1823f970922595fd1cdcf0a9030c7ffe910
parent: 79008d44c010419040878e1bf0c438f37f5163bd
author: henesy <unknown>
date: Sat Nov 3 21:28:43 EST 2018
init 2
--- /dev/null
+++ b/INSTALL
@@ -1,0 +1,86 @@
+Installing hosted Inferno from source
+
+Overview
+
+ Like the native kernels emu relies on several auxil-
+iary libraries (the source of which it often shares with the
+native kernels). Emu itself is built by the mkfile in the
+emu subdirectory containing the platform-specific source for
+the host platform. Each library has its own mkfile; the
+various components are made in the right order by the
+mkfile at the root of the Inferno tree. The mkfile for
+each platform will also invoke mk recursively to make the
+appropriate libraries for a given configuration.
+
+ The Unix emu variant generally is covered by `POSIX'
+(with common extensions) but each Unix port has one file
+that differs considerably for each port, namely
+emu/platform/os.c, the differences corresponding to the dif-
+ferent ways under Unix of implementing kernel-scheduled
+threads efficiently.
+
+ There are working emu versions for FreeBSD/386,
+Irix/mips, Linux/386, NetBSD/386, MacOSX/386, MacOSX/power,
+Plan 9, Solaris/sparc, and Windows (NT, 2000 and Explorer
+plug-in). Each platform typically uses mechanisms specific
+to the host operating system to implement Inferno's internal
+thread/process structure. POSIX threads have often been
+found to be insufficient (poorly implemented) on some plat-
+forms, and if so are avoided. See kproc in emu/*/os.c.
+
+ Source is included for ports to HP/UX (S800 architec-
+ture), Solaris/386, and Unixware, in case someone wishes to
+take them up now, but we have not determined their fitness.
+
+ The Plan 9 hosted implementation is unusual in that it
+supports several processor types: 386, mips, power (Power
+PC) and sparc. Furthermore, all versions of emu can be
+built on any processor type, in the usual way for Plan 9.
+
+ Otherwise, as distributed, emu for a platform can only
+be built when running on that platform.
+
+ One unusual variant makes the whole of Inferno a plug-
+in for Microsoft's Internet Explorer, giving the same envi-
+ronment for Inferno applications running in an HTML page as
+is provided by hosted or native Inferno. That is, there is
+not a distinct `applet' environment with special programming
+interfaces. The source for the various plug-in components
+is found in /tools/plugin and /usr/internet within the
+Inferno tree; they use the version of emu defined by the
+configuration file /emu/Nt/ie.
+
+Build steps
+
+ All the libraries and executables can be built in a
+tree containing only the source code. To do that for a sup-
+ported variant of hosted Inferno, on Unix or Plan 9, do the
+following in the root of the Inferno tree:
+
+1 Edit mkconfig to reflect your host environment,
+ specifically ROOT (which must be an absolute path
+ name), SYSHOST and OBJTYPE. The comments in the file
+ should help you choose.
+
+2 Run makemk.sh to rebuild the mk command, which is
+ used to build everything else.
+
+3 Set PATH (or path on Plan 9) to include the bin
+ directory for the platform, which will now contain the
+ mk binary just built. On Unix, export PATH.
+
+4 Then mk nuke to remove any extraneous object files.
+
+5 Finally, mk install to create and install the
+ libraries, limbo compiler, emu for hosted Inferno,
+ and auxiliary commands. The rules do that in an order
+ that ensures that the commands or libraries needed by a
+ later stage are built and installed first. (Note that
+ a plain mk will not suffice, because it does not put
+ the results in the search path.)
+
+Doing something similar on Windows or Plan 9 currently
+requires the executable for mk to be available in the
+search path, since there is no equivalent of makemk.sh.
+Otherwise the procedure is the same. On Plan 9, of course,
+the host system's normal version of mk should be adequate.
--- /dev/null
+++ b/README.md
@@ -1,0 +1,9 @@
+Inferno® is a distributed operating system, originally developed at Bell Labs, but now developed and maintained by Vita Nuova® as Free Software. Applications written in Inferno's concurrent programming language, Limbo, are compiled to its portable virtual machine code (Dis), to run anywhere on a network in the portable environment that Inferno provides. Unusually, that environment looks and acts like a complete operating system.
+
+The use of a high-level language and virtual machine is sensible but mundane. The interesting thing is the system's representation of services and resources. They are represented in a file-like name hiearchy. Programs access them using only the file operations open, read/write, and close. The 'files' may of course represent stored data, but may also be devices, network and protocol interfaces, dynamic data sources, and services. The approach unifies and provides basic naming, structuring, and access control mechanisms for all system resources. A single file-service protocol (the same as Plan 9's 9P) makes all those resources available for import or export throughout the network in a uniform way, independent of location. An application simply attaches the resources it needs to its own per-process name hierarchy ('name space').
+
+The system can be used to build portable client and server applications. It makes it straightforward to build lean applications that share all manner of resources over a network, without the cruft of much of the 'Grid' software one sees.
+
+Inferno can run 'native' on various ARM, PowerPC, SPARC and x86 platforms but also 'hosted', under an existing operating system (including FreeBSD, Irix, Linux, MacOS X, Plan 9, and Solaris), again on various processor types.
+
+This Bitbucket project includes source for the basic applications, Inferno itself (hosted and native), all supporting software, including the native compiler suite, essential executables and supporting files.
--- /dev/null
+++ b/doc/20010618.ms
@@ -1,0 +1,941 @@
+.TL
+Inferno 3rd Edition \- June 2001 Revision
+.br
+Release Notes and Errata
+.AI
+Vita Nuova
+support@vitanuova.com
+.br
+18 June 2001
+.SP 4
+.NH 1
+Installation
+.LP
+If you have a previous version of Inferno installed, this one must not
+be installed over it.
+If you have a Lucent Inferno 2.3 release from 1999 or earlier, you
+should make a completely new installation from this CD.
+If you have installed the Vita Nuova `Binary and Limited Source' CD you
+should also make a new installation from this CD.
+If you have previously installed the Vita Nuova Full Source CD from July 2000,
+see the section `Updating the July 2000 release' at the end of this document.
+(You could also make a fresh installation in a new directory, or remove the old
+release and install this one if there are no files you wish to keep.)
+.LP
+Follow the installation instructions in ``Installation of the Inferno Software''
+at the end of Volume 2.
+The printed copy has several mistakes:
+.IP \(bu
+Page 327 notes that on Unix
+if the installation is done as user
+.CW root
+but a user
+.CW inferno
+exists the files will be owned by
+.CW inferno .
+They are not.
+Indeed, on Unix there is no reason to install the package as
+the super-user
+.CW root .
+Do the installation as the user, perhaps
+.CW inferno ,
+that is to own the files.
+.IP \(bu
+It refers on page 327
+to the Windows installation program as
+.CW Nt-386.exe .
+The correct name is used on the next
+page in the actual instructions, namely
+.CW \einstall\esetup.exe .
+.IP \(bu
+The Windows
+.CW setup.exe
+will offer to create the target installation directory if it does not exist.
+On all other systems, you must first create the directory yourself.
+.IP \(bu
+Solaris and some other commercial Unix systems
+do not read the `Joliet' format section of the CD,
+which uses Unicode names, and
+they
+force the names in the non-Joliet portion on the CD to lower case, sometimes
+with hyphens converted to underscores.
+The installation script on Solaris is therefore
+.CW install/solaris_sparc.sh .
+The installation script itself works despite this.
+Alternatively, if the CD is mounted with the option
+.CW nomaplcase
+the system will not force the names to lower case
+and all names will appear as we intended
+(see the Solaris manual entry for
+.I mount_hsfs ).
+Future CDs will use Rock Ridge format as well as Joliet to avoid the problem.
+(We attempted that this time
+but ran into trouble on Windows systems because of an error in the CD writing program.)
+Linux and FreeBSD show the names we intended.
+.LP
+When installing on Windows NT or Windows 2000, the installation program determines
+if the current user belongs to the
+.CW Administrators
+group.
+If so, the Start Menu entry is created in the
+.CW "All Users"
+profile, otherwise it is created in the user's private profile.
+On Windows 95 and Windows 98, if the user has a private profile the
+Start Menu entry is created there,
+otherwise it is created in the main system Start Menu.
+Furthermore,
+.CW setup.exe
+looks for
+.CW "Start Menu"
+to install the shortcut, but that name is locale-dependent,
+and thus the shortcut will not be installed correctly in non-English locales,
+although Inferno itself will be installed successfully.
+.LP
+When the installation completes normally,
+it prints ``installation complete'', but on some platforms it can then print
+``Killed: ...'' followed by the command line used in the installation shell script.
+Provided it has said ``installation complete'' and there were no errors noted
+prior to that, the installation has succeeded.
+The ``Killed'' message results from the installation software
+running inside the Inferno environment having shut down that environment.
+It does not mean that installation failed.
+.LP
+If an installation does fail, for instance by running out of space,
+delete everything in the target directory before retrying the installation.
+Also see the ``Known Problems'' section below.
+.NH 1
+CD Number
+.LP
+There is a set of six unique numbers on the back of the CD case.
+You should keep the case or record the numbers:
+they identify your subscription, and you will need them
+to gain access to subscriber's services we provide electronically.
+(They are randomly generated and usable in a 6/49 lottery!)
+.NH 1
+Hosted Operating System versions
+.LP
+The software was compiled on the operating system versions listed below.
+UNIX systems show the output from
+.CW "uname -a"
+and the version of the C compiler,
+if known.
+See the section
+.I "Known problems"
+below for a list of known problems for any given release.
+.TS
+center;
+lf(B) lf(R)w(4i) .
+Windows Nt Windows NT4.0 SP4
+Linux T{
+Linux vespa 2.2.9-19mdk #1 Wed May 19 19:53:00 GMT 1999 i586 ...
+.br
+(gcc-2.91.66)
+T}
+Solaris T{
+SunOS pazzo 5.6 Generic_105181-03 sun4u sparc SUNW,Ultra-5_10
+.br
+(gcc 2.95)
+T}
+FreeBSD 4.x T{
+FreeBSD outside 4.0-RELEASE FreeBSD 4.0-RELEASE #0: ... i386
+.br
+(gcc 2.95.2)
+T}
+HP/UX T{
+HP-UX hpserv1 B.10.20 A 9000/715 2013314861 ...
+.br
+(c89)
+T}
+IRIX 5.3 T{
+IRIX invece 5.3 11091812 IP22 mips
+.br
+(MIPS cc)
+T}
+Plan9/x86 Third Edition, updates to 5th June 2001
+.TE
+.LP
+The Windows version has also been tested and
+used extensively on the following variants:
+.DS
+.ft B
+Windows '95
+Windows '98
+Windows Me
+.ft R
+.DE
+We have also installed and run the system under Windows 2000, both
+client and server, but there is a problem with the cursor under Windows 2000 client (see below).
+.LP
+Inferno source code is included for the following, but they have neither been
+built nor tested:
+.DS
+.ft B
+Plan 9 (mips, sparc, power)
+Solaris/386
+Unixware v 2.3
+.ft R
+.DE
+.NH 1
+Known problems
+.LP
+See
+.CW www.vitanuova.com/inferno/
+for current pointers to information about Inferno.
+You should particularly check the Frequently Asked Questions at
+.P1
+www.vitanuova.com/inferno/faq.html
+.P2
+and the current Bugs list at
+.P1
+www.vitanuova.com/inferno/bugs.html
+.P2
+The Subscriber area will include online updates after 6 July 2001.
+.LP
+Now, the bad news:
+.IP \(bu
+The Windows NT installation program will create a Start Menu shortcut that invokes
+.I emu
+with a
+.CW "-g800x600"
+argument. The
+.I emu
+display will be incorrect if the screen width available is less than 800 pixels and is not a multiple
+of four. The problem can be seen if the PC is configured with a resolution of 800x600 pixels
+and the Microsoft Office shortcut bar is active. The simplest fix is to change the
+.I emu
+shortcut to use a
+.CW "-g788x600"
+argument instead.
+.IP \(bu
+As mentioned above, the Windows NT installation program looks only for
+.CW "Start Menu"
+not the locale-dependent name.
+.IP \(bu
+Changing the cursor image does not work under Windows 2000 Client;
+this mainly affects
+.I acme ,
+which changes the cursor when rearranging or resizing frames.
+.IP \(bu
+The HP version of hosted Inferno was generated by HP/UX version B.10.20.
+On the HP platform,
+.CW emu
+can currently only be run in interpreted mode; the compiled mode (ie,
+.CW -c1
+option)
+will fault.
+.IP \(bu
+Some Inferno calls return error strings provided by Windows, without modification.
+They can be obscure:
+for instance, ``windows error 10049'' is produced by network calls
+that attempt to use symbolic names when
+.I cs (8)
+has not been started.
+.IP \(bu
+On all systems, the mapping from Inferno names and permissions to the underlying
+operating system's names and permissions needs more work, particularly on Windows.
+Common problems include:
+.RS
+.IP \-
+File names in the Inferno environment cannot be longer than 27 bytes.
+.IP \-
+.I Ftpfs
+cannot access files with long names or names containing spaces.
+.IP \-
+The contents of
+.CW /dev/user
+on Windows will contain the Windows user name, which can contain spaces.
+.IP \-
+Files created inside the Inferno hierarchy by host system applications
+can sometimes have odd permissions when accessed within Inferno.
+.IP \-
+Readonly files and open files cannot be removed under Windows.
+.LP
+In general, the mapping between Inferno users and groups and Windows/Nt users and groups
+is systematic, as described in
+.I sys-stat (2):
+.QS
+.I Emu
+attempts to maintain a limited but consistent map
+between Inferno and NT worlds, specifically between Inferno
+names and NT security IDs.
+Special NT group `Everyone'
+represents `other' for file permissions. The Inferno uid is
+the file owner under NT; the Inferno gid reported is the
+first user in the file's ACL that is neither the owner nor
+Everyone; failing that, the gid is the file's owner.
+.QE
+.LP
+The effects of this mapping are sometimes peculiar: for instance, something that
+is thought of as a user appears as the group name in
+.I ls ,
+and vice-versa.
+Either the implementation or the mapping might need to be rethought.
+.RE
+.IP \(bu
+.I bufio (2)
+maintains an internal list of files open for output,
+to support its little-used
+.CW flush -all
+operation.
+Unfortunately, that means that if several processes
+use the same Bufio module instance (the result of a single
+.CW load )
+concurrently for output to different files, they must separately interlock the open, create and close calls.
+.IP \(bu
+The file
+.CW utils/5l/thumb.c
+contains comments in the C++ style, which (reasonably enough) are not accepted by the
+ANSI C compiler on the HP/UX system we used, and thus the current
+.CW 5l
+is not compiled for HP/UX; they will be changed to ANSI C comments by
+one of the online updates.
+.NH 1
+Contents
+.LP
+This revision offers the following, compared to the July 2000 release.
+.SH
+.I "Repairs and changes"
+.IP \(bu
+Many bug fixes and improvements appear throughout.
+Many commands have been revised to print usage and diagnostic consistently, and give reliable exit status for
+.CW sh .
+.IP \(bu
+The shell
+.CW sh
+has repairs and several visible changes:
+.I sh-expr (1)
+implements a
+.CW %
+operator for remainder;
+a new
+.CW @
+operator creates a sub-shell to execute a command, allowing the calling shell to be insulated
+by
+.CW pctl
+(see
+.I sh-std (1))
+from changes to name space and environment;
+environment variables are stored in printable form;
+a newline is allowed after a caret;
+and everyone's favourite change, the colon character
+.CW : ' `
+is no longer reserved in argument words
+(allowing URLs to be given without quoting).
+.IP \(bu
+Inferno's Acme has been revised to match the version in Plan 9 (Third Edition).
+In particular, the
+.CW Edit
+built-in has been added, allowing the use of structural regular expressions
+and the
+.CW sam
+command language for efficient editing within
+Acme,
+replacing the
+.CW /acme/edit
+suite of commands.
+The Acme panes have acquired a similar touch of colour.
+.IP \(bu
+Charon has also had many fixes and improvements, particularly to
+Javascript extraction, frames and layout code.
+Sometimes the `fix' requires making Charon mimic the interpretation of incorrect HTML
+of other browsers.
+The PNG image format is now supported.
+Cookies are enabled by default (too many sites use them);
+parsing and production of the cookie file has been repaired.
+The progress bar is more compact.
+Support for longer SSL keys is enabled by default.
+.IP \(bu
+The software installation commands
+.CW install/*
+have been extensively revised,
+and documented by
+.I archfs (4)
+and
+.I create (8).
+They are based (though not entirely) on Russ Cox's update
+package for Plan 9.
+.IP \(bu
+.I format (8)
+can format an ordinary file, for instance to prepare flash partition
+contents in a hosted environment for a native Inferno device.
+.IP \(bu
+.CW mount
+allows the certificate file to be named directly.
+.IP \(bu
+.CW ns
+correctly quotes the fields in its output.
+.IP \(bu
+.CW /services/server/config
+no longer gives the unauthenticated
+.CW nobody ') (`
+option to
+.I styx (8);
+some unused entries have also been deleted.
+.IP \(bu
+.I srv (8)
+passes all arguments to servers it spawns, including the command name.
+It also ensures each server has its own process group, file descriptors
+and name space.
+.IP \(bu
+.I stack (1)
+has a new
+.CW -p
+option to add names to the source file search list.
+.IP \(bu
+.I deflate (2)
+correctly detects end-of-file;
+.I inflate (2)
+correctly decodes the combined compressed code-length tables
+.IP \(bu
+Tk's handling of objects in canvases is better:
+raise and lower work properly; stipple is implemented.
+.IP \(bu
+Tk now knows that a window pops up when resized, and adjusts
+the z-order accordingly.
+.IP \(bu
+Tk's scroller always expands the fraction of a
+.CW moveto
+when it evaluates the scrolling command.
+.SH
+.I "New commands and modules"
+.IP \(bu
+.I ftpfs (4)
+provides a way to make a remote FTP site visible in the Inferno name space
+(it was documented in the printed manual but not previously included with the system).
+.IP \(bu
+.I listen (1)
+provides a convenient way to listen for incoming calls to one or more Inferno services,
+with optional use of
+.I ssl (3)
+for authentication and encryption.
+This can replace the clumsy use of
+.I srv (8).
+.IP \(bu
+.I dial
+(see
+.I listen (1))
+is the complement of
+.I listen ;
+it dials a service, with optional authentication and encryption.
+.IP \(bu
+.I lockfs (4)
+enforces multiple reader, exclusive writer access to the contents of a name space.
+.IP \(bu
+The device
+.I prof (3)
+serves a name space for controlling the profiling of Limbo modules,
+and retrieving the resulting data.
+The module
+.I profile (2)
+offers a convenient interface, relating the profiling data to the source code.
+Finally,
+.I prof (1)
+is the command line interface to enable profiling and display the results.
+.IP \(bu
+.CW uuencode
+and
+.CW uudecode
+interpret a format used to encode binary data printably on Usenet and in mail messages;
+see
+.I uuencode (1).
+.IP \(bu
+.I Read (1)
+writes to standard output the result of a single
+.I sys-read (2)
+of a given number of bytes from standard input,
+with optional seek offset.
+(See
+.I getlines
+in
+.I sh-std (1)
+for a way to read a line from standard input.)
+.IP \(bu
+.I Tcs (1)
+uses
+.I convcs (2)
+to offer character set conversion on files.
+.IP \(bu
+.I wm-misc (1)
+mentions
+.CW wm/mand ,
+a browser for fractals,
+and
+.CW wm/polyhedra ,
+a polyhedra viewer
+.IP \(bu
+.I cfg (2)
+provides a module to read configuration files.
+.IP \(bu
+.I dividers (2)
+provides user-draggable dividing lines to separate Tk widgets,
+allowing screen space to be allocated to widgets by dragging a dividing line.
+.IP \(bu
+.I imagefile (2)
+offers support for reading PNG image files
+.SH
+.I "Interface changes and extensions"
+.IP \(bu
+.CW Url
+.CW /module/url.m ) (
+now requires an
+.CW init
+function to be invoked before other functions in the module
+.IP \(bu
+.I convcs (2)
+has changed its interface significantly (see the manual page);
+it also offers support for UTF-7.
+Even the module name has changed, to
+.CW Convcs
+(from
+.CW ConvCS ).
+.IP \(bu
+.I plumber (8)
+now returns an error to a message's sender if it cannot be plumbed,
+as was previously documented;
+it handles
+.CW ^
+correctly in regular expressions
+.IP \(bu
+.I readdir (2)
+returns all file names in union directories
+.IP \(bu
+.I string (2)'s
+quoting and unquoting functions are correct and more efficient
+.IP \(bu
+.I styxlib (2)
+forces an internal process into a new empty name space to
+allow detection of the last unmount of the served space by a file server
+.IP \(bu
+.I translate (2)
+supports writing Unicode characters in hexadecimal using Limbo's \f5\eu\fP\fIXXXX\fP
+syntax
+.IP \(bu
+.I workdir (2)
+returns a better guess at the current directory on native Inferno
+(hosted implementation will be supported in the next update)
+.IP \(bu
+.I cs (8)
+keeps a cache of recent translations;
+.I ipsrv (8)
+uses UDP/IP by default, not TCP/IP, for domain name lookup
+.IP \(bu
+.I httpd (8)
+can now be said to work as documented
+.IP \(bu
+.I wm-sh (1)
+provides a new control file
+.CW /chan/shctl
+to allow it to be kept informed
+of the current directory (and
+.CW /lib/wmsetup
+defines a Shell function to do that)
+and to implement buttons below the title bar,
+as with
+.I mash-tk (1).
+.IP \(bu
+A collection of playing card images has been added in
+.CW /icons/cards .
+.IP \(bu
+.CW /lib/wmsetup
+defines a
+.CW cd
+function to keep
+.CW wm/sh
+informed of the current directory;
+the
+.CW Web
+menu item is now called
+.CW Charon ;
+a new
+.CW Manual
+entry invokes
+.I wm-man (1);
+and the game
+.CW Tetris
+has been added to the
+.CW Misc
+submenu.
+.IP \(bu
+The manual pages
+.I draw-font (2),
+.I draw-image (2),
+and
+.I draw-screen (2)
+now document
+.CW display
+and
+.CW screen
+members of various data structures.
+.IP \(bu
+.I security-auth (2)
+and
+.I security-login (2)
+note that
+.CW keyring.m
+must be included.
+.IP \(bu
+In
+.I sh (2),
+the function
+.CW exec
+has been renamed
+.CW run ,
+to reflect more accurately what it does.
+.IP \(bu
+In
+.I sys-pctl (2),
+the type of the second parameter to
+.CW Sys->pctl
+is actually
+.CW "list of int"
+not
+.CW "list of ref Sys->FD" .
+.IP \(bu
+.I ssl (3)
+documents the new
+.CW encalg
+and
+.CW hashalg
+files, which list the supported algorithms.
+.IP \(bu
+.I canvas (9)
+describes the new
+.CW -winding
+option
+.IP \(bu
+.CW /services/cs/services
+lists the Software Download Server port
+and removes others that are obsolete.
+.IP \(bu
+.CW /services/server/config
+no longer gives the unauthenticated
+.CW nobody ') (`
+option to
+.I styx (8);
+some unused entries have also been deleted.
+.SH
+.I "Limbo compiler"
+.IP \(bu
+The Limbo compiler correctly clears reference values (eg,
+.CW list ,
+.CW ref
+and
+.CW array )
+when they go out of scope when necessary (eg, in loops), causing the storage to be reclaimed,
+and the values to be correctly
+.CW nil
+when the scope is next entered.
+.IP \(bu
+A bug that caused temporaries sometimes to be reused too early has been fixed.
+.IP \(bu
+The compile-time evaluation of floating-point
+.CW >
+does not calculate
+.CW >=
+instead.
+.IP \(bu
+A little context has been added to `syntax error' messages.
+.SH
+.I "Compilers and architectures"
+.IP \(bu
+The system now supports the Thumb variant of the ARM architecture,
+including full interworking of Thumb code and 32-bit ARM code;
+there is a Thumb JIT compiler for the virtual machine, and support for
+Thumb mode in the disassembler and debugger.
+The Thumb compiler is
+.CW tc ;
+the existing ARM linker
+.CW 5l
+links both ARM and Thumb code.
+.IP \(bu
+A bug in
+.CW 5l
+has been fixed that could very occasionally place a literal pool inside a case table.
+.IP \(bu
+The compilers find include files correctly under Nt.
+.SH
+.I "Hosted and Native Inferno"
+.IP \(bu
+.I Emu
+has the following fixes and improvements:
+.RS
+.IP \-
+support for logging of memory pool usage
+.IP \-
+FreeBSD support uses
+.I rfork
+not
+.I pthreads .
+.IP \-
+Linux support no longer relies on being able to set the TSS register.
+.IP \-
+A memory leak when a kernel process exits has been fixed.
+.IP \-
+A start has been made on tidying up the audio support; the
+.CW svp.c
+file has gone and platforms that do not currently support audio
+no longer must include a stub driver.
+.IP \-
+The Linux and
+FreeBSD ports use an alternative implementation of Inferno graphics
+under X11 that should allow the program to run under 16-bit graphics and higher.
+.RE
+.IP \(bu
+For the native kernels only:
+.RS
+.IP \-
+The native kernel implements Rob Pike's `lexical names' (as in Plan 9), which helps
+provide more sensible data to
+.I workdir (2).
+The hosted Inferno implementation of lexical names will appear in a future update.
+Note that
+.CW /os/port/ns.c
+no longer exists, and native kernel configuration files should no longer mention
+.CW ns
+in the
+.CW port
+section.
+Drivers that serve a directory hierarchy might need to support the
+.CW DEVDOTDOT
+value for the table index in their
+.CW devgen
+function.
+.IP \-
+.CW os/ip/bootp.c
+will correctly time out if there is no response.
+.CW os/ip/ihbootp.c
+will now work with an RFC1084 BOOTP server.
+Note that both are likely to be replaced in future by
+Limbo applications that do BOOTP and DHCP.
+.IP \-
+.CW os/pc/cga.c
+provides a replacement for
+.CW screen.c
+for use when only a simple CGA console is needed or possible.
+.IP \-
+The file descriptor array is correctly freed when the file descriptor
+group is closed, fixing a memory leak.
+.IP \-
+A few files that were left off the July 2000 CD have been included this time.
+.IP \-
+ARM/Thumb kernels have been implemented; those implementations
+are not included in this update because the hardware is specialised but ask if you could use them.
+.RE
+.LP
+The remaining points are common to both hosted and native Inferno:
+.IP \(bu
+A write on closed pipe does not produce an exception if the process has been killed.
+.IP \(bu
+.I Devbwrite
+will not lose memory if an error is raised by the device driver (eg, interrupt).
+.IP \(bu
+.I ssl (3)
+has
+.CW encalg
+and
+.CW hashalg
+files
+.IP \(bu
+.I Kfs (3)
+offers
+.CW readonly
+and
+.CW readwrite
+commands, and an
+.CW ro
+(readonly)
+option to the
+.CW init
+request.
+.IP \(bu
+.I Srv (3)
+rejects attempts to create an unusable name containing a `/'.
+.IP \(bu
+The
+.CW NODEVS
+flag of
+.I sys-pctl (2)
+is correctly copied when the name space is duplicated.
+.\" the following is only in the 386 version
+.\" .IP \(bu
+.\" .I Prog (3)
+.\" can optionally give the correct Dis pc for the debugger even for compiled modules; if
+.\" .CW cflag
+.\" (or
+.\" .CW -c
+.\" option for
+.\" .I emu )
+.\" is 2 not 1, a mapping table is retained after JIT compilation.
+.SH
+.I "SA1100 native"
+.IP \(bu
+A new
+.CW archether
+function in
+.CW arch*.c
+(when required)
+is invoked by
+.CW devether.c
+to discover the configuration of Ethernet devices.
+.IP \(bu
+.CW devuart.c
+supports the use of XON/XOFF;
+the first line of the
+.CW stat
+file gives the current UART settings;
+it no longer panics if an overrun occurs;
+and
+.CW setlength
+accesses the correct structures.
+.IP \(bu
+The CS8900 driver
+.CW ether8900.c
+is more general, supporting the I/O port interface as well as the memory interface.
+.IP \(bu
+Obsolete definitions have been removed from several include files
+(eg,
+.CW io.h )
+and obsolete (typically #ifdef'd code) has been removed from source files.
+.NH 1
+IDEA™
+.LP
+The software includes an implementation of the IDEA encryption algorithm,
+for non-commercial use.
+IDEA was patented by Ascom-Tech AG (European patent EP 0 482 154 B1, US patent number
+US005214703, and patent filed in Japan).
+At time of writing, there was no licence fee required for noncommercial use.
+If you intend to use IDEA encryption commercially with Inferno, you should consult
+.CW http://www.it-sec.com/idea_lic_policy.html
+for the current licensing policy of iT_SEC Systec Ltd, which currently holds the patents and trademark.
+Note that IDEA is not required for
+successful use of Inferno.
+It is not by default enabled by
+.I rstyxd (8),
+and otherwise is used only by SSL3 (for the browser), in
+.CW /appl/lib/crypt/ssl3.b ,
+where it can be disabled.
+.NH 1
+Updating the July 2000 Full Source release
+.LP
+A set of update packages for the July 2000 release is provided in the directory
+.CW /updates
+on this June 2001 CD.
+You should install the updates as the host operating system user who owns the Inferno files and directories
+on your system.
+You might like to take a backup copy of the existing tree, just in case.
+Do the following to update the installation.
+(Next time we hope to provide a more automated scheme.
+You might check
+.CW www.vitanuova.com/inferno/
+to see if there are any more recent instructions.)
+.IP 1.
+Copy the directory
+.CW updates
+and its subdirectories from the CD to a directory
+.CW updates
+in the root of your existing Inferno hierarchy.
+Make sure there is sufficient space in the file system holding that hierarchy.
+If all platform files are copied and installed, about 45 Mbytes will be needed
+to hold the compressed update packages in
+.CW updates ,
+with a further
+35 Mbytes needed for an temporary uncompressed copy
+of the largest package, and allow 10 Mbytes for new additional material, giving
+90 Mbytes in all.
+The
+.CW updates
+directory can be removed after installation.
+.IP 2.
+Start the existing Inferno
+.I emu ;
+it will be quicker if you use the
+.CW -c1
+option to force compiled mode.
+It is best to use only the Inferno console; do not start the window system, since
+the updates will change files in the running system.
+On the other hand, it is a good idea to make the window in the host
+operating system a scrolling one, so that you can scroll back to
+see any errors.
+All following commands are run in the Inferno environment.
+.IP 3.
+Change to the directory
+.CW updates
+and load the standard Inferno shell module:
+.RS
+.P1
+cd /updates
+load std
+.P2
+.RE
+.NE 1i
+.IP 4.
+Unpack the updated installation software:
+.RS
+.P1
+gunzip <install.tgz | {cd /; gettar}
+.P2
+.RE
+.IP 5.
+Update the installed Inferno, source and utility source directories
+using the script
+.CW applybase :
+.RS
+.P1
+sh ./applybase
+.P2
+.LP
+That script updates the
+.CW inferno ,
+.CW src
+and
+.CW utils
+packages.
+You will see one warning:
+.P1
+skipping /dis/install/inst.dis: locally modified
+.P2
+because that file was updated by step 4.
+You might see other warnings if you have modified any other files from the original release.
+.RE
+.IP 6.
+Update one or more platform specific files for your platform(s).
+The directories are named after the platforms:
+.CW Solaris
+for Solaris,
+.CW Plan9
+for Plan 9,
+.CW Nt
+for all Windows systems,
+and so on.
+For each
+.I platform
+that you run, do:
+.RS
+.P1
+for (a in \fIplatform\fP/*) {install/inst -v -t $a}
+.P2
+.RE
+.IP 7.
+Quit
+.I emu .
+The new version of
+.I emu
+will be called
+.CW emu.new
+in the platform-specific directory
+(eg,
+.CW Solaris/sparc/bin/emu.new ).
+Rename the old
+.CW emu
+file as
+.CW emu.old ,
+then rename the new
+.CW emu.new
+as
+.CW emu
+on Plan 9 and Unix systems,
+or
+.CW emu.exe
+on Windows.
+When run, it should announce itself as
+``Inferno Third Edition (18 Jun 2001)''.
binary files /dev/null b/doc/20010618.pdf differ
--- /dev/null
+++ b/doc/20011003.ms
@@ -1,0 +1,758 @@
+.TL
+Inferno 3rd Edition \- 3 October 2001 Update
+.br
+Release Notes
+.AI
+Vita Nuova
+support@vitanuova.com
+.br
+3 October 2001
+.SP 4
+.NH 1
+Contents
+.LP
+This set of updates applies to the 18 June 2001 base.
+The installed software must therefore be that of 18 June 2001, whether installed directly,
+or the result of updating the original July 2000 release with update packages.
+.LP
+This update offers the following, compared to the June 2001 release.
+.SH
+.I "Repairs and changes"
+.IP \(bu
+Various minor bug fixes and improvements appear here and there.
+They include corrected usage messages, status returned to the shell on error,
+diagnosing failures to load library modules,
+and use of the
+.CW %r
+format to improve diagnostics.
+Many
+.I wm
+applications adjust their display to suit small screens (eg, on the Compaq iPAQ);
+currently the applications do some of the work themselves but it will soon be automatic.
+Other changes are listed below.
+.IP \(bu
+.I Date (1)
+takes the options
+.CW -u
+(show UTC/GMT)
+and
+.CW -n
+(print time as a number).
+If given a number of seconds as its argument,
+.I date
+takes that as the time to convert.
+.IP \(bu
+.I Format (8)
+supports tiny partitions.
+.IP \(bu
+.CW /appl/env.b
+uses
+.I env (2)
+to access environment variables.
+.IP \(bu
+For packages created after Saturday 8 September 2001,
+.CW install/create
+converts the leading
+.CW 10
+to the letter
+.CW A
+so that the names of update files remain in sorted order.
+.IP \(bu
+.CW install/inst
+takes a
+.CW -c
+option to cause it to carry on even if some files or directories cannot be made or updated.
+(This option is intended for special applications and should not generally be used when applying system updates.)
+.IP \(bu
+.CW install/install
+takes a
+.CW -P
+option that lists the packages to install.
+The
+.CW -g
+option causes
+.CW install
+to install all available packages.
+.IP \(bu
+The installation software regards files that have had carriage returns inserted
+before newlines as identical to the original files when deciding when files
+have been modified locally.
+.IP \(bu
+The
+.CW wm
+applications now check the whole string received on the window
+control channel returned by
+.I wmlib (2)'s
+.CW titlebar,
+not just an initial letter;
+in particular, they check for
+.CW "exit"
+not just
+.CW 'e' .
+.IP \(bu
+.I Wm (1)
+takes the option
+.CW -s
+to suppress the start menu.
+.IP \(bu
+.I Keyring-gensk (2)
+and
+.I createsignerkey (8)
+provide experimental support for the RSA algorithm for signatures instead
+of the default El-Gamal algorithm;
+.CW /keyring/rsaalg.c
+is new.
+.IP \(bu
+.CW /keyring/egalg.c
+uses the correct structure for Public keys.
+.IP \(bu
+.I Listen (1)
+has a new option
+.CW -i
+that takes a shell command for each listener to run to set up appropriate
+context (eg, name space) before listening for incoming calls.
+.IP \(bu
+.I man (1)
+allows non-numerics in section names.
+.IP \(bu
+.CW /appl/cmd/mkfile
+now includes
+.CW /mkfiles/mksubdirs
+and has acquired the list of
+.CW cmd
+subdirectories from
+.CW /appl/mkfile .
+.IP \(bu
+.I Puttar
+gives warnings not fatal errors when files or directories cannot be found.
+.IP \(bu
+.I sh-expr (1)
+implements the
+.CW !=
+operator.
+.IP \(bu
+.I Tail (1)
+no longer gives an array bound error when a binary file does not end with a newline.
+.IP \(bu
+.CW tiny/sh
+does not break when
+.I filepat (2)
+is not available.
+.IP \(bu
+.I Webgrab (1)
+has several repairs to its HTTP protocol implementation, making it work with multi-homed servers.
+.IP \(bu
+.CW wm/sendmail
+no longer fails to save messages when requested(!).
+.IP \(bu
+.I Arg (2)
+allows re-use by ensuring that its globals are reinitialised by its
+.CW init
+function.
+.IP \(bu
+.I Bufio (2)
+correctly implements relative seek.
+.IP \(bu
+.I Convcs (2)
+provides several more character sets.
+.IP \(bu
+.I Cs (8)
+does not complain if it is already running, suppressing a previously confusing diagnostic.
+.IP \(bu
+.CW lib/deflate
+no longer fails on the output of some PC versions of
+.I gzip .
+.IP \(bu
+.I Wmlib (2)
+adapts a little better to different font and screen sizes (though more remains to be done);
+on small screens, defined as those less than 480 pixels wide,
+.I wm (1)
+puts all windows at the screen origin by default.
+.SH
+.I "Interface changes and extensions"
+.IP \(bu
+.I Wm-deb (1)
+has got a
+.CW stack
+button to bring up the stack window if it has previously been dismissed.
+It also allows breakpoints to be set in modules that have not yet been loaded.
+.IP \(bu
+.CW wm/memory 's
+display has been redesigned to be more informative.
+.IP \(bu
+.CW wm/tetris
+allows the use of a stylus (or mouse) to guide the placement of pieces.
+.IP \(bu
+.CW wm/view
+supports PNG format
+.IP \(bu
+.CW wm/view
+has a new option
+.CW -i
+for use in
+.I plumbing (6)
+files, to cause it to listen for messages from the
+.I plumber (8).
+Existing plumbing files that invoke
+.CW wm/view
+will typically need to change to add that option for image viewing
+using the plumber to work as expected.
+.IP \(bu
+.I Newns (2)
+provides more general option parsing by using
+.I arg (2),
+and does error checking unless the
+.CW -i
+option is given to the
+.I namespace (6)
+commands.
+Its internal
+.CW mount
+commandalso accepts the new options
+.CW -k
+.I keyfile
+to select the source of the authentication key, and
+.CW -C
+.I alg
+to select the encryption algorithm.
+.IP \(bu
+.I Plumber (8)
+takes a
+.CW -v
+option to cause it to log the contents of messages (for debugging plumbing applications), and
+also the option
+.CW -c
+.I wmchan
+to select an alternative window manager channel instead of
+.CW /chan/wm
+when the
+.CW -w
+option is used.
+.SH
+.I "Tk changes and extensions"
+.LP
+Many of the Tk changes enforce rules and do more error-checking than before,
+but there are several extensions and interface changes as well.
+.IP \(bu
+Tk applications must create the parent widget before its children.
+Currently the check to enforce this rule has been suppressed, but it will be enabled in future.
+.IP \(bu
+Widget names are now checked for well-formedness: neither trailing dot nor double dot are allowed.
+.IP \(bu
+When text in a text widget is deleted, embedded windows in
+that text are only deleted if they are descendents of the text
+widget.
+.IP \(bu
+Text widget now redisplays correctly when an embedded window
+is destroyed.
+.IP \(bu
+Text widget now checks for embedded windows in the text that have been destroyed since they were added to the text.
+.IP \(bu
+Widgets packed under a destroyed widget that are not
+descendents of that widget are now removed correctly
+from the packing hierarchy.
+.IP \(bu
+.CW -activebackground
+now changes the border appropriately, same as
+.CW -background .
+.IP \(bu
+.CW scrollbar
+now returns currently activated part when
+.CW activate
+is called
+with no arguments.
+.IP \(bu
+Only one part of a scrollbar may be active at any one time.
+.IP \(bu
+Tk can now distinguish between a null argument
+.CW {}
+and a missing argument.
+.IP
+Creating an embedded window in a text widget at index 1.0
+no longer causes the packer to go into an infinite loop.
+.IP \(bu
+Changing the options on an embedded window in a text
+widget caused an uninitialised pointer access.
+.IP \(bu
+Changing the window associated with an embedded window item
+in text and canvas widgets previously did not disassociate the old window correctly.
+.IP \(bu
+Changing the window associated with an embedded window item
+in a text widget did not set the widget's size appropriately.
+.IP \(bu
+The
+.CW -baseline
+alignment option for embedded windows in text widgets previously
+did not calculate the line height correctly.
+.IP \(bu
+The
+.CW -relief
+setting for buttons is now restored after enter/leave or selection.
+.IP \(bu
+Buttons 4, 5 and 6 have been added (for the iPAQ).
+.IP \(bu
+Tk no longer crashes if the `grab' changes during the processing of a mouse event.
+.IP \(bu
+A new event
+.CW <Destroy>
+can be bound to a widget to receive notification when it is destroyed (eg, by the destruction of
+a parent widget).
+The
+.CW <Configure>
+event is propagated to slaves as well as the configured master.
+These two changes make it easier to implement pseudo-widgets such as
+.I dividers (2).
+.IP \(bu
+.CW -anchor
+has been implemented for labels;
+.CW -justify
+should be implemented as documented.
+.IP \(bu
+Submenus are unmapped correctly.
+.IP \(bu
+Tk detects command loops (by limiting recursion depth).
+.IP \(bu
+.CW canvas
+has a new boolean option
+.CW -buffered
+that controls whether the whole canvas, or just the visible region, is allocated an off-screen buffer image.
+It defaults to just the visible area only.
+.IP \(bu
+.CW canvas
+has new operations
+.CW screenx
+and
+.CW screeny
+to map canvas coordinates to screen coordinates.
+.SH
+.I "New commands and modules"
+.IP \(bu
+A collection of small and tiny playing card images have been added, in
+.CW /icons/smallcards
+and
+.CW /icons/tinycards .
+.IP \(bu
+.CW install/wfind
+lists the versions of a given file in a set of installation packages.
+.IP \(bu
+.I Touchcal (8)
+provides touch-screen calibration; it runs both inside and outside the window
+manager
+.I wm (1).
+Both internal and external interfaces are completely different from previous versions.
+.IP \(bu
+.I Wm-keyboard (1)
+describes new commands
+.CW wm/keyboard
+and
+.CW wm/pen
+that provide soft keyboard and single-stroke gesture recognition for touch screen devices.
+.IP \(bu
+.I Gamesrv (4)
+provides a file system interface for multi-player networked games;
+.I gamesrv (2)
+provides the interface for the game-specific engines loaded on demand by the game server.
+.IP \(bu
+.CW utils/awk
+is a new directory containing the source for a version of
+.I awk
+for use in doing Inferno ports for systems that lack it (or a sufficiently recent version), including Windows.
+It is not currently made automatically for any system.
+It is covered by its own licence; see the
+.CW README
+and
+.CW NOTICE
+files in that directory.
+.SH
+.I "Limbo compiler"
+.IP \(bu
+The compiler now adds a source file name (relative to the Inferno root) to each Dis
+file, to allow
+.I debug (2)
+and thus the debugger
+.CW wm/deb
+and other commands such as
+.I profile (1)
+and
+.I stack (1)
+to find source
+and
+.CW .sbl
+files without prompting.
+.IP \(bu
+The initialisation of large arrays avoids deep recursion, preventing a trap on Nt
+and a large stack on other platforms.
+.IP \(bu
+Overflow is avoided when sorting integers for case statements.
+.SH
+.I "Compilers and architectures"
+.IP \(bu
+The linker
+.CW 5l
+has a critical bug fix in
+.CW utils/5l/span.c
+that fixes a bug in the flushing of literal pools.
+.IP \(bu
+.CW 5coff
+has a small change to make the output conform to actual practice
+not COFF documentation.
+.SH
+.I "Hosted and Native Inferno"
+.IP \(bu
+.I Emu
+has the following fixes and improvements:
+.RS
+.IP \(bu
+Trap handling on Windows now (we hope) does all that is required
+to work on many versions, variants, updates and releases.
+.IP \(bu
+The cursor appears correctly under Windows 2000.
+.IP \(bu
+Windows
+.I emu
+passes page up, down scroll, pause, insert, delete and print
+characters through to
+.CW /dev/keyboard .
+.IP \(bu
+.CW styx.c
+prevents bad Styx messages from causing trouble.
+.IP \(bu
+.CW devenv.c
+returns
+.CW "file exists"
+if an attempt is made to create an existing name;
+it implements
+.CW ORCLOSE .
+.IP \(bu
+.CW devroot.c
+makes directories mode 555 not 777.
+.RE
+.IP \(bu
+For the native kernels only:
+.RS
+.IP \(bu
+.I Env (3)
+is now provided for native kernels.
+To add it to a kernel, change the kernel configuration file as follows:
+.RS
+.IP 1.
+Include the device driver
+.CW env
+in the
+.CW dev
+section.
+.IP 2.
+Include the support file
+.CW env
+in the
+.CW port
+section.
+.IP 3.
+Include the name
+.CW /env
+in the
+.CW root
+section.
+.LP
+To exclude it from a kernel, include the support file
+.CW noenv
+in the
+.CW port
+section.
+You should only do this if you are trying to make a small highly specialised kernel;
+general applications are likely to make more use of
+.CW /env
+now that it is there,
+eventually to replace
+.CW sysenv
+and to select locales.
+.RE
+.IP \(bu
+Common floating-point emulator code has moved from platform-specific directories to
+.CW /os/port/fpi.c
+and
+.CW /os/port/fpimem.c ,
+with corresponding changes to configuration files and
+.CW mkfiles .
+.IP \(bu
+The scheduling code in
+.CW /os/port/proc.c
+has changed to support wait-for-interrupt.
+If no process can be scheduled, the platform-specific function
+.CW "void idlehands(void)"
+is called, with interrupts
+.I off
+(unlike the function of the same name in Plan 9).
+On most platforms, it currently is an empty function defined by
+.CW #define
+in
+.CW fns.h ,
+and the scheduler effectively spins waiting for an interrupt to make a kernel process ready,
+but on the iPAQ and a few other platforms it uses the hardware-specific
+``wait for interrupt'' function, for power saving.
+.IP \(bu
+A new package
+.CW ipaq
+is available that populates
+.CW /os/ipaq
+with the preliminary Inferno port to the Compaq iPAQ.
+.IP \(bu
+.CW /os/ip
+has incorporated bug fixes and improvements from Plan 9 to
+keep the source code up to date:
+.RS
+.IP \-
+.CW /net/ndb
+has been added, to allow for future changes in IP configuration code
+.IP \-
+permissions are checked more carefully;
+.CW wstat
+is implemented
+.IP \-
+.CW Conv
+structures are now unlocked on
+.CW close
+by
+.CW devip.c
+not by each protocol's implementation;
+the
+.CW car
+lock for connect/announce no longer exists, because the conversation itself is locked
+.IP \-
+some missing
+.CW waserror
+calls have been added
+.IP \-
+ensure local port is unique across existing conversations
+.IP \-
+.CW tos
+can be set for a converstation by a
+.CW tos
+control message, and is retained during routing
+.IP \-
+.CW qdiscard
+in
+.CW qio.c
+returns the number of bytes discarded
+.IP \-
+protocol handlers
+.CW esp.c ,
+.CW gre.c ,
+.CW icmp.c ,
+.CW ip.c ,
+.CW ipifc.c ,
+.CW ipmux.c ,
+.CW rudp.c
+and
+.CW tcp.c
+have consequentially changed;
+the TCP/IP implementation most extensively;
+.CW il.c
+and
+.CW udp.c
+have not yet been realigned with Plan 9
+.IP \-
+medium drivers use the structure-member initialisation extension of Plan 9 C,
+to insulate driver source text from changes in the layout of the
+.CW Medium
+structure
+.RE
+.IP \(bu
+The SA1100 UART driver now correctly pushes input up the stack when the FIFO empties.
+.RE
+.LP
+The remaining points are common to both hosted and native Inferno:
+.IP \(bu
+.I Cons (3)
+implements the file
+.CW kprint
+to capture Inferno console messages; and a file
+.CW jit
+that can be used to set the compile-on-the-fly option dynamically or read its current state.
+.IP \(bu
+There is a new kernel function:
+.RS
+.DS
+.ft 5
+char* seprint(char *buf, char *ebuf, char *fmt, ...);
+.ft P
+.DE
+which puts a formatted result into
+.CW buf
+never writing beyond
+.CW ebuf-1
+(including the trailing null byte).
+It returns the address of the next available byte in
+.I buf .
+.RE
+.IP \(bu
+.CW kfs 's
+.CW Eexist
+error has become
+.CW Eexists
+to remove a clash with the new
+.CW Eexist
+name in
+.CW error.h
+.IP \(bu
+.CW exportfs.c
+maintains offsets in exported directories correctly.
+.IP \(bu
+The undocumented
+.CW devaudit.c
+has been removed.
+.IP \(bu
+Some Limbo profiler bugs have been fixed.
+.IP \(bu
+A race for the use of a shared semaphore has been fixed in
+.CW devprog.c .
+.IP \(bu
+.CW devprog.c
+has a new debugging event:
+.CW load
+.I filename
+corresponds to the execution of a Dis
+.CW load
+instruction.
+.IP \(bu
+.CW devdraw.c
+implements
+.CW readpixels
+from a window
+.NH 1
+Updating the software
+.LP
+You should install the updates as the host operating system user who owns the Inferno files and directories
+on your system.
+You might like to take a backup copy of the existing tree, just in case.
+Do the following to update the installation.
+.IP 1.
+Fetch the update archives required, namely
+.CW inferno.tgz ,
+.CW src.tgz ,
+.CW utils.tgz
+and any platform-specific packages required for your installation.
+If you are running Windows, for instance, you will need
+.CW Nt.tgz ;
+if running Plan 9, you will need
+.CW Plan9.tgz .
+These are gzip'd tar files containing files starting with the directory
+name
+.CW updates/20011003 .
+Unpack each in your Inferno root directory.
+For instance, you can unpack
+.CW inferno.tgz
+using the Inferno commands:
+.RS
+.P1
+cd /
+gunzip <inferno.tgz | gettar
+.P2
+The following instructions assume they are visible in the Inferno hierarchy.
+Make sure there is sufficient space in the file system holding that hierarchy.
+Each archive can be removed after unpacking, and the
+.CW updates
+directory can be removed after installation.
+.RE
+.IP 2.
+After unpacking the archives,
+(re)start
+the existing Inferno
+.I emu ;
+it will be quicker if you use the
+.CW -c1
+option to force compiled mode.
+It is best to use only the Inferno console; do not start the window system, since
+the updates will change files in the running system.
+On the other hand, it is a good idea to make the window in the host
+operating system a scrolling one, so that you can scroll back to
+see any errors.
+All following commands are run in the Inferno environment.
+.IP 3.
+Change to the directory containing the updates:
+.RS
+.P1
+cd /updates/20011003
+.P2
+.RE
+.NE 1i
+.IP 4.
+Updated installation software was included in
+.CW inferno.tgz .
+You must first unpack that installation software, as follows:
+.RS
+.P1
+sh ./unpacktools
+.P2
+.RE
+.NE 1i
+.IP 5.
+Update the installed Inferno, source and utility source directories
+using the script
+.CW applybase :
+.RS
+.P1
+sh ./applybase
+.P2
+.LP
+That script updates the
+.CW inferno ,
+.CW src
+and
+.CW utils
+packages.
+You might see warnings if you have modified any non-configuration files from the original release.
+.RE
+.IP 6.
+Update one or more platform specific files for your platform(s).
+The directories are named after the platforms:
+.CW Solaris
+for Solaris,
+.CW Plan9
+for Plan 9,
+.CW Nt
+for all Windows systems,
+and so on.
+For each
+.I platform
+that you run, do:
+.RS
+.P1
+sh ./applyplat \fIplatform\fP
+.P2
+For instance, if you use Windows, run
+.P1
+sh ./applyplat Nt
+.P2
+The iPAQ distribution is installed the same way (it is just another platform):
+.P1
+sh ./applyplat ipaq
+.P2
+which populates
+.CW /os/ipaq
+in the Inferno tree.
+.RE
+.IP 6.
+Quit
+.I emu .
+The new version of
+.I emu
+will be called
+.CW emu.new
+in the platform-specific directory
+(eg,
+.CW Solaris/sparc/bin/emu.new ).
+Rename the old
+.CW emu
+file as
+.CW emu.old ,
+then rename the new
+.CW emu.new
+as
+.CW emu
+on Plan 9 and Unix systems,
+or
+.CW emu.exe
+on Windows.
+When run, it should announce itself as
+``Inferno Third Edition (3 October 2001)''.
binary files /dev/null b/doc/20011003.pdf differ
--- /dev/null
+++ b/doc/20020628.ms
@@ -1,0 +1,1030 @@
+.TL
+Inferno 3rd Edition \- 28 June 2002 Update
+.br
+Release Notes
+.AI
+Vita Nuova
+support@vitanuova.com
+.br
+28 June 2002
+.SP 4
+.NH 1
+Base for update
+.LP
+This set of updates applies to the 18 June 2001 base.
+The installed software must therefore be that of 18 June 2001, whether installed directly,
+or the result of updating the original July 2000 release with update packages to the
+18 June 2001 level.
+This set of updates can be applied, however, whether or not the previous update
+of 3 October 2001 was installed; this package includes all those changes too.
+.NH 1
+Contents
+.LP
+This update offers the following, compared to the 3 October 2001 release.
+.LP
+The following sections provide short
+summaries of the more interesting or important changes.
+There are many more minor or cosmetic changes.
+.SH
+.I "New commands and modules"
+.IP \(bu
+.I Fc (1)
+is a floating-point calculator.
+.IP \(bu
+A new page
+.I man (6)
+describes the formatting commands used in manual pages.
+.IP \(bu
+Support for system-level memory monitoring has been made available
+as an optional component of
+.CW emu ,
+with the interface provided by a new driver
+.CW /emu/devmem.c .
+It is not compiled in by default,
+since it is relatively intrusive.
+Details of how to enable it will be provided with the applications that use it.
+.IP \(bu
+.I Strings (1)
+extracts strings from a file.
+.IP \(bu
+.CW 9export
+(see
+.I export (4))
+serves a name space to a 9P client (mainly useful for
+interacting with Third Edition Plan 9 systems at the moment).
+.IP \(bu
+.CW cdfs ,
+which was described by
+.I dossrv (4)
+but not actually shipped is now included.
+.IP \(bu
+.CW csquery ,
+described in
+.I cs (8),
+queries the connection service and prints the result.
+.IP \(bu
+.CW newer ,
+described by
+.I ftest (1),
+is a shell predicate that compares the modification times on two files.
+.IP \(bu
+.I Print (2)
+is a new module that provides an interface to directly-driven printers.
+.SH
+.I "Repairs and changes"
+.IP \(bu
+.CW cp
+has been reworked extensively; amongst other things, it can now safely traverse
+arbitrary name spaces without becoming confused by
+complex mounts.
+.IP \(bu
+.I Listen (1)
+and
+.CW dial
+both accept a
+.CW -A
+option to cause it to authenticate a connection.
+.IP \(bu
+.I Diff (1)
+compares binary files and also does not quit (during recursive diffs) when it finds
+a file it cannot open.
+.IP \(bu
+.I Ls (1)
+implements the
+.CW -u
+and
+.CW -t
+sorting operations properly.
+A new
+.CW -e
+option prints the time as a decimal integer (seconds since the system Epoch).
+.IP \(bu
+.I Sh
+regards all of `../', `./', and `#' at the front of a command name as a request to use the name as-is.
+.IP \(bu
+.I Tail
+has been changed to avoid some boundary cases.
+.IP \(bu
+.I Touch
+uses standard argument processing and returns an error exit status if it fails.
+.IP \(bu
+.I Zeros
+accepts new options
+.CW -r
+to generate random bytes
+and
+.CW -v
+to produce blocks with a given value.
+.IP \(bu
+The network games software in
+.CW /appl/games
+uses port 6660 for its signer, not the standard Inferno signer's port.
+.IP \(bu
+.I Charon :
+disguises itself as Mozilla to satisfy agent-name probes; retries fetches on error; supports multiple windows;
+detects HTML vs plain text correctly; has many Javascript fixes, including fixing a day one bug
+in operator precedence (!).
+In general, it
+has undergone extensive changes,
+particularly to the cookies and Javascript implementation.
+It also insists that it be run under the window manager
+.I wm (1).
+It can be set to plumb schemes that it does not implement internally.
+.IP \(bu
+The installation software now uses the numeric time as-is to name output packages;
+the temporary fix introduced in Ocotober of converting leading
+.CW 10
+to
+.CW A
+has been removed.
+.IP \(bu
+.CW /appl/lib/dis.b
+prints offsets from
+.CW MP
+correctly (it previously printed them
+as offsets from
+.CW FP ).
+.IP \(bu
+.CW /appl/lib/parseman.b
+accounts for enough extra
+.I troff
+commands to display
+.I man (6)
+correctly.
+.IP \(bu
+.CW /appl/lib/profile.b
+binds
+.CW #P
+only if needed.
+.IP \(bu
+.CW /appl/lib/translate.b
+.I translate (2)) (
+now computes the right hash value.
+.IP \(bu
+Several run-time checks that were missing or incomplete for array access and slicing have been added.
+.IP \(bu
+.CW lib/isrv
+no longer starts a new shell.
+.IP \(bu
+.I Wm-ftree (1)
+has several new options:
+.CW -E ,
+.CW -p
+and
+.CW -d .
+.IP \(bu
+.CW wm/rt
+now correctly updates stack size (if set).
+.IP \(bu
+.I Xd (1)
+correctly implements
+.CW -r
+to suppress repeated lines.
+.IP \(bu
+.I gettar (1)
+documents the
+.B lstar
+output format
+.IP \(bu
+.I man (6)
+describes the manual page documentation format
+.SH
+.I "Interface changes and extensions"
+.IP \(bu
+.I Security-password (2)
+provides a new entry
+.CW setpwfile
+to allow the default name
+.CW /keydb/password
+to be changed.
+.IP \(bu
+.I Dossrv (4)
+has changed extensively, to support rename and long names in FAT format correctly.
+It uses a smaller number of IO buffers.
+.IP \(bu
+.I Createsignerkey (8)
+has a completely different invocation, documented in that manual page,
+with a conventional option structure replacing a rather idiosyncratic chain
+of optional arguments.
+The old syntax is temporarily secretly still supported, but only
+for the benefit of any shell scripts that might so use it; if you have any,
+please change them to use the new syntax before the old usage finally vanishes in a later release.
+.IP \(bu
+.I Sum (1)
+documents the new
+.CW sha1sum
+command.
+.IP \(bu
+.I Bufio (2)
+returns an error if a seek fails rather than stopping the process(!).
+.IP \(bu
+.CW /lib/convcs
+has several more character sets:
+.CW koi8-r ,
+.CW windows-1250
+and
+.CW windows-1252 .
+.IP \(bu
+.I Keyring-sha (2)
+documents two new functions,
+.CW hmac_sha1
+and
+.CW hmac_md5 ,
+which are keyed versions of the corresponding secure hashing functions,
+as defined by RFC2104.
+The new definitions are in
+.CW /module/keyring.m .
+.IP \(bu
+.I Keyring-getstring (2)
+makes it clear that it does not
+.I provide
+delimited I/O but rather
+.I requires
+it (eg, as provided by
+.I ssl (3))
+for successful operation.
+.IP \(bu
+.I Security-ssl (2)
+no longer requires
+.CW #D
+to be bound into the name space (since it can only be used locally in any case).
+Consequently,
+.CW bind
+calls have been removed from several modules that used SSL.
+.IP \(bu
+.CW /appl/lib/crypt/ssl3.b
+has several critical bug fixes, allowing secure connections to work correctly in Charon;
+an important bug fix was also made to
+.CW /appl/lib/crypt/x509.b .
+.IP \(bu
+.I Imageremap
+has been changed to allow concurrent use.
+.IP \(bu
+.I Translate (6)
+and
+.CW /appl/lib/translate.b
+have changed as required to put the locale-specific dictionaries in directory
+\f(CW/locale/\fP\fIlocale\f(CW/dict\fP;
+the directory for a chosen
+.I locale
+is then normally bound onto
+.CW /locale/dict ,
+where
+.CW translate
+looks by default.
+.IP \(bu
+The documentation for
+.I button (9)
+no longer claims that
+.CW -padx
+and
+.CW -pady
+are supported options for that widget.
+.IP \(bu
+.CW sys.m
+has some new constants defined for use by a later Styx revision.
+.SH
+.I "Tk changes and extensions"
+.IP \(bu
+The canvas code provides extensions to Tk:
+.CW "grab set tag" ,
+.CW "grab release tag" ,
+and
+.CW "grab ifunset tag" .
+See
+.I canvas (9)
+for details.
+.IP \(bu
+Also in canvases, object-specific hit tests have been added, rather than using a bounding box
+in most cases.
+.IP \(bu
+Borderwidth defaults to zero in the entry widget.
+.IP \(bu
+Tk multiplexes the cursor amongst top-level windows.
+.IP \(bu
+Text and other items selected use foreground/background colours
+.SH
+.I "Limbo compiler"
+.IP \(bu
+Constant tuple and adt values can be used as values in
+.CW con
+constant declarations.
+.IP \(bu
+In an array value, the use of reference values for different
+.CW pick
+alternatives of the same adt no longer draws a diagnostic
+but causes the array value to be an array of
+.CW ref
+to the adt, not a particular pick alternative.
+.IP \(bu
+The C language stub declarations support
+.CW pick
+adts.
+.IP \(bu
+The alignment of
+.CW big
+and
+.CW real
+is now correct in C stubs.
+.IP \(bu
+The string escape
+.CW \ef
+is form-feed (to simplify translation of C programs).
+.SH
+.I "Compilers and utilities"
+.IP \(bu
+.I Iar 's
+source directory has been renamed from
+.CW ar
+to
+.CW iar .
+.IP \(bu
+.CW mk
+understands the long-name table in Windows' archive files, preventing
+spurious out-of-date status and other confusing results in builds.
+.IP \(bu
+.CW sqz
+and
+.CW zqs
+can compress larger ARM and PowerPC executables.
+.IP \(bu
+.CW kprof
+uses a table now provided by
+.CW /dev/kprof
+to provide profiling results to the resolution selected by a given platform.
+.SH
+.I "Hosted and Native Inferno"
+.LP
+Changes common to hosted and native kernels:
+.IP \(bu
+The
+.CW HOSTMODEL
+and
+.CW TARGMODEL
+for Plan 9 are now
+.CW Plan9
+not, rather confusingly,
+.CW Inferno .
+A good few source files have been renamed accordingly.
+.IP \(bu
+The memory allocation functions have been modified slightly to support the addition of
+memory monitoring and profiling.
+Some functions have also been added to the C library supporting
+.I emu
+and the kernels,
+with an eye to starting the revision of Styx, based on the current 9P2000.
+Consequently, the
+include file
+.CW lib9.h
+for all platforms has been modified:
+to add new functions such as
+.CW setmalloctag ;
+to make the types of parameters to the
+.CW malloc
+calls uniform (and reflect the documentation, as it happens),
+so that sizes are always unsigned;
+to change the type of a parameter to
+.CW strchr
+and
+.CW strrchr
+from
+.CW char
+to
+.CW int ;
+to remove obsolete definitions such as
+.CW UMFILE
+and
+.CW UMDIR ;
+and to add new functions for use inside the kernels.
+Several new functions are strictly for internal use
+(and thus might change in future), and
+they have not been added to section 10.
+One exception is the new function
+.CW tokenize ,
+documented in
+.I getfields (10.2)
+and used by
+.I parsecmd (10.2).
+.I Getfields
+replaces the function called
+.CW parsefields ,
+which has been removed, and all calls to it changed to use
+.CW getfields .
+The higher aim of some of these otherwise fussy changes is
+to work towards making Inferno and Plan 9 drivers interchangeable
+(as best we can).
+.IP \(bu
+To help decouple authentication methods from identity setting, two new files
+.CW caphash
+and
+.CW capuse
+have been added to
+.CW #c
+(a temporary location for them), supported by code in
+.CW auth.c
+.CW /os/port/auth.c ). (
+Their use will soon replace the
+.CW setid
+functionality of
+.I keyring-auth .
+They will not be documented until then.
+.IP \(bu
+.CW /emu/exception.c
+and
+.CW /os/port/exception.c
+no longer trap the use of the
+.CW exit
+statement (by accident of implementation).
+.CW \(bu
+.CW /emu/dis.c
+and
+.CW /os/port/dis.c
+do not leave a process in Broken state when it
+receives an exception because a kernel memory allocation failed.
+.IP \(bu
+The JIT compilation of case statements has fixed a day one bug
+that caused a degenerate form of
+.CW case
+(with only a default arm) to be compiled incorrectly.
+.IP \(bu
+The 386 JIT will allow rescheduling, and the scheduling is finer grained on the ARM.
+.IP \(bu
+ARM code generation produces correct code for some list accesses that were previously wrong
+(the Dis operator HEADM)
+.IP \(bu
+.I Emu
+has the following changes specific to it:
+.RS
+.IP \-
+The mouse event queue in
+.CW devcon.c
+is now a circular queue without locks as in the native kernel;
+this prevents a scheduling problem under Linux.
+.IP \-
+The
+.CW READSTR
+constant defined by the native kernels is now also defined by
+.CW emu .
+.IP \-
+The function previously called
+.CW rendezvous
+is now called
+.CW erendezvous
+to avoid a type clash with a library function on Plan 9
+(its `rendezvous' function takes an unsigned long as first parameter, not
+.CW void* ).
+.CW lib9.h
+has changed accordingly where required.
+.IP \-
+.CW devcon.c
+acquires
+.CW caphash
+and
+.CW capuse .
+.IP \-
+A new file
+.CW auth.c
+must be included in every build.
+.IP \-
+Several drivers in
+.CW emu
+have been changed to use
+.RS
+.P1
+ switch((ulong)c->qid.path ...)
+.P2
+.RE
+to force 32-bit operations to be used when
+.CW path
+eventually becomes 64 bits.
+.IP \-
+Also in
+.CW emu
+the function
+.CW oserrstr
+now takes a buffer length, rather than assuming
+.CW ERRLEN .
+.IP \-
+Obsolete code for time and directory mode conversions (!) has been removed from
+.CW os-*.c .
+.IP \-
+Obsolete code for `daemonize' has been removed from
+.CW os-*.c
+and
+.CW lib9 .
+.RE
+.IP \(bu
+For the native kernels:
+.RS
+.IP \-
+.CW /os/port/devprof.c
+has been added but is not yet supported (it is not configured in to any kernel,
+so no existing kernels are affected by its presence).
+.IP \-
+.CW /os/port/devcons.c
+has
+.CW caphash
+and
+.CW capuse
+.IP \-
+Every kernel configuration file must now include
+.CW auth
+in the
+.CW port
+section to include
+.CW /os/port/auth.c .
+.RE
+...#.bp
+...#.NH 1
+...#Description by source file
+...#.LP
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/appl/charon T{
+...#plumbs schemes that aren't built-in, if on an authorised list
+...#.br
+...#retries on error (but not POST)
+...#.br
+...#identifies itself as Mozilla to pass agent-name tests.
+...#.br
+...#bug fix: doesn't loop (exhausting memory) if a file ends with bad data.
+...#.br
+...#private version of Url
+...#.br
+...#CCI removed
+...#.br
+...#charon_code and charon_guiwm removed
+...#.br
+...#multiple windows
+...#.br
+...#detects HTML vs text correctly
+...#.br
+...#separate layout and gui display
+...#.br
+...#T}
+...#/appl/cmd/diff.b T{
+...#compare binary files as documented
+...#.br
+...#don't quit if files can't be opened
+...#.br
+...#T}
+...#/appl/cmd/strings.b T{
+...#new command
+...#.br
+...#T}
+...#/appl/cmd/sh T{
+...#../ / ./ and # starting a (command) name all cause it to be treated as absolute
+...#.br
+...#T}
+...#/appl/lib/auth.b T{
+...#allow nobody even if setid is 0 provided it appears in the algorithm list
+...#.br
+...#don't bind #D
+...#.br
+...#T}
+...#/appl/lib/createsignerkey.b T{
+...#arguments
+...#.br
+...#don't bother with random
+...#.br
+...#error status
+...#.br
+...#T}
+...#.TE
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/appl/lib/crypt/ssl3.b T{
+...#delete session id on alert
+...#.br
+...#T}
+...#/appl/lib/ecmascript T{
+...#correct precedence for operators
+...#.br
+...#delete HTML comments
+...#.br
+...#T}
+...#/appl/lib/isrv.b T{
+...#don't start a new shell
+...#.br
+...#T}
+...#/appl/lib/login.b T{
+...#don't bind #D
+...#.br
+...#T}
+...#/appl/lib/logind.b T{
+...#don't bind #D
+...#.br
+...#don't load (unused) Random
+...#.br
+...#minor internal changes.
+...#.br
+...#T}
+...#/appl/lib/profile.b T{
+...#memory profiling
+...#.br
+...#T}
+...#/appl/lib/srv.b T{
+...#be sure to FORKFD so file descriptors don't accumulate in parent
+...#.br
+...#T}
+...#/appl/lib/ssl.b T{
+...#don't require #D to be bound since it can only be used locally
+...#.br
+...#T}
+...#/appl/lib/usb/* T{
+...#see TI925 release
+...#.br
+...#T}
+...#/appl/lib/wmlib.b T{
+...#calculate correct size of file browser (eg when full screen)
+...#.br
+...#don't mess with the cursor
+...#.br
+...#T}
+...#/appl/minicharon T{
+...#moan if no window manager
+...#.br
+...#don't trap if module not yet loaded
+...#.br
+...#T}
+...#/appl/wm/bounce.b
+...#/appl/wm/man.b
+...#/appl/wm/mand.b
+...#/appl/wm/polyhedra.b
+...#/appl/wm/wm.b T{
+...#correct window size in full screen mode
+...#.br
+...#T}
+...#/appl/wm/reversi.b T{
+...#use fittoscreen(0)
+...#.br
+...#T}
+...#/appl/wm/polyhedra.b T{
+...#sys->sleep(0) to yield
+...#.br
+...#T}
+...#/man/2/security-ssl T{
+...#bind not required
+...#.br
+...#conventions documented accurately
+...#.br
+...#T}
+...#/man/2/keyring-getstring T{
+...#makes it clear that it doesn't PROVIDE delimited i/o, but EXPECTS it (eg, via ssl(3))
+...#.br
+...#T}
+...#.TE
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/crypt/jacobi.c T{
+...#add missing return statement
+...#.br
+...#T}
+...#/emu/alloc.c T{
+...#changes for monitoring
+...#.br
+...#long for size throughout
+...#.br
+...#poolrealloc, now used by malloc
+...#.br
+...#poolmsize
+...#.br
+...#malloc and realloc tagging data with pc of allocation
+...#.br
+...#ud -> lud etc
+...#.br
+...#T}
+...#/emu/chan.c T{
+...#space isn't frog, experimentally
+...#.br
+...#T}
+...#/emu/dat.h T{
+...#READSTR definition
+...#.br
+...#Rept definition (will change)
+...#.br
+...#add BusyGC
+...#.br
+...#remove support for %N
+...#.br
+...#T}
+...#/emu T{
+...#oserrstr takes buffer size (consequential changes throughout)
+...#.br
+...#no %N
+...#.br
+...#Plan 9's HOSTMODEL -> Plan9 not Inferno (!)
+...#.br
+...#msize -> hmsize in some cases
+...#.br
+...#rendezvous -> erendezvous
+...#.br
+...#T}
+...#/emu/devcon.c T{
+...#no %N, Nconv
+...#.br
+...#remove debugging memout file
+...#.br
+...#use of (ulong)c->qid.path ...
+...#.br
+...#remove logmsg calls
+...#.br
+...#T}
+...#/emu/devprof.c T{
+...#memory profiling
+...#.br
+...#T}
+...#/emu/devprog.c T{
+...#msize -> hmsize because malloc and heap addresses are different
+...#.br
+...#T}
+...#/emu/dis.c T{
+...#instrument garbage collections
+...#.br
+...#force periodic garbage collection passes when not idle (BusyGC)
+...#.br
+...#T}
+...#/emu/discall.c T{
+...#tag QLock structures with owner
+...#.br
+...#T}
+...#/emu/fns.h T{
+...#oserrstr definition
+...#.br
+...#obsolete gsleep defn removed
+...#.br
+...#T}
+...#/emu/main.c T{
+...#obsolete gsleep defn removed
+...#.br
+...#T}
+...#.TE
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/emu/os-* T{
+...#dflag becomes simply don't enable graphics terminal (ie, save/restore tty state)
+...#.br
+...#daemonize calls removed [do it from outside]
+...#.br
+...#rendezvous -> erendezvous
+...#.br
+...#some files had obsolete timeconv and dirmodconv code, now removed
+...#.br
+...#T}
+...#/emu/proc.c T{
+...#provisional rpt code (will change, will move)
+...#.br
+...#T}
+...#/emu/vlrt-Nt.c T{
+...#use dat.h not local definitions
+...#.br
+...#T}
+...#/include/interp.h T{
+...#force HEAP_ALIGN to provide extra cell in heap header for heap profiling
+...#.br
+...#new functions: heapmonitor, hmsize
+...#.br
+...#utfnlen removed (local to interp/runt.c)
+...#.br
+...#T}
+...#/include/pool.h T{
+...#sizes are now unsigned
+...#.br
+...#poolcompact defn, new poolmsize, poolrealloc
+...#.br
+...#T}
+...#/include/tk.h T{
+...#extra state to control cursor
+...#.br
+...#T}
+...#/interp/comp-386.c T{
+...#implement and enable interpreter rescheduling when JIT enabled
+...#.br
+...#T}
+...#/interp/comp-arm.c
+...#/interp/comp-thumb.c T{
+...#change arm rescheduling check to be similar to 386
+...#.br
+...#T}
+...#/interp/gc.c T{
+...#instrument garbage collector
+...#.br
+...#add heapmonitor hook for devmem.c
+...#.br
+...#T}
+...#/interp/heap.c T{
+...#define heapmonitor hook (default: nil)
+...#.br
+...#call it at appropriate places
+...#.br
+...#change // to /* comment
+...#.br
+...#add hmsize to account for alloc.c changes
+...#.br
+...#T}
+...#/interp/keyring.c T{
+...#ensure keyring i/o functions return "failure" as intended (improves diagnostic of login/logind)
+...#.br
+...#T}
+...#/interp/math.c T{
+...#min -> minimum (to avoid clash with C macro)
+...#.br
+...#T}
+...#/interp/runt.c T{
+...#check for nil arrays in utf functions, and negative offsets
+...#.br
+...#T}
+...#.TE
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/interp/string.c T{
+...#msize -> hmsize
+...#.br
+...#T}
+...#/interp/tk.c T{
+...#cursor switching between apps
+...#.br
+...#T}
+...#/interp/validstk.c T{
+...#msize -> hmsize
+...#.br
+...#T}
+...#/interp/xec.c T{
+...#check that slice offset isn't negative
+...#.br
+...#T}
+...#/kern T{
+...#new function utfecpy, added to directory and mkfile
+...#.br
+...#strchr, strrchr argument -> int not char
+...#.br
+...#T}
+...#/kfs T{
+...#replace DEBUG by KFSDEBUG
+...#.br
+...#remove obsolete malloc definition
+...#.br
+...#ensure HFILES includes emu dat.h and fns.h
+...#.br
+...#T}
+...#/lib9/errstr-* T{
+...#add support for buffer limit to oserrstr
+...#.br
+...#T}
+...#/lib9 T{
+...#exits argument shouldn't be const
+...#.br
+...#add utfecpy
+...#.br
+...#remove log-* and logging stuff from print.c
+...#.br
+...#remove printcol from doprint.c
+...#.br
+...#T}
+...#/man/1/gettar T{
+...#document lstar's format
+...#.br
+...#T}
+...#/man/* T{
+...#extraneous/incorrect cross-references corrected
+...#.br
+...#T}
+...#/usr/inferno/mkfile T{
+...#-Inferno -> -Plan9 for several things
+...#.br
+...#(because of HOSTMODEL/TARGMODEL change)
+...#.br
+...#mkfile-Plan9-* TARGMODEL -> Plan9 not Inferno
+...#.br
+...#T}
+...#/os/ip/ip.c T{
+...#adjust length after options stripped
+...#.br
+...#T}
+...#/os/ip/kernel.h T{
+...#strrchr char -> int
+...#.br
+...#T}
+...#/os/port/alloc.c T{
+...#int -> ulong in sizes
+...#.br
+...#same memory allocation conventions as /emu/alloc.c (re sizing, quanta)
+...#.br
+...#strip last aspects of audit
+...#.br
+...#other changes similar to /emu/alloc.c
+...#.br
+...#T}
+...#/appl/acme T{
+...#raise an non-nil exception, not empty string
+...#.br
+...#T}
+...#/appl/charon T{
+...#error when no window manager running
+...#.br
+...#cookies code being added/improved
+...#.br
+...#java script fixes/enhancements
+...#.br
+...#T}
+...#/appl/cmd/install/install.b global is default now
+...#/appl/cmd/install/wrap.b extra package name check
+...#/appl/cmd/sh code tidy up
+...#/appl/ebook ongoing changes
+...#/appl/lib/dis.b offset from fp to mp fix
+...#/appl/lib/dividers.b extra bind command to fix dividers issue
+...#/appl/lib/ecmascript.b T{
+...#parsing of '/' improved (division or start of
+...#regular expression)
+...#.br
+...#T}
+...#/appl/lib/parseman.b extra troff commands accounted for
+...#/appl/lib/profile.b bind of #P only if needed
+...#/appl/lib/translate.b hash code fix
+...#/appl/wm/c4.b evaluation function improvement
+...#/appl/wm/pen.b namechan() call moved to fix bug
+...#/appl/wm/polyhedra.b cosmetic changes
+...#/appl/wm/readmail.b slight code improvements
+...#/appl/wm/sendmail.b ditto
+...#/man/1/ebook text improvements
+...#/man/1/sh-std ditto
+...#/man/1/sum sha1sum added
+...#/man/2/keyring hmac_sha1, hmac_md5 added
+...#/man/3/kprof slight rewrite
+...#/man/4/export 9export added
+...#/man/6/translate new BUGS section comment
+...#/man/9/button padx, pady removed
+...#/man/9/canvas added grab commands
+...#/module/keyring.m added hmac routines
+...#/module/sys.m added DM* constants for future use
+...#/emu/devcon.c T{
+...#caphash and capuse files added for future
+...#development
+...#.br
+...#T}
+...#/emu/dis.c added Enomem check to broken progs
+...#/emu/exception.c added couple of extra string checks
+...#/emu/exportfs.c T{
+...#nexterror() replaced by return to give
+...#better error recovery
+...#.br
+...#T}
+...#/emu/os-Nt.c prints on console when ran out of kernel processes
+...#image/bezier.c getbezsplinepts() added
+...#interp/comp-arm.c headm bug fix, removed dodgy code
+...#interp/comp-power.c removed dodgy code
+...#interp/comp-thumb.c headm bug fix, removed dodgy code
+...#interp/keyring.c T{
+...#generalization of certain keyring functions
+...#.br
+...#addition of hmac routines
+...#.br
+...#T}
+...#interp/tk.c tkcursorcmd() removed temporarily
+...#/os/ip/devip.c ipremove() replaced by devremove()
+...#/os/ip/ip.c fragoff now a ulong
+...#/os/ip/tcp.c extra safety checks
+...#/os/port/devcons.c caphash and capuse files added
+...#/os/port/devkprof.c general improvements
+...#/os/port/dis.c Enomem check on broken progs
+...#/os/port/exception.c added couple of extra string checks
+...#/os/port/exportfs.c T{
+...#nexterror() replaced by return to give
+...#better error recovery
+...#.br
+...#check against correct file offset when reading directories
+...#.br
+...#T}
+...#/os/port/utils.c parsefields(), stroll() removed
+...#/tk/canvs.c T{
+...#extensions to tk: grab set tag, grab release tag,
+...#.br
+...#grab ifunset tag
+...#.br
+...#T}
+...#/tk/ctext.c T{
+...#text widget tag highlight fix
+...#text widget tag index fix
+...#.br
+...#T}
+...#/tk/entry.c borderwidth default to 0 in entry widget
+...#/tk/menus.c menu button release fix (off by the borderwidth bug)
+...#/tk/scrol.c T{
+...#scrollbar selection fixes (off by 1 bugs)
+...#.br
+...#autorepeat code added but disabled
+...#.br
+...#T}
+...#/tk/utils.c tkinsidepoly() function
+...#/tk/xdata.c unused globals removed
+...#/tk/* T{
+...#draw methods take extra parameter
+...#.br
+...#hit methods added
+...#.br
+...#tkcfirsttag(), tkclasttag() fixes
+...#.br
+...#tkrunpack() argument type change
+...#.br
+...#T}
+...#.TE
binary files /dev/null b/doc/20020628.pdf differ
--- /dev/null
+++ b/doc/20020715.ms
@@ -1,0 +1,1033 @@
+.TL
+Inferno 3rd Edition \- 15 July 2002 Experimental Update
+.br
+Release Notes
+.AI
+Vita Nuova
+support@vitanuova.com
+.br
+15 July 2002
+.SP 4
+.NH 1
+Base for update
+.LP
+This experimental set of updates applies to the 18 June 2001 base.
+The installed software must therefore be that of 18 June 2001, whether installed directly,
+or the result of updating the original July 2000 release with update packages to the
+18 June 2001 level.
+This set of updates can be applied, however, whether or not the previous updates
+of 3 October 2001 and 28 June 2002 were installed; this package includes all those changes too.
+.NH 1
+Contents
+.LP
+This update offers the following, compared to the 28 June 2002 update.
+.LP
+The main change is that the Plan 9 hosted implementation supports Plan9 Fourth Edition.
+This has affected the portability interface for both hosted and native software.
+That, and the introduction of `lexical names' into
+.I emu
+causes this update to be classified as `experimental'.
+.LP
+The following sections provide short
+summaries of the more interesting or important changes.
+There are many more minor or cosmetic changes.
+.SH
+.I "New commands and modules"
+.SH
+.I "Repairs and changes"
+.IP \(bu
+.CW cp
+has been reworked extensively; amongst other things, it can now safely traverse
+arbitrary name spaces without becoming confused by
+complex mounts.
+.IP \(bu
+.I Listen (1)
+and
+.CW dial
+both accept a
+.CW -A
+option to cause it to authenticate a connection.
+.IP \(bu
+.I Diff (1)
+compares binary files and also does not quit (during recursive diffs) when it finds
+a file it cannot open.
+.IP \(bu
+.I Ls (1)
+implements the
+.CW -u
+and
+.CW -t
+sorting operations properly.
+A new
+.CW -e
+option prints the time as a decimal integer (seconds since the system Epoch).
+.IP \(bu
+.I Sh
+regards all of `../', `./', and `#' at the front of a command name as a request to use the name as-is.
+.IP \(bu
+.I Tail
+has been changed to avoid some boundary cases.
+.IP \(bu
+.I Touch
+uses standard argument processing and returns an error exit status if it fails.
+.IP \(bu
+.I Zeros
+accepts new options
+.CW -r
+to generate random bytes
+and
+.CW -v
+to produce blocks with a given value.
+.IP \(bu
+The network games software in
+.CW /appl/games
+uses port 6660 for its signer, not the standard Inferno signer's port.
+.IP \(bu
+.I Charon :
+disguises itself as Mozilla to satisfy agent-name probes; retries fetches on error; supports multiple windows;
+detects HTML vs plain text correctly; has many Javascript fixes, including fixing a day one bug
+in operator precedence (!).
+In general, it
+has undergone extensive changes,
+particularly to the cookies and Javascript implementation.
+It also insists that it be run under the window manager
+.I wm (1).
+It can be set to plumb schemes that it does not implement internally.
+.IP \(bu
+The installation software now uses the numeric time as-is to name output packages;
+the temporary fix introduced in Ocotober of converting leading
+.CW 10
+to
+.CW A
+has been removed.
+.IP \(bu
+.CW /appl/lib/dis.b
+prints offsets from
+.CW MP
+correctly (it previously printed them
+as offsets from
+.CW FP ).
+.IP \(bu
+.CW /appl/lib/parseman.b
+accounts for enough extra
+.I troff
+commands to display
+.I man (6)
+correctly.
+.IP \(bu
+.CW /appl/lib/profile.b
+binds
+.CW #P
+only if needed.
+.IP \(bu
+.CW /appl/lib/translate.b
+.I translate (2)) (
+now computes the right hash value.
+.IP \(bu
+Several run-time checks that were missing or incomplete for array access and slicing have been added.
+.IP \(bu
+.CW lib/isrv
+no longer starts a new shell.
+.IP \(bu
+.I Wm-ftree (1)
+has several new options:
+.CW -E ,
+.CW -p
+and
+.CW -d .
+.IP \(bu
+.CW wm/rt
+now correctly updates stack size (if set).
+.IP \(bu
+.I Xd (1)
+correctly implements
+.CW -r
+to suppress repeated lines.
+.IP \(bu
+.I gettar (1)
+documents the
+.B lstar
+output format
+.IP \(bu
+.I man (6)
+describes the manual page documentation format
+.SH
+.I "Interface changes and extensions"
+.IP \(bu
+.I Security-password (2)
+provides a new entry
+.CW setpwfile
+to allow the default name
+.CW /keydb/password
+to be changed.
+.IP \(bu
+.I Dossrv (4)
+has changed extensively, to support rename and long names in FAT format correctly.
+It uses a smaller number of IO buffers.
+.IP \(bu
+.I Createsignerkey (8)
+has a completely different invocation, documented in that manual page,
+with a conventional option structure replacing a rather idiosyncratic chain
+of optional arguments.
+The old syntax is temporarily secretly still supported, but only
+for the benefit of any shell scripts that might so use it; if you have any,
+please change them to use the new syntax before the old usage finally vanishes in a later release.
+.IP \(bu
+.I Sum (1)
+documents the new
+.CW sha1sum
+command.
+.IP \(bu
+.I Bufio (2)
+returns an error if a seek fails rather than stopping the process(!).
+.IP \(bu
+.CW /lib/convcs
+has several more character sets:
+.CW koi8-r ,
+.CW windows-1250
+and
+.CW windows-1252 .
+.IP \(bu
+.I Keyring-sha (2)
+documents two new functions,
+.CW hmac_sha1
+and
+.CW hmac_md5 ,
+which are keyed versions of the corresponding secure hashing functions,
+as defined by RFC2104.
+The new definitions are in
+.CW /module/keyring.m .
+.IP \(bu
+.I Keyring-getstring (2)
+makes it clear that it does not
+.I provide
+delimited I/O but rather
+.I requires
+it (eg, as provided by
+.I ssl (3))
+for successful operation.
+.IP \(bu
+.I Security-ssl (2)
+no longer requires
+.CW #D
+to be bound into the name space (since it can only be used locally in any case).
+Consequently,
+.CW bind
+calls have been removed from several modules that used SSL.
+.IP \(bu
+.CW /appl/lib/crypt/ssl3.b
+has several critical bug fixes, allowing secure connections to work correctly in Charon;
+an important bug fix was also made to
+.CW /appl/lib/crypt/x509.b .
+.IP \(bu
+.I Imageremap
+has been changed to allow concurrent use.
+.IP \(bu
+.I Translate (6)
+and
+.CW /appl/lib/translate.b
+have changed as required to put the locale-specific dictionaries in directory
+\f(CW/locale/\fP\fIlocale\f(CW/dict\fP;
+the directory for a chosen
+.I locale
+is then normally bound onto
+.CW /locale/dict ,
+where
+.CW translate
+looks by default.
+.IP \(bu
+The documentation for
+.I button (9)
+no longer claims that
+.CW -padx
+and
+.CW -pady
+are supported options for that widget.
+.IP \(bu
+.CW sys.m
+has some new constants defined for use by a later Styx revision.
+.SH
+.I "Tk changes and extensions"
+.IP \(bu
+The canvas code provides extensions to Tk:
+.CW "grab set tag" ,
+.CW "grab release tag" ,
+and
+.CW "grab ifunset tag" .
+See
+.I canvas (9)
+for details.
+.IP \(bu
+Also in canvases, object-specific hit tests have been added, rather than using a bounding box
+in most cases.
+.IP \(bu
+Borderwidth defaults to zero in the entry widget.
+.IP \(bu
+Tk multiplexes the cursor amongst top-level windows.
+.IP \(bu
+Text and other items selected use foreground/background colours
+.SH
+.I "Limbo compiler"
+.IP \(bu
+Constant tuple and adt values can be used as values in
+.CW con
+constant declarations.
+.IP \(bu
+In an array value, the use of reference values for different
+.CW pick
+alternatives of the same adt no longer draws a diagnostic
+but causes the array value to be an array of
+.CW ref
+to the adt, not a particular pick alternative.
+.IP \(bu
+The C language stub declarations support
+.CW pick
+adts.
+.IP \(bu
+The alignment of
+.CW big
+and
+.CW real
+is now correct in C stubs.
+.IP \(bu
+The string escape
+.CW \ef
+is form-feed (to simplify translation of C programs).
+.SH
+.I "Compilers and utilities"
+.IP \(bu
+.I Iar 's
+source directory has been renamed from
+.CW ar
+to
+.CW iar .
+.IP \(bu
+.CW mk
+understands the long-name table in Windows' archive files, preventing
+spurious out-of-date status and other confusing results in builds.
+.IP \(bu
+.CW sqz
+and
+.CW zqs
+can compress larger ARM and PowerPC executables.
+.IP \(bu
+.CW kprof
+uses a table now provided by
+.CW /dev/kprof
+to provide profiling results to the resolution selected by a given platform.
+.SH
+.I "Hosted and Native Inferno"
+.LP
+Changes common to hosted and native kernels:
+.IP \(bu
+The
+.CW HOSTMODEL
+and
+.CW TARGMODEL
+for Plan 9 are now
+.CW Plan9
+not, rather confusingly,
+.CW Inferno .
+A good few source files have been renamed accordingly.
+.IP \(bu
+The memory allocation functions have been modified slightly to support the addition of
+memory monitoring and profiling.
+Some functions have also been added to the C library supporting
+.I emu
+and the kernels,
+with an eye to starting the revision of Styx, based on the current 9P2000.
+Consequently, the
+include file
+.CW lib9.h
+for all platforms has been modified:
+to add new functions such as
+.CW setmalloctag ;
+to make the types of parameters to the
+.CW malloc
+calls uniform (and reflect the documentation, as it happens),
+so that sizes are always unsigned;
+to change the type of a parameter to
+.CW strchr
+and
+.CW strrchr
+from
+.CW char
+to
+.CW int ;
+to remove obsolete definitions such as
+.CW UMFILE
+and
+.CW UMDIR ;
+and to add new functions for use inside the kernels.
+Several new functions are strictly for internal use
+(and thus might change in future), and
+they have not been added to section 10.
+One exception is the new function
+.CW tokenize ,
+documented in
+.I getfields (10.2)
+and used by
+.I parsecmd (10.2).
+.I Getfields
+replaces the function called
+.CW parsefields ,
+which has been removed, and all calls to it changed to use
+.CW getfields .
+The higher aim of some of these otherwise fussy changes is
+to work towards making Inferno and Plan 9 drivers interchangeable
+(as best we can).
+.IP \(bu
+To help decouple authentication methods from identity setting, two new files
+.CW caphash
+and
+.CW capuse
+have been added to
+.CW #c
+(a temporary location for them), supported by code in
+.CW auth.c
+.CW /os/port/auth.c ). (
+Their use will soon replace the
+.CW setid
+functionality of
+.I keyring-auth .
+They will not be documented until then.
+.IP \(bu
+.CW /emu/exception.c
+and
+.CW /os/port/exception.c
+no longer trap the use of the
+.CW exit
+statement (by accident of implementation).
+.CW \(bu
+.CW /emu/dis.c
+and
+.CW /os/port/dis.c
+do not leave a process in Broken state when it
+receives an exception because a kernel memory allocation failed.
+.IP \(bu
+The JIT compilation of case statements has fixed a day one bug
+that caused a degenerate form of
+.CW case
+(with only a default arm) to be compiled incorrectly.
+.IP \(bu
+The 386 JIT will allow rescheduling, and the scheduling is finer grained on the ARM.
+.IP \(bu
+ARM code generation produces correct code for some list accesses that were previously wrong
+(the Dis operator HEADM)
+.IP \(bu
+.I Emu
+has the following changes specific to it:
+.RS
+.IP \-
+The mouse event queue in
+.CW devcon.c
+is now a circular queue without locks as in the native kernel;
+this prevents a scheduling problem under Linux.
+.IP \-
+The
+.CW READSTR
+constant defined by the native kernels is now also defined by
+.CW emu .
+.IP \-
+The function previously called
+.CW rendezvous
+is now called
+.CW erendezvous
+to avoid a type clash with a library function on Plan 9
+(its `rendezvous' function takes an unsigned long as first parameter, not
+.CW void* ).
+.CW lib9.h
+has changed accordingly where required.
+.IP \-
+.CW devcon.c
+acquires
+.CW caphash
+and
+.CW capuse .
+.IP \-
+A new file
+.CW auth.c
+must be included in every build.
+.IP \-
+Several drivers in
+.CW emu
+have been changed to use
+.RS
+.P1
+ switch((ulong)c->qid.path ...)
+.P2
+.RE
+to force 32-bit operations to be used when
+.CW path
+eventually becomes 64 bits.
+.IP \-
+Also in
+.CW emu
+the function
+.CW oserrstr
+now takes a buffer length, rather than assuming
+.CW ERRLEN .
+.IP \-
+Obsolete code for time and directory mode conversions (!) has been removed from
+.CW os-*.c .
+.IP \-
+Obsolete code for `daemonize' has been removed from
+.CW os-*.c
+and
+.CW lib9 .
+.RE
+.IP \(bu
+For the native kernels:
+.RS
+.IP \-
+.CW /os/port/devprof.c
+has been added but is not yet supported (it is not configured in to any kernel,
+so no existing kernels are affected by its presence).
+.IP \-
+.CW /os/port/devcons.c
+has
+.CW caphash
+and
+.CW capuse
+.IP \-
+Every kernel configuration file must now include
+.CW auth
+in the
+.CW port
+section to include
+.CW /os/port/auth.c .
+.RE
+...#.bp
+...#.NH 1
+...#Description by source file
+...#.LP
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/appl/charon T{
+...#plumbs schemes that aren't built-in, if on an authorised list
+...#.br
+...#retries on error (but not POST)
+...#.br
+...#identifies itself as Mozilla to pass agent-name tests.
+...#.br
+...#bug fix: doesn't loop (exhausting memory) if a file ends with bad data.
+...#.br
+...#private version of Url
+...#.br
+...#CCI removed
+...#.br
+...#charon_code and charon_guiwm removed
+...#.br
+...#multiple windows
+...#.br
+...#detects HTML vs text correctly
+...#.br
+...#separate layout and gui display
+...#.br
+...#T}
+...#/appl/cmd/diff.b T{
+...#compare binary files as documented
+...#.br
+...#don't quit if files can't be opened
+...#.br
+...#T}
+...#/appl/cmd/strings.b T{
+...#new command
+...#.br
+...#T}
+...#/appl/cmd/sh T{
+...#../ / ./ and # starting a (command) name all cause it to be treated as absolute
+...#.br
+...#T}
+...#/appl/lib/auth.b T{
+...#allow nobody even if setid is 0 provided it appears in the algorithm list
+...#.br
+...#don't bind #D
+...#.br
+...#T}
+...#/appl/lib/createsignerkey.b T{
+...#arguments
+...#.br
+...#don't bother with random
+...#.br
+...#error status
+...#.br
+...#T}
+...#.TE
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/appl/lib/crypt/ssl3.b T{
+...#delete session id on alert
+...#.br
+...#T}
+...#/appl/lib/ecmascript T{
+...#correct precedence for operators
+...#.br
+...#delete HTML comments
+...#.br
+...#T}
+...#/appl/lib/isrv.b T{
+...#don't start a new shell
+...#.br
+...#T}
+...#/appl/lib/login.b T{
+...#don't bind #D
+...#.br
+...#T}
+...#/appl/lib/logind.b T{
+...#don't bind #D
+...#.br
+...#don't load (unused) Random
+...#.br
+...#minor internal changes.
+...#.br
+...#T}
+...#/appl/lib/profile.b T{
+...#memory profiling
+...#.br
+...#T}
+...#/appl/lib/srv.b T{
+...#be sure to FORKFD so file descriptors don't accumulate in parent
+...#.br
+...#T}
+...#/appl/lib/ssl.b T{
+...#don't require #D to be bound since it can only be used locally
+...#.br
+...#T}
+...#/appl/lib/usb/* T{
+...#see TI925 release
+...#.br
+...#T}
+...#/appl/lib/wmlib.b T{
+...#calculate correct size of file browser (eg when full screen)
+...#.br
+...#don't mess with the cursor
+...#.br
+...#T}
+...#/appl/minicharon T{
+...#moan if no window manager
+...#.br
+...#don't trap if module not yet loaded
+...#.br
+...#T}
+...#/appl/wm/bounce.b
+...#/appl/wm/man.b
+...#/appl/wm/mand.b
+...#/appl/wm/polyhedra.b
+...#/appl/wm/wm.b T{
+...#correct window size in full screen mode
+...#.br
+...#T}
+...#/appl/wm/reversi.b T{
+...#use fittoscreen(0)
+...#.br
+...#T}
+...#/appl/wm/polyhedra.b T{
+...#sys->sleep(0) to yield
+...#.br
+...#T}
+...#/man/2/security-ssl T{
+...#bind not required
+...#.br
+...#conventions documented accurately
+...#.br
+...#T}
+...#/man/2/keyring-getstring T{
+...#makes it clear that it doesn't PROVIDE delimited i/o, but EXPECTS it (eg, via ssl(3))
+...#.br
+...#T}
+...#.TE
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/crypt/jacobi.c T{
+...#add missing return statement
+...#.br
+...#T}
+...#/emu/alloc.c T{
+...#changes for monitoring
+...#.br
+...#long for size throughout
+...#.br
+...#poolrealloc, now used by malloc
+...#.br
+...#poolmsize
+...#.br
+...#malloc and realloc tagging data with pc of allocation
+...#.br
+...#ud -> lud etc
+...#.br
+...#T}
+...#/emu/chan.c T{
+...#space isn't frog, experimentally
+...#.br
+...#T}
+...#/emu/dat.h T{
+...#READSTR definition
+...#.br
+...#Rept definition (will change)
+...#.br
+...#add BusyGC
+...#.br
+...#remove support for %N
+...#.br
+...#T}
+...#/emu T{
+...#oserrstr takes buffer size (consequential changes throughout)
+...#.br
+...#no %N
+...#.br
+...#Plan 9's HOSTMODEL -> Plan9 not Inferno (!)
+...#.br
+...#msize -> hmsize in some cases
+...#.br
+...#rendezvous -> erendezvous
+...#.br
+...#T}
+...#/emu/devcon.c T{
+...#no %N, Nconv
+...#.br
+...#remove debugging memout file
+...#.br
+...#use of (ulong)c->qid.path ...
+...#.br
+...#remove logmsg calls
+...#.br
+...#T}
+...#/emu/devprof.c T{
+...#memory profiling
+...#.br
+...#T}
+...#/emu/devprog.c T{
+...#msize -> hmsize because malloc and heap addresses are different
+...#.br
+...#T}
+...#/emu/dis.c T{
+...#instrument garbage collections
+...#.br
+...#force periodic garbage collection passes when not idle (BusyGC)
+...#.br
+...#T}
+...#/emu/discall.c T{
+...#tag QLock structures with owner
+...#.br
+...#T}
+...#/emu/fns.h T{
+...#oserrstr definition
+...#.br
+...#obsolete gsleep defn removed
+...#.br
+...#T}
+...#/emu/main.c T{
+...#obsolete gsleep defn removed
+...#.br
+...#T}
+...#.TE
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/emu/os-* T{
+...#dflag becomes simply don't enable graphics terminal (ie, save/restore tty state)
+...#.br
+...#daemonize calls removed [do it from outside]
+...#.br
+...#rendezvous -> erendezvous
+...#.br
+...#some files had obsolete timeconv and dirmodconv code, now removed
+...#.br
+...#T}
+...#/emu/proc.c T{
+...#provisional rpt code (will change, will move)
+...#.br
+...#T}
+...#/emu/vlrt-Nt.c T{
+...#use dat.h not local definitions
+...#.br
+...#T}
+...#/include/interp.h T{
+...#force HEAP_ALIGN to provide extra cell in heap header for heap profiling
+...#.br
+...#new functions: heapmonitor, hmsize
+...#.br
+...#utfnlen removed (local to interp/runt.c)
+...#.br
+...#T}
+...#/include/pool.h T{
+...#sizes are now unsigned
+...#.br
+...#poolcompact defn, new poolmsize, poolrealloc
+...#.br
+...#T}
+...#/include/tk.h T{
+...#extra state to control cursor
+...#.br
+...#T}
+...#/interp/comp-386.c T{
+...#implement and enable interpreter rescheduling when JIT enabled
+...#.br
+...#T}
+...#/interp/comp-arm.c
+...#/interp/comp-thumb.c T{
+...#change arm rescheduling check to be similar to 386
+...#.br
+...#T}
+...#/interp/gc.c T{
+...#instrument garbage collector
+...#.br
+...#add heapmonitor hook for devmem.c
+...#.br
+...#T}
+...#/interp/heap.c T{
+...#define heapmonitor hook (default: nil)
+...#.br
+...#call it at appropriate places
+...#.br
+...#change // to /* comment
+...#.br
+...#add hmsize to account for alloc.c changes
+...#.br
+...#T}
+...#/interp/keyring.c T{
+...#ensure keyring i/o functions return "failure" as intended (improves diagnostic of login/logind)
+...#.br
+...#T}
+...#/interp/math.c T{
+...#min -> minimum (to avoid clash with C macro)
+...#.br
+...#T}
+...#/interp/runt.c T{
+...#check for nil arrays in utf functions, and negative offsets
+...#.br
+...#T}
+...#.TE
+...#.TS
+...#lf(CW)w(2.5i) lf(R)w(4i) .
+...#/interp/string.c T{
+...#msize -> hmsize
+...#.br
+...#T}
+...#/interp/tk.c T{
+...#cursor switching between apps
+...#.br
+...#T}
+...#/interp/validstk.c T{
+...#msize -> hmsize
+...#.br
+...#T}
+...#/interp/xec.c T{
+...#check that slice offset isn't negative
+...#.br
+...#T}
+...#/kern T{
+...#new function utfecpy, added to directory and mkfile
+...#.br
+...#strchr, strrchr argument -> int not char
+...#.br
+...#T}
+...#/kfs T{
+...#replace DEBUG by KFSDEBUG
+...#.br
+...#remove obsolete malloc definition
+...#.br
+...#ensure HFILES includes emu dat.h and fns.h
+...#.br
+...#T}
+...#/lib9/errstr-* T{
+...#add support for buffer limit to oserrstr
+...#.br
+...#T}
+...#/lib9 T{
+...#exits argument shouldn't be const
+...#.br
+...#add utfecpy
+...#.br
+...#remove log-* and logging stuff from print.c
+...#.br
+...#remove printcol from doprint.c
+...#.br
+...#T}
+...#/man/1/gettar T{
+...#document lstar's format
+...#.br
+...#T}
+...#/man/* T{
+...#extraneous/incorrect cross-references corrected
+...#.br
+...#T}
+...#/usr/inferno/mkfile T{
+...#-Inferno -> -Plan9 for several things
+...#.br
+...#(because of HOSTMODEL/TARGMODEL change)
+...#.br
+...#mkfile-Plan9-* TARGMODEL -> Plan9 not Inferno
+...#.br
+...#T}
+...#/os/ip/ip.c T{
+...#adjust length after options stripped
+...#.br
+...#T}
+...#/os/ip/kernel.h T{
+...#strrchr char -> int
+...#.br
+...#T}
+...#/os/port/alloc.c T{
+...#int -> ulong in sizes
+...#.br
+...#same memory allocation conventions as /emu/alloc.c (re sizing, quanta)
+...#.br
+...#strip last aspects of audit
+...#.br
+...#other changes similar to /emu/alloc.c
+...#.br
+...#T}
+...#/appl/acme T{
+...#raise an non-nil exception, not empty string
+...#.br
+...#T}
+...#/appl/charon T{
+...#error when no window manager running
+...#.br
+...#cookies code being added/improved
+...#.br
+...#java script fixes/enhancements
+...#.br
+...#T}
+...#/appl/cmd/install/install.b global is default now
+...#/appl/cmd/install/wrap.b extra package name check
+...#/appl/cmd/sh code tidy up
+...#/appl/ebook ongoing changes
+...#/appl/lib/dis.b offset from fp to mp fix
+...#/appl/lib/dividers.b extra bind command to fix dividers issue
+...#/appl/lib/ecmascript.b T{
+...#parsing of '/' improved (division or start of
+...#regular expression)
+...#.br
+...#T}
+...#/appl/lib/parseman.b extra troff commands accounted for
+...#/appl/lib/profile.b bind of #P only if needed
+...#/appl/lib/translate.b hash code fix
+...#/appl/wm/c4.b evaluation function improvement
+...#/appl/wm/pen.b namechan() call moved to fix bug
+...#/appl/wm/polyhedra.b cosmetic changes
+...#/appl/wm/readmail.b slight code improvements
+...#/appl/wm/sendmail.b ditto
+...#/man/1/ebook text improvements
+...#/man/1/sh-std ditto
+...#/man/1/sum sha1sum added
+...#/man/2/keyring hmac_sha1, hmac_md5 added
+...#/man/3/kprof slight rewrite
+...#/man/4/export 9export added
+...#/man/6/translate new BUGS section comment
+...#/man/9/button padx, pady removed
+...#/man/9/canvas added grab commands
+...#/module/keyring.m added hmac routines
+...#/module/sys.m added DM* constants for future use
+...#/emu/devcon.c T{
+...#caphash and capuse files added for future
+...#development
+...#.br
+...#T}
+...#/emu/dis.c added Enomem check to broken progs
+...#/emu/exception.c added couple of extra string checks
+...#/emu/exportfs.c T{
+...#nexterror() replaced by return to give
+...#better error recovery
+...#.br
+...#T}
+...#/emu/os-Nt.c prints on console when ran out of kernel processes
+...#image/bezier.c getbezsplinepts() added
+...#interp/comp-arm.c headm bug fix, removed dodgy code
+...#interp/comp-power.c removed dodgy code
+...#interp/comp-thumb.c headm bug fix, removed dodgy code
+...#interp/keyring.c T{
+...#generalization of certain keyring functions
+...#.br
+...#addition of hmac routines
+...#.br
+...#T}
+...#interp/tk.c tkcursorcmd() removed temporarily
+...#/os/ip/devip.c ipremove() replaced by devremove()
+...#/os/ip/ip.c fragoff now a ulong
+...#/os/ip/tcp.c extra safety checks
+...#/os/port/devcons.c caphash and capuse files added
+...#/os/port/devkprof.c general improvements
+...#/os/port/dis.c Enomem check on broken progs
+...#/os/port/exception.c added couple of extra string checks
+...#/os/port/exportfs.c T{
+...#nexterror() replaced by return to give
+...#better error recovery
+...#.br
+...#check against correct file offset when reading directories
+...#.br
+...#T}
+...#/os/port/utils.c parsefields(), stroll() removed
+...#/tk/canvs.c T{
+...#extensions to tk: grab set tag, grab release tag,
+...#.br
+...#grab ifunset tag
+...#.br
+...#T}
+...#/tk/ctext.c T{
+...#text widget tag highlight fix
+...#text widget tag index fix
+...#.br
+...#T}
+...#/tk/entry.c borderwidth default to 0 in entry widget
+...#/tk/menus.c menu button release fix (off by the borderwidth bug)
+...#/tk/scrol.c T{
+...#scrollbar selection fixes (off by 1 bugs)
+...#.br
+...#autorepeat code added but disabled
+...#.br
+...#T}
+...#/tk/utils.c tkinsidepoly() function
+...#/tk/xdata.c unused globals removed
+...#/tk/* T{
+...#draw methods take extra parameter
+...#.br
+...#hit methods added
+...#.br
+...#tkcfirsttag(), tkclasttag() fixes
+...#.br
+...#tkrunpack() argument type change
+...#.br
+...#T}
+...#.TE
+.ig
+lib9.h all changed
+Storeinc and IEEE FP parameters default in math/dtoa.c
+Fconv -> Fmt
+doprint -> vseprint
+errstr -> add int size
+icossin, icossin2 -> image.h
+with ICOSSCALE
+Plan9 hosted include files different structure:
+Dir9p1 and Qid9p1
+#define Dir Dir9p1
+Fourth Edition system call interface
+except for Dir* functions
+under ifdef Inferno4, Qid -> Qid9p1, dirstat -> v3dirstat, etc.
+and those map
+Dir.length -> vlong
+Plan9 hosted include files quite different.
+#endif
+Styx module (styx(2)), dossrv, cdfs, acme all use it
+getcallerpc-$SYSTARG-$OBJTYPE.$O
+getwd-posix.$O
+lock-*.$O
+some types long -> int, some int ->long
+all conversion functions change
+detachscreen
+null if-else body
+main should call quotefmtinstall()
+..
+pc: draw screen; screen.$O removed from mkfile
+pc/mouse.c -> pc/ps2mouse.c
+pc config files updated to new ip stack
+dbg references removed/commented out.
+ether2114x driver provided
+env added
+fault removed
+emu and port print.c
+bug in native/hosted directory reading
+lexical names
+pctl etc more efficient for non-blocking ones
+solaris sets sa_handler not sa_sigaction for sigILL
--- /dev/null
+++ b/doc/INSTALL1.ms
@@ -1,0 +1,148 @@
+.\" this is an extract of port.ms: change that too if needed
+.pl 9999
+.SH
+Installing hosted Inferno from source
+.SH
+Overview
+.PP
+Like the native kernels
+.CW emu
+relies on several auxiliary libraries (the source of which
+it often shares with the native kernels).
+Emu itself is built by the
+.CW mkfile
+in the
+.CW emu
+subdirectory containing the platform-specific source for the host platform.
+Each library has its own
+.CW mkfile ;
+the various components are made in the right order by the
+.CW mkfile
+at the root of the Inferno tree.
+The
+.CW mkfile
+for each platform will also invoke
+.CW mk
+recursively to make the appropriate libraries
+for a given configuration.
+.PP
+The Unix emu variant generally is covered by `POSIX' (with common extensions)
+but each Unix port has one file that differs considerably for each port,
+namely \f5emu/\fP\fIplatform\fP\f5/os.c\fP, the differences
+corresponding to the different ways under Unix of implementing kernel-scheduled
+threads efficiently.
+.PP
+There are working emu versions
+for
+FreeBSD/386,
+Irix/mips,
+Linux/386,
+NetBSD/386,
+MacOSX/386,
+MacOSX/power,
+Plan 9,
+Solaris/sparc,
+and Windows (NT, 2000 and Explorer plug-in).
+Each platform typically uses mechanisms specific to the host operating
+system to implement Inferno's internal thread/process structure.
+POSIX threads have often been found to be insufficient (poorly implemented)
+on some platforms, and if so are avoided.
+See
+.CW kproc
+in
+.CW emu/*/os.c .
+.PP
+Source is included for ports to HP/UX (S800 architecture),
+Solaris/386, and Unixware, in case someone wishes to take them up now,
+but we have not determined their fitness.
+.PP
+The Plan 9 hosted implementation is unusual in that it supports
+several processor types:
+.CW 386 ,
+.CW mips ,
+.CW power
+(Power PC)
+and
+.CW sparc .
+Furthermore, all versions of
+.CW emu
+can be built on any processor type, in the usual way for Plan 9.
+.PP
+Otherwise, as distributed,
+.CW emu
+for a platform can only be built when running on that platform.
+.PP
+One unusual variant makes the whole of Inferno a plug-in for Microsoft's
+Internet Explorer, giving the same environment for Inferno applications
+running in an HTML page as is provided by hosted or native Inferno.
+That is, there is not a distinct `applet' environment with special programming interfaces.
+The source for the various plug-in components is found in
+.CW /tools/plugin
+and
+.CW /usr/internet
+within the Inferno tree; they use the version of
+.I emu
+defined by the configuration file
+.CW /emu/Nt/ie .
+.SH
+Build steps
+.PP
+All the libraries and executables can be built in a tree containing only the source code.
+To do that for a supported variant of hosted Inferno, on Unix or Plan 9, do the following
+in the root of the Inferno tree:
+.nr Ci 0 +1
+.de Xx
+.IP \\n+(Ci
+..
+.Xx
+Edit
+.CW mkconfig
+to reflect your host environment,
+specifically ROOT (which must be an absolute path name), SYSHOST and OBJTYPE.
+The comments in the file should help you choose.
+.Xx
+Run
+.CW makemk.sh
+to rebuild the
+.CW mk
+command, which is used to build everything else.
+.Xx
+Set
+.CW PATH
+(or
+.CW path
+on Plan 9)
+to include the
+.CW bin
+directory for the platform, which will now contain the
+.CW mk
+binary just built.
+On Unix, export
+.CW PATH .
+.Xx
+Then
+.CW "mk nuke"
+to remove any extraneous object files.
+.Xx
+Finally,
+.CW "mk install"
+to create and install the libraries,
+.CW limbo
+compiler,
+.CW emu
+for hosted Inferno, and auxiliary commands.
+The rules do that in an order that ensures that the commands or libraries
+needed by a later stage are built and installed first.
+(Note that a plain
+.CW mk
+will not suffice, because it does not put the results in the search path.)
+.LP
+Doing something similar on Windows or Plan 9 currently requires the executable for
+.CW mk
+to be available in the search path,
+since there is no equivalent of
+.CW makemk.sh .
+Otherwise the procedure is the same.
+On Plan 9, of course, the host system's normal version of
+.CW mk
+should be adequate.
--- /dev/null
+++ b/doc/acid.ms
@@ -1,0 +1,2519 @@
+.am DS
+.ft I
+..
+.am DE
+.ft R
+..
+.ta 1i 2.3i 4.5i (optional to set tabs)
+.TL
+Acid Reference Manual
+.AU
+Phil Winterbottom
+philw@plan9.bell-labs.com
+.FS
+\l'1i'
+.br
+Previously appeared with minor differences as the
+``Acid Manual'' in
+.I "Plan 9 Programmer's Manual, Volume 2 (Second Edition)".
+.FE
+.SH
+Introduction
+.PP
+Acid is a general purpose, source level symbolic debugger.
+The debugger is built around a simple command language.
+The command language, distinct from the language of the program being debugged,
+provides a flexible user interface that allows the debugger
+interface to be customized for a specific application or architecture.
+Moreover, it provides an opportunity to write test and
+verification code independently of a program's source code.
+Acid is able to debug multiple
+processes provided they share a common set of symbols, such as the processes in
+a threaded program.
+.PP
+Like other language-based solutions, Acid presents a poor user interface but
+provides a powerful debugging tool.
+Application of Acid to hard problems is best approached by writing functions off-line
+(perhaps loading them with the
+.CW include
+function or using the support provided by
+.I acme (1)),
+rather than by trying to type intricate Acid operations
+at the interactive prompt.
+.PP
+Acid allows the execution of a program to be controlled by operating on its
+state while it is stopped and by monitoring and controlling its execution
+when it is running. Each program action that causes a change
+of execution state is reflected by the execution
+of an Acid function, which may be user defined.
+A library of default functions provides the functionality of a normal debugger.
+.PP
+On Plan 9, a process is controlled by writing messages to a control file in the
+.I proc (3)
+file system. Each control message has a corresponding Acid function, which
+sends the message to the process. These functions take a process id
+.I pid ) (
+as an
+argument. The memory and text file of the program may be manipulated using
+the indirection operators. The symbol table, including source cross reference,
+is available to an Acid program. The combination allows complex operations
+to be performed both in terms of control flow and data manipulation.
+.SH
+Input format and \f(CWwhatis\fP
+.PP
+Comments start with
+.CW //
+and continue to the end of the line.
+Input is a series of statements and expressions separated by semicolons.
+At the top level of the interpreter, the builtin function
+.CW print
+is called automatically to display the result of all expressions except function calls.
+A unary
+.CW +
+may be used as a shorthand to force the result of a function call to be printed.
+.PP
+Also at the top level, newlines are treated as semicolons
+by the parser, so semicolons are unnecessary when evaluating expressions.
+.PP
+When Acid starts, it loads the default program modules,
+enters interactive mode, and prints a prompt. In this state Acid accepts
+either function definitions or statements to be evaluated.
+In this interactive mode
+statements are evaluated immediately, while function definitions are
+stored for later invocation.
+.PP
+The
+.CW whatis
+operator can be used to report the state of identifiers known to the interpreter.
+With no argument,
+.CW whatis
+reports the name of all defined Acid functions; when supplied with an identifier
+as an argument it reports any variable, function, or type definition
+associated with the identifier.
+Because of the way the interpreter handles semicolons,
+the result of a
+.CW whatis
+statement can be returned directly to Acid without adding semicolons.
+A syntax error or interrupt returns Acid to the normal evaluation
+mode; any partially evaluated definitions are lost.
+.SH
+Using the Library Functions
+.PP
+After loading the program binary, Acid loads the portable and architecture-specific
+library functions that form the standard debugging environment.
+These files are Acid source code and are human-readable.
+The following example uses the standard debugging library to show how
+language and program interact:
+.P1
+% acid /bin/ls
+/bin/ls:mips plan 9 executable
+
+/sys/lib/acid/port
+/sys/lib/acid/mips
+acid: new()
+75721: system call _main ADD $-0x14,R29
+75721: breakpoint main+0x4 MOVW R31,0x0(R29)
+acid: bpset(ls)
+acid: cont()
+75721: breakpoint ls ADD $-0x16c8,R29
+acid: stk()
+At pc:0x0000141c:ls /sys/src/cmd/ls.c:87
+ls(s=0x0000004d,multi=0x00000000) /sys/src/cmd/ls.c:87
+ called from main+0xf4 /sys/src/cmd/ls.c:79
+main(argc=0x00000000,argv=0x7ffffff0) /sys/src/cmd/ls.c:48
+ called from _main+0x20 /sys/src/libc/mips/main9.s:10
+acid: PC
+0xc0000f60
+acid: *PC
+0x0000141c
+acid: ls
+0x0000141c
+.P2
+The function
+.CW new()
+creates a new process and stops it at the first instruction.
+This change in state is reported by a call to the
+Acid function
+.CW stopped ,
+which is called by the interpreter whenever the debugged program stops.
+.CW Stopped
+prints the status line giving the pid, the reason the program stopped
+and the address and instruction at the current PC.
+The function
+.CW bpset
+makes an entry in the breakpoint table and plants a breakpoint in memory.
+The
+.CW cont
+function continues the process, allowing it to run until some condition
+causes it to stop. In this case the program hits the breakpoint placed on
+the function
+.CW ls
+in the C program. Once again the
+.CW stopped
+routine is called to print the status of the program. The function
+.CW stk
+prints a C stack trace of the current process. It is implemented using
+a builtin Acid function that returns the stack trace as a list; the code
+that formats the information is all written in Acid.
+The Acid variable
+.CW PC
+holds the address of the
+cell where the current value of the processor register
+.CW PC
+is stored. By indirecting through
+the value of
+.CW PC
+the address where the program is stopped can be found.
+All of the processor registers are available by the same mechanism.
+.SH
+Types
+.PP
+An Acid variable has one of four types:
+.I integer ,
+.I float ,
+.I list ,
+or
+.I string .
+The type of a variable is inferred from the type of the right-hand
+side of the assignment expression which last set its value.
+Referencing a variable that has not yet
+been assigned draws a "used but not set" error. Many of the operators may
+be applied to more than
+one type; for these operators the action of the operator is determined by
+the types of its operands. The action of each operator is defined in the
+.I Expressions
+section of this manual.
+.SH
+Variables
+.PP
+Acid has three kinds of variables: variables defined by the symbol table
+of the debugged program, variables that are defined and maintained
+by the interpreter as the debugged program changes state, and variables
+defined and used by Acid programs.
+.PP
+Some examples of variables maintained by the interpreter are the register
+pointers listed by name in the Acid list variable
+.CW registers ,
+and the symbol table listed by name and contents in the Acid variable
+.CW symbols .
+.PP
+The variable
+.CW pid
+is updated by the interpreter to select the most recently created process
+or the process selected by the
+.CW setproc
+builtin function.
+.SH 1
+Formats
+.PP
+In addition to a type, variables have formats. The format is a code
+letter that determines the printing style and the effect of some of the
+operators on that variable. The format codes are derived from the format
+letters used by
+.I db (1).
+By default, symbol table variables and numeric constants
+are assigned the format code
+.CW X ,
+which specifies 32-bit hexadecimal.
+Printing a variable with this code yields the output
+.CW 0x00123456 .
+The format code of a variable may be changed from the default by using the
+builtin function
+.CW fmt .
+This function takes two arguments, an expression and a format code. After
+the expression is evaluated the new format code is attached to the result
+and forms the return value from
+.CW fmt .
+The backslash operator is a short form of
+.CW fmt .
+The format supplied by the backslash operator must be the format character
+rather than an expression.
+If the result is assigned to a variable the new format code is maintained
+in the variable. For example:
+.P1
+acid: x=10
+acid: print(x)
+0x0000000a
+acid: x = fmt(x, 'D')
+acid: print(x, fmt(x, 'X'))
+10 0x0000000a
+acid: x
+10
+acid: x\eo
+12
+.P2
+The supported format characters are:
+.RS
+.IP \f(CWo\fP
+Print two-byte integer in octal.
+.IP \f(CWO\fP
+Print four-byte integer in octal.
+.IP \f(CWq\fP
+Print two-byte integer in signed octal.
+.IP \f(CWQ\fP
+Print four-byte integer in signed octal.
+.IP \f(CWB\fP
+Print four-byte integer in binary.
+.IP \f(CWd\fP
+Print two-byte integer in signed decimal.
+.IP \f(CWD\fP
+Print four-byte integer in signed decimal.
+.IP \f(CWY\fP
+Print eight-byte integer in signed decimal.
+.IP \f(CWx\fP
+Print two-byte integer in hexadecimal.
+.IP \f(CWX\fP
+Print four-byte integer in hexadecimal.
+.IP \f(CWu\fP
+Print two-byte integer in unsigned decimal.
+.IP \f(CWU\fP
+Print four-byte integer in unsigned decimal.
+.IP \f(CWZ\fP
+Print eight-byte integer in unsigned decimal.
+.IP \f(CWf\fP
+Print single-precision floating point number.
+.IP \f(CWF\fP
+Print double-precision floating point number.
+.IP \f(CWg\fP
+Print a single precision floating point number in string format.
+.IP \f(CWG\fP
+Print a double precision floating point number in string format.
+.IP \f(CWb\fP
+Print byte in hexadecimal.
+.IP \f(CWc\fP
+Print byte as an ASCII character.
+.IP \f(CWC\fP
+Like
+.CW c ,
+with
+printable ASCII characters represented normally and
+others printed in the form \f(CW\ex\fInn\fR.
+.IP \f(CWs\fP
+Interpret the addressed bytes as UTF characters
+and print successive characters until a zero byte is reached.
+.IP \f(CWr\fP
+Print a two-byte integer as a rune.
+.IP \f(CWR\fP
+Print successive two-byte integers as runes
+until a zero rune is reached.
+.IP \f(CWY\fP
+Print successive eight-byte integers in hexadecimal.
+.IP \f(CWi\fP
+Print as machine instructions.
+.IP \f(CWI\fP
+As
+.CW i
+above, but print the machine instructions in
+an alternate form if possible:
+.CW sunsparc
+and
+.CW mipsco
+reproduce the manufacturers' syntax.
+.IP \f(CWa\fP
+Print the value in symbolic form.
+.RE
+.SH
+Complex types
+.PP
+Acid permits the definition of the layout of memory.
+The usual method is to use the
+.CW -a
+flag of the compilers to produce Acid-language descriptions of data structures (see
+.I 2c (1))
+although such definitions can be typed interactively.
+The keywords
+.CW complex ,
+.CW adt ,
+.CW aggr ,
+and
+.CW union
+are all equivalent; the compiler uses the synonyms to document the declarations.
+A complex type is described as a set of members, each containing a format letter,
+an offset in the structure, and a name. For example, the C structure
+.P1
+struct List {
+ int type;
+ struct List *next;
+};
+.P2
+is described by the Acid statement
+.P1
+complex List {
+ 'D' 0 type;
+ 'X' 4 next;
+};
+.P2
+.SH
+Scope
+.PP
+Variables are global unless they are either parameters to functions
+or are declared as
+.CW local
+in a function body. Parameters and local variables are available only in
+the body of the function in which they are instantiated.
+Variables are dynamically bound: if a function declares a local variable
+with the same name as a global variable, the global variable will be hidden
+whenever the function is executing.
+For example, if a function
+.CW f
+has a local called
+.CW main ,
+any function called below
+.CW f
+will see the local version of
+.CW main ,
+not the external symbol.
+.SH 1
+Addressing
+.PP
+Since the symbol table specifies addresses,
+to access the value of program variables
+an extra level of indirection
+is required relative to the source code.
+For consistency, the registers are maintained as pointers as well; Acid variables with the names
+of processor registers point to cells holding the saved registers.
+.PP
+The location in a file or memory image associated with
+an address is calculated from a map
+associated with the file.
+Each map contains one or more quadruples (\c
+.I t ,
+.I b ,
+.I e ,
+.I f \|),
+defining a segment named
+.I t
+(usually
+.CW text ,
+.CW data ,
+.CW regs ,
+or
+.CW fpregs )
+mapping addresses in the range
+.I b
+through
+.I e
+to the part of the file
+beginning at
+offset
+.I f .
+The memory model of a Plan 9 process assumes
+that segments are disjoint. There
+can be more than one segment of a given type (e.g., a process
+may have more than one text segment) but segments
+may not overlap.
+An address
+.I a
+is translated
+to a file address
+by finding a segment
+for which
+.I b
++
+.I a
+<
+.I e ;
+the location in the file
+is then
+.I address
++
+.I f
+\-
+.I b .
+.PP
+Usually,
+the text and initialized data of a program
+are mapped by segments called
+.CW text
+and
+.CW data .
+Since a program file does not contain bss, stack, or register data,
+these data are
+not mapped by the data segment.
+The text segment is mapped similarly in the memory image of
+a normal (i.e., non-kernel) process.
+However, the segment called
+.CW *data
+maps memory from the beginning to the end of the program's data space.
+This region contains the program's static data, the bss, the
+heap and the stack. A segment
+called
+.CW *regs
+maps the registers;
+.CW *fpregs
+maps the floating point registers (if they exist).
+.PP
+Sometimes it is useful to define a map with a single segment
+mapping the region from 0 to 0xFFFFFFFF; such a map
+allows the entire file to be examined
+without address translation. The builtin function
+.CW map
+examines and modifies Acid's map for a process.
+.SH 1
+Name Conflicts
+.PP
+Name conflicts between keywords in the Acid language, symbols in the program,
+and previously defined functions are resolved when the interpreter starts up.
+Each name is made unique by prefixing enough
+.CW $
+characters to the front of the name to make it unique. Acid reports
+a list of each name change at startup. The report looks like this:
+.P1
+/bin/sam: mips plan 9 executable
+/lib/acid/port
+/lib/acid/mips
+Symbol renames:
+ append=$append T/0xa4e40
+acid:
+.P2
+The symbol
+.CW append
+is both a keyword and a text symbol in the program. The message reports
+that the text symbol is now named
+.CW $append .
+.SH
+Expressions
+.PP
+Operators have the same
+binding and precedence as in C.
+For operators of equal precedence, expressions are evaluated from left to right.
+.SH 1
+Boolean expressions
+.PP
+If an expression is evaluated for a boolean condition the test
+performed depends on the type of the result. If the result is of
+.I integer
+or
+.I floating
+type the result is true if the value is non-zero. If the expression is a
+.I list
+the result is true if there are any members in the list.
+If the expression is a
+.I string
+the result is true if there are any characters in the string.
+.DS
+ primary-expression:
+ identifier
+ identifier \f(CW:\fP identifier
+ constant
+ \f(CW(\fP expression \f(CW)\fP
+ \f(CW{\fP elist \f(CW}\fP
+
+ elist:
+ expression
+ elist , expression
+.DE
+An identifier may be any legal Acid variable. The colon operator returns the
+address of parameters or local variables in the current stack of a program.
+For example:
+.P1
+*main:argc
+.P2
+prints the number of arguments passed into main. Local variables and parameters
+can only be referenced after the frame has been established. It may be necessary to
+step a program over the first few instructions of a breakpointed function to properly set
+the frame.
+.PP
+Constants follow the same lexical rules as C.
+A list of expressions delimited by braces forms a list constructor.
+A new list is produced by evaluating each expression when the constructor is executed.
+The empty list is formed from
+.CW {} .
+.P1
+acid: x = 10
+acid: l = { 1, x, 2\eD }
+acid: x = 20
+acid: l
+{0x00000001 , 0x0000000a , 2 }
+.P2
+.SH 1
+Lists
+.PP
+Several operators manipulate lists.
+.DS
+ list-expression:
+ primary-expression
+ \f(CWhead\fP primary-expression
+ \f(CWtail\fP primary-expression
+ \f(CWappend\fP expression \f(CW,\fP primary-expression
+ \f(CWdelete\fP expression \f(CW,\fP primary-expression
+.DE
+The
+.I primary-expression
+for
+.CW head
+and
+.CW tail
+must yield a value of type
+.I list .
+If there are no elements in the list the value of
+.CW head
+or
+.CW tail
+will be the empty list. Otherwise
+.CW head
+evaluates to the first element of the list and
+.CW tail
+evaluates to the rest.
+.P1
+acid: head {}
+{}
+acid: head {1, 2, 3, 4}
+0x00000001
+acid: tail {1, 2, 3, 4}
+{0x00000002 , 0x00000003 , 0x00000004 }
+.P2
+The first operand of
+.CW append
+and
+.CW delete
+must be an expression that yields a
+.I list .
+.CW Append
+places the result of evaluating
+.I primary-expression
+at the end of the list.
+The
+.I primary-expression
+supplied to
+.CW delete
+must evaluate to an integer;
+.CW delete
+removes the
+.I n 'th
+item from the list, where
+.I n
+is integral value of
+.I primary-expression.
+List indices are zero-based.
+.P1
+ acid: append {1, 2}, 3
+ {0x00000001 , 0x00000002 , 0x00000003 }
+ acid: delete {1, 2, 3}, 1
+ {0x00000001 , 0x00000003 }
+.P2
+.PP
+Assigning a list to a variable copies a reference to the list; if a list variable
+is copied it still points at the same list. To copy a list, the elements must
+be copied piecewise using
+.CW head
+and
+.CW append .
+.SH 1
+Operators
+.PP
+.DS
+ postfix-expression:
+ list-expression
+ postfix-expression \f(CW[\fP expression \f(CW]\fP
+ postfix-expression \f(CW(\fP argument-list \f(CW)\fP
+ postfix-expression \f(CW.\fP tag
+ postfix-expression \f(CW->\fP tag
+ postfix-expression \f(CW++\fP
+ postfix-expression \f(CW--\fP
+
+ argument-list:
+ expression
+ argument-list , expression
+.DE
+The
+.CW [
+.I expression
+.CW ]
+operator performs indexing.
+The indexing expression must result in an expression of
+.I integer
+type, say
+.I n .
+The operation depends on the type of
+.I postfix-expression .
+If the
+.I postfix-expression
+yields an
+.I integer
+it is assumed to be the base address of an array in the memory image.
+The index offsets into this array; the size of the array members is
+determined by the format associated with the
+.I postfix-expression .
+If the
+.I postfix-expression
+yields a
+.I string
+the index operator fetches the
+.I n 'th
+character
+of the string. If the index points beyond the end
+of the string, a zero is returned.
+If the
+.I postfix-expression
+yields a
+.I list
+then the indexing operation returns the
+.I n 'th
+item of the list.
+If the list contains less than
+.I n
+items the empty list
+.CW {}
+is returned.
+.PP
+The
+.CW ++
+and
+.CW --
+operators increment and decrement integer variables.
+The amount of increment or decrement depends on the format code. These postfix
+operators return the value of the variable before the increment or decrement
+has taken place.
+.DS
+ unary-expression:
+ postfix-expression
+ \f(CW++\fP unary-expression
+ \f(CW--\fP unary-expression
+
+ unary-operator: one of
+ \f(CW*\fP \f(CW@\fP \f(CW+\fP \f(CW-\fP ~ \f(CW!\fP
+.DE
+The operators
+.CW *
+and
+.CW @
+are the indirection operators.
+.CW @
+references a value from the text file of the program being debugged.
+The size of the value depends on the format code. The
+.CW *
+operator fetches a value from the memory image of a process. If either
+operator appears on the left-hand side of an assignment statement, either the file
+or memory will be written. The file can only be modified when Acid is invoked
+with the
+.CW -w
+option.
+The prefix
+.CW ++
+and
+.CW --
+operators perform the same operation as their postfix counterparts but
+return the value after the increment or decrement has been performed. Since the
+.CW ++
+and
+.CW *
+operators fetch and increment the correct amount for the specified format,
+the following function prints correct machine instructions on a machine with
+variable length instructions, such as the 68020 or 386:
+.P1
+ defn asm(addr)
+ {
+ addr = fmt(addr, 'i');
+ loop 1, 10 do
+ print(*addr++, "\en");
+ }
+.P2
+The operators
+.CW ~
+and
+.CW !
+perform bitwise and logical negation respectively. Their operands must be of
+.I integer
+type.
+.DS
+ cast-expression:
+ unary-expression
+ unary-expression \f(CW\e\fP format-char
+ \f(CW(\fP complex-name \f(CW)\fP unary-expression
+.DE
+A unary expression may be preceded by a cast. The cast has the effect of
+associating the value of
+.I unary-expression
+with a complex type structure.
+The result may then be dereferenced using the
+.CW .
+and
+.CW ->
+operators.
+.PP
+An Acid variable may be associated with a complex type
+to enable accessing the type's members:
+.P1
+acid: complex List {
+ 'D' 0 type;
+ 'X' 4 next;
+};
+acid: complex List lhead
+acid: lhead.type
+10
+acid: lhead = ((List)lhead).next
+acid: lhead.type
+-46
+.P2
+Note that the
+.CW next
+field cannot be given a complex type automatically.
+.PP
+When entered at the top level of the interpreter,
+an expression of complex type
+is treated specially.
+If the type is called
+.CW T
+and an Acid function also called
+.CW T
+exists,
+then that function will be called with the expression as its argument.
+The compiler options
+.CW -a
+and
+.CW -aa
+will generate Acid source code defining such complex types and functions; see
+.I 2c (1).
+.PP
+A
+.I unary-expression
+may be qualified with a format specifier using the
+.CW \e
+operator. This has the same effect as passing the expression to the
+.CW fmt
+builtin function.
+.DS
+ multiplicative-expression:
+ cast-expression
+ multiplicative-expression \f(CW*\fP multiplicative-expression
+ multiplicative-expression \f(CW/\fP multiplicative-expression
+ multiplicative-expression \f(CW%\fP multiplicative-expression
+.DE
+These operate on
+.I integer
+and
+.I float
+types and perform the expected operations:
+.CW *
+multiplication,
+.CW /
+division,
+.CW %
+modulus.
+.DS
+ additive-expression:
+ multiplicative-expression
+ additive-expression \f(CW+\fP multiplicative-expression
+ additive-expression \f(CW-\fP multiplicative-expression
+.DE
+These operators perform as expected for
+.I integer
+and
+.I float
+operands.
+Unlike in C,
+.CW +
+and
+.CW -
+do not scale the addition based on the format of the expression.
+This means that
+.CW i=i+1
+will always add 1 but
+.CW i++
+will add the size corresponding to the format stored with
+.CW i .
+If both operands are of either
+.I string
+or
+.I list
+type then addition is defined as concatenation.
+Adding a string and an integer is treated as concatenation
+with the Unicode character corresponding to the integer.
+Subtraction is undefined for strings and lists.
+.DS
+ shift-expression:
+ additive-expression
+ shift-expression \f(CW<<\fP additive-expression
+ shift-expression \f(CW>>\fP additive-expression
+.DE
+The
+.CW >>
+and
+.CW <<
+operators perform bitwise right and left shifts respectively. Both
+require operands of
+.I integer
+type.
+.DS
+ relational-expression:
+ relational-expression \f(CW<\fP shift-expression
+ relational-expression \f(CW>\fP shift-expression
+ relational-expression \f(CW<=\fP shift-expression
+ relational-expression \f(CW>=\fP shift-expression
+
+ equality-expression:
+ relational-expression
+ relational-expression \f(CW==\fP equality-expression
+ relational-expression \f(CW!=\fP equality-expression
+.DE
+The comparison operators are
+.CW <
+(less than),
+.CW >
+(greater than),
+.CW <=
+(less than or equal to),
+.CW >=
+(greater than or equal to),
+.CW ==
+(equal to) and
+.CW !=
+(not equal to). The result of a comparison is 0
+if the condition is false, otherwise 1. The relational operators can only be
+applied to operands of
+.I integer
+and
+.I float
+type. The equality operators apply to all types. Comparing mixed types is legal.
+Mixed integer and float compare on the integral value. Other mixtures are always unequal.
+Two lists are equal if they
+have the same number of members and a pairwise comparison of the members results
+in equality.
+.DS
+ AND-expression:
+ equality-expression
+ AND-expression \f(CW&\fP equality-expression
+
+ XOR-expression:
+ AND-expression
+ XOR-expression \f(CW^\fP AND-expression
+
+ OR-expression:
+ XOR-expression
+ OR-expression \f(CW|\fP XOR-expression
+.DE
+These operators perform bitwise logical operations and apply only to the
+.I integer
+type.
+The operators are
+.CW &
+(logical and),
+.CW ^
+(exclusive or) and
+.CW |
+(inclusive or).
+.DS
+ logical-AND-expression:
+ OR-expression
+ logical-AND-expression \f(CW&&\fP OR-expression
+
+ logical-OR-expression:
+ logical-AND-expression
+ logical-OR-expression \f(CW||\fP logical-AND-expression
+.DE
+The
+.CW &&
+operator returns 1 if both of its operands evaluate to boolean true, otherwise 0.
+The
+.CW ||
+operator returns 1 if either of its operands evaluates to boolean true,
+otherwise 0.
+.SH
+Statements
+.PP
+.DS
+ \f(CWif\fP expression \f(CWthen\fP statement \f(CWelse\fP statement
+ \f(CWif\fP expression \f(CWthen\fP statement
+.DE
+The
+.I expression
+is evaluated as a boolean. If its value is true the statement after
+the
+.CW then
+is executed, otherwise the statement after the
+.CW else
+is executed. The
+.CW else
+portion may be omitted.
+.DS
+ \f(CWwhile\fP expression \f(CWdo\fP statement
+.DE
+In a while loop, the
+.I statement
+is executed while the boolean
+.I expression
+evaluates
+true.
+.DS
+ \f(CWloop\fP startexpr, endexpr \f(CWdo\fP statement
+.DE
+The two expressions
+.I startexpr
+and
+.I endexpr
+are evaluated prior to loop entry.
+.I Statement
+is evaluated while the value of
+.I startexpr
+is less than or equal to
+.I endexpr .
+Both expressions must yield
+.I integer
+values. The value of
+.I startexpr
+is
+incremented by one for each loop iteration.
+Note that there is no explicit loop variable; the
+.I expressions
+are just values.
+.DS
+ \f(CWreturn\fP expression
+.DE
+.CW return
+terminates execution of the current function and returns to its caller.
+The value of the function is given by expression. Since
+.CW return
+requires an argument, nil-valued functions should return the empty list
+.CW {} .
+.DS
+ \f(CWlocal\fP variable
+.DE
+The
+.CW local
+statement creates a local instance of
+.I variable ,
+which exists for the duration
+of the instance of the function in which it is declared. Binding is dynamic: the local variable,
+rather than the previous value of
+.I variable ,
+is visible to called functions.
+After a return from the current function the previous value of
+.I variable
+is
+restored.
+.PP
+If Acid is interrupted, the values of all local variables are lost,
+as if the function returned.
+.DS
+ \f(CWdefn\fP function-name \f(CW(\fP parameter-list \f(CW)\fP body
+
+ parameter-list:
+ variable
+ parameter-list , variable
+
+ body:
+ \f(CW{\fP statement \f(CW}\fP
+.DE
+Functions are introduced by the
+.CW defn
+statement. The definition of parameter names suppresses any variables
+of the same name until the function returns. The body of a function is a list
+of statements enclosed by braces.
+.SH
+Code variables
+.PP
+Acid permits the delayed evaluation of a parameter to a function. The parameter
+may then be evaluated at any time with the
+.CW eval
+operator. Such parameters are called
+.I "code variables
+and are defined by prefixing their name with an asterisk in their declaration.
+.PP
+For example, this function wraps up an expression for later evaluation:
+.P1
+acid: defn code(*e) { return e; }
+acid: x = code(v+atoi("100")\eD)
+acid: print(x)
+(v+atoi("100"))\eD;
+acid: eval x
+<stdin>:5: (error) v used but not set
+acid: v=5
+acid: eval x
+105
+.P2
+.SH
+Source Code Management
+.PP
+Acid provides the means to examine source code. Source code is
+represented by lists of strings. Builtin functions provide mapping
+from address to lines and vice-versa. The default debugging environment
+has the means to load and display source files.
+.SH
+Builtin Functions
+.PP
+The Acid interpreter has a number of builtin functions, which cannot be redefined.
+These functions perform machine- or operating system-specific functions such as
+symbol table and process management.
+The following section presents a description of each builtin function.
+The notation
+.CW {}
+is used to denote the empty list, which is the default value of a function that
+does not execute a
+.CW return
+statement.
+The type and number of parameters for each function are specified in the
+description; where a parameter can be of any type it is specified as type
+.I item .
+.de Ip
+.KS
+.LP
+.tl '\f2\\$1\fP\ \ \f(CW\\$2(\f2\\$3\f(CW)\f1''\\$4'
+.IP
+..
+.de Ex
+.KE
+.KS
+.IP
+.ft CW
+.ta 4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n
+.nf
+.in +4n
+.br
+..
+.de Ee
+.fi
+.ft 1
+.br
+.KE
+..
+.\"
+.\"
+.\"
+.Ip integer access string "Check if a file can be read
+.CW Access
+returns the integer 1 if the file name in
+.I string
+can be read by the builtin functions
+.CW file ,
+.CW readfile ,
+or
+.CW include ,
+otherwise 0. A typical use of this function is to follow
+a search path looking for a source file; it is used by
+.CW findsrc .
+.Ex
+if access("main.c") then
+ return file("main.c");
+.Ee
+.\"
+.\"
+.\"
+.Ip float atof string "Convert a string to float
+.CW atof
+converts the string supplied as its argument into a floating point
+number. The function accepts strings in the same format as the C
+function of the same name. The value returned has the format code
+.CW f .
+.CW atof
+returns the value 0.0 if it is unable to perform the conversion.
+.Ex
+acid: +atof("10.4e6")
+1.04e+07
+.Ee
+.\"
+.\"
+.\"
+.Ip integer atoi string "Convert a string to an integer
+.CW atoi
+converts the argument
+.i string
+to an integer value.
+The function accepts strings in the same format as the C function of the
+same name. The value returned has the format code
+.CW D .
+.CW atoi
+returns the integer 0 if it is unable to perform a conversion.
+.Ex
+acid: +atoi("-1255")
+-1255
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP error string "Generate an interpreter error
+.CW error
+generates an error message and returns the interpreter to interactive
+mode. If an Acid program is running, it is aborted.
+Processes being debugged are not affected. The values of all local variables are lost.
+.CW error
+is commonly used to stop the debugger when some interesting condition arises
+in the debugged program.
+.Ex
+while 1 do {
+ step();
+ if *main != @main then
+ error("memory corrupted");
+}
+.Ee
+.\"
+.\"
+.\"
+.Ip list file string "Read the contents of a file into a list
+.CW file
+reads the contents of the file specified by
+.I string
+into a list.
+Each element in the list is a string corresponding to a line in the file.
+.CW file
+breaks lines at the newline character, but the newline
+characters are not returned as part each string.
+.CW file
+returns the empty list if it encounters an error opening or reading the data.
+.Ex
+acid: print(file("main.c")[0])
+#include <u.h>
+.Ee
+.\"
+.\"
+.\"
+.Ip integer filepc string "Convert source address to text address
+.CW filepc
+interprets its
+.I string
+argument as a source file address in the form of a file name and line offset.
+.CW filepc
+uses the symbol table to map the source address into a text address
+in the debugged program. The
+.I integer
+return value has the format
+.CW X .
+.CW filepc
+returns an address of -1 if the source address is invalid.
+The source file address uses the same format as
+.I acme (1).
+This function is commonly used to set breakpoints from the source text.
+.Ex
+acid: bpset(filepc("main:10"))
+acid: bptab()
+ 0x00001020 usage ADD $-0xc,R29
+.Ee
+.\"
+.\"
+.\"
+.Ip item fmt item,fmt "Set print, \f(CW@\fP and \f(CW*\fP formats
+.CW fmt
+evaluates the expression
+.I item
+and sets the format of the result to
+.I fmt .
+The format of a value determines how it will be printed and
+what kind of object will be fetched by the
+.CW *
+and
+.CW @
+operators. The
+.CW \e
+operator is a short-hand form of the
+.CW fmt
+builtin function. The
+.CW fmt
+function leaves the format of the
+.I item
+unchanged.
+.Ex
+acid: main=fmt(main, 'i') // as instructions
+acid: print(main\eX, "\et", *main)
+0x00001020 ADD $-64,R29
+.Ee
+.\"
+.\"
+.\"
+.Ip list fnbound integer "Find start and end address of a function
+.CW fnbound
+interprets its
+.I integer
+argument as an address in the text of the debugged program.
+.CW fnbound
+returns a list containing two integers corresponding to
+the start and end addresses of the function containing the supplied address.
+If the
+.I integer
+address is not in the text segment of the program then the empty list is returned.
+.CW fnbound
+is used by
+.CW next
+to detect stepping into new functions.
+.Ex
+acid: print(fnbound(main))
+{0x00001050, 0x000014b8}
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP follow integer "Compute follow set
+The follow set is defined as the set of program counter values that could result
+from executing an instruction.
+.CW follow
+interprets its
+.I integer
+argument as a text address, decodes the instruction at
+that address and, with the current register set, builds a list of possible
+next program counter values. If the instruction at the specified address
+cannot be decoded
+.CW follow
+raises an error.
+.CW follow
+is used to plant breakpoints on
+all potential paths of execution. The following code fragment
+plants breakpoints on top of all potential following instructions.
+.Ex
+lst = follow(*PC);
+while lst do
+{
+ *head lst = bpinst;
+ lst = tail lst;
+}
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP include string "Take input from a new file
+.CW include
+opens the file specified by
+.I string
+and uses its contents as command input to the interpreter.
+The interpreter restores input to its previous source when it encounters
+either an end of file or an error.
+.CW include
+can be used to incrementally load symbol table information without
+leaving the interpreter.
+.Ex
+acid: include("/sys/src/cmd/acme/syms")
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP interpret string "Take input from a string
+.CW interpret
+evaluates the
+.I string
+expression and uses its result as command input for the interpreter.
+The interpreter restores input to its previous source when it encounters
+either the end of string or an error. The
+.CW interpret
+function allows Acid programs to write Acid code for later evaluation.
+.Ex
+acid: interpret("main+10;")
+0x0000102a
+.Ee
+.\"
+.\"
+.\"
+.Ip string itoa integer "Convert integer to string
+.CW itoa
+takes an integer argument and converts it into an ASCII string
+in the
+.CW D
+format. This function is commonly used to build
+.CW rc
+command lines.
+.Ex
+acid: rc("cat /proc/"+itoa(pid)+"/segment")
+Stack 7fc00000 80000000 1
+Data 00001000 00009000 1
+Data 00009000 0000a000 1
+Bss 0000a000 0000c000 1
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP kill integer "Kill a process
+.CW kill
+writes a kill control message into the control file of the process
+specified by the
+.I integer
+pid.
+If the process was previously installed by
+.CW setproc
+it will be removed from the list of active processes.
+If the
+.I integer
+has the same value as
+.CW pid ,
+then
+.CW pid
+will be set to 0.
+To continue debugging, a new process must be selected using
+.CW setproc .
+For example, to kill all the active processes:
+.Ex
+while proclist do {
+ kill(head proclist);
+ proclist = tail proclist;
+}
+.Ee
+.\"
+.\"
+.\"
+.Ip list map list "Set or retrieve process memory map
+.CW map
+either retrieves all the mappings associated with a process or sets a single
+map entry to a new value.
+If the
+.I list
+argument is omitted then
+.CW map
+returns a list of lists. Each sublist has four values and describes a
+single region of contiguous addresses in the
+memory or file image of the debugged program. The first entry is the name of the
+mapping. If the name begins with
+.CW *
+it denotes a map into the memory of an active process.
+The second and third values specify the base and end
+address of the region and the fourth number specifies the offset in the file
+corresponding to the first location of the region.
+A map entry may be set by supplying a list in the same format as the sublist
+described above. The name of the mapping must match a region already defined
+by the current map.
+Maps are set automatically for Plan 9 processes and some kernels; they may
+need to be set by hand for other kernels and programs that run on bare hardware.
+.Ex
+acid: map({"text", _start, end, 0x30})
+.Ee
+.\"
+.\"
+.\"
+.Ip integer match item,list "Search list for matching value
+.CW match
+compares each item in
+.I list
+using the equality operator
+.CW ==
+with
+.I item .
+The
+.I item
+can be of any type. If the match succeeds the result is the integer index
+of the matching value, otherwise -1.
+.Ex
+acid: list={8,9,10,11}
+acid: print(list[match(10, list)]\eD)
+10
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP newproc string "Create a new process
+.CW newproc
+starts a new process with an argument vector constructed from
+.I string .
+The argument vector excludes the name of the program to execute and
+each argument in
+.I string
+must be space separated. A new process can accept no more
+than 512 arguments. The internal variable
+.CW pid
+is set to the pid of the newly created process. The new pid
+is also appended to the list of active processes stored in the variable
+.CW proclist .
+The new process is created then halted at the first instruction, causing
+the debugger to call
+.CW stopped .
+The library functions
+.CW new
+and
+.CW win
+should be used to start processes when using the standard debugging
+environment.
+.Ex
+acid: newproc("-l .")
+56720: system call _main ADD $-0x14,R29
+.Ee
+.\"
+.\"
+.\"
+.Ip string pcfile integer "Convert text address to source file name
+.CW pcfile
+interprets its
+.I integer
+argument as a text address in the debugged program. The address and symbol table
+are used to generate a string containing the name of the source file
+corresponding to the text address. If the address does not lie within the
+program the string
+.CW ?file?
+is returned.
+.Ex
+acid: print("Now at ", pcfile(*PC), ":", pcline(*PC))
+Now at ls.c:46
+.Ee
+.\"
+.\"
+.\"
+.Ip integer pcline integer "Convert text address to source line number
+.CW pcline
+interprets its
+.I integer
+argument as a text address in the debugged program. The address and symbol table
+are used to generate an integer containing the line number in the source file
+corresponding to the text address. If the address does not lie within the
+program the integer 0 is returned.
+.Ex
+acid: +file("main.c")[pcline(main)]
+main(int argc, char *argv[])
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP print item,item,... "Print expressions
+.CW print
+evaluates each
+.I item
+supplied in its argument list and prints it to standard output. Each
+argument will be printed according to its associated format character.
+When the interpreter is executing, output is buffered and flushed every
+5000 statements or when the interpreter returns to interactive mode.
+.CW print
+accepts a maximum of 512 arguments.
+.Ex
+acid: print(10, "decimal ", 10\eD, "octal ", 10\eo)
+0x0000000a decimal 10 octal 000000000012
+acid: print({1, 2, 3})
+{0x00000001 , 0x00000002 , 0x00000003 }
+acid: print(main, main\ea, "\et", @main\ei)
+0x00001020 main ADD $-64,R29
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP printto string,item,item,... "Print expressions to file
+.CW printto
+offers a limited form of output redirection. The first
+.I string
+argument is used as the path name of a new file to create.
+Each
+.I item
+is then evaluated and printed to the newly created file. When all items
+have been printed the file is closed.
+.CW printto
+accepts a maximum of 512 arguments.
+.Ex
+acid: printto("/env/foo", "hello")
+acid: rc("echo -n $foo")
+hello
+.Ee
+.\"
+.\"
+.\"
+.Ip string rc string "Execute a shell command
+.CW rc
+evaluates
+.I string
+to form a shell command. A new command interpreter is started
+to execute the command. The Acid interpreter blocks until the command
+completes. The return value is the empty string
+if the command succeeds, otherwise the exit status of the failed command.
+.Ex
+acid: rc("B "+itoa(-pcline(addr))+" "+pcfile(addr));
+.Ee
+.\"
+.\"
+.\"
+.Ip string readfile string "Read file contents into a string
+.CW readfile
+takes the contents of the file specified by
+.I string
+and returns its contents as a new string.
+If
+.CW readfile
+encounters a zero byte in the file, it terminates.
+If
+.CW readfile
+encounters an error opening or reading the file then the empty list
+is returned.
+.CW readfile
+can be used to read the contents of device files whose lines are not
+terminated with newline characters.
+.Ex
+acid: ""+readfile("/dev/label")
+helix
+.Ee
+.\"
+.\"
+.\"
+.Ip string reason integer "Print cause of program stoppage
+.CW reason
+uses machine-dependent information to generate a string explaining
+why a process has stopped. The
+.I integer
+argument is the value of an architecture dependent status register,
+for example
+.CW CAUSE
+on the MIPS.
+.Ex
+acid: print(reason(*CAUSE))
+system call
+.Ee
+.\"
+.\"
+.\"
+.Ip integer regexp pattern,string "Regular expression match
+.CW regexp
+matches the
+.I pattern
+string supplied as its first argument with the
+.I string
+supplied as its second.
+If the pattern matches the result is the value 1, otherwise 0.
+.Ex
+acid: print(regexp(".*bar", "foobar"))
+1
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP setproc integer "Set debugger focus
+.CW setproc
+selects the default process used for memory and control operations. It effectively
+shifts the focus of control between processes. The
+.I integer
+argument specifies the pid of the process to look at.
+The variable
+.CW pid
+is set to the pid of the selected process. If the process is being
+selected for the first time its pid is added to the list of active
+processes
+.CW proclist .
+.Ex
+acid: setproc(68382)
+acid: procs()
+>68382: Stopped at main+0x4 setproc(68382)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP start integer "Restart execution
+.CW start
+writes a
+.CW start
+message to the control file of the process specified by the pid
+supplied as its
+.I integer
+argument.
+.CW start
+draws an error if the process is not in the
+.CW Stopped
+state.
+.Ex
+acid: start(68382)
+acid: procs()
+>68382: Running at main+0x4 setproc(68382)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP startstop integer "Restart execution, block until stopped
+.CW startstop
+performs the same actions as a call to
+.CW start
+followed by a call to
+.CW stop .
+The
+.I integer
+argument specifies the pid of the process to control. The process
+must be in the
+.CW Stopped
+state.
+Execution is restarted, the debugger then waits for the process to
+return to the
+.CW Stopped
+state. A process will stop if a startstop message has been written to its control
+file and any of the following conditions becomes true: the process executes or returns from
+a system call, the process generates a trap or the process receives a note.
+.CW startstop
+is used to implement single stepping.
+.Ex
+acid: startstop(pid)
+75374: breakpoint ls ADD $-0x16c8,R29
+.Ee
+.\"
+.\"
+.\"
+.Ip string status integer "Return process state
+.CW status
+uses the pid supplied by its
+.I integer
+argument to generate a string describing the state of the process.
+The string corresponds to the state returned by the
+sixth column of the
+.I ps (1)
+command.
+A process must be in the
+.CW Stopped
+state to modify its memory or registers.
+.Ex
+acid: ""+status(pid)
+Stopped
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP stop integer "Wait for a process to stop
+.CW stop
+writes a
+.CW stop
+message to the control file of the process specified by the
+pid supplied as its
+.I integer
+argument.
+The interpreter blocks until the debugged process enters the
+.CW Stopped
+state.
+A process will stop if a stop message has been written to its control
+file and any of the following conditions becomes true: the process executes or returns from
+a system call, the process generates a trap, the process is scheduled or the
+process receives a note.
+.CW stop
+is used to wait for a process to halt before planting a breakpoint since Plan 9
+only allows a process's memory to be written while it is in the
+.CW Stopped
+state.
+.Ex
+defn bpset(addr) {
+ if (status(pid)!="Stopped") then {
+ print("Waiting...\en");
+ stop(pid);
+ }
+ ...
+}
+.Ee
+.\"
+.\"
+.\"
+.Ip list strace pc,sp,linkreg "Stack trace
+.CW strace
+generates a list of lists corresponding to procedures called by the debugged
+program. Each sublist describes a single stack frame in the active process.
+The first element is an
+.I integer
+of format
+.CW X
+specifying the address of the called function. The second element is the value
+of the program counter when the function was called. The third and fourth elements
+contain lists of parameter and automatic variables respectively.
+Each element of these lists
+contains a string with the name of the variable and an
+.I integer
+value of format
+.CW X
+containing the current value of the variable.
+The arguments to
+.CW strace
+are the current value of the program counter, the current value of the
+stack pointer, and the address of the link register. All three parameters
+must be integers.
+The setting of
+.I linkreg
+is architecture dependent. On the MIPS linkreg is set to the address of saved
+.CW R31 ,
+on the SPARC to the address of saved
+.CW R15 .
+For the other architectures
+.I linkreg
+is not used, but must point to valid memory.
+.Ex
+acid: print(strace(*PC, *SP, linkreg))
+{{0x0000141c, 0xc0000f74,
+{{"s", 0x0000004d}, {"multi", 0x00000000}},
+{{"db", 0x00000000}, {"fd", 0x000010a4},
+{"n", 0x00000001}, {"i", 0x00009824}}}}
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP waitstop integer "Wait for a process to stop
+.CW waitstop
+writes a waitstop message to the control file of the process specified by the
+pid supplied as its
+.I integer
+argument.
+The interpreter will remain blocked until the debugged process enters the
+.CW Stopped
+state.
+A process will stop if a waitstop message has been written to its control
+file and any of the following conditions becomes true: the process generates a trap
+or receives a note. Unlike
+.CW stop ,
+the
+.CW waitstop
+function is passive; it does not itself cause the program to stop.
+.Ex
+acid: waitstop(pid)
+75374: breakpoint ls ADD $-0x16c8,R29
+.Ee
+.\"
+.\"
+.\"
+.SH
+Library Functions
+.PP
+A standard debugging environment is provided by modules automatically
+loaded when
+Acid is started.
+These modules are located in the directory
+.CW /sys/lib/acid .
+These functions may be overridden, personalized, or added to by code defined in
+.CW $home/lib/acid .
+The implementation of these functions can be examined using the
+.CW whatis
+operator and then modified during debugging sessions.
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP Bsrc integer "Load text editor with source
+.CW Bsrc
+interprets the
+.I integer
+argument as a text address. The text address is used to produce a pathname
+and line number suitable for the external
+.CW B
+command
+of the text editor
+(eg,
+.I acme (1)).
+.CW Bsrc
+builds a shell
+command to invoke
+.CW B ,
+which either selects an existing source file or loads a new source file into
+the editor.
+The line of source corresponding to the text address is then selected.
+In the following example
+.CW stopped
+is redefined so that
+the editor
+follows and displays the source line currently being executed.
+.Ex
+defn stopped(pid) {
+ pstop(pid);
+ Bsrc(*PC);
+}
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP Fpr "" "Display double precision floating registers
+For machines equipped with floating point,
+.CW Fpr
+displays the contents of the floating point registers as double precision
+values.
+.Ex
+acid: Fpr()
+F0 0. F2 0.
+F4 0. F6 0.
+F8 0. F10 0.
+\&...
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP Ureg integer "Display contents of Ureg structure
+.CW Ureg
+interprets the integer passed as its first argument as the address of a
+kernel
+.CW Ureg
+structure. Each element of the structure is retrieved and printed.
+The size and contents of the
+.CW Ureg
+structure are architecture dependent.
+This function can be used to decode the first argument passed to a
+.I notify (2)
+function after a process has received a note.
+.Ex
+acid: Ureg(*notehandler:ur)
+ status 0x3000f000
+ pc 0x1020
+ sp 0x7ffffe00
+ cause 0x00004002
+\&...
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP acidinit "" "Interpreter startup
+.CW acidinit
+is called by the interpreter after all
+modules have been loaded at initialization time.
+It is used to set up machine specific variables and the default source path.
+.CW acidinit
+should not be called by user code.
+.KE
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP addsrcdir string "Add element to source search path
+.CW addsrcdir
+interprets its string argument as a new directory
+.CW findsrc
+should search when looking for source code files.
+.CW addsrcdir
+draws an error if the directory is already in the source search path. The search
+path may be examined by looking at the variable
+.CW srcpath .
+.Ex
+acid: rc("9fs fornax")
+acid: addsrcpath("/n/fornax/sys/src/cmd")
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP asm integer "Disassemble machine instructions
+.CW asm
+interprets its integer argument as a text address from which to disassemble
+machine instructions.
+.CW asm
+prints the instruction address in symbolic and hexadecimal form, then prints
+the instructions with addressing modes. Up to twenty instructions will
+be disassembled.
+.CW asm
+stops disassembling when it reaches the end of the current function.
+Instructions are read from the file image using the
+.CW @
+operator.
+.Ex
+acid: asm(main)
+main 0x00001020 ADD $-0x64,R29
+main+0x4 0x00001024 MOVW R31,0x0(R29)
+main+0x8 0x00001028 MOVW R1,argc+4(FP)
+main+0xc 0x0000102c MOVW $bin(SB),R1
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP bpdel integer "Delete breakpoint
+.CW bpdel
+removes a previously set breakpoint from memory.
+The
+.I integer
+supplied as its argument must be the address of a previously set breakpoint.
+The breakpoint address is deleted from the active breakpoint list
+.CW bplist ,
+then the original instruction is copied from the file image to the memory
+image so that the breakpoint is removed.
+.Ex
+acid: bpdel(main+4)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP bpset integer "Set a breakpoint
+.CW bpset
+places a breakpoint instruction at the address specified
+by its
+.I integer
+argument, which must be in the text segment.
+.CW bpset
+draws an error if a breakpoint has already been set at the specified address.
+A list of current breakpoints is maintained in the variable
+.CW bplist .
+Unlike in
+.I db (1),
+breakpoints are left in memory even when a process is stopped, and
+the process must exist, perhaps by being
+created by either
+.CW new
+or
+.CW win ,
+in order to place a breakpoint.
+.CW Db "" (
+accepts breakpoint commands before the process is started.)
+On the
+MIPS and SPARC architectures,
+breakpoints at function entry points should be set 4 bytes into the function
+because the
+instruction scheduler may fill
+.CW JAL
+branch delay slots with the first instruction of the function.
+.Ex
+acid: bpset(main+4)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP bptab "" "List active breakpoints
+.CW bptab
+prints a list of currently installed breakpoints. The list contains the
+breakpoint address in symbolic and hexadecimal form as well as the instruction
+the breakpoint replaced. Breakpoints are not maintained across process creation
+using
+.CW new
+and
+.CW win .
+They are maintained across a fork, but care must be taken to keep control of
+the child process.
+.Ex
+acid: bpset(ls+4)
+acid: bptab()
+ 0x00001420 ls+0x4 MOVW R31,0x0(R29)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP casm "" "Continue disassembly
+.CW casm
+continues to disassemble instructions from where the last
+.CW asm
+or
+.CW casm
+command stopped. Like
+.CW asm ,
+this command stops disassembling at function boundaries.
+.Ex
+acid: casm()
+main+0x10 0x00001030 MOVW $0x1,R3
+main+0x14 0x00001034 MOVW R3,0x8(R29)
+main+0x18 0x00001038 MOVW $0x1,R5
+main+0x1c 0x0000103c JAL Binit(SB)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP cont "" "Continue program execution
+.CW cont
+restarts execution of the currently active process.
+If the process is stopped on a breakpoint, the breakpoint is first removed,
+the program is single stepped, the breakpoint is replaced and the program
+is then set executing. This may cause
+.CW stopped()
+to be called twice.
+.CW cont
+causes the interpreter to block until the process enters the
+.CW Stopped
+state.
+.Ex
+acid: cont()
+95197: breakpoint ls+0x4 MOVW R31,0x0(R29)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP dump integer,integer,string "Formatted memory dump
+.CW dump
+interprets its first argument as an address, its second argument as a
+count and its third as a format string.
+.CW dump
+fetches an object from memory at the current address and prints it according
+to the format. The address is incremented by the number of bytes specified by
+the format and the process is repeated count times. The format string is any
+combination of format characters, each preceded by an optional count.
+For each object,
+.CW dump
+prints the address in hexadecimal, a colon, the object and then a newline.
+.CW dump
+uses
+.CW mem
+to fetch each object.
+.Ex
+acid: dump(main+35, 4, "X2bi")
+0x00001043: 0x0c8fa700 108 143 lwc2 r0,0x528f(R4)
+0x0000104d: 0xa9006811 0 0 swc3 r0,0x0(R24)
+0x00001057: 0x2724e800 4 37 ADD $-0x51,R23,R31
+0x00001061: 0xa200688d 6 0 NOOP
+0x0000106b: 0x2710c000 7 0 BREAK
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP findsrc string "Use source path to load source file
+.CW findsrc
+interprets its
+.I string
+argument as a source file. Each directory in the source path is searched
+in turn for the file. If the file is found, the source text is loaded using
+.CW file
+and stored in the list of active source files called
+.CW srctext .
+The name of the file is added to the source file name list
+.CW srcfiles .
+Users are unlikely to call
+.CW findsrc
+from the command line, but may use it from scripts to preload source files
+for a debugging session. This function is used by
+.CW src
+and
+.CW line
+to locate and load source code. The default search path for the MIPS
+is
+.CW ./ ,
+.CW /sys/src/libc/port ,
+.CW /sys/src/libc/9sys ,
+.CW /sys/src/libc/mips .
+.Ex
+acid: findsrc(pcfile(main));
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP fpr "" "Display single precision floating registers
+For machines equipped with floating point,
+.CW fpr
+displays the contents of the floating point registers as single precision
+values. When the interpreter stores or manipulates floating point values
+it converts into double precision values.
+.Ex
+acid: fpr()
+F0 0. F1 0.
+F2 0. F3 0.
+F4 0. F5 0.
+\&...
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP func "" "Step while in function
+.CW func
+single steps the active process until it leaves the current function
+by either calling another function or returning to its caller.
+.CW func
+will execute a single instruction after leaving the current function.
+.Ex
+acid: func()
+95197: breakpoint ls+0x8 MOVW R1,R8
+95197: breakpoint ls+0xc MOVW R8,R1
+95197: breakpoint ls+0x10 MOVW R8,s+4(FP)
+95197: breakpoint ls+0x14 MOVW $0x2f,R5
+95197: breakpoint ls+0x18 JAL utfrrune(SB)
+95197: breakpoint utfrrune ADD $-0x18,R29
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP gpr "" "Display general purpose registers
+.CW gpr
+prints the values of the general purpose processor registers.
+.Ex
+acid: gpr()
+R1 0x00009562 R2 0x000010a4 R3 0x00005d08
+R4 0x0000000a R5 0x0000002f R6 0x00000008
+\&...
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP labstk integer "Print stack trace from label
+.CW labstk
+performs a stack trace from a Plan 9
+.I label.
+The kernel
+and C compilers store continuations in a common format. Since the
+compilers all use caller save conventions a continuation may be saved by
+storing a
+.CW PC
+and
+.CW SP
+pair. This data structure is called a label and is used by the
+C function
+.CW longjmp
+and the kernel to schedule threads and processes.
+.CW labstk
+interprets its
+.I integer
+argument as the address of a label and produces a stack trace for
+the thread of execution. The value of the function
+.CW ALEF_tid
+is a suitable argument for
+.CW labstk .
+.Ex
+acid: labstk(*mousetid)
+At pc:0x00021a70:Rendez_Sleep+0x178 rendez.l:44
+Rendez_Sleep(r=0xcd7d8,bool=0xcd7e0,t=0x0) rendez.l:5
+ called from ALEF_rcvmem+0x198 recvmem.l:45
+ALEF_rcvmem(c=0x000cd764,l=0x00000010) recvmem.l:6
+\&...
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP lstk "" "Stack trace with local variables
+.CW lstk
+produces a long format stack trace.
+The stack trace includes each function in the stack,
+where it was called from, and the value of the parameters and automatic
+variables for each function.
+.CW lstk
+displays the value rather than the address of each variable and all
+variables are assumed to be an integer in format
+.CW X .
+To print a variable in its correct format use the
+.CW :
+operator to find the address and apply the appropriate format before indirection
+with the
+.CW *
+operator. It may be necessary to single step a couple of instructions into
+a function to get a correct stack trace because the frame pointer adjustment
+instruction may get scheduled down into the body of the function.
+.Ex
+acid: lstk()
+At pc:0x00001024:main+0x4 ls.c:48
+main(argc=0x00000001,argv=0x7fffefec) ls.c:48
+ called from _main+0x20 main9.s:10
+ _argc=0x00000000
+ _args=0x00000000
+ fd=0x00000000
+ buf=0x00000000
+ i=0x00000000
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP mem integer,string "Print memory object
+.CW mem
+interprets its first
+.I integer
+argument as the address of an object to be printed according to the
+format supplied in its second
+.I string
+argument.
+The format string can be any combination of format characters, each preceded
+by an optional count.
+.Ex
+acid: mem(bdata+0x326, "2c2Xb")
+P = 0xa94bc464 0x3e5ae44d 19
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP new "" "Create new process
+.CW new
+starts a new copy of the debugged program. The new program is started
+with the program arguments set by the variable
+.CW progargs .
+The new program is stopped in the second instruction of
+.CW main .
+The breakpoint list is reinitialized.
+.CW new
+may be used several times to instantiate several copies of a program
+simultaneously. The user can rotate between the copies using
+.CW setproc .
+.Ex
+acid: progargs="-l"
+acid: new()
+60: external interrupt _main ADD $-0x14,R29
+60: breakpoint main+0x4 MOVW R31,0x0(R29)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP next "" "Step through language statement
+.CW next
+steps through a single language level statement without tracing down
+through each statement in a called function. For each statement,
+.CW next
+prints the machine instructions executed as part of the statement. After
+the statement has executed, source lines around the current program
+counter are displayed.
+.Ex
+acid: next()
+60: breakpoint Binit+0x4 MOVW R31,0x0(R29)
+60: breakpoint Binit+0x8 MOVW f+8(FP),R4
+binit.c:93
+ 88
+ 89 int
+ 90 Binit(Biobuf *bp, int f, int mode)
+ 91 {
+>92 return Binits(bp, f, mode, bp->b, BSIZE);
+ 93 }
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP notestk integer "Stack trace after receiving a note
+.CW notestk
+interprets its
+.I integer
+argument as the address of a
+.CW Ureg
+structure passed by the kernel to a
+.I notify (2)
+function during note processing.
+.CW notestk
+uses the
+.CW PC ,
+.CW SP ,
+and link register from the
+.CW Ureg
+to print a stack trace corresponding to the point in the program where the note
+was received.
+To get a valid stack trace on the MIPS and SPARC architectures from a notify
+routine, the program must stop in a new function called from the notify routine
+so that the link register is valid and the notify routine's parameters are
+addressable.
+.Ex
+acid: notestk(*notify:ur)
+Note pc:0x00001024:main+0x4 ls.c:48
+main(argc=0x00000001,argv=0x7fffefec) ls.c:48
+ called from _main+0x20 main9.s:10
+ _argc=0x00000000
+ _args=0x00000000
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP pfl integer "Print source file and line
+.CW pfl
+interprets its argument as a text address and uses it to print
+the source file and line number corresponding to the address. The output
+has the same format as file addresses in
+.I acme (1).
+.Ex
+acid: pfl(main)
+ls.c:48
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP procs "" "Print active process list
+.CW procs
+prints a list of active process attached to the debugger. Each process
+produces a single line of output giving the pid, process state, the address
+the process is currently executing, and the
+.CW setproc
+command required to make that process current.
+The current process is marked in the first column with a
+.CW >
+character. The debugger maintains a list of processes in the variable
+.CW proclist .
+.Ex
+acid: procs()
+>62: Stopped at main+0x4 setproc(62)
+ 60: Stopped at Binit+0x8 setproc(60)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP pstop integer "Print reason process stopped
+.CW pstop
+prints the status of the process specified by the
+.I integer
+pid supplied as its argument.
+.CW pstop
+is usually called from
+.CW stopped
+every time a process enters the
+.CW Stopped
+state.
+.Ex
+acid: pstop(62)
+0x0000003e: breakpoint main+0x4 MOVW R31,0x0(R29)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP regs "" "Print registers
+.CW regs
+prints the contents of both the general and special purpose registers.
+.CW regs
+calls
+.CW spr
+then
+.CW gpr
+to display the contents of the registers.
+.KE
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP source "" "Summarize source data base
+.CW source
+prints the directory search path followed by a list of currently loaded
+source files. The source management functions
+.CW src
+and
+.CW findsrc
+use the search path to locate and load source files. Source files are
+loaded incrementally into a source data base during debugging. A list
+of loaded files is stored in the variable
+.CW srcfiles
+and the contents of each source file in the variable
+.CW srctext .
+.Ex
+acid: source()
+/n/bootes/sys/src/libbio/
+./
+/sys/src/libc/port/
+/sys/src/libc/9sys/
+/sys/src/libc/mips/
+ binit.c
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP spr "" "Print special purpose registers
+.CW spr
+prints the contents of the processor control and memory management
+registers. Where possible, the contents of the registers are decoded
+to provide extra information; for example the
+.CW CAUSE
+register on the MIPS is
+printed both in hexadecimal and using the
+.CW reason
+function.
+.Ex
+acid: spr()
+PC 0x00001024 main+0x4 ls.c:48
+SP 0x7fffef68 LINK 0x00006264 _main+0x28 main9.s:12
+STATUS 0x0000ff33 CAUSE 0x00000024 breakpoint
+TLBVIR 0x000000d3 BADVADR 0x00001020
+HI 0x00000004 LO 0x00001ff7
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP src integer "Print lines of source
+.CW src
+interprets its
+.I integer
+argument as a text address and uses this address to print 5 lines
+of source before and after the address. The current line is marked with a
+.CW >
+character.
+.CW src
+uses the source search path maintained by
+.CW source
+and
+.CW addsrcdir
+to locate the required source files.
+.Ex
+acid: src(*PC)
+ls.c:47
+ 42 Biobuf bin;
+ 43
+ 44 #define HUNK 50
+ 45
+ 46 void
+>47 main(int argc, char *argv[])
+ 48 {
+ 49 int i, fd;
+ 50 char buf[64];
+ 51
+ 52 Binit(&bin, 1, OWRITE);
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP step "" "Single step process
+.CW step
+causes the debugged process to execute a single machine level instruction.
+If the program is stopped on a breakpoint set by
+.CW bpset
+it is first removed, the single step executed, and the breakpoint replaced.
+.CW step
+uses
+.CW follow
+to predict the address of the program counter after the current instruction
+has been executed. A breakpoint is placed at each of these predicted addresses
+and the process is started. When the process stops the breakpoints are removed.
+.Ex
+acid: step()
+62: breakpoint main+0x8 MOVW R1,argc+4(FP)
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP stk "" "Stack trace
+.CW stk
+produces a short format stack trace. The stack trace includes each function
+in the stack, where it was called from, and the value of the parameters.
+The short format omits the values of automatic variables.
+Parameters are assumed to be integer values in the format
+.CW X ;
+to print a parameter in the correct format use the
+.CW :
+to obtain its address, apply the correct format, and use the
+.CW *
+indirection operator to find its value.
+It may be necessary to single step a couple of instructions into
+a function to get a correct stack trace because the frame pointer adjustment
+instruction may get scheduled down into the body of the function.
+.Ex
+acid: stk()
+At pc:0x00001028:main+0x8 ls.c:48
+main(argc=0x00000002,argv=0x7fffefe4) ls.c:48
+ called from _main+0x20 main9.s:10
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP stmnt "" "Execute a single statement
+.CW stmnt
+executes a single language level statement.
+.CW stmnt
+displays each machine level instruction as it is executed. When the executed
+statement is completed the source for the next statement is displayed.
+Unlike
+.CW next ,
+the
+.CW stmnt
+function will trace down through function calls.
+.Ex
+acid: stmnt()
+62: breakpoint main+0x18 MOVW R5,0xc(R29)
+62: breakpoint main+0x1c JAL Binit(SB)
+62: breakpoint Binit ADD $-0x18,R29
+binit.c:91
+ 89 int
+ 90 Binit(Biobuf *bp, int f, int mode)
+>91 {
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP stopped integer "Report status of stopped process
+.CW stopped
+is called automatically by the interpreter
+every time a process enters the
+.CW Stopped
+state, such as when it hits a breakpoint.
+The pid is passed as the
+.I integer
+argument. The default implementation just calls
+.CW pstop ,
+but the function may be changed to provide more information or perform fine control
+of execution. Note that
+.CW stopped
+should return; for example, calling
+.CW step
+in
+.CW stopped
+will recur until the interpreter runs out of stack space.
+.Ex
+acid: defn stopped(pid) {
+ if *lflag != 0 then error("lflag modified");
+ }
+acid: progargs = "-l"
+acid: new();
+acid: while 1 do step();
+<stdin>:7: (error) lflag modified
+acid: stk()
+At pc:0x00001220:main+0x200 ls.c:54
+main(argc=0x00000001,argv=0x7fffffe8) ls.c:48
+ called from _main+0x20 main9.s:10
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP symbols string "Search symbol table
+.CW symbols
+uses the regular expression supplied by
+.I string
+to search the symbol table for symbols whose name matches the
+regular expression.
+.Ex
+acid: symbols("main")
+main T 0x00001020
+_main T 0x0000623c
+.Ee
+.\"
+.\"
+.\"
+.Ip \f(CW{}\fP win "" "Start new process in a window
+.CW win
+performs exactly the same function as
+.CW new
+but uses the window system to create a new window for the debugged process.
+The variable
+.CW progargs
+supplies arguments to the new process.
+The environment variable
+.CW $8½srv
+must be set to allow the interpreter to locate the mount channel for the
+window system.
+The window is created in the top left corner of the screen and is
+400x600 pixels in size. The
+.CW win
+function may be modified to alter the geometry.
+The window system will not be able to deliver notes in the new window
+since the pid of the created process is not passed when the server is
+mounted to create a new window.
+.Ex
+acid: win()
+.Ee
binary files /dev/null b/doc/acid.pdf differ
--- /dev/null
+++ b/doc/acidpaper.ms
@@ -1,0 +1,1327 @@
+.TL
+Acid: A Debugger Built From A Language
+.AU
+.I "Phil Winterbottom"
+.AI
+.I "Lucent Technologies Inc"
+.AB
+.FS
+\l'1i'
+.br
+Originally appeared in
+.I
+Proc. of the Winter 1994 USENIX Conf.,
+.R
+pp. 211-222,
+San Francisco, CA;
+and subsequently in the
+.I "Plan 9 Programmer's Manual, Volume 2 (Second Edition)" .
+.FE
+Acid is an unusual source-level symbolic debugger for Plan 9. It is implemented
+as a language interpreter with specialized primitives that provide
+debugger support. Programs written in the language manipulate
+one or more target processes; variables in the language represent the
+symbols, state, and resources of those processes.
+This structure allows complex
+interaction between the debugger and the target program and
+provides a convenient method of parameterizing differences between
+machine architectures.
+Although some effort is required to learn
+the debugging language, the richness and flexibility of the
+debugging environment encourages new ways of reasoning about the way
+programs run and the conditions under which they fail.
+.AE
+.NH
+Introduction
+.PP
+The size and complexity
+of programs have increased in proportion to processor speed and memory but
+the interface between debugger and programmer has changed little.
+Graphical user interfaces have eased some of the tedious
+aspects of the interaction. A graphical interface is a convenient
+means for navigating through source and data structures but provides
+little benefit for process control.
+The introduction of a new concurrent language, Alef [Win93], emphasized the
+inadequacies of the existing Plan 9 [Pike90] debugger
+.I db ,
+a distant relative of
+.I adb ,
+and made it clear that a new debugger was required.
+.PP
+Current debuggers like
+.I dbx ,
+.I sdb ,
+and
+.I gdb
+are limited to answering only the questions their authors
+envisage. As a result, they supply a plethora
+of specialized commands, each attempting to anticipate
+a specific question a user may ask.
+When a debugging situation arises that is beyond the scope
+of the command set, the tool is useless.
+Further,
+it is often tedious or impossible to reproduce an anomalous state
+of the program, especially when
+the state is embedded in the program's data structures.
+.PP
+Acid applies some ideas found in CAD software used for
+hardware test and simulation.
+It is based on the notion that the state and resources of a program
+are best represented and manipulated by a language. The state and resources,
+such as memory, registers, variables, type information and source code
+are represented by variables in the language.
+Expressions provide a computation mechanism and control
+statements allow repetitive or selective interpretation based
+on the result of expression evaluation.
+The heart of the Acid debugger is an interpreter for a small typeless
+language whose operators mirror the operations
+of C and Alef, which in turn correspond well to the basic operations of
+the machine. The interpreter itself knows nothing of the underlying
+hardware; it deals with the program state and resources
+in the abstract.
+Fundamental routines to control
+processes, read files, and interface to the system are implemented
+as builtin functions available to the interpreter.
+The actual debugger functionality is coded
+in Acid; commands are implemented as Acid functions.
+.PP
+This language-based approach has several advantages.
+Most importantly, programs written in Acid, including most of the
+debugger itself, are inherently portable.
+Furthermore, Acid avoids the limitations other debuggers impose when
+debugging parallel programs. Instead of embedding a fixed
+process model in the debugger, Acid allows the
+programmer to adapt the debugger to handle an
+arbitrary process partitioning or program structure.
+The ability to
+interact dynamically with an executing process provides clear advantages
+over debuggers constrained to probe a static image.
+Finally, the Acid language is a powerful vehicle for expressing
+assertions about logic, process state, and the contents of data structures.
+When combined with dynamic interaction it allows a
+limited form of automated program verification without requiring
+modification or recompilation of the source code.
+The language is also an
+excellent vehicle for preserving a test suite for later regression testing.
+.PP
+The debugger may be customized by its users; standard
+functions may be modified or extended to suit a particular application
+or preference.
+For example, the kernel developers in our group require a
+command set supporting assembler-level debugging while the application
+programmers prefer source-level functionality.
+Although the default library is biased toward assembler-level debugging,
+it is easily modified to provide a convenient source-level interface.
+The debugger itself does not change; the user combines primitives
+and existing Acid functions in different ways to
+implement the desired interface.
+.NH
+Related Work
+.PP
+DUEL [Gol93], an extension to
+.I gdb
+[Stal91], proposes using a high level expression evaluator to solve
+some of these problems. The evaluator provides iterators to loop over data
+structures and conditionals to control evaluation of expressions.
+The author shows that complex state queries can be formulated
+by combining concise expressions but this only addresses part of the problem.
+A program is a dynamic entity; questions asked when the program is in
+a static state are meaningful only after the program has been `caught' in
+that state. The framework for manipulating the program is still as
+primitive as the underlying debugger. While DUEL provides a means to
+probe data structures it entirely neglects the most beneficial aspect
+of debugging languages: the ability to control processes. Acid is structured
+around a thread of control that passes between the interpreter and the
+target program.
+.PP
+The NeD debugger [May92] is a set of extensions to TCL [Ous90] that provide
+debugging primitives. The resulting language, NeDtcl, is used to implement
+a portable interface between a conventional debugger, pdb [May90], and
+a server that executes NeDtcl programs operating on the target program.
+Execution of the NeDtcl programs implements the debugging primitives
+that pdb expects.
+NeD is targeted at multi-process debugging across a network,
+and proves the flexibility of a language as a means of
+communication between debugging tools. Whereas NeD provides an interface
+between a conventional debugger and the process it debugs, Acid is the
+debugger itself. While NeD has some of the ideas
+found in Acid it is targeted toward a different purpose. Acid seeks to
+integrate the manipulation of a program's resources into the debugger
+while NeD provides a flexible interconnect between components of
+the debugging environment. The choice of TCL is appropriate for its use
+in NeD but is not suitable for Acid. Acid relies on the coupling of the type
+system with expression evaluation, which are the root of its design,
+to provide the debugging primitives.
+.PP
+Dalek [Ols90] is an event based language extension to gdb. State transitions
+in the target program cause events to be queued for processing by the
+debugging language.
+.PP
+Acid has many of the advantages of same process or
+.I local
+.I agent
+debuggers, like Parasight [Aral], without the need for dynamic linking or
+shared memory.
+Acid improves on the ideas of these other systems by completely integrating
+all aspects of the debugging process into the language environment. Of
+particular importance is the relationship between Acid variables,
+program symbols, source code, registers and type information. This
+integration is made possible by the design of the Acid language.
+.PP
+Interpreted languages such as Lisp and Smalltalk are able to provide
+richer debugging environments through more complete information than
+their compiled counterparts. Acid is a means to gather and represent
+similar information about compiled programs through cooperation
+with the compilation tools and library implementers.
+.NH
+Acid the Language
+.PP
+Acid is a small interpreted language targeted to its debugging task.
+It focuses on representing program state and addressing data rather than
+expressing complex computations. Program state is
+.I addressable
+from an Acid program.
+In addition to parsing and executing expressions and providing
+an architecture-independent interface to the target process,
+the interpreter supplies a mark-and-scan garbage collector
+to manage storage.
+.PP
+Every Acid session begins with the loading of the Acid libraries.
+These libraries contain functions, written in Acid, that provide
+a standard debugging environment including breakpoint management,
+stepping by instruction or statement, stack tracing, and
+access to variables, memory, and registers.
+The library contains 600 lines of Acid code and provides
+functionality similar to
+.I dbx .
+Following the loading of the system library, Acid loads
+user-specified libraries; this load sequence allows the
+user to augment or override the standard commands
+to customize the debugging environment. When all libraries
+are loaded, Acid issues an interactive prompt and begins
+evaluating expressions entered by the user. The Acid `commands'
+are actually invocations of builtin primitives or previously defined
+Acid functions. Acid evaluates each expression as it is entered and
+prints the result.
+.NH
+Types and Variables
+.PP
+Acid variables are of four basic types:
+.I integer ,
+.I string ,
+.I float ,
+and
+.I list .
+The type of a variable is inferred by the type of the right-hand side of
+an assignment expression.
+Many of the operators can be applied to more than
+one type; for these operators the action of the operator is determined
+by the type of its operands.
+For example,
+the
+.CW +
+operator adds
+.I integer
+and
+.I float
+operands, and concatenates
+.I string
+and
+.I list
+operands.
+Lists are the only complex type in Acid; there are no arrays, structures
+or pointers. Operators provide
+.CW head ,
+.CW tail ,
+.CW append
+and
+.CW delete
+operations.
+Lists can also be indexed like arrays.
+.PP
+Acid has two levels of scope: global and local.
+Function parameters and variables declared in a function body
+using the
+.CW local
+keyword are created at entry to the function and
+exist for the lifetime of a function.
+Global variables are created by assignment and need not be declared.
+All variables and functions in the program
+being debugged are entered in the Acid symbol table as global
+variables during Acid initialization.
+Conflicting variable names are resolved by prefixing enough `$' characters
+to make them unique.
+Syntactically, Acid variables and target program
+symbols are referenced identically.
+However, the variables are managed differently in the Acid
+symbol table and the user must be aware of this distinction.
+The value of an Acid variable is stored in the symbol
+table; a reference returns the value.
+The symbol table entry for a variable or function in the target
+program contains the address of that symbol in the image
+of the program. Thus, the value of a program variable is
+accessed by indirect reference through the Acid
+variable that has the same name; the value of an Acid variable is the
+address of the corresponding program variable.
+.NH
+Control Flow
+.PP
+The
+.CW while
+and
+.CW loop
+statements implement looping.
+The former
+is similar to the same statement in C.
+The latter evaluates starting and ending expressions yielding
+integers and iterates while an incrementing loop index
+is within the bounds of those expressions.
+.P1
+acid: i = 0; loop 1,5 do print(i=i+1)
+0x00000001
+0x00000002
+0x00000003
+0x00000004
+0x00000005
+acid:
+.P2
+The traditional
+.CW if-then-else
+statement implements conditional execution.
+.NH
+Addressing
+.PP
+Two indirection operators allow Acid to access values in
+the program being debugged.
+The
+.CW *
+operator fetches a value from the memory image of an
+executing process;
+the
+.CW @
+operator fetches a value from the text file of the process.
+When either operator appears on the left side of an assignment, the value
+is written rather than read.
+.PP
+The indirection operator must know the size of the object
+referenced by a variable.
+The Plan 9 compilers neglect to include this
+information in the program symbol table, so Acid cannot
+derive this information implicitly.
+Instead Acid variables have formats.
+The format is a code
+letter specifying the printing style and the effect of some of the
+operators on that variable.
+The indirection operators look at the format code to determine the
+number of bytes to read or write.
+The format codes are derived from the format letters used by
+.I db .
+By default, symbol table variables and numeric constants
+are assigned the format code
+.CW 'X'
+which specifies 32-bit hexadecimal.
+Printing such a variable yields output of the form
+.CW 0x00123456 .
+An indirect reference through the variable fetches 32 bits
+of data at the address indicated by the variable.
+Other formats specify various data types, for example
+.CW i
+an instruction,
+.CW D
+a signed 32 bit decimal,
+.CW s
+a null-terminated string.
+The
+.CW fmt
+function
+allows the user to change the format code of a variable
+to control the printing format and
+operator side effects.
+This function evaluates the expression supplied as the first
+argument, attaches the format code supplied as the second
+argument to the result and returns that value.
+If the result is assigned to a variable,
+the new format code applies to
+that variable. For convenience, Acid provides the
+.CW \e
+operator as a shorthand infix form of
+.CW fmt .
+For example:
+.P1
+acid: x=10
+acid: x // print x in hex
+0x0000000a
+acid: x = fmt(x, 'D') // make x type decimal
+acid: print(x, fmt(x, 'X'), x\eX) // print x in decimal & hex
+10 0x0000000a 0x0000000a
+acid: x // print x in decimal
+10
+acid: x\eo // print x in octal
+000000000012
+.P2
+The
+.CW ++
+and
+.CW --
+operators increment or decrement a variable by an amount
+determined by its format code. Some formats imply a non-fixed size.
+For example, the
+.CW i
+format code disassembles an instruction into a string.
+On a 68020, which has variable length instructions:
+.P1
+acid: p=main\ei // p=addr(main), type INST
+acid: loop 1,5 do print(p\eX, @p++) // disassemble 5 instr's
+0x0000222e LEA 0xffffe948(A7),A7
+0x00002232 MOVL s+0x4(A7),A2
+0x00002236 PEA 0x2f($0)
+0x0000223a MOVL A2,-(A7)
+0x0000223c BSR utfrrune
+acid:
+.P2
+Here,
+.CW main
+is the address of the function of the same name in the program under test.
+The loop retrieves the five instructions beginning at that address and
+then prints the address and the assembly language representation of each.
+Notice that the stride of the increment operator varies with the size of
+the instruction: the
+.CW MOVL
+at
+.CW 0x0000223a
+is a two byte instruction while all others are four bytes long.
+.PP
+Registers are treated as normal program variables referenced
+by their symbolic assembler language names.
+When a
+process stops, the register set is saved by the kernel
+at a known virtual address in the process memory map.
+The Acid variables associated with the registers point
+to the saved values and the
+.CW *
+indirection operator can then be used to read and write the register set.
+Since the registers are accessed via Acid variables they may
+be used in arbitrary expressions.
+.P1
+acid: PC // addr of saved PC
+0xc0000f60
+acid: *PC
+0x0000623c // contents of PC
+acid: *PC\ea
+main
+acid: *R1=10 // modify R1
+acid: asm(*PC+4) // disassemble @ PC+4
+main+0x4 0x00006240 MOVW R31,0x0(R29)
+main+0x8 0x00006244 MOVW $setR30(SB),R30
+main+0x10 0x0000624c MOVW R1,_clock(SB)
+.P2
+Here, the saved
+.CW PC
+is stored at address
+.CW 0xc0000f60 ;
+its current content is
+.CW 0x0000623c .
+The
+.CW a ' `
+format code converts this value to a string specifying
+the address as an offset beyond the nearest symbol.
+After setting the value of register
+.CW 1 ,
+the example uses the
+.CW asm
+command to disassemble a short section of code beginning
+at four bytes beyond the current value of the
+.CW PC .
+.NH
+Process Interface
+.PP
+A program executing under Acid is monitored through the
+.I proc
+file system interface provided by Plan 9.
+Textual messages written to the
+.CW ctl
+file control the execution of the process.
+For example writing
+.CW waitstop
+to the control file causes the write to block until the target
+process enters the kernel and is stopped. When the process is stopped
+the write completes. The
+.CW startstop
+message starts the target process and then does a
+.CW waitstop
+action.
+Synchronization between the debugger and the target process is determined
+by the actions of the various messages. Some operate asynchronously to the
+target process and always complete immediately, others block until the
+action completes. The asynchronous messages allow Acid to control
+several processes simultaneously.
+.PP
+The interpreter has builtin functions named after each of the control
+messages. The functions take a process id as argument.
+Any time a control message causes the program to execute instructions
+the interpreter performs two actions when the control operation has completed.
+The Acid variables pointing at the register set are fixed up to point
+at the saved registers, and then
+the user defined function
+.CW stopped
+is executed.
+The
+.CW stopped
+function may print the current address,
+line of source or instruction and return to interactive mode. Alternatively
+it may traverse a complex data structure, gather statistics and then set
+the program running again.
+.PP
+Several Acid variables are maintained by the debugger rather than the
+programmer.
+These variables allow generic Acid code to deal with the current process,
+architecture specifics or the symbol table.
+The variable
+.CW pid
+is the process id of the current process Acid is debugging.
+The variable
+.CW symbols
+contains a list of lists where each sublist contains the symbol
+name, its type and the value of the symbol.
+The variable
+.CW registers
+contains a list of the machine-specific register names. Global symbols in the target program
+can be referenced directly by name from Acid. Local variables
+are referenced using the colon operator as \f(CWfunction:variable\fP.
+.NH
+Source Level Debugging
+.PP
+Acid provides several builtin functions to manipulate source code.
+The
+.CW file
+function reads a text file, inserting each line into a list.
+The
+.CW pcfile
+and
+.CW pcline
+functions each take an address as an argument.
+The first
+returns a string containing the name of the source file
+and the second returns an integer containing the line number
+of the source line containing the instruction at the address.
+.P1
+acid: pcfile(main) // file containing main
+main.c
+acid: pcline(main) // line # of main in source
+11
+acid: file(pcfile(main))[pcline(main)] // print that line
+main(int argc, char *argv[])
+acid: src(*PC) // print statements nearby
+ 9
+ 10 void
+>11 main(int argc, char *argv[])
+ 12 {
+ 13 int a;
+.P2
+In this example, the three primitives are combined in an expression to print
+a line of source code associated with an address.
+The
+.CW src
+function prints a few lines of source
+around the address supplied as its argument. A companion routine,
+.CW Bsrc ,
+communicates with the external editor
+.CW sam .
+Given an address, it loads the corresponding source file into the editor
+and highlights the line containing the address. This simple interface
+is easily extended to more complex functions.
+For example, the
+.CW step
+function can select the current file and line in the editor
+each time the target program stops, giving the user a visual
+trace of the execution path of the program. A more complete interface
+allowing two way communication between Acid and the
+.CW acme
+user interface [Pike93] is under construction. A filter between the debugger
+and the user interface provides interpretation of results from both
+sides of the interface. This allows the programming environment to
+interact with the debugger and vice-versa, a capability missing from the
+.CW sam
+interface.
+The
+.CW src
+and
+.CW Bsrc
+functions are both written in Acid code using the file and line primitives.
+Acid provides library functions to step through source level
+statements and functions. Furthermore, addresses in Acid expressions can be
+specified by source file and line.
+Source code is manipulated in the Acid
+.I list
+data type.
+.NH
+The Acid Library
+.PP
+The following examples define some useful commands and
+illustrate the interaction of the debugger and the interpreter.
+.P1
+defn bpset(addr) // set breakpoint
+{
+ if match(addr, bplist) >= 0 then
+ print("bkpoint already set:", addr\ea, "\en");
+ else {
+ *fmt(addr, bpfmt) = bpinst; // plant it
+ bplist = append bplist, addr; // add to list
+ }
+}
+.P2
+The
+.CW bpset
+function plants a break point in memory. The function starts by
+using the
+.CW match
+builtin to
+search the breakpoint list to determine if a breakpoint is already
+set at the address.
+The indirection operator, controlled by the format code returned
+by the
+.CW fmt
+primitive, is used to plant the breakpoint in memory.
+The variables
+.CW bpfmt
+and
+.CW bpinst
+are Acid global variables containing the format code specifying
+the size of the breakpoint instruction and the breakpoint instruction
+itself.
+These
+variables are set by architecture-dependent library code
+when the debugger first attaches to the executing image.
+Finally the address of the breakpoint is
+appended to the breakpoint list,
+.CW bplist .
+.P1
+defn step() // single step
+{
+ local lst, lpl, addr, bput;
+
+ bput = 0; // sitting on bkpoint
+ if match(*PC, bplist) >= 0 then {
+ bput = fmt(*PC, bpfmt); // save current addr
+ *bput = @bput; // replace it
+ }
+
+ lst = follow(*PC); // get follow set
+
+ lpl = lst;
+ while lpl do { // place breakpoints
+ *(head lpl) = bpinst;
+ lpl = tail lpl;
+ }
+
+ startstop(pid); // do the step
+
+ while lst do { // remove breakpoints
+ addr = fmt(head lst, bpfmt);
+ *addr = @addr; // replace instr.
+ lst = tail lst;
+ }
+ if bput != 0 then
+ *bput = bpinst; // restore breakpoint
+}
+.P2
+The
+.CW step
+function executes a single assembler instruction.
+If the
+.CW PC
+is sitting
+on a breakpoint, the address and size of
+the breakpoint are saved.
+The breakpoint instruction
+is then removed using the
+.CW @
+operator to fetch
+.CW bpfmt
+bytes from the text file and to place it into the memory
+of the executing process using the
+.CW *
+operator.
+The
+.CW follow
+function is an Acid
+builtin which returns a follow-set: a list of instruction addresses which
+could be executed next.
+If the instruction stored at the
+.CW PC
+is a branch instruction, the
+list contains the addresses of the next instruction and
+the branch destination; otherwise, it contains only the
+address of the next instruction.
+The follow-set is then used to replace each possible following
+instruction with a breakpoint instruction. The original
+instructions need not be saved; they remain
+in their unaltered state in the text file.
+The
+.CW startstop
+builtin writes the `startstop' message to the
+.I proc
+control file for the process named
+.CW pid .
+The target process executes until some condition causes it to
+enter the kernel, in this case, the execution of a breakpoint.
+When the process blocks, the debugger regains control and invokes the
+Acid library function
+.CW stopped
+which reports the address and cause of the blockage.
+The
+.CW startstop
+function completes and returns to the
+.CW step
+function where
+the follow-set is used to replace the breakpoints placed earlier.
+Finally, if the address of the original
+.CW PC
+contained a breakpoint, it is replaced.
+.PP
+Notice that this approach to process control is inherently portable;
+the Acid code is shared by the debuggers for all architectures.
+Acid variables and builtin functions provide a transparent interface
+to architecture-dependent values and functions. Here the breakpoint
+value and format are referenced through Acid variables and the
+.CW follow
+primitive masks the differences in the underlying instruction set.
+.PP
+The
+.CW next
+function, similar to the
+.I dbx
+command of the same name,
+is a simpler example.
+This function steps through
+a single source statement but steps over function calls.
+.P1
+defn next()
+{
+ local sp, bound;
+
+ sp = *SP; // save starting SP
+ bound = fnbound(*PC); // begin & end of fn.
+ stmnt(); // step 1 statement
+ pc = *PC;
+ if pc >= bound[0] && pc < bound[1] then
+ return {};
+
+ while (pc<bound[0] || pc>bound[1]) && sp>=*SP do {
+ step();
+ pc = *PC;
+ }
+ src(*PC);
+}
+.P2
+The
+.CW next
+function
+starts by saving the current stack pointer in a local variable.
+It then uses the Acid library function
+.CW fnbound
+to return the addresses of the first and last instructions in
+the current function in a list.
+The
+.CW stmnt
+function executes a single source statement and then uses
+.CW src
+to print a few lines of source around the new
+.CW PC .
+If the new value of the
+.CW PC
+remains in the current function,
+.CW next
+returns.
+When the executed statement is a function call or a return
+from a function, the new value of the
+.CW PC
+is outside the bounds calculated by
+.CW fnbound
+and the test of the
+.CW while
+loop is evaluated.
+If the statement was a return, the new value of the stack pointer
+is greater than the original value and the loop completes without
+execution.
+Otherwise, the loop is entered and instructions are continually
+executed until the value of the
+.CW PC
+is between the bounds calculated earlier. At that point, execution
+ceases and a few lines of source in the vicinity of the
+.CW PC
+are printed.
+.PP
+Acid provides concise and elegant expression for control and
+manipulation of target programs. These examples demonstrate how a
+few well-chosen primitives can be combined to create a rich debugging environment.
+.NH
+Dealing With Multiple Architectures
+.PP
+A single binary of Acid may be used to debug a program running on any
+of the five processor architectures supported by Plan 9. For example,
+Plan 9 allows a user on a MIPS to import the
+.I proc
+file system from an i486-based PC and remotely debug a program executing
+on that processor.
+.PP
+Two levels of abstraction provide this architecture independence.
+On the lowest level, a Plan 9 library supplies functions to
+decode the file header of the program being debugged and
+select a table of system parameters
+and a jump vector of architecture-dependent
+functions based on the magic number.
+Among these functions are byte-order-independent
+access to memory and text files, stack manipulation, disassembly,
+and floating point number interpretation.
+The second level of abstraction is supplied by Acid.
+It consists of primitives and approximately 200 lines
+of architecture-dependent Acid library code that interface the
+interpreter to the architecture-dependent library.
+This layer performs functions such as mapping register names to
+memory locations, supplying breakpoint values and sizes,
+and converting processor specific data to Acid data types.
+An example of the latter is the stack trace function
+.CW strace ,
+which uses the stack traversal functions in the
+architecture-dependent library to construct a list of lists describing
+the context of a process. The first level of list selects
+each function in the trace; subordinate lists contain the
+names and values of parameters and local variables of
+the functions. Acid commands and library functions that
+manipulate and display process state information operate
+on the list representation and are independent of the
+underlying architecture.
+.NH
+Alef Runtime
+.PP
+Alef is a concurrent programming language,
+designed specifically for systems programming, which supports both
+shared variable and message passing paradigms.
+Alef borrows the C expression syntax but implements
+a substantially different type system.
+The language provides a rich set of
+exception handling, process management, and synchronization
+primitives, which rely on a runtime system.
+Alef program bugs are often deadlocks, synchronization failures,
+or non-termination caused by locks being held incorrectly.
+In such cases, a process stalls deep
+in the runtime code and it is clearly
+unreasonable to expect a programmer using the language
+to understand the detailed
+internal semantics of the runtime support functions.
+.PP
+Instead, there is an Alef support library, coded in Acid, that
+allows the programmer to interpret the program state in terms of
+Alef operations. Consider the example of a multi-process program
+stalling because of improper synchronization. A stack trace of
+the program indicates that it is waiting for an event in some
+obscure Alef runtime
+synchronization function.
+The function itself is irrelevant to the
+programmer; of greater importance is the identity of the
+unfulfilled event.
+Commands in the Alef support library decode
+the runtime data structures and program state to report the cause
+of the blockage in terms of the high-level operations available to
+the Alef programmer.
+Here, the Acid language acts
+as a communications medium between Alef implementer and Alef user.
+.NH
+Parallel Debugging
+.PP
+The central issue in parallel debugging is how the debugger is
+multiplexed between the processes comprising
+the program.
+Acid has no intrinsic model of process partitioning; it
+only assumes that parallel programs share a symbol table,
+though they need not share memory.
+The
+.CW setproc
+primitive attaches the debugger to a running process
+associated with the process ID supplied as its argument
+and assigns that value to the global variable
+.CW pid ,
+thereby allowing simple rotation among a group of processes.
+Further, the stack trace primitive is driven by parameters
+specifying a unique process context, so it is possible to
+examine the state of cooperating processes without switching
+the debugger focus from the process of interest.
+Since Acid is inherently extensible and capable of
+dynamic interaction with subordinate processes, the
+programmer can define Acid commands to detect and control
+complex interactions between processes.
+In short, the programmer is free to specify how the debugger reacts
+to events generated in specific threads of the program.
+.PP
+The support for parallel debugging in Acid depends on a crucial kernel
+modification: when the text segment of a program is written (usually to
+place a breakpoint), the segment is cloned to prevent other threads
+from encountering the breakpoint. Although this incurs a slight performance
+penalty, it is of little importance while debugging.
+.NH
+Communication Between Tools
+.PP
+The Plan 9 Alef and C compilers do not
+embed detailed type information in the symbol table of an
+executable file.
+However, they do accept a command line option causing them to
+emit descriptions of complex data types
+(e.g., aggregates and abstract data types)
+to an auxiliary file.
+The vehicle for expressing this information is Acid source code.
+When an Acid debugging session is
+subsequently started, that file is loaded with the other Acid libraries.
+.PP
+For each complex object in the program the compiler generates
+three pieces of Acid code.
+The first is a table describing the size and offset of each
+member of the complex data type. Following is an Acid function,
+named the same as the object, that formats and prints each member.
+Finally, Acid declarations associate the
+Alef or C program variables of a type with the functions
+to print them.
+The three forms of declaration are shown in the following example:
+.P1
+struct Bitmap {
+ Rectangle 0 r;
+ Rectangle 16 clipr;
+ 'D' 32 ldepth;
+ 'D' 36 id;
+ 'X' 40 cache;
+};
+.P2
+.P1
+defn
+Bitmap(addr) {
+ complex Bitmap addr;
+ print("Rectangle r {\en");
+ Rectangle(addr.r);
+ print("}\en");
+ print("Rectangle clipr {\en");
+ Rectangle(addr.clipr);
+ print("}\en");
+ print(" ldepth ", addr.ldepth, "\en");
+ print(" id ", addr.id, "\en");
+ print(" cache ", addr.cache, "\en");
+};
+
+complex Bitmap darkgrey;
+complex Bitmap Window_settag:b;
+.P2
+The
+.CW struct
+declaration specifies decoding instructions for the complex type named
+.CW Bitmap .
+Although the syntax is superficially similar to a C structure declaration,
+the semantics differ markedly: the C declaration specifies a layout, while
+the Acid declaration tells how to decode it.
+The declaration specifies a type, an offset, and name for each
+member of the complex object. The type is either the name of another
+complex declaration, for example,
+.CW Rectangle ,
+or a format code.
+The offset is the number of bytes from the start
+of the object to the member
+and the name is the member's name in the Alef or C declaration.
+This type description is a close match for C and Alef, but is simple enough
+to be language independent.
+.PP
+The
+.CW Bitmap
+function expects the address of a
+.CW Bitmap
+as its only argument.
+It uses the decoding information contained in the
+.CW Bitmap
+structure declaration to extract, format, and print the
+value of each member of the complex object pointed to by
+the argument.
+The Alef compiler emits code to call other Acid functions
+where a member is another complex type; here,
+.CW Bitmap
+calls
+.CW Rectangle
+to print its contents.
+.PP
+The
+.CW complex
+declarations associate Alef variables with complex types.
+In the example,
+.CW darkgrey
+is the name of a global variable of type
+.CW Bitmap
+in the program being debugged.
+Whenever the name
+.CW darkgrey
+is evaluated by Acid, it automatically calls the
+.CW Bitmap
+function with the address of
+.CW darkgrey
+as the argument.
+The second
+.CW complex
+declaration associates a local variable or parameter named
+.CW b
+in function
+.CW Window_settag
+with the
+.CW Bitmap
+complex data type.
+.PP
+Acid borrows the C operators
+.CW .
+and
+.CW ->
+to access the decoding parameters of a member of a complex type.
+Although this representation is sufficiently general for describing
+the decoding of both C and Alef complex data types, it may
+prove too restrictive for target languages with more complicated
+type systems.
+Further, the assumption that the compiler can select the proper
+Acid format code for each basic type in the language is somewhat
+naive. For example, when a member of a complex type is a pointer,
+it is assigned a hexadecimal type code; integer members are always
+assigned a decimal type code.
+This heuristic proves inaccurate when an integer field is a
+bit mask or set of bit flags which are more appropriately displayed
+in hexadecimal or octal.
+.NH
+Code Verification
+.PP
+Acid's ability to interact dynamically with
+an executing program allows passive test and
+verification of the target program. For example,
+a common concern is leak detection in programs using
+.CW malloc .
+Of interest are two items: finding memory that was allocated
+but never freed and detecting bad pointers passed to
+.CW free .
+An auxiliary Acid library contains Acid functions to
+monitor the execution of a program and detect these
+faults, either as they happen or in the automated
+post-mortem analysis of the memory arena.
+In the following example, the
+.CW sort
+command is run under the control of the
+Acid memory leak library.
+.P1
+helix% acid -l malloc /bin/sort
+/bin/sort: mips plan 9 executable
+/lib/acid/port
+/lib/acid/mips
+/lib/acid/malloc
+acid: go()
+now
+is
+the
+time
+<ctrl-d>
+is
+now
+the
+time
+27680 : breakpoint _exits+0x4 MOVW $0x8,R1
+acid:
+.P2
+The
+.CW go
+command creates a process and plants
+breakpoints at the entry to
+.CW malloc
+and
+.CW free .
+The program is then started and continues until it
+exits or stops. If the reason for stopping is anything
+other than the breakpoints in
+.CW malloc
+and
+.CW free ,
+Acid prints the usual status information and returns to the
+interactive prompt.
+.PP
+When the process stops on entering
+.CW malloc ,
+the debugger must capture and save the address that
+.CW malloc
+will return.
+After saving a stack
+trace so the calling routine can be identified, it places
+a breakpoint at the return address and restarts the program.
+When
+.CW malloc
+returns, the breakpoint stops the program,
+allowing the debugger
+to grab the address of the new memory block from the return register.
+The address and stack trace are added to the list of outstanding
+memory blocks, the breakpoint is removed from the return point, and
+the process is restarted.
+.PP
+When the process stops at the beginning of
+.CW free ,
+the memory address supplied as the argument is compared to the list
+of outstanding memory blocks. If it is not found an error message
+and a stack trace of the call is reported; otherwise, the
+address is deleted from the list.
+.PP
+When the program exits, the list of outstanding memory blocks contains
+the addresses of all blocks that were allocated but never freed.
+The
+.CW leak
+library function traverses the list producing a report describing
+the allocated blocks.
+.P1 1m
+acid: leak()
+Lost a total of 524288 bytes from:
+ malloc() malloc.c:32 called from dofile+0xe8 sort.c:217
+ dofile() sort.c:190 called from main+0xac sort.c:161
+ main() sort.c:128 called from _main+0x20 main9.s:10
+Lost a total of 64 bytes from:
+ malloc() malloc.c:32 called from newline+0xfc sort.c:280
+ newline() sort.c:248 called from dofile+0x110 sort.c:222
+ dofile() sort.c:190 called from main+0xac sort.c:161
+ main() sort.c:128 called from _main+0x20 main9.s:10
+Lost a total of 64 bytes from:
+ malloc() malloc.c:32 called from realloc+0x14 malloc.c:129
+ realloc() malloc.c:123 called from bldkey+0x358 sort.c:1388
+ buildkey() sort.c:1345 called from newline+0x150 sort.c:285
+ newline() sort.c:248 called from dofile+0x110 sort.c:222
+ dofile() sort.c:190 called from main+0xac sort.c:161
+ main() sort.c:128 called from _main+0x20 main9.s:10
+acid: refs()
+data...bss...stack...
+acid: leak()
+acid:
+.P2
+The presence of a block in the allocation list does not imply
+it is there because of a leak; for instance, it may have been
+in use when the program terminated.
+The
+.CW refs()
+library function scans the
+.I data ,
+.I bss ,
+and
+.I stack
+segments of the process looking for pointers
+into the allocated blocks. When one is found, the block is deleted from
+the outstanding block list.
+The
+.CW leak
+function is used again to report the
+blocks remaining allocated and unreferenced.
+This strategy proves effective in detecting
+disconnected (but non-circular) data structures.
+.PP
+The leak detection process is entirely passive.
+The program is not
+specially compiled and the source code is not required.
+As with the Acid support functions for the Alef runtime environment,
+the author of the library routines has encapsulated the
+functionality of the library interface
+in Acid code.
+Any programmer may then check a program's use of the
+library routines without knowledge of either implementation.
+The performance impact of running leak detection is great
+(about 10 times slower),
+but it has not prevented interactive programs like
+.CW sam
+and the
+.CW 8½
+window system from being tested.
+.NH
+Code Coverage
+.PP
+Another common component of software test uses
+.I coverage
+analysis.
+The purpose of the test is to determine which paths through the code have
+not been executed while running the test suite.
+This is usually
+performed by a combination of compiler support and a reporting tool run
+on the output generated by statements compiled into the program.
+The compiler emits code that
+logs the progress of the program as it executes basic blocks and writes the
+results to a file. The file is then processed by the reporting tool
+to determine which basic blocks have not been executed.
+.PP
+Acid can perform the same function in a language independent manner without
+modifying the source, object or binary of the program. The following example
+shows
+.CW ls
+being run under the control of the Acid coverage library.
+.P1
+philw-helix% acid -l coverage /bin/ls
+/bin/ls: mips plan 9 executable
+/lib/acid/port
+/lib/acid/mips
+/lib/acid/coverage
+acid: coverage()
+acid
+newstime
+profile
+tel
+wintool
+2: (error) msg: pid=11419 startstop: process exited
+acid: analyse(ls)
+ls.c:102,105
+ 102: return 1;
+ 103: }
+ 104: if(db[0].qid.path&CHDIR && dflag==0){
+ 105: output();
+ls.c:122,126
+ 122: memmove(dirbuf+ndir, db, sizeof(Dir));
+ 123: dirbuf[ndir].prefix = 0;
+ 124: p = utfrrune(s, '/');
+ 125: if(p){
+ 126: dirbuf[ndir].prefix = s;
+.P2
+The
+.CW coverage
+function begins by looping through the text segment placing
+breakpoints at the entry to each basic block. The start of each basic
+block is found using the Acid builtin function
+.CW follow .
+If the list generated by
+.CW follow
+contains more than one
+element, then the addresses mark the start of basic blocks. A breakpoint
+is placed at each address to detect entry into the block. If the result
+of
+.CW follow
+is a single address then no action is taken, and the next address is
+considered. Acid maintains a list of
+breakpoints already in place and avoids placing duplicates (an address may be
+the destination of several branches).
+.PP
+After placing the breakpoints the program is set running.
+Each time a breakpoint is encountered
+Acid deletes the address from the breakpoint list, removes the breakpoint
+from memory and then restarts the program.
+At any instant the breakpoint list contains the addresses of basic blocks
+which have not been executed.
+The
+.CW analyse
+function reports the lines of source code bounded by basic blocks
+whose addresses are have not been deleted from the breakpoint list.
+These are the basic blocks which have not been executed.
+Program performance is almost unaffected since each breakpoint is executed
+only once and then removed.
+.PP
+The library contains a total of 128 lines of Acid code.
+An obvious extension of this algorithm could be used to provide basic block
+profiling.
+.NH
+Conclusion
+.PP
+Acid has two areas of weakness. As with
+other language-based tools like
+.I awk ,
+a programmer must learn yet another language to step beyond the normal
+debugging functions and use the full power of the debugger.
+Second, the command line interface supplied by the
+.I yacc
+parser is inordinately clumsy.
+Part of the problem relates directly to the use of
+.I yacc
+and could be circumvented with a custom parser.
+However, structural problems would remain: Acid often requires
+too much typing to execute a simple
+command.
+A debugger should prostitute itself to its users, doing whatever
+is wanted with a minimum of encouragement; commands should be
+concise and obvious. The language interface is more consistent than
+an ad hoc command interface but is clumsy to use.
+Most of these problems are addressed by an Acme interface
+which is under construction. This should provide the best of
+both worlds: graphical debugging and access to the underlying acid
+language when required.
+.PP
+The name space clash between Acid variables, keywords, program variables,
+and functions is unavoidable.
+Although it rarely affects a debugging session, it is annoying
+when it happens and is sometimes difficult to circumvent.
+The current renaming scheme
+is too crude; the new names are too hard to remember.
+.PP
+Acid has proved to be a powerful tool whose applications
+have exceeded expectations.
+Of its strengths, portability, extensibility and parallel debugging support
+were by design and provide the expected utility.
+In retrospect,
+its use as a tool for code test and verification and as
+a medium for communicating type information and encapsulating
+interfaces has provided unanticipated benefits and altered our
+view of the debugging process.
+.NH
+Acknowledgments
+.PP
+Bob Flandrena was the first user and helped prepare the paper.
+Rob Pike endured three buggy Alef compilers and a new debugger
+in a single sitting.
+.NH
+References
+.LP
+[Pike90] R. Pike, D. Presotto, K. Thompson, H. Trickey,
+``Plan 9 from Bell Labs'',
+.I
+UKUUG Proc. of the Summer 1990 Conf.,
+.R
+London, England,
+1990.
+.LP
+[Gol93] M. Golan, D. Hanson,
+``DUEL -- A Very High-Level Debugging Language'',
+.I
+USENIX Proc. of the Winter 1993 Conf.,
+.R
+San Diego, CA,
+1993.
+.LP
+[Lin90] M. A. Linton,
+``The Evolution of DBX'',
+.I
+USENIX Proc. of the Summer 1990 Conf.,
+.R
+Anaheim, CA,
+1990.
+.LP
+[Stal91] R. M. Stallman, R. H. Pesch,
+``Using GDB: A guide to the GNU source level debugger'',
+Technical Report, Free Software Foundation,
+Cambridge, MA,
+1991.
+.LP
+[Win93] P. Winterbottom,
+``Alef reference Manual'',
+reprinted in this volume.
+.LP
+[Pike93] Rob Pike,
+``Acme: A User Interface for Programmers'',
+.I
+USENIX Proc. of the Winter 1994 Conf.,
+.R
+San Francisco, CA,
+reprinted in this volume.
+.LP
+[Ols90] Ronald A. Olsson, Richard H. Crawford, and W. Wilson Ho,
+``Dalek: A GNU, improved programmable debugger'',
+.I
+USENIX Proc. of the Summer 1990 Conf.,
+.R
+Anaheim, CA.
+.LP
+[May92] Paul Maybee,
+``NeD: The Network Extensible Debugger''
+.I
+USENIX Proc. of the Summer 1992 Conf.,
+.R
+San Antonio, TX.
+.LP
+[Aral] Ziya Aral, Ilya Gertner, and Greg Schaffer,
+``Efficient debugging primitives for multiprocessors'',
+.I
+Proceedings of the Third International Conference on Architectural
+Support for Programming Languages and Operating Systems,
+.R
+SIGPLAN notices Nr. 22, May 1989.
binary files /dev/null b/doc/acidpaper.pdf differ
--- /dev/null
+++ b/doc/acidtut.ms
@@ -1,0 +1,1062 @@
+.de d0
+.nr dP +1
+.nr dV +1p
+..
+.de d1
+.nr dP -1
+.nr dV -1p
+..
+.nr dT 4
+.de Af \" acid function
+.CW "\\$1(" "\fI\\$2\fP\f(CW)\fP"
+..
+.TL
+Native Kernel Debugging with Acid
+.AU
+Tad Hunt
+tad@plan9.bell-labs.com
+.br
+Lucent Technologies Inc
+.br
+(Revised 22 May 2000 by Vita Nuova)
+.SH
+Introduction
+.PP
+This tutorial provides an introduction to the Acid debugger. It assumes that you are familiar with the features of a typical source-level debugger. The Acid debugger is built round a command language with a syntax similar to C.
+This tutorial is not an
+introduction to Acid as a whole, but
+offers a brief tour
+of the basic built in and standard library functions,
+especially those needed for debugging native Inferno kernels on a target board.
+.PP
+Acid was originally developed by Phil Winterbottom
+to help debug multi-threaded programs in
+the concurrent language Alef, and provide more sophisticated
+debugging for C programs.
+In the paper
+.I "Acid: A Debugger Built From a Language" ,
+Winterbottom
+discusses Acid's design, including some worked examples of unusual
+applications of Acid to find memory leaks and assist code coverage analysis.
+Following that is the
+.I "Acid Reference Manual" ,
+also by Phil Winterbottom,
+which gives a more precise specification of the Acid debugging language and its libraries.
+.SH
+Preliminaries -- the environment
+.PP
+Acid runs under the host operating system used for cross-development,
+in the same way as the Inferno compilers.
+Before running either compilers or Acid, the following
+environment variables must be set appropriately:
+.TS
+center;
+lf(CW) lf(R)w(4i) .
+ROOT T{
+the directory in which Inferno lives (eg,
+.CW /usr/inferno ).
+T}
+SYSHOST T{
+.I host
+operating system type:
+.CW Nt ,
+.CW Solaris ,
+.CW Plan9 ,
+.CW Linux
+or
+.CW FreeBSD
+T}
+OBJTYPE T{
+.I host
+machine's architecture type:
+.CW 386 ,
+.CW sparc ,
+.CW mips ,
+or
+.CW powerpc
+T}
+.TE
+They might be set by a login shell profile
+(eg,
+Unix
+.CW ".profile" ,
+or
+Plan 9
+.CW lib/profile ).
+Also ensure that the directory
+.P1
+$ROOT/$SYSHOST/$OBJTYPE/bin
+.P2
+is on your search path.
+For example, on a Solaris sparc, one might use:
+.P1
+ROOT=\fIinferno_root\fP
+SYSHOST=Solaris
+OBJTYPE=sparc
+ACIDLIB=$ROOT/lib/acid
+PATH=$ROOT/$SYSHOST/$OBJTYPE/bin:$PATH
+export ROOT ACIDLIB PATH OBJTYPE SYSHOST
+.P2
+where
+.I "inferno_root"
+is the directory in which Inferno lives (eg,
+.CW "/usr/inferno" ).
+.SH
+An Example Program
+.PP
+The first example is not kernel code, but a small program that
+will be compiled but not run, to demonstrate basic Acid commands for
+source and object file inspection.
+The code is shown below:
+.P1
+int
+factorial(int n)
+{
+ if (n == 1)
+ return 1;
+ return n * factorial(n-1);
+}
+
+int f;
+void
+main(void)
+{
+ f = factorial(5);
+}
+
+void
+_main(void)
+{
+ main();
+}
+.P2
+.SH
+Compiling and Linking
+.PP
+The first step is to create an executable. The example shows the process for creating ARM executables. Substitute the appropriate compiler and linker for other cpu types.
+.P1
+% 5c factorial.c
+% 5l -o factorial factorial.5
+% ls
+factorial
+factorial.5
+factorial.c
+.P2
+.SH
+Starting Acid
+.PP
+Even without the target machine on which
+to run the program, many Acid features are available.
+The following command starts debugging the
+.CW "factorial"
+executable. Note that, upon startup, Acid will attempt to load some libaries from the directory specified in the
+.CW "ACIDLIB"
+environment variable (defaults to
+.CW "/usr/inferno/lib/acid" ).
+It will also attempt to load the file
+.CW "$HOME/lib/acid" ,
+in which you can place commands to be executed during startup.
+.P1
+% acid factorial
+factorial:Arm plan 9 executable
+
+$ROOT/lib/acid/port
+$ROOT/lib/acid/arm
+acid:
+.P2
+.SH
+Exploring the Executable
+.PP
+To find out what symbols are in the program:
+.P1
+acid: symbols("")
+etext T 0x00001068
+f D 0x00002000
+setR12 D 0x00002ffc
+end B 0x00002008
+bdata D 0x00002000
+edata D 0x00002008
+factorial T 0x00001020
+main T 0x00001048
+_main T 0x0000105c
+acid:
+.P2
+The output from the
+.CW symbols()
+function is similar to the output from the
+.I nm (10.1)
+command. The first column is the symbol name, the second column gives the section the symbol is in, and the third column is the address of the symbol.
+.PP
+There is also a
+.CW "symbols"
+global variable. Variables and functions can have the same names. It holds the list of symbol information that the
+.CW symbols
+function uses to generate the table:
+.d0
+.P1
+acid: symbols
+{{"etext", T, 0x00001068}, {"f", D, 0x00002000}, {"setR12", D, 0x00002ffc},
+ {"end", B, 0x00002008}, {"bdata", D, 0x00002000}, {"edata", D, 0x00002008},
+ {"factorial", T, 0x00001020}, {"main", T, 0x00001048}, {"_main", T, 0x00001
+05c}}
+acid:
+.P2
+.d1
+In large programs, finding the symbol you are interested in from a list that may be thousands of lines long would be difficult. The string argument of
+.CW symbols()
+is a regular expression against which to match symbols.
+All symbols that contain the pattern will be displayed. For example:
+.P1
+acid: symbols("main")
+main T 0x00001048
+_main T 0x0000105c
+acid: symbols("^main")
+main T 0x00001048
+acid:
+.P2
+The
+.CW symbols
+function is written in the
+.I acid
+command language and lives in the
+.CW "port"
+library
+.CW $ACIDLIB/port ). (
+.P1
+defn symbols(pattern)
+{
+ local l, s;
+
+ l = symbols;
+ while l do {
+ s = head l;
+ if regexp(pattern, s[0]) then
+ print(s[0], "\t", s[1], "\t", s[2], "\n");
+ l = tail l;
+ }
+}
+.P2
+Acid retrieves the list of symbols from the executable and turns each one into a global variable whose value is the address of the symbol. If the symbol clashes with a builtin name or keyword or a previously defined function, enough
+.CW "$"
+characters are prepended to the name to make it unique. The list of such renamings is printed at startup.
+.PP
+Most acid functions operate on addresses. For example, to view the source code for a given address, use the
+.CW src
+function:
+.P1
+acid: src(main)
+/usr/jrf/factorial.c:10
+ 5 return n * factorial(n-1);
+ 6 }
+ 7
+ 8 int f;
+ 9 void
+>10 main(void)
+ 11 {
+ 12 f = factorial(5);
+ 13 }
+ 14
+ 15 void
+.P2
+The
+.Af "src" addr
+function displays a section of source code, with the line containing the address passed as an argument in the middle of the display. To print the assembly code beginning at a given address, use the
+.CW asm()
+function.
+.P1
+acid: asm(factorial)
+factorial 0x00001020 MOVW.W R14,#-0x8(R13)
+factorial+0x4 0x00001024 CMP.S $#0x1,R0
+factorial+0x8 0x00001028 MOVW.EQ $#0x1,R0
+factorial+0xc 0x0000102c RET.EQ.P #0x8(R13)
+factorial+0x10 0x00001030 MOVW R0,n+0(FP)
+factorial+0x14 0x00001034 SUB $#0x1,R0,R0
+factorial+0x18 0x00001038 BL factorial
+factorial+0x1c 0x0000103c MOVW n+0(FP),R2
+factorial+0x20 0x00001040 MUL R2,R0,R0
+factorial+0x24 0x00001044 RET.P #0x8(R13)
+main 0x00001048 MOVW.W R14,#-0x8(R13)
+acid:
+.P2
+The output contains the symbolic address (symbol name+offset, where symbol name is the name of the enclosing function) followed by the absolute address, followed by the disassembled code.
+The
+.Af "asm" addr
+function prints the assembly beginning at
+.I "addr"
+and ending after either 30 lines have been printed, or the end of the function has been reached. The
+.CW "casm()"
+function continues the assembly listing from where it left off, even past the end of the function and into the next one.
+.P1
+acid: casm()
+main+0x4 0x0000104c MOVW $#0x5,R0
+main+0x8 0x00001050 BL factorial
+main+0xc 0x00001054 MOVW R0,$f-SB(SB)
+main+0x10 0x00001058 RET.P #0x8(R13)
+_main 0x0000105c MOVW.W R14,#-0x4(R13)
+acid:
+.P2
+All the functions presented so far are written in the acid command language. To see the source of a comand written in the acid command language, use the builtin command
+.CW "whatis [" "\fIname\fP\f(CW ]\fP."
+It prints the definition of the optional argument
+.I "name" .
+If
+.I "name"
+is an Acid builtin,
+.CW whatis
+prints
+.CW "builtin function" .
+.P1
+acid: whatis casm
+defn casm() {
+ asm(lasmaddr);
+}
+acid:
+acid: whatis atof
+builtin function
+acid:
+.P2
+If
+.I name
+is a variable, it prints the type of variable, and for the integer type, gives the format code used to print the value:
+.P1
+acid: whatis pid
+integer variable format D
+acid:
+.P2
+With no arguments,
+.CW whatis
+lists all available functions:
+.P1
+acid: whatis
+Bsrc bpmask follow new sh
+_bpconddel bpneq func newproc source
+_bpcondset bpor gpr next spr
+_stk bpprint include notestk spsrch
+access bppush interpret params src
+acidinit bpset itoa pcfile start
+addsrcdir bptab kill pcline startstop
+asm casm kstk pfl status
+atof cont labstk print stk
+atoi debug line printto stmnt
+bpaddr dump linkreg procs stop
+bpand error lkstk rc stopped
+bpconddel file locals readfile strace
+bpcondset filepc lstk reason symbols
+bpdel findsrc map regexp waitstop
+bpderef fmt match regs
+bpeq fnbound mem setproc
+acid:
+.P2
+The
+.Af "Bsrc" addr
+function brings up an editor on the line containing
+.I "addr" .
+It simply invokes a shell script named
+.CW "B"
+that takes two arguments,
+.I "-line"
+and
+.I "file"
+The shell script invokes
+.CW "$EDITOR +"
+.I "line file" .
+If unset,
+.CW "EDITOR"
+defaults to
+.I vi .
+The shell script, or the
+.CW Bsrc
+function can be easily rewritten to work with your favorite editor.
+.PP
+Entering a symbol name by itself will print the address of the symbol. Prefixing the symbol name with a
+.CW "*"
+will print the value at the address in the variable. Continuing to use our
+.CW "factorial"
+example:
+.P1
+acid: f
+0x00002000
+acid: *f
+0x00000000
+acid:
+.P2
+.SH
+Remote Debugging
+.PP
+Now that you have a basic understanding of how to explore the executable, it is time to examine a real remote debugging session.
+.PP
+We'll use the SA1100 keyboard driver as an example. Examining the kernel configuration file, you'll see the following:
+.P1
+dev
+ keyboard
+link driver/keyboard port
+ scanfujn860 kbd.h keycodes.h
+link ./../driver plat
+ kbdfujitsu ./../common/ssp.h \e
+ /driver/keyboard/kbd.h \e
+ /driver/keyboard/keycodes.h
+port
+ const char *defaultkeyboard = "fujitsu";
+ const char *defaultkeytable = "scanfujn860";
+ int debugkeys = 1; /* 1 = enabled, 0 = disabled */
+.P2
+This describes the pieces of the keyboard driver which are linked into the kernel. The source code lives in two places,
+.CW "$ROOT/os/driver/keyboard" ,
+and
+.CW "$ROOT/os/plat/sa1100/driver" .
+.PP
+The next step is to build a kernel. Use the
+.I mk
+target
+.CW acid
+to ensure that the Acid symbolic debugging data is
+produced.
+For example:
+.P1
+% mk 'CONF=sword' acid isword.p9.gz
+.P2
+This creates the Acid file
+.CW isword.acid ,
+containing Acid declarations describing kernel structures,
+the kernel executable
+.CW isword.p9 ;
+and finally
+.I gzip s
+a copy of the kernel in
+.CW isword.p9.gz
+to load onto the device. Next, copy the gzipped image onto the device and then boot it. Follow the directions found elsewhere for details of this process.
+.PP
+From a shell prompt on the target device, start the remote debugger by writing the letter
+.CW r
+(for run) to
+.CW "#b/dbgctl" .
+Next, start Acid in remote debug mode, specifying the serial port it is connected to with the
+.CW "-R"
+option.
+.CW "$CONF"
+is the name of the configuration file used, for example
+.CW "sword" .
+.P1
+% acid -R /dev/cua/b -l i$CONF.acid i$CONF
+isword:Arm plan 9 executable
+$ROOT/lib/acid/port
+i$CONF.acid
+$ROOT/lib/acid/arm
+/usr/jrf/lib/acid
+acid:
+.P2
+You are now debugging the kernel that is running on the target device. All of the previously listed commands will work as described before, in addition, there are many more commands available.
+.SH
+Kernel Process Listing
+.PP
+To get a list of kernel processes, use the
+.CW "ps()"
+function:
+.P1
+acid: ps()
+PID PC PRI STATE NAME
+1 0x00054684 5 Queueing interp
+2 0x00000000 1 Wakeme consdbg
+3 0x00000000 5 Wakeme tcpack
+4 0x00000000 5 Wakeme Fs.sync
+5 0x00000000 4 Wakeme touchscreen
+6 0x00054684 5 Queueing dis
+7 0x00059788 5 Wakeme dis
+8 0x00054684 5 Queueing dis
+9 0x00054684 5 Queueing dis
+10 0x00054684 5 Wakeme dis
+11 0x0004c26c 1 Running dbg
+acid:
+.P2
+The
+.CW "PC"
+column shows the address the process was executing at when the
+.CW ps
+command retrieved statistics on it. The
+.CW "PRI"
+column lists process priorities. The smaller the number the higher the process priority. Notice that the kernel process (kproc) running the debugger is the highest priority process in the system. The only process you will ever see in the
+.CW "Running"
+state while executing the
+.CW ps
+command will be the debugger, since it is gathering information about the other processes.
+.SH
+Breakpoints
+.PP
+Breakpoints in Inferno, unlike most traditional kernel debuggers, are conditional breakpoints. There are minimally two conditions which must be met. These conditions are address and process id. A breakpoint will only be taken when execution for a specific kernel process reaches the specified address. The user can create additional conditions that are evaluated if the address and process id match. If evaluation of these conditions result in a nonzero value, the breakpoint is taken, otherwise it is ignored, and execution continues.
+.PP
+Again, the best way to proceed is with an example:
+.P1
+acid: setproc(7)
+.P2
+The
+.Af setproc pid
+function selects a kproc to which later commands will be applied;
+the one with process ID (\fIpid\fP)
+in this case.
+.P1
+acid: bpset(keyboardread)
+Waiting...
+7: stopped flush8to4+0x18c MOVW (R3<<#4),R3
+.P2
+After selecting a kproc, we set a breakpoint at the address referred to by the
+.CW "keyboardread"
+symbol. As described before, the value of a global variable created from a symbol in the executable is the address of the symbol. In this case the address is the first instruction in the
+.CW "keyboardread()"
+function. Notice that setting a breakpoint stops the kproc from executing. A bit later, we'll see how to get it to continue execution.
+.PP
+Next, display the list of breakpoints using
+.CW "bptab()" :
+.P1
+acid: bptab()
+ID PID ADDR CONDITIONS
+0 7 keyboardread 0x0003c804 { }
+.P2
+The first column is a unique number that identifies the breakpoint. The second column is the process ID in which the breakpoint will be taken. The third and fourth columns are the address of the breakpoint, first in symbolic form, then in numeric form. Finally, the last column is a list of conditions to evaluate whenever the kproc specified in the
+.CW "PID"
+column hits the the address specified in the
+.CW "ADDR"
+column. When they match, the list of conditions is evaluated. If the result is nonzero, the breakpoint is taken. Since we used the simplified breakpoint creation function,
+.CW "bpset()"
+, there are no additional conditions. Later on, we'll see how to set conditional breakpoints.
+.PP
+Start the selected kproc executing again, and wait for it to hit the breakpoint.
+.P1
+acid: cont()
+.P2
+The
+.CW "cont()"
+function will not return until a breakpoint has been hit, and there is no way to interrupt it. This means you should only set breakpoints that will be hit, otherwise you'll have to reboot the target device and restart your debugging session.
+.PP
+To continue our example, repeatedly hit new line (return, enter)
+on the keyboard on the target device, until the breakpoint occurs:
+.P1
+break 0: pid 7: stopped keyboardread SUB $#0xa4,R13,R13
+acid:
+.P2
+This message, followed by the interactive prompt returning tells you that a breakpoint was hit. It gives the breakpoint id, the kernel process id, then the symbolic address at which execution halted, followed by the disassembly of the instruction at that address.
+.PP
+The
+.CW "kstk()"
+function prints a kernel stack trace, beginning with the current frame, all the way back to the call that started the kproc. For each function, it gives the name name, arguments, source file, and line number, followed by the symbolic address, source file, and line number of the caller.
+.d0
+.P1
+acid: kstk()
+At pc:247812:keyboardread /usr/inferno/os/driver/keyboard/devkey
+board.c:350
+keyboardread(offset=0x0000009d,buf=0x001267f8,n=0x00000001) /usr
+/inferno/os/driver/keyboard/devkeyboard.c:350
+ called from kchanio+0x9c /usr/inferno/os/port/sysfile.c:
+75
+kchanio(buf=0x001267f8,n=0x00000001,mode=0x00000000) /usr/infern
+o/os/port/sysfile.c:64
+ called from consread+0x144 /usr/inferno/os/driver/port/d
+evcons
+consread(offset=0x0000009d,buf=0x0043d4fc,n=0x00000400,c=0x0044e
+c38) /
+usr/inferno/os/driver/port/devcons.c:357
+ called from kread+0x164 /usr/inferno/os/port/sysfile.c:2
+97
+kread(fd=0x00000006,n=0x00000400,va=0x0043d4fc) /usr/inferno/os/
+port/sysfile.c:272
+ called from Sys_read+0x84 /usr/inferno/os/port/inferno.c
+:244
+Sys_read() /usr/inferno/os/port/inferno.c:229
+ called from mcall+0x98 /usr/inferno/interp/xec.c:590
+mcall() /usr/inferno/interp/xec.c:569
+ called from xec+0x128 /usr/inferno/interp/xec.c:1098
+xec(p=0x0044edd8) /usr/inferno/interp/xec.c:1077
+ called from vmachine+0xbc /usr/inferno/os/port/dis.c:706
+vmachine() /usr/inferno/os/port/dis.c:677
+ called from _main+0x50 /usr/inferno/os/plat/sa1100/infern
+o/main.c:237
+acid:
+.P2
+.d1
+There is another kernel stack dump function,
+.CW "lkstk()"
+which shows the same information as
+.CW "kstk()"
+plus the names and values of local variables. Notice that in addition to the
+`called from'
+information, each local variable and its value is listed on a line by itself.
+.d0
+.P1
+acid: lkstk()
+At pc:247812:keyboardread /usr/inferno/os/driver/keyboard/devkeyboard.
+c:350
+keyboardread(offset=0x00000018,buf=0x001267f9,n=0x00000001) /usr/inferno
+/os/driver/keyboard/devkeyboard.c:350
+ called from kchanio+0x9c /usr/inferno/os/port/sysfile.c:75
+ tmp=0x00000000
+kchanio(buf=0x001267f9,n=0x00000001,mode=0x00000000) /usr/inferno/os/por
+t/sysfile.c:64
+ called from consread+0x144 /usr/inferno/os/driver/port/devcons
+ c=0x0045a858
+ r=0x00000001
+consread(offset=0x00000015,buf=0x0043d4fc,n=0x00000400,c=0x0044ec38) /us
+r/inferno/os/driver/port/devcons.c:357
+ called from kread+0x164 /usr/inferno/os/port/sysfile.c:297
+ r=0x00000001
+ ch=0x0000006c
+ eol=0x00000000
+ i=0x00000000
+ mt=0x60000053
+ tmp=0x0007317c
+ l=0x0044ec38
+ p=0x00049754
+kread(fd=0x00000006,n=0x00000400,va=0x0043d4fc) /usr/inferno/os/port/sys
+file.c:272
+ called from Sys_read+0x84 /usr/inferno/os/port/inferno.c:244
+ c=0x0044ec38
+ dir=0x00000000
+Sys_read() /usr/inferno/os/port/inferno.c:229
+ called from mcall+0x98 /usr/inferno/interp/xec.c:590
+ f=0x0044eff0
+ n=0x00000400
+mcall() /usr/inferno/interp/xec.c:569
+ called from xec+0x128 /usr/inferno/interp/xec.c:1098
+ ml=0x0043d92c
+ f=0x0044eff0
+xec(p=0x0044edd8) /usr/inferno/interp/xec.c:1077
+ called from vmachine+0xbc /usr/inferno/os/port/dis.c:706
+vmachine() /usr/inferno/os/port/dis.c:677
+ called from _main+0x50 /usr/inferno/os/plat/sa1100/inferno/main.
+c:237
+ r=0x0044edd8
+ o=0x0044ee50
+.P2
+.d1
+The
+.CW "step()"
+function allows the currently selected process to execute a single instruction, and then stop.
+.P1
+acid: step()
+break 1: pid 7: stopped keyboardread+0x4 MOVW R14,#0x0(R13)
+acid:
+.P2
+The
+.CW "bpdel" (
+.I id )
+command deletes the breakpoint identified by
+.I id :
+.P1
+acid: bpdel(0)
+.P2
+The
+.CW "start()"
+command places the kproc back into the state it was in when it was stopped.
+.P1
+acid: start(7)
+acid:
+.P2
+Now lets look at how to set conditional breakpoints.
+.d0
+.P1
+acid: bpcondset(7, keyboardread, {bppush(_startup), bpderef()})
+Waiting...
+7: stopped sched+0x20 MOVW #0xffffff70(R12),R6
+acid: bptab()
+ID PID ADDR CONDITIONS
+0 7 keyboardread 0x0003c804 {
+ {"p", 0x00008020}
+ {"*", 0x00000000} }
+acid: *_startup = 0
+acid: cont()
+.P2
+.d1
+Conditional breakpoints are set with
+.CW "bpcondset()"
+. It takes three arguments, the kernel process id, the address, and a list of stack based operations which are executed if the pid and addr match. The operations push values onto the stack, and if at the end of execution, a nonzero value is on the top of the stack, the breakpoint is taken. Examining the list of breakpoints with the
+.CW "bptab()"
+function shows the list of conditions to apply. The list is a bit confusing to read, but the
+.CW ""p""
+means push and the
+.CW ""*""
+means
+.I dereference .
+.PP
+No matter how much you type on the keyboard, this particular breakpoint will never be taken. That's because before continuing, we set the value at the address
+.CW "_startup"
+to zero, so whenever execution reaches
+.CW "keyboardread"
+in kproc number 7, it pushes the address
+.CW "_startup" ,
+then pops it and pushes the word at that address. Since the top of the stack is zero, the breakpoint is ignored.
+.PP
+This contrived example may not be all that useful, but you can use a similar method in your driver to examine some state before making the decision to take the breakpoint.
+.SH
+Examining Registers
+.PP
+There are three commands to dump registers:
+.CW gpr() ,
+.CW spr()
+and
+.CW "regs()" .
+The
+.CW "gpr()"
+function dumps the general purpose registers,
+.CW "spr()"
+dumps special purpose registers (such as the
+.CW "PC"
+and
+.CW "LINK "
+registers), and
+.CW "regs()"
+dumps both:
+.d0
+.P1
+acid: regs()
+PC 0x0004a3b0 sched+0x20 /home/tad/inf2.1/os/port/proc.c:82
+LINK 0x0004b8e8 kchanio+0xa4 /home/tad/inf2.1/os/port/sysfile.c:75
+SP 0x00453c4c
+R0 0x00458798 R1 0x000fdf9c R2 0x0003c804 R3 0x00000000
+R4 0xffffffff R5 0x00000001 R6 0x00458798 R7 0x00000001
+R8 0x001267f8 R9 0x00000000 R10 0x0044ee50 R11 0x00029f9c
+R12 0x000fc854
+acid:
+.P2
+.d1
+.SH
+Complex Types
+.PP
+When reading in the symbol table, Acid treats all of the symbols in the executable as pointers to integers. This is fine for global integer variables, but it makes examining more complex types difficult. Luckily there is a solution. Acid allows you to create a description for more complex types, and a function which will automatically be called for these complex types. In fact, the compiler can automatically generate the acid code to describe these complex types. For example, if we wanted to print out the devtab structure for the keyboard driver, we can just give its name:
+.P1
+acid: whatis keyboarddevtab
+integer variable format a complex Dev
+acid: keyboarddevtab
+ dc 107
+ name 0x0010e0ea
+ reset 0x0003c3fc
+ init 0x0003c438
+ attach 0x0003c5dc
+ clone 0x000480d0
+ walk 0x0003c600
+ stat 0x0003c640
+ open 0x0003c680
+ create 0x0004881c
+ close 0x0003c768
+ read 0x0003c804
+ bread 0x0004883c
+ write 0x0003c968
+ bwrite 0x00048900
+ remove 0x00048978
+ wstat 0x00048998
+acid:
+.P2
+Acid knows the keyboarddevtab variable is of type Dev, and it prints it by invoking the function Dev(keyboarddevtab).
+.P1
+acid: whatis Dev
+complex Dev {
+ 'D' 0 dc;
+ 'X' 4 name;
+ 'X' 8 reset;
+ 'X' 12 init;
+ 'X' 16 attach;
+ 'X' 20 clone;
+ 'X' 24 walk;
+ 'X' 28 stat;
+ 'X' 32 open;
+ 'X' 36 create;
+ 'X' 40 close;
+ 'X' 44 read;
+ 'X' 48 bread;
+ 'X' 52 write;
+ 'X' 56 bwrite;
+ 'X' 60 remove;
+ 'X' 64 wstat;
+};
+.P3
+defn Dev(addr) {
+ complex Dev addr;
+ print("\etdct",addr.dc,"\en");
+ print("\etnamet",addr.nameX,"\en");
+ print("\etresett",addr.resetX,"\en");
+ print("\etinitt",addr.initX,"\en");
+ print("\etattacht",addr.attachX,"\en");
+ print("\etclonet",addr.cloneX,"\en");
+ print("\etwalkt",addr.walkX,"\en");
+ print("\etstatt",addr.statX,"\en");
+ print("\etopent",addr.openX,"\en");
+ print("\etcreatet",addr.createX,"\en");
+ print("\etcloset",addr.closeX,"\en");
+ print("\etreadt",addr.readX,"\en");
+ print("\etbreadt",addr.breadX,"\en");
+ print("\etwritet",addr.writeX,"\en");
+ print("\etbwritet",addr.bwriteX,"\en");
+ print("\etremovet",addr.removeX,"\en");
+ print("\etwstatt",addr.wstatX,"\en");
+}
+.P2
+Notice the complex type definition and the function to print the type both have the same name. If we know that an address is the address of a complex type, even though acid may not
+(say we're storing multiple types of data in a void pointer),
+we can print the complex type by calling the type printing function ourselves.
+.P1
+acid: print(fmt(keyboarddevtab, 'X'))
+0x00106d50
+acid: Dev(0x00106d50)
+ dc 107
+ name 0x0010e0ea
+ reset 0x0003c3fc
+ init 0x0003c438
+ attach 0x0003c5dc
+ clone 0x000480d0
+ walk 0x0003c600
+ stat 0x0003c640
+ open 0x0003c680
+ create 0x0004881c
+ close 0x0003c768
+ read 0x0003c804
+ bread 0x0004883c
+ write 0x0003c968
+ bwrite 0x00048900
+ remove 0x00048978
+ wstat 0x00048998
+acid:
+.P2
+.SH
+Conclusion
+.PP
+This introduction to using Acid for remote debugging Inferno kernels should be enough to get you started. As a tutorial, it only describes how to use some of the features of the debugger, and does not attempt to describe how to do advanced debugging such as writing your own functions, or modifying existing ones. Exploring the source, setting breakpoints, single stepping through code, and examining the contents of variables are the usual uses of a debugger. This tutorial gives examples of all of these.
+.PP
+For a more in depth discussion of the acid command language, and how to write your own acid functions, see the manual page
+.I acid (10.1)
+and Phil Winterbottom's papers on the Acid Debugger,
+reprinted in this volume.
+.TL
+Appendix
+.LP
+There are two important differences between Acid described in the
+accompanying paper, and Acid as distributed with Inferno for use in
+kernel debugging.
+.SH
+Connecting Acid to the remote Inferno kernel
+.PP
+A remote Plan 9 kernel can be debugged in the same
+way as a Plan 9 user process, using the
+file server
+.I rdbfs (4).
+It is a user-level file server on Plan 9 that
+uses a special debugging protocol on a serial connection to
+the remote kernel, but on the Plan 9 side serves a file system interface
+like that of
+.I proc (3),
+for use by Acid.
+Acid therefore does not need any special code to access the remote kernel's memory,
+or exert control over it.
+.PP
+Inferno's version of Acid currently runs under the host operating systems,
+which do not support such a mechanism (except for Plan 9).
+Instead, Acid itself provides a special debugging protocol,
+with (host) platform-specific interface code to access a serial port.
+This might well be addressed in future by implementing the native kernel debugger
+in Limbo.
+.SH
+Handling of breakpoints
+.PP
+.de Ip
+.KS
+.LP
+.tl '\f2\\$1\fP\ \ \f(CW\\$2(\f2\\$3\f(CW)\f1''\\$4'
+.IP
+..
+.de Ex
+.KE
+.KS
+.IP
+.ft CW
+.ta 4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n +4n
+.nf
+.in +4n
+.br
+..
+.de Ee
+.fi
+.ft 1
+.br
+.in -4n
+.KE
+..
+The following functions are provided by the Acid library
+.CW $ROOT/lib/acid/$OBJTYPE
+for use in native kernel debugging.
+In several cases they change the behavior described in the Acid manual.
+The functions are:
+.P1
+ id = bpset(addr)
+ id = bpcondset(pid, addr, list)
+ bppush(val)
+ bpderef()
+ bpmask()
+ bpeq()
+ bpneq()
+ bpand()
+ bpor()
+ bptab()
+ addr = bpaddr(id)
+ bpdel(id)
+ bpconddel(id)
+.P2
+.PP
+With traditional breakpoints, when a program reaches an address at which a breakpoint is set, execution is halted, and the debugger is notified. In applications programming, this type of breakpoint is sufficient because communicating the break in execution to the debugger is handled by the operating system. The traditional method of handling breakpoints breaks down when program being debugged is the kernel. A breakpoint cannot entirely suspend the execution of the kernel because there is no other program that can handle the communication to the debugger.
+.PP
+Some operating systems solve this problem by including a
+`mini' operating system,
+a self-contained program within the kernel that has its own code to handle the hardware used to communicate with the remote debugger or user. There are many problems with this mechanism. First, the debugger code that lives inside the kernel must duplicate a lot of code contained elsewhere in the kernel. This makes the kernel much bigger, and can increase maintenance costs. Typically this type of debug support treats the kernel as having a single thread of control, so a breakpoint stops everything while the user decides what to do about it. The only places in the kernel breakpoints cannot be set are in the debugger itself, and in the code that handles notifying the debugger of the breakpoint.
+.PP
+The Inferno kernel takes a different approach. The remote debug support is provided by a device driver that makes use of kernel services. Communication with the remote debugger is handled by a kernel process dedicated entirely to that task. All breakpoints can be considered to be minimally conditional on two values. First, the address to take the break at, and second, the kernel process to take the break in. This method allows the kernel debugger to be implemented as a regular Inferno device driver. The device driver can make use of all the APIs available to device drivers, it does not need to be self contained. Additionally, conditional breakpoints can be set anywhere in the kernel, with two exceptions. As with traditional debugger implementations, breakpoints can not be set in the code that handles notifying the debugger of the breakpoint. Unlike traditional implementations, the code that handles the execution and evaluation of the conditions applied to the breakpoint is the only other place breakpoint
+cannot be set. Since both of these parts of the kernel code are self contained, the user can set breakpoints in any other kernel routines. For example, the user could set a breakpoint in
+.CW kread() ,
+for a given kernel process, but the debugger can still call
+.CW kread()
+itself.
+.PP
+Use of conditional breakpoints can help make the debugging process more efficient. If there is a bug that occurs in the Nth iteration of a loop, with unconditional breakpoints, user intervention is required N-1 times before reaching the state the bug occurs in. Conditional breakpoints give the user the ability to automatically check the value of N, and only take the breakpoint when it reaches the critical value.
+.PP
+The following changed
+and additional functions in the Acid library provide access
+to this extended breakpoint support:
+.SH
+Setting Breakpoints
+.LP
+.\"
+.\"
+.\"
+.Ip integer bpset integer "Set a breakpoint
+.CW bpset
+places an unconditional breakpoint for the currently
+selected kernel process at the address specified
+by its
+.I integer
+argument.
+It returns the ID of the newly created breakpoint, or the nil list on error.
+It is simply shorthand for a call
+.Ex
+bpcondset(pid, addr, {})
+.Ee
+where
+.I pid
+is the global variable identifying the currently selected process,
+.I addr
+is the user-supplied address for the breakpoint,
+and
+.CW {}
+is the empty list, signifying no conditions.
+.Ip integer bpcondset "pid,addr,list" "Set conditional breakpoint
+Sets a conditional breakpoint at addr for the kernel process identified by
+.I pid .
+The
+.I list
+argument is a list of operations that are executed when execution reaches
+.I addr .
+If execution results in a a non-zero value on the top of the stack, the breakpoint is taken, otherwise it is skipped.
+The
+.I list
+is in reverse polish notation format, and has these operations:
+.Ex
+PUSH
+DEREF (pop val, push *(ulong*)val)
+MASK (pop mask, pop value, push value & mask)
+EQ (pop v1, pop v2, push v1 == v2)
+NEQ (pop v1, pop v2, push v1 != v2)
+AND (pop v1, pop v2, push v1 && v1)
+OR (pop v1, pop v2, push v1 || v2)
+.Ee
+Condition lists are executed in a single pass, starting with the first command in the list, ending with the last. If a nonzero value is on the top of the stack at the end of execution, the breakpoint is taken, otherwise it is skipped.
+.RS
+.LP
+In effect, there are two mandatory conditions, the address of the breakpoint, and the kernel process id. These two conditions must be met for the condition list to be processed. If these conditions are met, the entire condition list is processed, there is no short circuit evaluation path.
+.LP
+For example, given the following code fragment:
+.P1 +.4i
+int i;
+
+for(i=0; i<1000; i++) {
+ ...
+}
+.P2
+the following call to
+.CW bpcondset()
+sets a conditional breakpoint to be taken when execution reaches
+.I addr
+in kernel process
+.I pid
+on the 500th iteration of the loop:
+.P1 +.4i
+bpcondset(pid, addr, {bppush(i),
+ bpderef(), bppush(500), bpeq()});
+.P2
+.RE
+.SH
+Condition List Construction
+.LP
+.Ip list bppush val "Construct breakpoint stack
+Push val onto the stack.
+.KE
+.Ip list bpderef "" "Construct breakpoint stack
+Replace the value at the top of the stack with the value found at the address obtained by treating value at the top of the stack as an address. Pop the value on the top of the stack, treat it as a ulong*, and push the value at the address.
+.Ex
+addr = pop();
+push(*(ulong*)addr);
+.Ee
+.Ip list bpmask "" "Construct breakpoint stack
+Replace the top two values on the stack with the value obtained by masking the second value on the stack with the top of the stack.
+.Ex
+mask = pop();
+value = pop();
+push(value & mask);
+.Ee
+.Ip list bpeq "" "Construct breakpoint stack
+Comparison of the top two values on the stack. Replace the top two values on the stack with a 1 if the values are equal, or a zero if they are not.
+.Ex
+v1 = pop();
+v2 = pop();
+push(v1 == v2);
+.Ee
+.Ip list bpneq "" "Construct breakpoint stack
+Negative comparison of the top two values on the stack. Replace the top two values on the stack with a 0 if the values are equal, or 1 if they are not.
+.Ex
+v1 = pop();
+v2 = pop();
+push(v1 != v2);
+.Ee
+.Ip list bpand "" "Construct breakpoint stack
+Logical and of the top two values on the stack. Replace the top two values on the stack with a 0 if both are zero, or 1 if both are nonzero.
+.Ex
+v1 = pop();
+v2 = pop();
+push(v1 && v2);
+.Ee
+.Ip list bpor "" "Construct breakpoint stack
+Logical or of the top two values on the stack. Replace the top two values on the stack with a 1 if either is nonzero, 0 otherwise.
+.Ex
+v1 = pop();
+v2 = pop();
+push(v1 || v2);
+.Ee
+.SH
+Breakpoint Status
+.LP
+.Ip {} bptab "" "List active breakpoints
+Prints the list of breakpoints containing the following information in order: breakpoint number, kernel process id, breakpoint address, and the list of conditions to execute to determine if the breakpoint will be taken.
+.Ex
+acid: bptab()
+ID PID ADDR CONDITIONS
+0 1 consread+0x20 0x216cc {}
+acid:
+.Ee
+.Ip integer bpaddr id "Address of breakpoint
+Returns the address the breakpoint identified by
+.I id
+is set to trigger on.
+.KE
+.SH
+Deleting breakpoints
+.Ip {} bpdel id "Delete breakpoint
+Delete the breakpoint identified by
+.I id .
+Shorthand for bpconddel().
+.KE
+.Ip {} bpconddel id "Delete conditional breakpoint
+Delete the conditional breakpoint identified by the integer
+.I id .
+.KE
binary files /dev/null b/doc/acidtut.pdf differ
--- /dev/null
+++ b/doc/acme/acme.ms
@@ -1,0 +1,1295 @@
+.de EX
+.nr x \\$1v
+\\!h0c n \\nx 0
+..
+.de FG \" start figure caption: .FG filename.ps verticalsize
+.KF
+.BP \\$1 \\$2
+.sp .5v
+.EX \\$2v
+.ps -1
+.vs -1
+..
+.de fg \" end figure caption (yes, it is clumsy)
+.ps
+.vs
+.br
+\l'1i'
+.KE
+..
+.TL
+Acme: A User Interface for Programmers
+.AU
+.I "Rob Pike
+.I rob@plan9.bell-labs.com
+.SP .22i exactly
+.AB
+.FS
+\l'1i'
+.br
+Originally appeared in
+.I
+Proc. of the Winter 1994 USENIX Conf.,
+.R
+pp. 223-234,
+San Francisco, CA
+.br
+Reprinted in
+.I "Plan 9 Programmer's Manual: Volume 2"
+(Second Edition), AT&T 1995.
+.FE
+A hybrid of window system, shell, and editor, Acme gives text-oriented
+applications a clean, expressive, and consistent style of interaction.
+Traditional window systems support interactive client programs and offer libraries of
+pre-defined operations such as pop-up menus
+and buttons to promote a consistent
+user interface among the clients.
+Acme instead provides its clients with a fixed user interface and
+simple conventions to encourage its uniform use.
+Clients access the facilities of Acme through a file system interface;
+Acme is in part a file server that exports device-like files that may be
+manipulated to access and control the contents of its windows.
+Written in a concurrent programming language,
+Acme is structured as a set of communicating processes that neatly subdivide
+the various aspects of its tasks: display management, input, file server, and so on.
+.PP
+Acme attaches distinct functions to the three mouse buttons:
+the left selects text;
+the middle executes textual commands;
+and the right combines context search and file opening
+functions to integrate the various applications and files in
+the system.
+.PP
+Acme works well enough to have developed
+a community that uses it exclusively.
+Although Acme discourages the traditional style of interaction
+based on typescript windows\(emteletypes\(emits
+users find Acme's other services render
+typescripts obsolete.
+.AE
+.SH
+History and motivation
+.PP
+The usual typescript style of interaction with
+Unix and its relatives is an old one.
+The typescript\(eman intermingling of textual commands and their
+output\(emoriginates with the scrolls of paper on teletypes.
+The advent of windowed terminals has given each user what
+amounts to an array of teletypes, a limited and unimaginative
+use of the powers of bitmap displays and mice.
+Systems like the Macintosh
+that do involve the mouse as an integral part of the interaction
+are geared towards general users, not experts, and certainly
+not programmers.
+Software developers, at least on time-sharing systems, have been left behind.
+.FG ./acme.fig1 5i
+Figure 1. A small Acme screen\(emnormally it runs on a larger display\(emdemonstrating
+some of the details discussed in the text.
+The right column contains some guide files,
+a mailbox presented by Acme's mail program,
+the columnated display of files in Acme's own source directory,
+a couple of windows from the OED browser,
+a debugger window,
+and an error window showing diagnostics from a compilation.
+The left column holds a couple of source files
+.CW dat.h "" (
+and
+.CW acme.l ),
+another debugger window displaying a stack trace,
+and a third source file
+.CW time.l ). (
+.CW Time.l
+was opened from the debugger by clicking the right mouse button
+on a line in the stack window;
+the mouse cursor landed on the offending line of
+.CW acme.l
+after a click on the compiler message.
+.fg
+.PP
+Some programs have mouse-based editing of
+text files and typescripts;
+ones I have built include
+the window systems
+.CW mux
+[Pike88]
+and
+.CW 8½
+[Pike91]
+and the text editor
+Sam [Pike87].
+These have put the programmer's mouse to some productive work,
+but not wholeheartedly. Even experienced users of these programs
+often retype text that could be grabbed with the mouse,
+partly because the menu-driven interface is imperfect
+and partly because the various pieces are not well enough integrated.
+.PP
+Other programs\(emEMACS [Stal93] is the prime example\(emoffer a high
+degree of integration but with a user interface built around the
+ideas of cursor-addressed terminals that date from the 1970's.
+They are still keyboard-intensive and
+dauntingly complex.
+.PP
+The most ambitious attempt to face these issues was the Cedar
+system, developed at Xerox [Swei86].
+It combined a new programming language, compilers,
+window system, even microcode\(ema complete system\(emto
+construct a productive, highly
+integrated and interactive environment
+for experienced users of compiled languages.
+Although successful internally, the system was so large
+and so tied to specific hardware that it never fledged.
+.PP
+Cedar was, however, the major inspiration for Oberon [Wirt89],
+a system of similar scope but much smaller scale.
+Through careful selection of Cedar's ideas, Oberon shows
+that its lessons can be applied to a small, coherent system
+that can run efficiently on modest hardware.
+In fact, Oberon probably
+errs too far towards simplicity: a single-process system
+with weak networking, it seems an architectural throwback.
+.PP
+Acme is a new program,
+a combined window system, editor, and shell,
+that applies
+some of the ideas distilled by Oberon.
+Where Oberon uses objects and modules within a programming language (also called Oberon),
+Acme uses files and commands within an existing operating system (Plan 9).
+Unlike Oberon, Acme does not yet have support for graphical output, just text.
+At least for now, the work on Acme has concentrated on
+producing the smoothest user interface possible for a programmer
+at work.
+.PP
+The rest of this paper describes Acme's interface,
+explains how programs can access it,
+compares it to existing systems,
+and finally presents some unusual aspects of its implementation.
+.SH
+User interface
+.PP
+.FG ./acme.fig2 3i
+Figure 2. An Acme window showing a section of code.
+The upper line of text is the tag containing the file name,
+relevant commands, and a scratch area (right of the vertical bar);
+the lower portion of the window is the
+body, or contents, of the file.
+Here the scratch area contains a command for the middle button
+.CW mk ) (
+and a word to search for with the right button
+.CW cxfidalloc ). (
+The user has just
+clicked the right button on
+.CW cxfidalloc
+and Acme has searched for the word, highlighted it,
+and moved the mouse cursor there. The file has been modified:
+the center of the layout box is black and the command
+.CW Put
+appears in the tag.
+.fg
+Acme windows are arrayed in columns (Figure 1) and are used more
+dynamically than in an environment like X Windows or
+.CW 8½
+[Sche86, Pike91].
+The system frequently creates them automatically and the user
+can order a new one with a single mouse button click.
+The initial placement of a new window is determined
+automatically, but the user may move an existing window anywhere
+by clicking or dragging a
+.I "layout box
+in the upper left corner of
+the window.
+.PP
+Acme windows have two parts: a
+.I tag
+holding a single line of text,
+above a
+.I body
+holding zero or more lines (Figure 2).
+The body typically contains an image of a file being edited
+or the editable output of a
+program, analogous to an
+EMACS shell
+window. The tag contains
+the name of the window
+(usually the name of the associated
+file or directory), some built-in commands, and a scratch area to hold arbitrary text.
+If a window represents a directory, the name in the tag ends with
+a slash and the body contains a list of the names of the files
+in the directory.
+Finally, each non-empty body holds a scroll bar at the left of the text.
+.PP
+Each column of windows also has a layout box and a tag.
+The tag has no special meaning, although Acme pre-loads it with a few
+built-in commands.
+There is also a tag across the whole display, also loaded with
+helpful commands and a list of active processes started
+by Acme.
+.PP
+Typing with the keyboard and selecting with the left button are as in
+many other systems, including the Macintosh,
+.CW 8½ ,
+and Sam.
+The middle and right buttons are used, somewhat like the left button,
+to `sweep' text, but the indicated text is treated in a way
+that depends on the text's location\(em\f2context\f1\(emas well as its content.
+This context, based on the directory of the file containing the text,
+is a central component of Acme's style of interaction.
+.PP
+Acme has no single notion of `current directory'.
+Instead, every command, file name,
+action, and so on is interpreted or executed in the directory named by the
+tag of the window containing the command. For example, the string
+.CW mammals
+in a window labeled
+.CW /lib/
+or
+.CW /lib/insects
+will be interpreted as the file name
+.CW /lib/mammals
+if such a file exists.
+.PP
+Throughout Acme, the middle mouse button is used to execute commands
+and the right mouse button is used to locate and select files and text.
+Even when there are no true files on which to operate\(emfor example
+when editing mail messages\(emAcme and its applications use
+consistent extensions of these basic functions.
+This idea is as vital to Acme as icons are to the Macintosh.
+.PP
+The middle button executes commands: text swept with the button
+pressed is underlined; when the button is released, the underline is
+removed and the indicated text is executed.
+A modest number of commands are recognized as built-ins: words like
+.CW Cut ,
+.CW Paste ,
+and
+.CW New
+name
+functions performed directly by Acme.
+These words often appear in tags to make them always available,
+but the tags are not menus: any text anywhere in Acme may be a command.
+For example, in the tag or body of any window one may type
+.CW Cut ,
+select it with the left button, use the middle button to execute it,
+and watch it disappear again.
+.PP
+If the middle button indicates a command that is not recognized as a built-in,
+it is executed in the directory
+named by the tag of the window holding the text.
+Also, the file to be executed is searched for first in that directory.
+Standard input is connected to
+.CW /dev/null ,
+but standard and error outputs are connected to an Acme window,
+created if needed, called
+\f2dir\f(CW/+Errors\f1 where
+.I dir
+is the directory of the window.
+(Programs that need interactive input use a different interface, described below.)
+A typical use of this is to type
+.CW mk
+(Plan 9's
+.CW make )
+in the scratch area in the tag of a C source window, say
+.CW /sys/src/cmd/sam/regexp.c ,
+and execute it.
+Output, including compiler errors, appears in the window labeled
+.CW /sys/src/cmd/sam/+Errors ,
+so file names in the output are associated with the windows and directory
+holding the source.
+The
+.CW mk
+command remains in the tag, serving as a sort of menu item for the associated
+window.
+.PP
+Like the middle button, the right button is used to indicate text by sweeping it out.
+The indicated text is not a command, however, but the argument of a generalized
+search operator.
+If the text, perhaps after appending it to the directory of the window containing it,
+is the name of an existing file, Acme creates a new window to hold the file
+and reads it in. It then moves the mouse cursor to that window. If the file is
+already loaded into Acme, the mouse motion happens but no new window is made.
+For example, indicating the string
+.CW sam.h
+in
+.P1
+#include "sam.h"
+.P2
+in a window on the file
+.CW /sys/src/cmd/sam/regexp.c
+will open the file
+.CW /sys/src/cmd/sam/sam.h .
+.PP
+If the file name is followed immediately by a colon and a legal address in
+Sam notation (for example a line number or a regular expression delimited in
+slashes or a comma-separated compound of such addresses), Acme highlights
+the target of that address in the file and places the mouse there. One may jump to
+line 27 of
+.CW dat.h
+by indicating with the right button the text
+.CW dat.h:27 .
+If the file is not already open, Acme loads it.
+If the file name is null, for example if the indicated string is
+.CW :/^main/ ,
+the file is assumed to be that of the window containing the string.
+Such strings, when typed and evaluated in the tag of a window, amount to
+context searches.
+.PP
+If the indicated text is not the name of an existing file, it is taken to be literal
+text and is searched for in the body of the window containing the text, highlighting
+the result as if it were the result of a context search.
+.PP
+For the rare occasion when a file name
+.I is
+just text to search for, it can be selected with the left button and used as the
+argument to a built-in
+.CW Look
+command that always searches for literal text.
+.SH
+Nuances and heuristics
+.PP
+A user interface should not only provide the necessary functions, it should also
+.I feel
+right.
+In fact, it should almost not be felt at all; when one notices a
+user interface, one is distracted from the job at hand [Pike88].
+To approach this invisibility, some of Acme's properties and features
+are there just to make the others easy to use.
+Many are based on a fundamental principle of good design:
+let the machine do the work.
+.PP
+Acme tries to avoid needless clicking and typing.
+There is no `click-to-type', eliminating a button click.
+There are no pop-up or pull-down menus, eliminating the mouse action needed to
+make a menu appear.
+The overall design is intended to make text on the screen useful without
+copying or retyping; the ways in which this happens involve
+the combination of many aspects of the interface.
+.PP
+Acme tiles its windows and places them automatically
+to avoid asking the user to place and arrange them.
+For this policy to succeed, the automatic placement must behave well enough
+that the user is usually content with the location of a new window.
+The system will never get it right all the time, but in practice most
+windows are used at least for a while where Acme first places them.
+There have been several complete rewrites of the
+heuristics for placing a new window,
+and with each rewrite the system became
+noticeably more comfortable. The rules are as follows, although
+they are still subject to improvement.
+The window appears in the `active' column, that most recently used for typing or
+selecting.
+Executing and searching do not affect the choice of active column,
+so windows of commands and such do not draw new windows towards them,
+but rather let them form near the targets of their actions.
+Output (error) windows always appear towards the right, away from
+edited text, which is typically kept towards the left.
+Within the column, several competing desires are balanced to decide where
+and how large the window should be:
+large blank spaces should be consumed;
+existing text should remain visible;
+existing large windows should be divided before small ones;
+and the window should appear near the one containing the action that caused
+its creation.
+.PP
+Acme binds some actions to chords of mouse buttons.
+These include
+.CW Cut
+and
+.CW Paste
+so these common operations can be done without
+moving the mouse.
+Another is a way to apply a command in one window to text (often a file name)
+in another, avoiding the actions needed to assemble the command textually.
+.PP
+Another way Acme avoids the need to move the mouse is instead to move the cursor
+to where it is likely to be used next. When a new window is made, Acme
+moves the cursor to the new window; in fact, to the selected text in that window.
+When the user deletes a newly made window, the cursor is
+returned to the point it was before the window was made,
+reducing the irritation of windows that pop up to report annoying errors.
+.PP
+When a window is moved, Acme moves the cursor to the layout box in
+its new place, to permit further adjustment without moving the mouse.
+For example, when a click of the left mouse button on the layout box grows
+the window, the cursor moves to the new location of the box so repeated clicks,
+without moving the mouse, continue to grow it.
+.PP
+Another form of assistance the system can offer is to supply precision in
+pointing the mouse. The best-known form of this is `double-clicking' to
+select a word rather than carefully sweeping out the entire word.
+Acme provides this feature, using context to decide whether to select
+a word, line, quoted string, parenthesized expression, and so on.
+But Acme takes the idea much further by applying it to execution
+and searching.
+A
+.I single
+click, that is, a null selection, with either the middle or right buttons,
+is expanded automatically to indicate the appropriate text containing
+the click. What is appropriate depends on the context.
+.PP
+For example, to execute a single-word command
+such as
+.CW Cut ,
+it is not necessary to sweep the entire word; just clicking the button once with
+the mouse pointing at the word is sufficient. `Word'
+means the largest string of likely file name characters surrounding the location
+of the click: click on a file name, run that program.
+On the right button, the rules are more complicated because
+the target of the click might be a file name, file name with address,
+or just plain text. Acme examines the text near the click to find
+a likely file name;
+if it finds one, it checks that it names an existing file (in the directory named in the tag, if the name is relative)
+and if so, takes that as the result, after extending it with any address
+that may be present. If there is no file with that name, Acme
+just takes the largest alphanumeric string under the click.
+The effect is a natural overloading of the button to refer to plain text as
+well as file names.
+.PP
+First, though, if the click occurs over the left-button-selected text in the window,
+that text is taken to be what is selected.
+This makes it easy to skip through the occurrences of a string in a file: just click
+the right button
+on some occurrence of the text in the window (perhaps after typing it in the tag)
+and click once for each subsequent occurrence. It isn't even necessary to move
+the mouse between clicks; Acme does that.
+To turn a complicated command into a sort of menu item, select it:
+thereafter, clicking the middle button on it will execute the full command.
+.PP
+As an extra feature, Acme recognizes file names in angle brackets
+.CW <>
+as names of files in standard directories of include files,
+making it possible for instance to look at
+.CW <stdio.h>
+with a single click.
+.PP
+Here's an example to demonstrate how the actions and defaults work together.
+Assume
+.CW /sys/src/cmd/sam/regexp.c
+is
+open and has been edited. We write it (execute
+.CW Put
+in the tag; once the file is written, Acme removes the word from the tag)
+and type
+.CW mk
+in the tag. We execute
+.CW mk
+and get some errors, which appear in a new window labeled
+.CW /sys/src/cmd/sam/+Errors .
+The cursor moves automatically to that window.
+Say the error is
+.P1
+main.c:112: incompatible types on assignment to `pattern'
+.P2
+We move the mouse slightly and click the right button
+at the left of the error message; Acme
+makes a new window, reads
+.CW /sys/src/cmd/main.c
+into it, selects line 112
+and places the mouse there, right on the offending line.
+.SH
+Coupling to existing programs
+.PP
+Acme's syntax for file names and addresses makes it easy for other programs
+to connect automatically to Acme's capabilities. For example, the output of
+.P1
+grep -n variable *.[ch]
+.P2
+can be used to help Acme step through the occurrences of a variable in a program;
+every line of output is potentially a command to open a file.
+The file names need not be absolute, either: the output
+appears in a window labeled with the directory in which
+.CW grep
+was run, from which Acme can derive the full path names.
+.PP
+When necessary, we have changed the output of some programs,
+such as compiler error messages, to match
+Acme's syntax.
+Some might argue that it shouldn't be necessary to change old programs,
+but sometimes programs need to be updated when systems change,
+and consistent output benefits people as well as programs.
+A historical example is the retrofitting of standard error output to the
+early Unix programs when pipes were invented.
+.PP
+Another change was to record full path names in
+the symbol table of executables, so line numbers reported by the debugger
+are absolute names that may be used directly by Acme; it's not necessary
+to run the debugger in the source directory. (This aids debugging
+even without Acme.)
+.PP
+A related change was to add lines of the form
+.P1
+#pragma src "/sys/src/libregexp"
+.P2
+to header files; coupled with Acme's ability to locate a header file,
+this provides a fast, keyboardless way to get the source associated with a library.
+.PP
+Finally, Acme directs the standard output of programs it runs to
+windows labeled by the directory in which the program is run.
+Acme's splitting of the
+output into directory-labeled windows is a small feature that has a major effect:
+local file names printed by programs can be interpreted directly by Acme.
+By indirectly coupling the output of programs to the input,
+it also simplifies the management of software that occupies multiple
+directories.
+.SH
+Coupling to new programs
+.PP
+Like many Plan 9 programs,
+Acme offers a programmable interface to
+other programs by acting as a file server.
+The best example of such a file server is the window system
+.CW 8½
+[Pike91],
+which exports files with names such as
+.CW screen ,
+.CW cons ,
+and
+.CW mouse
+through which applications may access the I/O capabilities of the windows.
+.CW 8½
+provides a
+.I distinct
+set of files for each window and builds a private file name space
+for the clients running `in' each window;
+clients in separate windows see distinct files with the same names
+(for example
+.CW /dev/mouse ).
+Acme, like the process file system [PPTTW93], instead associates each
+window with a directory of files; the files of each window are visible
+to any application.
+This difference reflects a difference in how the systems are used:
+.CW 8½
+tells a client what keyboard and mouse activity has happened in its window;
+Acme tells a client what changes that activity wrought on any window it asks about.
+Putting it another way,
+.CW 8½
+enables the construction of interactive applications;
+Acme provides the interaction for applications.
+.PP
+The root of
+Acme's file system is mounted using Plan 9 operations on the directory
+.CW /mnt/acme .
+In
+that root directory appears a directory for each window, numbered with the window's identifier,
+analogous to a process identifier, for example
+.CW /mnt/acme/27 .
+The window's directory
+contains 6 files:
+.CW /mnt/acme/27/addr ,
+.CW body ,
+.CW ctl ,
+.CW data ,
+.CW event ,
+and
+.CW tag .
+The
+.CW body
+and
+.CW tag
+files contain the text of the respective parts of the window; they may be
+read to recover the contents. Data written to these files is appended to the text;
+.CW seeks
+are ignored.
+The
+.CW addr
+and
+.CW data
+files provide random access to the contents of the body.
+The
+.CW addr
+file is written to set a character position within the body; the
+.CW data
+file may then be read to recover the contents at that position,
+or written to change them.
+(The tag is assumed
+small and special-purpose enough not to need special treatment.
+Also,
+.CW addr
+indexes by character position, which is not the same as byte offset
+in Plan 9's multi-byte character set [Pike93]).
+The format accepted by the
+.CW addr
+file is exactly the syntax of addresses within the user interface,
+permitting regular expressions, line numbers, and compound addresses
+to be specified. For example, to replace the contents of lines 3 through 7,
+write the text
+.P1
+3,7
+.P2
+to the
+.CW addr
+file, then write the replacement text to the
+.CW data
+file. A zero-length write deletes the addressed text; further writes extend the replacement.
+.PP
+The control file,
+.CW ctl ,
+may be written with commands to effect actions on the window; for example
+the command
+.P1
+name /adm/users
+.P2
+sets the name in the tag of the window to
+.CW /adm/users .
+Other commands allow deleting the window, writing it to a file, and so on.
+Reading the
+.CW ctl
+file recovers a fixed-format string containing 5 textual numbers\(emthe window
+identifier, the number of characters in the tag, the number in the body,
+and some status information\(emfollowed by the text of the tag, up to a newline.
+.PP
+The last file,
+.CW event ,
+is the most unusual.
+A program reading a window's
+.CW event
+file is notified of all changes to the text of the window, and
+is asked to interpret all middle- and right-button actions.
+The data passed to the program is fixed-format and reports
+the source of the action (keyboard, mouse, external program, etc.),
+its location (what was pointed at or modified), and its nature (change,
+search, execution, etc.).
+This message, for example,
+.P1
+MI15 19 0 4 time
+.P2
+reports that actions of the mouse
+.CW M ) (
+inserted in the body (capital
+.CW I )
+the 4 characters of
+.CW time
+at character positions 15 through 19; the zero is a flag word.
+Programs may apply their own interpretations of searching and
+execution, or may simply reflect the events back to Acme,
+by writing them back to the
+.CW event
+file, to have the default interpretation applied.
+Some examples of these ideas in action are presented below.
+.PP
+Notice that changes to the window are reported
+after the fact; the program is told about them but is not required to act
+on them. Compare this to a more traditional interface in which a program
+is told, for example, that a character has been typed on the keyboard and
+must then display and interpret it.
+Acme's style stems from the basic model of the system, in which any
+number of agents\(emthe keyboard, mouse, external programs
+writing to
+.CW data
+or
+.CW body ,
+and so on\(emmay
+change the contents of a window.
+The style is efficient: many programs are content
+to have Acme do most of the work and act only when the editing is completed.
+An example is the Acme mail program, which can ignore the changes
+made to a message being composed
+and just read its body when asked to send it.
+A disadvantage is that some traditional ways of working are impossible.
+For example, there is no way `to turn off echo': characters appear on the
+screen and are read from there; no agent or buffer stands between
+the keyboard and the display.
+.PP
+There are a couple of other files made available by Acme in its root directory
+rather than in the directory of each window.
+The text file
+.CW /mnt/acme/index
+holds a list of all window names and numerical identifiers,
+somewhat analogous to the output of the
+.CW ps
+command for processes.
+The most important, though, is
+.CW /mnt/acme/new ,
+a directory that makes new windows, similar to the
+.CW clone
+directory in the Plan 9 network devices [Pres93].
+The act of opening any file in
+.CW new
+creates a new Acme window; thus the shell command
+.P1
+grep -n var *.c > /mnt/acme/new/body
+.P2
+places its output in the body of a fresh window.
+More sophisticated applications may open
+.CW new/ctl ,
+read it to discover the new window's identifier, and then
+open the window's other files in the numbered directory.
+.SH
+Acme-specific programs
+.PP
+Although Acme is in part an attempt to move beyond typescripts,
+they will probably always have utility.
+The first program written for Acme was therefore one
+to run a shell or other traditional interactive application
+in a window, the Acme analog of
+.CW xterm .
+This program,
+.CW win ,
+has a simple structure:
+it acts as a two-way intermediary between Acme and the shell,
+cross-connecting the standard input and output of the shell to the
+text of the window.
+The style of interaction is modeled after
+.CW mux
+[Pike88]: standard output is added to the window at the
+.I "output point;
+text typed after the output point
+is made available on standard input when a newline is typed.
+After either of these actions, the output point is advanced.
+This is different from the working of a regular terminal,
+permitting cut-and-paste editing of an input line until the newline is typed.
+Arbitrary editing may be done to any text in the window.
+The implementation of
+.CW win ,
+using the
+.CW event ,
+.CW addr ,
+and
+.CW data
+files, is straightforward.
+.CW Win
+needs no code for handling the keyboard and mouse; it just monitors the
+contents of the window. Nonetheless, it allows Acme's full editing to be
+applied to shell commands.
+The division of labor between
+.CW win
+and
+.CW Acme
+contrasted with
+.CW xterm
+and the X server demonstrates how much work Acme handles automatically.
+.CW Win
+is implemented by a single source file 560 lines long and has no graphics code.
+.PP
+.CW Win
+uses the middle and right buttons to connect itself in a consistent way
+with the rest of Acme.
+The middle button still executes commands, but in a style more suited
+to typescripts. Text selected with the middle button is treated as if
+it had been typed after the output point, much as a similar feature in
+.CW xterm
+or
+.CW 8½ ,
+and therefore causes it to be `executed' by the application running in the window.
+Right button actions are reflected back to Acme but refer to the appropriate
+files because
+.CW win
+places the name of the current directory in the tag of the window.
+If the shell is running, a simple shell function replacing the
+.CW cd
+command can maintain the tag as the shell navigates the file system.
+This means, for example, that a right button click on a file mentioned in an
+.CW ls
+listing opens the file within Acme.
+.PP
+Another Acme-specific program is a mail reader that begins by presenting,
+in a window, a listing of the messages in the user's mailbox, one per line.
+Here the middle and right button actions are modified to refer to
+mail commands
+and messages, but the change feels natural.
+Clicking the right button on a line creates a new window and displays the
+message there, or, if it's already displayed, moves the mouse to that window.
+The metaphor is that the mailbox is a directory whose constituent files are messages.
+The mail program also places some relevant commands in the tag lines of
+the windows; for example, executing the word
+.CW Reply
+in a message's tag creates a new window
+in which to compose a message to the sender of the original;
+.CW Post
+then dispatches it.
+In such windows, the addressee is just a list of names
+on the first line of the body, which may be edited to add or change recipients.
+The program also monitors the mailbox, updating the `directory' as new messages
+arrive.
+.PP
+The mail program is as simple as it sounds; all the work of interaction,
+editing, and management of the display is done by Acme.
+The only
+difficult sections of the 1200
+lines of code concern honoring the external protocols for managing
+the mailbox and connecting to
+.CW sendmail .
+.PP
+One of the things Acme does not provide directly is a facility like
+Sam's command language to enable actions such as global substitution;
+within Acme, all editing is done manually.
+It is easy, though, to write external programs for such tasks.
+In this, Acme comes closer to the original intent of Oberon:
+a directory,
+.CW /acme/edit ,
+contains a set of tools for repetitive editing and a template
+or `guide' file that gives examples
+of its use.
+Acme's editing guide,
+.CW /acme/edit/guide ,
+looks like this:
+.P1
+e file | x '/regexp/' | c 'replacement'
+e file:'0,$' | x '/.*word.*\en/' | p -n
+e file | pipe command args ...
+.P2
+The syntax is reminiscent of Sam's command language, but here the individual
+one-letter commands are all stand-alone programs connected by pipes.
+Passed along the pipes are addresses, analogous to structural expressions
+in Sam terminology.
+The
+.CW e
+command, unlike that of Sam, starts the process by generating the address
+(default dot, the highlighted selection) in the named files.
+The other commands are as in Sam:
+.CW p
+prints the addressed text on standard output (the
+.CW -n
+option is analogous to that of
+.CW grep ,
+useful in combination with the right mouse button);
+.CW x
+matches a regular expression to the addressed (incoming) text,
+subdividing the text;
+.CW c
+replaces the text; and so on. Thus, global substitution throughout a file,
+which would be expressed in Sam as
+.P1
+0,$ x/regexp/ c/replacement/
+.P2
+in Acme's editor becomes
+.P1
+e 'file:0,$' | x '/regexp/' | c 'replacement'
+.P2
+.PP
+To use the Acme editing commands, open
+.CW /acme/edit/guide ,
+use the mouse and keyboard to edit one of the commands to the right form,
+and execute it with the middle button.
+Acme's context rules find the appropriate binaries in
+.CW /acme/edit
+rather than
+.CW /bin ;
+the effect is to turn
+.CW /acme/edit
+into a toolbox containing tools and instructions (the guide file) for their use.
+In fact, the source for these tools is also there, in the directory
+.CW /acme/edit/src .
+This setup allows some control of the file name space for binary programs;
+not only does it group related programs, it permits the use of common
+names for uncommon jobs. For example, the single-letter names would
+be unwise in a directory in everyone's search path; here they are only
+visible when running editing commands.
+.PP
+In Oberon,
+such a collection would be called a
+.I tool
+and would consist
+of a set of entry points in a module and a menu-like piece of text containing
+representative commands that may be edited to suit and executed.
+There is, in fact, a tool called
+.CW Edit
+in Oberon.
+To provide related functionality,
+Acme exploits the directory and file structure of the underlying
+system, rather than the module structure of the language;
+this fits well with Plan 9's
+file-oriented philosophy.
+Such tools are central to the working of Oberon but they are
+less used in Acme, at least so far.
+The main reason is probably that Acme's program interface permits
+an external program to remain executing in the background, providing
+its own commands as needed (for example, the
+.CW Reply
+command in the mail program); Oberon uses tools to
+implement such services because its must invoke
+a fresh program for each command.
+Also,
+Acme's better integration allows more
+basic functions to be handled internally; the right mouse button
+covers a lot of the basic utility of the editing tools in Oberon.
+Nonetheless, as more applications are written for Acme,
+many are sure to take this Oberon tool-like form.
+.SH
+Comparison with other systems
+.PP
+Acme's immediate ancestor is Help [Pike92], an experimental system written
+a few years ago as a first try at exploring some of Oberon's ideas
+in an existing operating system.
+Besides much better engineering, Acme's advances over Help
+include the actions of the right button (Help had nothing comparable),
+the ability to connect long-running programs to the user interface
+(Help had no analog of the
+.CW event
+file),
+and the small but important change to split command output into
+windows labeled with the directory in which the commands run.
+.PP
+Most of Acme's style, however, derives from the user interface and window
+system of Oberon [Wirt89, Reis91].
+Oberon includes a programming language and operating system,
+which Acme instead borrows from an existing system, Plan 9.
+When I first saw Oberon, in 1988, I was struck by the
+simplicity of its user interface, particularly its lack of menus
+and its elegant use of multiple mouse buttons.
+The system seemed restrictive, though\(emsingle process,
+single language, no networking, event-driven programming\(emand
+failed to follow through on some of its own ideas.
+For example, the middle mouse button had to be pointed accurately and
+the right button was essentially unused.
+Acme does follow through:
+to the basic idea planted by Oberon, it adds
+the ability to run on different operating systems and hardware,
+connection to existing applications including
+interactive ones such as shells and debuggers,
+support for multiple processes,
+the right mouse button's features,
+the default actions and context-dependent properties
+of execution and searching,
+and a host of little touches such as moving the mouse cursor that make the system
+more pleasant.
+At the moment, though, Oberon does have one distinct advantage: it incorporates
+graphical programs well into its model, an issue Acme has not yet faced.
+.PP
+Acme shares with the Macintosh a desire to use the mouse well and it is
+worth comparing the results.
+The mouse on the Macintosh has a single button, so menus are essential
+and the mouse must frequently move a long way
+to reach the appropriate function.
+An indication that this style has trouble is that applications provide
+keyboard sequences to invoke menu selections and users often prefer them.
+A deeper comparison is that the Macintosh uses pictures where Acme uses text.
+In contrast to pictures, text can be edited quickly, created on demand,
+and fine-tuned to the job at hand; consider adding an option to a command.
+It is also self-referential; Acme doesn't need menus because any text can be
+in effect a menu item.
+The result is that, although a Macintosh screen is certainly prettier and probably
+more attractive, especially to beginners, an Acme screen is more dynamic
+and expressive, at least for programmers and experienced users.
+.PP
+For its role in the overall system,
+Acme most resembles EMACS [Stal93].
+It is tricky to compare Acme to EMACS, though, because there are
+many versions of EMACS and, since it is fully programmable, EMACS
+can in principle do anything Acme does.
+Also, Acme is much younger and therefore has not
+had the time to acquire as many features.
+The issue therefore is less what the systems can be programmed to do than
+how they are used.
+The EMACS versions that come closest to Acme's style are those that
+have been extended to provide a programming environment, usually
+for a language such as LISP [Alle92, Lucid92].
+For richness of the existing interface, these EMACS versions are certainly superior to Acme.
+On the other hand, Acme's interface works equally well already for a variety
+of languages; for example, one of its most enthusiastic users works almost
+exclusively in Standard ML, a language nothing like C.
+.PP
+Where Acme excels is in the smoothness of its interface.
+Until recently, EMACS did not support the mouse especially well,
+and even with the latest version providing features such as `extents'
+that can be programmed to behave much like Acme commands,
+many users don't bother to upgrade.
+Moreover, in the versions that provide extents,
+most EMACS packages don't take advantage of them.
+.PP
+The most important distinction is just that
+EMACS is fundamentally keyboard-based, while
+Acme is mouse-based.
+.PP
+People who try Acme find it hard to go back to their previous environment.
+Acme automates so much that to return to a traditional interface
+is to draw attention to the extra work it requires.
+.SH
+Concurrency in the implementation
+.PP
+Acme is about 8,000 lines of code in Alef, a concurrent object-oriented language syntactically similar to C [Alef].
+Acme's structure is a set of communicating
+processes in a single address space.
+One subset of the processes drives the display and user interface,
+maintaining the windows; other processes forward mouse and keyboard
+activity and implement the file server interface for external programs.
+The language and design worked out well;
+as explained elsewhere [Pike89, Gans93, Reppy93],
+user interfaces built with concurrent systems
+can avoid the clumsy
+top-level event loop typical of traditional interactive systems.
+.PP
+An example of the benefits of the multi-process style
+is the management of the state of open
+files held by clients of the file system interface.
+The problem is that some I/O requests,
+such as reading the
+.CW event
+file, may block if no data is available, and the server must
+maintain the state of (possibly many) requests until data appears.
+For example,
+in
+.CW 8½ ,
+a single-process window system written in C, pending requests were queued in
+a data structure associated with each window.
+After activity in the window that might complete pending I/O,
+the data structure was scanned for requests that could now finish.
+This structure did not fit well with the rest of the program and, worse,
+required meticulous effort
+to guarantee correct behavior under all conditions
+(consider raw mode, reads of partial lines, deleting a window,
+multibyte characters, etc.).
+.PP
+Acme instead creates a new dedicated process
+for each I/O request.
+This process coordinates with the rest of the system
+using Alef's synchronous communication;
+its state implicitly encodes the state of
+the I/O request and obviates the need for queuing.
+The passage of the request through Acme proceeds as follows.
+.PP
+Acme contains a file server process, F, that executes a
+.CW read
+system call to receive a Plan 9 file protocol (9P) message from the client [AT&T92].
+The client blocks until Acme answers the request.
+F communicates with an allocation process, M,
+to acquire an object of type
+.CW Xfid
+(`executing fid'; fid is a 9P term)
+to hold the request.
+M sits in a loop (reproduced in Figure 2) waiting for either a request for
+a new
+.CW Xfid
+or notification that an existing one has finished its task.
+When an
+.CW Xfid
+is created, an associated process, X,
+is also made.
+M queues idle
+.CW Xfids ,
+allocating new ones only when the list is empty.
+Thus, there is always a pool of
+.CW Xfids ,
+some executing, some idle.
+.PP
+The
+.CW Xfid
+object contains a channel,
+.CW Xfid.c ,
+for communication with its process;
+the unpacked message; and some associated functions,
+mostly corresponding to 9P messages such as
+.CW Xfid.write
+to handle a 9P write request.
+.PP
+The file server process F parses the message to see its nature\(emopen,
+close, read, write, etc. Many messages, such as directory
+lookups, can be handled immediately; these are responded to directly
+and efficiently
+by F without invoking the
+.CW Xfid ,
+which is therefore maintained until the next message.
+When a message, such as a write to the display, requires the attention
+of the main display process and interlocked access to its data structures,
+F enables X
+by sending a function pointer on
+.CW Xfid.c .
+For example, if the message is a write, F executes
+.P1
+x->c <-= Xfid.write;
+.P2
+which sends
+the address of
+.CW Xfid.write
+on
+.CW Xfid.c ,
+waking up X.
+.PP
+The
+.CW Xfid
+process, X, executes a simple loop:
+.P1
+void
+Xfid.ctl(Xfid *x)
+{
+ for(;;){
+ (*<-x->c)(x); /* receive and execute message */
+ bflush(); /* synchronize bitmap display */
+ cxfidfree <-= x; /* return to free list */
+ }
+}
+.P2
+Thus X
+will wake up with the address of a function to call (here
+.CW Xfid.write )
+and execute it; once that completes, it returns itself to the pool of
+free processes by sending its address back to the allocator.
+.PP
+Although this sequence may seem complicated, it is just a few lines
+of code and is in fact far simpler
+than the management of the I/O queues in
+.CW 8½ .
+The hard work of synchronization is done by the Alef run time system.
+Moreover, the code worked the first time, which cannot be said for the code in
+.CW 8½ .
+.SH
+Undo
+.PP
+Acme provides a general undo facility like that of Sam, permitting
+textual changes to be unwound arbitrarily.
+The implementation is superior to Sam's, though,
+with much higher performance and the ability to `redo' changes.
+.PP
+Sam uses
+a multi-pass algorithm that builds
+a transcript of changes to be made simultaneously
+and then executes them atomically.
+This was thought necessary because the elements of a repetitive
+command such as a global substitution should all be applied to the same
+initial file and implemented simultaneously; forming the complete
+transcript before executing any of the changes avoids the
+cumbersome management of addresses in a changing file.
+Acme, however, doesn't have this problem; global substitution
+is controlled externally and may be made incrementally by exploiting
+an observation: if the changes are sorted in address order and
+executed in reverse, changes will not invalidate the addresses of
+pending changes.
+.PP
+Acme therefore avoids the initial transcript. Instead, changes are applied
+directly to the file, with an undo transcript recorded in a separate list.
+For example, when text is added to a window, it is added directly and a record
+of what to delete to restore the state is appended to the undo list.
+Each undo action and the file are marked with a sequence number;
+actions with the same sequence number are considered a unit
+to be undone together.
+The invariant state of the structure
+is that the last action in the undo list applies to the current state of the file,
+even if that action is one of a related set from, for example, a global substitute.
+(In Sam, a related set of actions needed to be undone simultaneously.)
+To undo an action, pop the last item on the undo list, apply it to the file,
+revert it, and append it to a second, redo list.
+To redo an action, do the identical operation with the lists interchanged.
+The expensive operations occur
+only when actually undoing; in normal editing the overhead is minor.
+For example, Acme reads files about seven times faster than Sam, partly
+because of this improvement and partly because of a cleaner implementation.
+.PP
+Acme uses a temporary file to hold the text, keeping in memory only the
+visible portion, and therefore can edit large files comfortably
+even on small-memory machines such as laptops.
+.SH
+Future
+.PP
+Acme is still under development.
+Some things are simply missing.
+For example, Acme should support non-textual graphics, but this is being
+deferred until it can be done using a new graphics model being developed
+for Plan 9. Also, it is undecided how Acme's style of interaction should best be
+extended to graphical applications.
+On a smaller scale, although the system feels smooth and comfortable,
+work continues to tune the heuristics and
+try new ideas for the user interface.
+.PP
+There need to be more programs that use Acme. Browsers for
+Usenet and AP News articles, the Oxford English Dictionary, and other
+such text sources exist, but more imaginative applications will
+be necessary to prove that Acme's approach is viable.
+One that has recently been started is an interface to the debugger Acid [Wint94],
+although it is still
+unclear what form it will ultimately take.
+.PP
+Acme shows that it is possible to make a user interface a stand-alone component
+of an interactive environment. By absorbing more of the interactive
+functionality than a simple window system, Acme off-loads much of the
+computation from its applications, which helps keep them small and
+consistent in their interface. Acme can afford to dedicate
+considerable effort to making that interface as good as possible; the result
+will benefit the entire system.
+.PP
+Acme is complete and useful enough to attract users.
+Its comfortable user interface,
+the ease with which it handles multiple tasks and
+programs in multiple directories,
+and its high level of integration
+make it addictive.
+Perhaps most telling,
+Acme shows that typescripts may not be the most
+productive interface to a time-sharing system.
+.SH
+Acknowledgements
+.PP
+Howard Trickey, Acme's first user, suffered buggy versions gracefully and made
+many helpful suggestions. Chris Fraser provided the necessary insight for the Acme editing
+commands.
+.SH
+References
+.LP
+[Alef] P. Winterbottom,
+``Alef Language Reference Manual'',
+.I
+Plan 9 Programmer's Manual,
+.R
+AT&T Bell Laboratories,
+Murray Hill, NJ,
+1992;
+revised in this volume.
+.br
+[Alle92]
+.I
+Allegro Common Lisp user Guide, Vol 2,
+.R
+Chapter 14, "The Emacs-Lisp Interface".
+March 1992.
+.br
+[AT&T92] Plan 9 Programmer's manual, Murray Hill, New Jersey, 1992.
+.br
+[Far89] Far too many people, XTERM(1), Massachusetts Institute of Technology, 1989.
+.br
+[Gans93] Emden R. Gansner and John H. Reppy, ``A Multi-threaded Higher-order User Interface Toolkit'', in
+.I
+Software Trends, Volume 1,
+User Interface Software,
+.R
+Bass and Dewan (Eds.),
+John Wiley & Sons 1993,
+pp. 61-80.
+.br
+[Lucid92] Richard Stallman and Lucid, Inc.,
+.I
+Lucid GNU EMACS Manual,
+.R
+March 1992.
+.br
+[Pike87] Rob Pike, ``The Text Editor \f(CWsam\fP'', Softw. - Pract. and Exp., Nov 1987, Vol 17 #11, pp. 813-845; reprinted in this volume.
+.br
+[Pike88] Rob Pike, ``Window Systems Should Be Transparent'', Comp. Sys., Summer 1988, Vol 1 #3, pp. 279-296.
+.br
+[Pike89] Rob Pike, ``A Concurrent Window System'', Comp. Sys., Spring 1989, Vol 2 #2, pp. 133-153.
+.br
+[PPTTW93] Rob Pike, Dave Presotto, Ken Thompson, Howard Trickey, and Phil Winterbottom, ``The Use of Name Spaces in Plan 9'',
+Op. Sys. Rev., Vol. 27, No. 2, April 1993, pp. 72-76,
+reprinted in this volume.
+.br
+[Pike91] Rob Pike, ``8½, the Plan 9 Window System'', USENIX Summer Conf. Proc., Nashville, June, 1991, pp. 257-265,
+reprinted in this volume.
+.br
+[Pike92] Rob Pike, ``A Minimalist Global User Interface'', Graphics Interface '92 Proc., Vancouver, 1992, pp. 282-293. An earlier version appeared under the same title in USENIX Summer Conf. Proc., Nashville, June, 1991, pp. 267-279.
+.br
+[Pike93] Rob Pike and Ken Thompson, ``Hello World or Καλημέρα κόσμε or
+\f(Jpこんにちは 世界\fP'', USENIX Winter Conf. Proc., San Diego, 1993, pp. 43-50,
+reprinted in this volume.
+.br
+[Pres93] Dave Presotto and Phil Winterbottom, ``The Organization of Networks in Plan 9'', Proc. Usenix Winter 1993, pp. 271-287, San Diego, CA,
+reprinted in this volume.
+.br
+[Reis91] Martin Reiser, \fIThe Oberon System,\fP Addison Wesley, New York, 1991.
+.br
+[Reppy93] John H. Reppy,
+``CML: A higher-order concurrent language'', Proc. SIGPLAN'91 Conf. on Programming, Lang. Design and Impl., June, 1991, pp. 293-305.
+.br
+[Sche86] Robert W. Scheifler and Jim Gettys,
+``The X Window System'',
+ACM Trans. on Graph., Vol 5 #2, pp. 79-109.
+.br
+[Stal93] Richard Stallman,
+.I
+Gnu Emacs Manual, 9th edition, Emacs version 19.19,
+.R
+MIT.
+.br
+[Swei86] Daniel Sweinhart, Polle Zellweger, Richard Beach, and Robert Hagmann,
+``A Structural View of the Cedar Programming Environment'',
+ACM Trans. Prog. Lang. and Sys., Vol. 8, No. 4, pp. 419-490, Oct. 1986.
+.br
+[Wint94], Philip Winterbottom, ``Acid: A Debugger based on a Language'', USENIX Winter Conf. Proc., San Francisco, CA, 1993,
+reprinted in this volume.
+.br
+[Wirt89] N. Wirth and J. Gutknecht, ``The Oberon System'', Softw. - Prac. and Exp., Sep 1989, Vol 19 #9, pp 857-894.
binary files /dev/null b/doc/acme/acme.pdf differ
--- /dev/null
+++ b/doc/acme/mkfile
@@ -1,0 +1,4 @@
+<../fonts
+
+acme.ps:D: acme.ms
+ {echo $FONTS; cat acme.ms} | troff -mpm -mpictures -mnihongo | lp -dstdout >acme.ps
--- /dev/null
+++ b/doc/asm.ms
@@ -1,0 +1,1394 @@
+.ft CW
+.ta 8n +8n +8n +8n +8n +8n +8n
+.ft
+.TL
+A Manual for the Plan 9 assembler
+.AU
+.I "Rob Pike"
+.AI
+rob@plan9.bell-labs.com
+.SH
+Machines
+.PP
+There is an assembler for each of the MIPS, SPARC, Intel 386,
+Motorola 68020 and 68000, IBM Power PC, DEC Alpha, and ARM.
+The 68020 assembler,
+.CW 2a ,
+is the oldest and in many ways the prototype.
+The assemblers are really just variations of a single program:
+they share many properties such as left-to-right assignment order for
+instruction operands and the synthesis of macro instructions
+such as
+.CW MOVE
+to hide the peculiarities of the load and store structure of the machines.
+To keep things concrete, the first part of this manual is
+specifically about the 68020.
+At the end is a description of the differences among
+the other assemblers.
+.ig
+.PP
+The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
+is a prerequisite for this manual.
+..
+.SH
+Registers
+.PP
+All pre-defined symbols in the assembler are upper-case.
+Data registers are
+.CW R0
+through
+.CW R7 ;
+address registers are
+.CW A0
+through
+.CW A7 ;
+floating-point registers are
+.CW F0
+through
+.CW F7 .
+.PP
+A pointer in
+.CW A6
+is used by the C compiler to point to data, enabling short addresses to
+be used more often.
+The value of
+.CW A6
+is constant and must be set during C program initialization
+to the address of the externally-defined symbol
+.CW a6base .
+.PP
+The following hardware registers are defined in the assembler; their
+meaning should be obvious given a 68020 manual:
+.CW CAAR ,
+.CW CACR ,
+.CW CCR ,
+.CW DFC ,
+.CW ISP ,
+.CW MSP ,
+.CW SFC ,
+.CW SR ,
+.CW USP ,
+and
+.CW VBR .
+.PP
+The assembler also defines several pseudo-registers that
+manipulate the stack:
+.CW FP ,
+.CW SP ,
+and
+.CW TOS .
+.CW FP
+is the frame pointer, so
+.CW 0(FP)
+is the first argument,
+.CW 4(FP)
+is the second, and so on.
+.CW SP
+is the local stack pointer, where automatic variables are held
+(SP is a pseudo-register only on the 68020);
+.CW 0(SP)
+is the first automatic, and so on as with
+.CW FP .
+Finally,
+.CW TOS
+is the top-of-stack register, used for pushing parameters to procedures,
+saving temporary values, and so on.
+.PP
+The assembler and loader track these pseudo-registers so
+the above statements are true regardless of what has been
+pushed on the hardware stack, pointed to by
+.CW A7 .
+The name
+.CW A7
+refers to the hardware stack pointer, but beware of mixed use of
+.CW A7
+and the above stack-related pseudo-registers, which will cause trouble.
+Note, too, that the
+.CW PEA
+instruction is observed by the loader to
+alter SP and thus will insert a corresponding pop before all returns.
+The assembler accepts a label-like name to be attached to
+.CW FP
+and
+.CW SP
+uses, such as
+.CW p+0(FP) ,
+to help document that
+.CW p
+is the first argument to a routine.
+The name goes in the symbol table but has no significance to the result
+of the program.
+.SH
+Referring to data
+.PP
+All external references must be made relative to some pseudo-register,
+either
+.CW PC
+(the virtual program counter) or
+.CW SB
+(the ``static base'' register).
+.CW PC
+counts instructions, not bytes of data.
+For example, to branch to the second following instruction, that is,
+to skip one instruction, one may write
+.P1
+ BRA 2(PC)
+.P2
+Labels are also allowed, as in
+.P1
+ BRA return
+ NOP
+return:
+ RTS
+.P2
+When using labels, there is no
+.CW (PC)
+annotation.
+.PP
+The pseudo-register
+.CW SB
+refers to the beginning of the address space of the program.
+Thus, references to global data and procedures are written as
+offsets to
+.CW SB ,
+as in
+.P1
+ MOVL $array(SB), TOS
+.P2
+to push the address of a global array on the stack, or
+.P1
+ MOVL array+4(SB), TOS
+.P2
+to push the second (4-byte) element of the array.
+Note the use of an offset; the complete list of addressing modes is given below.
+Similarly, subroutine calls must use
+.CW SB :
+.P1
+ BSR exit(SB)
+.P2
+File-static variables have syntax
+.P1
+ local<>+4(SB)
+.P2
+The
+.CW <>
+will be filled in at load time by a unique integer.
+.PP
+When a program starts, it must execute
+.P1
+ MOVL $a6base(SB), A6
+.P2
+before accessing any global data.
+(On machines such as the MIPS and SPARC that cannot load a register
+in a single instruction, constants are loaded through the static base
+register. The loader recognizes code that initializes the static
+base register and treats it specially. You must be careful, however,
+not to load large constants on such machines when the static base
+register is not set up, such as early in interrupt routines.)
+.SH
+Expressions
+.PP
+Expressions are mostly what one might expect.
+Where an offset or a constant is expected,
+a primary expression with unary operators is allowed.
+A general C constant expression is allowed in parentheses.
+.PP
+Source files are preprocessed exactly as in the C compiler, so
+.CW #define
+and
+.CW #include
+work.
+.SH
+Addressing modes
+.PP
+The simple addressing modes are shared by all the assemblers.
+Here, for completeness, follows a table of all the 68020 addressing modes,
+since that machine has the richest set.
+In the table,
+.CW o
+is an offset, which if zero may be elided, and
+.CW d
+is a displacement, which is a constant between -128 and 127 inclusive.
+Many of the modes listed have the same name;
+scrutiny of the format will show what default is being applied.
+For instance, indexed mode with no address register supplied operates
+as though a zero-valued register were used.
+For "offset" read "displacement."
+For "\f(CW.s\fP" read one of
+.CW .L ,
+or
+.CW .W
+followed by
+.CW *1 ,
+.CW *2 ,
+.CW *4 ,
+or
+.CW *8
+to indicate the size and scaling of the data.
+.IP
+.TS
+l lfCW.
+data register R0
+address register A0
+floating-point register F0
+special names CAAR, CACR, etc.
+constant $con
+floating point constant $fcon
+external symbol name+o(SB)
+local symbol name<>+o(SB)
+automatic symbol name+o(SP)
+argument name+o(FP)
+address of external $name+o(SB)
+address of local $name<>+o(SB)
+indirect post-increment (A0)+
+indirect pre-decrement -(A0)
+indirect with offset o(A0)
+indexed with offset o()(R0.s)
+indexed with offset o(A0)(R0.s)
+external indexed name+o(SB)(R0.s)
+local indexed name<>+o(SB)(R0.s)
+automatic indexed name+o(SP)(R0.s)
+parameter indexed name+o(FP)(R0.s)
+offset indirect post-indexed d(o())(R0.s)
+offset indirect post-indexed d(o(A0))(R0.s)
+external indirect post-indexed d(name+o(SB))(R0.s)
+local indirect post-indexed d(name<>+o(SB))(R0.s)
+automatic indirect post-indexed d(name+o(SP))(R0.s)
+parameter indirect post-indexed d(name+o(FP))(R0.s)
+offset indirect pre-indexed d(o()(R0.s))
+offset indirect pre-indexed d(o(A0))
+offset indirect pre-indexed d(o(A0)(R0.s))
+external indirect pre-indexed d(name+o(SB))
+external indirect pre-indexed d(name+o(SB)(R0.s))
+local indirect pre-indexed d(name<>+o(SB))
+local indirect pre-indexed d(name<>+o(SB)(R0.s))
+automatic indirect pre-indexed d(name+o(SP))
+automatic indirect pre-indexed d(name+o(SP)(R0.s))
+parameter indirect pre-indexed d(name+o(FP))
+parameter indirect pre-indexed d(name+o(FP)(R0.s))
+.TE
+.in
+.SH
+Laying down data
+.PP
+Placing data in the instruction stream, say for interrupt vectors, is easy:
+the pseudo-instructions
+.CW LONG
+and
+.CW WORD
+(but not
+.CW BYTE )
+lay down the value of their single argument, of the appropriate size,
+as if it were an instruction:
+.P1
+ LONG $12345
+.P2
+places the long 12345 (base 10)
+in the instruction stream.
+(On most machines,
+the only such operator is
+.CW WORD
+and it lays down 32-bit quantities.
+The 386 has all three:
+.CW LONG ,
+.CW WORD ,
+and
+.CW BYTE .
+The AMD64 adds
+.CW QUAD
+for 64-bit values.)
+.PP
+Placing information in the data section is more painful.
+The pseudo-instruction
+.CW DATA
+does the work, given two arguments: an address at which to place the item,
+including its size,
+and the value to place there. For example, to define a character array
+.CW array
+containing the characters
+.CW abc
+and a terminating null:
+.P1
+ DATA array+0(SB)/1, $'a'
+ DATA array+1(SB)/1, $'b'
+ DATA array+2(SB)/1, $'c'
+ GLOBL array(SB), $4
+.P2
+or
+.P1
+ DATA array+0(SB)/4, $"abc\ez"
+ GLOBL array(SB), $4
+.P2
+The
+.CW /1
+defines the number of bytes to define,
+.CW GLOBL
+makes the symbol global, and the
+.CW $4
+says how many bytes the symbol occupies.
+Uninitialized data is zeroed automatically.
+The character
+.CW \ez
+is equivalent to the C
+.CW \e0.
+The string in a
+.CW DATA
+statement may contain a maximum of eight bytes;
+build larger strings piecewise.
+Two pseudo-instructions,
+.CW DYNT
+and
+.CW INIT ,
+allow the (obsolete) Alef compilers to build dynamic type information during the load
+phase.
+The
+.CW DYNT
+pseudo-instruction has two forms:
+.P1
+ DYNT , ALEF_SI_5+0(SB)
+ DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB)
+.P2
+In the first form,
+.CW DYNT
+defines the symbol to be a small unique integer constant, chosen by the loader,
+which is some multiple of the word size. In the second form,
+.CW DYNT
+defines the second symbol in the same way,
+places the address of the most recently
+defined text symbol in the array specified by the first symbol at the
+index defined by the value of the second symbol,
+and then adjusts the size of the array accordingly.
+.PP
+The
+.CW INIT
+pseudo-instruction takes the same parameters as a
+.CW DATA
+statement. Its symbol is used as the base of an array and the
+data item is installed in the array at the offset specified by the most recent
+.CW DYNT
+pseudo-instruction.
+The size of the array is adjusted accordingly.
+The
+.CW DYNT
+and
+.CW INIT
+pseudo-instructions are not implemented on the 68020.
+.SH
+Defining a procedure
+.PP
+Entry points are defined by the pseudo-operation
+.CW TEXT ,
+which takes as arguments the name of the procedure (including the ubiquitous
+.CW (SB) )
+and the number of bytes of automatic storage to pre-allocate on the stack,
+which will usually be zero when writing assembly language programs.
+On machines with a link register, such as the MIPS and SPARC,
+the special value -4 instructs the loader to generate no PC save
+and restore instructions, even if the function is not a leaf.
+Here is a complete procedure that returns the sum
+of its two arguments:
+.P1
+TEXT sum(SB), $0
+ MOVL arg1+0(FP), R0
+ ADDL arg2+4(FP), R0
+ RTS
+.P2
+An optional middle argument
+to the
+.CW TEXT
+pseudo-op is a bit field of options to the loader.
+Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of
+the program.
+For example,
+.P1
+TEXT sum(SB), 1, $0
+ MOVL arg1+0(FP), R0
+ ADDL arg2+4(FP), R0
+ RTS
+.P2
+will not be profiled; the first version above would be.
+Subroutines with peculiar state, such as system call routines,
+should not be profiled.
+.PP
+Setting the 2 bit allows multiple definitions of the same
+.CW TEXT
+symbol in a program; the loader will place only one such function in the image.
+It was emitted only by the Alef compilers.
+.PP
+Subroutines to be called from C should place their result in
+.CW R0 ,
+even if it is an address.
+Floating point values are returned in
+.CW F0 .
+Functions that return a structure to a C program
+receive as their first argument the address of the location to
+store the result;
+.CW R0
+is unused in the calling protocol for such procedures.
+A subroutine is responsible for saving its own registers,
+and therefore is free to use any registers without saving them (``caller saves'').
+.CW A6
+and
+.CW A7
+are the exceptions as described above.
+.SH
+When in doubt
+.PP
+If you get confused, try using the
+.CW -S
+option to
+.CW 2c
+and compiling a sample program.
+The standard output is valid input to the assembler.
+.SH
+Instructions
+.PP
+The instruction set of the assembler is not identical to that
+of the machine.
+It is chosen to match what the compiler generates, augmented
+slightly by specific needs of the operating system.
+For example,
+.CW 2a
+does not distinguish between the various forms of
+.CW MOVE
+instruction: move quick, move address, etc. Instead the context
+does the job. For example,
+.P1
+ MOVL $1, R1
+ MOVL A0, R2
+ MOVW SR, R3
+.P2
+generates official
+.CW MOVEQ ,
+.CW MOVEA ,
+and
+.CW MOVESR
+instructions.
+A number of instructions do not have the syntax necessary to specify
+their entire capabilities. Notable examples are the bitfield
+instructions, the
+multiply and divide instructions, etc.
+For a complete set of generated instruction names (in
+.CW 2a
+notation, not Motorola's) see the file
+.CW /sys/src/cmd/2c/2.out.h .
+Despite its name, this file contains an enumeration of the
+instructions that appear in the intermediate files generated
+by the compiler, which correspond exactly to lines of assembly language.
+.PP
+The MC68000 assembler,
+.CW 1a ,
+is essentially the same, honoring the appropriate subset of the instructions
+and addressing modes.
+The definitions of these are, nonetheless, part of
+.CW 2.out.h .
+.SH
+Laying down instructions
+.PP
+The loader modifies the code produced by the assembler and compiler.
+It folds branches,
+copies short sequences of code to eliminate branches,
+and discards unreachable code.
+The first instruction of every function is assumed to be reachable.
+The pseudo-instruction
+.CW NOP ,
+which you may see in compiler output,
+means no instruction at all, rather than an instruction that does nothing.
+The loader discards all
+.CW NOP 's.
+.PP
+To generate a true
+.CW NOP
+instruction, or any other instruction not known to the assembler, use a
+.CW WORD
+pseudo-instruction.
+Such instructions on RISCs are not scheduled by the loader and must have
+their delay slots filled manually.
+.SH
+MIPS
+.PP
+The registers are only addressed by number:
+.CW R0
+through
+.CW R31 .
+.CW R29
+is the stack pointer;
+.CW R30
+is used as the static base pointer, the analogue of
+.CW A6
+on the 68020.
+Its value is the address of the global symbol
+.CW setR30(SB) .
+The register holding returned values from subroutines is
+.CW R1 .
+When a function is called, space for the first argument
+is reserved at
+.CW 0(FP)
+but in C (not Alef) the value is passed in
+.CW R1
+instead.
+.PP
+The loader uses
+.CW R28
+as a temporary. The system uses
+.CW R26
+and
+.CW R27
+as interrupt-time temporaries. Therefore none of these registers
+should be used in user code.
+.PP
+The control registers are not known to the assembler.
+Instead they are numbered registers
+.CW M0 ,
+.CW M1 ,
+etc.
+Use this trick to access, say,
+.CW STATUS :
+.P1
+#define STATUS 12
+ MOVW M(STATUS), R1
+.P2
+.PP
+Floating point registers are called
+.CW F0
+through
+.CW F31 .
+By convention,
+.CW F24
+must be initialized to the value 0.0,
+.CW F26
+to 0.5,
+.CW F28
+to 1.0, and
+.CW F30
+to 2.0;
+this is done by the operating system.
+.PP
+The instructions and their syntax are different from those of the manufacturer's
+manual.
+There are no
+.CW lui
+and kin; instead there are
+.CW MOVW
+(move word),
+.CW MOVH
+(move halfword),
+and
+.CW MOVB
+(move byte) pseudo-instructions. If the operand is unsigned, the instructions
+are
+.CW MOVHU
+and
+.CW MOVBU .
+The order of operands is from left to right in dataflow order, just as
+on the 68020 but not as in MIPS documentation.
+This means that the
+.CW Bcond
+instructions are reversed with respect to the book; for example, a
+.CW va
+.CW BGTZ
+generates a MIPS
+.CW bltz
+instruction.
+.PP
+The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
+It understands the 64-bit instructions
+.CW MOVV ,
+.CW MOVVL ,
+.CW ADDV ,
+.CW ADDVU ,
+.CW SUBV ,
+.CW SUBVU ,
+.CW MULV ,
+.CW MULVU ,
+.CW DIVV ,
+.CW DIVVU ,
+.CW SLLV ,
+.CW SRLV ,
+and
+.CW SRAV .
+The assembler does not have any cache, load-linked, or store-conditional instructions.
+.PP
+Some assembler instructions are expanded into multiple instructions by the loader.
+For example the loader may convert the load of a 32 bit constant into an
+.CW lui
+followed by an
+.CW ori .
+.PP
+Assembler instructions should be laid out as if there
+were no load, branch, or floating point compare delay slots;
+the loader will rearrange\(em\f2schedule\f1\(emthe instructions
+to guarantee correctness and improve performance.
+The only exception is that the correct scheduling of instructions
+that use control registers varies from model to model of machine
+(and is often undocumented) so you should schedule such instructions
+by hand to guarantee correct behavior.
+The loader generates
+.P1
+ NOR R0, R0, R0
+.P2
+when it needs a true no-op instruction.
+Use exactly this instruction when scheduling code manually;
+the loader recognizes it and schedules the code before it and after it independently. Also,
+.CW WORD
+pseudo-ops are scheduled like no-ops.
+.PP
+The
+.CW NOSCHED
+pseudo-op disables instruction scheduling
+(scheduling is enabled by default);
+.CW SCHED
+re-enables it.
+Branch folding, code copying, and dead code elimination are
+disabled for instructions that are not scheduled.
+.SH
+SPARC
+.PP
+Once you understand the Plan 9 model for the MIPS, the SPARC is familiar.
+Registers have numerical names only:
+.CW R0
+through
+.CW R31 .
+Forget about register windows: Plan 9 doesn't use them at all.
+The machine has 32 global registers, period.
+.CW R1
+[sic] is the stack pointer.
+.CW R2
+is the static base register, with value the address of
+.CW setSB(SB) .
+.CW R7
+is the return register and also the register holding the first
+argument to a C (not Alef) function, again with space reserved at
+.CW 0(FP) .
+.CW R14
+is the loader temporary.
+.PP
+Floating-point registers are exactly as on the MIPS.
+.PP
+The control registers are known by names such as
+.CW FSR .
+The instructions to access these registers are
+.CW MOVW
+instructions, for example
+.P1
+ MOVW Y, R8
+.P2
+for the SPARC instruction
+.P1
+ rdy %r8
+.P2
+.PP
+Move instructions are similar to those on the MIPS: pseudo-operations
+that turn into appropriate sequences of
+.CW sethi
+instructions, adds, etc.
+Instructions read from left to right. Because the arguments are
+flipped to
+.CW SUBCC ,
+the condition codes are not inverted as on the MIPS.
+.PP
+The syntax for the ASI stuff is, for example to move a word from ASI 2:
+.P1
+ MOVW (R7, 2), R8
+.P2
+The syntax for double indexing is
+.P1
+ MOVW (R7+R8), R9
+.P2
+.PP
+The SPARC's instruction scheduling is similar to the MIPS's.
+The official no-op instruction is:
+.P1
+ ORN R0, R0, R0
+.P2
+.SH
+i386
+.PP
+The assembler assumes 32-bit protected mode.
+The register names are
+.CW SP ,
+.CW AX ,
+.CW BX ,
+.CW CX ,
+.CW DX ,
+.CW BP ,
+.CW DI ,
+and
+.CW SI .
+The stack pointer (not a pseudo-register) is
+.CW SP
+and the return register is
+.CW AX .
+There is no physical frame pointer but, as for the MIPS,
+.CW FP
+is a pseudo-register that acts as
+a frame pointer.
+.PP
+Opcode names are mostly the same as those listed in the Intel manual
+with an
+.CW L ,
+.CW W ,
+or
+.CW B
+appended to identify 32-bit,
+16-bit, and 8-bit operations.
+The exceptions are loads, stores, and conditionals.
+All load and store opcodes to and from general registers, special registers
+(such as
+.CW CR0,
+.CW CR3,
+.CW GDTR,
+.CW IDTR,
+.CW SS,
+.CW CS,
+.CW DS,
+.CW ES,
+.CW FS,
+and
+.CW GS )
+or memory are written
+as
+.P1
+ MOV\f2x\fP src,dst
+.P2
+where
+.I x
+is
+.CW L ,
+.CW W ,
+or
+.CW B .
+Thus to get
+.CW AL
+use a
+.CW MOVB
+instruction. If you need to access
+.CW AH ,
+you must mention it explicitly in a
+.CW MOVB :
+.P1
+ MOVB AH, BX
+.P2
+There are many examples of illegal moves, for example,
+.P1
+ MOVB BP, DI
+.P2
+that the loader actually implements as pseudo-operations.
+.PP
+The names of conditions in all conditional instructions
+.CW J , (
+.CW SET )
+follow the conventions of the 68020 instead of those of the Intel
+assembler:
+.CW JOS ,
+.CW JOC ,
+.CW JCS ,
+.CW JCC ,
+.CW JEQ ,
+.CW JNE ,
+.CW JLS ,
+.CW JHI ,
+.CW JMI ,
+.CW JPL ,
+.CW JPS ,
+.CW JPC ,
+.CW JLT ,
+.CW JGE ,
+.CW JLE ,
+and
+.CW JGT
+instead of
+.CW JO ,
+.CW JNO ,
+.CW JB ,
+.CW JNB ,
+.CW JZ ,
+.CW JNZ ,
+.CW JBE ,
+.CW JNBE ,
+.CW JS ,
+.CW JNS ,
+.CW JP ,
+.CW JNP ,
+.CW JL ,
+.CW JNL ,
+.CW JLE ,
+and
+.CW JNLE .
+.PP
+The addressing modes have syntax like
+.CW AX ,
+.CW (AX) ,
+.CW (AX)(BX*4) ,
+.CW 10(AX) ,
+and
+.CW 10(AX)(BX*4) .
+The offsets from
+.CW AX
+can be replaced by offsets from
+.CW FP
+or
+.CW SB
+to access names, for example
+.CW extern+5(SB)(AX*2) .
+.PP
+Other notes: Non-relative
+.CW JMP
+and
+.CW CALL
+have a
+.CW *
+added to the syntax.
+Only
+.CW LOOP ,
+.CW LOOPEQ ,
+and
+.CW LOOPNE
+are legal loop instructions. Only
+.CW REP
+and
+.CW REPN
+are recognized repeaters. These are not prefixes, but rather
+stand-alone opcodes that precede the strings, for example
+.P1
+ CLD; REP; MOVSL
+.P2
+Segment override prefixes in
+.CW MOD/RM
+fields are not supported.
+.SH
+AMD64
+.PP
+The assembler's conventions are similar to those for the 386, above.
+The architecture provides extra fixed-point registers
+.CW R8
+to
+.CW R15 .
+All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
+as described in the processor handbook.
+For example,
+.CW MOVL
+to
+.CW AX
+puts a value in the low-order 32 bits and clears the top 32 bits to zero.
+Literal operands are limited to signed 32 bit values, which are sign-extended
+to 64 bits in 64 bit operations; the exception is
+.CW MOVQ ,
+which allows 64-bit literals.
+MMX registers are
+.CW M0
+to
+.CW M7 ,
+and
+XMM registers are
+.CW X0
+to
+.CW X15 .
+.PP
+There are many new instructions, including the MMX and XMM media instructions,
+and conditional move instructions.
+As with the 386 instruction names,
+all new 64-bit integer instructions, and the MMX and XMM instructions
+uniformly use
+.CW L
+for `long word' (32 bits) and
+.CW Q
+for `quad word' (64 bits).
+Some instructions use
+.CW O
+(`octword') for 128-bit values, where the processor handbook
+variously uses
+.CW O
+or
+.CW DQ .
+The assembler also consistently uses
+.CW PL
+for `packed long' in
+XMM instructions, instead of
+.CW Q ,
+.CW DQ
+or
+.CW PI .
+Either
+.CW MOVL
+or
+.CW MOVQ
+can be used to move values to and from control registers, even when
+the registers might be 64 bits.
+The assembler often accepts the handbook's name to ease conversion
+of existing code (but remember that the operand order is uniformly
+source then destination).
+.PP
+C's
+.CW "long long"
+type is 64 bits, but passed and returned by value, not by reference.
+More notably, C pointer values are 64 bits, and thus
+.CW "long long"
+and
+.CW "unsigned long long"
+are the only integer types wide enough to hold a pointer value.
+The C compiler and library use the XMM floating-point instructions, not
+the old 387 ones, although the latter are implemented by assembler and loader.
+The compiler provides external registers,
+allocated from
+.CW R15
+down.
+.PP
+The calling conventions are different from the 386.
+.CW CALL
+pushes, and
+.CW RET
+pops a 64-bit return address on the stack.
+The first integer or pointer argument is passed in a register, which is
+.CW BP
+for an integer or pointer (it can be referred to in assembly code by the pseudonym
+.CW RARG ).
+.CW AX
+holds the return value from subroutines as before.
+Floating-point results are returned in
+.CW X0 ,
+although currently the first parameter is not passed in a register if floating-point.
+All parameters less than 8 bytes in length have 8 byte slots reserved on the stack
+to preserve alignment and simplify variable-length argument list access,
+including the first parameter when passed in a register,
+although bytes 4 to 7 are not initialized.
+.PP
+The assembler assumes 64-bit mode unless a
+.CW MODE
+pseudo-operation is given:
+.P1
+ MODE $32
+.P2
+to change to 32-bit mode.
+The effect is mainly to diagnose instructions that are illegal in
+the given mode, but the loader will also assume 32-bit operands and addresses,
+and 32-bit PC values for call and return.
+.SH
+Alpha
+.PP
+On the Alpha, all registers are 64 bits. The architecture handles 32-bit values
+by giving them a canonical format (sign extension in the case of integer registers).
+Registers are numbered
+.CW R0
+through
+.CW R31 .
+.CW R0
+holds the return value from subroutines, and also the first parameter.
+.CW R30
+is the stack pointer,
+.CW R29
+is the static base,
+.CW R26
+is the link register, and
+.CW R27
+and
+.CW R28
+are linker temporaries.
+.PP
+Floating point registers are numbered
+.CW F0
+to
+.CW F31 .
+.CW F28
+contains
+.CW 0.5 ,
+.CW F29
+contains
+.CW 1.0 ,
+and
+.CW F30
+contains
+.CW 2.0 .
+.CW F31
+is always
+.CW 0.0
+on the Alpha.
+.PP
+The extension character for
+.CW MOV
+follows DEC's notation:
+.CW B
+for byte (8 bits),
+.CW W
+for word (16 bits),
+.CW L
+for long (32 bits),
+and
+.CW Q
+for quadword (64 bits).
+Byte and ``word'' loads and stores may be made unsigned
+by appending a
+.CW U .
+.CW S
+and
+.CW T
+refer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively.
+.SH
+PowerPC
+.PP
+The PowerPC follows the Plan 9 model set by the MIPS and SPARC,
+not the elaborate ABIs.
+The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
+there is no support for the older POWER instructions.
+Registers are
+.CW R0
+through
+.CW R31 .
+.CW R0
+is initialized to zero; this is done by C start up code
+and assumed by the compiler and loader.
+.CW R1
+is the stack pointer.
+.CW R2
+is the static base register, with value the address of
+.CW setSB(SB) .
+.CW R3
+is the return register and also the register holding the first
+argument to a C function, with space reserved at
+.CW 0(FP)
+as on the MIPS.
+.CW R31
+is the loader temporary.
+The external registers in Plan 9's C are allocated from
+.CW R30
+down.
+.PP
+Floating point registers are called
+.CW F0
+through
+.CW F31 .
+By convention, several registers are initialized
+to specific values; this is done by the operating system.
+.CW F27
+must be initialized to the value
+.CW 0x4330000080000000
+(used by float-to-int conversion),
+.CW F28
+to the value 0.0,
+.CW F29
+to 0.5,
+.CW F30
+to 1.0, and
+.CW F31
+to 2.0.
+.PP
+As on the MIPS and SPARC, the assembler accepts arbitrary literals
+as operands to
+.CW MOVW ,
+and also to
+.CW ADD
+and others where `immediate' variants exist,
+and the loader generates sequences
+of
+.CW addi ,
+.CW addis ,
+.CW oris ,
+etc. as required.
+The register indirect addressing modes use the same syntax as the SPARC,
+including double indexing when allowed.
+.PP
+The instruction names are generally derived from the Motorola ones,
+subject to slight transformation:
+the
+.CW . ' `
+marking the setting of condition codes is replaced by
+.CW CC ,
+and when the letter
+.CW o ' `
+represents `OE=1' it is replaced by
+.CW V .
+Thus
+.CW add ,
+.CW addo.
+and
+.CW subfzeo.
+become
+.CW ADD ,
+.CW ADDVCC
+and
+.CW SUBFZEVCC .
+As well as the three-operand conditional branch instruction
+.CW BC ,
+the assembler provides pseudo-instructions for the common cases:
+.CW BEQ ,
+.CW BNE ,
+.CW BGT ,
+.CW BGE ,
+.CW BLT ,
+.CW BLE ,
+.CW BVC ,
+and
+.CW BVS .
+The unconditional branch instruction is
+.CW BR .
+Indirect branches use
+.CW "(CTR)"
+or
+.CW "(LR)"
+as target.
+.PP
+Load or store operations are replaced by
+.CW MOV
+variants in the usual way:
+.CW MOVW
+(move word),
+.CW MOVH
+(move halfword with sign extension), and
+.CW MOVB
+(move byte with sign extension, a pseudo-instruction),
+with unsigned variants
+.CW MOVHZ
+and
+.CW MOVBZ ,
+and byte-reversing
+.CW MOVWBR
+and
+.CW MOVHBR .
+`Load or store with update' versions are
+.CW MOVWU ,
+.CW MOVHU ,
+and
+.CW MOVBZU .
+Load or store multiple is
+.CW MOVMW .
+The exceptions are the string instructions, which are
+.CW LSW
+and
+.CW STSW ,
+and the reservation instructions
+.CW lwarx
+and
+.CW stwcx. ,
+which are
+.CW LWAR
+and
+.CW STWCCC ,
+all with operands in the usual data-flow order.
+Floating-point load or store instructions are
+.CW FMOVD ,
+.CW FMOVDU ,
+.CW FMOVS ,
+and
+.CW FMOVSU .
+The register to register move instructions
+.CW fmr
+and
+.CW fmr.
+are written
+.CW FMOVD
+and
+.CW FMOVDCC .
+.PP
+The assembler knows the commonly used special purpose registers:
+.CW CR ,
+.CW CTR ,
+.CW DEC ,
+.CW LR ,
+.CW MSR ,
+and
+.CW XER .
+The rest, which are often architecture-dependent, are referenced as
+.CW SPR(n) .
+The segment registers of the 60x series are similarly
+.CW SEG(n) ,
+but
+.I n
+can also be a register name, as in
+.CW SEG(R3) .
+Moves between special purpose registers and general purpose ones,
+when allowed by the architecture,
+are written as
+.CW MOVW ,
+replacing
+.CW mfcr ,
+.CW mtcr ,
+.CW mfmsr ,
+.CW mtmsr ,
+.CW mtspr ,
+.CW mfspr ,
+.CW mftb ,
+and many others.
+.PP
+The fields of the condition register
+.CW CR
+are referenced as
+.CW CR(0)
+through
+.CW CR(7) .
+They are used by the
+.CW MOVFL
+(move field) pseudo-instruction,
+which produces
+.CW mcrf
+or
+.CW mtcrf .
+For example:
+.P1
+ MOVFL CR(3), CR(0)
+ MOVFL R3, CR(1)
+ MOVFL R3, $7, CR
+.P2
+They are also accepted in
+the conditional branch instruction, for example
+.P1
+ BEQ CR(7), label
+.P2
+Fields of the
+.CW FPSCR
+are accessed using
+.CW MOVFL
+in a similar way:
+.P1
+ MOVFL FPSCR, F0
+ MOVFL F0, FPSCR
+ MOVFL F0, $7, FPSCR
+ MOVFL $0, FPSCR(3)
+.P2
+producing
+.CW mffs ,
+.CW mtfsf ,
+or
+.CW mtfsfi
+as appropriate.
+.SH
+ARM
+.PP
+The assembler provides access to
+.CW R0
+through
+.CW R14
+and the
+.CW PC .
+The stack pointer is
+.CW R13 ,
+the link register is
+.CW R14 ,
+and the static base register is
+.CW R12 .
+.CW R0
+is the return register and also the register holding
+the first argument to a subroutine.
+The assembler supports the
+.CW CPSR
+and
+.CW SPSR
+registers.
+It also knows about coprocessor registers
+.CW C0
+through
+.CW C15 .
+Floating registers are
+.CW F0
+through
+.CW F7 ,
+.CW FPSR
+and
+.CW FPCR .
+.PP
+As with the other architectures, loads and stores are called
+.CW MOV ,
+e.g.
+.CW MOVW
+for load word or store word, and
+.CW MOVM
+for
+load or store multiple,
+depending on the operands.
+.PP
+Addressing modes are supported by suffixes to the instructions:
+.CW .IA
+(increment after),
+.CW .IB
+(increment before),
+.CW .DA
+(decrement after), and
+.CW .DB
+(decrement before).
+These can only be used with the
+.CW MOV
+instructions.
+The move multiple instruction,
+.CW MOVM ,
+defines a range of registers using brackets, e.g.
+.CW [R0-R12] .
+The special
+.CW MOVM
+addressing mode bits
+.CW W ,
+.CW U ,
+and
+.CW P
+are written in the same manner, for example,
+.CW MOVM.DB.W .
+A
+.CW .S
+suffix allows a
+.CW MOVM
+instruction to access user
+.CW R13
+and
+.CW R14
+when in another processor mode.
+Shifts and rotates in addressing modes are supported by binary operators
+.CW <<
+(logical left shift),
+.CW >>
+(logical right shift),
+.CW ->
+(arithmetic right shift), and
+.CW @>
+(rotate right); for example
+.CW "R7>>R2" or
+.CW "R2@>2" .
+The assembler does not support indexing by a shifted expression;
+only names can be doubly indexed.
+.PP
+Any instruction can be followed by a suffix that makes the instruction conditional:
+.CW .EQ ,
+.CW .NE ,
+and so on, as in the ARM manual, with synonyms
+.CW .HS
+(for
+.CW .CS )
+and
+.CW .LO
+(for
+.CW .CC ),
+for example
+.CW ADD.NE .
+Arithmetic
+and logical instructions
+can have a
+.CW .S
+suffix, as ARM allows, to set condition codes.
+.PP
+The syntax of the
+.CW MCR
+and
+.CW MRC
+coprocessor instructions is largely as in the manual, with the usual adjustments.
+The assembler directly supports only the ARM floating-point coprocessor
+operations used by the compiler:
+.CW CMP ,
+.CW ADD ,
+.CW SUB ,
+.CW MUL ,
+and
+.CW DIV ,
+all with
+.CW F
+or
+.CW D
+suffix selecting single or double precision.
+Floating-point load or store become
+.CW MOVF
+and
+.CW MOVD .
+Conversion instructions are also specified by moves:
+.CW MOVWD ,
+.CW MOVWF ,
+.CW MOVDW ,
+.CW MOVWD ,
+.CW MOVFD ,
+and
+.CW MOVDF .
binary files /dev/null b/doc/asm.pdf differ
binary files /dev/null b/doc/backmatter.pdf differ
--- /dev/null
+++ b/doc/bltj.ms
@@ -1,0 +1,1073 @@
+.TL
+The Inferno Operating System
+.AU
+Sean Dorward
+Rob Pike
+David Leo Presotto
+Dennis M. Ritchie
+Howard Trickey
+Phil Winterbottom
+.AI
+Computing Science Research Center
+Lucent Technologies, Bell Labs
+Murray Hill, New Jersey
+USA
+.FS
+.FA
+Originally appeared in the
+.I "Bell Labs Technical Journal" ,
+Vol. 2, No. 1, Winter 1997, pp. 5-18.
+.br
+Minor revisions have been made by Vita Nuova to reflect subsequent changes to Inferno.
+.br
+Copyright © 1997 Lucent Technologies Inc. All rights reserved.
+.FE
+.AB
+Inferno is an operating system for creating and supporting distributed services.
+It was originally developed by the Computing Science Research Center of Bell Labs, the R&D arm of Lucent Technologies, and
+further developed by other groups in Lucent.
+.LP
+Inferno was designed specifically as a commercial product, both for licensing
+in the marketplace and for use within new Lucent offerings.
+It encapsulates many years of Bell Labs research in operating systems, languages, on-the-fly compilers, graphics, security, networking and portability.
+.AE
+.SH
+Introduction
+.LP
+Inferno is intended to be used in a variety of network environments, for example those supporting
+advanced telephones, hand-held devices, TV set-top boxes attached to cable or satellite systems, and inexpensive Internet computers, but also in conjunction with traditional computing systems.
+.LP
+The most visible new environments involve cable television, direct satellite broadcast, the Internet, and other networks. As the entertainment, telecommunications, and computing industries converge and interconnect, a variety of public data networks are emerging, each potentially as useful and profitable as the telephone system. Unlike the telephone system, which started with standard terminals and signaling, these networks are developing in a world of diverse terminals, network hardware, and protocols. Only a well-designed, economical operating system can insulate the various providers of content and services from the equally varied transport and presentation
+platforms. Inferno is a network operating system for this new world.
+.LP
+Inferno's definitive strength lies in its portability and versatility across several dimensions:
+.IP •
+Portability across processors: it currently runs on Intel, Sparc, MIPS, ARM, HP-PA, and PowerPC architectures and is readily portable to others.
+.IP •
+Portability across environments: it runs as a stand-alone operating system on small terminals, and also as a user application under Windows NT, Windows 95, Unix (Irix, Solaris, FreeBSD, Linux, AIX, HP/UX) and Plan 9. In all of these environments, Inferno applications see an identical interface.
+.IP •
+Distributed design: the identical environment is established at the user's terminal and at the server, and each may import the resources (for example, the attached I/O devices or networks) of the other. Aided by the communications facilities of the run-time system, applications may be split easily (and even dynamically) between client and server.
+.IP •
+Minimal hardware requirements: it runs useful applications stand-alone on machines with as little as 1 MB of memory, and does not require memory-mapping hardware.
+.IP •
+Portable applications: Inferno applications are written in the type-safe language Limbo, whose binary representation is identical over all platforms.
+.IP •
+Dynamic adaptability: applications may, depending on the hardware or other resources available, load different program modules to perform a specific function. For example, a video player application might use any of several different decoder modules.
+.LP
+Underlying the design of Inferno is a model of the diversity of application areas it intends to stimulate. Many providers are interested in purveying media and services: telephone network service providers, WWW servers, cable companies, merchants, various information providers.
+There are many connection technologies: ordinary telephone modems, ISDN, ATM, the Internet, analog broadcast or cable TV, cable modems, digital video on demand, and other interactive TV systems.
+.LP
+Applications more clearly related to Lucent's current and planned product offerings include
+control of switches and routers, and the associated operations system facilities needed to support them.
+For example, Inferno software controls an IP switch/router for voice and data being
+developed by Lucent's Bell Labs research and Network Systems organizations.
+An Inferno-based firewall (Signet) is being used to secure outside access to the Research
+Internet connection.
+.LP
+Finally, there are existing or potential hardware endpoints. Some are in consumers' homes: PCs,
+game consoles, newer set-top boxes. Some are inside the networks themselves: nodes for billing, network monitoring or provisioning. The higher ends of these spectra, epitomized by fully interactive TV with video on demand, may be fascinating, but have developed more slowly than expected. One reason is the cost of the set-top box, especially its memory requirements. Portable terminals, because of weight and cost considerations, are similarly constrained.
+.LP
+Inferno is parsimonious enough in its resource requirements to support interesting applications on today's hardware, while being versatile enough to grow into the future. In particular, it enables developers to create applications that will work across a range of facilities. An example: an interactive shopping catalog that works in text mode over a POTS modem, shows still pictures (perhaps with audio) of the merchandise over ISDN, and includes video clips over digital cable.
+.LP
+Clearly not everyone who deploys an Inferno-based solution will want to span the whole range of possibilities, but the system architecture should be constrained only by the desired markets and the available interconnection and server technologies, not by the software.
+.SH
+Inferno interfaces
+.LP
+The role of the Inferno system is to
+.I "create"
+several standard interfaces for its applications:
+.IP •
+Applications use various resources internal to the system, such as a consistent virtual machine that runs the application programs, together with library modules that perform services as simple as string manipulation through more sophisticated graphics services for dealing with text, pictures,
+higher-level toolkits, and video.
+.IP •
+Applications exist in an external environment containing resources such as data files that can be read and manipulated, together with objects that are named and manipulated like files but are more active. Devices (for example a hand-held remote control, an MPEG decoder or a network interface) present themselves to the application as files.
+.IP •
+Standard protocols exist for communication within and between separate machines running Inferno, so that applications can cooperate.
+.LP
+At the same time, Inferno
+.I uses
+interfaces supplied by an existing environment, either bare hardware or standard operating systems and protocols.
+.LP
+Most typically, an Inferno-based service would consist of many relatively cheap terminals running Inferno as a native system, and a smaller number of large machines running Inferno as a hosted system. On these server machines Inferno might interface to databases, transaction systems, existing OA&M facilities, and other resources provided under the native operating system. The Inferno applications themselves would run either on the client or server machines, or both.
+.SH
+External Environment of Inferno Applications
+.LP
+The purpose of most Inferno applications is to present information or media to the user; thus applications must locate the information sources in the network and construct a local representation of them. The information flow is not one-way: the user's terminal (whether a network computer, TV set-top, PC, or videophone) is also an information source and its devices represent resources to applications. Inferno draws heavily on the design of the Plan 9 operating system [1] in the way it presents resources to these applications.
+.LP
+The design has three principles.
+.IP •
+All resources are named and accessed like files in a forest of hierarchical file systems.
+.IP •
+The disjoint resource hierarchies provided by different services are joined together into a single private hierarchical
+.I "name space" .
+.IP •
+A communication protocol, called
+.I "Styx" ,
+is applied uniformly to access these resources, whether local or remote.
+.LP
+In practice, most applications see a fixed set of files organized as a directory tree. Some of the files contain ordinary data, but others represent more active resources. Devices are represented as files, and device drivers (such as a modem, an MPEG decoder, a network interface, or the TV screen) attached to a particular hardware box present themselves as small directories. These directories typically containing two files,
+.CW "data"
+and
+.CW "ctl" ,
+which respectively perform actual device input/output and control operations. System services also live behind file names. For example, an Internet domain name server might be attached to an agreed-upon name (say
+.CW "/net/dns" );
+after writing to this file a string representing a symbolic Internet domain name, a subsequent read from the file would return the corresponding numeric Internet address.
+.LP
+The glue that connects the separate parts of the resource name space together is the Styx protocol.
+Within an instance of Inferno, all the device drivers and other internal resources respond to the procedural version of Styx. The Inferno kernel implements a
+.I "mount driver"
+that transforms file system operations into remote procedure calls for transport over a network. On the other side of the connection, a server unwraps the Styx messages and implements them using resources local to it. Thus, it is possible to import parts of the name space (and thus resources) from other machines.
+.LP
+To extend the example above, it is unlikely that a set-top box would store the code needed for an Internet domain name-server within itself. Instead, an Internet browser would import the
+.CW "/net/dns"
+resource into its own name space from a server machine across a network.
+.LP
+The Styx protocol lies above and is independent of the communications transport layer; it is readily carried over TCP/IP, PPP, ATM or various modem transport protocols.
+.SH
+Internal Environment of Inferno Applications
+.LP
+Inferno applications are written in a new language called Limbo [2], which was designed specifically for the Inferno environment. Its syntax is influenced by C and Pascal, and it supports the standard data types common to them, together with several higher-level data types such as lists, tuples, strings, dynamic arrays, and simple abstract data types.
+.LP
+In addition, Limbo supplies several advanced constructs carefully integrated into the Inferno virtual machine. In particular, a communication mechanism called a
+.I "channel"
+is used to connect different Limbo tasks on the same machine or across the network.
+A channel transports typed data in a machine-independent fashion, so that complex data structures (including channels themselves) may be passed between Limbo tasks or attached to files in the name space for language-level communication between machines.
+.LP
+Multi-tasking is supported directly by the Limbo language: independently scheduled threads of control may be spawned, and an
+.CW "alt"
+statement is used to coordinate the channel communication
+between tasks (that is,
+.CW "alt"
+is used to select one of several channels that are ready to communicate).
+By building channels and tasks into the language and its virtual machine, Inferno encourages a communication style that is easy to use and safe.
+.LP
+Limbo programs are built of
+.I "modules" ,
+which are self-contained units with a well-defined interface
+containing functions (methods), abstract data types, and constants defined by the module and visible outside it. Modules are accessed dynamically; that is, when one module wishes to make use of another, it dynamically executes a
+.CW "load"
+statement naming the desired module, and uses a returned handle to access the new module.
+When the module is no longer in use, its storage and code will be released.
+The flexibility of the modular structure contributes to the smallness of typical Inferno applications, and also to their adaptability.
+For example, in the shopping catalog described above,
+the application's main module checks dynamically for the existence of the video resource.
+If it is unavailable, the video-decoder module is never loaded.
+.LP
+Limbo is fully type-checked at compile- and run-time; for example, pointers, besides being more
+restricted than in C, are checked before being dereferenced, and the type-consistency of a dynamically loaded module is checked when it is loaded. Limbo programs run safely on a machine
+without memory-protection hardware.
+Moreover, all Limbo data and program objects are subject to
+a garbage collector, built deeply into the Limbo run-time system. All system data objects are tracked by the virtual machine and freed as soon as they become unused. For example, if an application task creates a graphics window and then terminates, the window automatically disappears the instant the last reference to it has gone away.
+.LP
+Limbo programs are compiled into byte-codes representing instructions for a virtual machine called
+Dis. The architecture of the arithmetic part of Dis is a simple 3-address machine, supplemented with a few specialized operations for handling some of the higher-level data types like arrays and strings. Garbage collection is handled below the level of the machine language; the scheduling of tasks is similarly hidden. When loaded into memory for execution, the byte-codes are expanded
+into a format more efficient for execution; there is also an optional on-the-fly compiler that turns a Dis instruction stream into native machine instructions for the appropriate real hardware. This can be done efficiently because Dis instructions match well with the instruction-set architecture of today's machines. The resulting code executes at a speed approaching that of compiled C.
+.LP
+Underlying Dis is the Inferno kernel, which contains the interpreter and on-the-fly compiler as well as memory management, scheduling, device drivers, protocol stacks, and the like.
+The kernel also contains the core of the file system (the name evaluator and the code that turns file system operations into remote procedure calls over communications links) as well as the small file systems implemented internally.
+.LP
+Finally, the Inferno virtual machine implements several standard modules internally. These include
+.CW "Sys" ,
+which provides system calls and a small library of useful routines (e.g. creation of network connections, string manipulations). Module
+.CW "Draw"
+is a basic graphics library that handles raster graphics, fonts, and windows. Module
+.CW "Prefab"
+builds on
+.CW "Draw"
+to provide structured complexes containing images and text inside of windows; these elements may be scrolled, selected, and changed by the methods of
+.CW "Prefab" .
+Module
+.CW "Tk"
+is an all-new implementation of the Tk graphics toolkit [18], with a Limbo interface. A
+.CW "Math"
+module encapsulates the procedures for numerical programming.
+.SH
+The Environment of the Inferno System
+.LP
+Inferno creates a standard environment for applications. Identical application programs can run
+under any instance of this environment, even in distributed fashion, and see the same resources.
+Depending on the environment in which Inferno itself is implemented, there are several versions of the Inferno kernel, Dis/Limbo interpreter, and device driver set.
+.LP
+When running as the native operating system, the kernel includes all the low-level glue (interrupt handlers, graphics and other device drivers) needed to implement the abstractions presented to applications.
+For a hosted system, for example under Unix, Windows NT or Windows 95, Inferno runs as a set of ordinary processes.
+Instead of mapping its device-control functionality to real hardware,
+it adapts to the resources provided by the operating system under which it runs.
+For example, under Unix, the graphics library might be implemented using the X window system and the networking using the socket interface; under Windows, it uses the native Windows graphics and Winsock calls.
+.LP
+Inferno is, to the extent possible, written in standard C and most of its components are independent of the many operating systems that can host it.
+.SH
+Security in Inferno
+.LP
+Inferno provides security of communication, resource control, and
+system integrity.
+.LP
+Each external communication channel may be transmitted in the clear,
+accompanied by message digests to prevent corruption, or encrypted to
+prevent corruption and interception. Once communication is set up,
+the encryption is transparent to the application. Key exchange is
+provided through standard public-key mechanisms; after key exchange,
+message digesting and line encryption likewise use standard symmetric
+mechanisms.
+.LP
+Inferno is secure against erroneous or malicious applications, and
+encourages safe collaboration between mutually suspicious service
+providers and clients. The resources available to applications appear
+exclusively in the name space of the application, and standard
+protection modes are available. This applies to data, to
+communication resources, and to the executable modules that constitute
+the applications. Security-sensitive resources of the system are
+accessible only by calling the modules that provide them; in
+particular, adding new files and servers to the name space is
+controlled and is an authenticated operation. For example, if the
+network resources are removed from an application's name space, then
+it is impossible for it to establish new network connections.
+.LP
+Object modules may be signed by trusted authorities who guarantee
+their validity and behavior, and these signatures may be checked by
+the system the modules are accessed.
+.LP
+Although Inferno provides a rich variety of authentication and security
+mechanisms, as detailed below, few application programs need to
+be aware of them or explicitly include coding to make use of them.
+Most often, access to resources across a secure communications link
+is arranged in advance by the larger system in which the application operates.
+For example, when a client system uses a server system
+and connection authentication or link encryption is appropriate,
+the server resources will most naturally be supplied
+as a part of the application's name space.
+The communications channel that carries the Styx protocol
+can be set to authenticate or encrypt; thereafter,
+all use of the resource is automatically protected.
+.SH
+Security mechanisms
+.LP
+Authentication and digital signatures are performed using
+public key cryptography. Public keys are certified by
+Inferno-based or other certifying authorities that sign the public keys with their
+own private key.
+.LP
+Inferno uses encryption for:
+.IP •
+mutual authentication of communicating parties;
+.IP •
+authentication of messages between these parties; and
+.IP •
+encryption of messages between these parties.
+.LP
+The encryption algorithms provided by Inferno
+include the SHA, MD4, and MD5 secure hashes;
+Elgamal public key signatures and signature verification [4];
+RC4 encryption;
+DES encryption;
+and public key exchange based on the Diffie-Hellman scheme.
+The public key signatures use keys with moduli up to 4096 bits,
+512 bits by default.
+.LP
+There is no generally accepted national or international authority
+for storing or generating public or private encryption keys.
+Thus Inferno includes tools for using or implementing a trusted authority,
+but it does not itself provide the authority,
+which is an administrative function.
+Thus an organization using Inferno (or any other security
+and key-distribution scheme) must design its system to suit its
+own needs, and in particular decide whom to trust as a Certifying
+Authority (CA). However, the Inferno design is sufficiently flexible
+and modular to accommodate the protocols likely to be attractive in practice.
+.LP
+The certifying authority that signs a user's
+public key determines the size of the key and the public key
+algorithm used. Tools provided with
+Inferno use these signatures for authentication. Library
+interfaces are provided for Limbo programs to sign and verify
+signatures.
+.LP
+Generally authentication is performed using public key cryptography. Parties
+register by having their public keys signed by the certifying authority (CA).
+The signature covers a secure hash (SHA, MD4, or MD5) of
+the name of the party, his public key, and an expiration time. The signature,
+which contains the name of the signer, along with the signed information,
+is termed a
+.I "certificate" .
+.LP
+When parties communicate, they use the Station to Station protocol[5] to
+establish the identities of the two parties and to create a mutually known secret.
+This STS protocol uses the Diffie-Hellman algorithm [6] to create this shared
+secret.
+The protocol is protected against replay attacks by choosing new random
+parameters for each conversation. It is secured against `man in
+the middle' attacks by having the parties exchange certificates and then
+digitally signing key parts of the protocol. To masquerade as another
+party an attacker would have to be able to forge that party's signature.
+.SH
+Line Security
+.LP
+A network conversation can be secured against modification alone
+or against both modification and snooping. To secure against
+modification, Inferno can append a secure MD5 or SHA hash (called a digest),
+.P1
+hash(secret, message, messageid)
+.P2
+to each message.
+.I "Messageid"
+is a 32 bit number that starts at 0 and is incremented by
+one for each message sent. Thus messages can be neither
+changed, removed, reordered or inserted into the stream without knowing
+the secret or breaking the secure hash algorithm.
+.LP
+To secure against snooping, Inferno supports encryption of the complete conversation
+using either RC4 or DES with either DES chain block coding (DESCBC)
+and electronic code book (DESECB).
+.LP
+Inferno uses the same encapsulation format as Netscape's Secure Sockets Layer [7].
+It is possible to encapsulate
+a message stream in multiple encapsulations to provide varying degrees of
+security.
+.SH
+Random Numbers
+.LP
+The strength of cryptographic algorithms depends in part on strength
+of the random numbers
+used for choosing keys, Diffie-Hellman parameters, initialization vectors, etc.
+Inferno achieves this in two steps: a slow (100 to 200 bit
+per second) random bit stream comes from sampling the low order bits of a
+free running counter whenever a clock ticks. The clock must be unsynchronized,
+or at least poorly synchronized, with the counter. This generator is then used to
+alter the state of a faster pseudo-random number generator.
+Both the slow and fast generators were tested on a number of architectures
+using self correlation, random walk, and repeatability tests.
+.SH
+Introduction to Limbo
+.LP
+Limbo is the application programming language for the Inferno operating system. Although Limbo looks syntactically like C, it has a number of features that make it easier to use, safer, and more suited to the heterogeneous, networked Inferno environment: a rich set of basic types, strong typing, garbage collection, concurrency, communications, and modules. Limbo may be interpreted or compiled `just in time' for efficient, portable execution.
+.LP
+This paper introduces the language by studying an example of a complete, useful Limbo program. The program illustrates general programming as well as aspects of concurrency, graphics, module loading, and other features of Limbo and Inferno.
+.SH
+The problem
+.LP
+Our example program is a stripped-down version of the Inferno[14] program
+.CW "view" ,
+which displays graphical image files on the screen, one per window. This version sacrifices some functionality, generality, and error-checking but performs the basic job. The files may be in either GIF[12, 13] or JPEG[19] format and must be converted before display, or they may already be in the Inferno standard format that needs no conversion.
+.CW "View"
+`sniffs' each file to determine what processing it requires, maps the colors if necessary, creates a new window, and copies the converted image to it. Each window is given a title bar across the top to identify it and hold the buttons to move and delete the window.
+.SH
+The Source
+.LP
+Here is the complete Limbo source for our version of
+.CW "view" ,
+annotated with line numbers for easy reference (Limbo, of course, does not use line numbers). Subsequent sections explain the workings of the program. Although the program is too large to absorb as a first example without some assistance, it's worth skimming before moving to the next section, to get an idea of the style of the language. Control syntax derives from C[11], while declaration syntax comes from the Pascal family of languages[17]. Limbo borrows features from a number of languages (e.g., tuples on lines 45 and 48) and introduces a few new ones (e.g. explicit module loading on lines 90 and 92).
+.P1
+ 1 implement View;
+.P3
+ 2 include "sys.m";
+ 3 sys: Sys;
+.P3
+ 4 include "draw.m";
+ 5 draw: Draw;
+ 6 Rect, Display, Image: import draw;
+.P3
+ 7 include "bufio.m";
+.P3
+ 8 include "imagefile.m";
+.P3
+ 9 include "tk.m";
+10 tk: Tk;
+.P3
+11 include "wmlib.m";
+12 wmlib: Wmlib;
+.P3
+13 include "string.m";
+14 str: String;
+.P3
+15 View: module
+16 {
+17 init: fn(ctxt: ref Draw->Context,
+ argv: list of string);
+18 };
+.P3
+19 init(ctxt: ref Draw->Context,
+ argv: list of string)
+20 {
+21 sys = load Sys Sys->PATH;
+22 draw = load Draw Draw->PATH;
+23 tk = load Tk Tk->PATH;
+24 wmlib = load Wmlib Wmlib->PATH;
+25 str = load String String->PATH;
+26 wmlib->init();
+.P3
+27 imageremap := load Imageremap
+ Imageremap->PATH;
+28 bufio := load Bufio Bufio->PATH;
+.P3
+29 argv = tl argv;
+30 if(argv != nil
+ && str->prefix("-x ", hd argv))
+31 argv = tl argv;
+.P3
+32 viewer := 0;
+33 while(argv != nil){
+34 file := hd argv;
+35 argv = tl argv;
+.P3
+36 im := ctxt.display.open(file);
+37 if(im == nil){
+38 idec := filetype(file);
+39 if(idec == nil)
+40 continue;
+.P3
+41 fd := bufio->open(file,
+ Bufio->OREAD);
+42 if(fd == nil)
+43 continue;
+.P3
+44 idec->init(bufio);
+45 (ri, err) := idec->read(fd);
+46 if(ri == nil)
+47 continue;
+.P3
+48 (im, err) = imageremap->remap(
+ ri, ctxt.display, 1);
+49 if(im == nil)
+50 continue;
+51 }
+.P3
+52 spawn view(ctxt, im, file,
+ viewer++);
+53 }
+54 }
+.P3
+55 view(ctxt: ref Draw->Context,
+ im: ref Image, file: string,
+ viewer: int)
+56 {
+57 corner := string(25+20*(viewer%5));
+.P3
+58 (nil, file) = str->splitr(file, "/");
+59 (t, menubut) := wmlib->titlebar(ctxt.screen,
+ " -x "+corner+" -y "+corner+
+ " -bd 2 -relief raised",
+ "View: "+file, Wmlib->Hide);
+.P3
+60 event := chan of string;
+61 tk->namechan(t, event, "event");
+.P3
+62 tk->cmd(t, "frame .im -height " +
+ string im.r.dy() +
+ " -width " +
+ string im.r.dx());
+63 tk->cmd(t, "bind . <Configure> "+
+ "{send event resize}");
+64 tk->cmd(t, "bind . <Map> "+
+ "{send event resize}");
+65 tk->cmd(t, "pack .im -side bottom"+
+ " -fill both -expand 1");
+66 tk->cmd(t, "update");
+.P3
+67 t.image.draw(posn(t), im, ctxt.display.ones, im.r.min);
+68 for(;;) alt{
+69 menu := <-menubut =>
+70 if(menu == "exit")
+71 return;
+72 wmlib->titlectl(t, menu);
+73 <-event =>
+74 t.image.draw(posn(t), im,
+ ctxt.display.ones, im.r.min);
+75 }
+76 }
+.P3
+77 posn(t: ref Tk->Toplevel): Rect
+78 {
+79 minx := int tk->cmd(t,
+ ".im cget -actx");
+80 miny := int tk->cmd(t,
+ ".im cget -acty");
+81 maxx := minx + int tk->cmd(t,
+ ".im cget -actwidth");
+82 maxy := miny + int tk->cmd(t,
+ ".im cget -actheight");
+.P3
+83 return ((minx, miny), (maxx, maxy));
+84 }
+.P3
+85 filetype(file: string): RImagefile
+86 {
+87 if(len file>4
+ && file[len file-4:]==".gif")
+88 r := load RImagefile
+ RImagefile->READGIFPATH;
+89 if(len file>4
+ && file[len file-4:]==".jpg")
+90 r = load RImagefile
+ RImagefile->READJPGPATH;
+91 return r;
+92 }
+.P2
+.SH
+Modules
+.LP
+Limbo programs are composed of modules that are loaded and linked at run-time. Each Limbo source file is the implementation of a single module; here line 1 states this file implements a module called
+.CW "View" ,
+whose declaration appears in the
+.CW "module"
+declaration on lines 15-18. The declaration states that the module has one publicly visible element, the function
+.CW "init" .
+Other functions and variables defined in the file will be compiled into the module but only accessible internally.
+.LP
+The function
+.CW "init"
+has a type signature (argument and return types) that makes it callable from the Inferno shell, a convention not made explicit here. The type of
+.CW "init"
+allows
+.CW "View"
+to be invoked by typing, for example,
+.P1
+view *.jpg
+.P2
+at the Inferno command prompt to view all the JPEG files in a directory. This interface is all that is required for the module to be callable from the shell; all programs are constructed from modules, and some modules are directly callable by the shell because of their type. In fact the shell invokes
+.CW "View"
+by loading it and calling
+.CW "init" ,
+not for example through the services of a system
+.CW "exec"
+function as in a traditional operating system.
+.LP
+Not all modules, of course, implement shell commands; modules are also used to construct libraries, services, and other program components. The module
+.CW "View"
+uses the services of other modules for I/O, graphics, file format conversion, and string processing. These modules are identified on lines 2-14. Each module's interface is stored in a public `include file' that holds a definition of a module much like lines 15-18 of the
+.CW "View"
+program. For example, here is an excerpt from the include file
+.CW "sys.m" :
+.P1
+Sys: module
+{
+ PATH: con "$Sys";
+
+ FD: adt # File descriptor
+ {
+ fd: int;
+ };
+
+ OREAD: con 0;
+ OWRITE: con 1;
+ ORDWR: con 2;
+
+ open: fn(s: string, mode: int): ref FD;
+ print: fn(s: string, *): int;
+ read: fn(fd: ref FD, buf: array of byte, n: int): int;
+ write: fn(fd: ref FD, buf: array of byte, n: int): int;
+};
+.P2
+This defines a module type, called
+.CW "Sys" ,
+that has functions with familiar names like
+.CW "open"
+and
+.CW "print" ,
+constants like
+.CW "OREAD"
+to specify the mode for opening a file, an aggregate type
+.CW "adt" ) (
+called
+.CW "FD" ,
+returned by
+.CW "open" ,
+and a constant string called
+.CW "PATH" .
+.LP
+After including the definition of each module,
+.CW "View"
+declares variables to access the module. Line 3, for example, declares the variable
+.CW "sys"
+to have type
+.CW "Sys" ;
+it will be used to hold a reference to the implementation of the module. Line 6 imports a number of types from the
+.CW "draw"
+(graphics) module to simplify their use; this line states that the implementation of these types is by default to be that provided by the module referenced by the variable
+.CW "draw" .
+Without such an
+.CW "import"
+statement, calls to methods of these types would require explicit mention of the module providing the implementation.
+.LP
+Unlike most module languages, which resolve unbound references to modules automatically, Limbo requires explicit `loading' of module implementations.
+Although this requires more bookkeeping, it allows a program to have fine control over the loading (and unloading) of modules, an important property in the small-memory systems in which Inferno is intended to run.
+Also, it allows easy garbage collection of unused modules and allows multiple implementations to serve a single interface, a style of programming we will exploit in
+.CW "View" .
+.LP
+Declaring a module variable such as
+.CW "sys"
+is not sufficient to access a module; an implementation must also be loaded and bound to the variable. Lines 21-25 load the implementations of the standard modules used by
+.CW "View" .
+The
+.CW "load"
+operator, for example
+.P1
+sys = load Sys Sys->PATH;
+.P2
+takes a type
+.CW "Sys" ), (
+the file name of the implementation
+.CW "Sys->PATH" ), (
+and loads it into memory. If the implementation matches the specified type, a reference to the implementation is returned and stored in the variable
+.CW "sys" ). (
+If not, the constant
+.CW "nil"
+will be returned to indicate an error. Conventionally, the
+.CW "PATH"
+constant defined by a module names the default implementation. Because
+.CW "Sys"
+is a built-in module provided by the system, it has a special form of name; other modules'
+.CW "PATH"
+variables name files containing actual code. For example,
+.CW "Wmlib->PATH"
+is \f5"/dis/lib/wmlib.dis"\fP.
+Note, though, that the name of the implementation of the module in a
+.CW "load"
+statement can be any string.
+.LP
+Line 26 initializes the
+.CW "wmlib"
+module by invoking its
+.CW "init"
+function (unrelated to the
+.CW "init"
+of
+.CW "View" ).
+Note the use of the
+.CW "->"
+operator to access the member function of the module. The next two lines load modules, but add a new wrinkle: they also
+.I "declare"
+and
+.I "initialize"
+the module variables storing the reference. Limbo declarations have the general form
+.P1
+\fIvar\fP: \fItype\fP = \fIvalue\fP;
+.P2
+If the type is missing, it is taken to be the type of the value, so for example,
+.P1
+bufio := load Bufio Bufio->PATH;
+.P2
+on line 28 declares a variable of type
+.CW "Bufio"
+and initializes it to the result of the
+.CW "load"
+expression.
+.SH
+The main loop
+.LP
+The
+.CW "init"
+function takes two parameters, a graphics context,
+.CW "ctxt" ,
+for the program and a list of command-line argument strings,
+.CW "argv" .
+.CW "Argv"
+is a
+.CW "list"
+.CW "of"
+.CW "string" ;
+strings are a built-in type in Limbo and lists are a built-in form of constructor. Lists have several operations defined:
+.CW "hd"
+(head) returns the first element in the list,
+.CW "tl"
+(tail) the remainder after the head, and
+.CW "len"
+(length) the number of elements in the list.
+.LP
+Line 29 throws away the first element of
+.CW "argv" ,
+which is conventionally the name of the program being invoked by the shell, and lines 30-31 ignore a geometry argument passed by the window system. The loop from lines 33 to 53 processes each file named in the remaining arguments; when
+.CW "argv"
+is a
+.CW "nil"
+list, the loop is complete. Line 34 picks off the next file name and line 35 updates the list.
+.LP
+Line 36 is the first method call we have seen:
+.P1
+im := ctxt.display.open(file);
+.P2
+The parameter
+.CW "ctxt"
+is an
+.CW "adt"
+that contains all the relevant information for the program to access its graphics environment. One of its elements, called
+.CW "display" ,
+represents the connection to the frame buffer on which the program may write. The
+.CW "adt"
+.CW "display"
+(whose type is imported on line 6) has a member function
+.CW "open"
+that reads a named image file into the memory associated with the frame buffer, returning a reference to the new image. (In X[20] terminology,
+.CW "display"
+represents a connection to the server and
+.CW "open"
+reads a pixmap from a file and instantiates it on that server.)
+.LP
+The
+.CW "display.open"
+method succeeds only if the file exists and is in the standard Inferno image format. If it fails, it will return
+.CW "nil"
+and lines 38-50 will attempt to convert the file into the right form.
+.SH
+Decoding the file
+.LP
+Line 38 calls
+.CW "filetype"
+to determine what format the file has. The simple version here, on lines 85-92, just looks at the file suffix to determine the type. A realistic implementation would work harder, but even this version illustrates the utility of program-controlled loading of modules.
+.LP
+The decoding interface for an image file format is specified by the module type
+.CW "RImagefile" .
+However, unlike the other modules we have looked at,
+.CW "RImagefile"
+has a number of implementations. If the file is a GIF file,
+.CW "filetype"
+returns the implementation of
+.CW "RImagefile"
+that decodes GIFs; if it is a JPEG file,
+.CW "filetype"
+returns an implementation that decodes JPEGs. In either case, the
+.CW "read"
+method has the same interface. Since reference variables like
+.CW "r"
+are implicitly initialized to
+.CW "nil" ,
+that is what
+.CW "filetype"
+will return if it does not recognize the image format.
+.LP
+Thus,
+.CW "filetype"
+accepts a file name and returns the implementation of a module to decode it.
+.LP
+A couple of other points about
+.CW "filetype" .
+First, the expression
+.CW "file[len file-4:]"
+is a
+.I "slice"
+of the string
+.CW "file" ;
+it creates a string holding the last four characters of the file name. The colon separates the starting and ending indices of the slice; the missing second index defaults to the end of the string. As with lists,
+.CW "len"
+returns the number of characters (not bytes; Limbo uses Unicode[21] throughout) in the string.
+.LP
+Second, and more important, this version of
+.CW "filetype"
+loads the decoder module anew every time it is called, which is clearly inefficient. It's easy to do better, though: just store the module in a global, as in this fragment:
+.P1
+readjpg: RImagefile;
+filetype(...)...
+{
+ if(isjpg()){
+ if(readjpg == nil)
+ readjpg = load RImagefile
+ RImagefile->READJPGPATH;
+ return readjpg;
+ }
+}
+.P2
+The program can form its own policies on loading and unloading modules based on time/space or other tradeoffs; the system does not impose its own.
+.LP
+Returning to the main loop, after the type of the file has been discovered, line 41 opens the file for I/O using the buffered I/O package. Line 44 calls the
+.CW "init"
+function of the decoder module, passing it the instance of the buffered I/O module being used (if we were caching decoder modules, this call to
+.CW "init"
+would be done only when the decoder is first loaded.) Finally, the Limbo-characteristic line 45 reads in the file:
+.P1
+(ri, err) := idec->read(fd);
+.P2
+The
+.CW "read"
+method of the decoder does the hard job of cracking the image format, which is beyond the scope of this paper. The result is a
+.I "tuple" :
+a pair of values. The first element of the pair is the image, while the second is an error string. If all goes well, the
+.CW "err"
+will be
+.CW "nil" ;
+if there is a problem, however,
+.CW "err"
+may be printed by the application to report what went wrong. The interesting property of this style of error reporting, common to Limbo programs, is that an error can be returned even if the decoding was successful (that is, even if
+.CW "ri"
+is non-
+.CW "nil" ).
+For example, the error may be recoverable, in which case it is worth returning the result but also worth reporting that an error did occur, leaving the application to decide whether to display the error or ignore it.
+.CW "View" "\ " (
+ignores it, for brevity.)
+.LP
+In a similar manner, line 48 remaps the colors from the incoming colormap associated with the file to the standard Inferno color map. The result is an image ready to be displayed.
+.SH
+Creating a process
+.LP
+By line 52 in the main loop, we have an image ready in the variable
+.CW "im"
+and use the Limbo primitive
+.CW "spawn"
+to create a new process to display that image on the screen.
+.CW "Spawn"
+operates on a function call, creating a new process to execute that function. The process doing the spawning, here the main loop, continues immediately, while the new process begins execution in the specified function with the specified parameters. Thus line 52 begins a new process in the function
+.CW "view"
+with arguments the graphics context, the image to display, the file name, and a unique identification number used in placing the windows.
+.LP
+The new process shares with the calling process all variables except the stack. Shared memory can therefore be used to communicate between them; for synchronization, a more sophisticated mechanism is needed, a subject we will cover in the section on communications.
+.SH
+Starting Tk
+.LP
+The function
+.CW "view"
+uses the Inferno Tk graphics toolkit (a re-implementation for Limbo of Ousterhout's Tcl/Tk toolkit [18]) to place the image on the screen in a new window. Line 57 computes the position of the corner of the window, using the viewer number to stagger the positions of successive windows. The
+.CW "string"
+keyword is a conversion; in this example the conversion does an automatic translation from an integer expression into a decimal representation of the number. Thus
+.CW "corner"
+is a string variable, a form more useful in the calls to the Tk library.
+.LP
+The Inferno Tk implementation uses Limbo as its controlling language.
+Rather than building a rich procedural interface, the interface passes strings to a generic Tk command processor, which returns strings as results.
+This is similar to the use Tk within Tcl, but with most of the control flow, arithmetic, and so on written in Limbo.
+.LP
+A good introduction to the style is the function
+.CW "posn"
+on lines 77-84. The calls to
+.CW "tk->cmd"
+evaluate the textual command in the context defined by the
+.CW "Tk->Toplevel"
+variable
+.CW "t"
+(created on line 57 and passed to
+.CW "posn" );
+the result is a decimal integer, converted to binary by the explicit
+.CW "int"
+conversion. On line 83, all the coordinates of the rectangle are known, and the function returns a nested tuple defining the rectangular position of the
+.CW ".im"
+component of the Toplevel. This tuple is automatically promoted to the
+.CW "Rect"
+type by the return statement.
+.LP
+Back in function
+.CW "view" ,
+line 58 uses a function from the higher-level
+.CW "String"
+module to strip off the basename of the file name, for use in the banner of the window. Note that one component of the tuple is nil; the value of this component is discarded.
+Line 58 calls the window manager function
+.CW "wmlib->titlebar"
+to establish a title bar on the window
+The arguments are
+.CW "ctxt.screen" ,
+a data structure representing the window stack on the frame buffer,
+a string specifying the size and properties of the new window, the window's
+label, and the set of control buttons required.
+The
+.CW "+"
+operator on strings performs concatenation.
+The window is labelled \f5"View"\fP
+and the file basename, with a control button to hide the window.
+Titlebars always include a control button to dismiss the window.
+(The size and properties argument is more commonly nil or the empty string,
+leaving the choice of position and style to the window manager.)
+The first value
+in the tuple returned by
+.CW "wmlib->titlebar"
+is a reference to a `top-level' widget\-a window\-upon which the program will assemble its display.
+.SH
+Communications
+.LP
+The second value in the tuple
+returned from
+.CW "wmlib->titlebar"
+is a built-in Limbo type called a channel
+.CW "chan" "" (
+is the keyword). A channel is a communications mechanism in the manner of Hoare's CSP[15]. Two processes that wish to communicate do so using a shared channel; data sent on the channel by one process may be received by another process. The communication is
+.I "synchronous" :
+both processes must be ready to communicate before the data changes hands, and if one is not ready the other blocks until it is. Channels are a feature of the Limbo language: they have a declared type
+.CW "chan" "" (
+.CW "of"
+.CW "int" ,
+.CW "chan"
+.CW "of"
+.CW "list"
+.CW "of"
+.CW "string" ,
+etc.) and only data of the correct type may be sent. There is no restriction on what may be sent; one may even send a channel on a channel. Channels therefore serve both to communicate and to synchronize.
+.LP
+Channels are used throughout Inferno to provide interfaces to system functions. The threading and communications primitives in Limbo are not designed to implement efficient multicomputer algorithms, but rather to provide an elegant way to build active interfaces to devices and other programs.
+.LP
+One example is the
+.CW "menubut"
+channel returned by
+.CW "wmlib->titlebar" ,
+a channel of textual commands sent by the window manager. The expression on line 69,
+.P1
+menu := <-menubut
+.P2
+receives the next message on the channel and assigns it to the variable menu. The communications operator,
+.CW "<-" ,
+receives a datum when prefixed to channel and transmits a datum when combined with an assignment operator (e.g.
+.CW "channel<-=2" ).
+This use of menubut appears inside an
+.CW "alt"
+(alternation) statement, a construct we'll discuss later.
+.LP
+Lines 60 and 61 create and register a new channel,
+.CW "event" ,
+to be used by the Tk module to report user interface events. Lines 62-66 use simple Tk operations to make the window in which the image may be drawn. Lines 63 and 64 bind events within this window to messages to be sent on the channel
+.CW "event" .
+For example, line 63 defines that when the configuration of the window is changed, presumably by actions of the window manager, the string
+\f5"resize"\fP
+is to be transmitted on
+.CW "event"
+for interpretation by the application. This translation of events into messages on explicit channels is fundamental to the Limbo style of programming.
+.SH
+Displaying the image
+.LP
+The payoff occurs on line 67, which steps outside the Tk model to draw the image
+.CW "im"
+directly on the window:
+.P1
+t.image.draw(posn(t), im, ctxt.display.ones, im.r.min);
+.P2
+.CW "Posn"
+calculates where on the screen the image is to go. The
+.CW "draw"
+method is the fundamental graphics operation in Inferno, whose design is outside our scope here. In this statement, it just copies the pixels from
+.CW "im"
+to the window's own image,
+.CW "t.image" ;
+the argument
+.CW "ctxt.display.ones"
+is a mask that selects every pixel.
+.SH
+Multi-way communications
+.LP
+Once the image is on the screen,
+.CW "view"
+waits for any changes in the status of the window. Two things may happen: either the buttons on the title bar may be used, in which case a message will appear on
+.CW "menubut" ,
+or a configuration or mapping operation will apply to the window, in which case a message will appear on
+.CW "event" .
+.LP
+The Limbo
+.CW "alt"
+statement provides control when more than one communication may proceed. Analogous to a
+.CW "case"
+statement, the
+.CW "alt"
+evaluates a set of expressions and executes the statements associated with the correct expression. Unlike a
+.CW "case" ,
+though, the expressions in an
+.CW "alt"
+must each be a communication, and the
+.CW "alt"
+will execute the statements associated with the communication that can first proceed. If none can proceed, the
+.CW "alt"
+waits until one can; if more than one can proceed, it chooses one randomly.
+.LP
+Thus the loop on lines 68-75 processes messages received by the two classes of actions. When the window is moved or resized, line 73 will receive a \f5"resize"\fP
+message due to the bindings on lines 63 and 64. The message is discarded but the action of receiving it triggers the repainting of the newly placed window on line 74. Similarly, messages triggered by buttons on the title bar send a message on
+.CW "menubut" ,
+and the value of that is examined to see if it is
+\f5"exit"\fP,
+which should be handled locally, or anything else, which can be passed on to the underlying library.
+.SH
+Cleanup
+.LP
+If the exit button is pushed, line 71 will return from
+.CW "view" .
+Since
+.CW "view"
+was the top-level function in this process, the process will exit, freeing all its resources. All memory, open file descriptors, windows, and other resources held by the process will be garbage collected when the return executes.
+.LP
+The Limbo garbage collector [16] uses a hybrid scheme that combines reference counting to reclaim memory the instant its last reference disappears with a real-time sweeping algorithm that runs as an idle-time process to reclaim unreferenced circular structures.
+The instant-free property means that system resources like file descriptors and windows can be tied to the collector for recovery as soon as they become unused; there is no pause until a sweeper discovers it.
+This property allows Inferno to run in smaller memory arenas than are required for efficient mark-and-sweep algorithms, as well as providing an extra level of programmer convenience.
+.SH
+Summary
+.LP
+Inferno supplies a rich environment for constructing distributed applications that are portable\-in fact identical\-even when running on widely divergent underlying hardware. Its unique advantage over other solutions is that it encompasses not only a virtual machine, but also a complete virtual operating system including network facilities.
+.SH
+Acknowledgment
+.LP
+The cryptographic elements of Inferno owe much
+to the cryptographic library of Lacy et al. [22].
+.SH
+References
+.LP
+.nr PS -1
+.nr VS -1
+.IP 1.
+R. Pike, D. Presotto, S. Dorward, B. Flandrena, K. Thompson, H. Trickey, and P. Winterbottom. ``Plan 9 from Bell Labs'',
+.I "J. Computing Systems"
+8:3, Summer 1995, pp. 221-254.
+.IP 2.
+S. Dorward, R. Pike, and P. Winterbottom. ``Programming in Limbo'',
+.I "IEEE Compcon 97 Proceedings" ,
+1997.
+.IP 3.
+J. K. Ousterhout.
+.I "Tcl and the Tk Toolkit" ,
+Addison-Wesley, 1994.
+.IP 4.
+T. Elgamal, ``A Public-Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms'',
+.I "Advances in Cryptography: Proceedings of CRYPTO 84, "
+Springer Verlag, 1985, pp. 10-18
+.IP 5.
+B. Schneier, ``Applied Cryptography'', Wiley, 1996, p. 516
+.IP 6.
+D. Stinson, ``Cryptography, Theory and Practice'',
+.I "CRC Press" ,
+1996, p. 271
+.IP 7.
+K. Hickman and T. Elgamal, ``The SSL Protocol (V3.0)'',
+.I "IETF Internet-draft"
+.IP 8.
+S. M. Bellovin and M. Merritt, ``Encrypted Key Exchange: Password-Based Protocols Secure Against Dictionary Attack'', Proceedings of the 1992 IEEE Computer Society Conference on Research in Security and Privacy, 1992, pp. 72-84
+.IP 9.
+M. Blaze, J. Feigenbaum, J. Lacy, ``Decentralized Trust Management'',
+.I "Proceedings 1996 IEEE Symposium on Security and Privacy" ,
+May 1996
+.IP 10.
+R. Rivest and B. Lampson, ``SDSI - A Simple Distributed Security Architecture'', unpublished,
+.I "http://theory.lcs.mit.edu/~rivest/sdsi10.ps"
+.IP 11.
+.I "American National Standard for Information Systems Programming Language C" ,
+American National Standards Institute, X3.159-1989.
+.IP 12.
+.I "GIF Graphics Interchange Format: A standard defining a mechanism for the storage and transmission of bitmap-based graphics information" ,
+CompuServe Incorporated, Columbus, OH, 1987.
+.IP 13.
+.I "GIF Graphics Interchange Format: Version 89a" ,
+CompuServe Incorporated, Columbus, OH, 1990.
+.IP 14.
+S. Dorward et al., ``Inferno'',
+.I "IEEE Compcon 97 Proceedings" ,
+1997.
+.IP 15.
+C. A. R. Hoare, ``Communicating Sequential Processes''.
+.I "Comm. ACM"
+21:8, pp. 666-677, 1978.
+.IP 16.
+L. Huelsbergen, and P. Winterbottom, ``Very Concurrent Mark & Sweep Garbage Collection without Fine-Grain Synchronization'', Submitted
+.I "International Conference of Functional Programming" ,
+Amsterdam, 1997.
+.IP 17.
+K. Jensen, and N. Wirth,
+.I "PascalUser Manual and Report" .
+Springer-Verlag, 1974.
+.IP 18.
+John K. Ousterhout,
+.I "Tcl and the Tk Toolkit" ,
+Addison-Wesley, 1994.
+.IP 19.
+W. B. Pennebaker. and J. L. Mitchell,
+.I "JPEG Still Image Data Compression" ,
+Van Nostrand Reinhold, New York, 1992.
+.IP 20.
+R. W. Scheifler, J. Gettys, and R. Newman,
+.I "X Window System" ,
+Digital Press, 1988.
+.IP 21.
+The Unicode Consortium,
+.I "The Unicode Standard, Version 2.0, "
+Addison Wesley, 1996.
+.IP 22.
+J. B. Lacy, D. P. Mitchell, and W. M. Schell, ``CryptoLib: Cryptography in Software,''
+.I "UNIX Security Symposium IV Proceedings" ,
+USENIX Association, 1993 pp. 1-17.
+.nr PS +1
+.nr VS +1
binary files /dev/null b/doc/bltj.pdf differ
--- /dev/null
+++ b/doc/changes.ms
@@ -1,0 +1,2053 @@
+.\"<-xtx-*> tbl changes.ms | troff -ms | lp -d stdout
+.FP palatino
+.ps 9
+.nr PS 9
+.vs 11
+.nr VS 11
+.nr dP 1
+.nr dV 1p
+.nr dT 4m
+.nr XT 4
+.TL
+System and Interface Changes to Inferno
+.AU
+C H Forsyth
+.br
+Vita Nuova
+.br
+forsyth@vitanuova.com
+.br
+9 June 2003
+.SH
+Overview
+.LP
+This paper describes some of the changes made to Inferno
+interfaces as they stood in the published Third Edition manuals,
+to form the current Fourth Edition of the system,
+and the broad effects on internal and external interfaces.
+Changes include: extensions to the Limbo language;
+new instructions in Dis and the virtual machine; extra content
+in Dis object files; structure of the source tree; configuration of
+.CW emu ;
+replacement of the window system with changes to the client interface;
+commands renamed, replaced, and removed;
+revised support for network booting;
+9P2000 becomes the basis for Styx;
+a graphics model offering alpha-blended compositing and general pixel structure;
+and improvements to Tk.
+.NH 1
+Limbo
+.LP
+Exceptions and fixed point have been added to the Limbo language.
+They are described in more detail in separate notes by John Firth,
+shortly to be available on the Vita Nuova web site
+.CW www.vitanuova.com .
+Channels can now be buffered.
+A form of polymorphism is now available in Limbo.
+.NH 2
+Exceptions
+.LP
+Discussion of exceptions will be restricted here to implications for existing source code.
+The most obvious changes are that
+.CW Sys->rescue ,
+.CW Sys->rescued ,
+.CW Sys->unrescue
+and
+.CW Sys->raise
+have vanished.
+Instead the exception handling is expressed using constructions in the Limbo language.
+Named exceptions can be declared and used (these are described in the note by Firth), and
+they are declared as part of the type of functions that raise them.
+There is also a general `failure' exception that effectively subsumes the old
+.CW Sys->rescue
+scheme, including run-time errors such as `out of memory' that can happen in almost any function.
+Unlike named exceptions a `failure' exception can be raised or caught by any function,
+and its value is a string.
+The
+.CW raise
+statement raises an exception.
+This is most obvious in commands that wish to produce an `exit status'.
+Instead of
+.P1
+sys->raise("fail:usage");
+.P2
+one must now write
+.P1
+raise "fail:usage";
+.P2
+(That is one of the more common source changes required to Third Edition Limbo commands,
+since that was the most common use of exceptions before.)
+A block can have an
+.CW exception
+handler:
+.P1
+{
+ a := array[128] of byte;
+ dosomething(a);
+} exception e {
+"out of memory:*" =>
+ sys->print("i need more space: %s\en", e);
+"fail:*" =>
+ sys->print("exit status: %s\en", e);
+"*" =>
+ sys->print("unexpected error: %s\en", e);
+ raise; # propagate it
+}
+.P2
+If an exception is raised during the execution of the block (including functions it calls),
+execution of the block is abandoned, and control transfers to the appropriate exception handler
+(which is outside the block).
+Because the compiler and run-time system know the scope of the exception,
+values such as
+.CW a
+above are correctly reclaimed on exit from the faulty block.
+Unhandled failures are propagated to callers; unhandled named exceptions (currently) become failures.
+.LP
+A process group can cause unhandled exceptions in any process in the group either to
+propagate to all members of the group, or to be propagated to the process group leader
+after destroying the other processes in the group.
+This makes it easier to program recovery from exceptions within a group of concurrent processes.
+For instance, if a process is expected to send to another on a channel, but fails unexpectedly instead
+(eg, because memory was exhausted),
+instead of leaving the intended recipient blocked on a receive operation, it can be sent
+an exception to notify it of the failure of the other process, allowing it to take appropriate recovery action.
+(This could sometimes be programmed using the
+.CW wait
+file of
+.I prog (3),
+but not always.)
+.LP
+Exception handling is intended for recovering from disaster.
+We still think it is better Limbo style
+to use tuples, channels and processes to make ordinary error handling explicit.
+The few attempts to use failure exceptions to achieve `pretty' but peculiar control flow have had exactly the usual
+effect of making the code hard to follow and error-prone.
+.NH 2
+Channels
+.LP
+Buffered channels have been added:
+.P1
+c := chan [N] of int;
+.P2
+where
+.I N
+is an integer value,
+creates a channel that will allow up to
+.I N
+integer values to be sent to it without an intervening receive without blocking the sender.
+If
+.I N
+is zero, the channel is unbuffered, equivalent to plain
+.CW "chan of int" ,
+and synchronises sender and receiver as before.
+.LP
+The restriction that a given channel value could not be sent to or received from in two
+.CW alt
+statements simultaneously has been removed.
+.NH 2
+Polymorphism
+.LP
+John Firth has implemented a form of parametric polymorphism in Limbo.
+It too will be described in a separate note.
+Currently we are still fussing over aspects of the constraint syntax
+and some other implications of the most general form, and since some aspects are
+therefore subject to change, including syntax, we have not yet published the details.
+We think it is possible to use the following subset without having to change the code later:
+.IP 1.
+Function declarations can be parametrised by one or more type variables:
+For example:
+.RS
+.P1
+reverse[T](l: list of T): list of T
+{
+ rl: list of T;
+ for(; l != nil; l = tl l)
+ rl = hd l :: rl;
+ return rl;
+}
+.P2
+Such a function can then be invoked on any compatible set of values.
+The function invocation does not specify the type (the compiler does type unification on the parameters).
+Thus the above can be used as:
+.P1
+l1: list of string;
+l2: list of ref Item;
+l3: list of list of string;
+l1 = reverse(l1);
+l2 = reverse(l2);
+l3 = reverse(l3);
+.P2
+.RE
+.IP 2.
+ADTs can also be parametrised:
+.P1
+Tree: adt[T] {
+ v: T;
+ l, r: cyclic ref Tree[T];
+};
+.P2
+allowing declaration of
+.CW "Tree[ref Item]"
+and
+.CW "Tree[string]"
+for instance.
+.IP 3.
+Values of the parametrised type can only be declared, assigned, passed as parameters, returned,
+or sent down channels.
+The only types that can be used as actual parameter types are reference types (ie,
+.CW ref " ADT,"
+.CW array ,
+.CW chan ,
+.CW list
+and
+.CW module ),
+and
+.CW string
+(which is a value type but is implemented using a reference).
+At some point we shall allow a function such as
+.CW reverse
+above to be invoked with any compatible type (not just reference types) but
+that requires changes to Dis and the virtual machine not yet made.
+.LP
+The formal type parameters can be further constrained by listing a set
+of operations that they must have (which currently implies the actual parameters
+must be ADT types with compatible operations).
+We are not completely happy with the current constraint syntax, and some other
+aspects of the scheme, and so that
+be described here later once we have settled it.
+.NH 1
+Dis and virtual machine
+.LP
+To make the Limbo changes and extensions some new operators were added to
+the virtual machine.
+(We also added a
+.CW casel
+operator to allow
+.CW case
+statements to work on
+.CW big
+values.)
+Modules that have exception handlers also have a (new) exception table,
+added to the Dis object format.
+Furthermore, we moved the import table used by the
+.CW load
+operator out of the Dis data space into the object format
+(which also makes it available for inspection by
+.CW wm/rt
+amongst others).
+.LP
+There is now an internal interface to set conditions under
+which modules must be signed to be loaded, and to check a signature on a module.
+Appropriate stubs are defined when module signing is not configured; if
+.I sign (3)
+is configured, however, it replaces them by ones that enforce its signing policy.
+.NH 1
+Window manager
+.LP
+The window manager
+.I wm (1)
+has been reimplemented by Roger Peppe.
+It now multiplexes pointer and keyboard input to applications,
+and manages windows on the display.
+.I Tk (2)
+no longer manages windows from inside the kernel.
+In some ways the structure is closer to that of
+.I mux (1)
+and more specifically the design described in Rob Pike's paper ``A Concurrent Window System''.
+It is possible to import and export window system environments between hosts.
+.LP
+This is one of the bigger causes of source file changes, although many of them
+can be done by global substitutions (eg, using
+.I acme (1)).
+Appendix A gives details.
+.CW Wmlib
+is no longer the application's interface to the window system.
+Instead that is done through a new
+.CW Tkclient
+module; see
+.I tkclient (2).
+(It uses a different
+.CW Wmlib
+as an auxiliary module,
+and also uses a new
+.CW Titlebar
+module to allow the look of the window decoration to be changed more easily).
+An application acquires a window by a call to
+.CW Tkclient->toplevel ;
+starts pointer or keyboard input if desired by calling
+.CW Tkclient->startinput ;
+and puts the window on screen (after sending it Tk configuration commands)
+using
+.CW Tkclient->onscreen .
+Nothing appears on screen until that is called (which amongst other things avoids the resizing on start up that afflicted
+the original scheme).
+.CW Onscreen
+gives it a connection to the window manager for pointer, keyboard and control input,
+with a separate channel for each.
+When it receives data from any of the channels
+(typically using
+.CW alt )
+it must pass it to Tk using calls to appropriate
+.CW Tkclient
+functions.
+.LP
+The toolbar used by the old
+.I wm
+is now provided by a separate program
+.CW wm/toolbar
+(see
+.I toolbar (1)),
+and it is
+.CW toolbar
+that interprets the
+.CW /lib/wmsetup
+file.
+.CW Wm
+invokes
+.CW wm/toolbar
+by default so most users will see no difference, but it does make it easier to develop alternative interfaces.
+More visible is that
+.CW wm/logon
+is now a
+.I client
+of the window manager, and must be invoked as follows:
+.P1
+wm/wm wm/logon
+.P2
+.LP
+Applications need not even use
+.I tk (2).
+There is an interface for
+.CW draw -only
+clients,
+.I wmclient (2).
+.NH 1
+Inferno source tree
+.LP
+The structure of the Inferno source tree has changed in the following ways.
+.NH 2
+Library source
+.LP
+The
+.CW image
+and
+.CW memimage
+directories have gone, replaced by
+.CW libdraw
+and
+.CW libmemdraw .
+The directories in the Inferno root that contain the source for libraries
+now
+always have names starting `\f5lib\f1':
+.CW libcrypt ,
+.CW libinterp ,
+.CW libkeyring ,
+.CW libmath ,
+etc.
+.NH 2
+Emu source
+.LP
+The
+.CW emu
+directory now contains a subdirectory structure similar to the
+.CW os
+kernels, and uses a similar configuration file (parts list) to say what goes in
+a given instance of
+.CW emu .
+This allows platform-dependent selection of drivers, libraries and even
+.CW #/
+(ie,
+.I root (3))
+contents to be done easily.
+.LP
+The top directory,
+.CW /emu ,
+contains:
+.CW mkfile
+that simply moves to the platform configured by
+.CW /mkconfig ,
+allowing builds in the Inferno root as before;
+a subdirectory
+.CW port
+containing portable code (including some code shared by several platforms, such as
+.CW devfs-posix.c );
+and a subdirectory for each hosting platform, distinguished by an upper-case initial letter.
+Current platforms include
+.CW FreeBSD ,
+.CW Irix ,
+.CW Linux ,
+.CW Nt
+(for all Windows platforms after 95),
+.CW Plan9 ,
+.CW Solaris ,
+and several others.
+.NH 2
+Emu configuration
+.LP
+Each platform-specific directory contains a configuration file with the
+same structure and indeed similar contents to the ones used for the native kernel.
+The default configuration file is called
+.CW emu .
+Another can be chosen, again in a similar way to the native kernel, by using
+.P1
+mk 'CONF=\fIcfile\fP'
+.P2
+where
+.I cfile
+is the name of the configuration file.
+The name of the resulting executable file contains the configuration file name but depends on the platform:
+it is \fIcfile\fP\f5.exe\fP on Windows, \f5o.\fP\fIcfile\fP on Unix systems, and \f58.\fP\fIcfile\fP on 386 Plan 9 systems.
+The configuration file format and contents is documented for all types of kernels by
+.I conf (10.6).
+.NH 2
+Tk source
+.LP
+The Tk implementation in
+.CW libtk
+has been made more modular.
+It allows a significantly different `style' to be implemented,
+and although that is by no means trivial to do, there is at least an interface to do it.
+We hope to change various aspects of the standard style further, but that has not yet been done.
+.NH 1
+Commands and modules
+.LP
+There are new commands and library modules, others have become obsolete and been removed,
+and a few existing ones have been given new names (typically when ones with similar function have been
+collected together).
+The biggest change has been to
+.I wm (1),
+which retains the same name but slightly different invocation and completely different
+implementation,
+as discussed above.
+Here I shall simply note the bigger changes, rather than discuss new functionality.
+.NH 2
+Renamed commands
+.LP
+As part of a mild reorganisation of the
+.CW /appl
+and
+.CW /dis
+trees, we have moved commands out of
+.CW /dis/lib
+so that it now contains only library modules except for a few commands left
+there temporarily for compatibility.
+Commands themselves have sometimes been shuffled to subdirectories,
+often copying seemingly better structure from Plan 9,
+so that authentication commands are
+.CW auth/ ...,
+naming service commands are
+.CW ndb/ ...,
+and
+IP-specific commands are
+.CW ip/ "... ."
+.LP
+One noticeable change is that
+.CW lib/cs
+is now
+.CW ndb/cs .
+More dramatically, the command
+.CW lib/srv
+(ie,
+.I srv (8))
+has been replaced by
+.I sh (1)
+scripts, all described by
+.I svc (8),
+that contain appropriate calls to
+.I listen (1)
+after setting up any locally-desired environment.
+.LP
+Other commands have also moved:
+.IP •
+.CW lib/plumber
+is now simply
+.CW plumber
+.IP •
+.CW lib/bootp
+and
+.CW lib/tfptd
+have become
+.CW ip/bootpd
+and
+.CW ip/tftpd ,
+documented in
+.I bootpd (8)
+.IP •
+.CW lib/virgild
+has become
+.CW ip/virgild
+(see
+.I virgild (8))
+.IP •
+.CW lib/chatsrv ,
+.CW lib/rdbgsrv
+and
+.CW cpuslave
+have moved to
+.CW auxi
+(ie,
+.CW /dis/auxi
+and
+.CW /appl/cmd/auxi)
+.IP •
+.CW csquery
+has become
+.CW ndb/csquery
+.NH 2
+New or newly-documented commands
+.IP •
+an authentication server (signer) can use
+.I keyfs (4)
+to store its keys securely in the encrypted file
+.CW /keydb/keys
+(instead of the unencrypted
+.CW /keydb/password ),
+and run
+.I keysrv (4)
+to offer secure change of password remotely.
+They are typically started, with other signing services, by
+.CW svc/auth
+described in
+.I svc (8).
+.IP •
+.CW /dis/auth
+and
+.CW /appl/cmd/auth
+contain commands related to authentication;
+they rely on
+.I keyfs (4)
+in most cases.
+The older ones that use
+.CW /keydb/passwd
+are still in
+.CW /dis/lib
+and
+.CW /appl/lib
+during the transition
+.IP •
+.I dns (8)
+has replaced the
+.CW lib/ipsrv
+implementation of
+.I srv (2);
+when used, it must be started before
+.CW ndb/cs .
+.I Srv (2)
+has reverted to being a hosted-only interface to the hosting system's native
+DNS resolver.
+It is automatically used by
+.I cs (8)
+if it cannot find
+.I dns (8),
+and
+.I dns (8)
+will also use it if available before consulting the DNS network.
+.IP •
+.I chgrp (1),
+.I cpuview (1),
+.I grid (1),
+.I 9660srv (4),
+.I cpuslave (4),
+.I dossrv (4),
+.I keyfs (4),
+.I keysrv (4),
+.I nsslave (4),
+.I palmsrv (4),
+.I registry (4),
+.I rioimport ,
+.I styxchat (1),
+.I styxlisten ,
+.I wmexport ,
+.I wmimport ,
+and
+.I uniq (1)
+are new
+.IP •
+the multiplayer games software previously in
+.CW /appl/games
+has been replaced by a related but significantly different system in
+.CW /appl/spree .
+(Also see
+.I spree (2)
+for supporting modules.)
+.IP •
+.I Registry (4)
+provides dynamic registration and location of services using sets of attributes/value pairs,
+through a name space.
+.I Registries (2)
+provides a convenient Limbo interface for registration and query.
+.NH 2
+Commands removed
+.IP •
+.CW lib/csget
+(see
+.I cs (8)
+for its replacement
+.CW csquery )
+.IP •
+the undocumented and obsolete commands
+.CW lib/isrv
+and
+.CW lib/istyxd
+have been removed, since either the
+.CW none
+authentication protocol, or the
+.CW -A
+option to
+.CW mount
+can be used if no authentication is needed
+.IP •
+.CW lib/srv
+has been replaced by
+.I svc (8)
+as mentioned above.
+.IP •
+.CW getenv
+and
+.CW setenv
+have been removed since the Shell provides alternatives
+.IP •
+.CW wm/license
+is no longer needed
+.NH 2
+New modules
+.LP
+There are library modules to support: registries and configuration files of attribute/value pairs;
+Internet address parsing and manipulation; management of windows and subwindows (used by
+.I wm (1)
+itself); timers; Styx; Styx servers; exception handling; memory
+and performance profiling; Freetype interface; parsing Palm databases; and navigating XML files (without reading them all into memory) and interpreting style sheets.
+.NH 1
+Styx
+.LP
+Styx was derived from the 9P protocol used by Plan 9 in 1995, with changes that reflected the requirements
+of the Inferno project of the time, mainly by removing features that were thought too closely tied to the Plan 9
+environment.
+Some 9P messages were removed, particularly those
+that incorporated details of the Plan 9 authentication methods;
+Styx moved authentication outside the file service protocol.
+Other changes eliminated file locking and append-only files.
+Some restrictions that 9P imposed were retained, however, such as limiting file names to 27 bytes.
+This last restriction is fine for synthetic network services, but
+has been troublesome when trying to access Unix and Windows systems, amongst others.
+.LP
+A recent revision of 9P adds support for much longer file names
+and takes the opportunity to improve other aspects of the protocol.
+It also removes details of authentication algorithms from the protocol.
+The Styx implementation now uses the new version of 9P as the default file service protocol.
+(It is possible that for interoperation with older Inferno systems the system will be able to
+interact with both old and new versions of Styx.)
+.NH 2
+Protocol changes
+.LP
+The messages
+.CW Tauth
+and
+.CW Tversion
+are new to Styx.
+.CW Tversion
+includes negotiation (at connection start) of the message size and protocol version;
+it also introduces a new session.
+.CW Tauth
+obtains access to a special authentication file if the server requires
+authentication within a Styx session.
+.CW Tclone
+has been replaced by a more elaborate form of
+.CW Twalk
+that allows zero to MAXWELEM (16) elements to be walked, perhaps to a new fid, in a single message,
+returning a sequence of qid values in
+.CW Rwalk .
+(A clone is simply a walk of a fid to a new fid with zero elements.)
+A walk of several elements can return partial results if the walk of the first element succeeds but
+subsequent ones fail.
+A partial walk leaves the state of the fids unchanged.
+.CW Ropen
+and
+.CW Rcreate
+return a suggested size for atomic I/O on the fid (0 means `not given').
+All strings are variable length, and consequently
+.CW Twstat
+and
+.CW Rstat
+data is variable length and formatted differently.
+Data returned from
+.CW Tread
+of a directory is similarly changed, because
+directory entries are not fixed length.
+.CW Tnop
+has gone.
+.LP
+Tags remain 16-bit integers, but fids and counts
+become 32-bit integers (mainly of interest to large systems),
+and qids have a different structure.
+Previously a qid was a pair of 32-bit integers, path and vers, where
+path had the top bit set for a directory.
+Now a qid is a triple: a 64-bit path, 32-bit vers, and 8-bit type.
+The type is defined to be the top 8 bits of the file's mode.
+The path does not have the top bit set for a directory, and indeed the
+path value is not interpreted by the protocol.
+There are now bits in the file mode for append-only and exclusive-use
+files (new for Inferno), and for authentication files (new for both Plan 9 and Inferno).
+The stat information includes the user name that last caused the file's mtime to be changed.
+All strings in the protocol are variable length: file names, attach names, user names, and error text.
+.LP
+The message format on the wire is significantly different.
+The message size is negotiated for a connection by
+.CW Tversion ,
+and messages can be large, allowing much more data to be sent in single
+.CW Twrite
+and
+.CW Rread
+messages.
+The header includes a 32-bit message size, making it easy to find message boundaries without
+parsing the contents.
+Strings are
+represented as a 16-bit size followed by the string's UTF-8 encoding (without zero byte).
+R-messages do not carry a copy of the fid from the T-message.
+Padding bytes have gone.
+The order of some fields has changed of course to match message parameter changes.
+.LP
+Authentication of the connection itself, and optionally
+establishing the keys for digesting and encryption,
+is done before the protocol starts, in both Inferno and Plan 9.
+Details will follow on the protocol for that, and Limbo interfaces.
+For now, it can be assumed that the old authentication messages can still be used,
+even after a more flexible protocol has been implemented.
+.CW Tauth
+can be used to authenticate particular accesses within such a session, but
+implies trust by the server that the client system will not cheat its users.
+(That trust is typically established by the connection level authentication which is needed
+anyway for link encryption, and thus for single-user clients further authentication
+seems extraneous in most cases.)
+Most Inferno services that run as file servers within a system (eg,
+.CW 9660srv )
+will, like Plan 9's, reply to
+.CW Tauth
+with an
+.CW Rerror
+stating ``authentication not required''.
+Access to them when exported is typically controlled as now by verifying the incoming connection.
+.NH 2
+Limbo interface changes
+.LP
+Because Limbo's interface to file service via
+.CW Sys
+and other modules uses Limbo
+.CW string
+for names, and that is inherently
+variable length, there are no interface changes required for that aspect of the protocol change,
+and consequently no source changes
+(in contrast to the introduction of 9P2000 in C implementations).
+Similarly the Inferno directory reading interfaces remain unchanged.
+.LP
+The `directory mode' bit previously called
+.CW CHDIR
+is now called
+.CW DMDIR .
+It is used
+.I only
+in
+.CW Dir.mode .
+.CW CHDIR
+is no longer defined, partly because it was used both
+in
+.CW Dir.mode
+and
+.CW Qid.path ,
+and the latter instances must change (discussed below).
+There are bits (new to Inferno) for
+.CW DMAPPEND
+(append-only file),
+.CW DMEXCL
+(exclusive-use file),
+and
+.CW DMAUTH
+(authentication file).
+The protocol can return the user name of the user that caused
+.CW mtime
+to be changed on a file; that is now available as
+.CW Dir.muid .
+.LP
+The structure of
+.CW Qid
+has changed.
+Previously a Qid had a 32-bit
+.CW path
+and a 32-bit version number,
+.CW vers .
+The top bit
+.CW CHDIR ) (
+of
+.CW path
+was set iff the Qid was that of a directory.
+The
+.CW path
+is now 64 bits (which is
+.CW big
+in Limbo and
+.CW vlong
+in the kernel), and there is no longer the convention that the top bit of
+.CW path
+must be 1 for a directory.
+Instead, there is a new, separate
+.CW type
+field (called
+.CW qtype
+in Limbo)
+that has the value of the top 8 bits of the file's mode.
+Each bit \f5DM\fIx\f1 in
+.CW Dir.mode ,
+has got a corresponding bit \f5QT\fIx\f1
+in
+.CW Qid.qtype :
+.CW QTDIR ,
+.CW QTAPPEND ,
+.CW QTEXCL
+and
+.CW QTAUTH .
+The bit
+.CW QTDIR
+.I must
+be set in the
+.CW Qid.qtype
+for a directory, and only then.
+There is an extra constant
+.CW QTFILE
+that is defined to be zero, and is used for clarity when neither
+.CW QTDIR
+nor
+.CW QTAUTH
+is set.
+.LP
+In Styx file servers, changes are required to reflect the slightly different set of message types
+and a few new parameters, but the main changes are:
+handling zero or more name elements at once in
+.CW Twalk
+and
+.CW Rwalk ;
+changing
+.CW CHDIR
+to
+.CW DMDIR
+in
+.CW Dir.mode
+(easy);
+the use of the new
+.CW Qid.qtype
+field
+and
+.CW QTDIR
+instead of
+.CW CHDIR
+in
+.CW Qid.path
+(a little more effort);
+and (typically) the insertion of casts to force
+.CW Qid.path
+to
+.CW int
+and thus ensure the use of 32-bit operations except where 64-bit paths really are needed
+(hardly ever in synthetic file servers).
+The new modules for use by file servers are discussed in the next section.
+.LP
+The revised definition of
+.CW Twstat
+in
+.I stat (5),
+and thus
+.CW sys->wstat ,
+provides for ``don't care'' values in
+.CW Dir
+that are tedious to provide directly; a new adt value
+.CW Sys->nulldir
+provides the right initial value for a
+.CW Dir
+which is then changed as needed for
+.CW wstat .
+.SH
+.I "Examples"
+.LP
+Create a directory:
+.P1
+ \fIold:\f5
+fd := sys->create(name, Sys->OREAD, Sys->CHDIR | 8r777);
+
+ \fInew:\f5
+fd := sys->create(name, Sys->OREAD, Sys->DMDIR | 8r777); # not CHDIR
+.P2
+.LP
+Make Qids
+for a file and a directory:
+.P1
+ \fIold:\f5
+Qdir, Qdata: con iota;
+qd := Sys->Qid(Sys->CHDIR | Qdir, 0);
+qf := Sys->Qid(Qdata, 0);
+
+ \fInew:\f5
+Qdir, Qdata: con iota;
+qd := Sys->Qid(big Qdir, 0, Sys->QTDIR);
+qf := Sys->Qid(big Qdata, 0, Sys->QTFILE);
+.P2
+.LP
+Test if a file is a directory:
+.P1
+ \fIold:\f5
+isdir(d: Sys->Dir): int
+{
+ return (d.mode & Sys->CHDIR) != 0;
+\fIOR:\f5
+ return (d.qid.path & Sys->CHDIR) != 0;
+}
+
+ \fInew:\f5
+isdir(d: Sys->Dir): int
+{
+ return (d.mode & Sys->DMDIR) != 0;
+\fIOR:\f5
+ return (d.qid.qtype & Sys->QTDIR) != 0;
+}
+.P2
+.LP
+If one wishes to have values
+.CW big
+only when required, one can write:
+.P1
+case int dir.qid.path {
+Qdir =>
+ ...
+Qdata =>
+ ...
+Qctl =>
+ ...
+}
+.P2
+Of course with the Dis change mentioned above,
+.CW case
+can now be applied to
+.CW big
+values, so it is no longer necessary to add the cast (as it once was).
+Even so, 32-bit operations are faster when they suffice.
+.NH 2
+Styx protocol in Limbo: Styx and Styxservers
+.LP
+A new module
+.CW Styx ,
+defined by
+.CW styx.m ,
+provides access to the Styx protocol messages, as variants of pick adts
+.CW Tmsg
+and
+.CW Rmsg .
+(There was an old, undocumented
+.CW Styx
+module but this new interface is completely different.)
+It is used by several file servers, such as
+.CW dossrv ,
+.CW cdfs ,
+and the new
+.CW logfs .
+See the attached manual page.
+There are several implementations with the same signature, covering different
+combinations of old and new Inferno and old and new protocols, through
+the same interface.
+There are slight differences in the application code for old and new
+systems because of the changed
+type and structure of
+.CW Qid .
+The versions that talk the old protocol need to store some internal state,
+and are intended only to meet compatibility requirements during the transition.
+.LP
+Many file service applications, however, serve a simple name space,
+requiring more than can be done with
+.CW file2chan ,
+but wishing some help in handling the protocol details.
+Two new modules
+.CW Styxservers
+and
+.CW Nametree
+are provided to make such applications easier to write.
+They are closely related and thus both modules are defined by
+.CW styxservers.m .
+.LP
+.CW Styxservers
+provides help in handling fids and interpreting the Styx requests for navigating a
+name space, and provides a reasonable set of default actions,
+allowing the application to focus on implementing
+read and write access to the files in the name space.
+It uses
+.CW Styx
+to talk to the Styx client on a connection.
+It interacts with the application through a channel interface and
+the
+.CW Navigator
+adt to navigate an abstract
+representation of the application's name space.
+The module can be used on its own, with the application doing the work
+of replying to those queries itself, or it can get extra help in the common cases from
+.CW Nametree .
+.CW Nametree
+provides a
+.CW Tree
+adt and operations for the application to build an abstract representation of a name space
+and maintain it dynamically quite simply, and it exports the channel interface used by
+.CW Styxservers
+for navigation, thus connecting the two, but leaving the application in complete
+control of the name space contents viewed by Styx.
+See the manual pages
+.I styxservers (2)
+and
+.I styxservers-nametree (2),
+attached.
+The latter includes a short working example of combining the two modules.
+.LP
+The previous release of the system had a module
+.CW Styxlib
+that combined the functions of
+.CW Styx
+and
+.CW Styxservers .
+It remains for a time for transition, but newer applications should use either
+.CW Styx
+or
+.CW Styxservers .
+.LP
+A new command
+.I styxchat (8)
+exchanges Styx messages with a server, reading a textual representation of T-messages
+on standard input.
+It can be helpful when testing a Styx server implementation.
+(It was originally developed to test the
+.CW Styx
+module implementations in several configurations.)
+See the attached manual page for details.
+It also supports an option that allows it to act as a server,
+printing T-messages as they are received from clients, and
+reading R-messages in a textual form from standard input for replies.
+.NH 2
+Device driver changes
+.LP
+Most of the differences for most drivers are relatively minor
+(in
+.CW diff
+terms).
+.LP
+Throughout the hosted and emulated kernels:
+.IP \(bu
+.CW Qid
+now is the structure:
+.RS
+.P1
+struct Qid {
+ vlong path;
+ ulong vers;
+ uchar type;
+};
+.P2
+The
+.CW type
+field has values
+.CW QTDIR ,
+.CW QTFILE ,
+.CW QTAPPEND ,
+etc.
+The test previously written
+.P1
+if(qid.path & CHDIR)
+.P2
+is now written
+.P1
+if(qid.type & QTDIR)
+.P2
+Because of that change, the various
+.CW switch
+statements in the drivers that previously read
+.P1
+switch(c->qid.path){
+.P2
+or
+.P1
+switch(c->qid.path & ~Sys->CHDIR){
+.P2
+now read
+.P1
+switch((ulong)c->qid.path){
+.P2
+to keep operations to 32 bits (except where otherwise required).
+.RE
+.IP \(bu
+The first entry of a driver's
+.CW Dirtab
+.I must
+be an entry for
+\f5"."\fP,
+if the driver uses
+.CW devgen
+to help implement
+.I walk ,
+.I stat ,
+.I devdirread
+or
+.I open
+operations.
+.IP \(bu
+Offsets passed to the driver's
+.I read
+and
+.I write
+entry points are
+64-bit
+.CW vlong ,
+not 32-bit
+.CW ulong .
+.IP \(bu
+The
+.I stat
+entry point has an extra buffer size parameter:
+.RS
+.P1
+int \fIxyz\f5stat(Chan *c, uchar *dp, int n)
+.P2
+It also returns an integer: the size of the result.
+.CW Devstat
+accepts the extra parameter and returns an appropriate result:
+.P1
+static int
+\fIxyz\f5stat(Chan *c, uchar *dp, int n)
+{
+ return devstat(c, dp, n, rtcdir, nelem(\fIxyz\f5dir), devgen);
+}
+.P2
+.RE
+.IP \(bu
+The biggest change is to
+.I walk .
+It has the signature:
+.RS
+.P1
+Walkqid *\fIxyz\f5walk(Chan *c, Chan *nc, char **names, int nname);
+.P2
+and it allows zero or more elements to be walked in a single call,
+returning its result in a newly-allocated
+.CW Walkqid
+structure:
+.P1
+struct Walkqid {
+ Chan* clone;
+ int nqid;
+ Qid qid[1];
+};
+.P2
+Note that the array
+.CW Walkqid.qid
+must actually hold up to
+.I nname
+Qids, and thus is allocated as follows:
+.P1
+wq = smalloc(sizeof(Walkqid)+(nname-1)*sizeof(Qid));
+.P2
+The driver must take care that the space is reclaimed if
+.CW error
+is called before its
+.I walk
+function returns, by using
+.CW waserror
+as required.
+Fortunately,
+.CW devwalk
+looks after the details of
+.I walk
+and
+.CW walkqid
+for most drivers:
+.P1
+static Walkqid*
+\fIxyz\f5walk(Chan* c, Chan *nc, char** name, int nname)
+{
+ return devwalk(c, nc, name, nname, \fIxyz\f5dir,
+ nelem(\fIxyz\f5dir), devgen);
+}
+.P2
+.RE
+.IP \(bu
+The
+.I clone
+entry point has gone, since cloning is seen by a driver as a particular form of call to its
+.I walk
+entry,
+where the parameter values satisfy:
+.RS
+.P1
+c != nc && nwname == 0
+.P2
+One difference is that a node can be cloned and walked in a single operation,
+in other words
+.CW nwname
+can be non-zero,
+and the incoming
+.CW nc
+is often nil and a new
+.CW Chan
+must be allocated.
+Note that if the driver found it adequate to call
+.CW devclone
+previously, then
+the new
+.CW devwalk
+will
+generally look after it as well.
+.CW Devclone
+remains for use as a utility function for the few drivers that need to
+clone a channel themselves,
+in their
+.I walk
+operations or elsewhere.
+.RE
+.IP \(bu
+The
+.I detach
+entry has been renamed
+.I shutdown
+(it was never the opposite of
+.I attach ).
+The stub
+.CW devshutdown
+can be used by devices that do not need it.
+.LP
+For drivers that serve a simple name space using the functions of
+.CW dev.c
+(described in
+.I devattach (10.2)),
+only a handful of simple changes are required.
+Most are pointed out by the compilers as type clashes.
+The main exception is the need for a
+.CW Dirtab
+to have its first entry be an entry for \f5"."\fP if the
+.CW Dirtab
+will be passed to
+.CW devgen
+via
+.CW devwalk ,
+.CW devstat
+and
+.CW devdirread .
+.NH 1
+Sys module changes
+.LP
+.NH 2
+Sys: name change(s)
+.LP
+The name
+.CW ERRLEN
+has become
+.CW ERRMAX
+(since it is the limit to any error string, not its necessary length).
+.CW NAMELEN
+has been removed,
+to allow each instance to be found (by compilation) and either removed
+(where it was simply limiting the length of a file name), or replaced by
+.CW NAMEMAX
+where it was used as a buffer size to read in names such as
+.CW /dev/sysname
+or
+.CW /dev/user .
+.NH 2
+Sys: file sizes
+.LP
+The Styx protocol has always supported 64-bit file sizes and file offsets.
+The Inferno interface has not.
+.CW Sys
+has changed so that length and offset values become
+.CW big ,
+specifically:
+file size
+.CW Dir.length ,
+the offset parameter to
+.CW seek ,
+and
+.CW seek 's
+result.
+.LP
+These and the Qid changes account for quite a few changes in
+our own source tree.
+Typically, applications did things like this:
+.P1
+ \fIold:\f5
+buf := array[d.length] of byte;
+
+sys->seek(fd, 0, Sys->SEEKSTART);
+off := sys->seek(fd, 0, Sys->SEEKRELA); rec := off + HDRLEN;
+for(offset := 0; offset < d.length; offset += RECSIZE){
+ sys->seek(fd, offset, Sys->SEEKSTART);
+ ...
+}
+.P2
+The compiler now objects in each case because
+.CW big
+values are now appearing where
+.CW int
+is required, or conversely.
+In some cases it is obvious that adding a cast is correct;
+in others it is worth considering whether the calculation should indeed
+be
+.CW big
+because file sizes for instance can in practice exceed the range of a
+signed integer without too much trouble today, especially when the `file'
+is a storage device.
+The case that some people like and some dislike is:
+.P1
+if(sys->seek(fd, big offset, Sys->SEEKSTART) < big 0) ...
+.P2
+where the
+.CW "big 0"
+is needed because
+.CW sys->seek
+is
+.CW big ,
+and there are no `usual arithmetic conversions' as in C.
+(Given the tangle that several languages have made of such conversions, perhaps
+being strict is correct.)
+.NH 2
+Sys: export
+.LP
+.CW Sys->export
+now has the signature:
+.P1
+export: fn(c: ref Sys->FD, dir: string, flag: int): int;
+.P2
+allowing a directory
+.I dir
+other than \f5"/"\f1
+to be exported.
+It replaces the
+.CW exportdir
+function of (later) Third Edition.
+.NH 2
+Sys: Styx support
+.LP
+The revision of Styx has caused three calls to be added:
+.P1
+fauth: fn(fd: ref Sys->FD, aname: string): ref Sys->FD;
+fversion: fn(fd: ref Sys->FD, msize: int, version: string): (int, string);
+iounit: fn(fd: ref Sys->FD): int;
+.P2
+.CW Fversion
+initialises a Styx session on connection
+.I fd ,
+sending the message size
+.I msize
+and protocol version string
+.I version ;
+it returns a tuple giving the message size and version returned by the Styx server.
+It is rarely called directly; the
+.CW mount
+operation does it automatically on an uninitialised connection.
+.LP
+.CW Fauth
+sends a Styx
+.CW Tauth
+message on connection
+.I fd ,
+and if successful, returns a file descriptor that
+refers to an authentication file provided by the file server,
+which may be read and written by
+.CW Sys->read
+and
+.CW Sys->write
+to implement the authentication protocol(s) supported by the server.
+.CW Fauth
+is needed only when the server requires authentication.
+.LP
+.CW Iounit
+returns the `atomic IO unit' suggested for the file
+.I fd
+by its file server when it was opened.
+.NH 2
+Sys: mount
+.LP
+The
+.CW mount
+system call has acquired a second file descriptor parameter:
+.P1
+mount: fn(fd: ref Sys->FD, afd: ref Sys->FD, on: string,
+ flags: int, spec: string): int;
+.P2
+.I Afd
+is nil if the file server is known not to require authentication within a Styx session.
+(The connection might itself have been authenticated previously, for instance,
+and most file servers such as
+.CW dossrv ,
+.CW ftpfs
+and
+.CW dbfs
+are invoked to provide services to an already-authenticated user, and therefore
+do not require authentication within a session.)
+If the server does require authentication,
+.I afd
+refers to a file descriptor returned by a previous
+.CW fauth
+on connection
+.I fd ,
+on which an authentication protocol has subsequently been executed as required by the file server connected to
+.I fd .
+.NH 2
+Sys: other new system calls
+.LP
+There are two more new system calls:
+.P1
+fd2path: fn(fd: ref Sys->FD): string;
+werrstr: fn(s: string): int;
+.P2
+.CW Fd2path
+returns the path name under which the file descriptor
+.I fd
+was originally opened (if known).
+One result is that
+.I workdir (2)
+produces reasonable results for the name of the current directory
+in the presence of mounts and binds.
+.LP
+.CW Werrstr
+sets the per-process system error string to
+.I s ,
+to allow a Limbo function to save and restore an error string over
+other system calls, to present a similar interface
+as the system calls on errors, or to annotate the error from a system call
+for its own caller.
+.NH 2
+Sys: directory reading
+.LP
+The
+.I sys-dirread (2)
+system call's signature has changed:
+.P1
+dirread: fn(fd: ref Sys->FD): (int, array of Sys->Dir);
+.P2
+Previously it accepted an array of
+.CW Dir
+to fill and returned a count;
+now it returns a tuple containing the count and the array of values read.
+The change was needed because the representation of directory entries
+is now variable length, and it is difficult to limit the number returned
+(it is possible, but all the methods have disadvantages).
+.CW Dirread
+still reads a directory incrementally, requesting a block of directory entries
+of reasonable size from the file server, and unpacking them into the returned array.
+Use
+.I readdir (2)
+to read whole directories at once.
+.NH 1
+Bufio
+.LP
+There are several changes to
+.CW Bufio :
+.P1
+Iobuf: adt {
+ ...
+ seek: fn(b: self ref Iobuf, n: big, where: int): big;
+ offset: fn(b: self ref Iobuf): big;
+};
+# flush: fn(); # deleted
+.P2
+The module-level function
+.CW Bufio->flush
+has been removed
+(\fInot\fP
+.CW Iobuf.flush ),
+to allow concurrent use of a single
+.CW Bufio
+instance; applications must
+.CW close
+or
+.CW flush
+each output file explicitly.
+.LP
+As a result of the change to 64-bit offsets for
+.CW Sys->seek ,
+.CW Iobuf.seek
+also accepts and returns
+.CW big
+offsets.
+.CW Iobuf.offset
+is new, and returns the current file offset in bytes, taking account of any buffering.
+.LP
+.CW Iobuf.flush
+has been extended to flush any data buffered on input files.
+.NH 1
+Draw
+.LP
+The graphics model represented by the
+.I draw (3)
+device and the
+.CW Draw
+module is significantly different, including support for a range of pixel formats,
+and compositing in the drawing operations.
+Most source code that uses Images
+directly will require some changes, but the scope of them is limited: needing only extra
+or different parameter values to individual operations, not radical restructuring.
+The following changes affect most non-Tk graphics application code:
+.IP \(bu
+Pixels in an
+.CW Image
+can now be more than 8 bits and have a more flexible structure
+(eg, several colour channels, and an optional alpha channel, of up to 8 bits each).
+To support that, the old
+.CW ldepth
+field has gone, replaced by a channel descriptor
+.CW chans
+of type
+.CW Chans ,
+which describes the pixel structure, and an integer
+.CW depth
+field, which gives the total pixel size (depth) in bits.
+.IP \(bu
+The colour parameters are now 32-bit RGBA values
+(red, green, blue and alpha components, 8-bit each, and big-endian
+only when an
+.CW int ).
+.IP \(bu
+The graphics subsystem supports Porter-Duff compositing,
+combining a destination image with a source image (within an optional matte)
+according to a compositing operator.
+The interpretation of the old `mask' Image parameter to
+.CW draw
+and
+.CW gendraw
+has changed.
+Previously it provided a simple binary mask;
+it now provides a `matte', and its
+alpha channel shapes the source image and adds partial transparencies.
+If the matte parameter is nil, the source image is used unmodified.
+If it lacks an alpha channel, one is computed from the matte image colour channels.
+The drawing operations
+.CW draw ,
+.CW gendraw ,
+.CW line ,
+.CW text ,
+and so on,
+have all got variants
+.CW drawop ,
+.CW gendrawop ,
+.CW lineop ,
+.CW textop ,
+and so on,
+each taking an extra final parameter that specifies a Porter-Duff
+compositing operator from a set predefined by
+.CW Draw :
+.CW SoverD ,
+.CW SinD ,
+.CW DatopS ,
+and so on.
+In each case,
+.CW S
+refers to the source image (within a matte, if provided), and
+.CW D
+refers to the destination image.
+Most of them are useful only when either or both source or destination images have got
+alpha channels (or a matte is used to shape the source).
+The old function names without the
+.CW op
+suffix use the most common compositing operation
+.CW Draw->SoverD ,
+drawing the source image over the destination,
+taking account of the shaping of the source and destination images by their alpha channels,
+with the source further shaped by the optional matte.
+Thus
+.CW Image.draw
+continues to do the `obvious' thing.
+.IP \(bu
+There are new colour map conversion functions.
+.LP
+The
+.CW Chans
+adt is the following:
+.P1
+Chans: adt
+{
+ # interpret standard channel string
+ mk: fn(s: string): Chans;
+ # standard printable form
+ text: fn(c: self Chans): string;
+ # equality
+ eq: fn(c: self Chans, d: Chans): int;
+ # bits per pixel
+ depth: fn(c: self Chans): int;
+};
+.P2
+Values are created by
+.CW Chans.mk ,
+which accepts a string that is a sequence of channel descriptors,
+each being a letter representing a channel type followed by an integer giving the channel's size (depth, width) in bits.
+The letters include:
+.CW r ,
+.CW g
+and
+.CW b
+for red, green and blue;
+.CW a
+for alpha;
+.CW k
+(!) for greyscale; and
+.CW x
+for padding (``unspecified'', ``don't care'').
+Thus
+.CW Chans.mk("r8g8b8a8")
+produces a descriptor for a 32-bit pixel with 8-bit colour and alpha components.
+The same descriptor is used in the revised
+.I image (6)
+format, although the older image file format with ldepth only is still recognised.
+Given a Chans value
+.I c ,
+\fIc\fP\f5.text()\fP returns such a descriptor for it as a string.
+.LP
+When
+.CW newimage
+previously was called with a specific value for
+.CW ldepth ,
+an appropriate
+.CW Chans
+value must replace it.
+A few common variants are defined as constants of type
+.CW Chans
+in
+.CW Draw .
+(We extended the Limbo compiler last year to support the use of
+.CW con
+with adt and tuple constants with this use in mind.)
+For example, the value
+.CW Draw->CMAP8
+is the descriptor for the 8-bit deep
+.I rgbv
+colour-mapped Image format previously used by Inferno.
+The list of predefined channels includes:
+.TS
+center;
+cfI cfI cfI cfI
+n lf(CW) n lw(3i) .
+Old ldepth Name Bit depth Description
+0 GREY1 1 single 1-bit deep greyscale channel
+1 GREY2 2 single 2-bit deep greyscale
+2 GREY4 4 single 4-bit deep greyscale
+\- GREY8 8 single 8-bit deep greyscale
+3 CMAP8 8 single 8-bit deep \fIrgbv\f1 colour-mapped channel
+\- RGB15 15 three channels RGB: r5g5b5
+\- RGB16 16 three channels RGB: r5g6b5
+\- RGB24 24 three channels RGB: r8g8b8
+\- RGBA32 32 four channels: RGB and alpha: r8g8b8a8
+.TE
+.LP
+The use of
+.CW Chans
+instead of
+.CW ldepth
+means that calls to
+.CW Display.newimage
+must be changed.
+For instance:
+.P1
+\fI(old)\f5
+buffer := display.newimage(r.inset(3), t.image.ldepth, 0, Draw->White);
+.P2
+becomes
+.P1
+\fI(new)\f5
+buffer := display.newimage(r.inset(3), t.image.chans, 0, Draw->White);
+.P2
+There is an obvious difference: the
+use of
+.CW t.image.chans
+instead of
+.CW t.image.ldepth
+to create a buffer Image with the same pixel structure as
+.CW t .
+There is, however, another difference.
+The final colour parameter to
+.CW newimage
+is also different in structure: in the new graphics model, it is a 32-bit integer value giving RGBA
+components,
+not a colour map index, and the name
+.CW Draw->White
+has the value
+.CW 16rFFFFFFFF
+not
+.CW 0 .
+Because a symbolic name was used, however, the source need not change.
+As another example,
+.CW Draw->Palegreyblue
+is
+.CW "int 16r4993DDFF" .
+Note the final
+.CW FF
+for the alpha component (creating a fully opaque colour).
+When the top bit is set, the
+.CW int
+cast shown here is needed to force the otherwise
+.CW big
+value to 32 bits.
+.LP
+The values of colour components are now uniformly expressed as
+intensity, so that a pixel with all zero colour components is black and
+one with all colour components at maximum (all ones, full intensity)
+is white.
+The
+.I rgbv
+map has therefore been reversed.
+Given a map index,
+.CW Display.cmap2rgba
+returns the 32-bit RGBA format used as a parameter in other calls.
+All colour components are
+.I linear
+values, as required for compositing to work properly;
+gamma correction is done as required by the display subsystem.
+.LP
+The colour components of a pixel with an alpha component are always
+.I pre-multiplied
+by the alpha value, following Porter and Duff, as further justified by Alvy Ray Smith and Jim Blinn.
+``Thus a 50% red is
+.CW 16r7F00007F
+not
+.CW 16rFF00007F .''
+The function
+.CW Draw->setalpha
+does the computation.
+.LP
+Because of the changes to colours and the replacement of simple masks by mattes, the Images
+.CW Display.ones
+and
+.CW Display.zeros
+are no longer defined.
+Instead, when they were intended to represent colours, the new Images
+.CW Display.black
+and
+.CW Display.white
+provide the obvious colours.
+When
+.CW ones
+and
+.CW zeros
+were used as masks, the new predefined Images
+.CW Display.opaque
+and
+.CW Display.transparent
+are used instead as constant mattes, with alpha channels (fully opaque and fully transparent, respectively).
+As noted above, where
+.CW Display.ones
+was used as a mask parameter in drawing operations, one can
+simply specify a nil Image as a matte (`no matte') instead.
+(That has been allowed for quite some time and is in use but might not be widely known.)
+.LP
+For example, Charon allocated a mask using:
+.P1
+dpicmask = display.newimage(pic.r, 0, 0, Draw->White);
+.P2
+which becomes
+.P1
+dpicmask = display.newimage(pic.r, Draw->GREY1, 0, Draw->Opaque);
+.P2
+where
+.CW GREY1
+is a constant value of the
+.CW Chans
+adt type, predefined by Draw, for Images that have a single 1-bit deep grey channel (ie, a bitmap).
+(Note that to form a fully-opaque matte,
+.CW Draw->Opaque
+was used for clarity, not
+.CW Draw->White ;
+.CW Draw->Transparent
+could also be used, as the basis for building a matte with transparency.)
+.LP
+A small if obscure change is that
+.CW Display.newwindow
+has a new parameter:
+.P1
+newwindow: fn(screen: self ref Screen, r: Rect,
+ backing: int, color: int): ref Image;
+.P2
+The
+.I backing
+parameter should usually be
+.CW Draw->Refbackup ,
+except for windows allocated on an image that already has got backing store
+assigned, for instance because it is an image on a screen on an existing window image, in which case it should be
+.CW Draw->Refnone ,
+because the parent window already provides the backing.
+.LP
+As a small but helpful change, the adt
+.CW Draw->Pointer
+has a new element
+.CW msec
+that reports a
+relative time stamp in milliseconds.
+.LP
+The
+.CW Draw->Context
+content is significantly different, for the benefit of the new
+window system implementation.
+.NH 1
+Tk module
+.LP
+There is a new function in
+.CW Tk :
+.P1
+quote: fn(s: string): string;
+.P2
+.CW Quote
+returns string
+.I s
+quoted according to Tk's `\f5{}\f1' quoting conventions.
+It replaces
+.CW Wmlib->tkquote .
+.LP
+There is a new widget type:
+.I panel (9).
+A panel instance can be packed and otherwise manipulated in the same way as any other Tk widget.
+An image is associated with it by calling
+.CW Tk->putimage
+defined in
+.I tk (2).
+The associated images can be drawn on directly by the application, using all the operations provided by
+.CW Draw .
+The coordinates of the changed rectangle must be given to Tk
+using the
+.CW panel
+widget command
+.CW dirty ;
+that part of the image will be redrawn if necessary at the next Tk
+.CW update .
+A panel has no default bindings.
+See
+.I panel (9)
+for details.
+.LP
+For example,
+.CW wm/coffee
+now uses the following:
+.P1
+r := Rect((0, 0), (400, 300));
+buffer := display.newimage(r, t.image.chans, 0, Draw->Black);
+tk->cmd(t, "panel .f.p -bd 3 -relief flat");
+tk->cmd(t, "pack .f.p -fill both -expand 1");
+tk->cmd(t, "update");
+org := buffer.r.min;
+tk->putimage(t, ".f.p", buffer, nil);
+.P2
+When it has updated the
+.CW buffer ,
+it tells Tk:
+.P1
+tk->cmd(t, ".f.p dirty; update");
+.P2
+In this case the whole image is marked dirty, but
+.CW dirty
+can be given an optional rectangle parameter to restrict redrawing.
+.LP
+.CW Tk->putimage
+and
+.CW Tk->getimage
+replace
+.CW imageput
+and
+.CW imageget .
+.NH 1
+Selectfile, Tabs and Dialog
+.LP
+The functions
+.CW filename ,
+.CW mktabs
+(and
+.CW tabsctl ),
+.CW dialog
+and
+.CW getstring
+have been moved to separate new modules, to allow those aspects of the
+user interface to be changed by replacing the implementations,
+and to allow standard modules to be provided for picking colours (for instance).
+.CW Selectfile
+acquires
+.CW filename ,
+.CW Tabs
+acquires the `tabs' Tk pseudo-widget, and
+.CW Dialog
+acquires
+.CW dialog ,
+which is renamed
+.CW prompt ,
+and
+.CW getstring .
+In cases where the functions took a
+.CW Tk->Toplevel
+as a parameter to specify a
+.CW parent
+window,
+they now take a
+.CW Draw->Context
+and (parent)
+.CW Image
+parameter;
+given a Toplevel
+.CW t ,
+use
+.CW t.image .
+See
+.I dialog (2),
+.I selectfile (2)
+and
+.I tabs (2).
+.TL
+Appendix A: Tk client conversion
+.LP
+.I Wm (1)
+applications now have to feed their own pointer and keyboard
+input to Tk. The window manager is now kept informed about the placement
+of windows.
+.LP
+A Tk toplevel now holds a window manager context:
+.P1
+Wmcontext: adt
+{
+ kbd: chan of int; # incoming characters from keyboard
+ ptr: chan of ref Pointer; # incoming stream of mouse positions
+ ctl: chan of string; # commands from wm to application
+ wctl: chan of string; # commands from application to wm
+ images: chan of ref Image; # exchange of images
+ connfd: ref Sys->FD; # connection control
+ ctxt: ref Context;
+};
+.P2
+It contains some channels on which the window manager
+sends information to the application, and a file
+descriptor that can be used to write requests to the window
+manager.
+The channels used directly by the application are:
+.RS
+.IP \f(CWkbd\fP
+characters typed by the user (pass them to
+.CW Tk->pointer )
+.IP \f(CWptr\fP
+pointer events (pass them to
+.CW Tk->keyboard )
+.IP \f(CWctl\fP
+application control requests.
+Passing these to
+.CW Tkclient->wmctl
+will do the default action.
+Requests starting with an exclamation mark
+.CW ! ) (
+can cause the application's image to change.
+.RE
+.LP
+The toplevel also holds a channel
+.CW wreq
+on which it sends application
+control requests; these have the same form as those
+sent on
+.CW Wmcontext.ctl ,
+and can be forwarded to
+.CW Tkclient->wmctl
+in the same way.
+.LP
+Control requests currently understood by
+.I wm (1)
+are:
+.RS
+.IP "\f(CW!reshape \fItag\fP \fIreqid\fP \fIminx\fP \fIminy\fP \fImaxx\fP \fImaxy\fP [\fIhow\fP]\fR
+.br
+Reshape the window referenced by
+.I tag ,
+creating a new image if
+.I tag
+did not previously exist.
+.I Reqid
+is ignored.
+.I How
+can be one of:
+.RS
+.IP \f(CWplace\fP 15
+.I Wm
+attempts to find a suitable patch of screen real estate on which to place
+the window; the size of the given rectangle
+is taken to be the minimum size for that window.
+.IP \f(CWexact\fP
+Reshape to the exact rectangle requested.
+This is the default if
+.I how
+is not given.
+.IP \f(CWonscreen\fP
+The given rectangle is adjusted so that it is no bigger than the available
+screen space, and is entirely on screen.
+.RE
+.IP "\f(CWdelete \fItag\fP\fR
+.br
+Delete the image associated with
+.I tag .
+.IP "\f(CWraise\fP
+.br
+Raise the window
+.IP "\f(CWlower\fP
+.br
+Lower the window
+.IP "\f(CW!move \fItag\fP \fIreqid\fP \fIstartx\fP \fIstarty\fP\fR
+.br
+Request the user to move the window to a new place.
+.I Startx
+and
+.I starty
+are the coordinates of the pointer when the request was initiated.
+.IP "\f(CW!size \fItag\fP\fR
+.br
+Request the user to resize the window.
+.RE
+.LP
+To convert a typical Tk application, do the following.
+.IP 1.
+Use an editor to make the following changes:
+.RS
+.TS
+cfI cfI
+lf(CW) lf(CW) .
+Old New
+Wmlib Tkclient
+wmlib tkclient
+tkclient->titlebar tkclient->toplevel
+tkclient->titlectl tkclient->wmctl
+tkclient->taskbar tkclient->settitle
+tk->imageput tk->putimage
+tk->imageget tk->getimage
+.TE
+.RE
+.IP 2.
+Insert the following code at the top of the central
+.CW alt
+statement.
+The names
+.CW wmctl ' `
+and
+.CW top ` '
+will need changing to the appropriate variables in the program:
+.RS
+.P1
+s := <-top.ctxt.kbd =>
+ tk->keyboard(top, s);
+s := <-top.ctxt.ptr =>
+ tk->pointer(top, *s);
+s := <-top.ctxt.ctl or
+s = <-top.wreq or
+s = <-wmctl =>
+ tkclient->wmctl(top, s);
+.P2
+.RE
+.IP 3.
+Add the following just after the Tk configuration code and
+before the main processing starts:
+.RS
+.P1
+tkclient->onscreen(top, nil);
+tkclient->startinput(top, "kbd"::"ptr"::nil);
+.P2
+This is possibly the easiest part to forget.
+.RE
+.LP
+Be careful of cases where a blocking function is called
+from the main loop that relies on keyboard/mouse input.
+The easiest solution can be to spawn a thread to handle the
+keyboard and mouse independently.
binary files /dev/null b/doc/changes.pdf differ
--- /dev/null
+++ b/doc/compiler.ms
@@ -1,0 +1,1174 @@
+.TL
+Plan 9 C Compilers \(dg
+.AU
+.I "Ken Thompson"
+.AI
+ken@plan9.bell-labs.com
+.AB
+.FS
+\l'1i'
+.br
+\(dg Originally appeared, in a different form, in
+.I
+Proceedings of the Summer 1990 UKUUG Conference,
+.R
+pp. 41-51,
+London, 1990.
+This version first appeared in
+.I "Plan 9 Programmer's Manual, Volume 2 (Second Edition)" .
+The Plan 9 compiler suite forms the basis for the portable Inferno compiler suite,
+making this paper still relevant.
+.FE
+This paper describes the overall structure and function of the Plan 9 C compilers.
+A more detailed implementation document
+for any one of the compilers
+is yet to be written.
+.AE
+.NH
+Introduction
+.LP
+There are many compilers in the series.
+Eight of the compilers (MIPS 3000, SPARC, Intel 386, AMD64, Power PC, ARM, DEC Alpha, and Motorola 68020)
+are considered active and are used to compile
+current versions of Plan 9 or Inferno.
+Several others (Motorola 68000, Intel 960, AMD 29000) have had only limited use, such as
+to program peripherals or experimental devices.
+.NH
+Structure
+.LP
+The compiler is a single program that produces an
+object file.
+Combined in the compiler are the traditional
+roles of preprocessor, lexical analyzer, parser, code generator,
+local optimizer,
+and first half of the assembler.
+The object files are binary forms of assembly
+language,
+similar to what might be passed between
+the first and second passes of an assembler.
+.LP
+Object files and libraries
+are combined by a loader
+program to produce the executable binary.
+The loader combines the roles of second half
+of the assembler, global optimizer, and loader.
+The names of the compliers, loaders, and assemblers
+are as follows:
+.DS
+.ta 1.5i
+.de Ta
+\\$1 \f(CW\\$2\fP \f(CW\\$3\fP \f(CW\\$4\fP
+..
+.Ta SPARC kc kl ka
+.Ta PowerPC qc ql qa
+.Ta MIPS vc vl va
+.Ta Motorola\ 68000 1c 1l 1a
+.Ta Motorola\ 68020 2c 2l 2a
+.Ta ARM 5c 5l 5a
+.Ta AMD64 6c 6l 6a
+.Ta DEC\ Alpha 7c 7l 7a
+.Ta Intel\ 386 8c 8l 8a
+.Ta AMD\ 29000 9c 9l 9a
+.DE
+There is a further breakdown
+in the source of the compilers into
+object-independent and
+object-dependent
+parts.
+All of the object-independent parts
+are combined into source files in the
+directory
+.CW /sys/src/cmd/cc .
+The object-dependent parts are collected
+in a separate directory for each compiler,
+for example
+.CW /sys/src/cmd/vc .
+All of the code,
+both object-independent and
+object-dependent,
+is machine-independent
+and may be cross-compiled and executed on any
+of the architectures.
+.NH
+The Language
+.LP
+The compiler implements ANSI C with some
+restrictions and extensions
+[ANSI90].
+Most of the restrictions are due to
+personal preference, while
+most of the extensions were to help in
+the implementation of Plan 9.
+There are other departures from the standard,
+particularly in the libraries,
+that are beyond the scope of this
+paper.
+.NH 2
+Register, volatile, const
+.LP
+The keyword
+.CW register
+is recognized syntactically
+but is semantically ignored.
+Thus taking the address of a
+.CW register
+variable is not diagnosed.
+The keyword
+.CW volatile
+disables all optimizations, in particular registerization, of the corresponding variable.
+The keyword
+.CW const
+generates warnings (if warnings are enabled by the compiler's
+.CW -w
+option) of non-constant use of the variable,
+but does not affect the generated code.
+.NH 2
+The preprocessor
+.LP
+The C preprocessor is probably the
+biggest departure from the ANSI standard.
+.LP
+The preprocessor built into the Plan 9 compilers does not support
+.CW #if ,
+although it does handle
+.CW #ifdef
+and
+.CW #include .
+If it is necessary to be more standard,
+the source text can first be run through the separate ANSI C
+preprocessor,
+.CW cpp .
+.NH 2
+Unnamed substructures
+.LP
+The most important and most heavily used of the
+extensions is the declaration of an
+unnamed substructure or subunion.
+For example:
+.DS
+.CW
+.ta .1i .6i 1.1i 1.6i
+ typedef
+ struct lock
+ {
+ int locked;
+ } Lock;
+
+ typedef
+ struct node
+ {
+ int type;
+ union
+ {
+ double dval;
+ float fval;
+ long lval;
+ };
+ Lock;
+ } Node;
+
+ Lock* lock;
+ Node* node;
+.R
+.DE
+The declaration of
+.CW Node
+has an unnamed substructure of type
+.CW Lock
+and an unnamed subunion.
+One use of this feature allows references to elements of the
+subunit to be accessed as if they were in
+the outer structure.
+Thus
+.CW node->dval
+and
+.CW node->locked
+are legitimate references.
+.LP
+When an outer structure is used
+in a context that is only legal for
+an unnamed substructure,
+the compiler promotes the reference to the
+unnamed substructure.
+This is true for references to structures and
+to references to pointers to structures.
+This happens in assignment statements and
+in argument passing where prototypes have been
+declared.
+Thus, continuing with the example,
+.DS
+.CW
+.ta .1i .6i 1.1i 1.6i
+ lock = node;
+.R
+.DE
+would assign a pointer to the unnamed
+.CW Lock
+in
+the
+.CW Node
+to the variable
+.CW lock .
+Another example,
+.DS
+.CW
+.ta .1i .6i 1.1i 1.6i
+ extern void lock(Lock*);
+ func(...)
+ {
+ ...
+ lock(node);
+ ...
+ }
+.R
+.DE
+will pass a pointer to the
+.CW Lock
+substructure.
+.LP
+Finally, in places where context is insufficient to identify the unnamed structure,
+the type name (it must be a
+.CW typedef )
+of the unnamed structure can be used as an identifier.
+In our example,
+.CW &node->Lock
+gives the address of the anonymous
+.CW Lock
+structure.
+.NH 2
+Structure displays
+.LP
+A structure cast followed by a list of expressions in braces is
+an expression with the type of the structure and elements assigned from
+the corresponding list.
+Structures are now almost first-class citizens of the language.
+It is common to see code like this:
+.DS
+.CW
+.ta .1i
+ r = (Rectangle){point1, (Point){x,y+2}};
+.R
+.DE
+.NH 2
+Initialization indexes
+.LP
+In initializers of arrays,
+one may place a constant expression
+in square brackets before an initializer.
+This causes the next initializer to assign
+the indicated element.
+For example:
+.DS
+.CW
+.ta .1i .6i 1.6i
+ enum errors
+ {
+ Etoobig,
+ Ealarm,
+ Egreg
+ };
+ char* errstrings[] =
+ {
+ [Ealarm] "Alarm call",
+ [Egreg] "Panic: out of mbufs",
+ [Etoobig] "Arg list too long",
+ };
+.R
+.DE
+In the same way,
+individual structures members may
+be initialized in any order by preceding the initialization with
+.CW .tagname .
+Both forms allow an optional
+.CW = ,
+to be compatible with a proposed
+extension to ANSI C.
+.NH 2
+External register
+.LP
+The declaration
+.CW extern
+.CW register
+will dedicate a register to
+a variable on a global basis.
+It can be used only under special circumstances.
+External register variables must be identically
+declared in all modules and
+libraries.
+The feature is not intended for efficiency,
+although it can produce efficient code;
+rather it represents a unique storage class that
+would be hard to get any other way.
+On a shared-memory multi-processor,
+an external register is
+one-per-processor and neither one-per-procedure (automatic)
+or one-per-system (external).
+It is used for two variables in the Plan 9 kernel,
+.CW u
+and
+.CW m .
+.CW U
+is a pointer to the structure representing the currently running process
+and
+.CW m
+is a pointer to the per-machine data structure.
+.NH 2
+Long long
+.LP
+The compilers accept
+.CW long
+.CW long
+as a basic type meaning 64-bit integer.
+On all of the machines
+this type is synthesized from 32-bit instructions.
+.NH 2
+Pragma
+.LP
+The compilers accept
+.CW #pragma
+.CW lib
+.I libname
+and pass the
+library name string uninterpreted
+to the loader.
+The loader uses the library name to
+find libraries to load.
+If the name contains
+.CW $O ,
+it is replaced with
+the single character object type of the compiler
+(e.g.,
+.CW v
+for the MIPS).
+If the name contains
+.CW $M ,
+it is replaced with
+the architecture type for the compiler
+(e.g.,
+.CW mips
+for the MIPS).
+If the name starts with
+.CW /
+it is an absolute pathname;
+if it starts with
+.CW .
+then it is searched for in the loader's current directory.
+Otherwise, the name is searched from
+.CW /$M/lib .
+Such
+.CW #pragma
+statements in header files guarantee that the correct
+libraries are always linked with a program without the
+need to specify them explicitly at link time.
+.LP
+They also accept
+.CW #pragma
+.CW packed
+.CW on
+(or
+.CW yes
+or
+.CW 1 )
+to cause subsequently declared data, until
+.CW #pragma
+.CW packed
+.CW off
+(or
+.CW no
+or
+.CW 0 ),
+to be laid out in memory tightly packed in successive bytes, disregarding
+the usual alignment rules.
+Accessing such data can cause faults.
+.LP
+Similarly,
+.CW #pragma
+.CW profile
+.CW off
+(or
+.CW no
+or
+.CW 0 )
+causes subsequently declared functions, until
+.CW #pragma
+.CW profile
+.CW on
+(or
+.CW yes
+or
+.CW 1 ),
+to be marked as unprofiled.
+Such functions will not be profiled when
+profiling is enabled for the rest of the program.
+.LP
+Two
+.CW #pragma
+statements allow type-checking of
+.CW print -like
+functions.
+The first, of the form
+.P1
+#pragma varargck argpos error 2
+.P2
+tells the compiler that the second argument to
+.CW error
+is a
+.CW print
+format string (see the manual page
+.I print (2))
+that specifies how to format
+.CW error 's
+subsequent arguments.
+The second, of the form
+.P1
+#pragma varargck type "s" char*
+.P2
+says that the
+.CW print
+format verb
+.CW s
+processes an argument of
+type
+.CW char* .
+If the compiler's
+.CW -F
+option is enabled, the compiler will use this information
+to report type violations in the arguments to
+.CW print ,
+.CW error ,
+and similar routines.
+.NH
+Object module conventions
+.LP
+The overall conventions of the runtime environment
+are important
+to runtime efficiency.
+In this section,
+several of these conventions are discussed.
+.NH 2
+Register saving
+.LP
+In the Plan 9 compilers,
+the caller of a procedure saves the registers.
+With caller-saves,
+the leaf procedures can use all the
+registers and never save them.
+If you spend a lot of time at the leaves,
+this seems preferable.
+With callee-saves,
+the saving of the registers is done
+in the single point of entry and return.
+If you are interested in space,
+this seems preferable.
+In both,
+there is a degree of uncertainty
+about what registers need to be saved.
+Callee-saved registers make it difficult to
+find variables in registers in debuggers.
+Callee-saved registers also complicate
+the implementation of
+.CW longjmp .
+The convincing argument is
+that with caller-saves,
+the decision to registerize a variable
+can include the cost of saving the register
+across calls.
+For a further discussion of caller- vs. callee-saves,
+see the paper by Davidson and Whalley [Dav91].
+.LP
+In the Plan 9 operating system,
+calls to the kernel look like normal procedure
+calls, which means
+the caller
+has saved the registers and the system
+entry does not have to.
+This makes system calls considerably faster.
+Since this is a potential security hole,
+and can lead to non-determinism,
+the system may eventually save the registers
+on entry,
+or more likely clear the registers on return.
+.NH 2
+Calling convention
+.LP
+Older C compilers maintain a frame pointer, which is at a known constant
+offset from the stack pointer within each function.
+For machines where the stack grows towards zero,
+the argument pointer is at a known constant offset
+from the frame pointer.
+Since the stack grows down in Plan 9,
+the Plan 9 compilers
+keep neither an
+explicit frame pointer nor
+an explicit argument pointer;
+instead they generate addresses relative to the stack pointer.
+.LP
+On some architectures, the first argument to a subroutine is passed in a register.
+.NH 2
+Functions returning structures
+.LP
+Structures longer than one word are awkward to implement
+since they do not fit in registers and must
+be passed around in memory.
+Functions that return structures
+are particularly clumsy.
+The Plan 9 compilers pass the return address of
+a structure as the first argument of a
+function that has a structure return value.
+Thus
+.DS
+.CW
+.ta .1i .6i 1.1i 1.6i
+ x = f(...)
+.R
+.DE
+is rewritten as
+.DS
+.CW
+.ta .1i .6i 1.1i 1.6i
+ f(&x, ...)\f1.
+.R
+.DE
+This saves a copy and makes the compilation
+much less clumsy.
+A disadvantage is that if you call this
+function without an assignment,
+a dummy location must be invented.
+.LP
+There is also a danger of calling a function
+that returns a structure without declaring
+it as such.
+With ANSI C function prototypes,
+this error need never occur.
+.NH
+Implementation
+.LP
+The compiler is divided internally into
+four machine-independent passes,
+four machine-dependent passes,
+and an output pass.
+The next nine sections describe each pass in order.
+.NH 2
+Parsing
+.LP
+The first pass is a YACC-based parser
+[Joh79].
+Declarations are interpreted immediately,
+building a block structured symbol table.
+Executable statements are put into a parse tree
+and collected,
+without interpretation.
+At the end of each procedure,
+the parse tree for the function is
+examined by the other passes of the compiler.
+.LP
+The input stream of the parser is
+a pushdown list of input activations.
+The preprocessor
+expansions of
+macros
+and
+.CW #include
+are implemented as pushdowns.
+Thus there is no separate
+pass for preprocessing.
+.NH 2
+Typing
+.LP
+The next pass distributes typing information
+to every node of the tree.
+Implicit operations on the tree are added,
+such as type promotions and taking the
+address of arrays and functions.
+.NH 2
+Machine-independent optimization
+.LP
+The next pass performs optimizations
+and transformations of the tree, such as converting
+.CW &*x
+and
+.CW *&x
+into
+.CW x .
+Constant expressions are converted to constants in this pass.
+.NH 2
+Arithmetic rewrites
+.LP
+This is another machine-independent optimization.
+Subtrees of add, subtract, and multiply of integers are
+rewritten for easier compilation.
+The major transformation is factoring:
+.CW 4+8*a+16*b+5
+is transformed into
+.CW 9+8*(a+2*b) .
+Such expressions arise from address
+manipulation and array indexing.
+.NH 2
+Addressability
+.LP
+This is the first of the machine-dependent passes.
+The addressability of a processor is defined as the set of
+expressions that is legal in the address field
+of a machine language instruction.
+The addressability of different processors varies widely.
+At one end of the spectrum are the 68020 and VAX,
+which allow a complex mix of incrementing,
+decrementing,
+indexing, and relative addressing.
+At the other end is the MIPS,
+which allows only registers and constant offsets from the
+contents of a register.
+The addressability can be different for different instructions
+within the same processor.
+.LP
+It is important to the code generator to know when a
+subtree represents an address of a particular type.
+This is done with a bottom-up walk of the tree.
+In this pass, the leaves are labeled with small integers.
+When an internal node is encountered,
+it is labeled by consulting a table indexed by the
+labels on the left and right subtrees.
+For example,
+on the 68020 processor,
+it is possible to address an
+offset from a named location.
+In C, this is represented by the expression
+.CW *(&name+constant) .
+This is marked addressable by the following table.
+In the table,
+a node represented by the left column is marked
+with a small integer from the right column.
+Marks of the form
+.CW A\s-2\di\u\s0
+are addressable while
+marks of the form
+.CW N\s-2\di\u\s0
+are not addressable.
+.DS
+.B
+.ta .1i 1.1i
+ Node Marked
+.CW
+ name A\s-2\d1\u\s0
+ const A\s-2\d2\u\s0
+ &A\s-2\d1\u\s0 A\s-2\d3\u\s0
+ A\s-2\d3\u\s0+A\s-2\d1\u\s0 N\s-2\d1\u\s0 \fR(note that this is not addressable)\fP
+ *N\s-2\d1\u\s0 A\s-2\d4\u\s0
+.R
+.DE
+Here there is a distinction between
+a node marked
+.CW A\s-2\d1\u\s0
+and a node marked
+.CW A\s-2\d4\u\s0
+because the address operator of an
+.CW A\s-2\d4\u\s0
+node is not addressable.
+So to extend the table:
+.DS
+.B
+.ta .1i 1.1i
+ Node Marked
+.CW
+ &A\s-2\d4\u\s0 N\s-2\d2\u\s0
+ N\s-2\d2\u\s0+N\s-2\d1\u\s0 N\s-2\d1\u\s0
+.R
+.DE
+The full addressability of the 68020 is expressed
+in 18 rules like this,
+while the addressability of the MIPS is expressed
+in 11 rules.
+When one ports the compiler,
+this table is usually initialized
+so that leaves are labeled as addressable and nothing else.
+The code produced is poor,
+but porting is easy.
+The table can be extended later.
+.LP
+This pass also rewrites some complex operators
+into procedure calls.
+Examples include 64-bit multiply and divide.
+.LP
+In the same bottom-up pass of the tree,
+the nodes are labeled with a Sethi-Ullman complexity
+[Set70].
+This number is roughly the number of registers required
+to compile the tree on an ideal machine.
+An addressable node is marked 0.
+A function call is marked infinite.
+A unary operator is marked as the
+maximum of 1 and the mark of its subtree.
+A binary operator with equal marks on its subtrees is
+marked with a subtree mark plus 1.
+A binary operator with unequal marks on its subtrees is
+marked with the maximum mark of its subtrees.
+The actual values of the marks are not too important,
+but the relative values are.
+The goal is to compile the harder
+(larger mark)
+subtree first.
+.NH 2
+Code generation
+.LP
+Code is generated by recursive
+descent.
+The Sethi-Ullman complexity completely guides the
+order.
+The addressability defines the leaves.
+The only difficult part is compiling a tree
+that has two infinite (function call)
+subtrees.
+In this case,
+one subtree is compiled into the return register
+(usually the most convenient place for a function call)
+and then stored on the stack.
+The other subtree is compiled into the return register
+and then the operation is compiled with
+operands from the stack and the return register.
+.LP
+There is a separate boolean code generator that compiles
+conditional expressions.
+This is fundamentally different from compiling an arithmetic expression.
+The result of the boolean code generator is the
+position of the program counter and not an expression.
+The boolean code generator makes extensive use of De Morgan's rule.
+The boolean code generator is an expanded version of that described
+in chapter 8 of Aho, Sethi, and Ullman
+[Aho87].
+.LP
+There is a considerable amount of talk in the literature
+about automating this part of a compiler with a machine
+description.
+Since this code generator is so small
+(less than 500 lines of C)
+and easy,
+it hardly seems worth the effort.
+.NH 2
+Registerization
+.LP
+Up to now,
+the compiler has operated on syntax trees
+that are roughly equivalent to the original source language.
+The previous pass has produced machine language in an internal
+format.
+The next two passes operate on the internal machine language
+structures.
+The purpose of the next pass is to reintroduce
+registers for heavily used variables.
+.LP
+All of the variables that can be
+potentially registerized within a procedure are
+placed in a table.
+(Suitable variables are any automatic or external
+scalars that do not have their addresses extracted.
+Some constants that are hard to reference are also
+considered for registerization.)
+Four separate data flow equations are evaluated
+over the procedure on all of these variables.
+Two of the equations are the normal set-behind
+and used-ahead
+bits that define the life of a variable.
+The two new bits tell if a variable life
+crosses a function call ahead or behind.
+By examining a variable over its lifetime,
+it is possible to get a cost
+for registerizing.
+Loops are detected and the costs are multiplied
+by three for every level of loop nesting.
+Costs are sorted and the variables
+are replaced by available registers on a greedy basis.
+.LP
+The 68020 has two different
+types of registers.
+For the 68020,
+two different costs are calculated for
+each variable life and the register type that
+affords the better cost is used.
+Ties are broken by counting the number of available
+registers of each type.
+.LP
+Note that externals are registerized together with automatics.
+This is done by evaluating the semantics of a ``call'' instruction
+differently for externals and automatics.
+Since a call goes outside the local procedure,
+it is assumed that a call references all externals.
+Similarly,
+externals are assumed to be set before an ``entry'' instruction
+and assumed to be referenced after a ``return'' instruction.
+This makes sure that externals are in memory across calls.
+.LP
+The overall results are satisfactory.
+It would be nice to be able to do this processing in
+a machine-independent way,
+but it is impossible to get all of the costs and
+side effects of different choices by examining the parse tree.
+.LP
+Most of the code in the registerization pass is machine-independent.
+The major machine-dependency is in
+examining a machine instruction to ask if it sets or references
+a variable.
+.NH 2
+Machine code optimization
+.LP
+The next pass walks the machine code
+for opportunistic optimizations.
+For the most part,
+this is highly specific to a particular
+processor.
+One optimization that is performed
+on all of the processors is the
+removal of unnecessary ``move''
+instructions.
+Ironically,
+most of these instructions were inserted by
+the previous pass.
+There are two patterns that are repetitively
+matched and replaced until no more matches are
+found.
+The first tries to remove ``move'' instructions
+by relabeling variables.
+.LP
+When a ``move'' instruction is encountered,
+if the destination variable is set before the
+source variable is referenced,
+then all of the references to the destination
+variable can be renamed to the source and the ``move''
+can be deleted.
+This transformation uses the reverse data flow
+set up in the previous pass.
+.LP
+An example of this pattern is depicted in the following
+table.
+The pattern is in the left column and the
+replacement action is in the right column.
+.DS
+.CW
+.ta .1i .6i 1.6i 2.1i 2.6i
+ MOVE a->b \fR(remove)\fP
+.R
+ (sequence with no mention of \f(CWa\fP)
+.CW
+ USE b USE a
+.R
+ (sequence with no mention of \f(CWa\fP)
+.CW
+ SET b SET b
+.R
+.DE
+.LP
+Experiments have shown that it is marginally
+worthwhile to rename uses of the destination variable
+with uses of the source variable up to
+the first use of the source variable.
+.LP
+The second transform will do relabeling
+without deleting instructions.
+When a ``move'' instruction is encountered,
+if the source variable has been set prior
+to the use of the destination variable
+then all of the references to the source
+variable are replaced by the destination and
+the ``move'' is inverted.
+Typically,
+this transformation will alter two ``move''
+instructions and allow the first transformation
+another chance to remove code.
+This transformation uses the forward data flow
+set up in the previous pass.
+.LP
+Again,
+the following is a depiction of the transformation where
+the pattern is in the left column and the
+rewrite is in the right column.
+.DS
+.CW
+.ta .1i .6i 1.6i 2.1i 2.6i
+ SET a SET b
+.R
+ (sequence with no use of \f(CWb\fP)
+.CW
+ USE a USE b
+.R
+ (sequence with no use of \f(CWb\fP)
+.CW
+ MOVE a->b MOVE b->a
+.R
+.DE
+Iterating these transformations
+will usually get rid of all redundant ``move'' instructions.
+.LP
+A problem with this organization is that the costs
+of registerization calculated in the previous pass
+must depend on how well this pass can detect and remove
+redundant instructions.
+Often,
+a fine candidate for registerization is rejected
+because of the cost of instructions that are later
+removed.
+.NH 2
+Writing the object file
+.LP
+The last pass walks the internal assembly language
+and writes the object file.
+The object file is reduced in size by about a factor
+of three with simple compression
+techniques.
+The most important aspect of the object file
+format is that it is independent of the compiling machine.
+All integer and floating numbers in the object
+code are converted to known formats and byte
+orders.
+.NH
+The loader
+.LP
+The loader is a multiple pass program that
+reads object files and libraries and produces
+an executable binary.
+The loader also does some minimal
+optimizations and code rewriting.
+Many of the operations performed by the
+loader are machine-dependent.
+.LP
+The first pass of the loader reads the
+object modules into an internal data
+structure that looks like binary assembly language.
+As the instructions are read,
+code is reordered to remove
+unconditional branch instructions.
+Conditional branch instructions are inverted
+to prevent the insertion of unconditional branches.
+The loader will also make a copy of a few instructions
+to remove an unconditional branch.
+.LP
+The next pass allocates addresses for
+all external data.
+Typical of processors is the MIPS,
+which can reference ±32K bytes from a
+register.
+The loader allocates the register
+.CW R30
+as the static pointer.
+The value placed in
+.CW R30
+is the base of the data segment plus 32K.
+It is then cheap to reference all data in the
+first 64K of the data segment.
+External variables are allocated to
+the data segment
+with the smallest variables allocated first.
+If all of the data cannot fit into the first
+64K of the data segment,
+then usually only a few large arrays
+need more expensive addressing modes.
+.LP
+For the MIPS processor,
+the loader makes a pass over the internal
+structures,
+exchanging instructions to try
+to fill ``delay slots'' with useful work.
+If a useful instruction cannot be found
+to fill a delay slot,
+the loader will insert
+``noop''
+instructions.
+This pass is very expensive and does not
+do a good job.
+About 40% of all instructions are in
+delay slots.
+About 65% of these are useful instructions and
+35% are ``noops.''
+The vendor-supplied assembler does this job
+more effectively,
+filling about 80%
+of the delay slots with useful instructions.
+.LP
+On the 68020 processor,
+branch instructions come in a variety of
+sizes depending on the relative distance
+of the branch.
+Thus the size of branch instructions
+can be mutually dependent.
+The loader uses a multiple pass algorithm
+to resolve the branch lengths
+[Szy78].
+Initially, all branches are assumed minimal length.
+On each subsequent pass,
+the branches are reassessed
+and expanded if necessary.
+When no more expansions occur,
+the locations of the instructions in
+the text segment are known.
+.LP
+On the MIPS processor,
+all instructions are one size.
+A single pass over the instructions will
+determine the locations of all addresses
+in the text segment.
+.LP
+The last pass of the loader produces the
+executable binary.
+A symbol table and other tables are
+produced to help the debugger to
+interpret the binary symbolically.
+.LP
+The loader places absolute source line numbers in the symbol table.
+The name and absolute line number of all
+.CW #include
+files is also placed in the
+symbol table so that the debuggers can
+associate object code to source files.
+.NH
+Performance
+.LP
+The following is a table of the source size of the MIPS
+compiler.
+.DS
+.ta .1i .6i
+ lines module
+ \0509 machine-independent headers
+ 1070 machine-independent YACC source
+ 6090 machine-independent C source
+
+ \0545 machine-dependent headers
+ 6532 machine-dependent C source
+
+ \0298 loader headers
+ 5215 loader C source
+.DE
+.LP
+The following table shows timing
+of a test program
+that plays checkers, running on a MIPS R4000.
+The test program is 26 files totaling 12600 lines of C.
+The execution time does not significantly
+depend on library implementation.
+Since no other compiler runs on Plan 9,
+the Plan 9 tests were done with the Plan 9 operating system;
+the other tests were done on the vendor's operating system.
+The hardware was identical in both cases.
+The optimizer in the vendor's compiler
+is reputed to be extremely good.
+.DS
+.ta .1i .9i
+ \0\04.49s Plan 9 \f(CWvc\fP \f(CW-N\fP compile time (opposite of \f(CW-O\fP)
+ \0\01.72s Plan 9 \f(CWvc\fP \f(CW-N\fP load time
+ 148.69s Plan 9 \f(CWvc\fP \f(CW-N\fP run time
+
+ \015.07s Plan 9 \f(CWvc\fP compile time (\f(CW-O\fP implicit)
+ \0\01.66s Plan 9 \f(CWvc\fP load time
+ \089.96s Plan 9 \f(CWvc\fP run time
+
+ \014.83s vendor \f(CWcc\fP compile time
+ \0\00.38s vendor \f(CWcc\fP load time
+ 104.75s vendor \f(CWcc\fP run time
+
+ \043.59s vendor \f(CWcc\fP \f(CW-O\fP compile time
+ \0\00.38s vendor \f(CWcc\fP \f(CW-O\fP load time
+ \076.19s vendor \f(CWcc\fP \f(CW-O\fP run time
+
+ \0\08.19s vendor \f(CWcc\fP \f(CW-O3\fP compile time
+ \035.97s vendor \f(CWcc\fP \f(CW-O3\fP load time
+ \071.16s vendor \f(CWcc\fP \f(CW-O3\fP run time
+.DE
+.LP
+To compare the Intel compiler,
+a program that is about 40% bit manipulation and
+about 60% single precision floating point was
+run on the same 33 MHz 486, once under Windows
+compiled with the Watcom compiler, version 10.0,
+in 16-bit mode and once under
+Plan 9 in 32-bit mode.
+The Plan 9 execution time was 27 sec while the Windows
+execution time was 31 sec.
+.NH
+Conclusions
+.LP
+The new compilers compile
+quickly,
+load slowly,
+and produce
+medium quality
+object code.
+The compilers are relatively
+portable,
+requiring but a couple of weeks' work to
+produce a compiler for a different computer.
+For Plan 9,
+where we needed several compilers
+with specialized features and
+our own object formats,
+this project was indispensable.
+It is also necessary for us to
+be able to freely distribute our compilers
+with the Plan 9 distribution.
+.LP
+Two problems have come up in retrospect.
+The first has to do with the
+division of labor between compiler and loader.
+Plan 9 runs on multi-processors and as such
+compilations are often done in parallel.
+Unfortunately,
+all compilations must be complete before loading
+can begin.
+The load is then single-threaded.
+With this model,
+any shift of work from compile to load
+results in a significant increase in real time.
+The same is true of libraries that are compiled
+infrequently and loaded often.
+In the future,
+we may try to put some of the loader work
+back into the compiler.
+.LP
+The second problem comes from
+the various optimizations performed over several
+passes.
+Often optimizations in different passes depend
+on each other.
+Iterating the passes could compromise efficiency,
+or even loop.
+We see no real solution to this problem.
+.NH
+References
+.LP
+[Aho87] A. V. Aho, R. Sethi, and J. D. Ullman,
+.I
+Compilers \- Principles, Techniques, and Tools,
+.R
+Addison Wesley,
+Reading, MA,
+1987.
+.LP
+[ANSI90] \f2American National Standard for Information Systems \-
+Programming Language C\f1, American National Standards Institute, Inc.,
+New York, 1990.
+.LP
+[Dav91] J. W. Davidson and D. B. Whalley,
+``Methods for Saving and Restoring Register Values across Function Calls'',
+.I
+Software\-Practice and Experience,
+.R
+Vol 21(2), pp. 149-165, February 1991.
+.LP
+[Joh79] S. C. Johnson,
+``YACC \- Yet Another Compiler Compiler'',
+.I
+UNIX Programmer's Manual, Seventh Ed., Vol. 2A,
+.R
+AT&T Bell Laboratories,
+Murray Hill, NJ,
+1979.
+.LP
+[Set70] R. Sethi and J. D. Ullman,
+``The Generation of Optimal Code for Arithmetic Expressions'',
+.I
+Journal of the ACM,
+.R
+Vol 17(4), pp. 715-728, 1970.
+.LP
+[Szy78] T. G. Szymanski,
+``Assembling Code for Machines with Span-dependent Instructions'',
+.I
+Communications of the ACM,
+.R
+Vol 21(4), pp. 300-308, 1978.
binary files /dev/null b/doc/compiler.pdf differ
--- /dev/null
+++ b/doc/descent/descent.ms
@@ -1,0 +1,2056 @@
+.de EX
+.nr x \\$1v
+\\!h0c n \\nx 0
+..
+.de FG \" start figure caption: .FG filename.ps verticalsize
+.KF
+.BP \\$1 \\$2
+.sp .5v
+.EX \\$2v
+.ps -1
+.vs -1
+..
+.de fg \" end figure caption (yes, it is clumsy)
+.ps
+.vs
+.br
+.KE
+..
+.TL
+A Descent into Limbo
+.AU
+Brian W. Kernighan
+.AI
+bwk@bell-labs.com
+.br
+Revised April 2005 by Vita Nuova
+.AB
+.DS B
+.ps -2
+.vs -1
+``If, reader, you are slow now to believe
+What I shall tell, that is no cause for wonder,
+For I who saw it hardly can accept it.''
+.ft R
+ Dante Alighieri, \fIInferno\fP, Canto XXV.
+.ps +2
+.vs +1
+.DE
+.LP
+Limbo is a new programming language, designed by
+Sean Dorward, Phil Winterbottom, and Rob Pike.
+Limbo borrows from, among other things,
+C (expression syntax and control flow),
+Pascal (declarations),
+Winterbottom's Alef (abstract data types and channels),
+and Hoare's CSP and Pike's Newsqueak (processes).
+Limbo is strongly typed, provides automatic garbage collection,
+supports only very restricted pointers,
+and compiles into machine-independent byte code for execution on
+a virtual machine.
+.LP
+This paper is a brief introduction to Limbo.
+Since Limbo is an integral part of the Inferno system,
+the examples here illustrate not only
+the language but also a certain amount about how to write
+programs to run within Inferno.
+.AE
+.NH 1
+Introduction
+.LP
+This document is a quick look at the basics
+of Limbo; it is not a replacement for the reference manual.
+The first section is a short overview of
+concepts and constructs;
+subsequent sections illustrate the language with examples.
+Although Limbo is intended to be used in Inferno,
+which emphasizes networking and graphical interfaces,
+the discussion here begins with standard text-manipulation
+examples, since they require less background to understand.
+.SH
+Modules:
+.LP
+A Limbo program is a set of modules that cooperate
+to perform a task.
+In source form, a module consists of a
+.CW "module"
+declaration that specifies the public interface \- the functions,
+abstract data types,
+and constants that the module makes visible to other modules \-
+and an implementation that provides the actual code.
+By convention, the module declaration is usually placed in a separate
+.CW ".m"
+file so it can be included by other modules,
+and the implementation is stored in a
+.CW ".b"
+file.
+Modules may have multiple implementations,
+each in a separate implementation file.
+.LP
+Modules are always loaded dynamically, at run time: the Limbo
+.CW "load"
+operator fetches the code and performs run-time type checking.
+Once a module has been loaded, its functions can be called.
+Several instances of the same module type can be in use at once,
+with possibly different implementations.
+.LP
+Limbo is strongly typed; programs are checked at compile time,
+and further when modules are loaded.
+The Limbo compiler compiles each source file into a
+machine-independent byte-coded
+.CW ".dis"
+file that can be loaded at run time.
+.SH
+Functions and variables:
+.LP
+Functions are associated with specific modules, either directly or
+as members of abstract data types within a module.
+Functions are visible outside their module only
+if they are part of the module interface.
+If the target module is loaded, specific names
+can be used in a qualified form like
+.CW "sys->print"
+or without the qualifier if imported with an explicit
+.CW "import"
+statement.
+.LP
+Besides normal block structure within functions,
+variables may have global scope within a module;
+module data can be accessed via the module pointer.
+.SH
+Data:
+.LP
+The numeric types are:
+.RS
+.TS
+lf(CW) lf(R)w(3i) .
+byte unsigned, 8 bits
+int signed, 32 bits
+big signed, 64 bits
+real IEEE long float, 64 bits
+.TE
+.RE
+The size and signedness of integral types are
+as specified above, and will be the same everywhere.
+Character constants are enclosed in single quotes
+and may use escapes like
+.CW "'\en'"
+or
+.CW "'\eudddd'" ,
+but the characters themselves
+are in Unicode and have type
+.CW "int" .
+There is no enumeration type, but there is a
+.CW "con"
+declaration that creates a named constant, and a special
+.CW "iota"
+operation that can be used to generate unique values.
+.LP
+Limbo also provides
+Unicode strings,
+arrays of arbitrary types,
+lists of arbitrary types,
+tuples (in effect, unnamed structures with unnamed members of arbitrary types),
+abstract data types or adt's (in effect, named structures with function
+members as well as data members),
+reference types (in effect, restricted pointers that can point only to adt objects),
+and
+typed channels (for passing objects between processes).
+.LP
+A channel is a mechanism for synchronized communication.
+It provides a place for one process to send or receive
+an object of a specific type;
+the attempt to send or receive blocks until a matching receive or send
+is attempted by another process.
+The
+.CW "alt"
+statement selects randomly but fairly among channels
+that are ready to read or write.
+The
+.CW "spawn"
+statement creates a new process that,
+except for its stack, shares memory with other processes.
+Processes are pre-emptively scheduled by the Inferno kernel.
+(Inferno processes are sometimes called ``threads'' in
+other operating systems.)
+.LP
+Limbo performs automatic garbage collection, so there is no
+need to free dynamically created objects.
+Objects are deleted and their resources freed when
+the last reference to them goes away.
+This release of resources happens immediately
+(``instant free'') for non-cyclic structures;
+release of cyclic data structures might be delayed but will happen eventually.
+(The language allows the programmer to ensure a given structure is non-cyclic
+when required.)
+.SH
+Operators and expressions:
+.LP
+Limbo provides many of C's operators,
+but not the
+.CW "?:"
+or
+`comma' (sequential execution) operators.
+Pointers, or `references', created with
+.CW "ref" ,
+are restricted compared to C: they can only refer to adt values on the heap.
+There is no
+.CW "&"
+(address of) operator, nor is address arithmetic possible.
+Arrays are also reference types, however,
+and since
+array slicing is supported, that replaces
+many of C's pointer constructions.
+.LP
+There are no implicit coercions between types,
+and only a handful of explicit casts.
+The numeric types
+.CW "byte" ,
+.CW "int" ,
+etc., can be used to convert a numeric expression, as in
+.P1
+nl := byte 10;
+.P2
+and
+.CW "string"
+can be used as a unary operator to convert any numeric expression
+to a string (in
+.CW "%g"
+format) and to convert an array of bytes in UTF-8 format to a Limbo
+.CW string
+value.
+In the other direction, the cast
+.CW "array of byte"
+converts a string to its UTF-8 representation in an array of bytes.
+.SH
+Statements:
+.LP
+Statements and control flow in Limbo are similar to those in C.
+A statement is an expression followed by a semicolon,
+or a sequence of statements enclosed in braces.
+The similar control flow statements are
+.P1
+if (\fIexpr\fP) \fIstat\fP
+if (\fIexpr\fP) \fIstat\fP else \fIstat\fP
+while (\fIexpr\fP) \fIstat\fP
+for (\fIexpr\fP; \fIexpr\fP; \fIexpr\fP) \fIstat\fP
+do \fIstat\fP while (\fIexpr\fP) ;
+return \fIexpr\fP ;
+exit ;
+.P2
+The
+.CW "exit"
+statement terminates a process and frees its resources.
+There is also a
+.CW "case"
+statement analogous to C's
+.CW "switch" ,
+but it differs in that it also supports string and range tests,
+and more critically, control flow does not ``flow through'' one arm of the case to another
+but stops without requiring an explicit
+.CW break
+(in that respect it is closer to Pascal's
+.CW case
+statement, hence the change of name).
+A
+.CW "break"
+or
+.CW "continue"
+followed by a label
+causes a break out of, or the next iteration of, the enclosing
+construct that is labeled with the same label.
+.LP
+Comments begin with
+.CW "#"
+and extend to the end of the line.
+There is no preprocessor, but an
+.CW "include"
+statement can be used to include source code, usually module declaration files.
+.SH
+Libraries:
+.LP
+Limbo has an extensive and growing set of standard libraries,
+each implemented as a module.
+A handful of these
+(notably
+.CW "Sys" ,
+.CW "Draw" ,
+and
+.CW "Tk" )
+are included in the Inferno kernel because they will be
+needed to support almost any Limbo program.
+Among the others are
+.CW "Bufio" ,
+a buffered I/O package based on Plan 9's Bio;
+.CW "Regex" ,
+for regular expressions;
+and
+.CW "Math" ,
+for mathematical functions.
+Some of the examples that follow provide the sort
+of functionality that might be a suitable module.
+.NH 1
+Examples
+.LP
+The examples in this section are each complete, in the sense that they
+will run as presented; I have tried to avoid code fragments
+that merely illustrate syntax.
+.NH 2
+Hello, World
+.LP
+The first example is the traditional ``hello, world'',
+in the file
+.CW "hello.b" :
+.P1
+implement Hello;
+
+include "sys.m";
+ sys: Sys;
+include "draw.m";
+
+Hello: module
+{
+ init: fn(ctxt: ref Draw->Context, args: list of string);
+};
+
+init(ctxt: ref Draw->Context, args: list of string)
+{
+ sys = load Sys Sys->PATH;
+ sys->print("hello, world\en");
+}
+.P2
+An implementation file implements a single module,
+named in the
+.CW "implement"
+declaration at the top of the file.
+The two
+.CW "include"
+lines copy interface definitions from two other modules,
+.CW "Sys"
+(which describes a variety of system functions like
+.CW "print" ),
+and
+.CW "Draw"
+(which describes a variety of graphics types and functions,
+only one of which,
+.CW "Context" ,
+is used here).
+.LP
+The
+.CW "module"
+declaration defines the external interface that this module
+presents to the rest of the world.
+In this case, it's a single function named
+.CW "init" .
+Since this module is to be called from a command interpreter
+(shell), by convention its
+.CW "init"
+function takes two arguments,
+the graphical context
+and a list of strings, the command-line arguments,
+though neither is used here.
+This is like
+.CW "main"
+in a C program.
+Essentially all of the other examples begin with this standard code.
+Commands are unusual, though, in that a command's module declaration
+appears in the same file as its implementation.
+.LP
+Most modules have a more extensive set of declarations; for example,
+.CW "draw.m"
+is 298 lines of constants, function prototypes, and
+type declarations for graphics types like
+.CW "Point"
+and
+.CW "Rect" ,
+and
+.CW "sys.m"
+is 160 lines of declarations for functions like
+.CW "open" ,
+.CW "read" ,
+and
+.CW "print" .
+Most module declarations are therefore stored in separate files,
+conventionally suffixed with
+.CW ".m" ,
+so they can be included in other modules.
+The system library module declaration files are collected in the
+.CW module
+directory at the root of the Inferno source tree.
+Modules that are components of a single program are typically
+stored in that program's source directory.
+.LP
+The last few lines of
+.CW "hello.b"
+are the implementation of the
+.CW "init"
+function, which loads the
+.CW "Sys"
+module, then calls its
+.CW "print"
+function.
+By convention, each module declaration includes a pathname constant
+that points to the code for the module; this is the second parameter
+.CW "Sys->PATH"
+of the
+.CW "load"
+statement.
+Note that the
+.CW Draw
+module is not loaded because none of its functions is used, but
+it is included to define the type
+.CW Draw->Context .
+.SH
+Compiling and Running Limbo Programs
+.LP
+With this much of the language described,
+we can compile and run this program.
+On Unix or Windows, the command
+.P1
+$ limbo -g hello.b
+.P2
+creates
+.CW "hello.dis" ,
+a byte-coded version of the program for the Dis
+virtual machine.
+The
+.CW "-g"
+argument adds a symbol table, useful for subsequent debugging.
+(Another common option is
+.CW -w ,
+which causes the compiler to produce helpful warnings about possible errors.)
+The program can then be run as
+.CW "hello"
+in Inferno; this shows execution under the Inferno emulator
+on a Unix system:
+.P1
+$ limbo -g hello.b
+$ emu
+; /usr/bwk/hello
+hello, world
+;
+.P2
+From within Inferno, it's also possible to run a
+program by selecting it from a menu.
+In any case, as the program runs, it loads as necessary other modules that it uses.
+.NH 2
+A Graphical "Hello World"
+.LP
+The following module creates and displays a window containing only
+a button with the label ``hello, world'' as shown in the screen shot in Figure 1.
+.P1
+implement Hello2;
+
+include "sys.m";
+ sys: Sys;
+include "draw.m";
+ draw: Draw;
+include "tk.m";
+ tk: Tk;
+include "tkclient.m";
+ tkclient: Tkclient;
+
+Hello2: module
+{
+ init: fn(ctxt: ref Draw->Context, args: list of string);
+};
+
+init(ctxt: ref Draw->Context, args: list of string)
+{
+ sys = load Sys Sys->PATH;
+ tk = load Tk Tk->PATH;
+ tkclient = load Tkclient Tkclient->PATH;
+
+ tkclient->init();
+
+ (t, nil) := tkclient->toplevel(ctxt, "", "Hello", Tkclient->Plain);
+
+ tk->cmd(t, "button .b -text {hello, world}");
+ tk->cmd(t, "pack .b");
+ tk->cmd(t, "update");
+
+ tkclient->onscreen(t, nil);
+
+ sys->sleep(10000); # wait 10 seconds
+}
+.P2
+.FG "f1.ps" 3i
+.ce
+.I "Figure 1. `Hello, world' button."
+.fg
+This is not very exciting, but it illustrates the absolute
+minimum required to get a picture on the screen.
+The
+.CW "Tk"
+module is modeled closely after John Ousterhout's Tk interface toolkit,
+but Limbo is used as the programming language instead of Tcl.
+The Inferno version
+is similar in functionality to the original Tk
+but it does not support any Tcl constructs,
+such as variables, procedures, or expression evaluation,
+since all processing is done using Limbo.
+There are ten functions in the
+.CW "Tk"
+interface, only one of which
+is used here:
+.CW "cmd" ,
+which executes a command string.
+(It is the most commonly used
+.CW Tk
+function.)
+.LP
+Tk itself displays graphics and handles mouse and keyboard interaction
+within a window.
+There can however be many different windows on a display.
+A separate window manager,
+.CW wm ,
+multiplexes control of input and output among those windows.
+The module
+.CW Tkclient
+provides the interface between the window manager and Tk.
+Its function
+.CW "toplevel" ,
+used above,
+makes a top-level window and returns a reference to it, for subsequent use by Tk.
+The contents of the window are prepared by calls to
+.CW tk->cmd
+before the window is finally displayed by the call to
+.CW onscreen .
+(The second parameter to
+.CW onscreen ,
+a string,
+controls the position and style of window;
+here we take the default by making that
+.CW nil .)
+.LP
+Note that
+.CW Tkclient
+must also be explicitly initialized by calling its
+.CW init
+function after loading.
+This is a common convention, although some modules do
+not require it (typically those built in
+to the system, such as
+.CW Sys
+or
+.CW Tk ).
+.LP
+The
+.CW "sleep"
+delays exit for 10 seconds so the button can be seen.
+If you try to interact with the window, for instance by pressing the button,
+you will see no response.
+That is because the program has not done what is required to receive mouse or keyboard input in the window.
+In a real application, some action would also be bound to pressing the button.
+Such actions are handled by setting up a connection (a `channel') from
+the Tk module to one's own code, and processing the
+messages (`events') that appear on this channel.
+The Tk module and its interface to the window manager
+is explained in more detail later,
+as are a couple of other constructions,
+after we have introduced processes and channels.
+.NH 2
+Echo
+.LP
+The next example,
+.CW "echo" ,
+prints its command-line arguments.
+Declarations are the same as in the first
+example, and have been omitted.
+.P1
+# declarations omitted...
+
+init(ctxt: ref Draw->Context, args: list of string)
+{
+ sys = load Sys Sys->PATH;
+
+ args = tl args; # skip over program name
+ for (s := ""; args != nil; args = tl args)
+ s += " " + hd args;
+ if (s != "") # something was stored in s
+ sys->print("%s\en", s[1:]);
+}
+.P2
+The arguments are stored in a
+.CW "list" .
+Lists may be of any type;
+.CW "args"
+is a
+.CW "list"
+.CW "of"
+.CW "string" .
+There are three list operators:
+.CW "hd"
+and
+.CW "tl"
+return the head and tail of a list, and
+.CW "::"
+adds a new element to the head.
+In this example, the
+.CW "for"
+loop walks along the
+.CW "args"
+list until the end,
+printing the head element
+.CW "hd args" ), (
+then advancing
+.CW "args = tl args" ). (
+.LP
+The value
+.CW "nil"
+is the ``undefined'' or ``explicitly empty'' value
+for non-numeric types.
+.LP
+The operator
+.CW ":="
+combines the declaration of a variable and assignment of a value to it.
+The type of the variable on the left of
+.CW ":="
+is the type
+of the expression on the right.
+Thus, the expression
+.P1
+s := ""
+.P2
+in the
+.CW "for"
+statement
+declares a string
+.CW "s"
+and initializes it to empty;
+if after the loop,
+.CW "s"
+is not empty,
+something has been written in it.
+By the way, there is no distinction between the values
+.CW "nil"
+and
+\f5""\fP
+for strings.
+.LP
+The
+.CW "+"
+and
+.CW "+="
+operators concatenate strings.
+The expression
+.CW "s[1:]"
+is a
+.I slice
+of the string
+.CW "s"
+that starts at index 1
+(the second character of the string) and goes
+to the end; this excludes the unwanted
+blank at the beginning of
+.CW "s" .
+.NH 2
+Word Count
+.LP
+The word count program
+.CW "wc"
+reads its standard input
+and counts the number of lines, words, and characters.
+Declarations have again been omitted.
+.P1
+# declarations omitted...
+
+init(nil: ref Draw->Context, args: list of string)
+{
+ sys = load Sys Sys->PATH;
+ buf := array[1] of byte;
+
+ stdin := sys->fildes(0);
+
+ OUT: con 0;
+ IN: con 1;
+
+ state := OUT;
+ nl := 0; nw := 0; nc := 0;
+ for (;;) {
+ n := sys->read(stdin, buf, 1);
+ if (n <= 0)
+ break;
+ c := int buf[0];
+ nc++;
+ if (c == '\en')
+ nl++;
+ if (c == ' ' || c == '\et' || c == '\en')
+ state = OUT;
+ else if (state == OUT) {
+ state = IN;
+ nw++;
+ }
+ }
+ sys->print("%d %d %d\en", nl, nw, nc);
+}
+.P2
+.LP
+This program contains several instances of the
+.CW ":="
+operator.
+For example, the line
+.P1
+ nl := 0; nw := 0; nc := 0;
+.P2
+declares three integer variables
+and assigns zero to each.
+.LP
+A Limbo program starts with three open files for standard
+input, standard output, and standard error, as in Unix.
+The line
+.P1
+ stdin := sys->fildes(0);
+.P2
+declares a variable
+.CW "stdin"
+and assigns the corresponding file descriptor to it.
+The type of
+.CW "stdin"
+is whatever the type of
+.CW "sys->fildes(0)"
+is, and it's possible to get by without
+ever knowing the name of that type.
+(We will return to this shortly.)
+.NE 3v
+.LP
+The lines
+.P1
+ OUT: con 0;
+ IN: con 1;
+.P2
+declare two integer constants with values zero and one.
+There is no
+.CW "enum"
+type in Limbo; the
+.CW "con"
+declaration is the closest equivalent.
+When the values are arbitrary, a different form is normally used:
+.P1
+ OUT, IN: con iota;
+.P2
+The operator
+.CW "iota" ,
+when used in
+.CW con
+declarations will produce the sequence of values 0, 1, ....,
+one value in turn for each name declared in the same declaration.
+It can appear in more complex expressions:
+.P1
+ M1, M2, M4, M8: con 1 << iota;
+ N1, N3, N5, N7: con (2*iota)+1;
+.P2
+The first example generates a set of bitmask values; the second generates a
+sequence of odd numbers.
+.LP
+Given the declarations of
+.CW "IN"
+and
+.CW "OUT" ,
+the line
+.P1
+ state := OUT;
+.P2
+declares
+.CW "state"
+to be an integer with initial value zero.
+.LP
+The line
+.P1
+ buf := array[1] of byte;
+.P2
+declares
+.CW "buf"
+to be a one-element array of
+.CW "byte" s.
+Arrays are indexed from zero, so
+.CW "buf[0]"
+is the only element.
+Arrays in Limbo are dynamic, so this array is created at
+the point of the declaration.
+An alternative would be to declare the array and
+create it in separate statements:
+.P1
+ buf : array of byte; # no size at declaration
+
+ buf = array[1] of byte; # size needed at creation
+.P2
+.LP
+Limbo does no automatic coercions between types,
+so an explicit coercion is required to convert the
+single byte read from
+.CW "stdin"
+into an
+.CW "int"
+that can be used in subsequent comparisons with
+.CW "int" 's;
+this is done by the line
+.P1
+ c := int buf[0];
+.P2
+which declares
+.CW "c"
+and assigns the integer value of the input byte to it.
+.NH 2
+Word Count Version 2
+.LP
+The word count program above tacitly assumes that its input is
+in the ASCII subset of Unicode, since it reads
+input one byte at a time instead of one Unicode character
+at a time.
+If the input contains any multi-byte Unicode characters,
+this code is plain wrong.
+The assignment to
+.CW "c"
+is a specific example: the integer value of the first byte
+of a multi-byte Unicode character is not the character.
+.LP
+There are several ways to address this shortcoming.
+Among the possibilities are
+rewriting to use the
+.CW "Bufio"
+module, which does string I/O,
+or checking each input byte sequence to see if it is
+a multi-byte character.
+The second version of word counting uses
+.CW "Bufio" .
+This example will also illustrate rules for accessing objects
+within modules.
+.P1
+# declarations omitted...
+
+include "bufio.m";
+ bufio: Bufio;
+ Iobuf: import bufio;
+
+init(nil: ref Draw->Context, nil: list of string)
+{
+ sys = load Sys Sys->PATH;
+ bufio = load Bufio Bufio->PATH;
+ if (bufio == nil) {
+ sys->fprint(sys->fildes(2), "wc: can't load %s: %r\en", Bufio->PATH);
+ raise "fail:load";
+ }
+
+ stdin := sys->fildes(0);
+ iob := bufio->fopen(stdin, bufio->OREAD);
+ if (iob == nil) {
+ sys->fprint(sys->fildes(2), "wc: can't open stdin: %r\en");
+ raise "fail:open";
+ }
+
+ OUT, IN: con iota;
+
+ state := OUT;
+ nl := big 0; nw := big 0; nc := big 0;
+ for (;;) {
+ c := iob.getc();
+ if (c == Bufio->EOF)
+ break;
+ nc++;
+ if (c == '\en')
+ nl++;
+ if (c == ' ' || c == '\et' || c == '\en')
+ state = OUT;
+ else if (state == OUT) {
+ state = IN;
+ nw++;
+ }
+ }
+ sys->print("%bd %bd %bd\en", nl, nw, nc);
+}
+.P2
+The lines
+.P1
+include "bufio.m";
+ bufio: Bufio;
+.P2
+include the declarations from
+.CW "bufio.m"
+and declare a variable
+.CW "bufio"
+that will serve as a handle when we load an implementation of the
+.CW "Bufio"
+module.
+(The use of a module's type in lower case as the name of a loaded instance is a common convention in Limbo programs.)
+With this handle, we can
+refer to the functions and types
+the module defines, which are in the file
+.CW "/usr/inferno/module/bufio.m"
+(the full name might be different on your system).
+Parts of this declaration are shown here:
+.P1
+Bufio: module # edited to fit your screen
+{
+ PATH: con "/dis/bufio.dis";
+ EOF: con -1;
+ Iobuf: adt {
+ fd: ref Sys->FD; # the file
+ buffer: array of byte; # the buffer
+ # other variables omitted
+ getc: fn(b: self ref Iobuf) : int;
+ gets: fn(b: self ref Iobuf, sep: int) : string;
+ close: fn(b: self ref Iobuf);
+ };
+ open: fn(name: string, mode: int) : ref Iobuf;
+ fopen: fn(fd: ref Sys->FD, mode: int) : ref Iobuf;
+};
+.P2
+.LP
+The
+.CW "bufio"
+module defines
+.CW "open"
+and
+.CW "fopen"
+functions that return references to an
+.CW "Iobuf" ;
+this is much like a
+.CW "FILE*"
+in the C standard I/O library.
+A reference is necessary so that all uses
+refer to the same entity, the object maintained by the module.
+.LP
+Given the name of a module (e.g.,
+.CW "Bufio" ),
+how do we refer to its contents?
+It is always possible to use fully-qualified names,
+and the
+.CW "import"
+statement permits certain abbreviations.
+We must also distinguish between the name of the module itself
+and a specific implementation returned by
+.CW "load" ,
+such as
+.CW "bufio" .
+.LP
+The fully-qualified name of a type or constant from a module
+is
+.P1
+\fIModulename\fP->\fIname\fP
+.P2
+as in
+.CW "Bufio->Iobuf"
+or
+.CW "Bufio->EOF" .
+To refer to members of an adt or functions or variables from a module, however,
+it is necessary to use a module value instead of a module name:
+although the interface
+is always the same, the implementations of different instances
+of a module will be different, and we must refer to a specific
+implementation.
+A fully-qualified name is
+.P1
+\fImoduleval\fP->\fIfunctionname\fP
+\fImoduleval\fP->\fIvariablename\fP
+\fImoduleval\fP->\fIadtname\fP.\fImembername\fP
+.P2
+where adt members can be variables or functions.
+Thus:
+.P1
+iob: ref bufio->Iobuf;
+...
+bufio->open(...)
+bufio->iob.getc()
+bufio->iob.fd
+.P2
+It is also legal to refer to module types, constants, and variables
+with a module handle, as in
+.CW "bufio->EOF" .
+.LP
+An
+.CW "import"
+statement makes a specific list of names from
+a module accessible without need for a fully-qualified name.
+Each name must be imported explicitly, and adt member names
+can not be imported.
+Thus, the line
+.P1
+Iobuf: import bufio;
+.P2
+imports the adt name
+.CW "Iobuf" ,
+which means that functions within that adt (like
+.CW "getc)"
+can be used
+without module qualification, i.e., without
+.CW "bufio->" .
+(It is still necessary to say
+.CW "iob.getc()"
+for reasons given below.)
+In all cases, imported names must be unique.
+.LP
+The second parameter of
+.CW "load"
+is a string giving the location of the module implementation,
+typically a
+.CW ".dis"
+file.
+(The string need not be static.)
+Some modules are part of the system;
+these have location names that begin with
+.CW "$"
+but are otherwise the same for users.
+By convention, modules include a constant called
+.CW "PATH"
+that points to their default location.
+.LP
+The call to
+.CW "bufio->fopen"
+attaches the I/O buffer to the already open file
+.CW "stdin" ;
+this is rather like
+.CW "freopen"
+in
+.CW "stdio" .
+.LP
+The function
+.CW "iob.getc"
+returns the next Unicode character,
+or
+.CW "bufio->EOF"
+if end of file was encountered.
+.LP
+A close look at the calls to
+.CW "sys->print"
+shows a new format conversion character,
+.CW "%r" ,
+for which there is no corresponding argument in the
+expression list.
+The value of
+.CW "%r"
+is the text of the most recent system error message.
+.LP
+Several other small changes were made as realistic examples:
+it keeps the counts as
+.CW big
+to cope with larger files (hence the use of
+.CW %bd
+as the output format);
+it prints diagnostics on the standard error stream,
+.CW sys->fildes(2) ,
+using
+.CW sys->fprint ,
+a variant of
+.CW sys->print
+that takes an explicit file descriptor;
+and it returns an error status to its caller (typically the shell) by
+raising an exception.
+.NH 2
+An Associative Array Module
+.LP
+This section describes a module that implements a conventional
+associative array (a hash table
+pointing to chained lists of name-value strings).
+This module is meant to be part of a larger program,
+not a standalone program like the previous examples.
+.LP
+The
+.CW "Hashtab"
+module stores a name-value pair as a tuple of
+.CW "(string,"
+.CW "string)" .
+A tuple is a type consisting of an ordered collection
+of objects, each with its own type.
+The hash table implementation uses several different tuples.
+.LP
+The hash table module defines a type to hold the
+data, using an
+.CW "adt"
+declaration.
+An adt defines a type and optionally a set of functions
+that manipulate an object of that type.
+Since it provides only the ability to group variables and functions,
+it is like a really slimmed-down version of a C++ class,
+or a slightly fancier C
+.CW "struct" .
+In particular, an adt does not provide information hiding
+(all member names are visible if the adt itself is visible),
+does not support inheritance,
+and has no constructors, destructors or overloaded method names.
+It is different from C or C++, however: when an adt is declared by a
+.CW module
+declaration, the adt's implementation (the bodies of its functions)
+will be defined by the module's implementation, and there can be more than one.
+To create an instance of an adt,
+.P1
+\fIadtvar\fP := \fIadtname\fP(\fIlist of values for all members, in order\fP);
+\fIadtvar\fP := ref \fIadtname\fP(\fIlist of values for all members, in order\fP);
+.P2
+Technically these are casts, from tuple to adt;
+that is, the adt is created from a tuple that
+specifies all of its members in order.
+.LP
+The
+.CW "Hashtab"
+module contains an
+.CW "adt"
+declaration for a type
+.CW "Table" ;
+the operations are a function
+.CW "alloc"
+for initial allocation
+(in effect a constructor),
+a hash function, and methods to add and look up elements by name.
+Here is the module declaration, which is contained in file
+.CW "hashtab.m" :
+.nr dT 4
+.nr dP \n(dP+1
+.P1
+Hashtab: module
+{
+ PATH: con "/usr/bwk/hashtab.dis"; # temporary name
+
+ Table: adt {
+ tab: array of list of (string, string);
+
+ alloc: fn(n: int) : ref Table;
+
+ hash: fn(ht: self ref Table, name: string) : int;
+ add: fn(ht: self ref Table, name: string, val: string);
+ lookup: fn(ht: self ref Table, name: string) : (int, string);
+ };
+};
+.P2
+.nr dT 8
+.nr dP \n(dP-1
+The implementation is in file
+.CW "hashtab.b" :
+.P1
+implement Hashtab;
+
+include "hashtab.m";
+
+Table.alloc(n: int) : ref Table
+{
+ return ref Table(array[n] of list of (string,string));
+}
+
+Table.hash(ht: self ref Table, s: string) : int
+{
+ h := 0;
+ for (i := 0; i < len s; i++)
+ h = (h << 1) ^ int s[i];
+ h %= len ht.tab;
+ if (h < 0)
+ h += len ht.tab;
+ return h;
+}
+
+Table.add(ht: self ref Table, name: string, val: string)
+{
+ h := ht.hash(name);
+ for (p := ht.tab[h]; p != nil; p = tl p) {
+ (tname, nil) := hd p;
+ if (tname == name) {
+ # illegal: hd p = (tname, val);
+ return;
+ }
+ }
+ ht.tab[h] = (name, val) :: ht.tab[h];
+}
+
+Table.lookup(ht: self ref Table, name: string) : (int, string)
+{
+ h := ht.hash(name);
+ for (p := ht.tab[h]; p != nil; p = tl p) {
+ (tname, tval) := hd p;
+ if (tname == name)
+ return (1, tval);
+ }
+ return (0, "");
+}
+
+.P2
+This is intentionally simple-minded, to focus on the language
+rather than efficiency or flexibility.
+The function
+.CW "Table.alloc"
+creates and returns a
+.CW "Table"
+with a specified size and an array of elements,
+each of which is a list of
+.CW "(string,"
+.CW "string)" .
+.LP
+The
+.CW "hash"
+function is trivial; the only interesting point
+is the
+.CW "len"
+operator, which returns the number of items in a string, array or list.
+For a string,
+.CW "len"
+.CW "s"
+is the number of Unicode characters.
+.LP
+The
+.CW "self"
+declaration says that the first
+argument of every call of this function is implicit, and refers to the
+value itself; this argument does not appear in the actual parameter list at any call site.
+.CW "Self"
+is similar to
+.CW "this"
+in C++.
+.LP
+The
+.CW "lookup"
+function searches down the appropriate list for
+an instance of the
+.CW "name"
+argument.
+If a match is found,
+.CW "lookup"
+returns a tuple consisting of 1 and the value field;
+if no match is found, it returns a tuple of 0 and an empty string.
+These return types match the function return type,
+.CW "(int,"
+.CW "string)" .
+.LP
+The line
+.P1
+ (tname, tval) := hd p;
+.P2
+shows a tuple on the left side of a declaration-assignment.
+This splits the pair of strings referred to by
+.CW "hd"
+.CW "p"
+into components and assigns them to the newly declared variables
+.CW "tname"
+and
+.CW "tval" .
+.LP
+The
+.CW "add"
+function is similar;
+it searches the right list for an instance of
+the name.
+If none is found,
+.P1
+ ht.tab[h] = (name, val) :: ht.tab[h];
+.P2
+combines the name and value into a tuple, then uses
+.CW "::"
+to stick it on the front of the proper list.
+.LP
+The line
+.P1
+ (tname, nil) := hd p;
+.P2
+in the loop body is a less obvious use of a tuple.
+In this case, only the first component, the name,
+is assigned, to a variable
+.CW "tname"
+that is declared here.
+The other component is ``assigned'' to
+.CW "nil" ,
+which causes it to be ignored.
+.LP
+The line
+.P1
+ # illegal: hd p = (tname, val);
+.P2
+is commented out because it's illegal:
+Limbo does not permit the assignment of a new name-value
+to a list element;
+list elements are immutable.
+.LP
+To create a new
+.CW "Table" ,
+add some values, then retrieve one, we can write:
+.P1
+ nvtab = Table.alloc(101); # make a Table
+
+ nvtab.add("Rob", "Pike");
+ nvtab.add("Howard", "Trickey");
+ (p, phil) := nvtab.lookup("Phil");
+ (q, sean) := nvtab.lookup("Sean");
+.P2
+Note that the
+.CW "ref"
+.CW "Table"
+argument does not appear in these calls;
+the
+.CW "self"
+mechanism renders it unnecessary.
+Remember that a module using
+.CW Table
+must
+.CW import
+it from some instance of
+.CW Hashtab ,
+or qualify all references to it by a module value.
+.NH 2
+An AWK-like Input Module
+.LP
+This example presents a simple module based on Awk's input mechanism:
+it reads input a line at a time from a list of of files,
+splits each line into an array of
+.CW "NF+1"
+strings (the original input line and the individual fields), and
+sets
+.CW "NF" ,
+.CW "NR" ,
+and
+.CW "FILENAME" .
+It comes in the usual two parts, a module:
+.P1
+.nr dP \n(dP+1
+.nr dT 4
+Awk: module
+{
+ PATH: con "/usr/bwk/awk.dis";
+
+ init: fn(args: list of string);
+ getline: fn() : array of string;
+ NR: fn() : int;
+ NF: fn() : int;
+ FILENAME: fn() : string;
+};
+.P2
+.nr dP \n(dP-1
+.nr dT 8
+and an implementation:
+.nr dP \n(dP+1
+.nr dT 4
+.P1
+implement Awk;
+
+include "sys.m";
+ sys: Sys;
+include "bufio.m";
+ bufio: Bufio;
+Iobuf: import bufio;
+ iobuf: ref Iobuf;
+
+include "awk.m";
+
+_NR: int;
+_NF: int;
+_FILENAME: string;
+args: list of string;
+
+.P3
+init(av: list of string)
+{
+ args = tl av;
+ if (len args == 0) # no args => stdin
+ args = "-" :: nil;
+
+ sys = load Sys Sys->PATH;
+ bufio = load Bufio Bufio->PATH;
+}
+
+.P3
+getline() : array of string
+{
+ t := array[100] of string;
+ fl: list of string;
+
+ top:
+ while (args != nil) {
+ if (_FILENAME == nil) { # advance to next file
+ _FILENAME = hd args;
+ if (_FILENAME == "-")
+ iobuf = bufio->fopen(sys->fildes(0), bufio->OREAD);
+ else
+ iobuf = bufio->open(_FILENAME, bufio->OREAD);
+ if (iobuf == nil) {
+ sys->fprint(sys->fildes(2), "can't open %s: %r\en", _FILENAME);
+ args = nil;
+ return nil;
+ }
+ }
+
+.P3
+ s := iobuf.gets('\en');
+ if (s == nil) {
+ iobuf.close();
+ _FILENAME = nil;
+ args = tl args;
+ continue top;
+ }
+
+.P3
+ t[0] = s[0:len s - 1];
+ _NR++;
+ (_NF, fl) = sys->tokenize(t[0], " \et\en\er");
+ for (i := 1; fl != nil; fl = tl fl)
+ t[i++] = hd fl;
+ return t[0:i];
+ }
+ return nil;
+}
+
+NR() : int { return _NR; }
+NF() : int { return _NF; }
+FILENAME() : string { return _FILENAME; }
+.P2
+.nr dT 8
+.nr dP \n(dP-1
+Since
+.CW "NR" ,
+.CW "NF"
+and
+.CW "FILENAME"
+should not be modified by users, they
+are accessed as functions; the actual variables have
+related names like
+.CW "_NF" .
+It would also be possible to make them ordinary variables
+in the
+.CW "Awk"
+module, and refer to them via a module value (i.e.,
+.CW awk->NR ).
+.LP
+The
+.CW "tokenize"
+function in the line
+.P1
+ (_NF, fl) = sys->tokenize(t[0], " \et\en\er");
+.P2
+breaks the argument string
+.CW "t[0]"
+into tokens, as separated by the characters of the second argument.
+It returns a tuple consisting of a length and a list
+of tokens.
+Note that this module has an
+.CW "init"
+function that must be called explicitly before
+any of its other functions are called.
+.NH 2
+A Simple Formatter
+.LP
+This program is a simple-minded text formatter, modeled after
+.CW "fmt" ,
+that tests the Awk module:
+.P1
+implement Fmt;
+
+include "sys.m";
+ sys: Sys;
+include "draw.m";
+
+Fmt: module
+{
+ init: fn(nil: ref Draw->Context, args: list of string);
+};
+
+include "awk.m";
+ awk: Awk;
+ getline, NF: import awk;
+
+out: array of string;
+nout: int;
+length: int;
+linelen := 65;
+
+.P3
+init(nil: ref Draw->Context, args: list of string)
+{
+ t: array of string;
+ out = array[100] of string;
+
+ sys = load Sys Sys->PATH;
+ awk = load Awk Awk->PATH;
+ if (awk == nil) {
+ sys->fprint(sys->fildes(2), "fmt: can't load %s: %r\en",
+ Awk->PATH);
+ raise "fail:load";
+ }
+ awk->init(args);
+
+ nout = 0;
+ length = 0;
+ while ((t = getline()) != nil) {
+ nf := NF();
+ if (nf == 0) {
+ printline();
+ sys->print("\en");
+ } else for (i := 1; i <= nf; i++) {
+ if (length + len t[i] > linelen)
+ printline();
+ out[nout++] = t[i];
+ length += len t[i] + 1;
+ }
+ }
+ printline();
+}
+.P3
+printline()
+{
+ if (nout == 0)
+ return;
+ for (i := 0; i < nout-1; i++)
+ sys->print("%s ", out[i]);
+ sys->print("%s\en", out[i]);
+ nout = 0;
+ length = 0;
+}
+.P2
+The functions
+.CW "getline"
+and
+.CW "NF"
+have been imported so their names need no qualification.
+It is more usual Limbo style to use explicit references such as
+.CW sys->read
+or
+.CW Bufio->EOF
+for clarity, and import only adts (and perhaps commonly used constants).
+.NH 2
+Channels and Communications
+.LP
+Another approach to a formatter is to use one process to fetch words and
+pass them to another process that formats and prints them.
+This is easily done with a channel, as in this
+alternative version:
+.P1
+# declarations omitted...
+
+WORD, BREAK, EOF: con iota;
+wds: chan of (int, string);
+
+init(nil: ref Draw->Context, nil: list of string)
+{
+ sys = load Sys Sys->PATH;
+ bufio = load Bufio Bufio->PATH;
+
+ stdin := sys->fildes(0);
+ iob = bufio->fopen(stdin, bufio->OREAD);
+
+ wds = chan of (int, string);
+ spawn getword(wds);
+ putword(wds);
+}
+
+.P3
+getword(wds: chan of (int, string))
+{
+ while ((s := iob.gets('\en')) != nil) {
+ (n, fl) := sys->tokenize(s, " \et\en");
+ if (n == 0)
+ wds <-= (BREAK, "");
+ else for ( ; fl != nil; fl = tl fl)
+ wds <-= (WORD, hd fl);
+ }
+ wds <-= (EOF, "");
+}
+
+.P3
+putword(wds: chan of (int, string))
+{
+ for (length := 0;;) {
+ (wd, s) := <-wds;
+ case wd {
+ BREAK =>
+ sys->print("\en\en");
+ length = 0;
+ WORD =>
+ if (length + len s > 65) {
+ sys->print("\en");
+ length = 0;
+ }
+ sys->print("%s ", s);
+ length += len s + 1;
+ EOF =>
+ sys->print("\en");
+ exit;
+ }
+ }
+}
+.P2
+This omits declarations and error checking in the interest
+of brevity.
+.LP
+The channel passes a tuple of
+.CW "int" , (
+.CW "string" );
+the
+.CW "int"
+indicates what kind of string is present \-
+a real word, a break caused by an empty input line,
+or
+.CW "EOF" .
+.LP
+The
+.CW "spawn"
+statement creates a separate process by calling the specified function;
+except for its own stack,
+this process shares memory with the process that spawned it.
+Any synchronization between processes is handled by channels.
+.LP
+The operator
+.CW "<-="
+sends an expression to a channel;
+the operator
+.CW "<-"
+receives from a channel.
+(Receive is combined here with
+.CW ":="
+to receive a tuple, and assign its elements to newly-declared variables.)
+In this example,
+.CW "getword"
+and
+.CW "putword"
+alternate, because each input word
+is sent immediately on the shared channel,
+and no subsequent word is processed until the previous one has been
+received and printed.
+.LP
+The
+.CW "case"
+statement consists of a list of case values,
+which must be string or numeric constants, followed by
+.CW "=>"
+and associated code.
+The value
+.CW "*"
+(not used here) labels the default.
+Multiple labels can be used, separated by the
+.CW "or"
+operator,
+and ranges of values can appear delimited by
+.CW "to" ,
+as in
+.P1
+ 'a' to 'z' or 'A' to 'Z' =>
+.P2
+Remember that control does not flow from one case arm to the next, unlike C,
+thus no
+.CW break
+statements appear.
+.NH 2
+Tk and Interface Construction
+.LP
+Inferno supports a rather complete implementation of
+the Tk interface toolkit developed by John Ousterhout.
+In other environments, Tk is normally accessed from
+Tcl programs, although there are also versions for Perl,
+Scheme and other languages that call Ousterhout's C code.
+The Inferno Tk was implemented from scratch, and is meant to be called
+from Limbo programs.
+As we saw earlier,
+there is a module declaration
+.CW "tk.m"
+and a kernel module
+.CW "Tk" .
+.LP
+The
+.CW "Tk"
+module provides all the widgets of the original Tk
+with almost all their options,
+the
+.CW "pack"
+command for geometry management,
+and the
+.CW "bind"
+command for attaching code to user actions.
+It also provides a
+.CW grid
+command to simplify the common case of objects arranged in a matrix or grid.
+In this implementation
+.CW "Tk"
+commands are
+written as strings and presented to one function,
+.CW "tk->cmd" ;
+Limbo calls this function and captures
+its return value, which is the string that the Tk command produces.
+For example, widget creation commands like
+.CW "button"
+return the widget name, so this will be the string
+returned by
+.CW "tk->cmd" .
+.LP
+There is one unconventional aspect:
+the use of channels to send data and events from the interface
+into the Limbo program.
+To create a widget, as we saw earlier, one writes
+.P1
+tk->cmd("button .b -text {Push me} -command {send cmd .bpush}");
+.P2
+to create a button
+.CW ".b"
+and attach a command to be executed when the button is pushed.
+That command sends
+the (arbitrary) string
+.CW ".bpush"
+on the channel named
+.CW "cmd" .
+The Limbo code that reads from this channel will look
+for the string
+.CW ".bpush"
+and act accordingly.
+The function
+.CW "tk->namechan"
+establishes a correspondence between a Limbo channel value
+and a channel named as a string in the Tk module.
+When an event occurs in a Tk widget with a
+.CW "-command"
+option,
+.CW "send"
+causes the string to be sent on the channel and the Limbo code
+can act on it.
+The program will often use a
+.CW "case"
+to process the strings that might appear on the channel,
+particularly when the same channel is used for several widgets.
+.LP
+We observed earlier that
+.CW Tk
+provides a user interface for an application's window,
+but there might be many windows on the screen.
+Normally, a graphical application is meant to run under
+the window manager
+.CW "wm"
+as a window that can be managed,
+reshaped, etc.
+This is done by calling functions in the module
+.CW "Tkclient" ,
+which provides the interface between
+.CW Tk
+and
+.CW wm .
+.LP
+Several functions must be called to create a window,
+put it on the screen, and start giving it input.
+We have already seen
+.CW Tkclient 's
+.CW toplevel
+for window creation and
+.CW onscreen
+to give a window space on the screen.
+Input arrives from several sources:
+from the mouse and keyboard, from the
+higher-level Tk widgets such as buttons,
+and from the window manager itself.
+In Limbo, each input source is represented by a channel, either given to the program
+by the window manager, or associated with one by
+.CW namechan ,
+as above.
+.LP
+This is all illustrated in the complete program below, which
+implements a trivial version of Etch-a-Sketch, shown in action in Figure 2.
+.FG "f3.ps" 4.8i
+.ce
+.I "Figure 2. Etch-a-Sketch display."
+.fg
+.nr dT 4
+.nr dP \n(dP+1
+.P1
+implement Etch;
+
+include "sys.m";
+ sys: Sys;
+include "draw.m";
+include "tk.m";
+ tk: Tk;
+include "tkclient.m";
+ tkclient: Tkclient;
+
+Etch: module
+{
+ init: fn(ctxt: ref Draw->Context, args: list of string);
+};
+.P3
+init(ctxt: ref Draw->Context, nil: list of string)
+{
+ sys = load Sys Sys->PATH;
+ tk = load Tk Tk->PATH;
+ tkclient = load Tkclient Tkclient->PATH;
+
+ tkclient->init();
+
+ (t, winctl) := tkclient->toplevel(ctxt, nil, "Etch", Tkclient->Appl);
+
+ cmd := chan of string;
+ tk->namechan(t, cmd, "cmd");
+ tk->cmd(t, "canvas .c -height 400 -width 600 -background white");
+ tk->cmd(t, "frame .f");
+ tk->cmd(t, "button .f.c -text {Clear} -command {send cmd clear}");
+ tk->cmd(t, "button .f.d -text {Done} -command {send cmd quit}");
+ tk->cmd(t, "pack .f.c .f.d -side left -fill x -expand 1");
+ tk->cmd(t, "pack .c .f -side top -fill x");
+ tk->cmd(t, "bind .c <ButtonPress-1> {send cmd b1down %x %y}");
+ tk->cmd(t, "bind .c <Button-1-Motion> {send cmd b1motion %x %y}");
+ tk->cmd(t, "update");
+
+ tkclient->startinput(t, "ptr" :: "kbd" :: nil);
+ tkclient->onscreen(t, nil);
+
+ lastx, lasty: int;
+ for (;;) {
+ alt {
+ s := <-cmd =>
+ (nil, cmdstr) := sys->tokenize(s, " \et\en");
+ case hd cmdstr {
+ "quit" =>
+ exit;
+ "clear" =>
+ tk->cmd(t, ".c delete all; update");
+ "b1down" =>
+ lastx = int hd tl cmdstr;
+ lasty = int hd tl tl cmdstr;
+ cstr := sys->sprint(".c create line %d %d %d %d -width 2",
+ lastx, lasty, lastx, lasty);
+ tk->cmd(t, cstr);
+ "b1motion" =>
+ x := int hd tl cmdstr;
+ y := int hd tl tl cmdstr;
+ cstr := sys->sprint(".c create line %d %d %d %d -width 2",
+ lastx, lasty, x, y);
+ tk->cmd(t, cstr);
+ lastx = x; lasty = y;
+ }
+
+ p := <-t.ctxt.ptr =>
+ tk->pointer(t, *p);
+
+ c := <-t.ctxt.kbd =>
+ tk->keyboard(t, c);
+
+ ctl := <-winctl or
+ ctl = <-t.ctxt.ctl or
+ ctl = <-t.wreq =>
+ tkclient->wmctl(t, ctl);
+ }
+ tk->cmd(t, "update");
+ }
+}
+.P2
+.nr dT 8
+.nr dP \n(dP-1
+.LP
+The function
+.CW "toplevel"
+returns a tuple containing the
+.CW Tk->Toplevel
+for the new window and a channel upon which the
+window manager will send messages for events such as
+hitting the exit button.
+An earlier example assigned the channel value to
+.CW nil ,
+discarding it; here it is assigned the name
+.CW winctl .
+The parameters to
+.CW toplevel
+includes a graphics context
+.CW ctxt
+where the window will be created,
+a configuration string (simply
+.CW nil
+here),
+the program name (which appears in the window's ``title bar'' if it has one),
+and a value
+.CW Tkclient->Appl
+that denotes a style of window suitable for most applications.
+Note that
+.CW ctxt
+was one of the arguments to
+.CW init .
+(We do not use the argument list for
+.CW init ,
+and so declare it as
+.CW nil ).
+.LP
+The program creates a canvas for drawing,
+a button to clear the canvas, and a button to quit.
+The sequence of calls to
+.CW "tk->cmd"
+creates the picture and sets up the bindings.
+The buttons are created with a
+.CW -command
+to send a suitable string on channel
+.CW cmd ,
+and two
+.CW bind
+commands make the same channel the target
+for messages about mouse button presses and movement in the canvas.
+Note the
+.CW %x
+and
+.CW %y
+parameters in the latter case to include the mouse's coordinates in the string.
+.LP
+The window manager sends keyboard and mouse input
+to the currently selected window using two more channels
+.CW t.ctxt.kbd
+and
+.CW t.ctxt.ptr .
+A further channel
+.CW t.wreq
+is used by the
+.CW Tk
+module itself to request changes to the window displaying
+.CW Toplevel
+.CW t .
+.LP
+Now there are many channels watching events:
+one for the buttons and canvas created by the drawing program
+itself, one for the mouse,
+and three for window management.
+We use an
+.CW "alt"
+statement to select from events on any of those channels.
+The expression
+.P1
+s := <-cmd
+.P2
+declares a variable
+.CW "s"
+of the type carried by the channel
+.CW "cmd" ,
+i.e., a
+.CW "string" ;
+when a string is received on the channel, the assignment is executed,
+and the subsequent
+.CW case
+decodes the message.
+The channel
+.CW t.ctxt.ptr
+carries references to
+.CW Draw->Pointer
+values, which give the state and position of the pointing device
+(mouse or stylus).
+They are handed as received to
+.CW tk->pointer
+for processing by Tk.
+Similarly, Unicode characters from the keyboard are given to Tk using
+.CW tk->keyboard .
+Internally, Tk hands those values on to the various widgets for processing, possibly
+resulting in messages being sent on one of the other channels.
+Finally, a value received from any of the
+.CW "winctl" ,
+.CW t.ctxt.ctl
+or
+.CW t.wreq
+channels is passed back to
+.CW Tkclient 's
+.CW "wmctl"
+function to be handled there.
+.LP
+As another example,
+here is the startup code for an implementation of
+Othello, adapted from a Java version
+by Muffy Barkocy, Arthur van Hoff, and Ben Fry.
+.nr dT 4
+.nr dP \n(dP+1
+.P1
+init(ctxt: ref Draw->Context, args: list of string)
+{
+ sys = load Sys Sys->PATH;
+ tk = load Tk Tk->PATH;
+ tkclient = load Tkclient Tkclient->PATH;
+
+ sys->pctl(Sys->NEWPGRP, nil);
+
+ tkclient->init();
+.P3
+ (t, winctl) := tkclient->toplevel(ctxt, nil, "Othello", Tkclient->Appl);
+.P3
+ cmd := chan of string;
+ tk->namechan(t, cmd, "cmd");
+ tk->cmd(t, "canvas .c -height 400 -width 400 -background green");
+ tk->cmd(t, "frame .f");
+ tk->cmd(t, "label .f.l -text {Othello?} -background white");
+ tk->cmd(t, "button .f.c -text {Reset} -command {send cmd Reset}");
+ tk->cmd(t, "button .f.d -text {Quit} -command {send cmd Quit}");
+ tk->cmd(t, "pack .f.l .f.c .f.d -side left -fill x -expand 1");
+ tk->cmd(t, "pack .c .f -side top -fill x");
+ tk->cmd(t, "bind .c <ButtonRelease-1> {send cmd B1up %x %y}");
+
+ for (i := 1; i < 9; i++)
+ for (j := 1; j < 9; j++) {
+ coord := sys->sprint("%d %d %d %d",
+ SQ*i, SQ*j, SQ*(i+1), SQ*(j+1));
+ tk->cmd(t, ".c create rectangle " + coord +
+ " -outline black -width 2");
+ }
+ tk->cmd(t, "update");
+ lasterror(t, "init");
+ tkclient->startinput(t, "ptr" :: "kbd" :: nil);
+ tkclient->onscreen(t, nil);
+
+ board = array[10] of {* => array[10] of int};
+ score = array[10] of {* => array[10] of int};
+ reinit();
+.P3
+ for (;;) {
+ alt {
+ s := <- cmd =>
+ (n, l) := sys->tokenize(s, " \et");
+ case hd l {
+ "Quit" =>
+ exit;
+ "Reset" =>
+ reinit();
+ "B1up" =>
+ x := int hd tl l;
+ y := int hd tl tl l;
+ mouseUp(int x, int y);
+ }
+
+ p := <-t.ctxt.ptr =>
+ tk->pointer(t, *p);
+
+ c := <-t.ctxt.kbd =>
+ tk->keyboard(t, c);
+
+ ctl := <-winctl or
+ ctl = <-t.ctxt.ctl or
+ ctl = <-t.wreq =>
+ tkclient->wmctl(t, ctl);
+ }
+ }
+}
+.P2
+.nr dP \n(dP-1
+.nr dT 4
+.FG "f2.ps" 4.8i
+.ce
+.I "Figure 3. Screen shot of Inferno display showing Othello window."
+.fg
+.LP
+If some call to the
+.CW "Tk"
+module results in an error,
+an error string is made available in a pseudo-variable
+.CW "lasterror"
+maintained by
+.CW "Tk" .
+When this variable is read, it is reset.
+The function
+.CW "lasterror"
+shows how to test and print this variable:
+.P1
+lasterror(t: ref Tk->Toplevel, where: string)
+{
+ s := tk->cmd(t, "variable lasterror");
+ if (s != nil)
+ sys->print("%s: tk error %s\en", where, s);
+}
+.P2
+In general, the Inferno implementation of
+.CW "Tk"
+does not provide variables except for a few special ones like this.
+The most common instance is a variable that links
+a set of radiobuttons.
+.NH 2
+Acknowledgements
+.LP
+I am very grateful to
+Steven Breitstein,
+Ken Clarkson,
+Sean Dorward,
+Eric Grosse,
+Doug McIlroy,
+Rob Pike,
+Jon Riecke,
+Dennis Ritchie,
+Howard Trickey,
+Phil Winterbottom,
+and
+Margaret Wright
+for explaining mysteries of Limbo and Inferno
+and for valuable suggestions on this paper.
binary files /dev/null b/doc/descent/descent.pdf differ
--- /dev/null
+++ b/doc/descent/mkfile
@@ -1,0 +1,15 @@
+<../fonts.pal
+
+descent.pdf:D: descent.ps
+
+descent.ps:D: descent.ms f1.ps f2.ps f3.ps mkfile
+ {echo $FONTS; cat descent.ms} | tbl | troff -mpm -mpictures | dpost >$target
+
+%.ps: %.gif
+ dpost <$stem.gif >$stem.ps
+
+%.ps: %.bit
+ aux/p9bitpost -b100 <$stem.bit >$stem.ps
+
+%.pdf: %.ps
+ ps2pdf <$stem.ps >$stem.pdf
--- /dev/null
+++ b/doc/dev.ms
@@ -1,0 +1,497 @@
+.TL
+Program Development under Inferno
+.AU
+Roger Peppé
+rog@vitanuova.com
+.SH
+Introduction
+.PP
+Inferno provides a set of programs that, used in
+combination, provide a powerful development environment
+in which to write Limbo programs.
+.I Limbo (1)
+is the compiler for the Limbo language; there
+are versions that run inside and outside the Inferno
+environment.
+.I Acme (1)
+is an integrated window system and editor, and the
+preferred source-code editing tool within Inferno.
+The Limbo debugger,
+.I wm-debug (1),
+allows interactive inspection of running Limbo programs.
+.I Stack (1)
+allows a quick inspection of the execution stack of a
+currently running process.
+.SH
+Getting started
+.PP
+This document assumes that you have already managed
+to install Inferno and have managed to obtain an Inferno
+window, running the Inferno window manager,
+.I wm (1).
+The document
+\&``Installing Inferno'' in this volume has details on this.
+If running within emu, it is worth giving Inferno
+as large a window as possible, as it cannot be resized later.
+This paper assumes that you are using a three-button mouse, as it is
+not feasible to use Acme without a three-button mouse.
+(if you have a two button mouse with a ``mouse wheel'',
+the wheel can be used as the middle button).
+The first thing to do is to get Acme going. By clicking
+on the Vita Nuova logo at the bottom left of the window,
+you can display a menu naming some preconfigured commands.
+If this has an ``Acme'' entry, then just clicking on that entry
+will start acme. If not, then click on the ``Shell'' entry,
+and type
+.P1
+acme
+.P2
+to start it up. The Acme window should then appear,
+filling most of the screen (the window manager toolbar
+should still be visible).
+.SH
+Acme basics
+.PP
+For a general overview and the rationale behind Acme, see ``Acme:
+A User Interface for Programmers'', elsewhere in this volume,
+and for detailed documentation, see
+.I acme (1).
+The basics are as follows:
+.PP
+Acme windows are text-only and organised into columns.
+A distinctive feature of Acme is that there are no graphical
+title bars to windows; instead, each window (and additionally
+each column, and the whole Acme window itself) has
+a textual
+.I tag ,
+which can be edited at will, and is initially primed to contain
+a few appropriate commands.
+.PP
+An Acme command is just represented by text; any textual
+command word may be executed simply by clicking with the middle
+mouse button on the word. (See ``Acme mouse commands'', below).
+If Acme recognizes the word that has been clicked on
+as one of its internal commands (e.g. Put, Undo), then it will take the appropriate
+action; otherwise it will run the text as a shell command.
+(See
+.I sh (1)).
+.SH
+Acme mouse commands
+.PP
+Mouse usage within Acme is somewhat more versatile
+than in most other window systems. Each of the three
+mouse buttons has its own action, and there are also
+actions bound to
+.I chords
+of mouse buttons (i.e. mouse buttons depressed simultaneously).
+Mouse buttons are numbered from left (1) to right (3).
+Button 1 follows similar conventions to other window systems -
+it selects text; a double click will select a line if at the beginning or end
+of a line, or match brackets if on a bracket character, or select
+a word otherwise.
+Button 2, as mentioned above, executes an
+Acme command; a single click with button 2 will execute
+the single word under the click, otherwise the swept text
+will be executed.
+Button 3 is a general ``look'' operator; if the text under the
+click represents a filename, then Acme will open a new
+window for the file and read it in, otherwise it will search
+within the current window for the next occurrence of the
+text.
+Clicking button 2 or button 3 on some text already selected
+by button 1 causes the click to refer exactly to the text
+selected, rather than gathering likely-looking characters
+from around the click as is the default.
+.PP
+There are two mouse chord sequences which are
+commonly used in Acme (and you will find that some
+other programs in the system also recognise these sequences,
+e.g.
+.I wm-sh (1)).
+They are both available once some text
+has been selected by dragging the mouse with button 1,
+but before the button has been released. At this point,
+touching button 2 will delete the selected text and save
+it in Acme's
+.I snarf
+buffer; clicking button 3 replaces the selected text with the contents
+of the snarf buffer. Before button 1 has been released,
+these two buttons reverse each other's actions, so, for
+example, selecting some text with button 1, keeping button 1
+held down, then clicking button 2 and button 3 in succession,
+will save the selected text in the snarf buffer while leaving the
+original intact.
+The following table summarises the mouse commands in
+Acme:
+.KS
+.TS
+center box;
+l l .
+B1 Select text.
+B2 Execute text.
+B3 Open file or search for text.
+B1-B2 Cut text.
+B1-B3 Paste text.
+B2-B3 Cancel the pending B2 action.
+B3-B2 Cancel the pending B3 action.
+.TE
+.ce
+.I "Acme mouse command summary"
+.KE
+
+.SH
+Scrolling and resizing Acme windows
+.PP
+The scroll bars in Acme are somewhat different from
+conventional scroll bars (including the scroll bars found
+in other parts of Inferno). Clicking, or dragging, with
+button-2 on the scrollbar acts the most like the conventional
+behaviour, namely that the further down the scroll bar
+you click, the further down the file you are shown.
+.PP
+True to form, however, Acme doesn't omit to make
+the other buttons useful: button-1 and button-3
+move backwards and forwards through the file respectively.
+The nearer the top of the scrollbar the mouse, the
+slower the movement. Holding one of these buttons
+down on the scrollbar will cause the scrolling motion
+to auto-repeat, so it is easy to scroll gently through the
+entire file, for instance.
+.PP
+The small square at the top left of each Acme window is
+the handle for resizing the window. Dragging this square
+from one place to another (within Acme) will move the
+window to the new place. A single button click in this square
+will grow the window: button 1 grows it a little bit; button 2
+grows it as much as possible without obscuring the other
+window titles in the column; button 3 grows it so it covers
+the whole column (all other windows in the column are
+obscured).
+.SH
+Creating a new file
+.PP
+All Limbo programs are composed of
+.I modules
+and each module is stored in its own file. To write a Limbo
+program, you need to write at least one module,
+the Limbo
+.I "source file" ,
+which will then be compiled into Dis code which can
+then be run by the Inferno Virtual Machine (VM).
+The first step is to decide where to store the file.
+When Acme starts up, it creates a new window containing
+a list of all the files in the directory in which it was started
+(usually your home directory). As a consequence of the
+mouse rules above, a click of button-3 on any of those
+filenames in that window will open a new window
+showing that file or, if it is a directory, a list of the
+files and directories it contains.
+.PP
+An important aspect in Acme's mouse commands, is
+that the command is interpreted
+.I "relative to the window's current directory",
+where the current directory is determined from
+the filename in the window's tag. For instance,
+Acme commands executed in the tag or body of
+a window on the file
+.CW "/usr/joebloggs/myfile.txt"
+would run in the directory
+.CW /usr/joebloggs .
+.PP
+So, to create a new file in Acme, first open the
+directory in which to create the file. (If this is
+your home directory, then it's probably already on the screen;
+otherwise, you can just type (anywhere) the name of
+the directory, and button-3 click on it. If the directory
+does not exist, then no window will be created.
+Then, within the directory's window or its tag,
+choose a name,
+.I filename ,
+for your file (I'll use
+.CW myprog
+from here on,
+for explanatory convenience)
+, type the text:
+.P1
+New \fIfilename\fP.b
+.P2
+select this text (the Escape key can also be used to highlight
+text that you have just typed), and button-2 click on it.
+This should create a new empty window in which you
+can edit your Limbo source file. It will also create a
+window giving a warning that the file does not
+currently exist - you can get rid of this by clicking
+with button-2 on the text
+.CW Del
+in the tag of that window.
+.SH
+Editing the source file
+.PP
+You can now edit text in the new window.
+Type in the following program:
+.P1
+implement Myprog;
+include "sys.m";
+ sys: Sys;
+include "draw.m";
+
+Myprog: module {
+ init: fn(nil: ref Draw->Context, argv: list of string);
+};
+
+init(nil: ref Draw->Context, argv: list of string)
+{
+ sys = load Sys Sys->PATH;
+ sys->print("Hello, world\en");
+}
+.P2
+When typing it in, note that two new commands have appeared
+in the tag of the new window:
+.CW Put
+and
+.CW Undo .
+.CW Put
+saves the file;
+.CW Undo
+undoes the last change to the file, and successive
+executions of
+.CW Undo
+will move further back in time. In case you move
+too far back accidentally, there is also
+.CW Redo ,
+which redoes a change that you have just undone.
+Changes in the body of any window in Acme can be undone
+this way.
+.PP
+Click with button-2 on the
+.CW Put
+command, and the file is now saved and ready to be
+compiled. If you have problems at this point (say
+Acme complains about not being able to write the
+file), you have probably chosen an inappropriate
+directory, one in which you do not have write permission,
+in which to put the file. In this case you can change the
+name of the file simply by editing its name in the window's
+tag, and clicking on
+.CW Put
+again.
+.SH
+Compiling the source file
+.PP
+Now, you are in a position to compile the Limbo program.
+Although you can execute the Limbo compiler directly
+from the tag of the new file's window, it is usually more
+convenient to do it from a shell window. To start a shell
+window, type
+.CW win '' ``
+at the right of the tag of the new file's window, select
+it, and click with button-2 on it.
+A new window should appear showing a shell prompt (usually
+.CW "; " '' ``
+or
+.CW "% " ''). ``
+At this, you can type any of the commands mentioned
+in Section 1 of the Programmer's Manual.
+Note that, following Acme's usual rule, the shell has
+started up in the same directory as the new file;
+typing
+.P1
+lc
+.P2
+at the prompt will show all the files in the directory,
+including hopefully the newly written Limbo file.
+.PP
+Type the following command to the shell:
+.P1
+limbo -g myprog.b
+.P2
+If you typed in the example program correctly,
+then you'll get a short pause, and then another shell
+prompt. This indicates a successful compilation (no
+news is good news), in which case you will now have
+two new files in the current directory,
+.CW myprog.sbl
+and
+.CW myprog.dis .
+The
+.CW -g
+option to the
+.CW limbo
+command directed it to produce the
+.CW myprog.sbl
+file, which contains symbolic information
+relating the source code to the Dis executable file.
+The
+.CW myprog.dis
+file contains the actual executable file.
+At this point, if you type
+.CW lc ,
+to get a listing of the files in the current directory,
+and then click with button-2 on the
+.CW myprog.dis
+file, and you should see the output ``Hello, world''.
+You could also just type
+.CW myprog
+at the shell prompt.
+.PP
+If you are normal, however, the above compilation
+probably failed because of some mistyped characters
+in the source code; and for larger newly created programs,
+in my experience, this
+is almost invariably the case.
+If you got no errors in the above
+compilation, try changing
+.CW sys->print
+to
+.CW print ,
+saving the file again,
+and continue with the next section.
+.SH
+Finding compilation errors
+.PP
+When the Limbo compiler finds errors, it prints
+the errors, one per line, each one looking something
+like the following:
+.P1
+myprog.b:13: print is not declared
+.P2
+This shows the filename where the error has occurred,
+its line number in the file, and a description of the error.
+Acme's button-3 mouse clicking makes it extremely easy
+to see where in the source code the error has occurred.
+Click with button-3 anywhere in the filename on the
+line of the compilation error, and Acme will automatically
+take the cursor to the file of that name and highlight
+the correct line.
+.PP
+If there had been no currently appropriate open Acme
+window representing the file, then a new one would
+be created, and the appropriate line selected.
+.PP
+Edit
+.CW myprog.b
+until you have a program that compiles successfully
+and produces the ``Hello, world'' output.
+For a program as simple as this, that's all there
+is to it - you now know the essential stages involved in
+writing a Limbo program; there's just the small matter
+of absorbing the Limbo language and familiarising
+yourself with the libraries (``The Limbo Programming Language''
+elsewhere in this volume,
+and
+.I intro (2)
+are the two essential starting points here).
+.SH
+Finding run-time errors
+.PP
+For larger programs, there is the problem of programs
+that die unexpectedly with a run-time error. This
+will happen when, for instance, a Limbo program uses a reference
+that has not been initialised, or refers to an out-of-bounds
+array element.
+.PP
+When a Limbo program dies with a run-time exception,
+it does not go away completely, but remains hanging
+around, dormant, in a
+.I broken
+state; the state that it was in when it died may
+now be examined at leisure. To experiment with this,
+edit the Myprog module above to delete the line
+that loads the
+.CW Sys
+module
+.CW "sys = load Sys" ...), (
+and recompile the program.
+.PP
+This time when you come to run
+.CW myprog ,
+it will die, printing a message like:
+.P1
+sh: 319 "Myprog":module not loaded
+.P2
+The number
+.CW 319
+is the
+.I "process id"
+(or just
+.I pid )
+of the broken process. The command
+.CW ps ,
+which shows all currently running processes,
+can be used at this point - you will see a line like this:
+.P1
+ 319 245 rog broken 64K Myprog
+.P2
+The first number is the pid of the process;
+the second is the
+.I "process group"
+id of the process; the third field gives the
+owner of the process; the fourth gives its state
+(broken, in this case); the fifth shows the current
+size of the process, and the last gives the name
+of the module that the process is currently running.
+.PP
+The
+.CW stack
+command can be used to quickly find the line
+at which the process has broken; type:
+.P1
+ stack \fIpid\fP
+.P2
+where
+.I pid
+is the number mentioned in the ``module not loaded''
+message (319 in this case).
+It produces something like the following output:
+.P1
+init() myprog.b:12.1, 29
+unknown fn() Module /dis/sh.dis PC 1706
+.P2
+As usual, a quick button-3 click on the
+.CW myprog.b
+part of the first line takes you to the appropriate
+part of the source file. The reason that the program
+has died here is that, in Limbo, all external modules
+must be explicitly loaded before they can be used; to
+try to call an uninitialised module is an error
+and causes an exception.
+.SH
+More sophisticated debugging
+.PP
+.CW Stack
+is fine for getting a quick summary of the state
+in which a program has died, but there are
+times when such a simple post-mortem analysis
+is inadequate. The
+.CW wm/deb
+(see
+.I wm-deb\fR(1))\fP
+command provides an interactive windowing
+debugger for such occasions.
+It runs outside Acme,
+in the default window system. A convenient way
+to start debugging an existing process is
+to raise
+.CW wm/task
+(``Task Manager'' on the
+main menu), select with the mouse the process
+to debug, and click ``Debug''. This will start
+.CW wm/deb
+on that process. Before it can start, the debugger will ask
+for the names of any source files that it has not been
+able to find (usually this includes the source for
+the shell, as the module being debugged is often
+started by the shell, and so the top-level function will
+be in the shell's module).
+.PP
+.CW Wm/deb
+can be used to debug multiple threads, to inspect
+the data structures in a thread, and to interactively
+step through the running of a thread (single stepping).
+See
+.I wm-deb (1)
+for details.
+
+\" further afield?
+\" other development tools?
+\" tools to come?
binary files /dev/null b/doc/dev.pdf differ
--- /dev/null
+++ b/doc/dis.ms
@@ -1,0 +1,1824 @@
+.so /sys/lib/tmac/tmac.uni
+.TL
+Dis Virtual Machine Specification
+.AU
+.I "Lucent Technologies Inc"
+.I "30 September 1999"
+
+.I "Extensively revised by Vita Nuova Limited"
+.I "5 June 2000, 9 January 2003"
+.NH 1
+Introduction
+.LP
+The Dis virtual machine provides the execution environment for programs running under the Inferno operating system. The virtual machine models a CISC-like, three operand, memory-to-memory architecture. Code can either be interpreted by a C library or compiled on-the-fly into machine code for the target architecture.
+.LP
+This paper defines the virtual machine informally.
+A separate paper by Winterbottom and Pike[2] discusses its design.
+The Dis object file format is also defined here.
+Literals and keywords are in
+.CW typewriter
+typeface.
+.NH 1
+Addressing Modes
+.SH
+Operand Size
+.LP
+Operand sizes are defined as follows: a byte is 8 bits, a word or pointer is 32 bits, a float is 64 bits, a big integer is 64 bits. The operand size of each instruction is encoded explicitly by the operand code. The operand size and type are specified by the last character of the instruction mnemonic:
+.IP
+.TS
+lf(CW) lfR .
+W word, 32-bit two's complement
+B byte, 8-bit unsigned
+F float, 64-bit IEEE format
+L big, 64-bit two's complement
+P pointer
+C Unicode string encoded in UTF-8
+M memory
+MP memory containing pointers
+.TE
+.LP
+Two more operand types are defined to provide `short'
+types for use by languages other than Limbo:
+signed 16-bit integers, called `short word'
+here, and 32-bit IEEE format floating-point numbers, called `short float' or `short real' here.
+Support for them is limited to conversion to and from words or floats respectively;
+the instructions are marked below with a dagger (†).
+.SH
+Memory Organization
+.LP
+Memory for a thread is divided into several separate regions. The code segment stores either a decoded virtual machine instruction stream suitable for execution by the interpreter or flash compiled native machine code for the host CPU. Neither type of code segment is addressable from the instruction set. At the object code level, PC values are offsets, counted in instructions, from the beginning of the code space.
+.LP
+Data memory is a linear array of bytes, addressed using 32-bit pointers. Words are stored in the native representation of the host CPU. Data types larger than a byte must be stored at addresses aligned to
+a multiple of the data size. A thread executing a module has access to two regions of addressable data memory. A module pointer
+.CW "mp" \& (
+register) defines a region of global storage for a particular module, a frame pointer
+.CW "fp" \& (
+register) defines the current activation record or frame for the thread. Frames are allocated dynamically from a stack by function call and return instructions. The stack is extended automatically from the heap.
+.LP
+The
+.CW mp
+and
+.CW fp
+registers cannot be addressed directly, and therefore, can be modified only by call and return instructions.
+.SH
+Effective Addresses
+.LP
+Each instruction can potentially address three operands. The source and destination operands are general, but the middle operand can use any address mode except double indirect. If the middle operand of a three address instruction is omitted, it is assumed to be the same as the destination operand.
+.LP
+The general operands generate an effective address from three basic modes: immediate, indirect and double indirect. The assembler syntax for each mode is:
+.IP
+.TS
+lf(CW) lfR .
+10(fp) 30-bit signed indirect from fp
+20(mp) 30-bit signed indirect from mp
+$0x123 30-bit signed immediate value
+10(20(fp)) two 16-bit unsigned offsets double indirect from fp
+10(20(mp)) two 16-bit unsigned offsets double indirect from mp
+.TE
+.SH
+Garbage Collection
+.LP
+The Dis machine performs both reference counted and real time mark and sweep garbage collection. This hyrbrid approach allows code to be generated in several styles: pure reference counted, mark and sweep, or a hybrid of the two approaches. Compiler writers have the freedom to choose how specific types are handled by the machine to optimize code for performance or language implementation. Instruction selection determines which algorithm will be applied to specific types.
+.LP
+When using reference counting, pointers are a special operand type and should only be manipulated using the pointer instructions in order to ensure the correct functioning of the garbage collector. Every memory location that stores a pointer must be known to the interpreter so that it can be initialized and deallocated correctly. The information is transmitted in the form of type descriptors in the object module. Each type descriptor contains a bit vector for a particular type where each bit corresponds to a word in memory. Type descriptors are generated automatically by the Limbo compiler. The assembler syntax for a type descriptor is:
+.P1
+desc $10, 132, "001F"
+.P2
+The first parameter is the descriptor number, the second is the size in bytes, and the third a pointer map. The map contains a list of hex bytes where each byte maps eight 32 bit words. The most significant bit represents the lowest memory address.
+A one bit indicates a pointer in memory. The map need not have an entry for every byte and unspecified bytes are assumed zero.
+.LP
+Throughout this description, the symbolic constant
+.CW H
+refers to a nil pointer.
+.NH 1
+Instruction Set
+.SH
+add\fIx\fP \- Add
+.P1
+Syntax: addb src1, src2, dst
+ addf src1, src2, dst
+ addw src1, src2, dst
+ addl src1, src2, dst
+Function: dst = src1 + src2
+.P2
+.LP
+The
+.CW "add"
+instructions compute the sum of the operands addressed by
+.CW "src1"
+and
+.CW "src2"
+and stores the result in the
+.CW " dst"
+operand. For
+.CW "addb"
+the result is truncated to eight bits.
+.SH
+addc \- Add strings
+.P1
+Syntax: addc src1, src2, dst
+Function: dst = src1 + src2
+.P2
+.LP
+The
+.CW "addc"
+instruction concatenates the two UTF strings pointed to by
+.CW " src1"
+and
+.CW "src2" ;
+the result is placed in the pointer addressed by
+.CW "dst" .
+If both pointers are
+.CW "H"
+the result will be a zero length string rather than
+.CW "H" .
+.SH
+alt \- Alternate between communications
+.P1
+Syntax: alt src, dst
+.P2
+The
+.CW "alt"
+instruction selects between a set of channels ready to communicate. The
+.CW src
+argument is the address of a structure of the following form:
+.P1
+struct Alt {
+ int nsend; /* Number of senders */
+ int nrecv; /* Number of receivers */
+ struct {
+ Channel* c; /* Channel */
+ void* val; /* Address of lval/rval */
+ } entry[];
+};
+.P2
+The vector is divided into two sections; the first lists the channels ready to send values, the second lists channels either ready to receive or an array of channels each of which may be ready to receive. The counts of the sender and receiver channels are stored as the first and second words addressed by
+.CW src .
+An
+.CW "alt"
+instruction proceeds by testing each channel for readiness to communicate. A ready channel is added to a list. If the list is empty after each channel has been considered, the thread blocks at the
+.CW "alt"
+instruction waiting for a channel to become ready; otherwise, a channel is picked at random from the ready set.
+.LP
+The
+.CW "alt"
+instruction then uses the selected channel to perform the communication using the
+.CW "val"
+address as either a source for send or a destination for receive. The numeric index of the selected vector element is placed in
+.CW "dst" .
+.SH
+and\fIx\fP \- Logical AND
+.P1
+Syntax: andb src1, src2, dst
+ andw src1, src2, dst
+ andl src1, src2, dst
+Function: dst = src1 & src2
+.P2
+The instructions compute the bitwise AND of the two operands addressed by
+.CW "src1"
+and
+.CW "src2"
+and stores the result in the
+.CW "dst"
+operand.
+.SH
+beq\fIx\fP \- Branch equal
+.P1
+Syntax: beqb src1, src2, dst
+ beqc src1, src2, dst
+ beqf src1, src2, dst
+ beqw src1, src2, dst
+ beql src1, src2, dst
+Function: if src1 == src2 then pc = dst
+.P2
+If the
+.CW "src1"
+operand is equal to the
+.CW "src2"
+operand, then control is transferred to the program counter specified by the
+.CW "dst"
+operand.
+.SH
+bge\fIx\fP \- Branch greater or equal
+.P1
+Syntax: bgeb src1, src2, dst
+ bgec src1, src2, dst
+ bgef src1, src2, dst
+ bgew src1, src2, dst
+ bgel src1, src2, dst
+Function: if src1 >= src2 then pc = dst
+.P2
+If the
+.CW "src1"
+operand is greater than or equal to the
+.CW "src2"
+operand, then control is transferred to program counter specified by the
+.CW "dst"
+operand. This instruction performs a signed comparison.
+.SH
+bgt\fIx\fP \- Branch greater
+.P1
+Syntax: bgtb src1, src2, dst
+ bgtc src1, src2, dst
+ bgtf src1, src2, dst
+ bgtw src1, src2, dst
+ bgtl src1, src2, dst
+Function: if src1 > src2 then pc = dst
+.P2
+If the
+.CW "src1"
+operand is greater than the
+.CW "src2"
+operand, then control is transferred to the program counter specified by the
+.CW "dst"
+operand. This instruction performs a signed comparison.
+.SH
+ble\fIx\fP \- Branch less than or equal
+.P1
+Syntax: bleb src1, src2, dst
+ blec src1, src2, dst
+ blef src1, src2, dst
+ blew src1, src2, dst
+ blel src1, src2, dst
+Function: if src1 <= src2 then pc = dst
+.P2
+If the
+.CW "src1"
+operand is less than or equal to the
+.CW "src2"
+operand, then control is transferred to the program counter specified by the
+.CW "dst"
+operand. This instruction performs a signed comparison.
+.SH
+blt\fIx\fP \- Branch less than
+.P1
+Syntax: bltb src1, src2, dst
+ bltc src1, src2, dst
+ bltf src1, src2, dst
+ bltw src1, src2, dst
+ bltl src1, src2, dst
+Function: if src1 < src2 then pc = dst
+.P2
+If the
+.CW "src1"
+operand is less than the
+.CW "src2"
+operand, then control is transferred to the program counter specified by the
+.CW "dst"
+operand.
+.SH
+bne\fIx\fP \- Branch not equal
+.P1
+Syntax: bneb src1, src2, dst
+ bnec src1, src2, dst
+ bnef src1, src2, dst
+ bnew src1, src2, dst
+ bnel src1, src2, dst
+Function: if src1 != src2 then pc = dst
+.P2
+If the
+.CW "src1"
+operand is not equal to the
+.CW "src2"
+operand, then control is transferred to the program counter specified by the
+.CW "dst"
+operand.
+.SH
+call \- Call local function
+.P1
+Syntax: call src, dst
+Function: link(src) = pc
+ frame(src) = fp
+ mod(src) = 0
+ fp = src
+ pc = dst
+.P2
+The
+.CW "call"
+instruction performs a function call to a routine in the same module. The
+.CW "src"
+argument specifies a frame created by
+.CW "new" .
+The current value of
+.CW "pc"
+is stored in link(src), the current value of
+.CW "fp"
+is stored in frame(src) and the module link register is set to 0. The value of
+.CW "fp"
+is then set to
+.CW "src"
+and control is transferred to the program counter specified by
+.CW dst .
+.SH
+case \- Case compare integer and branch
+.P1
+Syntax: case src, dst
+Function: pc = 0..i: dst[i].pc where
+ dst[i].lo >= src && dst[i].hi < src
+.P2
+The
+.CW "case"
+instruction jumps to a new location specified by a range of values. The
+.CW "dst"
+operand points to a table in memory containing a table of
+.CW "i"
+values. Each value is three words long: the first word specifies a low value, the second word specifies a high value, and the third word specifies a program counter. The first word of the table gives the number of entries. The
+.CW "case"
+instruction searches the table for the first matching value where the
+.CW "src"
+operand is greater than or equal to the low word and less than the high word. Control is transferred to the program counter stored in the first word of the matching entry.
+.SH
+casec \- Case compare string and branch
+.P1
+Syntax: casec src, dst
+Function: pc = 0..i: dst[i].pc where
+ dst[i].lo >= src && dst[i].hi < src
+.P2
+The
+.CW "casec"
+instruction jumps to a new location specified by a range of string constants. The table is the same as described for the
+.CW case
+instruction.
+.SH
+cons\fIx\fP \- Allocate new list element
+.P1
+Syntax: consb src, dst
+ consc src, dst
+ consf src, dst
+ consl src, dst
+ consm src, dst
+ consmp src, dst
+ consp src, dst
+ consw src, dst
+Function: p = new(src, dst)
+ dst = p
+.P2
+The
+.CW "cons"
+instructions add a new element to the head of a list. A new list element is composed from the
+.CW "src"
+operand and a pointer to the head of an extant list specified by
+.CW "dst" .
+The resulting element is stored back into
+.CW "dst" .
+.SH
+cvtac \- Convert byte array to string
+.P1
+Syntax: cvtac src, dst
+Function: dst = string(src)
+.P2
+The
+.CW "src"
+operand must be an array of bytes, which is converted into a character string and stored in
+.CW "dst" .
+The new string is a copy of the bytes in
+.CW "src" .
+.SH
+cvtbw \- Convert byte to word
+.P1
+Syntax: cvtbw src, dst
+Function: dst = src & 0xff
+.P2
+A byte is fetched from the
+.CW "src"
+operand extended to the size of a word and then stored into
+.CW "dst" .
+.SH
+cvtca \- Convert string to byte array
+.P1
+Syntax: cvtca src, dst
+Function: dst = array(src)
+.P2
+The
+.CW "src"
+operand must be a string which is converted into an array of bytes and stored in
+.CW "dst" .
+The new array is a copy of the characters in src.
+.SH
+cvtcf \- Convert string to real
+.P1
+Syntax: cvtcf src, dst
+Function: dst = (float)src
+.P2
+The string addressed by the
+.CW "src"
+operand is converted to a floating point value and stored in the
+.CW "dst"
+operand. Initial white space is ignored; conversion ceases at the first character in the string that is not part of the representation of the floating point value.
+.SH
+cvtcl \- Convert string to big
+.P1
+Syntax: cvtcl src, dst
+Function: dst = (big)src
+.P2
+The string addressed by the
+.CW "src"
+operand is converted to a big integer and stored in the
+.CW "dst"
+operand. Initial white space is ignored; conversion ceases at the first non-digit in the string.
+.SH
+cvtcw \- Convert string to word
+.P1
+Syntax: cvtcw src, dst
+Function: dst = (int)src
+.P2
+The string addressed by the
+.CW "src"
+operand is converted to a word and stored in the
+.CW "dst"
+operand. Initial white space is ignored; after a possible sign, conversion ceases at the first non-digit in the string.
+.SH
+cvtfc \- Convert real to string
+.P1
+Syntax: cvtfc src, dst
+Function: dst = string(src)
+.P2
+The floating point value addressed by the
+.CW "src"
+operand is converted to a string and stored in the
+.CW "dst"
+operand. The string is a floating point representation of the value.
+.SH
+cvtfw \- Convert real to word
+.P1
+Syntax: cvtfw src, dst
+Function: dst = (int)src
+.P2
+The floating point value addressed by
+.CW "src"
+is converted into a word and stored into
+.CW "dst" .
+The floating point value is rounded to the nearest integer.
+.SH
+cvtfl \- Convert real to big
+.P1
+Syntax: cvtfl src, dst
+Function: dst = (big)src
+.P2
+The floating point value addressed by
+.CW "src"
+is converted into a big integer and stored into
+.CW "dst" .
+The floating point value is rounded to the nearest integer.
+.SH
+cvtfr \- Convert real to short real†
+.P1
+Syntax: cvtfr src, dst
+Function: dst = (short float)src
+.P2
+The floating point value addressed by
+.CW "src"
+is converted to a short (32-bit) floating point value and stored into
+.CW "dst" .
+The floating point value is rounded to the nearest integer.
+.SH
+cvtlc \- Convert big to string
+.P1
+Syntax: cvtlc src, dst
+Function: dst = string(src)
+.P2
+The big integer addressed by the
+.CW "src"
+operand is converted to a string and stored in the
+.CW "dst"
+operand. The string is the decimal representation of the big integer.
+.SH
+cvtlw \- Convert big to word
+.P1
+Syntax: cvtlw src, dst
+Function: dst = (int)src
+.P2
+The big integer addressed by the
+.CW "src"
+operand is converted to a word and stored in the
+.CW "dst"
+operand.
+.SH
+cvtsw \- Convert short word to word†
+.P1
+Syntax: cvtsw src, dst
+Function: dst = (int)src
+.P2
+The short word addressed by the
+.CW "src"
+operand is converted to a word and stored in the
+.CW "dst"
+operand.
+.SH
+cvtwb \- Convert word to byte
+.P1
+Syntax: cvtwb src, dst
+Function: dst = (byte)src;
+.P2
+The
+.CW "src"
+operand is converted to a byte and stored in the
+.CW "dst"
+operand.
+.SH
+cvtwc \- Convert word to string
+.P1
+Syntax: cvtwc src, dst
+Function: dst = string(src)
+.P2
+The word addressed by the
+.CW "src"
+operand is converted to a string and stored in the
+.CW "dst"
+operand. The string is the decimal representation of the word.
+.SH
+cvtwl \- Convert word to big
+.P1
+Syntax: cvtwl src, dst
+Function: dst = (big)src;
+.P2
+The word addressed by the
+.CW "src"
+operand is converted to a big integer and stored in the
+.CW "dst"
+operand.
+.SH
+cvtwf \- Convert word to real
+.P1
+Syntax: cvtwf src, dst
+Function: dst = (float)src;
+.P2
+The word addressed by the
+.CW "src"
+operand is converted to a floating point value and stored in the
+.CW "dst"
+operand.
+.SH
+cvtws \- Convert word to short word†
+.P1
+Syntax: cvtws src, dst
+Function: dst = (short)src;
+.P2
+The word addressed by the
+.CW "src"
+operand is converted to a short word and stored in the
+.CW "dst"
+operand.
+.SH
+cvtlf \- Convert big to real
+.P1
+Syntax: cvtlf src, dst
+Function: dst = (float)src;
+.P2
+The big integer addressed by the
+.CW "src"
+operand is converted to a floating point value and stored in the
+.CW "dst"
+operand.
+.SH
+cvtrf \- Convert short real to real†
+.P1
+Syntax: cvtrf src, dst
+Function: dst = (float)src;
+.P2
+The short (32 bit) floating point value addressed by the
+.CW "src"
+operand is converted to a 64-bit floating point value and stored in the
+.CW "dst"
+operand.
+.SH
+div\fIx\fP \- Divide
+.P1
+Syntax: divb src1, src2, dst
+ divf src1, src2, dst
+ divw src1, src2, dst
+ divl src1, src2, dst
+Function: dst = src2/src1
+.P2
+The
+.CW "src2"
+operand is divided by the
+.CW "src1"
+operand and the quotient is stored in the
+.CW "dst"
+operand. Division by zero causes the thread to terminate.
+.SH
+exit \- Terminate thread
+.P1
+Syntax: exit
+Function: exit()
+.P2
+The executing thread terminates. All resources held in the stack are deallocated.
+.SH
+frame \- Allocate frame for local call
+.P1
+Syntax: frame src1, src2
+Function: src2 = fp + src1->size
+ initmem(src2, src1);
+.P2
+The frame instruction creates a new stack frame
+for a call to a function in the same module. The frame is initialized according to the type descriptor supplied as the
+.CW src1
+operand. A pointer to the newly created frame is stored in the
+.CW src2
+operand.
+.SH
+goto \- Computed goto
+.P1
+Syntax: goto src, dst
+Function: pc = dst[src]
+.P2
+The
+.CW "goto"
+instruction performs a computed goto. The
+.CW "src"
+operand must be an integer index into a table of PC values specified by the
+.CW "dst"
+operand.
+.SH
+head\fIx\fP \- Head of list
+.P1
+Syntax: headb src, dst
+ headf src, dst
+ headm src, dst
+ headmp src, dst
+ headp src, dst
+ headw src, dst
+ headl src, dst
+Function: dst = hd src
+.P2
+The
+.CW "head"
+instructions make a copy of the first data item stored in a list. The
+.CW "src"
+operand must be a list of the correct type. The first item is copied into the
+.CW "dst"
+operand. The list is not modified.
+.SH
+indc \- Index by character
+.P1
+Syntax: indc src1, src2, dst
+Function: dst = src1[src2]
+.P2
+The
+.CW "indc"
+instruction indexes Unicode strings. The
+.CW "src1"
+instruction must be a string. The
+.CW "src2"
+operand must be an integer specifying the origin-0 index in
+.CW src1
+of the (Unicode) character to store in the
+.CW "dst"
+operand.
+.SH
+indx \- Array index
+.P1
+Syntax: indx src1, dst, src2
+Function: dst = &src1[src2]
+.P2
+The
+.CW "indx"
+instruction computes the effective address of an array element. The
+.CW "src1"
+operand must be an array created by the
+.CW "newa"
+instruction. The
+.CW "src2"
+operand must be an integer. The effective address of the
+.CW "src2"
+element of the array is stored in the
+.CW "dst"
+operand.
+.SH
+ind\fIx\fP \- Index by type
+.P1
+Syntax: indb src1, dst, src2
+ indw src1, dst, src2
+ indf src1, dst, src2
+ indl src1, dst, src2
+Function: dst = &src1[src2]
+.P2
+The
+.CW "indb" ,
+.CW "indw" ,
+.CW "indf"
+and
+.CW "indl"
+instructions index arrays of the basic types. The
+.CW "src1"
+operand must be an array created by the
+.CW "newa"
+instruction. The
+.CW "src2"
+operand must be a non-negative integer index less than the array size. The effective address of the element at the index is stored in the
+.CW "dst"
+operand.
+.SH
+insc \- Insert character into string
+.P1
+Syntax: insc src1, src2, dst
+Function: src1[src2] = dst
+.P2
+The
+.CW "insc"
+instruction inserts a character into an existing string.
+The index in
+.CW "src2"
+must be a non-negative integer less than the length of the string plus one.
+(The character will be appended to the string if the index is equal to
+the string's length.)
+The
+.CW "src1"
+operand must be a string (or nil).
+The character to insert must be a valid 21-bit unicode value represented as a word.
+.SH
+jmp \- Branch always
+.P1
+Syntax: jmp dst
+Function: pc = dst
+.P2
+Control is transferred to the location specified by the
+.CW "dst"
+operand.
+.SH
+lea \- Load effective address
+.P1
+Syntax: lea src, dst
+Function: dst = &src
+.P2
+The
+.CW "lea"
+instruction computes the effective address of the
+.CW "src"
+operand and stores it in the
+.CW "dst"
+operand.
+.SH
+lena \- Length of array
+.P1
+Syntax: lena src, dst
+Function: dst = nelem(src)
+.P2
+The
+.CW "lena"
+instruction computes the length of the array specified by the
+.CW "src"
+operand and stores it in the
+.CW "dst"
+operand.
+.SH
+lenc \- Length of string
+.P1
+Syntax: lenc src, dst
+Function: dst = utflen(src)
+.P2
+The
+.CW "lenc"
+instruction computes the number of characters in the UTF string addressed by the
+.CW "src"
+operand and stores it in the
+.CW "dst"
+operand.
+.SH
+lenl \- Length of list
+.P1
+Syntax: lenl src, dst
+Function: dst = 0;
+ for(l = src; l; l = tl l)
+ dst++;
+.P2
+The
+.CW "lenl"
+instruction computes the number of elements in the list addressed by the
+.CW "src"
+operand and stores the result in the
+.CW "dst"
+operand.
+.SH
+load \- Load module
+.P1
+Syntax: load src1, src2, dst
+Function: dst = load src2 src1
+.P2
+The
+.CW "load"
+instruction loads a new module into the heap. The module might optionally be compiled into machine code depending on the module header. The
+.CW "src1"
+operand is a pathname to the file containing the object code for the module. The
+.CW "src2"
+operand specifies the address
+of a linkage descriptor for the module (see below).
+A reference to the newly loaded module is stored in the
+.CW "dst"
+operand.
+If the module could not be loaded for any reason, then
+.CW "dst"
+will be set to
+.CW H .
+.LP
+The linkage descriptor referenced by the
+.CW src2
+operand is a table in data space that lists the functions
+imported by the current module from the module to be loaded.
+It has the following layout:
+.P1
+int nentries;
+struct { /* word aligned */
+ int sig;
+ byte name[]; /* UTF encoded name, 0-terminated */
+} entry[];
+.P2
+The
+.CW nentries
+value gives the number of entries in the table and can be zero.
+It is followed by that many linkage entries.
+Each entry is aligned on a word boundary; there can therefore
+be padding before each structure.
+The entry names the imported function in the UTF-encoded string in
+.CW name ,
+which is terminated by a byte containing zero.
+The MD5 hash of the function's type signature is given in the value
+.CW sig .
+For each entry,
+.CW load
+instruction checks that a function with the same name in the newly loaded
+exists, with the same signature.
+Otherwise the load will fail and
+.CW dst
+will be set to
+.CW H .
+.LP
+The entries in the linkage descriptor form an array of linkage records
+(internal to the virtual machine) associated with the
+module pointer returned in
+.CW dst ,
+that is indexed by operators
+.CW mframe ,
+.CW mcall
+and
+.CW mspawn
+to refer to functions in that module.
+The linkage scheme provides a level of indirection that allows
+a module to be loaded using any module declaration that is a valid
+subset of the implementation module's declaration,
+and allows entry points to be added to modules without invalidating
+calling modules.
+.SH
+lsr\fIx\fP \- Logical shift right
+.P1
+Syntax: lsrw src1, src2, dst
+ lsrl src1, src2, dst
+Function: dst = (unsigned)src2 >> src1
+.P2
+The
+.CW "lsr"
+instructions shift the
+.CW "src2"
+operand right by the number of bits specified by the
+.CW "src1"
+operand, replacing the vacated bits by 0, and store the result in the
+.CW "dst"
+operand. Shift counts less than 0 or greater than the number of bits in the object have undefined results.
+This instruction is included for support of languages other than Limbo,
+and is not used by the Limbo compiler.
+.SH
+mcall \- Inter-module call
+.P1
+Syntax: mcall src1, src2, src3
+Function: link(src1) = pc
+ frame(src1) = fp
+ mod(src1) = current_moduleptr
+ current_moduleptr = src3->moduleptr
+ fp = src1
+ pc = src3->links[src2]->pc
+.P2
+The
+.CW "mcall"
+instruction calls a function in another module. The first argument specifies a new frame for the called procedure and must have been built using the
+.CW "mframe"
+instruction.
+The
+.CW "src3"
+operand is a module reference generated by a successful
+.CW "load"
+instruction.
+The
+.CW "src2"
+operand specifies the index for the called
+function in the array of linkage records associated with that module reference
+(see the
+.CW load
+instruction).
+.SH
+mframe \- Allocate inter-module frame
+.P1
+Syntax: mframe src1, src2, dst
+Function: dst = fp + src1->links[src2]->t->size
+ initmem(dst, src1->links[src2])
+.P2
+The
+.CW mframe
+instruction allocates a new frame for a procedure call into another module. The
+.CW src1
+operand specifies the location of a module pointer created as the result of a successful load instruction. The
+.CW src2
+operand specifies the index for the called function in
+the array of linkage records associated
+with that module pointer (see the
+.CW load
+instruction).
+A pointer to the initialized frame is stored in
+.CW dst .
+The
+.CW src2
+operand specifies the linkage number of the function to be called in the module specified by
+.CW src1 .
+.SH
+mnewz \- Allocate object given type from another module
+.P1
+Syntax: mnewz src1, src2, dst
+Function: dst = malloc(src1->types[src2]->size)
+ initmem(dst, src1->types[src2]->map)
+.P2
+The
+.CW mnewz
+instruction allocates and initializes storage to a new
+area of memory.
+The
+.CW src1
+operand specifies the location of a module pointer created as the result of a successful load instruction.
+The size of the new memory area and the location of
+pointers within it are specified by the
+.CW src2
+operand, which gives a
+type descriptor number within that module.
+Space not occupied by pointers is initialized to zero.
+A pointer to the initialized object is stored in
+.CW dst .
+This instruction is not used by Limbo; it was added to implement other languages.
+.SH
+mod\fIx\fP \- Modulus
+.P1
+Syntax: modb src1, src2, dst
+ modw src1, src2, dst
+ modl src1, src2, dst
+Function: dst = src2 % src1
+.P2
+The modulus instructions compute the remainder of the
+.CW "src2"
+operand divided by the
+.CW "src1"
+operand and store the result in
+.CW "dst" .
+The operator preserves the condition that the absolute value of a%b is less than the absolute value of
+.CW "b" ;
+.CW "(a/b)*b + a%b"
+is always equal to
+.CW a .
+.SH
+mov\fIx\fP \- Move scalar
+.P1
+Syntax: movb src, dst
+ movw src, dst
+ movf src, dst
+ movl src, dst
+Function: dst = src
+.P2
+The move operators perform assignment. The value specified by the
+.CW "src"
+operand is copied to the
+.CW "dst"
+operand.
+.SH
+movm \- Move memory
+.P1
+Syntax: movm src1, src2, dst
+Function: memmove(&dst, &src1, src2)
+.P2
+The
+.CW "movm"
+instruction copies memory from the
+.CW "src1"
+operand to the
+.CW "dst"
+operand for
+.CW "src2"
+bytes. The
+.CW "src1"
+and
+.CW "dst"
+operands specify the effective address of the memory rather than a pointer to the memory.
+.SH
+movmp \- Move memory and update reference counts
+.P1
+Syntax: movmp src1, src2, dst
+Function: decmem(&dst, src2)
+ memmove(&dst, &src1, src2->size)
+ incmem(&src, src2)
+.P2
+The
+.CW "movmp"
+instructions performs the same function as the
+.CW "movm"
+instruction but increments the reference count of pointers contained in the data type. For each pointer specified by the
+.CW "src2"
+type descriptor, the corresponding pointer reference count in the destination is decremented. The
+.CW "movmp"
+instruction then copies memory from the
+.CW "src1"
+operand to the
+.CW "dst"
+operand for the number of bytes described by the type descriptor. For each pointer specified by the type descriptor the corresponding pointer reference count in the source is incremented.
+.SH
+movp \- Move pointer
+.P1
+Syntax: movp src, dst
+Function: destroy(dst)
+ dst = src
+ incref(src)
+.P2
+The
+.CW "movp"
+instruction copies a pointer adjusting the reference counts to reflect the new pointers.
+.SH
+movpc \- Move program counter
+.P1
+Syntax: movpc src, dst
+Function: dst = PC(src);
+.P2
+The
+.CW "movpc"
+instruction computes the actual address of an immediate PC value. The
+.CW "dst"
+operand is set to the actual machine address of the instruction addressed by the
+.CW "src"
+operand. This instruction must be used to calculate PC values for computed branches.
+.SH
+mspawn \- Module spawn function
+.P1
+Syntax: mspawn src1, src2, src3
+Function: fork();
+ if(child){
+ link(src1) = 0
+ frame(src1) = 0
+ mod(src1) = src3->moduleptr
+ current_moduleptr = src3->moduleptr
+ fp = src1
+ pc = src3->links[src2]->pc
+ }
+.P2
+The
+.CW "mspawn"
+instruction creates a new thread, which starts executing a function in another module.
+The first argument specifies a new frame for the called procedure and must have been built using the
+.CW "mframe"
+instruction.
+The
+.CW "src3"
+operand is a module reference generated by a successful
+.CW "load"
+instruction.
+The
+.CW "src2"
+operand specifies the index for the called function in
+the array of linkage records associated with that module reference (see the
+.CW load
+instruction above).
+.SH
+mul\fIx\fP - Multiply
+.P1
+Syntax: mulb src1, src2, dst
+ mulw src1, src2, dst
+ mulf src1, src2, dst
+ mull src1, src2, dst
+Function: dst = src1 * src2
+.P2
+The
+.CW src1
+operand is multiplied by the
+.CW src2
+operand and the product is stored in the
+.CW dst
+operand.
+.SH
+nbalt \- Non blocking alternate
+.P1
+Syntax: nbalt src, dst
+.P2
+The
+.CW "nbalt"
+instruction has the same operands and function as
+.CW "alt"
+, except that if no channel is ready to communicate, the instruction does not block. When no channels are ready, control is transferred to the PC in the last element of the table addressed by
+.CW dst .
+.SH
+negf \- Negate real
+.P1
+Syntax: negf src, dst
+Function: dst = -src
+.P2
+The floating point value addressed by the
+.CW "src"
+operand is negated and stored in the
+.CW "dst"
+operand.
+.SH
+new, newz \- Allocate object
+.P1
+Syntax: new src, dst
+ newz src, dst
+Function: dst = malloc(src->size);
+ initmem(dst, src->map);
+.P2
+The
+.CW "new"
+instruction allocates and initializes storage to a new area of memory. The size and locations of pointers are specified by the type descriptor number given as the
+.CW "src"
+operand. A pointer to the newly allocated object is placed in
+.CW "dst" .
+Any space not occupied by pointers has undefined value.
+.LP
+The
+.CW "newz"
+instruction additionally guarantees that all non-pointer values are set to zero.
+It is not used by Limbo.
+.SH
+newa, newaz \- Allocate array
+.P1
+Syntax: newa src1, src2, dst
+ newaz src1, src2, dst
+Function: dst = malloc(src2->size * src1);
+ for(i = 0; i < src1; i++)
+ initmem(dst + i*src2->size, src2->map);
+.P2
+The
+.CW "newa"
+instruction allocates and initializes an array. The number of elements is specified by the
+.CW "src1"
+operand. The type of each element is specified by the type descriptor number given as the
+.CW "src2"
+operand.
+Space not occupied by pointers has undefined value.
+The
+.CW newaz
+instruction additionally guarantees that all non-pointer values are set to zero;
+it is not used by Limbo.
+.SH
+newc\fIx\fP \- Allocate channel
+.P1
+Syntax: newcw dst
+ newcb dst
+ newcl dst
+ newcf dst
+ newcp dst
+ newcm src, dst
+ newcmp src, dst
+Function: dst = new(Channel)
+.P2
+The
+.CW "newc"
+instruction allocates a new channel of the specified type and stores a reference to the channel in
+.CW "dst" .
+For the
+.CW "newcm"
+instruction the source specifies the number of bytes of memory used by values sent on the channel (see the
+.CW movm
+instruction above).
+For the
+.CW "newcmp"
+instruction the first operand specifies a type descriptor giving the length of the structure and the location of pointers within the structure (see the
+.CW movmp
+instruction above).
+.SH
+or\fIx\fP \- Logical OR
+.P1
+Syntax: orb src1, src2, dst
+ orw src1, src2, dst
+ orl src1, src2, dst
+Function: dst = src1 | src
+.P2
+These instructions compute the bitwise OR of the two operands addressed by
+.CW "src1"
+and
+.CW "src2"
+and store the result in the
+.CW "dst"
+operand.
+.SH
+recv \- Receive from channel
+.P1
+Syntax: recv src, dst
+Function: dst = <-src
+.P2
+The
+.CW "recv"
+instruction receives a value from some other thread on the channel specified by the
+.CW "src"
+operand. Communication is synchronous, so the calling thread will block until a corresponding
+.CW "send"
+or
+.CW "alt"
+is performed on the channel. The type of the received value is determined by the channel type and the
+.CW "dst"
+operand specifies where to place the received value.
+.SH
+ret \- Return from function
+.P1
+Syntax: ret
+Function: npc = link(fp)
+ mod = mod(fp)
+ fp = frame(fp)
+ pc = npc
+.P2
+The
+.CW "ret"
+instruction returns control to the instruction after the call of the current function.
+.SH
+send \- Send to channel
+.P1
+Syntax: send src, dst
+Function: dst <-= src
+.P2
+The
+.CW "send"
+instruction sends a value from this thread to some other thread on the channel specified by the
+.CW "dst"
+operand. Communication is synchronous so the calling thread will block until a corresponding
+.CW "recv"
+or
+.CW "alt"
+is performed on the channel. The type of the sent value is determined by the channel type and the
+.CW "dst"
+operand specifies where to retrieve the sent value.
+.SH
+shl\fIx\fP \- Shift left arithmetic
+.P1
+Syntax: shlb src1, src2, dst
+ shlw src1, src2, dst
+ shll src1, src2, dst
+Function: dst = src2 << src1
+.P2
+The
+.CW "shl"
+instructions shift the
+.CW "src2"
+operand left by the number of bits specified by the
+.CW "src1"
+operand and store the result in the
+.CW "dst"
+operand. Shift counts less than 0 or greater than the number of bits in the object have undefined results.
+.SH
+shr\fIx\fP \- Shift right arithmetic
+.P1
+Syntax: shrb src1, src2, dst
+ shrw src1, src2, dst
+ shrl src1, src2, dst
+Function: dst = src2 >> src1
+.P2
+The
+.CW "shr"
+instructions shift the
+.CW "src2"
+operand right by the number of bits specified by the
+.CW "src1"
+operand and store the result in the
+.CW "dst"
+operand. Shift counts less than 0 or greater than the number of bits in the object have undefined results.
+.SH
+slicea \- Slice array
+.P1
+Syntax: slicea src1, src2, dst
+Function: dst = dst[src1:src2]
+.P2
+The
+.CW "slicea"
+instruction creates a new array, which contains the elements from the index at
+.CW "src1"
+to the index
+.CW "src2-1" .
+The new array is a reference array which points at the elements in the initial array. The initial array will remain allocated until both arrays are no longer referenced.
+.SH
+slicec \- Slice string
+.P1
+Syntax: slicec src1, src2, dst
+Function: dst = dst[src1:src2]
+.P2
+The
+.CW "slicec"
+instruction creates a new string, which contains characters from the index at
+.CW "src1"
+to the index
+.CW "src2-1" .
+Unlike
+.CW "slicea"
+, the new string is a copy of the elements from the initial string.
+.SH
+slicela \- Assign to array slice
+.P1
+Syntax: slicela src1, src2, dst
+Function: dst[src2:] = src1
+.P2
+The
+.CW "src1"
+and
+.CW "dst"
+operands must be arrays of equal types. The
+.CW "src2"
+operand is a non-negative integer index. The
+.CW "src1"
+array is assigned to the array slice
+.CW "dst[src2:]" ;
+.CW "src2 + nelem(src1)"
+must not exceed
+.CW "nelem(dst)" .
+.SH
+spawn \- Spawn function
+.P1
+Syntax: spawn src, dst
+Function: fork();
+ if(child)
+ dst(src);
+.P2
+The
+.CW "spawn"
+instruction creates a new thread and calls the function specified by the
+.CW "dst"
+operand. The argument frame passed to the thread function is specified by the
+.CW "src"
+operand and should have been created by the
+.CW "frame"
+instruction.
+.SH
+sub\fIx\fP \- Subtract
+.P1
+Syntax: subb src1, src2, dst
+ subf src1, src2, dst
+ subw src1, src2, dst
+ subl src1, src2, dst
+Function: dst = src2 - src1
+.P2
+The
+.CW "sub"
+instructions subtract the operands addressed by
+.CW "src1"
+and
+.CW "src2"
+and stores the result in the
+.CW "dst"
+operand. For
+.CW "subb" ,
+the result is truncated to eight bits.
+.SH
+tail \- Tail of list
+.P1
+Syntax: tail src, dst
+Function: dst = src->next
+.P2
+The
+.CW "tail"
+instruction takes the list specified by the
+.CW "src"
+operand and creates a reference to a new list with the head removed, which is stored in the
+.CW "dst"
+operand.
+.SH
+tcmp \- Compare types
+.P1
+Syntax: tcmp src, dst
+Function: if(typeof(src) != typeof(dst))
+ error("typecheck");
+.P2
+The
+.CW "tcmp"
+instruction compares the types of the two pointers supplied by the
+.CW "src"
+and
+.CW "dst"
+operands. The comparison will succeed if the two pointers were created from the same type descriptor or the
+.CW "src"
+operand is
+.CW "nil" ;
+otherwise, the program will error. The
+.CW "dst"
+operand must be a valid pointer.
+.SH
+xor\fIx\fP \- Exclusive OR
+.P1
+Syntax: xorb src1, src2, dst
+ xorw src1, src2, dst
+ xorl src1, src2, dst
+Function: dst = src1 ^ src2
+.P2
+These instructions compute the bitwise exclusive-OR of the two operands addressed by
+.CW "src1"
+and
+.CW "src2"
+and store the result in the
+.CW "dst"
+operand.
+.NH 1
+Object File Format
+.LP
+An object file defines a single module. The file has the following structure:
+.P1
+Objfile
+{
+ Header;
+ Code_section;
+ Type_section;
+ Data_section;
+ Module_name;
+ Link_section;
+};
+.P2
+The following data types are used in the description of the file encoding:
+.IP
+.TS
+lf(CW) lw(4i)fR .
+OP T{
+encoded integer operand, encoding selected by the two most significant bits as follows:
+.nf
+00 signed 7 bits, 1 byte
+.br
+10 signed 14 bits, 2 bytes
+.br
+11 signed 30 bits, 4 bytes
+T}
+B unsigned byte
+W 32 bit signed integer
+F canonicalized 64-bit IEEE754 floating point value
+SO 16 bit unsigned small offset from register
+SI 16 bit signed immediate value
+LO 30 bit signed large offset from register
+.TE
+.LP
+All binary values are encoded in two's complement format, most significant byte first.
+.SH
+The Header Section
+.P1
+Header
+{
+ OP: magic_number;
+ Signature;
+ OP: runtime_flag;
+ OP: stack_extent;
+ OP: code_size;
+ OP: data_size;
+ OP: type_size;
+ OP: link_size;
+ OP: entry_pc;
+ OP: entry_type;
+};
+.P2
+The magic number is defined as 819248
+(symbolically
+.CW XMAGIC ),
+for modules that have not been signed cryptographically, and 923426
+(symbolically
+.CW "SMAGIC" ),
+for modules that contain a signature.
+On the Inferno system, the symbolic names
+.CW "XMAGIC"
+and
+.CW SMAGIC
+are defined by the C include file
+.CW "/include/isa.h"
+and the Limbo module
+.CW /module/dis.m .
+.LP
+The signature field is only present if the magic number is
+.CW "SMAGIC" .
+It has the form:
+.P1
+Signature
+{
+ OP: length;
+ array[length] of byte: signature;
+};
+.P2
+A digital signature is defined by a length, followed by an array of untyped bytes.
+Data within the signature should identify the signing authority, algorithm, and data to be signed.
+.LP
+The
+.CW runtime_flag
+is a bit mask that defines various execution options for a Dis module. The flags currently defined are:
+.P1
+MUSTCOMPILE = 1<<0
+DONTCOMPILE = 1<<1
+SHAREMP = 1<<2
+.P2
+The
+.CW "MUSTCOMPILE"
+flag indicates that a
+.CW "load"
+instruction should draw an error if the implementation is unable to compile the module into native instructions using a just-in-time compiler.
+.LP
+The
+.CW "DONTCOMPILE"
+flag indicates that the module should not be compiled into native instructions, even though it is the default for the runtime environment. This flag may be set to allow debugging or to save memory.
+.LP
+The
+.CW "SHAREMP"
+flag indicates that each instance of the module should use the same module data for all instances of the module. There is no implicit synchronization between threads using the shared data.
+.LP
+The
+.CW stack_extent
+value indicates the number of bytes by which the thread stack of this module should be extended in the event that procedure calls exhaust the allocated stack. While stack extension is transparent to programs, increasing this value may improve the efficiency of execution at the expense of using more memory.
+.LP
+The
+.CW code_size
+is a count of the number of instructions stored in the Code_section.
+.LP
+The
+.CW data_size
+gives the size in bytes of the module's global data, which is initialized
+by evaluating the contents of the data section.
+.LP
+The
+.CW type_size
+is a count of the number of type descriptors stored in the Type_section.
+.LP
+The
+.CW link_size
+is a count of the number of external linkage directives stored in the Link_section.
+.LP
+The
+.CW entry_pc
+is an integer index into the instruction stream that is the default entry point for this module. The
+.CW entry_pc
+should point to the first instruction of a function. Instructions are numbered from a program counter value of zero.
+.LP
+The
+.CW entry_type
+is the index of the type descriptor that corresponds to the function entry point set by
+.CW entry_pc .
+.SH
+The Code Section
+.LP
+The code section describes a sequence of instructions for the virtual machine. An instruction is encoded as follows:
+.P1
+Instruction
+{
+ B: opcode;
+ B: address_mode;
+ Middle_data;
+ Source_data;
+ Dest_data;
+};
+.P2
+.LP
+The
+.CW opcode
+specifies the instruction to execute, encoded as follows:
+.IP
+.TS
+tab(:);
+l l l l l .
+00 nop:20 headb:40 mulw:60 blew:80 shrl
+01 alt:21 headw:41 mulf:61 bgtw:81 bnel
+02 nbalt:22 headp:42 divb:62 bgew:82 bltl
+03 goto:23 headf:43 divw:63 beqf:83 blel
+04 call:24 headm:44 divf:64 bnef:84 bgtl
+05 frame:25 headmp:45 modw:65 bltf:85 bgel
+06 spawn:26 tail:46 modb:66 blef:86 beql
+07 runt:27 lea:47 andb:67 bgtf:87 cvtlf
+08 load:28 indx:48 andw:68 bgef:88 cvtfl
+09 mcall:29 movp:49 orb:69 beqc:89 cvtlw
+0A mspawn:2A movm:4A orw:6A bnec:8A cvtwl
+0B mframe:2B movmp:4B xorb:6B bltc:8B cvtlc
+0C ret:2C movb:4C xorw:6C blec:8C cvtcl
+0D jmp:2D movw:4D shlb:6D bgtc:8D headl
+0E case:2E movf:4E shlw:6E bgec:8E consl
+0F exit:2F cvtbw:4F shrb:6F slicea:8F newcl
+10 new:30 cvtwb:50 shrw:70 slicela:90 casec
+11 newa:31 cvtfw:51 insc:71 slicec:91 indl
+12 newcb:32 cvtwf:52 indc:72 indw:92 movpc
+13 newcw:33 cvtca:53 addc:73 indf:93 tcmp
+14 newcf:34 cvtac:54 lenc:74 indb:94 mnewz
+15 newcp:35 cvtwc:55 lena:75 negf:95 cvtrf
+16 newcm:36 cvtcw:56 lenl:76 movl:96 cvtfr
+17 newcmp:37 cvtfc:57 beqb:77 addl:97 cvtws
+18 send:38 cvtcf:58 bneb:78 subl:98 cvtsw
+19 recv:39 addb:59 bltb:79 divl:99 lsrw
+1A consb:3A addw:5A bleb:7A modl:9A lsrl
+1B consw:3B addf:5B bgtb:7B mull:9B eclr
+1C consp:3C subb:5C bgeb:7C andl:9C newz
+1D consf:3D subw:5D beqw:7D orl:9D newaz
+1E consm:3E subf:5E bnew:7E xorl
+1F consmp:3F mulb:5F bltw:7F shll
+.TE
+.LP
+The
+.CW address_mode
+byte specifies the addressing mode of each of the three operands: middle, source and destination. The source and destination operands are encoded by three bits and the middle operand by two bits. The bits are packed as follows:
+.P1
+bit 7 6 5 4 3 2 1 0
+ m1 m0 s2 s1 s0 d2 d1 d0
+.P2
+The middle operand is encoded as follows:
+.IP
+.TS
+lf(CW) lf(CW) lw(3i)fR .
+00 \fInone\fP no middle operand
+01 $SI small immediate
+10 SO(FP) small offset indirect from FP
+11 SO(MP) small offset indirect from MP
+.TE
+.LP
+The source and destination operands are encoded as follows:
+.IP
+.TS
+lf(CW) lf(CW) lw(3i)fR .
+000 LO(MP) offset indirect from MP
+001 LO(FP) offset indirect from FP
+010 $OP 30 bit immediate
+011 \fInone\fP no operand
+100 SO(SO(MP)) double indirect from MP
+101 SO(SO(FP)) double indirect from FP
+110 \fIreserved\fP
+111 \fIreserved\fP
+.TE
+.LP
+The
+.CW middle_data
+field is only present if the middle operand specifier of the address_mode is not `none'.
+If the field is present it is encoded as an
+.CW "OP" .
+.LP
+The
+.CW source_data
+and
+.CW dest_data
+fields are present only if the corresponding
+.CW address_mode
+field is not `none'.
+For offset indirect and immediate modes the field contains a single
+.CW "OP" .
+For double indirect modes the values are encoded as two
+.CW "OP"
+values: the first value is the register indirect offset, and the second value is the final indirect offset. The offsets for double indirect addressing cannot be larger than 16 bits.
+.SH
+The Type Section
+.LP
+The type section contains type descriptors describing the layout of pointers within data types. The format of each descriptor is:
+.P1
+Type_descriptor
+{
+ OP: desc_number;
+ OP: size;
+ OP: number_ptrs;
+ array[number_ptrs] of B: map;
+};
+.P2
+.LP
+The
+.CW desc_number
+is a small integer index used to identify the descriptor to instructions such as
+.CW "new" .
+.LP
+The
+.CW "size"
+field is the size in bytes of the memory described by this type.
+.LP
+The
+.CW number_ptrs
+field gives the size in bytes of the
+.CW "map"
+array.
+.LP
+The
+.CW "map"
+array is a bit vector where each bit corresponds to a word in memory.
+The most significant bit corresponds to the lowest address.
+For each bit in the map,
+the word at the corresponding offset in the type is a pointer iff the bit is set to 1.
+.SH
+The Data Section
+.LP
+The data section encodes the contents of the
+.CW "MP"
+data for the module. The section contains a sequence of items; each item contains
+a control byte and an offset into the section,
+followed by one or more data items.
+A control byte of zero marks the end of the data section.
+Otherwise, it gives the type of data to be loaded and selects between
+two representations of an item:
+.P1
+Short_item
+{
+ B: code;
+ OP: offset;
+ array[code & 16rF] of type[code>>4]: data;
+};
+.P3
+Long_item
+{
+ B: code;
+ OP: count;
+ OP: offset;
+ array[ndata] of type[code>>4]: data;
+};
+.P2
+A
+.CW Short_item
+is generated for 15 or fewer items, otherwise a
+.CW "Long_item"
+is generated. In a
+.CW "Long_item"
+the count field (bottom 4 bits of code) is set to zero and the count follows as an
+.CW "OP" .
+The top 4 bits of code determine the type of the datum.
+The defined values are:
+.IP
+.TS
+lf(CW) lw(3i)f(R) .
+0001 8 bit bytes
+0010 32 bit words
+0011 utf encoded string
+0100 real value IEEE754 canonical representation
+0101 Array
+0110 Set array address
+0111 Restore load address
+1000 64 bit big
+.TE
+.LP
+The byte, word, real and big operands are encoded as sequences
+of bytes (of appropriate length) in big-endian form, converted to native
+format before being stored in the data space.
+The `string' code takes a UTF-encoded sequence of
+.CW count
+bytes, which is converted to an array of 21-bit Unicode values stored in an
+implementation-dependent structure on
+the heap; a 4-byte pointer to the string descriptor is stored in the data space.
+The `array' code takes two 4-byte operands: the first is the index of the array's type
+descriptor in the type section; the second is the length of the array to be created.
+The result in memory is a 4-byte pointer to an implementation-dependent
+array descriptor in the heap.
+.LP
+Each item's data is stored at the address formed by adding the
+.CW offset
+in that item to a base address maintained by the loader.
+Initially that address is the base of the data space of the module instance.
+A new base for loading subsequent items can be set or restored by
+the following operations, used to initialize arrays.
+The `set array index' item must appear immediately following an `array'
+item.
+Its operand is a 4-byte big-endian integer that gives an index into that
+array, at which address subsequent data should be loaded; the
+previous load address is stacked internally.
+Subsequent data will be loaded at offsets from the new base address.
+The `restore load address' item has no operands; it pops a load address
+from the internal address stack and makes that the new
+base address.
+.SH
+The Module Name
+.LP
+The module name immediately follows the data section.
+It contains the name of the implementation module, in UTF encoding,
+terminated by a zero byte.
+.SH
+The Link Section
+.LP
+The link section contains an array of external linkage items:
+the list of functions exported by this module.
+Each item describes one exported function in the following form:
+.P1
+Linkage_item
+{
+ OP: pc;
+ OP: desc_number;
+ W: sig;
+ array[] of byte: name;
+};
+.P2
+The
+.CW pc
+is the instruction number of the function's entry point.
+The
+.CW desc_number
+is the index, in the type section, of the type descriptor for the function's stack frame.
+The
+.CW sig
+word is a 32-bit hash of the function's type signature.
+Finally,
+the name of the function is stored as a variable length array of bytes
+in UTF-8 encoding,
+with the end of the array marked by a zero byte.
+The names of member functions of an exported adt are qualified
+by the name of the adt.
+The next linkage item, if any, follows immediately.
+.NH 1
+Symbol Table File Format
+.LP
+The object file format does not include type information for debuggers.
+The Limbo compiler can optionally produce a separate symbol table file.
+Its format is defined in the entry
+.I sbl (6)
+of [1].
+.NH 1
+References
+.IP 1.
+.I "Inferno Programmer's Manual"
+(Third Edition),
+Volume 1 (`the manual'),
+Vita Nuova Holdings Limited, June 2000.
+.IP 2.
+P Winterbottom and R Pike,
+``The Design of the Inferno Virtual Machine'',
+reprinted in this volume.
binary files /dev/null b/doc/dis.pdf differ
--- /dev/null
+++ b/doc/ebookimp.ms
@@ -1,0 +1,389 @@
+.TL
+Navigating Large XML Documents on Small Devices
+.AU
+Roger Peppe
+.AI
+Vita Nuova
+.br
+April 2002
+.AB
+Browsing eBooks on platforms with limited memory presents an
+interesting problem: how can memory usage be bounded despite
+the need to view documents that may be much larger than the
+available memory. A simple interface to an XML parser enables
+this whilst retaining much of the ease of access afforded
+by XML parsers that read all of a document into memory at once.
+.AE
+.SH
+Introduction
+.LP
+The Open Ebook Publication Structure was devised by the Open Ebook Forum
+in order to ``provide a specification for representing the content of electronic
+books''. It is based on many existing standards, notably XML and HTML.
+An Open eBook publication consists of a set of documents bound together
+with an Open eBook package file which enumerates all the documents,
+pictures and other items that make up the book
+.LP
+The underlying document format is essentially HTML compatible,
+which is where the first problem arises: HTML was not designed to
+make it easy to view partial sections of a document. Conventionally
+an entire HTML document is read in at once and rendered onto
+the device. When viewing an eBook on a limited-memory device,
+however, this may not be possible; books tend to be fairly large.
+For such a device, the ideal format would keep the book itself
+in non-volatile storage (e.g. flash or disk) and make it possible
+for reader to seek to an arbitrary position in the book and render
+what it finds there.
+.LP
+This is not possible in an HTML or XML document, as the
+arbitrarily nested nature of the format means that every
+position in the document has some unknown surrounding context,
+which cannot be discovered without reading sequentially through
+the document from the beginning.
+.SH
+SAX and DOM
+.LP
+There are two conventional programming interfaces to an XML
+parser. A SAX parser provides a stream of XML entities, leaving
+it up to the application to maintain the context. It is not possible
+to rewind the stream, except, perhaps, to the beginning.
+Using a SAX parser is
+fairly straightforward, but awkward: the stream-like nature
+of the interface does not map well to the tree-like structure
+that is XML. A DOM parser reads a whole document into an internal
+data structure representation, so a program can treat it exactly
+as a tree. This also enables a program to access parts of the
+document in an arbitrary order.
+The DOM approach is all very well for small documents, but for large
+documents the memory usage can rapidly grow to exceed
+the available memory capacity. For eBook documents, this is unacceptable.
+.SH
+A different approach
+.LP
+The XML parser used in the eBook browser is akin to a SAX parser,
+in that only a little of the XML structure is held in memory at one time.
+The first significant difference is that the XML entities returned are
+taken from one level of the tree - if the program does not wish to
+see the contents of a particular XML tag, it is trivial to skip over.
+The second significant difference is that random access is possible.
+This possibility comes from the observation that if we have visited
+a part of the document we can record the context that we found there
+and restore it later if necessary. In this scheme, if we wish to return later to
+a part of a document that we are currently at, we can create a ``mark'',
+a token that holds the current context; at some later time we can use
+that mark to return to this position.
+.LP
+The eBook browser uses this technique to enable random access
+to the document on a page-by-page basis. Moreover a mark
+can be written to external storage, thus allowing an external
+``index'' into the document so it is not always necessary to
+read the entire document from the start in order to jump to a particular
+page in that document.
+.SH
+The programming interface
+.LP
+The interface is implemented by a module named
+.CW Xml ,
+which provides a
+.CW Parser
+adt which gives access to the contents of an XML document.
+Xml items are represented by an
+.CW Item
+pick adt with one branch of the pick corresponding to each
+type of item that might be encountered.
+.LP
+The interface to the parser looks like this:
+.P1
+open: fn(f: string, warning: chan of (Locator, string)): (ref Parser, string);
+Parser: adt {
+ next: fn(p: self ref Parser): ref Item;
+ down: fn(p: self ref Parser);
+ up: fn(p: self ref Parser);
+ mark: fn(p: self ref Parser): ref Mark;
+ atmark: fn(p: self ref Parser, m: ref Mark): int;
+ goto: fn(p: self ref Parser, m: ref Mark);
+ str2mark: fn(p: self ref Parser, s: string): ref Mark;
+};
+.P2
+To start parsing an XML document, it must first be
+.CW open ed;
+.CW warning
+is a channel on which non-fatal error messages will be sent
+if they are encountered during the parsing of the document.
+It can be nil, in which case warnings are ignored.
+If the document is opened successfully, a new
+.CW Parser
+adt, say
+.I p ,
+is returned.
+Calling
+.CW \fIp\fP.next
+returns the next XML item at the current level of the tree. If there
+are no more items in the current branch at the current level, it
+returns
+.CW nil .
+When a
+.CW Tag
+item is returned,
+.CW \fIp\fP.down
+can be used to descend ``into'' that tag; subsequent calls of
+.CW \fIp\fP.next
+will return XML items contained within the tag,
+and
+.CW \fIp\fP.up
+returns to the previous level.
+.LP
+An
+.CW Item
+is a pick adt:
+.P1
+Item: adt {
+ fileoffset: int;
+ pick {
+ Tag =>
+ name: string;
+ attrs: Attributes;
+ Text =>
+ ch: string;
+ ws1, ws2: int;
+ Process =>
+ target: string;
+ data: string;
+ Doctype =>
+ name: string;
+ public: int;
+ params: list of string;
+ Stylesheet =>
+ attrs: Attributes;
+ Error =>
+ loc: Locator;
+ msg: string;
+ }
+};
+.P2
+.CW Item.Tag
+represents a XML tag, empty or not. The XML
+fragments
+.CW "<tag></tag>" '' ``
+and
+.CW "<tag />" '' ``
+look identical from the point of view of this interface.
+A
+.CW Text
+item holds text found in between tags, with adjacent whitespaces merged
+and whitespace at the beginning and end of the text elided.
+.CW Ws1
+and
+.CW ws2
+are non-zero if there was originally whitespace at the beginning
+or end of the text respectively.
+.CW Process
+represents an XML processing request, as found between
+.CW "<?....?>" '' ``
+delimiters.
+.CW Doctype
+and
+.CW Stylesheet
+are items found in an XML document's prolog, the
+former representing a
+.CW "<!DOCTYPE...>" '' ``
+document type declaration, and the latter an XML
+stylesheet processing request.
+.LP
+When most applications are processing documents, they
+will wish to ignore all items other than
+.CW Tag
+and
+.CW Text .
+To this end, it is conventional to define a ``front-end'' function
+to return desired items, discard others, and take an appropriate
+action when an error is encountered. Here's an example:
+.P1
+nextitem(p: ref Parser): ref Item
+{
+ while ((gi := p.next()) != nil) {
+ pick i := gi {
+ Error =>
+ sys->print("error at %s:%d: %s\n",
+ i.loc.systemid, i.loc.line, i.msg);
+ exit;
+ Process =>
+ ; # ignore
+ Stylesheet =>
+ ; # ignore
+ Doctype =>
+ ; # ignore
+ * =>
+ return gi;
+ }
+ }
+ return nil;
+}
+.P2
+When
+.CW nextitem
+encounters an error, it exits; it might instead handle the
+error another way, say by raising an exception to be caught at the
+outermost level of the parsing code.
+.SH
+A small example
+.LP
+Suppose we have an XML document that contains some data that we would
+like to extract, ignoring the rest of the document. For this example we will
+assume that the data is held within
+.CW <data>
+tags, which contain zero or more
+.CW <item>
+tags, holding the actual data as text within them.
+Tags that we do not recognize we choose to ignore.
+So for example, given the following XML document:
+.P1
+<metadata>
+ <a>hello</a>
+ <b>goodbye</b>
+</metadata>
+<data>
+ <item>one</item>
+ <item>two</item>
+ <item>three</item>
+</data>
+<data>
+ <item>four</item>
+</data>
+.P2
+we wish to extract all the data items, but ignore everything inside
+the
+.CW <metadata>
+tag. First, let us define another little convenience function to get
+the next XML tag, ignoring extraneous items:
+.P1
+nexttag(p: ref Parser): ref Item.Tag
+{
+ while ((gi := nextitem(p)) != nil) {
+ pick i := gi {
+ Tag =>
+ return i;
+ }
+ }
+ return nil;
+}
+.P2
+Assuming that the document has already been opened,
+the following function scans through the document, looking
+for top level
+.CW <data>
+tags, and ignoring others:
+.P1
+document(p: ref Parser)
+{
+ while ((i := nexttag(p)) != nil) {
+ if (i.name == "data") {
+ p.down();
+ data(p);
+ p.up();
+ }
+ }
+}
+.P2
+The function to parse a
+.CW <data>
+tag is almost as straightforward; it scans for
+.CW <item>
+tags and extracts any textual data contained therein:
+.P1
+data(p: ref Parser)
+{
+ while ((i := nexttag(p)) != nil) {
+ if (i.name == "item") {
+ p.down();
+ if ((gni := p.next()) != nil) {
+ pick ni := gni {
+ Text =>
+ sys->print("item data: %s\n", ni.ch);
+ }
+ }
+ p.up();
+ }
+ }
+}
+.P2
+The above program is all very well and works fine, but
+suppose that the document that we are parsing is very
+large, with data items scattered through its length, and that
+we wish to access those items in an order that is not necessarily
+that in which they appear in the document.
+This is quite straightforward; every time we see a
+data item, we record the current position with a mark.
+Assuming the global declaration:
+.P1
+marks: list of ref Mark;
+.P2
+the
+.CW document
+function might become:
+.P1
+document(p: ref Parser)
+{
+ while ((i := nexttag(p)) != nil) {
+ if (i.name == "data") {
+ p.down();
+ marks = p.mark() :: marks;
+ p.up();
+ }
+ }
+}
+.P2
+At some later time, we can access the data items arbitrarily,
+for instance:
+.P1
+ for (m := marks; m != nil; m = tl m) {
+ p.goto(hd m);
+ data(p);
+ }
+.P2
+If we wish to store the data item marks in some external index
+(in a file, perhaps), the
+.CW Mark
+adt provides a
+.CW str
+function which returns a string representation of the mark.
+.CW Parser 's
+.CW str2mark
+function can later be used to recover the mark. Care must
+be taken that the document it refers to has not been changed,
+otherwise it is likely that the mark will be invalid.
+.SH
+The eBook implementation
+.LP
+The Open eBook reader software uses the primitives described above
+to maintain display-page-based access to arbitrarily large documents
+while trying to bound memory usage.
+Unfortunately it is difficult to unconditionally bound memory usage,
+given that any element in an XML document may be arbitrarily
+large. For instance a perfectly legal document might have 100MB
+of continuous text containing no tags whatsoever. The described
+interface would attempt to put all this text in one single item, rapidly
+running out of memory! Similar types of problems can occur when
+gathering the items necessary to format a particular tag.
+For instance, to format the first row of a table, it is necessary to lay out
+the entire table to determine the column widths.
+.LP
+I chose to make the simplifying assumption that top-level items within
+the document would be small enough to fit into memory.
+From the point of view of the display module, the document
+looks like a simple sequence of items, one after another.
+One item might cover more than one page, in which case a different
+part of it will be displayed on each of those pages.
+.LP
+One difficulty is that the displayed size of an item depends on many
+factors, such as stylesheet parameters, size of installed fonts, etc.
+When a document is read, the page index must have been created
+from the same document with the same parameters. It is difficult in
+general to enumerate all the relevant parameters; they would need
+to be stored inside, or alongside the index; any change would invalidate
+the index. Instead of doing this, as the document is being displayed,
+the eBook display program constantly checks to see if the results
+it is getting from the index match with the results it is getting
+when actually laying out the document. If the results differ, the
+index is remade; the discrepancy will hopefully not be noticed by
+the user!
binary files /dev/null b/doc/ebookimp.pdf differ
--- /dev/null
+++ b/doc/fonts
@@ -1,0 +1,8 @@
+# mkfile rules to get fonts in Lucida Sans.
+# if you don't have Lucida fonts, change this next line to
+# FONTS=''
+FONTS='.fp 1 R LucidaSans
+.fp 2 I LucidaSansI
+.fp 3 B LucidaSansB
+.fp 5 CW LucidaCW
+'
--- /dev/null
+++ b/doc/fonts.bem
@@ -1,0 +1,7 @@
+# FONTS=''
+FONTS='.fp 1 R BemboBookMTStd-Regular
+.fp 2 I BemboBookMTStd-Italic
+.fp 3 B BemboBookMTStd-Bold
+.fp 4 BI BemboBookMTStd-BoldIt
+....fp 5 L LucidaCW
+'
--- /dev/null
+++ b/doc/fonts.pal
@@ -1,0 +1,9 @@
+# mkfile rules to get fonts in Lucida Sans.
+# if you don't have Lucida fonts, change this next line to
+# FONTS=''
+FONTS='.fp 1 R PA
+.fp 2 I PI
+.fp 3 B PB
+.fp 4 PX
+....fp 5 L LucidaCW
+'
binary files /dev/null b/doc/frontmatter.pdf differ
--- /dev/null
+++ b/doc/gridinstall.ms
@@ -1,0 +1,136 @@
+.FP palatino
+.TL
+Installing the Vita Nuova grid software
+.AU
+Vita Nuova
+.br
+5 May 2005
+.NH 1
+Package contents
+.LP
+The installation CD contains software for both grid client and the server (scheduler),
+in separate directories in the root directory of the CD:
+.B client
+and
+.B server .
+.NH 1
+Client software
+.LP
+The grid client software will be installed on Windows NT4/2000/XP machines,
+in the directory (folder)
+.CW C:\eVNClient .
+.IP 1.
+On a Windows machine with the CD loaded,
+use Windows Explorer (or equivalent) to move to the directory named
+.CW \eclient\einstall
+on the CD.
+.IP 2.
+Double-click
+.CW setup.exe
+in that directory.
+It will display a new window that prompts for a destination directory.
+The directory need not exist but if it does, it should be empty.
+The default should be
+.CW C:\eVNClient .
+You can change the name if required (eg, because
+.CW C:
+lacks space), but
+you will then need to edit several files, as discussed below,
+and make appropriate changes to the instructions below.
+Hit the
+.SM ENTER
+key to start installation.
+The program will prompt for permission to create the directory if it does not already exist.
+It will then populate it with all files required by the client.
+.IP 3.
+Move in Explorer to the directory
+.CW C:\eVNClient\egrid\eslave .
+Check that the file
+.CW schedaddr
+contains the right address for your scheduler machine.
+If you changed the drive letter, you must also change the
+four
+.CW .bat
+files in the directory to replace the
+.CW C:
+drive letter by the one you used.
+.IP 4.
+You can now add the grid client as a Windows service by running the appropriate
+.CW .bat
+file on the client.
+Use
+.RS
+.IP
+\f5install_service.bat\fP
+for Windows 2000 and Windows XP
+.IP \fIOR\fP
+\f5install_service_nt4.bat\fP
+for Windows NT4
+.LP
+Just double-clicking in Explorer on the chosen name should install the service.
+.RE
+.LP
+Once installed as a service the client software will start automatically when
+the client machine next boots.
+You can start it manually using the Windows Services Manager in the usual way.
+There are two
+.CW .bat
+files to remove the service (when desired): \f5remove_service.bat\fP
+for 2000/XP and \f5remove_service_nt4.bat\fP for NT4.
+.LP
+The manual page
+.I scheduler-intro (1)
+in the PDF file
+.CW \escheduler.pdf
+on the CD gives more details on running and configuring the client software.
+The manual page
+.I scheduler-monitor (1)
+in the same PDF file describes the use of the Client Monitor software.
+.LP
+On Windows machines you can remove the directory
+.CW C:\eVNClient\eLinux
+to reduce the space required on Windows clients.
+.NH 1
+Server software
+.LP
+The grid server software will be installed on Linux (Redhat 8 or 9), in the directory
+.CW /grid/inferno ,
+which should either not exist or be empty.
+.LP
+Linux will usually mount the CD at
+.CW /mnt/cdrom .
+.IP 1.
+In a shell (`New Terminal') window, type the following command:
+.P1
+sh /mnt/cdrom/server/install/Linux-grid-386.sh
+.P2
+Assuming it has permission to do so, it will populate
+.CW /grid/inferno
+with the Inferno distribution, including the grid scheduler components.
+.IP 2.
+The file
+.CW /mnt/cdrom/server/install/gridsched.sh
+contains a Bourne shell script that can be copied to an appropriate
+place on your system, or used as the basis for one of your own,
+to simplify starting the scheduler.
+In particular it sets the right bin directory in
+.CW PATH
+to find Inferno's
+.I emu ,
+and starts
+.I emu
+with the right parameters to find the
+.CW /grid/inferno
+directory and start the scheduler in the right environment.
+.LP
+Now check that
+.CW /grid/inferno/grid/master/config
+contains the right network address for your scheduler.
+The manual pages
+.I scheduler-intro (1)
+and
+.I scheduler (1)
+in
+.CW scheduler.pdf
+on the CD give more details on invoking the scheduler in
+different ways.
binary files /dev/null b/doc/gridinstall.pdf differ
--- /dev/null
+++ b/doc/hotchips.ms
@@ -1,0 +1,125 @@
+.TL
+The design of the Inferno virtual machine
+.AU
+.I "Phil Winterbottom"
+.I "Rob Pike"
+.AI
+.I "Bell Labs, Lucent Technologies"
+.FS
+Originally appeared in
+.I "IEEE Compcon 97 Proceedings" ,
+1997.
+.FE
+.SP .22i exactly
+.AB
+Virtual machines are an important component of modern portable environments such as Inferno and Java because they provide an architecture-independent representation of executable code. Their performance is critical to the success of such environments, but they are difficult to design well because they are subject to conflicting goals. On the one hand, they offer a way to hide the differences between instruction architectures; on the other, they must be implemented efficiently on a variety of underlying machines. A comparison of the engineering and evolution of the Inferno and Java virtual machines provides insight into the tradeoffs in their design and implementation. We argue that the design of virtual machines should be rooted in the nature of modern processors, not language interpreters, with an eye towards on-the-fly compilation rather than interpretation or special-purpose silicon.
+.AE
+.SH
+Dis, the Inferno Virtual Machine
+.LP
+In early 1995, we set out to apply the ideas of the Plan 9 operating system [1] to a wider range of devices and networks. The resulting system, Inferno [2], is a small operating system and execution environment that supports application portability across a wide variety of processors and operating systems. Unaware of the contemporary work to establish Java [3] from the technology of the Oak project, we independently concluded that a virtual machine (VM) was a necessary component of such a system [4]. Because of improvements in processor speed and the feasibility of on-the-fly compilers, a VM can execute quickly enough to be economically viable.
+.LP
+The Inferno virtual machine, called Dis, has several unusual aspects to its design: the instruction set, the module system, and the garbage collector.
+.LP
+The Dis instruction set provides a close match to the architecture of existing processors. Instructions are of the form
+.DS
+.ft I
+OP src1, src2, dst
+.ft P
+.DE
+The
+.I "src1"
+and
+.I "dst"
+operands specify general addresses or arbitrary-sized constants, while the
+.I "src2"
+operand is restricted to smaller constants and stack offsets to reduce code space. Each operand specifies an address either in the stack frame of the executing procedure or in the global data of its module.
+.LP
+The types of operands are set by the instructions. Basic types are
+.CW "word"
+(32-bit signed),
+.CW "big"
+(64-bit signed),
+.CW "byte"
+(8-bit unsigned),
+.CW "real"
+(64-bit IEEE floating point), and pointer (implementation-dependent). The instruction set follows the example of CISC processors, providing three-operand memory-to-memory operations for arithmetic, data motion, and so on. It also has instructions to allocate memory, to load modules, and to create, synchronize, and communicate between processes.
+.LP
+A module is the unit of dynamically loaded code and data. Modules are loaded by a VM instruction that returns a pointer to a method table for the module. That pointer is managed by the VM's garbage collector, so code and data for the module are garbage collected like any other memory. Type safety is preserved by checking method types at module load time using an MD5 signature of the type.
+.LP
+Memory management is intimately tied to the instruction set of the VM. Dis uses a hybrid garbage collection scheme: most garbage is collected by simple reference counting, while a real-time coloring collector gathers cyclic data. Because reference counting is an exact rather than conservative form of garbage collection, the type of all data items must be known to the VM run-time system. For this reason, the language-to-VM compiler generates a type descriptor for all compound types. This descriptor reports the location of all pointers within the type, allowing the VM to track references as items are copied.
+.SH
+Garbage collection
+.LP
+Memory dominates the cost of small systems, so the VM should be designed to keep memory usage as small as possible. Through reference-counted garbage collection, Dis reclaims memory the moment it becomes unused. Reference counting also eliminates the need for a large arena as required for efficient mark-and-sweep collection. Both these results reduce the memory requirements of the VM and its applications.
+.LP
+Compare this to the Java VM, whose instruction set makes it difficult to track references as objects are copied. This biases against reference counting, so JVM implementations choose lazier techniques such as mark-and-sweep, inducing a larger arena and delayed collection, both of which increase the memory use and therefore the cost of the overall system.
+.SH
+Issues in compiling
+.LP
+It is easy to interpret the individual instructions of a stack-based virtual machine (SM) such as the Java virtual machine (JVM), because most operands are implicit. However, a high-level language implementation of the interpreter generates more memory traffic than the equivalent set of instructions in a memory transfer machine (MM) such as Dis. Consider the code to execute
+.P1
+c = a + b;
+.P2
+An SM would execute this by a code burst such as this, which we have annotated with its memory traffic using
+.I L
+for load and
+.I S
+for store:
+.P1
+push a # \fILS\fP
+push b # \fILS\fP
+add # \fILLS\fP
+store c # \fILS\fP
+.P2
+The corresponding MM code burst would be the plain three-operand instruction
+.P1
+add a,b,c # \fILLS\fP
+.P2
+When interpreting, the extra memory traffic of the SM is masked by the time saved by not decoding any operand fields. The operand fields are implicit in the SM instructions, while the MM they are explicit: three operand fields must be decoded in every instruction, even those without operands.
+.LP
+When compiling, the tradeoffs are different. Clearly, either design can produce the same native instructions from its just-in-time compiler (JIT), but in the SM case most of the work must be done in the JIT, whereas in the MM design the front end has done most of the work and the JIT can be substantially simpler and faster.
+.LP
+A JIT for an SM is forced to do most of the work of register allocation in the JIT itself. Because the types of stack cells change as the program executes, the JIT must track their types as it compiles. In an MM, however, the architecture maps well to native instructions. This produces a continuum of register allocation strategies from none, to simple mapping of known cells to registers, to flow-based register allocation. Most of the work of any of these strategies can be done in the language-to-VM compiler. It can generate code for an infinite-register machine, and the JIT can then allocate as many as are available in the native architecture. Again, this distribution of work keeps the JIT simple.
+.SH
+Processors
+.LP
+The same issues that face the JIT writer also face the designer of special-purpose processors to support a VM. Register allocation in the JIT is analogous to register relabeling in silicon, and an SM design adds unnecessary complexity to an already difficult problem. One might argue that a stack-based processor design would mitigate the difficulties, but our experience with the implementation of a stack machine in the AT&T Crisp microprocessor [5] leads us to believe that stack architectures are inherently slower than register-based machines. Their design lengthens the critical path by replacing simple registers with a complex stack cache mechanism.
+.LP
+In other words, it is a better idea to match the design of the VM to the processor than the other way around.
+.LP
+Dis fits this criterion better, but we do not plan to implement Dis in silicon. The idea of a VM is to be architecture-independent; offering a special processor to run it negates the original goal by favoring one instruction set. Ignoring that for the moment, though, there could still be two reasons to consider designing silicon for Dis: performance and cost.
+.LP
+On performance, history shows that language-specific CPUs are not competitive. The investment in the special design takes energy away from the systems issues that ultimately dominate performance. Performance gains realized through language-specific support tend to be offset by parallel improvements in general-purpose processors during the life cycle of the CPU.
+.LP
+Dis compiles quickly into native code that runs only 30-50% slower than native C. At the current rate of processor improvement, that is only a few months of processor design time. It is wiser to focus on improving execution on commodity, general purpose processors than on inventing a new architecture.
+.LP
+The issue of cost is more subtle. Dis is close enough to familiar architectures that a special chip with high integration of systems facilities could be cost-effective on small platforms. The real reason for that, though, is that the memory management design of the virtual machine makes it easy to implement Dis in small memory. By contrast, whatever cost gains an integrated Java processor might realize will likely be lost in the extra memory required by its conservative garbage collection scheme [6].
+.SH
+References
+.nr PS -1
+.nr VS -1
+.IP 1.
+R. Pike, D. Presotto, S. Dorward, B. Flandrena, K. Thompson, H. Trickey, and P. Winterbottom. ``Plan 9 from Bell Labs'',
+.I "J. Computing Systems"
+8:3, Summer 1995, pp. 221-254.
+.IP 2.
+Dorward, S., et al., ``Inferno'',
+.I "IEEE Compcon 97 Proceedings" ,
+1997.
+.IP 3.
+Arnold, K. and Gosling, J.,
+.I "The Java Programming Language" ,
+Addison-Wesley, 1996.
+.IP 4.
+Nori, K. V., Ammann, U., Nabeli, H. H., and Jacobi, Ch., ``Pascal P Implementation notes'', in Barron, D. W. (ed.),
+.I "Pascal\-The Language and its Implementation" ,
+Wiley, 1981, pp. 125-170.
+.IP 5.
+Ditzel, D. R. and McLellan, R., ``Register Allocation for Free: The C Machine Stack Cache'',
+.I "Proc. of Symp. on Arch. Supp. for Prog. Lang. and Op. Sys." ,
+March, 1982, pp. 48-56.
+.IP 6.
+Case, B., ``Implementing the Java Virtual Machine'',
+.I "Microprocessor Report" ,
+March 25, 1996, pp. 12-17.
binary files /dev/null b/doc/hotchips.pdf differ
--- /dev/null
+++ b/doc/install.ms
@@ -1,0 +1,1423 @@
+.de EX
+.nr x \\$1v
+\\!h0c n \\nx 0
+..
+.de FG \" start figure caption: .FG filename.ps verticalsize
+.KF
+.BP \\$1 \\$2
+.sp .5v
+.EX \\$2v
+.ps -1
+.vs -1
+..
+.de fg \" end figure caption (yes, it is clumsy)
+.ps
+.vs
+.br
+\l'1i'
+.KE
+..
+\" step numbers
+.nr ,s 0 1
+.af ,s a
+.am NH
+.nr ,s 0 1
+..
+.de Sn \" .Sn "step"
+•\ Step \\n(H1\\n+(,s: \\$1
+..
+.de Ss
+.P1
+.B
+.Sn "\\$1"
+.P2
+..
+.TL
+Installing the Inferno Software
+.AU
+Vita Nuova
+.br
+support@vitanuova.com
+.br
+12 June 2003
+.SP 4
+.LP
+Inferno can run as either a native operating system, in the usual way, or as a
+.I hosted
+virtual operating system,
+running as an application on another operating system.
+This paper explains how to install Inferno from the distribution media
+to a hosted environment and how to configure the system for
+basic networking.
+.LP
+Inferno can run as a hosted virtual operating system on top of
+Plan 9, Unix or Windows.
+In this paper, the term
+.I Unix
+is used to cover all supported variants, currently FreeBSD, Linux, HP/UX, Irix and Solaris,
+and the term
+.I Windows
+covers Microsoft Windows (98, Me, Nt, 2000, and XP).
+(Windows 98 might first require installation of the Unicode layer update from Microsoft.)
+.NH
+Preparation
+.LP
+You should ensure at least 150 Mbytes of free space on the filesystem.
+The installation program will copy files from the distribution CD to a
+directory on the filesystem called the
+.I inferno_root
+directory.
+You can choose the location of this directory.
+If you are installing to a multiuser filesystem outside your control a subdirectory of your home
+directory might be most sensible. If you plan to share the Inferno
+system with other users then common choices for
+.I inferno_root
+are
+.CW /usr/inferno
+on Unix and Plan 9 systems, and
+.CW c:\einferno
+on Windows systems.
+Where these appear in examples in this paper you should substitute
+your own
+.I inferno_root
+directory.
+.Ss "Choose the \fIinferno_root\fP directory."
+Ensure that the user who will run the installation program has
+appropriate filesystem permissions to create the
+.I inferno_root
+directory and
+files and subdirectories beneath it.
+.NH
+Copying Files
+.LP
+On all platforms the files will be owned by the user doing the installation,
+except for installation onto a FAT file system (eg, on Windows), where the files
+appear to be owned by
+.CW Everyone
+because FAT does not record ownership.
+.Ss "Insert the distribution CD into the CD drive."
+On Unix and Plan 9,
+mount the CD to a suitable location on the filesystem, call this location
+.I cd_path .
+On Windows, note the drive letter of the CD, call this drive letter
+.I cd_drive .
+The files will be copied by an Inferno hosted installation program which runs
+directly from the CD.
+The directory
+.CW /install
+on the CD contains an installation program for each supported platform \- a shell
+script for Unix and Plan 9 and an executable for Windows.
+The Plan 9 install script is called
+.CW Plan9.rc
+and determines the CPU type from the environment variable
+.CW cputype .
+The Unix install scripts all have names of the form
+.CW \fIhost_os\fP-\fIhost_arch\fP.sh
+where
+.I host_os
+will be one of:
+.CW FreeBSD ,
+.CW Linux ,
+or
+.CW Solaris
+and
+.I host_arch
+will be one of:
+.CW 386 ,
+.CW mips ,
+.CW power
+or
+.CW sparc .
+Most platforms offer just the one obvious combination.
+The Windows installation program is called
+.CW setup.exe ;
+it is used on all varieties of Windows.
+The next step describes how to begin the installation by running the program
+that corresponds to your host system.
+.Ss "Run the installation script."
+The installation program will copy files from the CD to the filesystem.
+The Windows installation program will also create registry entries and add
+an Inferno item to the Windows
+.I start
+menu.
+On Plan 9, run
+.P1
+rc \fIcd_path\fP/install/Plan9.rc \fIinferno_root\fP
+.P2
+Where
+.I inferno_root
+is the path to the chosen Inferno root directory. The CPU architecture
+will be inferred from the environment variable
+.CW cputype .
+On Unix, run
+.P1
+sh \fIcd_path\fP/install/\fIhost-os\fP-\fIhost_arch\fP.sh \fIinferno_root\fP
+.P2
+Where
+.I host_os
+is the Unix variant name
+.CW FreeBSD , (
+.CW Irix ,
+.CW Linux
+or
+.CW Solaris ).
+.I host_arch
+is the CPU type (eg,
+.CW 386 ),
+and
+.I inferno_root
+is the path to the chosen Inferno directory.
+On Windows, run
+.P1
+\fIcd_drive\f(CW:\einstall\esetup.exe
+.P2
+The Windows installation program will ask you to choose the location of the installation
+directory on the hard disk.
+.LP
+On all platforms, a copy of Inferno
+on the CD will install from various installation packages on the CD to the
+.I inferno_root
+subtree on the filesystem.
+On any platform it installs support for all.
+.LP
+Inferno is now installed, but it needs to be configured
+for your site.
+The process acts as a quick tour of parts of the system.
+The main tasks are to add local parameters to the network data base,
+and to set up the authentication system.
+If you are going to run Inferno standalone, for instance to experiment with Limbo
+and the file serving interface,
+most of what follows can be deferred indefinitely.
+It is still worthwhile skimming through it, because the first few sections tell how
+to start up Inferno with correct parameters (eg, root directory and graphics resolution).
+(A configuration program that runs under the window system would be more convenient,
+and fairly easy to do, but that has not yet been done.)
+.NH
+Running Inferno
+.LP
+Inferno host executables are all kept in a single directory corresponding
+to the combination of host operating system and CPU architecture \- the Inferno
+.CW bin
+directory.
+.P1
+\fIinferno_root\fP/\fIhost_os\fP/\fIhost_arch\fP/bin
+.P2
+(On Windows the path might need
+.CW \e
+not
+.CW /
+of course.)
+That directory can be added to the search path of the host system's command interpreter,
+and that process will be described first, although as discussed later one can use a script
+instead and that is sometimes more convenient.
+(Of course, the script will still need to refer to that directory.)
+.LP
+.I "Plan 9:\ \ "
+Plan 9 users should add a line to their
+.CW lib/profile
+file that binds this directory after their
+.CW /bin
+directory.
+.P1
+bind -a /usr/inferno/Plan9/$cputype/bin /bin
+.P2
+The bind is done after the existing
+.I bin
+directory to avoid hiding the existing Plan 9 compilers.
+If, at a later stage, you build either the hosted or native Inferno kernels for ARM or StrongARM
+you should ensure that the Inferno compilers are used rather than
+the Plan 9 compilers, since they differ in the implementation of
+floating-point instructions (the Plan 9 ARM suite uses a byte order that is more plausible
+than the order ARM dictates but therefore wrong).
+That difference is likely to be resolved at some point but it has not yet been done.
+.LP
+.I "Windows:\ \"
+The
+.I host_os
+is always
+.CW Nt
+(even for Windows 98, 2000 or XP)
+and
+.I host_arch
+is always
+.CW 386
+and the installation program will create an entry on the
+.I "start menu"
+to invoke Inferno.
+For Unix systems or Windows systems in which Inferno will be started
+from a command shell, the environment variable
+.CW PATH
+should be set to include the Inferno
+.CW bin
+directory.
+For Windows 95 and Windows 98 this should be done in the
+.CW \eautoexec.bat
+file by adding a line like
+.P1
+PATH=c:\einferno\eNt\e386\ebin;%PATH%
+.P2
+You will need to reboot Windows to have the system reread the
+.CW \eautoexec.bat
+file.
+For Windows NT and Windows 2000 modify the
+.CW Path
+environment variable through
+.I "Control Panel -> System -> Environment" .
+.LP
+If you are using an MKS or Cygwin Unix-like shell environment,
+you might instead set:
+.P1
+PATH="c:/inferno/Nt/386/bin;$PATH"
+.P2
+and export it if necessary.
+.LP
+.I "Unix:\ \"
+For Unix systems, for
+.CW sh
+derivatives, the environment variable
+.CW PATH
+should be set to include the Inferno
+.CW bin
+directory.
+This might be done in your
+.CW .profile
+file by adding a line like
+.P1
+PATH="/usr/inferno/Linux/386/bin:$PATH"
+.P2
+Don't forget to ensure that
+.CW PATH
+is exported.
+You may need to log out and back in again for the changes to take effect.
+.KS
+.Ss "Start Inferno."
+Hosted inferno is run by invoking an executable called
+.I emu .
+.KE
+On Windows, select the Inferno option from the
+.I "start menu" .
+This will invoke
+.I emu
+with appropriate arguments to find its files in
+.I inferno_root .
+If you need to change any of the options passed to
+.I emu
+when invoked from the
+.I "start menu"
+you need to do this by clicking the right mouse button
+on the Windows task bar and choosing
+.I "Properties -> Start Menu Programs -> Advanced"
+to modify the shortcut used for Inferno.
+For Unix and Plan 9, you will need to tell
+.I emu
+where to find the Inferno file tree by passing it the
+.CW -r\fIrootpath\f(CW
+command line option. For example
+.P1
+emu -r/usr/john/inferno
+.P2
+Without the
+.CW -r
+option it will look for the file tree in
+.CW /usr/inferno
+on Plan 9 and Unix and, when invoked from the command line on WIndows,
+the default is
+.CW \einferno
+on the current drive.
+(The Windows start menu by contrast has already been set to use the right directory by the installation software.)
+.LP
+When using graphics,
+.I emu
+will use a window with a resolution of 640 x 480 pixels by default. To use a larger resolution
+you will need to pass
+.I emu
+an option
+.CW -g\fIXsize\f(CWx\fIYsize\f(CW
+on the command line. So, for example, to invoke
+.I emu
+as above but with a resolution of 1024 x 768 pixels the full command line
+would be
+.P1
+emu -r/usr/john/inferno -g1024x768
+.P2
+When invoked in this way
+.I emu
+displays a command window running the Inferno shell
+.CW /dis/sh.dis .
+To avoid typing the command line options each time you invoke
+.I emu
+you can store them in the environment variable
+.CW EMU
+which is interrogated when
+.I emu
+is started and might as well be set along side the
+.CW PATH
+environment variable if the same configuration options are to be used on
+each invocation.
+.P1
+set EMU="-rd:\eDocuments and Settings\ejohn\einferno -g1024x768"
+.P2
+for Windows.
+.P1
+EMU=(-r/usr/john/inferno -g1024x768)
+.P2
+for Plan 9, and
+.P1
+EMU="-r/usr/john/inferno -g1024x768"
+.P2
+for Unix.
+An alternative to using the
+.CW EMU
+environment variable is to place the correct invocation in a
+script file (or batch file, for Windows) and invoke that instead
+of running
+.I emu
+directly.
+It is important to note that for Windows the
+.CW -r
+option also serves to indicate both the drive and directory on to which the software
+has been installed. Without a drive letter the system will assume the
+current drive and will fail if the user changes to an alternative drive.
+Once the environment variables or scripts are set up, as described above, invoking plain
+.P1
+emu
+.P2
+or the appropriate script file,
+should result in it starting up Inferno's command interpreter
+.I sh (1),
+which prompts with a semicolon:
+.P1
+;
+.P2
+You can add a further option
+.CW -c1
+to start up
+.I emu
+in a
+mode in which the system compiles a module's
+Dis operations to native machine instructions when a module
+is loaded.
+(See the
+.I emu (1)
+manual page.)
+In
+.I compile
+mode programs that do significant computation will run much faster.
+Whether in compiled or interpreted mode you should now have a functional
+hosted Inferno system.
+When Inferno starts the initial
+.CW /dis/sh.dis
+it reads commands from the file
+.CW /lib/sh/profile
+before becoming interactive. See the manual pages for the shell
+.I sh (1)
+to learn more about tailoring the initial environment.
+.LP
+The semicolon is the default shell prompt. From this command window
+you should be able to see the installed Inferno files and directories
+.P1
+lc /
+.P2
+The command
+.I lc
+presents the contents of its directory argument in columnar fashion to standard
+output in the command window.
+.P1
+; lc /
+FreeBSD/ Unixware/ icons/ libkern/ man/ prof/
+Hp/ acme/ include/ libkeyring/ mkconfig prog/
+Inferno/ appl/ keydb/ libmath/ mkfile services/
+Irix/ asm/ legal/ libmemdraw/ mkfiles/ tmp/
+LICENCE chan/ lib/ libmemlayer/ mnt/ tools/
+Linux/ dev/ lib9/ libtk/ module/ usr/
+MacOSX/ dis/ libbio/ licencedb/ n/ utils/
+NOTICE doc/ libcrypt/ limbo/ net/ wrap/
+Nt/ emu/ libdraw/ locale/ nvfs/
+Plan9/ env/ libfreetype/ mail/ o/
+Solaris/ fonts/ libinterp/ makemk.sh os/
+;
+.P2
+Only the files and directories in and below the
+.I inferno_root
+directory on the host filesystem are immediately visible to an Inferno process;
+these files are made visible in the root of the Inferno file namespace.
+If you wish to import or export files
+from and to the host filesystem you will need to use tools on your
+host to move them in or out of the Inferno visible portion of your host
+filesystem (see the manual pages
+.I os (1)
+and
+.I cmd (3)
+for an interface to host commands).
+(We plan to make such access direct, but the details are still being worked out.)
+From this point onwards in this paper all file paths not qualified with
+.I inferno_root
+are assumed to be in the Inferno namespace.
+Files created in the host filesystem will be created with the user id of
+the user that started
+.I emu
+and on Unix systems with that user's group id.
+.NH
+Setting the site's time zone
+.LP
+Time zone settings are defined by
+files in the directory
+.CW /locale .
+The setting affects only how the time is displayed; the internal representation does not vary.
+For instance, the file
+.CW /locale/GMT
+defines Greenwich Mean Time,
+.CW /locale/GB-Eire
+defines time zones for Great Britain and the Irish Republic
+(GMT and British Summer Time), and
+.CW /locale/US_Eastern
+defines United States
+Eastern Standard Time and Eastern Daylight Time.
+The time zone settings used by applications are read
+(by
+.I daytime (2))
+from the file
+.CW /locale/timezone ,
+which is initially a copy of
+.CW /locale/GB-Eire .
+If displaying time as the time in London is adequate, you need change nothing.
+To set a different time zone for the whole site,
+copy the appropriate time zone file into
+.CW /locale/timezone :
+.P1
+cp /locale/US_Eastern /locale/timezone
+.P2
+To set a different time zone for a user or window,
+.I bind (1)
+the file containing the time zone setting over
+.CW /locale/timezone ,
+either in the user's profile or in a name space description file:
+.P1
+bind /locale/US_Eastern /locale/timezone
+.P2
+.NH
+Running the
+Window Manager
+.I wm
+.LP
+Graphical Inferno programs normally run under the window manager
+.I wm (1).
+Inferno has a simple editor,
+.I wm/edit ,
+that can be used to edit the inferno configuration files.
+The `power environment' for editing and program development is
+.I acme (1),
+but rather that throwing you in at the deep end, we shall stick to
+the simpler one for now.
+If you already know Acme from
+Plan 9, however, or perhaps Wily from Unix, feel free to use Inferno's
+.I acme
+instead of
+.I edit .
+.Ss "Start the window manager."
+Invoke
+.I wm
+by typing
+.P1
+wm/wm
+.P2
+You should see a new window open with a blue-grey background and a small
+.I "Vita Nuova"
+logo in the bottom left hand corner. Click on the logo with mouse button 1
+to reveal a small menu.
+Selecting the
+.I Edit
+entry will start
+.I wm/edit .
+In common with most
+.I wm
+programs the editor has three small buttons in a line at its top right hand corner.
+Clicking on the X button, the rightmost button,
+will close the program down. The leftmost of the three buttons will allow the window
+to be resized \- after clicking it drag the window from a point near to either one of its
+edges or one of its corners. The middle button will minimise the window, creating
+an entry for it in the application bar along the bottom of the main
+.I wm
+window. You can restore a minimised window by clicking on its entry in the application bar.
+The initial
+.I wm
+configuration is determined by the contents of the shell
+script
+.CW /lib/wmsetup
+(see
+.I toolbar (1)
+and
+.I sh (1)).
+.Ss "Open a shell window."
+Choose the
+.I shell
+option from the menu to open up a shell window. The configuration of Inferno
+will be done from this shell window.
+.NH
+Manual Pages
+.LP
+Manual pages for all of the system commands are available from a shell
+window. Use the
+.I man
+or
+.I wm/man
+commands. For example,
+.P1
+man wm
+.P2
+will give information about
+.I wm .
+And
+.P1
+man man
+.P2
+will give information about using
+.I man .
+.I Wm/man
+makes use of the Tk text widget to produce slightly more
+attractive output than the plain command
+.I man .
+Here, and in other Inferno documentation you will see references to manual page
+entries of the form \fIcommand\f(CW(\fIsection\f(CW)\fR.
+You can display the manual page for the command by running
+.P1
+man \fIcommand\fP
+.P2
+or
+.P1
+man \fIsection\fP \fIcommand\fP
+.P2
+if the manual page appears in more than one section.
+.NH
+Initial Namespace
+.LP
+The initial Inferno namespace is built
+by placing the root device '#/' (see
+.I root (3))
+at the root of the namespace and binding
+.nr ,i 0 1
+.af ,i i
+.IP \n+(,i)
+the host filesystem device '#U' (see
+.I fs (3))
+containing the
+.I inferno_root
+subtree of the host filesystem at the root of the Inferno filesystem,
+.IP \n+(,i)
+the console device '#c' (see
+.I cons (3))
+in
+.CW /dev ,
+.IP \n+(,i)
+the prog device '#p' (see
+.I prog (3))
+onto
+.CW /prog ,
+.IP \n+(,i)
+the IP device '#I' (see
+.I ip (3))
+in
+.CW /net ,
+and
+.IP \n+(,i)
+the environment device '#e' (see
+.I env (3))
+at
+.CW /dev/env .
+.rr ,i
+.LP
+You can see the sequence of commands required to construct the current namespace
+by running
+.P1
+ns
+.P2
+.NH
+Inferno's network
+.LP
+If you are just going to use Inferno for local Limbo programming, and not use its
+networking interface, you can skip to the section ``Adding new users'' at the end of this document.
+You can always come back to this step later.
+.LP
+To use IP networking, the IP device
+.I ip (3)) (
+must have been bound into
+.CW /net .
+Typing
+.P1
+ls -l /net
+.P2
+(see
+.I ls (1))
+should result in something like
+.P1
+--rw-rw-r-- I 0 network john 0 May 31 07:11 /net/arp
+--rw-rw-r-- I 0 network john 0 May 31 07:11 /net/ndb
+d-r-xr-xr-x I 0 network john 0 May 31 07:11 /net/tcp
+d-r-xr-xr-x I 0 network john 0 May 31 07:11 /net/udp
+.P2
+There might be many more names on some systems.
+.LP
+A system running Inferno, whether native or hosted, can by agreement attach to any or all resources that
+are in the name space of another Inferno system (or even its own).
+That requires:
+.IP •
+the importing system must know where to find them
+.IP •
+the exporting system must agree to export them
+.IP •
+the two systems must authenticate the access (not all resources will be permitted to all systems or users)
+.IP •
+the conversation can be encrypted to keep it safe from prying eyes and interference
+.LP
+On an Inferno network, there is usually one secure machine that acts as authentication server.
+All other systems variously play the rôles of server and client as required: any system can import some resources (or none)
+and export others (or none), simultaneously, and differently in different name spaces.
+In following sections, we shall write as though there were three distinct machines:
+authentication server (signer); server (exporting resources); and client (importing resources).
+With Inferno, you can achieve a similar effect on a single machine by starting up distinct
+instances of
+.I emu
+instead.
+That is the easiest way to become familiar with the process (and also avoids having to install
+the system on several machines to start).
+It is still worthwhile setting up a secured
+authentication server later, especially if you are using Windows on a FAT file system
+where the host system's file protections are limited.
+.LP
+We shall now configure Inferno to allow each of the functions listed above:
+.IP •
+change the network database to tell where to find local network resources
+.IP •
+set up the authentication system, specifically the authentication server or `signer'
+.IP •
+start network services (two distinct sets: one for the authentication services and the other for
+all other network services)
+.NH
+Network database files
+.LP
+In both hosted and native modes, Inferno uses a collection of text files
+of a particular form to store all details of network and service configuration.
+When running hosted, Inferno typically gets most of its data from the host operating system,
+and the database contains mainly Inferno-specific data.
+.LP
+The file
+.CW /lib/ndb/local
+is the root of the collection of network database files.
+The format is defined by
+.I ndb (6),
+but essentially it is a collection of groups of attribute/value pairs of the form
+\fIattribute\fP\f(CW=\fP\fIvalue\fP.
+Attribute names and most values are case-sensitive.
+.LP
+Related attribute/value pairs are grouped into database `entries'.
+An entry can span one or more
+lines: the first line starts with a non-blank character,
+and any subsequent lines in that entry start
+with white space (blank or tab).
+.NH 2
+Site parameters
+.LP
+The version of
+.CW /lib/ndb/local
+at time of writing looks like this:
+.P1
+database=
+ file=/lib/ndb/local
+ file=/lib/ndb/dns
+ file=/lib/ndb/inferno
+ file=/lib/ndb/common
+
+infernosite=
+ #dnsdomain=your.domain.com
+ #dns=1.2.3.4 # resolver
+ SIGNER=your_Inferno_signer_here
+ FILESERVER=your_Inferno_fileserver_here
+ smtp=your_smtpserver_here
+ pop3=your_pop3server_here
+ registry=your_registry_server
+.P2
+The individual files forming the data base are listed in order in the
+.CW database
+entry.
+They can be ignored for the moment.
+The entry labelled
+.CW infernosite=
+defines a mapping from symbolic host names of the form
+.CW $\fIservice\f(CW
+to a host name, domain name, or a numeric Internet address.
+For instance, an application that needs an authentication service
+will refer to
+.CW $SIGNER
+and an Inferno naming service will translate that at run-time to the appropriate network name for
+that environment.
+Consequently,
+the entries above need to be customised for a given site.
+(The items that are commented out are not needed when the host's own DNS resolver is used instead
+of Inferno's own
+.I dns (8).)
+For example, our
+.CW infernosite
+entry in the
+.CW local
+file might look something like this
+.P1
+infernosite=
+ dnsdomain=vitanuova.com
+ dns=200.1.1.11 # resolver
+ SIGNER=doppio
+ FILESERVER=doppio
+ smtp=doppio
+ pop3=doppio
+ registry=doppio
+.P2
+where
+.CW doppio
+is the host name of a machine that is offering the given services to Inferno,
+and
+.CW 200.1.1.11
+is the Internet address of a local DNS resolver.
+.Ss "Enter defaults for your site"
+.LP
+The only important names initially are:
+.IP \f(CWSIGNER\fP 20
+the host or domain name, or address of the machine that will act as signer
+.IP \f(CWregistry\fP
+the name or address of a machine that provides the local dynamic service
+.I registry (4)
+.IP \f(CWFILESERVER\fP
+the primary file server (actually needed only by clients with no storage of their ow