Input / Output

This section discussed SNOBOL5 input / output considerations. FILE NAMES, MODIFIERS and other I/O considerations:

File names can be associated with input/output unit numbers as mentioned in the section on invoking SNOBOL5 above. File names (more correctly, path names with optional modifiers) are specified in the command line or as the optional fourth and fifth parameters of the INPUT or OUTPUT functions. If no fourth parameter is given for the INPUT or OUTPUT functions, the previous file name and modifiers assigned to the I/O unit (either on the command line or prior INPUT/OUTPUT function call) are used.

Path name syntax is described in WINDOWS and LINUX documentation. In SNOBOL5, "file modifiers" may be optionally specified. Each modifier begins with the hyphen character "-" followed by the modifier name. All of the modifiers follow the path name with or without intervening blanks.

On the command line files and modifiers are specified as follows:

-nnn pathname -mod -mod... Where nnn is the i/o unit number (1-999). Where pathname is the path name of the file. Use double quotes if the file name contains blank characters. Where mod is one of the file modifiers listed below. Several modifiers can follow the file's path name. If -nnn is omitted, unit 5 is assumed which is by default associated with the input variable "INPUT".

In the INPUT or OUTPUT functions, the pathname is the fourth parameter. The modifiers are a string in the fifth parameter.

All of the modifiers use lower case letters. The modifier names and their functions are described below.

1. The -a (ASCII) modifier specifies that the file is to be considered an ASCII file with records terminated by a line feed character (0Ah). Carriage control "return" character (0Dh, is discarded on input). The end of the file is indicated by the actual end of the file or the end-of-file character (1Ah), whichever comes first. -a is the default. -a cancels the -b modifier.

2. The -ap (append) modifier specifies that the file writes should start at the end of the file, appending to the file. -r specifies replace mode, in which the files are rewritten from the start. If the file is ASCII (-a and -ef), then a check is made for the end of file character 1Ah at the end of the file. If it appears there, the next write will replace the end of file character. -ap is ignored for devices, such as the console. -ap implies not -r.

3. The -b (binary) modifier is the opposite of -a and indicates a binary file. Characters are read until the read buffer is full, except for possibly the last record read in the file, which may be shortened if the file size is not a multiple of the buffer size. The buffer size is given by the third parameter of the INPUT function. If -b is specified, -ntabx, -ncr, -nef are implied and -ic is off.

4. The -cr (carriage return) modifier indicates that a carriage return character (0Dh) should be written before the line feed character (0Ah) at the end of writes. This is defaults on for WINDOWS and off for LINUX. -cr implies -a and not -b. -ncr forces this off.

5. The -ef (end file char) modifier indicates that an end of file character (1Ah) should be written at the end of the file when it is closed. This defaults off for both WINDOWS and LINUX even though old versions of WINDOWS and DOS used this convention. -ef implies -a and not -b. -nef forces this off.

6. The -fl (fixed length modifier indicates that input reads in ASCII mode (-a modifier) should pad the input buffer with blanks, after the omitted carriage control and line feed characters. The size of the input buffer is specified as the third parameter of the INPUT function. The opposite is the -vl modifier which stands for variable length. -fl is the default.

7. The -i (includes) modifier indicates that include processing is enabled for -a ASCII mode reads. Include processing is available both for reading the SNOBOL program source code as well as data. Specify includes as follows:

-include "pathname" Where pathname is the path name of a file. There should not be any blanks before the hyphen "-". If the pathname is relative (doesn't start with "/" or "\"), then the current directory is used. In addition if the "SNOPATH" environment variable is set, the paths specified there are also checked for the file specified in the include line. As an example, more than one path can be specified for SNOPATH: SNOPATH=c:\mysnolib;d:\users\userid\alternatelib for WINDOWS SNOPATH=/home/userid/mysnolib:/home/userid/alternatelib for LINUX Note that for WINDOWS the list separator is ';' and for LINUX it is ':'. In WINDOWS the name "snopath" is not case sensitive. Includes can be nested. That is, an included file may include yet another file. -ni disables include processing the I/O unit.

8. The -tabx (tab expand) modifier is the default and causes tab characters to be expanded into blanks on input of ASCII (-a) files. The tabs expand into blanks such that the next character after the tab has an offset in the line which is the next multiple of 8. This eliminates the need to preprocess lines when they were written by some editor or other program/utility to eliminate the tab characters. On output, -tabx compresses the output lines (in ASCII -a mode) so that they use tabs where possible. Quotation marks terminate the tab expansion. -ntabx disables this tab expansion/compression.

9. The -r (replace) modifier causes writing to start at the beginning of the file, erasing any prior file content. -r is the default. Alternatively -ap is used to append data to an existing file.

10. The -std (standard i/o) modifier indicates the I/O is one of the "standard" I/O methods. In this case the file name must be one of "in", "out" or "err", corresponding to standard input, standard output or standard error, respecively.

11. The -vl (variable length) modifier provides the ability to read variable length records for ASCII files (-a). The maximum record length is determined by the third parameter to the INPUT() function. The record ends with the line feed character but does not include the carriage-return character nor line feed in the input read. Use of the -vl option is sometimes preferable to using &TRIM = 1 because less processing is involved and the reads reflect more precisely the content of the file. -vl applies to input on files read in ASCII mode (-a).

12. The -dir (directory) modifier specifies that the name specified in the INPUT function is a path to a directory rather than a file. Reads do not return a string, but rather a predefined data type called "DirEnt". This is returned on each read of a directory. The fields in DirEnt are:

DEname - name of the entry DEtype - type of the entry: "FILE" for files (Windows and Linux) "DIR" for directories (Windows and Linux) "LINK" for symbolic link (Windows and Linux) "UNKN" when unknown (Windows and Linux) "CHAR" for character device (Linux) "BLOK" for block device (Linux) "SOCK" for socket (Linux) "DEV" for device (Windows) DEsize - size of file in bytes DEdate - last local write time in same format as DATE() function The following is some example code to list the contents of a directory: IDENT(&OS,"Windows") INPUT('in',1,,'C:\windows','-dir') IDENT(&OS,"Linux") INPUT('in',1,,'/etc','-dir') loop x = in :F(loopend) OUTPUT = RPAD(DEtype(x),4) ' ' DEdate(x) ' ' LPAD(DEsize(x),10) ' ' DEname(x) :(loop) loopend end

SUMMARY and RULES
Summary. (*=default):

-b Binary vs *-a ASCII *-tabx Tab expand/compress vs -ntabx No tab expand or compress -ap Append to file vs *-r Replace file -ef End of file char vs -nef No end of file character *-i Include processing vs -ni No include processing -vl Variable length read vs *-fl Fixed length read *-cr Carriage returns vs -ncr No carriage returns -std Name is standard "in" "out" or "err" -dir Name is a directory Rules: -b implies -ntabx -ni -tabx implies -a -vl implies -a -cr implies -a -dir impies all others ignored

If an I/O unit is not assigned to a file when starting to read or write, it is then assigned to the standard device "in" -std or "out" -std. INPUT() and OUTPUT() functions fail if the file name is not legal or if the modifiers are incorrect.


Prior Page, Next Page, First Page of the Oregon SNOBOL5 Reference