node204

Common Lisp the Language, 2nd Edition

Next: Case Conventions Up: File Names Previous: File Names

23.1.1. Pathnames

All file systems dealt with by Common Lisp are forced into a common framework, in which files are named by a Lisp data object of type pathname.

A pathname always has six components, described below. These components are the common interface that allows programs to work the same way with different file systems; the mapping of the pathname components into the concepts peculiar to each file system is taken care of by the Common Lisp implementation.

host
The name of the file system on which the file resides.

device
Corresponds to the ``device’’ or ``file structure’’ concept in many host file systems: the name of a (logical or physical) device containing files.

directory
Corresponds to the ``directory’’ concept in many host file systems: the name of a group of related files (typically those belonging to a single user or project).

name
The name of a group of files that can be thought of as the ``same’’ file.

type
Corresponds to the ``filetype’’ or ``extension’’ concept in many host file systems; identifies the type of file. Files with the same names but different types are usually related in some specific way, for instance, one being a source file, another the compiled form of that source, and a third the listing of error messages from the compiler.

version
Corresponds to the ``version number’’ concept in many host file systems. Typically this is a number that is incremented every time the file is modified.

Note that a pathname is not necessarily the name of a specific file. Rather, it is a specification (possibly only a partial specification) of how to access a file. A pathname need not correspond to any file that actually exists, and more than one pathname can refer to the same file. For example, the pathname with a version of ``newest’’ may refer to the same file as a pathname with the same components except a certain number as the version. Indeed, a pathname with version ``newest’’ may refer to different files as time passes, because the meaning of such a pathname depends on the state of the file system. In file systems with such facilities as ``links,’’ multiple file names, logical devices, and so on, two pathnames that look quite different may turn out to address the same file. To access a file given a pathname, one must do a file system operation such as open.

Two important operations involving pathnames are parsing and merging. Parsing is the conversion of a namestring (which might be something supplied interactively by the user when asked to supply the name of a file) into a pathname object. This operation is implementation-dependent, because the format of namestrings is implementation-dependent. Merging takes a pathname with missing components and supplies values for those components from a source of defaults.

Not all of the components of a pathname need to be specified. If a component of a pathname is missing, its value is nil. Before the file system interface can do anything interesting with a file, such as opening the file, all the missing components of a pathname must be filled in (typically from a set of defaults). Pathnames with missing components may be used internally for various purposes; in particular, parsing a namestring that does not specify certain components will result in a pathname with missing components.

change_begin
X3J13 voted in January 1989 (PATHNAME-UNSPECIFIC-COMPONENT) to permit any component of a pathname to have the value :unspecific, meaning that the component simply does not exist, for file systems in which such a value makes sense. (For example, a UNIX file system usually does not support version numbers, so the version component of a pathname for a UNIX host might be :unspecific. Similarly, the file type is usually regarded in a UNIX file system as the part of a name after a period, but some file names contain no periods and therefore have no file types.)

When a pathname is converted to a namestring, the values nil and :unspecific have the same effect: they are treated as if the component were empty (that is, they each cause the component not to appear in the namestring). When merging, however, only a nil value for a component will be replaced with the default for that component; the value :unspecific will be left alone as if the field were filled.

The results are undefined if :unspecific is supplied to a file system in a component for which :unspecific does not make sense for that file system.

Programming hint: portable programs should be prepared to handle the value :unspecific in the device, directory, type, or version field in some implementations. Portable programs should not explicitly place :unspecific in any field because it might not be permitted in some situations, but portable programs may sometimes do so implicitly (by copying such a value from another pathname, for example).
change_end

old_change_begin
A component of a pathname can also be the keyword :wild. This is only useful when the pathname is being used with a directory-manipulating operation, where it means that the pathname component matches anything. The printed representation of a pathname typically designates :wild by an asterisk; however, this is host-dependent.
old_change_end

See section 23.1.4 for a discussion of new wildcard pathname facilities.

What values are allowed for components of a pathname depends, in general, on the pathname’s host. However, in order for pathnames to be usable in a system-independent way, certain global conventions are adhered to. These conventions are stronger for the type and version than for the other components, since the type and version are explicitly manipulated by many programs, while the other components are usually treated as something supplied by the user that just needs to be remembered and copied from place to place.

The type is always a string or nil or :wild. It is expected that most programs that deal with files will supply a default type for each file.

The version is either a positive integer or a special symbol. The meanings of nil and :wild have been explained above. The keyword :newest refers to the largest version number that already exists in the file system when reading a file, or to a version number greater than any already existing in the file system when writing a new file. Some Common Lisp implementors may choose to define other special version symbols. Some semi-standard names, suggested but not required to be supported by every Common Lisp implementation, are :oldest, to refer to the smallest version number that exists in the file system; :previous, to refer to the version previous to the newest version; and :installed, to refer to a version that is officially installed for users (as opposed to a working or development version). Some Common Lisp implementors may also choose to attach a meaning to non-positive version numbers (a typical convention is that 0 is synonymous with :newest and -1 with :previous), but such interpretations are implementation-dependent.

The host may be a string, indicating a file system, or a list of strings, of which the first names the file system and the rest may be used for such a purpose as inter-network routing.

old_change_begin
The device, directory, and name can each be a string (with host-dependent rules on allowed characters and length) or possibly some other Common Lisp data structure (in which case such a component is said to be structured and has an implementation-dependent format). Structured components may be used to handle such file system features as hierarchical directories. Common Lisp programs do not need to know about structured components unless they do host-dependent operations. Specifying a string as a pathname component for a host that requires a structured component will cause conversion of the string to the appropriate form.
old_change_end

change_begin
X3J13 voted in June 1989 (PATHNAME-SUBDIRECTORY-LIST) to define a specific format for structured directories (see section 23.1.3).

X3J13 voted in June 1989 (PATHNAME-COMPONENT-VALUE) to approve the following clarifications and specifications of precisely what are valid values for the various components of a pathname.

Pathname component value strings never contain the punctuation characters that are used to separate fields in a namestring (for example, slashes and periods as used in UNIX file systems). Punctuation characters appear only in namestrings. Characters used as punctuation can appear in pathname component values with a non-punctuation meaning if the file system allows it (for example, UNIX file systems allow a file name to begin with a period).

When examining pathname components, conforming programs must be prepared to encounter any of the following siutations:

Any component can be nil, which means the component has not been specified.
Any component can be :unspecific, which means the component has no meaning in this particular pathname.
The device, directory, name, and type can be strings.
The host can be any object, at the discretion of the implementation.
The directory can be a list of strings and symbols as described in section 23.1.3.
The version can be any symbol or any integer. The symbol :newest refers to the largest version number that already exists in the file system when reading, overwriting, appending, superseding, or directory-listing an existing file; it refers to the smallest version number greater than any existing version number when creating a new file. Other symbols and integers have implementation-defined meaning. It is suggested, but not required, that implementations use positive integers starting at 1 as version numbers, recognize the symbol :oldest to designate the smallest existing version number, and use keyword symbols for other special versions.

When examining wildcard components of a wildcard pathname, conforming programs must be prepared to encounter any of the following additional values in any component or any element of a list that is the directory component:

The symbol :wild, which matches anything.
A string containing implementation-dependent special wildcard characters.
Any object, representing an implementation-dependent wildcard pattern.

When constructing a pathname from components, conforming programs must follow these rules:

Any component may be nil. Specifying nil for the host may, in some implementations, result in using a default host rather than an actual nil value.
The host, device, directory, name, and type may be strings. There are implementation-dependent limits on the number and type of characters in these strings. A plausible assumption is that letters (of a single case) and digits are acceptable to most file systems.
The directory may be a list of strings and symbols as described in section 23.1.3. There are implementation-dependent limits on the length and contents of the list.
The version may be :newest.
Any component may be taken from the corresponding component of another pathname. When the two pathnames are for different file systems (in implementations that support multiple file systems), an appropriate translation occurs. If no meaningful translation is possible, an error is signaled. The definitions of ``appropriate’’ and ``meaningful’’ are implementation-dependent.
When constructing a wildcard pathname, the name, type, or version may be :wild, which matches anything.
An implementation might support other values for some components, but a portable program should not use those values. A conforming program can use implementation-dependent values but this can make it non-portable; for example, it might work only with UNIX file systems.

The best way to compare two pathnames for equality is with equal, not eql. (On pathnames, eql is simply the same as eq.) Two pathname objects are equal if and only if all the corresponding components (host, device, and so on) are equivalent. (Whether or not uppercase and lowercase letters are considered equivalent in strings appearing in components depends on the file name conventions of the file system.) Pathnames that are equal should be functionally equivalent.

old_change_begin
Some host file systems have features that do not fit into this pathname model. For instance, directories might be accessible as files; there might be complicated structure in the directories or names; or there might be a way to specify a directory relative to a ``current’’ directory, such as the < syntax in MULTICS or the special ``..’’ file name of UNIX. Such features are not allowed for by the standard Common Lisp file system interface. An implementation is free to accommodate such features in its pathname representation and provide a parser that can process such specifications in namestrings; such features are then likely to work within that single implementation. However, note that once a program depends explicitly on any such features, it will not be portable.
old_change_end

change_begin
X3J13 voted in June 1989 (PATHNAME-SUBDIRECTORY-LIST) to define a specific format for structured directories (see section 23.1.3), so some of the specific examples in the previous paragraph no longer apply, but the principle is still correct.
change_end

Next: Case Conventions Up: File Names Previous: File Names

AI.Repository@cs.cmu.edu