Common Lisp the Language, 2nd Edition
Next: Non-standard Characters
Up: Characters
Previous: Standard
Characters
The treatment of line divisions is one of the most difficult issues in designing portable software, simply because there is so little agreement among operating systems. Some use a single character to delimit lines; the recommended ASCII character for this purpose is the line feed character LF (also called the new line character, NL), but some systems use the carriage return character CR. Much more common is the two-character sequence CR followed by LF. Frequently line divisions have no representation as a character but are implicit in the structuring of a file into records, each record containing a line of text. A deck of punched cards has this structure, for example.
Common Lisp provides an abstract interface by requiring that there be
a single character, #\Newline
, that within the language
serves as a line delimiter. (The language C has a similar requirement.)
An implementation of Common Lisp must translate between this internal
single-character representation and whatever external representation(s)
may be used.
Implementation note: How the character called
#\Newline
is represented internally is not specified here,
but it is strongly suggested that the ASCII LF character be used in
Common Lisp implementations that use the ASCII character encoding. The
ASCII CR character is a workable, but in most cases inferior,
alternative.
When the first edition was written it was not yet clear that UNIX would
become so widely accepted. The decision to represent the line delimiter
as a single character has proved to be a good one.
The requirement that a line division be represented as a single character has certain consequences. A character string written in the middle of a program in such a way as to span more than one line must contain exactly one character to represent each line division. Consider this code fragment:
(setq a-string "This string
contains
forty-two characters.")
Between g
and c
there must be exactly one
character, #\Newline
; a two-character sequence, such as
#\Return
and then #\Newline
, is not
acceptable, nor is the absence of a character. The same is true between
s
and f
.
When the character #\Newline
is written to an output
file, the Common Lisp implementation must take the appropriate action to
produce a line division. This might involve writing out a record or
translating #\Newline
to a CR/LF sequence.
Implementation note: If an implementation uses the
ASCII character encoding, uses the CR/LF sequence externally to delimit
lines, uses LF to represent #\Newline
internally, and
supports #\Return
as a data object corresponding to the
ASCII character CR, the question arises as to what action to take when
the program writes out #\Return
followed by
#\Newline
. It should first be noted that
#\Return
is not a standard Common Lisp character, and the
action to be taken when #\Return
is written out is
therefore not defined by the Common Lisp language. A plausible approach
is to buffer the #\Return
character and suppress it if and
only if the next character is #\Newline
(the net effect is
to generate a CR/LF sequence). Another plausible approach is simply to
ignore the difficulty and declare that writing #\Return
and
then #\Newline
results in the sequence CR/CR/LF in the
output.
Next: Non-standard Characters
Up: Characters
Previous: Standard
Characters
AI.Repository@cs.cmu.edu