Common Lisp the Language, 2nd Edition
Next: Defining New Series
Up: Optimization
Previous: Basic
Restrictions
Even if a series expression satisfies all of the restrictions above, it
still may not be possible to transform the expression into a loop. The
sole remaining problem is that if a series is used in two places, the
two uses may place incompatible constraints on the times at which series
elements should be produced.
The series expression below shows a situation where this problem
arises. The expression creates a series x
of the elements
in a list. It then creates a normalized series by dividing each element
of x
by the sum of the elements in x
. Finally,
the expression returns the maximum of the normalized elements.
(let ((x (scan '(1 2 5 2)))) ;Warning message issued
(collect-max (#M/ x (series (collect-sum x))))) => 1/2
The two uses of x
in the expression place contradictory
constraints on the way pipelined evaluation must proceed;
collect-sum
requires that all of the elements of
x
be produced before the sum can be returned, and
series
requires that its input be available before it can
start to produce its output. However, #M/
requires that the
first element of x
be available at the same time as the
first element of the output of series
. For pipelining to
work, the first element of the output of series
(and
therefore the output of collect-sum
) must be available
before the second element of x
is produced. Unfortunately,
this is impossible.
The essence of the inconsistency above is the cycle of constraints used in the argument. This in turn stems from a cycle in the data flow graph underlying the expression. In figure A-1 function calls are represented by boxes and data flow is represented by arrows. Simple arrows indicate the flow of series values and cross-hatched arrows indicate the flow of non-series values.
—————————————————————- Figure A-1: A Constraint Cycle in a Series Expression +——+ +—–+ +—–+ –>| scan |———————————–>| #M/ |–>| max |–> +——+ \ +—–+ +——–+ / +—–+ +—–+ –>| sum |-++++->| series |– +—–+ +——–+ —————————————————————-
Given a data flow graph corresponding to a series expression, a
constraint cycle is a closed oriented loop of data flow arcs
such that each arc is traversed exactly once and no non-series arc is
traversed backward. (Series data flow arcs can be traversed in either
direction.) A constraint cycle is said to pass through an input
or output port when exactly one of the arcs in the cycle touches the
port. In figure A-1 the data
flow arcs touching scan
, sum
,
series
, and #M/
form a constraint cycle. Note
that if the output of scan
were not a series, this loop
would not be a constraint cycle, because there would be no valid way to
traverse it. Also note that while the constraint cycle passes through
all the other ports it touches, it does not pass through the output of
scan
.
Whenever a constraint cycle passes through a non-series output, an
argument analogous to the one above can be constructed and therefore
pipelining will be impossible. When this situation arises, a warning
message is issued identifying the problematical port and the cycle
passing through it. For instance, the warning triggered by the example
above states that the constraint cycle associated with
scan
, collect-sum
, series
, and
#M/
passes through the non-series output of
collect-sum
.
Given this kind of detailed information, it is easy to alleviate the problem. To start with, every cycle must contain at least one function that has two series data flows leaving it. At worst, the cycle can be broken by duplicating this function (and any functions computing series used by it). For instance, the example above can be rewritten as shown below.
(let ((x (scan '(1 2 5 2)))
(sum (collect-sum (scan '(1 2 5 2)))))
(collect-max (#M/ x (series sum))))
=> 1/2
It would be easy enough to automatically apply code copying to break problematical constraint cycles. However, this is not done for two reasons. First, there is considerable virtue in maintaining the property that each function in a series expression turns into one piece of computation in the loop produced. Users can be confident that series expressions that look simple and efficient actually are simple and efficient. Second, with a little creativity, constraint problems can often be resolved in ways that are much more efficient than copying code. In the example above, the conflict can be eliminated efficiently by interchanging the operation of computing the maximum with the operation of normalizing an element.
(let ((x (scan '(1 2 5 2))))
(/ (collect-max x) (collect-sum x))) => 1/2
The restriction that optimizable series expressions cannot contain constraint cycles that pass through non-series outputs places limitations on the qualitative character of optimizable series expressions. In particular, they all must have the general form of creating some number of series using scanners, computing various intermediate series using transducers, and then computing one or more summary results using collectors. The output of a collector cannot be used in the intermediate computation unless it is the output of a separate subexpression.
It is worthy of note that the last expression above fixes the constraint conflict by moving the non-series output out of the cycle, rather than by breaking the cycle. This illustrates the fact that constraint cycles that do not pass through non-series outputs do not necessarily cause problems. They cause problems only if they pass through off-line ports.
A series input port or series output port of a series function is on-line if and only if it is processed in lockstep with all the other on-line ports as follows: the initial element of each on-line input is read, then the initial element of each on-line output is written, then the second element of each on-line input is read, then the second element of each on-line output is written, and so on. Ports that are not on-line are off-line. If all of the series ports of a function are on-line, the function is said to be on-line; otherwise, it is off-line. (The above extends the standard definition of the term on-line so that it applies to individual ports as well as whole functions.)
If all of the ports a cycle passes through are on-line, the lockstep processing of these ports guarantees that there cannot be any conflicts between the constraints associated with the cycle. However, passing through an off-line port leads to the same kinds of problems as passing through a non-series output.
Most of the series functions are on-line. In particular, scanners and
collectors are all on-line as are many transducers. However, the
transducers in section A.2.4
are off-line. In particular, the series inputs of catenate
,
choose-if
, chunk
, expand
,
mask
, mingle
, positions
, and
subseries
along with the series outputs of
choose
, split
, and split-if
are
off-line.
In summary, the fourth and final restriction is that for
optimization to be possible, a series expression cannot contain a
constraint cycle that passes through a non-series output or an off-line
port. Whenever this restriction is violated a warning message
is issued. Violations can be fixed either by breaking the cycle or
restructuring the computation so that the offending port is removed from
the cycle.
Next: Defining New Series
Up: Optimization
Previous: Basic
Restrictions
AI.Repository@cs.cmu.edu