The Good IR: Multiple Returns
This post is part of a seres:
The Good IR (BasicBlocks and control flow instructions)
The Good IR: Other Control Flow Structures
The Good IR: Instructions and Values
The Good IR: Reporting Semantic Errors via Type Checking
The Good IR: Multiple Returns
—–
Why Have Early Returns?
I’ve been meaning for awhile now to do a post on a tricky bit of language modeling that throws a monkey wrench into what would otherwise be completely straightforward: early returns. The ability for a function to have more than one return point is quite common. In fact, for recursive functions, it’s a mandatory convenience. Consider, the classic(ly bad) example of recursion: factorial. Notice that, without the ability to return early, the function would be much more awkward, and control-flow less clear.
unsigned int factorial(unsigned int n) { if (n <= 1) // handle base case return 1; // and return early else return n * factorial(n-1); } |
In discussion with labmates, we found no less than 3 issues which are affected by the allowance of early returns in our language:
- Dominator Tree: The presence of a return in one branch of an IfElse will cause the JoinBlock to be dominated by the other branch, rather than just the IfHeader.
- Dead Code: The presence of a return may cause some of the following code to never execute. A pass to eliminate the dead code might adversely affect some PhiInstruction.
- Return Semantics: For any function which returns a value, we must be able to guarantee that all code paths set a valid return value.
Non-local Changes in the Dominator Tree
Except for the presence of early returns, it would be possible (via Single-Pass Generation of Static Single-Assignment Form for Structured Languages by Brandis and Mössenböck) to create the Dominator Tree during parsing.
To illustrate the difference, let’s first consider the ‘normal case’:
Code | Control Flow Graph | Dominator Tree | |
---|---|---|---|
|
We can clearly see how both the ControlFlowGraph and DominatorTree are constructable during parse. The Parser’s method for IfStatement
can easily hold local references to each of these blocks. Connecting the edges for each data structure is simply a matter of following the diagram. Unfortunately, as we see in the next example, the presence of a ReturnStatement disrupts the simplicity of this approach.
Code | Control Flow Graph | Dominator Tree | |
---|---|---|---|
|
In this case, I’ve placed a ReturnStatement in the ThenBranch(1) of the IfThenElse control structure. At this point I’d like to draw attention to what is a rather inconvenient side-effect: The ReturnStatement causes a ControlFlowEdge to the function’s ExitBlock. The ThenBranch(1) no longer has a ControlFlowEdge to the JoinBlock(3), while the ElseBranch(2) retains its ControlFlowEdge to the JoinBlock(3). This situation means that the ElseBranch(2) now dominates the JoinBlock(3). Irratatingly, a ReturnStatement in one branch causes a side-effect in the other branch. The non-local effects of the ReturnStatement complicate attempts to keep an up-to-date version of the Dominance Tree during a parse.
One option to mitigate this unfortunate circumstance is to delay the linking of DominanceTreeEdges until after both sides of the IfStatement have been parsed. Handling the situation at this point during the parse requires that we introduce a mechanism whereby a recursive descendant parse (down a branch) can propagate information (back up through all callers) about having encountered a ReturnStatement. Such an approach might work, but I convinced myself through some adversarial counter-examples that the task would not be clean. It took some time to realize that the pictures above misrepresent the difficulty of the situation. In the face of arbitrary nested control flow structures, it essentially amounts to solving a not-so-simple problem: detect that all code paths within a branch terminate in a ReturnStatement.
Another option, is to give up. Rather than attempt to build the Dominance Tree during parse, build it in a separate pass. (some papers exist on this topic, but I think it’s relatively easy to figure out through the above drawings) After the Parser has finished, the ControlFlowGraph is already built and can be reliably walked by a DominanceTreeGenerator.
Returns Produce Dead Code
Introducing a ReturnStatement into a program can cause the code following it to no longer execute at runtime (Cases 3 and 4, below). Such code is considered dead, and we should not emit it into the final executable. However, I found, after examining all the possibilities, that some situations (Case 5, below) are difficult to detect. Moreover, it wasn’t immediately clear whether the Parser (because it can detect some situations) or some other portion of the compiler (because the Parser can’t detect all situations) should be delegated with the responsibility of removing the dead code. Let’s walk through all possible situations to obtain a more clear idea how a ReturnStatement can create dead code.
Looking back over all the examples, we notice a pattern: that illustrations which include empty blocks following a block with a ReturnStatement more clearly emphasize the general problem. In all of the above cases, it’s clear that DeadCode can be spotted easily in the ControlFlowGraph by discovering which blocks have no incoming edges. In the last case, it’s clear that DeadCode can form its own entire subgraph. From this we can take away the following lesson: DeadCode is easier to identify in the ControlFlowGraph, than it is during a parse.
I have before advocated (Other Control Flow Structures) the insertion of EmptyBlocks to make the ControlFlowGraph more uniform in appearance and thus aid analysis. Let’s apply that same advice again: When the parser encounters a ReturnStatement it should end the current block with a jump to the ExitBlock and create a new block in which to place any following code.
Now, we always have a ControlFlowGraph which features the same general control structures as before. The Parser doesn’t have to worry about detecting DeadCode, that can be handled in a separate walk over all BasicBlocks after parsing is finished. The Parser’s code for creating ControlFlowEdges in IfStatement and WhileLoop remains completely unaffected. We still face some semantic choices though:
- DeadCode is an Error. The DeadCode pass can report its findings directly, and halt the compilation process.
- DeadCode is a Warning. The DeadCode pass can report its findings, and allow the compilation process to continue.
- DeadCode must remain syntactically and semantically valid. The Parser’s willingness to parse the DeadCode and
issue
instructions (type-checked at construction) into BasicBlocks following the ReturnStatement ensure that any syntax errors will be caught. Choosing the other way (DeadCode can be syntactically or semantically invalid) is actually more difficult, because the Parser would have to be placed into a DeadCode mode (the challenges have been discussed already).
Also, I should note, that if later optimizations create DeadCode, then you might run the DeadCode pass multiple times. If this is the case, then you will likely only want to issue a warning or quit in error the first time that the DeadCode pass is run. It would be confusing for a programmer to be told about DeadCode that resulted only from a transformation of the IR, because such DeadCode would not necessarily be map-able to a source line of program text.
How DeadCode Affects Phi’s
Discussion | Control Flow Graph |
---|---|
I’d like to repeat one of the above examples, and trace through it’s parsing.
Now that the Parser has finished the generation of the ControlFlowGraph, the compiler runs a DeadCodeElimination pass.
|
Clearly, a DeadCode pass might interfere with the Phi’s. After some careful thought, and a bit of discussion with my labmates, I was able to resolve the issue: Run the DeadCode pass first, and allow it to change Phi’s of two (or more) arguments into Phi’s of a single argument. I’ve advocated Aycock’s Simple Generation of Static Single-Assignment Form, which requires a cleanup pass to get rid of the unnecessary Phi’s. It is no problem to extend that approach to also handle Phi’s of a single argument, and ensure that the compiler performs the PhiCleanup pass after DeadCodeElimination.
Semantic Checking of the Return Value
In our language, we have Procedures, which return no value, and Functions, which must return a value the same as that declared in the function signature. We can conveniently model Procedures by creating them as a Function returning a VoidType. In this manner, we can fit them into a reasoning framework, where they do not appear as a special case. It’s straightforward (especially with the type-checking framework I introduced previously) to check at every ReturnStatement that the return value is type compatible with the Function currently being parsed.
However, if Functions are permitted multiple return sites, we are still faced with a tricky semantic check: the Parser must verify that all paths through the Function have a return. To clarify how this check can be accomplished, I’d like to compare two pieces of code which accomplish the same task.
Exhibit A | Exhibit B | ||
---|---|---|---|
|
|
Although both of these functions accomplish the same task, they do so in different ways. Exhibit B
allocates a temporary variable and pushes the ReturnStatement to the end of the Function. The value returned is whatever was last assigned to the variable _ret
. The transformation from Exhibit A
is completely mechanical, but serves to draw attention of our decision to have a single ExitBlock for every function.
Semantically, it’s possible for us to separate the assignment of a return value from that of exiting the procedure. When the Parser reaches a function declaration, it can create in the EntryBlock an UninitializedValue of the same type as the function, and bind that value to a ReturnValue for the function. Throughout the parse the ReturnValue acts as any local variable. At each occurrence of ReturnStatement, the Parser can check that the return value is compatible, and updates the return’s entry in the StateVector. Semantically, this is equivalent to assigning to the ReturnValue. The type-checking framework will automatically report an error during an assignment to a VoidType, thereby handling an adversarial coder’s attempt to return an actual value from a Procedure. Each ReturnStatement also creates a branch to the ExitBlock, where a Phi(ReturnValue) tracks all incoming updates to the ReturnValue.
When the Parser is finished parsing the function, a query to the Phi(ReturnValue) can quickly check whether any of its arguments reference the UninitializedValue created at the beginning of the function. If so, then there exists at least one path through the function which did not have a ReturnStatement. In this way the Phi(ReturnValue) allows us to very easily check the execution path semantics of functions. It also operates no differently than any other Phi for a local variable.
Aside: The introduction of an internal temporary local for holding the ReturnValue reminds me of those old languages (Fortran, Pascal) where returns values were syntactically accomplished by assigning to an undeclared variable of the same name as the function itself. Those old farts had some interesting wisdom. Too bad it looks confusing in retrospect, and requires implementation knowledge to fully grasp.
The statement “Introducing a ReturnStatement into a program can cause the code following it to no longer execute at runtime” is only true of lazy grammars which don’t force a ReturnStatement to be the last in a BasicBlock.