New Terminology in Information Flow Research
Information flow is about tracking the flows of information within a computer program, i.e. what values influence other values as the program executes. Denning and Denning looked at this problem in the late 1970’s [1, 2] and distinguished between flows that occur due to a data dependence (such as assignment) and flows which occur due to control dependence (such as an if-else branch). Let’s now examine each of the possible ways in which information can flow between variables in a computer.
Explicit Flows
The most direct that information can flow between variable is through an assignment statement. For example, the simple line of code: b := a
causes information to be transferred a→b
. However, we could also have a series of dependent assignments, in which information will flow a→b→c
. This kind of indirect flow, a→c
can be seen in the series of statements: b := f(..., a, ...); c := g(..., b, ...);
.
Implicit Flows
Since, we are trying to assess the influence of some variables on other variables, we must also take into account control flow within the program. For example, a variable might be given a particular value only if some condition is satisfied. I’ve constructed an example of this kind of implicit flow x→y
:
bool x := ... bool y := 1 bool z := 0 if (x == 0) then y := 0 else y := 1 z := 1
Any observer examining the value of y
after the code is run combined with an a-priori knowledge (gained from a static analysis) can infer the value of x
in the conditional. Lest we think that such powers of inference are not worth investigating, consider if the observer is some code that an attacker was able to inject into the system and x
represents a secret pin or bit of authentication. Further scrutinizing the above code, we will go through each of the two possible cases (for observers restricted to inspecting the variable y
:
x == 0
: The variable y
is modified to 0
. The executed branch exerts a direct influence on the value of observed variable. The observer can determine which branch was taken by inspecting the value of the modified variable.
x == 1
: Even though y
is modified, it’s value remains unchanged. Because there are only two branches, and y
is set to a unique value in each one, the observer can still determine the value of of the conditional variable x
.
Let’s now change the game slightly, and restrict the observer to inspecting the variable z
. There are still two cases to consider:
x == 0
: The variable z
is left untouched. But, because the observer knows the value was left unchanged, it can still determine which branch was taken.
x == 1
: The variable z
is modified, and thus directly informs the observer of which branch was taken.
It should be clear now, that if an attacker can inject code like that above, they can use knowledge of branches not taken to infer values of variables used in the branch conditional. In a world of binary branches, observing variables left unmodified can be just as revealing as a direct assignment.
We therefore have to carefully distinguish between information gained via branches taken (accessible by dynamic analysis at runtime) from those not taken (accessible by static analysis of the source code). Denning’s work did not fully distinguish these two cases; so some recent famous work [3] re-used the term ‘indirect’ for implicit flows from branches taken at runtime. However, in seeking clarity, I now prefer the term active flows for implicit flows in branches taken at runtime, and passive flows for implicit flows from branches not taken. (these choices were made in consultation with my postdoc, who first noticed that ‘indirect’ had been accidentally overloaded).
Let’s summarize this terminology in a nice table:
Category | Descriptor | Example | Flow | Analysis |
---|---|---|---|---|
Explicit | Direct |
b := a |
a→b |
Dataflow |
Indirect |
b = f(..., a, ...) c = g(..., b, ...) |
a→c |
Dataflow | |
Implicit | Active |
x := true if (x) y := 1 else ... |
x→y |
Control Flow (dynamic) |
Passive |
x := true z := 0 if (x) ... else z := 1 |
x→z |
Control Flow (static) |
References
[1] @article{denning1976lattice, title={A lattice model of secure information flow}, author={Denning, D.E.}, journal={Communications of the ACM}, volume={19}, number={5}, pages={236--243}, year={1976}, publisher={ACM} } [2] @article{denning1977certification, title={Certification of programs for secure information flow}, author={Denning, D.E. and Denning, P.J.}, journal={Communications of the ACM}, volume={20}, number={7}, pages={504--513}, year={1977}, publisher={ACM} } [3] @inproceedings{jang2010empirical, title={An empirical study of privacy-violating information flows in JavaScript web applications}, author={Jang, D. and Jhala, R. and Lerner, S. and Shacham, H.}, booktitle={Proceedings of the 17th ACM conference on Computer and communications security}, pages={270--283}, year={2010}, organization={ACM} }