Consider the following C program:
A pointer analysis computes a mapping from pointer expressions to a set of allocation sites of objects they may point to. For the above program, an idealized, fully precise analysis would compute the following results:
(Where X::Y represents the stack allocation holding the local variable Y in the function X.)
However, a context-insensitive analysis such as Andersen's or Steensgaard's algorithm would lose precision when analyzing the calls to id, and compute the following result:
As a form of static analysis, fully precise pointer analysis can be shown to be undecidable.1 Most approaches are sound, but range widely in performance and precision. Many design decisions impact both the precision and performance of an analysis; often (but not always) lower precision yields higher performance. These choices include:23
Pointer analysis algorithms are used to convert collected raw pointer usages (assignments of one pointer to another or assigning a pointer to point to another one) to a useful graph of what each pointer can point to.4
Steensgaard's algorithm and Andersen's algorithm are common context-insensitive, flow-insensitive algorithms for pointer analysis. They are often used in compilers, and have implementations in SVF 5 and LLVM.
Many approaches to flow-insensitive pointer analysis can be understood as forms of abstract interpretation, where heap allocations are abstracted by their allocation site (i.e., a program location).6
Many flow-insensitive algorithms are specified in Datalog, including those in the Soot analysis framework for Java.7
Context-sensitive, flow-sensitive algorithms achieve higher precision, generally at the cost of some performance, by analyzing each procedure several times, once per context.8 Most analyses use a "context-string" approach, where contexts consist of a list of entries (common choices of context entry include call sites, allocation sites, and types).9 To ensure termination (and more generally, scalability), such analyses generally use a k-limiting approach, where the context has a fixed maximum size, and the least recently added elements are removed as needed.10 Three common variants of context-sensitive, flow-insensitive analysis are:11
In call-site sensitivity, the points-to set of each variable (the set of abstract heap allocations each variable could point to) is further qualified by a context consisting of a list of callsites in the program. These contexts abstract the control-flow of the program.
The following program demonstrates how call-site sensitivity can achieve higher precision than a flow-insensitive, context-insensitive analysis.
For this program, a context-insensitive analysis would (soundly but imprecisely) conclude that p can point to either the allocation holding x or that of y, so u and v may alias, and both could point to either allocation:
A callsite-sensitive analysis would analyze id twice, once for main.3 and once for main.4, and the points-to facts for p would be qualified by the call-site, enabling the analysis to deduce that when main returns, u can only point to the allocation holding x and v can only point to the allocation holding y:
In an object sensitive analysis, the points-to set of each variable is qualified by the abstract heap allocation of the receiver object of the method call. Unlike call-site sensitivity, object-sensitivity is non-syntactic or non-local: the context entries are derived during the points-to analysis itself.12
Type sensitivity is a variant of object sensitivity where the allocation site of the receiver object is replaced by the class/type containing the method containing the allocation site of the receiver object.13 This results in strictly fewer contexts than would be used in an object-sensitive analysis, which generally means better performance.
Reps, Thomas (2000-01-01). "Undecidability of context-sensitive data-dependence analysis". ACM Transactions on Programming Languages and Systems. 22 (1): 162–186. doi:10.1145/345099.345137. ISSN 0164-0925. S2CID 2956433. https://doi.org/10.1145%2F345099.345137 ↩
Barbara G. Ryder (2003). "Dimensions of Precision in Reference Analysis of Object-Oriented Programming Languages". Compiler Construction, 12th International Conference, CC 2003 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2003 Warsaw, Poland, April 7–11, 2003 Proceedings. pp. 126–137. doi:10.1007/3-540-36579-6_10. /wiki/Doi_(identifier) ↩
(Hind) harv error: no target: CITEREFHind (help) ↩
Zyrianov, Vlas; Newman, Christian D.; Guarnera, Drew T.; Collard, Michael L.; Maletic, Jonathan I. (2019). "srcPtr: A Framework for Implementing Static Pointer Analysis Approaches" (PDF). ICPC '19: Proceedings of the 27th IEEE International Conference on Program Comprehension. Montreal, Canada: IEEE. https://www.zyrianov.org/papers/ICPC19.pdf ↩
Sui, Yulei; Xue, Jingling (2016). "SVF: interprocedural static value-flow analysis in LLVM" (PDF). CC'16: Proceedings of the 25th international conference on compiler construction. ACM. https://yuleisui.github.io/publications/cc16.pdf ↩
Smaragdakis, Yannis; Bravenboer, Martin; Lhoták, Ondrej (2011-01-26). "Pick your contexts well". Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages. POPL '11. Austin, Texas, USA: Association for Computing Machinery. pp. 17–30. doi:10.1145/1926385.1926390. ISBN 978-1-4503-0490-0. S2CID 6451826. 978-1-4503-0490-0 ↩
Antoniadis, Tony; Triantafyllou, Konstantinos; Smaragdakis, Yannis (2017-06-18). "Porting doop to Soufflé". Proceedings of the 6th ACM SIGPLAN International Workshop on State of the Art in Program Analysis. SOAP 2017. Barcelona, Spain: Association for Computing Machinery. pp. 25–30. doi:10.1145/3088515.3088522. ISBN 978-1-4503-5072-3. S2CID 3074689. 978-1-4503-5072-3 ↩
(Smaragdakis & Balatsouras, p. 29) harv error: no target: CITEREFSmaragdakisBalatsouras (help) ↩
Thiessen, Rei; Lhoták, Ondřej (2017-06-14). "Context transformations for pointer analysis". ACM SIGPLAN Notices. 52 (6): 263–277. doi:10.1145/3140587.3062359. ISSN 0362-1340. https://doi.org/10.1145/3140587.3062359 ↩
(Li et al., pp. 1:4) harv error: no target: CITEREFLiTanMøllerSmaragdakis (help) ↩
(Smaragdakis & Balatsouras) harv error: no target: CITEREFSmaragdakisBalatsouras (help) ↩
(Smaragdakis & Balatsouras, p. 37) harv error: no target: CITEREFSmaragdakisBalatsouras (help) ↩
(Smaragdakis & Balatsouras, p. 39) harv error: no target: CITEREFSmaragdakisBalatsouras (help) ↩