The challenge of structural code-coverage preservation is to ensure for a given structural code coverage of a program
that this code coverage is preserved while the program
is transformed into another program
. This scenario is shown in Figure 1. Of course if a program will be transformed, also the sets of basic blocks
, the set of program decisions
, or program scopes
may get changed. As shown in Figure 1, the interesting question is whether a concrete code transformation preserves the structural code coverage of interest.
When transforming a program, we are interested in the program properties that must be maintained by the code transformation such that a structural code coverage of the original program by the test-data set
is preserved to the transformed program. Based on these properties one can adjust a source-to-source transformer or a compiler to use only those optimizations that preserve the intended structural code coverage.
These coverage-preservation properties to be maintained have to ensure that whenever the code coverage is fulfilled at the original program by some test data
then this coverage is also fulfilled at the transformed program with the same test data:
The code coverage preservation can be applied on any type of code transformation, for example, on a source-to-source transformer or a compiler.
In the first step, we have to determine for each code transformation of the code transformer whether it preserves a given structural code coverage. We call this the coverage profile of a code transformation. The determination of the coverage profile is shown in Figure 2. The structural code coverage metrics of interest have to be formalized and based on that the coverage preservation criteria have to be determined. The coverage preservation criteria together with description of a code optimization are used to calculate the coverage profile of that optimization. The construction of a formal model of the code optimization in Figure 2 is an intermediate step that is necessary if one wants to use formal verification to determine the coverage profile. In case the coverage profile is determined manually, such a formal model of the code optimization is not needed.
In the second step, the coverage preservation has to be integrated into the code transformer. As an example we assume the code transformer is a compiler, as shown in Figure 3. This coverage-preserving compiler will have an input parameter to set the code coverage metric to be preserved. The coverage-preserving compiler can have two operation modes.
Safe Mode
In this mode the coverage-preserving compiler will apply only those code optimizations that preserve the given code coverage metric. With this operation mode we assure coverage preservation at the cost of a potential degradation of performance.
Full-Optimization Mode
In this mode the coverage-preserving compiler will apply all code transformations but it will emit a warning whenever a code transformation has been used that does not ensure the preservation of the given coverage metrics. The warning message should be as specific as possible to support the user in determining additional test data to regain code coverage for the optimized code.
The determination of the coverage profile for a given code transformation and the realization of a coverage-preserving compiler are not the focus of this article. Within this article we present the foundation for such a coverage preservation framework and discuss issues that challenge its applicability.
In the following we present coverage preservation criteria for several variants of structural code-coverage metrics. The important aspect is that these preservation criteria are independent of the concrete test data
that achieve the structural code coverage at the original program.
4.1. Preserving Statement Coverage (SC)
Equation (15) of Theorem 4.1 provides a coverage preservation criterion for statement coverage. Equation (15) essentially says that for each basic block
of the transformed program there exists a basic block
of the original program such that reaching
with a given test vector implies that also
is reached with the same test vector.
Theorem 4.1 (Preservation of SC).
Assuming that a set of test data
achieves statement coverage on a given program
, then (15) provides a sufficient—and without further knowledge about the program and the test data (there is now knowledge about the test data or the program assumed), also necessary—criterion for guaranteeing preservation of statement coverage on a transformed program
.
Proof.
Preservation of SC: Part 1, showing sufficiency: Since
is assumed to achieve SC on
, it holds for each
that
. Since (15) states that
it follows that for each
we also have
. Thus, SC is preserved at
.
Part 2, showing necessity by indirect proof: Assuming there exists a basic block
of
such that for all basic blocks
of
it holds that
, then each
contains at least one input that is not in
. If
consists of exactly those inputs, then
is never reached although SC holds in
, which implies that SC is not preserved.
4.2. Preserving Condition Coverage (CC)
To define a coverage preservation criterion for CC (Theorem 4.2) we use the auxiliary predicate
given in (16).
The predicate
is only TRUE if the set of input data
includes at least the true-satisfiability valuation
or the false-satisfiability valuation
of expression
, where
is either a condition or a decision. The predicate
is used for the coverage preservation criterion of CC (and also DC) to test whether the evaluation of any expression
of the original program to both, TRUE and FALSE, implies that the test data include at least one element of
, needed for the coverage of an expression in the transformed program
Equation (17) states that for each condition
of the transformed program there exists at least one condition of the original program whose coverage implies that
evaluates to TRUE and there exists at least one condition of the original program whose coverage implies that
evaluates to FALSE.
Theorem 4.2 (Preservation of CC).
Assuming that a set of test data
achieves condition coverage on a given program
, then (17) provides a sufficient—and without further knowledge about the program and the test data, also necessary—criterion for guaranteeing preservation of condition coverage on a transformed program
:
Proof.
Preservation of CC: Part 1, showing sufficiency: Since
is assumed to achieve CC on
, it holds for each
that
and
. Since (17) states that for each
it holds that
it follows that for each
we also have
Thus, CC is preserved at
.
Part 2, showing necessity by indirect proof: Assuming there exists a condition
of program
such that for all conditions
of program
it either holds that
(a)
(b)
then it is possible that
(a)
:
(b)
:
which in both cases violates the preservation of CC.
Simplification of the CC Preservation Criteria
The goal of defining the coverage preservation criterion is to decide for a set of code transformations whether they could potentially disrupt the structural code coverage achieved on the original program. Typically, when checking the preservation of structural code coverage, one would simplify (17) by just checking whether each condition
is kept equal or simply is inverted. This would result in the simpler criterion given in (20)
Working with the simple constraint of (20) may be sufficient in practice when analyzing the effect of concrete code transformations, since many transformations do not modify the conditions within a decision, but only their grouping into decisions. The simplified criterion is sufficient to allow only such code transformations that do not introduce new conditions with new unique satisfiability by the test data. Further, some transformations just invert a condition, which can be checked also with this simplified criterion.
4.3. Preserving Decision Coverage (DC)
To define a coverage preservation criterion for DC (Theorem 4.3) we use the auxiliary predicate
given in (16), which is also used for preserving CC.
Equation (21) of Theorem 4.3 provides a coverage preservation criterion for decision coverage. Equation (21) essentially says that for each decision
of the transformed program there exists at least one decision of the original program whose coverage implies that
evaluates to TRUE and there exists at least one decision of the original program whose coverage implies that
evaluates to FALSE.
Theorem 4.3 (Preservation of DC).
Assuming that a set of test data
achieves decision coverage on a given program
, then (21) provides a sufficient—and without further knowledge about the program and the test data, also necessary—criterion for guaranteeing preservation of decision coverage on a transformed program
Proof.
Preservation of DC: Part 1, showing sufficiency: since
is assumed to achieve DC on
, it holds for each
that
and
. Since (21) states that for each
(1)
(2)
it follows that for each
we also have
and
. Thus, DC is preserved at
.
Part 2, showing necessity by indirect proof: assuming there exists a decision
such that for all conditions
it either holds that
(a)
(b)
then it is possible that
(a)
or
(b)
which in both cases violates the preservation of DC.
Guaranteeing Decision Coverage
Guaranteeing the preservation of a structural code coverage criterion that depends on the coverage of decisions of a program is challenging, since there are many ways to re-group conditions into hierarchies of decisions without changing the program semantics.
The criterion given in (21) imposes quite strong restrictions on the performed code transformations, since it requires that for each decision
there is an adequate decision
of the original program such that decision coverage is preserved. For example, consider the following code transformation:
if (a==3)
if (a==3 && b==2)
if (b==2)
c();
c();
inlined style
noninlined style
Such a transformation is quite typical when source-code is transformed into assembly code. Actually, the only decision in the original code is (a==3 && b==2). Having decision coverage on the original code, there are numerous code transformations possible that do not preserve decision coverage.
Thus, it would be useful to have another criterion to guarantee decision coverage at the transformed program. Equation (22) provides a sufficient criterion for guaranteeing decision coverage on the transformed program, assuming that condition coverage is fulfilled on the original program
The new criterion requires a different, but not stronger, structural code coverage at the original code to guarantee decision coverage at the transformed code. This criterion is typically more flexible when generating assembly code (which typically does not have control-flow statements with complex decisions). Further, in case that condition decision coverage (CDC) is fulfilled at the original program, one may chose between the criteria of (21) and (22) to guarantee decision coverage at the transformed program.
4.4. Preserving MCDC
Preserving MCDC coverage on a transformed program is especially challenging, since the code transformation may produce arbitrary groupings of conditions into decisions. Especially the requirement that each condition can independently influence the outcome of its conditions, is rather complex to check.
As the MCDC coverage preservation criterion is rather complex, we derive them in two steps. First, we describe a rather naive criterion that is relatively ease to understand. This criterion is sufficient but not necessary (too strict). Second, we describe a "realistic" (more detailed) criterion that is sufficient and necessary.
A Naive Coverage Preservation Criterion
A sufficient but not necessary coverage preservation criterion for MCDC is given in (23). The predicate symbol
is used in the same way as the real criterion: it is used to express that only input data that fulfill MCDC at the original program have to be considered for coverage preservation
This naive criterion is not necessary since it requires the coverage preservation of the conditions in the transformed program
by a single condition from the original program
.
Another drawback of this naive criterion is that it is based on a concrete set of test data
that are used to achieve MCDC at the original program. To ensure coverage preservation in general, it would be necessary to ensure that the criterion holds for all possible sets of test data
that achieve MCDC at the original program, which tends to be intractable in practice.
A Realistic Coverage Preservation Criterion
To define an easier testable (but more complicated) coverage preservation criterion for MCDC (Theorem 4.4) we use the auxiliary predicate
given in (24). The predicate
is similar to the predicate symbol
, with the difference that it performs the control check on all members of two sets of input data. The predicate
is used for the coverage preservation of MCDC to test whether the condition
of the original program refers to TRUE for one input data set
or
and refers to FALSE for the other. Besides
, also the predicate
(10) is used to describe the preservation criterion for MCDC coverage
The criterion given in equ_preserve_mcdc states that for each condition
of a decision
of the transformed program there exist two sets of input data
and
whose members achieve the
criterion needed for MCDC coverage. Further, there has to be a condition of the original program such that the
is a subset of either the true-satisfiability valuation or the false-satisfiability valuation (tested with the predicate
).
the same requirement as
.
Theorem 4.4 (Preservation of MCDC).
Assuming that a set of test data
achieves MCDC coverage on a given program
, then (25) provides a sufficient—and without further knowledge about the program and the test data, also necessary—criterion for guaranteeing preservation of MCDC coverage on a transformed program
Proof.
Preservation of MCDC: Part 1, showing sufficiency: Since
is assumed to achieve MCDC on
, it holds for each
and for each
that there exist at least two test vectors
such that
. Since
as defined in (10) for each condition is the formal definition of MCDC it directly follows that
is a sufficient criterion to ensure that MCDC is preserved at program
.
Part 2, showing necessity by indirect proof: Assuming there exists a decision
with a condition
such that for all input-data subsets
it either holds that
then it is possible that
(for all conditions in the original program
condition coverage is not fulfilled; this case is already excluded by assumption of having MCDC coverage at
)
(there is no MCDC coverage at the original program
; this case is already excluded by assumption of having MCDC coverage at
)
(the test data
do not provide MCDC coverage at the transformed program
) which in each case violates the preservation of MCDC: Case (a) and (b) violate the preservation of MCDC since they are in contradiction with the requirement that MCDC is achieved at the original program. Case (c) states that there exists a condition in the transformed program for which there are no test data to achieve unique cause coverage, which is required for MCDC.
4.5. Preserving Scoped Path Coverage (SPC)
To define a coverage preservation criterion for SPC (Theorem 4.5) we use the auxiliary predicate
given in (31).
The predicate
is only TRUE if there is at least one condition from the set of conditions
whose true-satisfiability valuation is a subset of the input data
or there is at least one condition from the set of conditions
whose false-satisfiability valuation is a subset of the input data
. The predicate
is used for the coverage preservation criterion of SPC to test whether for a condition in the transformed program with true/false-satisfiability valuation
there exist two conditions in the original program whose true/false coverage are a subset of
As stated in Theorem 4.5, (32) provides a coverage preservation criterion for SPC. Equation (32) says that for each scoped path
of the transformed program there exists a scoped path
such that the reachability of the first basic block of
implies the reachability of the first basic block of
. Further, Equation (32) states that for each condition
of
that has to be evaluated to TRUE, there exists a condition
of a scoped path in the original program that will imply the True evaluation of
(by predicate
). Finally, Equation (32) states that for each condition
of
that has to be evaluated to FALSE, there exists a condition
of a scoped path in the original program that will imply the FALSE evaluation of
(by predicate
).
Theorem 4.5 (Preservation of SPC).
Assuming that a set of test data
achieves scoped path coverage on a given program
, then (32) provides a sufficient—and without further knowledge about the program and the test data, also necessary—criterion for guaranteeing preservation of scoped path coverage on a transformed program
Proof.
Preservation of SPC: Part 1, showing sufficiency: Since
is assumed to achieve SPC on
, it holds for each
and each
with
that there exists test data
with
Since (32) states that
it follows that
As (32) also states that
it follows that
Finally, as (32) states that
it follows that
Thus, SPC is preserved at
.
Part 2, showing necessity by indirect proof: Assuming there exists a scoped program path
such that for all scoped program paths
it either holds that
then at least one of the following cases is possible:
(the first basic block of a scoped program path of the transformed program
is not executed with the given test data
)
(one of the conditions of a scoped program path of the transformed program
that has to be evaluated to True evaluates to False for all test vectors of
)
(one of the conditions of a scoped program path of the transformed program
that has to be evaluated to False evaluates to True for all test vectors of
) which in each case violates the preservation of SPC.