The challenge of structural code-coverage preservation is to ensure for a given structural code coverage of a program that this code coverage is preserved while the program is transformed into another program . This scenario is shown in Figure 1. Of course if a program will be transformed, also the sets of basic blocks , the set of program decisions , or program scopes may get changed. As shown in Figure 1, the interesting question is whether a concrete code transformation preserves the structural code coverage of interest.

When transforming a program, we are interested in the program properties that must be maintained by the code transformation such that a structural code coverage of the original program by the test-data set is preserved to the transformed program. Based on these properties one can adjust a source-to-source transformer or a compiler to use only those optimizations that preserve the intended structural code coverage.

These coverage-preservation properties to be maintained have to ensure that whenever the code coverage is fulfilled at the original program by some test data then this coverage is also fulfilled at the transformed program with the same test data:

The code coverage preservation can be applied on any type of code transformation, for example, on a source-to-source transformer or a compiler.

In the first step, we have to determine for each code transformation of the code transformer whether it preserves a given structural code coverage. We call this the *coverage profile* of a code transformation. The determination of the coverage profile is shown in Figure 2. The structural code coverage metrics of interest have to be formalized and based on that the coverage preservation criteria have to be determined. The coverage preservation criteria together with description of a code optimization are used to calculate the coverage profile of that optimization. The construction of a formal model of the code optimization in Figure 2 is an intermediate step that is necessary if one wants to use formal verification to determine the coverage profile. In case the coverage profile is determined manually, such a formal model of the code optimization is not needed.

In the second step, the coverage preservation has to be integrated into the code transformer. As an example we assume the code transformer is a compiler, as shown in Figure 3. This coverage-preserving compiler will have an input parameter to set the code coverage metric to be preserved. The coverage-preserving compiler can have two operation modes.

Safe Mode

In this mode the coverage-preserving compiler will apply only those code optimizations that preserve the given code coverage metric. With this operation mode we assure coverage preservation at the cost of a potential degradation of performance.

Full-Optimization Mode

In this mode the coverage-preserving compiler will apply all code transformations but it will emit a warning whenever a code transformation has been used that does not ensure the preservation of the given coverage metrics. The warning message should be as specific as possible to support the user in determining additional test data to regain code coverage for the optimized code.

The determination of the coverage profile for a given code transformation and the realization of a coverage-preserving compiler are not the focus of this article. Within this article we present the foundation for such a coverage preservation framework and discuss issues that challenge its applicability.

In the following we present coverage preservation criteria for several variants of structural code-coverage metrics. The important aspect is that these preservation criteria are independent of the concrete test data that achieve the structural code coverage at the original program.

### 4.1. Preserving Statement Coverage (SC)

Equation (15) of Theorem 4.1 provides a coverage preservation criterion for statement coverage. Equation (15) essentially says that for each basic block of the transformed program there exists a basic block of the original program such that reaching with a given test vector implies that also is reached with the same test vector.

Theorem 4.1 (Preservation of SC).

Assuming that a set of test data achieves statement coverage on a given program , then (15) provides a sufficient—and without further knowledge about the program and the test data (there is now knowledge about the test data or the program assumed), also necessary—criterion for guaranteeing preservation of statement coverage on a transformed program .

Proof.

*Preservation of SC*: Part 1, showing sufficiency: Since is assumed to achieve SC on , it holds for each that . Since (15) states that it follows that for each we also have . Thus, SC is preserved at .

Part 2, showing necessity by indirect proof: Assuming there exists a basic block of such that for all basic blocks of it holds that , then each contains at least one input that is not in . If consists of exactly those inputs, then is never reached although SC holds in , which implies that SC is not preserved.

### 4.2. Preserving Condition Coverage (CC)

To define a coverage preservation criterion for CC (Theorem 4.2) we use the auxiliary predicate given in (16).

The predicate is only TRUE if the set of input data includes at least the true-satisfiability valuation or the false-satisfiability valuation of expression , where is either a condition or a decision. The predicate is used for the coverage preservation criterion of CC (and also DC) to test whether the evaluation of any expression of the original program to both, TRUE and FALSE, implies that the test data include at least one element of , needed for the coverage of an expression in the transformed program

Equation (17) states that for each condition of the transformed program there exists at least one condition of the original program whose coverage implies that evaluates to TRUE and there exists at least one condition of the original program whose coverage implies that evaluates to FALSE.

Theorem 4.2 (Preservation of CC).

Assuming that a set of test data achieves condition coverage on a given program , then (17) provides a sufficient—and without further knowledge about the program and the test data, also necessary—criterion for guaranteeing preservation of condition coverage on a transformed program :

Proof.

*Preservation of CC*: Part 1, showing sufficiency: Since is assumed to achieve CC on , it holds for each that and . Since (17) states that for each it holds that

it follows that for each we also have

Thus, CC is preserved at .

Part 2, showing necessity by indirect proof: Assuming there exists a condition of program such that for all conditions of program it either holds that

(a)

(b)

then it is possible that

(a):

(b):

which in both cases violates the preservation of CC.

Simplification of the CC Preservation Criteria

The goal of defining the coverage preservation criterion is to decide for a set of code transformations whether they could potentially disrupt the structural code coverage achieved on the original program. Typically, when checking the preservation of structural code coverage, one would simplify (17) by just checking whether each condition is kept equal or simply is inverted. This would result in the simpler criterion given in (20)

Working with the simple constraint of (20) may be sufficient in practice when analyzing the effect of concrete code transformations, since many transformations do not modify the conditions within a decision, but only their grouping into decisions. The simplified criterion is sufficient to allow only such code transformations that do not introduce new conditions with new unique satisfiability by the test data. Further, some transformations just invert a condition, which can be checked also with this simplified criterion.

### 4.3. Preserving Decision Coverage (DC)

To define a coverage preservation criterion for DC (Theorem 4.3) we use the auxiliary predicate given in (16), which is also used for preserving CC.

Equation (21) of Theorem 4.3 provides a coverage preservation criterion for decision coverage. Equation (21) essentially says that for each decision of the transformed program there exists at least one decision of the original program whose coverage implies that evaluates to TRUE and there exists at least one decision of the original program whose coverage implies that evaluates to FALSE.

Theorem 4.3 (Preservation of DC).

Assuming that a set of test data achieves decision coverage on a given program , then (21) provides a sufficient—and without further knowledge about the program and the test data, also necessary—criterion for guaranteeing preservation of decision coverage on a transformed program

Proof.

*Preservation of DC*: Part 1, showing sufficiency: since is assumed to achieve DC on , it holds for each that and . Since (21) states that for each

(1)

(2)

it follows that for each we also have and . Thus, DC is preserved at .

Part 2, showing necessity by indirect proof: assuming there exists a decision such that for all conditions it either holds that

(a)

(b)

then it is possible that

(a)
or

(b)

which in both cases violates the preservation of DC.

Guaranteeing Decision Coverage

Guaranteeing the preservation of a structural code coverage criterion that depends on the coverage of decisions of a program is challenging, since there are many ways to re-group conditions into hierarchies of decisions without changing the program semantics.

The criterion given in (21) imposes quite strong restrictions on the performed code transformations, since it requires that for each decision there is an adequate decision of the original program such that *decision coverage* is preserved. For example, consider the following code transformation:

if (a==3)

if (a==3 && b==2)
if (b==2)

c();
c();

inlined style

noninlined style

Such a transformation is quite typical when source-code is transformed into assembly code. Actually, the only decision in the original code is (a==3 && b==2). Having *decision coverage* on the original code, there are numerous code transformations possible that do not preserve *decision coverage*.

Thus, it would be useful to have another criterion to guarantee decision coverage at the transformed program. Equation (22) provides a sufficient criterion for guaranteeing decision coverage on the transformed program, assuming that *condition coverage* is fulfilled on the original program

The new criterion requires a different, but not stronger, structural code coverage at the original code to guarantee *decision coverage* at the transformed code. This criterion is typically more flexible when generating assembly code (which typically does not have control-flow statements with complex decisions). Further, in case that *condition decision coverage* (CDC) is fulfilled at the original program, one may chose between the criteria of (21) and (22) to guarantee *decision coverage* at the transformed program.

### 4.4. Preserving MCDC

Preserving MCDC coverage on a transformed program is especially challenging, since the code transformation may produce arbitrary groupings of conditions into decisions. Especially the requirement that each condition can independently influence the outcome of its conditions, is rather complex to check.

As the MCDC coverage preservation criterion is rather complex, we derive them in two steps. First, we describe a rather naive criterion that is relatively ease to understand. This criterion is sufficient but not necessary (too strict). Second, we describe a "realistic" (more detailed) criterion that is sufficient and necessary.

A Naive Coverage Preservation Criterion

A sufficient but not necessary coverage preservation criterion for MCDC is given in (23). The predicate symbol is used in the same way as the real criterion: it is used to express that only input data that fulfill MCDC at the original program have to be considered for coverage preservation

This naive criterion is not necessary since it requires the coverage preservation of the conditions in the transformed program by a single condition from the original program .

Another drawback of this naive criterion is that it is based on a concrete set of test data that are used to achieve MCDC at the original program. To ensure coverage preservation in general, it would be necessary to ensure that the criterion holds for all possible sets of test data that achieve MCDC at the original program, which tends to be intractable in practice.

A Realistic Coverage Preservation Criterion

To define an easier testable (but more complicated) coverage preservation criterion for MCDC (Theorem 4.4) we use the auxiliary predicate given in (24). The predicate is similar to the predicate symbol , with the difference that it performs the control check on all members of two sets of input data. The predicate is used for the coverage preservation of MCDC to test whether the condition of the original program refers to TRUE for one input data set or and refers to FALSE for the other. Besides , also the predicate (10) is used to describe the preservation criterion for MCDC coverage

The criterion given in equ_preserve_mcdc states that for each condition of a decision of the transformed program there exist two sets of input data and whose members achieve the criterion needed for MCDC coverage. Further, there has to be a condition of the original program such that the is a subset of either the true-satisfiability valuation or the false-satisfiability valuation (tested with the predicate ). the same requirement as .

Theorem 4.4 (Preservation of MCDC).

Assuming that a set of test data achieves MCDC coverage on a given program , then (25) provides a sufficient—and without further knowledge about the program and the test data, also necessary—criterion for guaranteeing preservation of MCDC coverage on a transformed program

Proof.

*Preservation of MCDC*: Part 1, showing sufficiency: Since is assumed to achieve MCDC on , it holds for each and for each that there exist at least two test vectors such that . Since as defined in (10) for each condition is the formal definition of MCDC it directly follows that

is a sufficient criterion to ensure that MCDC is preserved at program .

Part 2, showing necessity by indirect proof: Assuming there exists a decision with a condition such that for all input-data subsets it either holds that

then it is possible that

(for all conditions in the original program condition coverage is not fulfilled; this case is already excluded by assumption of having MCDC coverage at )

(there is no MCDC coverage at the original program ; this case is already excluded by assumption of having MCDC coverage at )

(the test data do not provide MCDC coverage at the transformed program ) which in each case violates the preservation of MCDC: Case (a) and (b) violate the preservation of MCDC since they are in contradiction with the requirement that MCDC is achieved at the original program. Case (c) states that there exists a condition in the transformed program for which there are no test data to achieve *unique cause* coverage, which is required for MCDC.

### 4.5. Preserving Scoped Path Coverage (SPC)

To define a coverage preservation criterion for SPC (Theorem 4.5) we use the auxiliary predicate given in (31).

The predicate is only TRUE if there is at least one condition from the set of conditions whose true-satisfiability valuation is a subset of the input data or there is at least one condition from the set of conditions whose false-satisfiability valuation is a subset of the input data . The predicate is used for the coverage preservation criterion of SPC to test whether for a condition in the transformed program with true/false-satisfiability valuation there exist two conditions in the original program whose true/false coverage are a subset of

As stated in Theorem 4.5, (32) provides a coverage preservation criterion for SPC. Equation (32) says that for each scoped path of the transformed program there exists a scoped path such that the reachability of the first basic block of implies the reachability of the first basic block of . Further, Equation (32) states that for each condition of that has to be evaluated to TRUE, there exists a condition of a scoped path in the original program that will imply the True evaluation of (by predicate ). Finally, Equation (32) states that for each condition of that has to be evaluated to FALSE, there exists a condition of a scoped path in the original program that will imply the FALSE evaluation of (by predicate ).

Theorem 4.5 (Preservation of SPC).

Assuming that a set of test data achieves scoped path coverage on a given program , then (32) provides a sufficient—and without further knowledge about the program and the test data, also necessary—criterion for guaranteeing preservation of scoped path coverage on a transformed program

Proof.

*Preservation of SPC*: Part 1, showing sufficiency: Since is assumed to achieve SPC on , it holds for each and each with that there exists test data with

Since (32) states that

it follows that

As (32) also states that

it follows that

Finally, as (32) states that

it follows that

Thus, SPC is preserved at .

Part 2, showing necessity by indirect proof: Assuming there exists a scoped program path such that for all scoped program paths it either holds that

then at least one of the following cases is possible:

(the first basic block of a scoped program path of the transformed program is not executed with the given test data )

(one of the conditions of a scoped program path of the transformed program that has to be evaluated to True evaluates to False for all test vectors of )

(one of the conditions of a scoped program path of the transformed program that has to be evaluated to False evaluates to True for all test vectors of ) which in each case violates the preservation of SPC.