 Review
 Open Access
 Published:
Research on recovery strategy in embedded realtime main memory databases
EURASIP Journal on Embedded Systemsvolume 2016, Article number: 9 (2016)
Abstract
In order to recover data from embedded realtime main memory databases effectively and efficiently, this paper proposes a realtime logbased recovery approach. With respect to the realtime requirement in embedded systems, we classify the consistency in realtime main memory databases into data and transaction consistencies, analyze them theoretically, design rules for correct recovery strategy, and propose realtime logbased recover algorithms for different types of transactions. The experiments show that the proposed approach is more effective and efficient than methods in both traditional and eXtremeDB database systems.
Review
With the development of embedded systems, the application of databases in embedded systems [1] is a hotspot in both industry and academia. Embedded systems work in an environment without manual intervention, so when a fault occurs in these systems, they need to diagnose the fault and recover it automatically all by themselves [2]. The main memory databases [3, 4] can reduce the I/O operations greatly while running, and satisfy the realtime requirement of embedded systems, so the databases implemented in embedded systems usually work in the main memory.
In realtime main memory databases [5–7], the main copy of database works in the volatile RAM, and the data is very vulnerable, so the recovery is necessary. Meantime, the I/O operations in realtime main memory databases are few, and recovery is the only part that affects the I/O performance, so the performance of recovery is critical for realtime main memory databases [8, 9]. While recovering from a fault, realtime main memory databases need to satisfy multiple constraints [10, 11], and this pose a huge challenge for designing reasonable recovering strategies.
Checkpoint or memory snapshots [12, 13] is a commonly used program recovery strategy, but the overhead of storing states of running program is very high, and it is not suitable for embedded applications. In addition, the logs in embedded systems record the behaviors of embedded systems, and researchers use different logs to design different recovery strategies, such as partition log [14], realtime log [15], remote log [16], and operation log [17]. However, these strategies only take the requirement of realtime into consideration, and ignore other specific requirements in embedded systems, so they cannot be applied to the embedded environment efficiently. In addition, the method proposed in [13], studied the recovery strategy in mainmemory, but the method is based on virtual memory snapshots. In order to improve realtime ability, Levy and Silberschatz [18] proposed an incremental recovery strategy in mainmemory database.
In this paper, we analyze the consistency constraints in embedded realtime main memory databases from the perspectives of both data and transaction. Then we design some rules that an efficient recovery strategy must obey in embedded realtime main memory databases. Finally, we propose corresponding recovery algorithms for different tasks of embedded realtime main memory databases.
Analysis of consistency in embedded realtime main memory databases
In this section, we analyze the consistency of embedded realtime main memory databases from the perspective of both data and transaction.
Data consistency
The embedded realtime main memory databases include three types of data, i.e., image objects, derived objects, and invariant objects.
The objects of real world are sensed by sensors, and their values are written into the databases. The values written into the databases are image objects. An image object is an image of a real world object at some instant, and each image object has its own sampling timestamp and external validity interval.
A derived object is calculated out by a group of image objects during a transaction processing. The timestamp of a derived object is the instant when the transaction is finished, and the validity interval is the intersection of all validity intervals of image objects in the group.
An invariant object is a constant which is invariant as time goes by. The validity of an invariant object is not affected by time, so it is also called nontime series data object.
As there is a validity interval for each image object and derived object, both of them are time series data objects. The sampling time and computing time of time series data are validate only in an interval starting from the system’s current time.
Definition 1. If VI(X) is far less than AT(X), i.e., VI(X) < < AT(X), then X is short timelimited data.
The data consistency of embedded realtime systems includes internal consistency, external consistency, and mutual consistency.
Definition 2. X is internal consistent, if and only if it satisfies the predefined integrity and consistency of traditional database systems.
Here, the internal consistency is the internal consistency in traditional database systems, and it only refers to the internal world of database systems.
Definition 3. X is external consistent, if and only if it satisfies t ≤ ST(X) + VI(X).
The external consistency requires that the sampling data in a database lag the real world within a certain time.
Definition 4. A group of related data used for decision or deriving new data is a mutual consistent set R, and each R is related to a corresponding mutual validity internal R _{mvi}.
Definition 5. Let R = {X _{1}, X _{2}, …, X _{ n }}, then R is mutual consistent, if and only if ∀ X _{ i } ∈ R, ∀ X _{ j } ∈ R and k ≠ i, such that ST(X _{ i }) − ST(X _{ j }) ≤ R _{mvi}.
If R is used to generate new data, then the mutual consistency is used to assure the values in R are generated within the common validity interval.
Transaction consistency
The embedded realtime main memory database systems interact with real world according to two behaviors. The first one is recording the states and events of the real world into the databases, and the second one is doing some acts to affect the real world. The embedded realtime transactions can be classified into data receiving transactions, data processing transactions and manipulating transactions.
Data receiving transactions sample the external environment periodically and write it into the databases. This kind of transaction generates an image object in one period, and it is a readonly and nonblocking hard realtime transaction.
Data processing transactions do readonly operations to image objects periodically or nonperiodically, and read and write deriving objects or invariant objects. This kind of transaction does not interact with the real world, and is a soft realtime transaction.
Manipulating transactions read all kinds of data in a database, and do a set of actions AS(T) = {A _{ i }1 ≤ i ≤ h} to control the embedded system. If this kind of transaction exceeds the validity interval, disastrous results will be generated, so it is also a hard realtime transaction. Manipulating transactions are readonly operations, and they do not affect the consistency of databases, but they can change the states of real world.
The same as data consistency, transaction consistency in embedded realtime main memory database systems also include internal consistency, external consistency, and mutual consistency.
Definition 6. T is internal consistent, if and only if the value it reads and/or writes satisfies the predefined internal integrity and consistency of traditional database systems.
Definition 7. T is external consistent, if and only if t ≤ D(T) and ∀ X _{ i } ∈ DS(T), t ≤ ST(X _{ i }) + VI(X _{ i }).
The external consistency of embedded realtime transactions requires that each transaction is in its validity internal, and all read/write operations are within its validity interval.
Theorem 1. Let MVI(T) be the minimum of all validate terminal instants of T while reading/writing data objects, then the final terminal instant of T is D _{ R }(T) = min(D(T), MVI(T)).
Proof: If MVI(T) < t < D(T), then ∃ X _{ i } ∈ DS(T), such that t > ST(X _{ i }) + VI(X _{ i }), that is, there exists some X _{ i }, which loses the external consistency, so this violates the external consistency constraint while T reads/writes data objects. On the contrary, if D(T) < MVI(T) and t > D(T), then T exceeds the validity interval, and this violates the external constraint of T. So, we can have D _{ R }(T) = min(D(T), MVI(T)).
Definition 8. T is mutual consistent, if and only if ∀ X _{ i }, X _{ j } ∈ DS(T), and i ≠ j, such that ST(X _{ i }) − ST(X _{ j }) ≤ R _{mvi}.
The mutual consistency of embedded realtime transactions means that the time interval between any two data objects is not bigger than the given value R _{mvi}(T).
With the same reason, when T is both external consistent and mutual consistent, then it is time consistent. A validate submit of transaction in embedded realtime systems depends not only on the internal consistency, but also on the time consistency. So, we have the following corollary.
Corollary 1. T is consistent, if and only if the following constraints satisfy at the same time:

(1)
∀ X _{ i }, X _{ i } ∈ DS(T);

(2)
CT(T) ≤ D _{ R }(T);

(3)
∀ X _{ i } ∈ RS(T), RT _{ T }(X _{ i }) ≤ ST(X _{ i }) + VI(X _{ i });

(4)
∀ X _{ i }, X _{ j } ∈ RS(T) and i ≠ j, such that ST(X _{ i }) − ST(X _{ j }) ≤ R _{mvi}(T).
Rules for correct recovery strategy
Taking the internal consistency and time consistency of transactions and data in embedded realtime main memory databases into consideration, we present some rules for correct recovery strategies.
Nontime series data recovery rule
Rule 1. If T has not been submitted, then for ∀ X _{ i } ∈ US(T) satisfying S _{ t }(X _{ i }) = UI _{ T }(X _{ i }), execute the undo operation.
Rule 2. If T has been submitted, then for ∀ X _{ i } ∈ US(T) satisfying S _{ t }(X _{ i }) ≠ UI _{ T }(X _{ i }), execute the redo operation.
Rules 1 and 2 can recover the data such that they satisfy the internal consistent constraint, and nontime series data only have internal consistent constraint, so they can also be used to recover nontime series data.
Time series data recovery rule
Rule 3. If ∃ X _{ i } ∈ US(T) satisfying S _{ t }(X _{ i }) = UI _{ T }(X _{ i }) and t ≤ ST(X _{ i }) + VI(X _{ i }), then whether or not T has been submitted, there is no need to execute any recovery operation for X _{ i }.
Rule 4. If ∃ X _{ i } ∈ US(T) satisfying S _{ t }(X _{ i }) ≠ UI _{ T }(X _{ i }) and t ≤ ST(X _{ i }) + VI(X _{ i }), then execute the redo operation for X _{ i }.
Rule 5. If ∃ X _{ i } ∈ US(T) satisfying t > ST(X _{ i }) + VI(X _{ i }), then resample by starting the data receiving transaction of X _{ i }.
Theorem 2. Rules 3~5 can recover the internal and external state consistency of time series data.
Proof: The recovery of time series data X _{ i } needs to consider the consistency between its internal state S _{ t }(X _{ i }) with its external state UI _{ T }(X _{ i }), but not whether or not the transaction has been submitted.
When t ≤ ST(X _{ i }) + VI(X _{ i }), if S _{ t }(X _{ i }) ≠ UI _{ T }(X _{ i }), i.e., the internal and external states of X _{ i } are not consistent, then whether or not T has been submitted, the redo operation should be executed according to UI _{ T }(X _{ i }) (Rule 4); and if S _{ t }(X _{ i }) = UI _{ T }(X _{ i }), i.e., the internal and external states of X _{ i } are consistent, then whether or not T has been submitted, there is no need to execute any recovery operation (Rule 3).
When t > ST(X _{ i }) + VI(X _{ i }), executing undo or redo operation is meaningless, and data receiving transaction should be restarted immediately to resample and recover the consistency of X _{ i } between its internal and external states (Rule 5).
Real world state recovery rule
In embedded realtime applications, if the transactions have been submitted and have changed the real world states, there is no need to recover; and if the transactions have not been submitted, then we should do some compensation to recover the state changes of real world.
Rule 6. If T has not been submitted, then for each action that has happened, i.e., ∀ A _{ i } ∈ AS(T), execute compensation or recovery task for A _{ i }.
Theorem 3. Rule 6 can recover the consistency of real world state.
Proof: Manipulating transactions is readonly, and they do not violate the consistency of data objects. The atomicity of manipulating transactions is that, whether all actions of T, AS(T) = {A _{ i }1 ≤ i ≤ h}, are executed or none of them is executed.
Let OAS(T) = {A _{ j }1 ≤ j ≤ h} be the set of actions that has been executed in T when a fault occurs. According to Rule 6, when OAS(T) ≠ ∅ and OAS(T) ≠ AS(T), we need to compensate and recover for ∀ A _{ j } ∈ OAS(T). So, the real world states, that have been changed, can be recovered correctly.
Transaction restart rule
No manual intervention is a typical feature of embedded realtime databases, and thus, the database systems should restart all kinds of transactions automatically when faults occur. The transactions needed to restart include two kinds. The first one is that restarting period has passed by or running time has exceeded the running period, and the second one includes nonperiodic transactions, that do not finish successfully but still satisfy all consistencies.
Rule 7. For a periodic transaction T, if T does not finish normally, or T finishes normally and satisfies t ≥ BT(T) + P(T), then restart T.
Rule 8. For a nonperiodic transaction T, that does not finish normally, if the following conditions satisfy at the same time, then restart T.

(1)
t + EET(T) ≤ D _{ R }(T);

(2)
∀ X _{ i } ∈ RS(T), t ≤ ST(X _{ i }) + VI(X _{ i });

(3)
∀ X _{ i }, X _{ j } ∈ RS(T), i ≠ j and 1 ≤ i, j ≤ n, ST(X _{ i }) − ST(X _{ j }) ≤ R _{mvi}(T).
Rule 8 is the same as Corollary 1, i.e., when a fault occurs, only when all consistencies of a transaction have been satisfied, then we can restart the transaction.
Logbased recovery strategy
In order to recover from faults, embedded realtime main memory databases need to log the time and triggered actions for each transaction and data. These logs include realtime transaction logs, data logs, and action logs. Taking the limits of CPU, storage and energy in embedded systems, we propose the following data recovering strategies based on the rules in the last section.
Strategy 1. If X is a series data with short limited time, then there is no need to log the updates of data.
Strategy 2. If \( \frac{\leftAFI\left({X}_i\right)BFI\left({X}_i\right)\right}{BFI\left({X}_i\right)}\ge \delta \left({X}_i\right) \), then log the current data update operation; and otherwise, log nothing.
Strategy 3. Update the time series data objects immediately. That is updating the states of database before a transaction is submitted.
Strategy 4. Deferred update the nontime series data objects. That is updating the states of database when a transaction is submitted.
Strategies 1 and 2 can greatly reduce the overhead of logging the updates of time series data, and also accelerate the recovery speed. Rule 3 makes sure that the latest states of time series data can be written to the databases to reduce the redo operations of time series data. Rule 4 clears the logs of nontime series data and their undo recovery, and can further reduce the overhead of storage and recovery.
Based on the above strategies, we propose corresponding recovery algorithms for data receiving transactions, control transactions, and data processing transactions, and they are described as follows:
Experiments
Experimental setting
In the experiments, we implement the proposed logbased recovery algorithm on the eXtremeDB embedded database [19], and compare it with the traditional recovery method and the method in eXtremeDB. The experiments contain a small database, and the operations include insert, delete, and modification. Query operations are not in our experiments, because they do not change the data in the database, and the recovery strategy does not need to consider this situation. We mainly compare the system overhead, overtime transaction ratio (ratio of transactions that exceed the validity interval), and rejecting service time (downtime). The meanings and values of experimental parameters in eXtremeDB are in Table 1.
Experimental results
Firstly, we compare the CPU utilization and log buffer utilization of the three approaches, and the results are in Figs. 1 and 2, respectively. With respect to CPU utilization, our proposed approach is higher than the other two, and the reason is that the proposed approach uses main memory to store data and it has the highest throughput. With respect to the log buffer utilization, the value of the proposed approach is the lowest, which means that the proposed method only logs necessary data and the usage of log buffer is the most efficient.
Secondly, we compare the ratio of transactions exceeding the validity interval in Fig. 3, and the average rejecting service time in Fig. 4. The ratio of transactions exceeding the validity interval is also the ratio of missing transactions. From Fig. 3, we can see that our proposed approach has the least missing transactions. Rejecting service time is also called downtime. Figure 4 illustrates that the proposed approach has the lowest average downtime.
Next, in our proposed approach, we observe the changes of overtime transaction ratio under different “per_short” (short timelimited data ratio) and “threshold” (time series data state change threshold), and the results are in Figs. 5 and 6, respectively. In Fig. 5, the order of overtime transaction ratios for different per_short is 0 > 0.5 > 0.1 > 0.3 > 0.2, which means that we must carefully select per_short to optimize the overtime transaction ratio. Here, per_short = 0.2 is the best. In Fig. 6, the order of overtime transaction ratios for different threshold is the same as that of per_short, so we can have the same conclusion.
Finally, we observe the time series data ratio of the proposed approach under different update modes, and the results are in Fig. 7. From the figure, we can see that the hybrid of deferred and immediate update modes has the lowest time series data ratio, which means that the hybrid update mode has canceled the overhead of undo recovery for the invariant data objects, and thus reduces the ratio of transactions exceeding the validity interval.
Conclusions
In this paper, we study the problem of data recovery strategy in embedded realtime main memory databases. Because of realtime requirement in embedded systems, consistency of embedded realtime main memory databases is different from traditional databases. We analyzed both the data and transaction consistencies in embedded realtime main memory databases, designed rules for correct recovery strategy, and proposed realtime logbased recover algorithms for different types of transactions. The experiments show that the proposed approach is more effective and efficient than methods in both traditional and eXtremeDB database systems. The proposed recovery algorithm can be integrated into the eXtremeDB database, and thus provide better recovery performance. Integrating the proposed algorithm into other main memory database will be our future work.
References
 1.
A Nori, Mobile and Embedded Databases[C]//Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. ACM, 2007, pp. 1175–1177
 2.
V Narayanan, Y Xie, Reliability concerns in embedded system designs. Computer 39(1), 118–120 (2006)
 3.
H GarciaMolina, K Salem, Main memory database systems: an overview. Knowl. Data Eng. IEEE Trans. 4(6), 509–516 (1992)
 4.
J Stankovic, SH Son, J Hansson, Misconceptions about realtime databases. Computer 32(6), 29–36 (1999)
 5.
K Ramamritham, Realtime databases. Distrib. Parallel Databases 1(2), 199–226 (1993)
 6.
G Özsoyoğlu, RT Snodgrass, Temporal and realtime databases: a survey. Knowl. Data Eng. IEEE Trans. 7(4), 513–532 (1995)
 7.
K Ramamritham, SH Son, LC Dipippo, Realtime databases and data services. Realtime Syst. 28(23), 179–215 (2004)
 8.
KH Kim, HO Welch, Distributed execution of recovery blocks: an approach for uniform treatment of hardware and software faults in realtime applications. Comput. IEEE Trans. 38(5), 626–636 (1989)
 9.
RM Sivasankaran, K Ramamritham, JA Stankovic et al., Data Placement, Logging and Recovery in RealTime Active Databases[M]//Active and RealTime Database Systems (ARTDB95) (Springer, London, 1996), pp. 226–241
 10.
Soparkar NR, Silberschatz A, Korth HF. Timeconstrained transaction management: realtime constraints in database transaction systems. Kluwer Academic Publishers; 1996.
 11.
MI Seltzer, MA Olson, Challenges in Embedded Database System Administration[C]//Proceeding of the Embedded System Workshop, 1999, pp. 29–31
 12.
GM Liao, JP Li, Research on Timely Recovery Technology of Memory Database[C]//Wavelet Active Media Technology and Information Processing (ICWAMTIP), 2012 International Conference on. IEEE, 2012, pp. 268–271
 13.
A Kemper, T Neumann, HyPer: A hybrid OLTP&OLAP Main Memory Database System Based on Virtual Memory Snapshots[C]//Data Engineering (ICDE), 2011 IEEE 27th International Conference on. IEEE, 2011, pp. 195–206
 14.
Lam KY, Kuo TW. realtime database systems: architecture and techniques. Kluwer Academic Publishers; 2001.
 15.
LC Shu, JA Stankovic, SH Son, Achieving bounded and predictable recovery using realtime logging. Comput. J. 47(3), 373–394 (2004)
 16.
T Niklander, K Raatikainen, Using Logs to Increase Availability in RealTime MainMemory Database[M]//Parallel and Distributed Processing (Springer, Berlin Heidelberg, 2000), pp. 720–726
 17.
N Malviya, A Weisberg, S Madden et al., Rethinking Main Memory OLTP Recovery[C]//Data Engineering (ICDE), 2014 IEEE 30th International Conference on. IEEE, 2014, pp. 604–615
 18.
E Levy, A Silberschatz, Incremental recovery in main memory database systems. Knowl. Data Eng. IEEE Trans. 4(6), 529–540 (1992)
 19.
MC Majhi, AK Behera, NM Kulshreshtha et al., ExtremeDB: a unified web repository of extremophilic archaea and bacteria [J], 2013
Acknowledgements
The work was supported by the following funds: Hunan Provincial Natural Science Foundation of China (Grant No.2015JJ6043); Hunan University of Science and Engineering; Scientific Research Fund of Hunan Provincial Education Department(Grant No.12A054); The Construct Program of the Key Discipline in Hunan University of Science and Engineering(Circuits and Systems).
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Embedded system
 Realtime main memory database
 Recovery strategy
 Consistency