briefly describe the various implementation of the process pairs concept comment on how process there may be useful in implementing a fault tolerant distributed DBMS
Answers
Answer:
We have referred to “reliability” and “availability” of the database a number of times
so far without defining these terms precisely. Specifically, we mentioned these terms
in conjunction with data replication, because the principle method of building a
reliable system is to provide redundancy in system components. We also claimed
in Chapter 1 that the distribution of data enhances system reliability. However, the
distribution of the database or the replication of data items is not sufficient to make
the distributed DBMS reliable. A number of protocols need to be implemented within
the DBMS to exploit this distribution and replication in order to make operations
more reliable.
A reliable distributed database management system is one that can continue
to process user requests even when the underlying system is unreliable. In other
words, even when components of the distributed computing environment fail, a
reliable distributed DBMS should be able to continue executing user requests without
violating database consistency.
The purpose of this chapter is to discuss the reliability features of a distributed
DBMS. From Chapter 10 the reader will recall that the reliability of a distributed
DBMS refers to the atomicity and durability properties of transactions. Two specific
aspects of reliability protocols that need to be discussed in relation to these properties
are the commit and the recovery protocols. In that sense, in this chapter we relax one
of the major assumptions of Chapter 11: that the underlying distributed system is
fully reliable and does not experience any hardware or software failures. Furthermore,
the commit protocols discussed in this chapter constitute the support provided by the
distributed DBMS for the execution of commit commands in transactions.
The organization of this chapter is as follows. We start with a definition of the
fundamental reliability concepts and reliability measures in Section 12. In Section
12.2 we discuss the reasons for failures in distributed systems and focus on the types
of failures in distributed DBMSs. Section 12.3 focuses on the functions of the local
recovery manager and provides an overview of reliability measures in centralized
DBMS. This discussion forms the foundation for the distributed commit and recovery
protocols, which are introduced in Section 12.4. In Sections 12.5 and 12.6 we present
detailed protocols for dealing with site failures and network partitioning, respectively.