Notes and Study Materials

Differences Between Distributed Processing and Distributed Databases

 

 

In distributed processing, a database’s logical processing is shared among two or more physically independent sites that are connected through a network. For example, the data input/output (I/O), data selection, and data validation might be performed on one computer, and a report based on that data might be created on another computer.

 

A distributed database, on the other hand, stores a logically related database over two or more physically independent sites. The sites are connected via a computer network. In contrast, the distributed processing system uses only a single-site database but shares the processing chores among several sites. In a distributed database system, a database is composed of several parts known as database fragments. The database fragments are located at different sites and can be replicated among various sites. Each database fragment is, in turn, managed by its local database process.

Add a comment

Distributed Database Management System(DDBMS) Components

 

 

The different components of DDBMS are as follows:

• Computer workstations or remote devices (sites or nodes) that form the network system. The distributed database system must be independent of the computer system hardware.

• Network hardware and software components that reside in each workstation or device. The network components allow all sites to interact and exchange data. Because the components—computers, operating systems, network hardware, and so on—are likely to be supplied by different vendors, it is best to ensure that distributed database functions can be run on multiple platforms.

Add a comment

Distribution Transparency

 

 

Distribution transparency allows a physically dispersed database to be managed as though it were a centralized database. The level of transparency supported by the DDBMS varies from system to system. Three levels of distribution transparency are recognized:

• Fragmentation transparency is the highest level of transparency. The end user or programmer does not need to know that a database is partitioned. Therefore, neither fragment names nor fragment locations are specified prior to data access.

• Location transparency exists when the end user or programmer must specify the database fragment names but does not need to specify where those fragments are located.

Add a comment

Distributed Concurrency Control and Two-Phase Commit Protocol

 

 

Concurrency control becomes especially important in the distributed database environment because multisite, multiple-process operations are more likely to create data inconsistencies and deadlocked transactions than single-site systems are. For example, the TP component of a DDBMS must ensure that all parts of the transaction are completed at all sites before a final COMMIT is issued to record the transaction.

Suppose that each transaction operation was committed by each local DP, but one of the DPs could not commit the transaction’s results. Such a scenario would yield the problems illustrated in the following figure.
The transaction(s) would yield an inconsistent database, with its inevitable integrity problems, because committed data cannot be uncommitted! The solution for the problem is illustrated in the figure is a two-phase commit protocol.

Add a comment

Performance transparency and query optimization in DDBMS

 

 

One of the most important functions of a database is its ability to make data available. Because all data reside at a single site in a centralized database, the DBMS must evaluate every data request and find the most efficient way to access the local data.

 

In contrast, the DDBMS makes it possible to partition a database into several fragments, thereby rendering the query translation more complicated, because the DDBMS must decide which fragment of the database to access.

 

In addition, the data may also be replicated at several different sites. The data replication makes the access problem even more complex, because the database must decide which copy of the data to access. The DDBMS uses query optimization techniques to deal with such problems and to ensure acceptable database performance.

Add a comment

Client/Server Vs. DDBMS or Advantages of Client/Server Architecture

 

 

Client/server architecture refers to the way in which computers interact to form a system. The client/server architecture features a user of resources, or a client, and a provider of resources, or a server.

The client/server architecture can be used to implement a DBMS in which the client is the TP and the server is the DP. Client/server interactions in a DDBMS are carefully scripted.

The client (TP) interacts with the end user and sends a request to the server (DP). The server receives, schedules, and executes the request, selecting only those records that are needed by the client. The server then sends the data to the client only when the client requests the data.

Add a comment

Different Levels of Data and Process Distribution

 

 

Current database systems can be classified on the basis of how process distribution and data distribution are supported.

For example, a DBMS may store data in a single site (centralized DB) or in multiple sites (distributed DB) and may support data processing at a single site or at multiple sites. The different types of Data and Process distribution methods are as follows.

Add a comment

Transaction Transparency

 

 

Transaction transparency is a DDBMS property that ensures that database transactions will maintain the distributed database’s integrity and consistency. Remember that a DDBMS database transaction can update data stored in many different computers connected in a network. Transaction transparency ensures that the transaction will be completed only when all database sites involved in the transaction complete their part of the transaction.

Distributed database systems require complex mechanisms to manage transactions and to ensure the database’s consistency and integrity. To understand how the transactions are managed, you should know the basic concepts governing remote requests, remote transactions, distributed transactions, and distributed requests.

 

Distributed Requests and Distributed Transactions:

Add a comment