Notes and Study Materials

Distributed Database Design

 

 

Whether the database is centralized or distributed, the design principles and concepts are same. However, the design of a distributed database introduces three new issues:

• How to partition the database into fragments.

• Which fragments to replicate.

• Where to locate those fragments and replicas.

Data fragmentation and data replication deal with the first two issues and data allocation deals with the third issue.

Add a comment

Business Intelligence and Its Architecture

 

 

Business intelligence (BI)1 is a term used to describe a comprehensive, cohesive, and integrated set of tools and processes used to capture, collect, integrate, store, and analyze data with the purpose of generating and presenting information used to support business decision making.

BI is a framework that allows a business to transform data into information, information into knowledge, and knowledge into wisdom. Implementing BI in an organization involves capturing not only business data (internal and external) but also the metadata, or knowledge about the data. BI provides a well-orchestrated framework for the management of data that works across all levels of the organization. BI involves the following general steps:

Add a comment

Operational Data Vs Decision Support Data

 

 

Operational data and decision support data serve different purposes. Most operational data are stored in a relational database in which the structures (tables) tend to be highly normalized.

Operational data storage is optimized to support transactions that represent daily operations. For example, each time an item is sold, it must be accounted for.

Operational Data Vs Decision Support Data

 

Customer data, inventory data, and so on, are in a frequent update mode. To provide effective update performance, operational systems store data in many tables, each with a minimum number of fields.

From the data analyst’s point of view, decision support data differ from operational data in three main areas: time span, granularity, and dimensionality.

Operational Data Vs decision support data

Add a comment

The Data Warehouse and Data Mart

 

 

Data Warehouse is “an integrated, subject-oriented, time-variant, nonvolatile collection of data that provides support for decision making.” To understand that definition, let’s take a more detailed look at its components.

 

• Integrated:

 

The data warehouse is a centralized, consolidated database that integrates data derived from the entire organization and from multiple sources with diverse formats. Data integration implies that all business entities, data elements, data characteristics, and business metrics are described in the same way throughout the enterprise.

Add a comment

Online Analytical Processing and its Characteristics

 

 

The need for more intensive decision support prompted the introduction of a new generation of tools. Those new tools, called online analytical processing (OLAP), create an advanced data analysis environment that supports decision making, business modeling, and operations research.

 

OLAP systems share four main characteristics:

Add a comment

Relational OLAP

 

 

Relational online analytical processing (ROLAP) provides OLAP functionality by using relational databases and familiar relational query tools to store and analyze multidimensional data. That approach builds on existing relational technologies and represents a natural extension to all of the companies that already use relational database management systems within their organizations. ROLAP adds the following extensions to traditional RDBMS technology:

 

• Multidimensional data schema support within the RDBMS.

• Data access language and query performance optimized for multidimensional data.

• Support for very large databases (VLDBs).

Add a comment

Star Schema and Its Components

 

 

The star schema is a data-modeling technique used to map multidimensional decision support data into a relational database. In effect, the star schema creates the near equivalent of a multidimensional database schema from the existing relational database. The star schema was developed because existing relational modeling techniques, ER, and normalization did not yield a database structure that served advanced data analysis requirements well.

The basic star schema has four components: facts, dimensions, attributes, and attribute hierarchies.

Add a comment

Data Warehouse Implementation

 

 

Organization-wide information system development is subject to many constraints. Some of the constraints are based on available funding. Others are a function of management’s view of the role played by an IS department and of the extent and depth of the information requirements. Add the constraints imposed by corporate culture, and you understand why no single formula can describe perfect data warehouse development. Therefore, rather than proposing a single data warehouse design and implementation methodology. The following are the few factors that appear to be common to data warehousing.

Add a comment