TRADITIONAL DATA WAREHOUSES (Chapter 3.5 in “All AWS Data Analytics Services”)

Schematic of an OLAP Cube Used in Traditional Data Warehouses

Schematic of an OLAP Cube Used in Traditional Data Warehouses

3.5  TRADITIONAL DATA WAREHOUSES

A traditional data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence.

Online analytical processing (OLAP) cubes are multi-dimensional data structures that traditional Data Warehouse use to contain the data that you import. The cubes divide the data into subsets that are defined by dimensions.

In a dimensional approach, transaction data are partitioned into “facts”, which are generally numeric transaction data, and “dimensions“, which are the reference information that gives context to the facts. For example, a sales transaction can be broken up into facts such as the number of products ordered and the total price paid for the products, and into dimensions such as order date, customer name, product number, order ship-to and bill-to locations, and salesperson responsible for receiving the order.

A key advantage of a dimensional approach is that the data warehouse is easier for the user to understand and to use. Also, the retrieval of data from the data warehouse tends to operate very quickly. Dimensional structures are easy to understand for business users, because the structure is divided into measurements/facts and context/dimensions. Facts are related to the organization’s business processes and operational system whereas the dimensions surrounding them contain context about the measurement. Another advantage offered by dimensional model is that it does not involve a relational database every time. Thus, this type of modeling technique is very useful for end-user queries in data warehouse.

The main disadvantages of the dimensional approach are the following:

  1. In order to maintain the integrity of facts and dimensions, loading the data warehouse with data from different operational systems is extremely complicated and time-consuming.
  2. The mapping of the disparate data sources involved very complex mapping of each column, done through the “T” portion of ETL (Extract, Transform & Load). Depending on how much data was to be mapped, the types of disparate data sources, and data competion & cleanliness often led this process to take many, many months done by highly-skilled, highly-paid specialists.
  3. It is difficult to modify the data warehouse structure if the organization adopting the dimensional approach changes the way in which it does business. Normally, a new cube would have to be made to answer a different set of analytical questions.

In Chapter 9, “Amazon Redshift Data Warehouse”, a comparison is done between Amazon Redshift’s modern approach to a Data Warehousing vs. this traditional approach to Data Warehousing.

Read the previous post here.

Read the next post here.

#gottaluvAWS! #gottaluvAWS Marketplace!

This entry was posted in Dimensions & Measures, OLAP Cubes, Traditional Data Warehousing. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s