Difference between revisions of "OLAP"

From TedYunWiki
Jump to navigation Jump to search
Line 38: Line 38:
  
 
=== OLAP Queries ===
 
=== OLAP Queries ===
; Join -> Filter -> Group -> Aggregate
+
; Join Filter Group Aggregate
; Performance
+
* Performance
:# Inherently very slow: special indexes, query processing techniques
+
*# Inherently very slow: special indexes, query processing techniques
:# Extensive use of materialized views
+
*# Extensive use of materialized views
 +
 
 +
=== Data Cube (Multidimensional OLAP) ===
 +
* Dimension data forms axes of "cube"
 +
* Fact (dependent) data in cells)
 +
* Aggregated data on sides, edges, corner
 +
 
 +
=== Drill-down and Roll-up ===
 +
* Drilling-down corresponds to ''adding'' a '''GROUP BY''' attribute.
 +
* Rolling-up corresponds to ''removing'' a '''GROUP BY''' attribute.

Revision as of 05:04, 13 November 2013

Two Types of Database Activity

  1. OLTP (Online Transaction Processing)
    • short transactions
    • simple queries
    • touch small portions of data
    • frequent update
  2. OLAP (Online Analytical Processing)
    • long transactions
    • complex queries
    • touch large portions of the data
    • infrequent updates

Terminologies

  • Data Warehousing

Bring data from operational(OLTP) sources into a single "warehouse" for (OLAP) analysis

  • Decision Support System (DSS)

Infrastructure for data analysis, e.g., data warehouse tuned for OLAP

Star Schema

  • Fact table

Updated frequently, often append-only, very large
e.g. sales transactions, course enrollments, page view

  • Dimension tables

Updated infrequently, not as large
e.g. stores, items, customers, students, courses, webpages, users, advertisers

  • Fact table references dimension tables. Hence, the star shape.

Example

Fact table
Sales(storeID, itemID, custID, qty, price)
dimension attributes: storeID, itemID, custID (=> foreign keys)
dependent attributes: qty, price
Dimenstion tables
Store(storeID, city, state)
Item(itemID, category, brand, color, size)
Customer(custID, name, address)

OLAP Queries

Join → Filter → Group → Aggregate
  • Performance
    1. Inherently very slow: special indexes, query processing techniques
    2. Extensive use of materialized views

Data Cube (Multidimensional OLAP)

  • Dimension data forms axes of "cube"
  • Fact (dependent) data in cells)
  • Aggregated data on sides, edges, corner

Drill-down and Roll-up

  • Drilling-down corresponds to adding a GROUP BY attribute.
  • Rolling-up corresponds to removing a GROUP BY attribute.