Difference between revisions of "OLAP"

From TedYunWiki
Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 29: Line 29:
 
; Fact table
 
; Fact table
 
: Sales(storeID, itemID, custID, qty, price)
 
: Sales(storeID, itemID, custID, qty, price)
:: ''dimension attributes'': storeID, itemID, custID (=> foreign keys)
+
:* ''dimension attributes'': storeID, itemID, custID (=> ''foreign keys'')
:: ''dependent attributes'': qty, price
+
:* ''dependent attributes'': qty, price
  
 
; Dimenstion tables
 
; Dimenstion tables
Line 36: Line 36:
 
: Item(itemID, category, brand, color, size)
 
: Item(itemID, category, brand, color, size)
 
: Customer(custID, name, address)
 
: Customer(custID, name, address)
 +
 +
=== OLAP Queries ===
 +
; Join → Filter → Group → Aggregate
 +
* Performance
 +
*# Inherently very slow: special indexes, query processing techniques
 +
*# Extensive use of materialized views
 +
 +
=== Data Cube (Multidimensional OLAP) ===
 +
* Dimension data forms axes of "cube"
 +
* Fact (dependent) data in cells)
 +
* Aggregated data on sides, edges, corner
 +
 +
=== Drill-down and Roll-up ===
 +
* Drilling-down corresponds to ''adding'' a '''GROUP BY''' attribute.
 +
* Rolling-up corresponds to ''removing'' a '''GROUP BY''' attribute.
 +
 +
=== SQL Constructs ===
 +
Two queries to support (multidimensional) OLAP: '''WITH CUBE''', '''WITH ROLLUP'''
 +
 +
Select dimension-attrs, aggregates
 +
From tables
 +
Where conditions
 +
Group By dimension-attrs <u>With Cube</u>
 +
=> Add to result: faces, edges, and corner of cube using NULL values
 +
 +
Select dimension-attrs, aggregates
 +
From tables
 +
Where conditions
 +
Group By dimension-attrs <u>With Rollup</u>
 +
=> For hierarchical dimensions, portion of With Cube.
 +
 +
=== OLAP Queries ===
 +
* Star join
 +
* Drill-down and roll-up
 +
* "Slice" and "dice"
 +
* ''With Cube'' and ''With Rollup''
 +
 +
=== References ===
 +
# Coursera Introduction to Databases (Chapter 15)
 +
# http://en.wikipedia.org/wiki/Online_analytical_processing
 +
# http://en.wikipedia.org/wiki/OLAP_cube
 +
# http://en.wikipedia.org/wiki/Essbase

Latest revision as of 05:58, 13 November 2013

Two Types of Database Activity

  1. OLTP (Online Transaction Processing)
    • short transactions
    • simple queries
    • touch small portions of data
    • frequent update
  2. OLAP (Online Analytical Processing)
    • long transactions
    • complex queries
    • touch large portions of the data
    • infrequent updates

Terminologies

  • Data Warehousing

Bring data from operational(OLTP) sources into a single "warehouse" for (OLAP) analysis

  • Decision Support System (DSS)

Infrastructure for data analysis, e.g., data warehouse tuned for OLAP

Star Schema

  • Fact table

Updated frequently, often append-only, very large
e.g. sales transactions, course enrollments, page view

  • Dimension tables

Updated infrequently, not as large
e.g. stores, items, customers, students, courses, webpages, users, advertisers

  • Fact table references dimension tables. Hence, the star shape.

Example

Fact table
Sales(storeID, itemID, custID, qty, price)
  • dimension attributes: storeID, itemID, custID (=> foreign keys)
  • dependent attributes: qty, price
Dimenstion tables
Store(storeID, city, state)
Item(itemID, category, brand, color, size)
Customer(custID, name, address)

OLAP Queries

Join → Filter → Group → Aggregate
  • Performance
    1. Inherently very slow: special indexes, query processing techniques
    2. Extensive use of materialized views

Data Cube (Multidimensional OLAP)

  • Dimension data forms axes of "cube"
  • Fact (dependent) data in cells)
  • Aggregated data on sides, edges, corner

Drill-down and Roll-up

  • Drilling-down corresponds to adding a GROUP BY attribute.
  • Rolling-up corresponds to removing a GROUP BY attribute.

SQL Constructs

Two queries to support (multidimensional) OLAP: WITH CUBE, WITH ROLLUP

Select dimension-attrs, aggregates
From tables
Where conditions
Group By dimension-attrs With Cube

=> Add to result: faces, edges, and corner of cube using NULL values

Select dimension-attrs, aggregates
From tables
Where conditions
Group By dimension-attrs With Rollup

=> For hierarchical dimensions, portion of With Cube.

OLAP Queries

  • Star join
  • Drill-down and roll-up
  • "Slice" and "dice"
  • With Cube and With Rollup

References

  1. Coursera Introduction to Databases (Chapter 15)
  2. http://en.wikipedia.org/wiki/Online_analytical_processing
  3. http://en.wikipedia.org/wiki/OLAP_cube
  4. http://en.wikipedia.org/wiki/Essbase