OLAP

From TedYunWiki
Revision as of 05:58, 13 November 2013 by Tedyun (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Two Types of Database Activity

  1. OLTP (Online Transaction Processing)
    • short transactions
    • simple queries
    • touch small portions of data
    • frequent update
  2. OLAP (Online Analytical Processing)
    • long transactions
    • complex queries
    • touch large portions of the data
    • infrequent updates

Terminologies

  • Data Warehousing

Bring data from operational(OLTP) sources into a single "warehouse" for (OLAP) analysis

  • Decision Support System (DSS)

Infrastructure for data analysis, e.g., data warehouse tuned for OLAP

Star Schema

  • Fact table

Updated frequently, often append-only, very large
e.g. sales transactions, course enrollments, page view

  • Dimension tables

Updated infrequently, not as large
e.g. stores, items, customers, students, courses, webpages, users, advertisers

  • Fact table references dimension tables. Hence, the star shape.

Example

Fact table
Sales(storeID, itemID, custID, qty, price)
  • dimension attributes: storeID, itemID, custID (=> foreign keys)
  • dependent attributes: qty, price
Dimenstion tables
Store(storeID, city, state)
Item(itemID, category, brand, color, size)
Customer(custID, name, address)

OLAP Queries

Join → Filter → Group → Aggregate
  • Performance
    1. Inherently very slow: special indexes, query processing techniques
    2. Extensive use of materialized views

Data Cube (Multidimensional OLAP)

  • Dimension data forms axes of "cube"
  • Fact (dependent) data in cells)
  • Aggregated data on sides, edges, corner

Drill-down and Roll-up

  • Drilling-down corresponds to adding a GROUP BY attribute.
  • Rolling-up corresponds to removing a GROUP BY attribute.

SQL Constructs

Two queries to support (multidimensional) OLAP: WITH CUBE, WITH ROLLUP

Select dimension-attrs, aggregates
From tables
Where conditions
Group By dimension-attrs With Cube

=> Add to result: faces, edges, and corner of cube using NULL values

Select dimension-attrs, aggregates
From tables
Where conditions
Group By dimension-attrs With Rollup

=> For hierarchical dimensions, portion of With Cube.

OLAP Queries

  • Star join
  • Drill-down and roll-up
  • "Slice" and "dice"
  • With Cube and With Rollup

References

  1. Coursera Introduction to Databases (Chapter 15)
  2. http://en.wikipedia.org/wiki/Online_analytical_processing
  3. http://en.wikipedia.org/wiki/OLAP_cube
  4. http://en.wikipedia.org/wiki/Essbase