Difference between revisions of "OLAP"

From TedYunWiki
Jump to navigation Jump to search
Line 51: Line 51:
 
* Drilling-down corresponds to ''adding'' a '''GROUP BY''' attribute.
 
* Drilling-down corresponds to ''adding'' a '''GROUP BY''' attribute.
 
* Rolling-up corresponds to ''removing'' a '''GROUP BY''' attribute.
 
* Rolling-up corresponds to ''removing'' a '''GROUP BY''' attribute.
 +
 +
=== SQL Constructs ===
 +
; '''WITH CUBE''' and '''WITH ROLLUP'''
 +
<pre>Select dimension-attrs, aggregates
 +
From tables
 +
Where conditions
 +
Group By dimension-attrs <u>With Cube</u></pre>

Revision as of 05:07, 13 November 2013

Two Types of Database Activity

  1. OLTP (Online Transaction Processing)
    • short transactions
    • simple queries
    • touch small portions of data
    • frequent update
  2. OLAP (Online Analytical Processing)
    • long transactions
    • complex queries
    • touch large portions of the data
    • infrequent updates

Terminologies

  • Data Warehousing

Bring data from operational(OLTP) sources into a single "warehouse" for (OLAP) analysis

  • Decision Support System (DSS)

Infrastructure for data analysis, e.g., data warehouse tuned for OLAP

Star Schema

  • Fact table

Updated frequently, often append-only, very large
e.g. sales transactions, course enrollments, page view

  • Dimension tables

Updated infrequently, not as large
e.g. stores, items, customers, students, courses, webpages, users, advertisers

  • Fact table references dimension tables. Hence, the star shape.

Example

Fact table
Sales(storeID, itemID, custID, qty, price)
dimension attributes: storeID, itemID, custID (=> foreign keys)
dependent attributes: qty, price
Dimenstion tables
Store(storeID, city, state)
Item(itemID, category, brand, color, size)
Customer(custID, name, address)

OLAP Queries

Join → Filter → Group → Aggregate
  • Performance
    1. Inherently very slow: special indexes, query processing techniques
    2. Extensive use of materialized views

Data Cube (Multidimensional OLAP)

  • Dimension data forms axes of "cube"
  • Fact (dependent) data in cells)
  • Aggregated data on sides, edges, corner

Drill-down and Roll-up

  • Drilling-down corresponds to adding a GROUP BY attribute.
  • Rolling-up corresponds to removing a GROUP BY attribute.

SQL Constructs

WITH CUBE and WITH ROLLUP
Select dimension-attrs, aggregates
From tables
Where conditions
Group By dimension-attrs <u>With Cube</u>