(Solved): CIS 355: Business Data Warehousing and Dimensional Modeling - 4th April 2020...
CIS 355: Business Data Warehousing and Dimensional Modeling
(Term: Spring 2021, BDA Program)
1) Duration: 30 minutes
2) This is a closed book quiz.
3) Be brief and neat.
4) Try to answer all questions.
Question 1: (1.5 points)
Determine whether the following statements are true or false. If false, explain why the statement is false, and how to make it true.
1. ETL (or data load) in data warehouse is an easy task and takes little time.
2. Overlapping dimensions refer to the situation where two dimension tables have “just enough in common” to support drill-across.
3. Drill across refers to an operation that navigates among different levels of data ranging from the most summarized to the most detailed.
Question 2: (2.5 points)
The following is a partial Star schema that models student enrollment in courses taught by faculty members at ASU.
a. Why there is no measurement in the “enrollment_fact” table? What type of fact table is it?
b. Is it possible to add a tracking fact/measurement to the “enrollment_fact” table? What value will the tracking fact take?
c. If we want to answer the question “what course is the most popular course?” where popularity of the course is measured by the number of students enrolled in a course, what tables do you need? What SQL query operations do you need?
Question 3: (4 points)
The following is the ETL data flow task we discussed in class to load data (from transaction DB “AdventureWorks2014”) into the dimension table “DimSalesTerritory” in the data warehouse. Based on the data flow, answer the following questions:
a. What does the node “OLE DB Source” do in the data flow task?
b. There are two branches beneath the “Slowly Changing Dimension” node, please briefly discuss (using two sentences maximum) what each of the branches does.
c. When the data flow task is executed, we noticed that 10 rows of data were affected from “OLE DB Source” to “Lookup” to “Slowly Changing Dimension.” However, no action (data affected) seems to be taken from “Slowly Changing Dimension” node to “Insert Destination” and “OLE DB Command” (see image below). Why is it the case?
d. Now suppose that the name of the territory for TerritoryID = 3 has been changed from ‘Central’ to ‘Midwest’ in the transaction DB “AdventureWorks2014” since the last ETL run. If we re-run the ETL process, which path of the data flow task will be affected and what are the number of rows (records) affected on the path?