VicRoads would like to have a good understanding of the areas and times of the day that have high traffic so they could consider measures to address that.
They have some data available from the traffic collection system and have hired you as a consultant to build a data warehouse and an interface that will allow them to undertake further analysis of the traffic data.
The assignment is divided into four major tasks:
(i) You FIRST need to upload the data from the.CSV file provided by VicRoads (available on LMS) into an operational database. It is up to you to decide the number of tables and propose a normalised structure to accommodate this data.
The .CSV file supplied has 225146 records and has the following fields:
FLOW_ID – The unique identifier of the traffic flow measured; LINK_ID – The unique identifier of the road link being measured; LINK_DESC – A description of the road link being measured; FLOW_DIR – The direction of the traffic flow;
PERIOD_TYPE – The type of period when traffic was measured;
F00 to F33 – The traffic flow measured at each hour, i.e. 0:00, 1:00….23:00
(ii) Your SECOND task is to design a data warehouse to enable various analytics on the traffic data. You need to design the data warehouse using a star (or snowflake) schema and populate the schema with the appropriate data from the underlying database that you created at point (i), using SQL queries. You may need to create other data as applicable (e.g. for time dimension).
(iii) Your THIRD task is to provide a web interface in QlikView® that can help VicRoads managers to navigate and analyze the data in the proposed data warehouse.
The interface will allow users to answer queries including but not limited to:
a) What are the top 5 busiest links during hours public holidays, school holidays and normal time (school term)? What is the average traffic on those links during those periods?
b) What are the 3 least busy links during hours public holidays, school holidays and normal time (school term)?
c) What are the peak time hours for Monash Freeway links? What about Princes Way links?
d) What is the trend of traffic volumes during the day, during public holidays, school holidays and normal time (school term)?
e) During public holidays and school holidays, are mornings, afternoons or evenings most popular for travel? What is the average traffic during those periods?
What parameters/selections you provide the user with, and whether the answers to these questions will be shown in the QlikView© application as graphs, tables, or a combination of both, are your design decisions.
(iv) Your FOURTH task is to create an implementation report documenting all the steps of the solution. The report should include at a minimum:
• A cover page with the name of the solution, and names & IDs of the students in the group;
• The description of the operational database created at point (i);
• The description of the star schema (or snowflake schema) created at point (ii); all tables should be defined, including keys as applicable;
• Screenshots for the interface built at point (iii), explaining the UI components used;
• An appendix for all source code, matching the solution (i.e. you’ll need to have source code to create and populate the operational database, and source code to create and populate the data warehouse);
MARKING GUIDELINE (TOTAL: 100 MARKS)
Backend Design and Implementation [50 marks]
Operational database design & implementation [10 marks]
StarSchema/SnowFlake DW Schema Design [15 marks]
Write and implement query scripts to populate the DW Schema [25 marks]
Interface and Analytics [30 marks]
QlikView application implemented [10 marks]
User is able to answer each of the questions a) to e) listed in the task 3 [5 * 4 =20 marks]
Documentation and Presentation [20 marks]
Professional implementation report (structure, fluency, grammar) [10 marks]
Demo presentation skills (structure, clarity) [10 marks]