[Tasks and Requirements]:
The content of your technical report should include the following A-H aspects:
A. A meaningful title (5-20 words) followed by your name and student ID
The title describes your topic and points out your research direction, so it is very important and we list the tips below.
[Topic selection tips]:
Tip 1: The title may be narrowed down further from the title of your Assignment 1.
Tip 2: Make the best use of Practicals to implement your data analysis project.
First, please learn a framework of basic skills of using Weka to prepare, process and analyze data in Practicals 1-3 and focus on Practical 6 (Performance Evaluation). Practical 6 comprises the skills to compare multiple data mining methods in different metrics, which directly helps your comparison and evaluation of different techniques in your technical report.
Then, you can focus on one of the following techniques for analysing different types of data according to your topic:
. Practical 4 or Practical 5 for processing general numeric data
. Practical 7 Predicting Time Series
. Practical 8 Text Mining
. Practical 9 Image Analysi
. Others learnt by yourself
Practical 10 Recommendation provides an application example using techniques in Practicals 4-9.
Tip 3: An example. If the title in Assignment 1 is "Using Classification Method to Discover Events from Twitter Data", then the title in Assignment 2 may be "UsingDecision Tree to Predict A Event Trend from Twitter Data". If you choose this topic for Assignment 2, you can focus on Practical 8, which provides you details for classifying short text documents using Weka. Here, the event trend can be replaced by "a user's preference trend for a commercial product" and twitter data (taking one tweet as one short text document) also can be other short text data, such as user's comments for the product, etc. You may prefer to Decision Tree (J48 In Weka) as shown in this title, however, you also will present a counterpart method to do comparison.
B. An abstract (100-200 words)
C. An introduction of a data analytic application background, motivation and aim (200-300 words)
D. A summary of your dataset, including data type (general numeric data, short text, time series, image etc.), data size, data quality and data pre-processing) (300-500 words)
E. The main data mining techniques you adopt to satisfy your application aim (800-1000 words)
In this section, you will point out
. whether it is a clustering problem or classification problem,
. what is the data mining algorithm you will adopt to analyse your data, what are the steps of the adopted algorithm and what are its advantages and disadvantages,
. what is the counterpart algorithm that maybe an alternative choice to analyse your data.
F. Evaluation and demonstration (800-1000 words)
. what is the difference between your adopted algorithm and the counterpart algorithm, you may use performance evaluation skills learnt in Practical 6 to compare them. And you must demonstrate (i.e., show result accuracies as evidence) why one is better than another.
G. Conclusions (100-200 words)
H. List of References (IEEE and Harvard are preferred).