The recognition of human activities has become a task of high interest for medical, military, and security applications. For instance, patients with diabetes, obesity, or heart disease are often required to follow
a well-defined exercise routine as part of their treatments . Therefore, recognizing activities such as walking, running, or cycling becomes quite useful to provide feedback to the caregiver about the patient’s behavior. Likewise, patients with dementia and other mental pathologies could be monitored to detect abnormal activities and thereby prevent undesirable consequences .
In such IoT applications, proper software engineering and data engineering are especially important to manage the software development life cycle and help make data useful for machine learning models. Many software engineers are primarily interested in aggregating raw data and making it into useful, ordered and structured data formats. A typical flowchart of sensor-based human activity
This assignment involves the following subtasks:
1. Use Agile to manage this IoT application development (e.g., develop backlog, create sprint, and monitor the sprint progress). The backlog and each sprint along with each week’s sprint progress burndown chart shall be recorded in the final submission document.
2. Based on the given workshop materials, create python code to load data and extract corresponding features from the given dataset.
3. Test and evaluate the two given machine learning models (KNN and SVM) and application in general and record the test results and evaluation summary in the final submission document.
4. Refactor the source code according to the design pattern lecture and make the code easier to understand and extensible. The code shall be managed by GitHub and will be reviewed for this along with GitHub version control history.
The sourcing data is from a public dataset (Dalia dataset , which contains 6 sensors’ data for 13 activities), refining that data and cleaning them up, and extracting significant features through statistical analysis for use in artificial intelligence and machine learning systems.
An example code is provided for reference. You may need to learn the use of Python libraries Numpy  and Pandas . Machine learning modules using Scikit-learn  are given though having some understanding of them is recommended (we will only cover the basics of it to avoid course overlapping).
The human activity recognition IoT system are recommended to be developed in four sprints.
1. Data loading and preprocessing: In this stage, based on the workshop materials provided, you need to firstly visualize the sensor data to get some idea of the underlying human activity pattern. Based on the given codes, apply the signal filtering and visualize the cleaned data. You don’t have to visualize all the sensor data, visualize a specific pair of accelerometer and gyroscope shall be enough.
2. Feature engineering for sensor data: In this stage, you need to extract features from the cleaned sensor data. In the example code, min, max, and mean values of three accelerometers in the wrist sensor are extracted as features of each human activity. In this assignment, you need to focus on feature engineering (try to extract more features from more sensors based on the Week 3 lecture note, and research how different features influence the performance of human activity recognition based IoT application). Then, you could use the GIVEN code to construct training datasets. In this stage, you could train different GIVEN machine learning models based on training feature set. The code of recognition models is GIVEN, where KNN and SVM classifier are used to learn human activity recognition. You are required to analyze 13 activities based on 19 participants. Thus your training and test set shall contain all 13 activites.
3. Testing: After training a model, you should evaluate and test the application. Classification accuracy is a simple metric to measure the performance of a trained model. In addition, confusion matrix could clearly show the performance of our model on the recognition of each activity (Testing of Machine learning models and confusion matrix will be covered in week 4 lecture notes) . The two evaluation metrics are also GIVEN in the example code.
4. Code refactoring and Version Control: The given example code reflects the state of the art engineering for IoT. Please refactor the code to make the code easier to read/understand (e.g., comments) and extensible (those techniques for design pattern and software refactoring taught in the unit). The changes shall be reflected in the GitHub version control.
The use of example code:
1. download the DaLiAc dataset from the link of reference 1. Unzip all files into a folder called ‘dataset’ and then put the example code ‘har.py’ and the ‘dataset’ folder at the same directory.
2. Install Python3.x and libraries Numpy, Pandas, Scikit-learn and Matplotlib following the guidance in the weekly labs.
3. Run har.py