Wearable Sensor Data | Jay Skaria's Portfolio

Activity Recognition from Wearable Sensor Data

Overview:

This project explores the application of Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) models to classify human activities using time-series data collected from wearable sensors. The dataset includes motion signals captured at 50 Hz from sensors on the left ankle and right lower arm of healthy participants. The goal is to leverage temporal modeling techniques to accurately recognize a range of physical activities, with implications for remote monitoring, rehabilitation, and personalized digital health interventions.

Tools & Technologies:

Google Colab/Jupyter, python, numpy, pandas, matplotlib, scikit-learn, tensorflow, RNN, LSTM

Background:

Current physical activity guidelines recommend that adults engage in 150–300 minutes of moderate or 75–150 minutes of vigorous aerobic activity per week. mHealth technologies, through wearables, can play a key role in monitoring adherence and enhancing patient engagement. In this project, we applied machine learning to classify activity types from wearable sensor data collected from 9 adult participants at Sunnybrook Health Sciences Centre in Toronto. These models help assess the feasibility of using mHealth systems to track physical activity in a clinical or consumer health setting.

Research Questions:

1. How accurately can we classify human activity based on tracking time-series data from wearable sensors?

2. How do RNN and LSTM models compare in classifying sequential motion data?

Rationale:

Mobile health (mHealth) is transforming how healthcare is delivered, especially with the proliferation of wearable sensors capable of continuously capturing real-time physiological and biomechanical data. Such devices can monitor activity, detect deviations in movement patterns, and facilitate preventive care, which are crucial in chronic disease management and rehabilitation. Machine Learning (ML), and particularly deep learning methods like RNNs and LSTMs, offer powerful tools to analyze sequential sensor data, uncover patterns, and predict outcomes in health contexts.

Dataset:

Source: Wearable sensor data from Sunnybrook Health Sciences Centre, Toronto

Participants: 9 healthy adults (median age: 22.5, median BMI: 24); 1 participant removed due to activity distribution imbalance

Sensors: Accelerometer and gyroscope; Outputs from each sensor include acceleration and gyroscopic measurements across three axes (XYZ), captured at a rate of 50 Hz (50 times a second). These sensor readings are annotated with activity labels ranging from sedentary behaviours to dynamic movements.

Placement: Left ankle and right lower arm

Sampling Rate: 50 Hz

Activity Labels: 12 labeled activities including stillness, walking, cycling, jumping, and running

Data Pre-Processing & ML Modeling:

We chose RNN and LSTM models due to their ability to model temporal dependencies in sequential data. LSTMs are particularly suited for long-range dependencies and help mitigate the vanishing gradient problem seen in basic RNNs. A decision tree model was also included as a computationally simple and interpretable baseline using a sliding window approach and simplified activity groupings. This allowed us to compare deep learning models with a traditional ML approach and evaluate trade-offs in accuracy, interpretability, and computational complexity.

Exploration: No missing values; no highly correlated features (all correlations < 0.8)
Cleaning: Participant 9 removed due to outlier activity distribution
Normalization: StandardScaler applied to features; fit on training set only
Label Encoding: Activities labeled 0 to 12; additional categorical grouping (low, medium, high intensity) for tree-based model
Segmentation:
- RNN/LSTM: 30-second windows (1500 timesteps); zero-padding for incomplete sequences
- Decision Tree: 1000-record sliding windows with a stride of 100
Sampling Strategy:
- Random undersampling of majority class (activity 0)
- Random oversampling of minority class (activity 12)
Modeling:
- RNN: 1 RNN layer + 2 dense layers, 36 hidden units, 15 epochs, batch size 16
- LSTM: Similar structure to RNN, using LSTM units
- Decision Tree: Max depth = 3, on grouped activity intensities

Results:

Our primary model, the Recurrent Neural Network (RNN), achieved an overall accuracy of 63%. It performed well on class 0 (‘Nothing’), with a precision of 63%, recall of 99%, and an F1-score of 77%, correctly identifying 104 instances. However, the model failed to classify any examples from the remaining classes (1 to 12), yielding precision, recall, and F1-scores of 0% for all other activities.

The LSTM model showed a slight improvement with an overall accuracy of 64%. Performance on class 0 further improved, reaching 78% precision, 93% recall, and an F1-score of 85%. The model also began to correctly classify a few minority classes (e.g., 3, 4, 9, and 11), though recall remained low. Many other classes continued to have zero precision and recall, indicating they were still not being predicted at all.

The decision tree model, using a sliding window approach on grouped activity intensities, achieved the highest overall accuracy of 76%. It demonstrated strong performance for class 0, with a precision of 77%, recall of 97%, and an F1-score of 86%, correctly classifying 1,423 instances. The model performed moderately on a few other classes:

Class 1: Precision 61%, Recall 36%, F1-score 45%
Class 3: Precision 75%, Recall 34%, F1-score 47%

However, it completely failed to classify class 2, resulting in 0% precision, recall, and F1-score for that category.

For this project, accuracy offers a general measure of overall model performance, it is not ideal in the context of this classification task due to significant class imbalance. The dataset is heavily skewed toward class 0 ("Nothing"), meaning that a model can achieve high accuracy simply by predicting the majority class correctly while failing to identify minority classes altogether. For example, our RNN model achieved 63% accuracy primarily by correctly identifying class 0, but it failed to predict any other activity, resulting in 0% precision, recall, and F1-scores for classes 1–12. This highlights how accuracy can mask poor performance on underrepresented but clinically relevant classes.

Conclusion:

Our models achieved poor to moderate accuracy, with a noticeable bias toward class 0 (no activity). Despite applying resampling techniques, the RNN model struggled to classify activities beyond the dominant class—likely due to limitations in model complexity, sequence length, and vanishing gradients. The LSTM model offered marginal improvements by better capturing temporal dependencies, but many minority classes remained poorly predicted. The decision tree model, although simpler, achieved the highest overall accuracy and performed moderately well across several classes.

Importantly, computational efficiency should be considered when selecting a modeling approach. RNNs and LSTMs imposed significant computational demands due to their sequential nature and longer training times. In contrast, the decision tree model offered faster training and inference, making it more practical for real-time or resource-constrained environments, albeit with lower granularity and flexibility in capturing temporal dynamics.

Future improvements should focus on:

Enhancing model architectures (e.g., CNN-LSTM hybrids or transformers)
Refining sampling strategies (e.g., SMOTE, class weighting)
Exploring alternative models like XGBoost for interpretability and efficiency
Expanding the dataset for better generalization
Validating models on external datasets

Improving both performance and computational scalability will be key to building robust, real-world mHealth activity recognition systems.

Keywords:

mHealth, Human Activity Recognition, RNN, LSTM, Time-Series Classification, Wearable Sensors, Deep Learning, Decision Tree, Health Informatics, Class Imbalance