Skip to main content

Always up-to-date, most comprehensive HAR resource — continuously scanned and auto-updated from Papers with Code. 53 datasets integrated across all modalities.

130
GitHub Stars
123
Curated Resources
8
Categories
3 hours ago
Last Refreshed
DatasetsFrameworks and LibrariesPretrained ModelsTutorials and CoursesKey PapersCompetitions and ChallengesTools and UtilitiesRelated Awesome Lists

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me skeleton action recognition resources from awesome-human-activity-recognition"

Installation instructions →

What's inside

Frameworks and Libraries

  • 2s-AGCNSkeleton Action Recognition

    Two-stream adaptive graph convolutional network for skeleton-based action recognition from CVPR 2019.

  • aeonWearable Sensor HAR

    Unified Python toolkit for time series including classification, clustering, and anomaly detection.

  • CTR-GCNSkeleton Action Recognition

    Channel-wise topology refinement graph convolution for skeleton-based action recognition from ICCV 2021.

  • DeepConvLSTMWearable Sensor HAR

    Reference implementation of the convolutional LSTM architecture for wearable activity recognition.

  • Hang-Time HARWearable Sensor HAR

    Basketball activity recognition from a single wrist-worn inertial sensor using deep learning.

  • HD-GCNSkeleton Action Recognition

    Hierarchically decomposed graph convolutional network for skeleton action recognition from AAAI 2024.

Datasets

  • ActivityNetVision (RGB / Depth)

    Temporal action detection benchmark with 20k untrimmed YouTube videos across 200 classes.

  • ActivityNet CaptionsMultimodal and Egocentric

    Dense video captioning and temporal grounding with 20k videos and 100k captions.

  • AMASSSkeleton and Mocap

    Unified SMPL motion capture parameters from 40+ datasets covering 16k minutes and 344 subjects.

  • AVAVision (RGB / Depth)

    Spatio-temporal action detection with 430 movie clips and 80 atomic action labels with bounding boxes.

  • BabelSkeleton and Mocap

    Motion-language alignment dataset with 43 hours and 3.7k sequences annotated with SMPL and text labels.

  • BEHAVEEmerging and Frontier

    RGB-D human-object interaction with 3D pose spanning 321 sequences from 20 subjects.

Competitions and Challenges

Key Papers

  • Attend and DiscriminateWearable and Sensor HAR

    Abedin et al., IMWUT 2021, attention mechanisms for multi-sensor HAR.

  • C3D: Learning Spatiotemporal FeaturesFoundational

    Tran et al., ICCV 2015, pioneering 3D convolutions for video feature learning.

  • DeepConvLSTMWearable and Sensor HAR

    Ordonez and Roggen, Sensors 2016, establishing deep learning for wearable activity recognition.

  • Deep Learning for HAR: A SurveySurveys

    Li et al., ACM Computing Surveys 2022, comprehensive review of deep learning approaches for HAR.

  • I3D: Quo Vadis Action RecognitionFoundational

    Carreira and Zisserman, CVPR 2017, inflating 2D ImageNet architectures to 3D video.

  • InternVideo2Transformer Era (2020 onwards)

    Wang et al., ECCV 2024, scaling video foundation models to 6B parameters across 60+ benchmarks.

Related Awesome Lists

Tutorials and Courses

Tools and Utilities

  • Decord

    Efficient GPU-accelerated video reader for deep learning training pipelines.

  • MediaPipe

    Google's on-device ML framework for pose estimation, hand tracking, and gesture recognition.

  • MMAction2 Model Zoo

    Pretrained checkpoints and configs for 100+ action recognition models.

  • OpenPose

    Real-time multi-person keypoint detection for skeleton extraction from video.

  • Papers with Code - HAR Leaderboards

    Live SOTA tracking across all major HAR benchmarks.

  • vid2player

    Character animation from video input, useful for activity recognition visualization.

Pretrained Models

  • InternVideo2 Model Zoo

    6B-parameter video-language model checkpoints on Hugging Face for action recognition and retrieval.

  • MotionBERT Checkpoints

    Pretrained motion encoder transferable to 3D pose estimation, action recognition, and mesh recovery.

  • MVD

    Masked video distillation pretrained model competitive with VideoMAE on downstream action recognition.

  • UniFormerV2

    Efficient video transformer with multi-scale tokens achieving 90.0% top-1 on Kinetics-400.

  • VideoMAE V2

    Billion-parameter video foundation model pretrained on millions of clips, finetunable for action recognition.

Showing a sample of 123 resources. View the full list on GitHub →