International Business Machines Corporation
Method and System for Video Action Classification by Mixing 2D and 3D Features

Last updated:

Abstract:

A method, system, and computer program product provide for video action classification by selecting a first video frame and a first plurality of video frames from a received video to process the first video frame with a 2D convolutional neural network processing pathway to extract spatial features classifying the first video frame, and to process the first plurality of video frames with a 3D convolutional neural network processing pathway to extract spatiotemporal features classifying the first plurality of video frames so that the spatial features are combined with the spatiotemporal features to generate a classification label for the video action.

Status:
Application
Type:

Utility

Filling date:

14 May 2020

Issue date:

18 Nov 2021