Spatiotemporal Interaction Residual Networks with Pseudo3D for Video Action Recognition.
pseudo3D architecture
spatiotemporal representation learning
two-branches network
video action recognition
Journal
Sensors (Basel, Switzerland)
ISSN: 1424-8220
Titre abrégé: Sensors (Basel)
Pays: Switzerland
ID NLM: 101204366
Informations de publication
Date de publication:
01 Jun 2020
01 Jun 2020
Historique:
received:
27
03
2020
revised:
23
05
2020
accepted:
25
05
2020
entrez:
5
6
2020
pubmed:
5
6
2020
medline:
5
6
2020
Statut:
epublish
Résumé
Action recognition is a significant and challenging topic in the field of sensor and computer vision. Two-stream convolutional neural networks (CNNs) and 3D CNNs are two mainstream deep learning architectures for video action recognition. To combine them into one framework to further improve performance, we proposed a novel deep network, named the spatiotemporal interaction residual network with pseudo3D (STINP). The STINP possesses three advantages. First, the STINP consists of two branches constructed based on residual networks (ResNets) to simultaneously learn the spatial and temporal information of the video. Second, the STINP integrates the pseudo3D block into residual units for building the spatial branch, which ensures that the spatial branch can not only learn the appearance feature of the objects and scene in the video, but also capture the potential interaction information among the consecutive frames. Finally, the STINP adopts a simple but effective multiplication operation to fuse the spatial branch and temporal branch, which guarantees that the learned spatial and temporal representation can interact with each other during the entire process of training the STINP. Experiments were implemented on two classic action recognition datasets, UCF101 and HMDB51. The experimental results show that our proposed STINP can provide better performance for video recognition than other state-of-the-art algorithms.
Identifiants
pubmed: 32492842
pii: s20113126
doi: 10.3390/s20113126
pmc: PMC7308980
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : National Natural Science Foundation of China
ID : 61672150, 61702092, 61907007, 61602221
Organisme : Fund of the Jilin Provincial Science and Technology Department
ID : 20190201305JC, 20180201089GX
Organisme : Fund of Education Department of Jilin Province
ID : JJKH20190355KJ, JJKH20190294KJ, JJKH20190291KJ
Organisme : Fundamental Research Funds for the Central Universities
ID : 2412019FZ049
Références
IEEE Trans Pattern Anal Mach Intell. 2007 Dec;29(12):2247-53
pubmed: 17934233
Sensors (Basel). 2019 Jun 21;19(12):
pubmed: 31234366
J Healthc Eng. 2017;2017:3090343
pubmed: 29065585
IEEE Trans Pattern Anal Mach Intell. 2013 Jan;35(1):221-31
pubmed: 22392705
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):773-787
pubmed: 28278449
Sensors (Basel). 2019 Aug 10;19(16):
pubmed: 31405153
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):677-691
pubmed: 27608449
Sensors (Basel). 2019 Aug 24;19(17):
pubmed: 31450609