报告题目：Multi-cue Based 3D Residual Network for Action Recognition
报 告 人：Ruili Wang 教授
Professor Wang is a Professor of Artificial Intelligence at Massey University, Auckland, New Zealand. He received the PhD degree in Computer Science from Dublin City University (Dublin, Ireland), the Bachelor Degree from Huazhong University of Science and Technology (Wuhan, China), Masters’ Degree from Northeastern University (Shenyang, China). His current research areas include language and speech processing, machine learning and data mining, computer vison and image processing. He is an Associate Editor (or an editorial board member) for the journals of IEEE Transactions on Emerging Topics in Computational Intelligence, Knowledge and Intelligent Systems (Springer), Applied Soft Computing (Elsevier), and Neurocomputing (Elsevier), and Complex and Intelligent Systems (Springer). He has published more than 140 papers, of which 90 are in peer-reviewed journals. He has supervised 19 PhD and 8 Master’s students to completion. He was awarded a Marsden grant in 2013 in machine learning and its application to speech processing, and the Seed Projects from the National Science Challenges of New Zealand.
Human action recognition aims to automatically identify specified actions in a video. It has many applications such as intelligent video surveillance, human-computer interaction and smart hospital care. Obviously, the appearance cue plays an important role in still images while the motion clue plays an important role in action recognition. Recently, deep learning-based models have achieved the state-of-the-art performance for video tasks. In particular, 3D convolutional neural networks (CNNs) are the natural architecture for video data. However, current 3D CNNs for action recognition have some deficiencies. A multi-cue 3D convolutional neural network (M3D) is proposed to overcome the deficiencies. Further, we propose a deeper residual multi-cue 3D convolution model (R-M3D) to improve the representation ability to obtain representative video features. Experimental results show that our proposed models outperform the current 3D CNNs models and achieves better performance compared with the state-of-the-art models on the UCF101 and HMDB51 datasets.