报告题目:Multi-cue Based 3D Residual Network for Action Recognition
报告时间:2019年12月15日下午14:00
报告地点:计算机学院A521
报 告 人:Ruili Wang 教授
报告人简介:
Professor Wang is a Professor of Artificial Intelligence at Massey University, Auckland, New Zealand. He received the PhD degree in Computer Science from Dublin City University (Dublin, Ireland), the Bachelor Degree from Huazhong University of Science and Technology (Wuhan, China), Masters’ Degree from Northeastern University (Shenyang, China). His current research areas include language and speech processing, machine learning and data mining, computer vison and image processing. He is an Associate Editor (or an editorial board member) for the journals of IEEE Transactions on Emerging Topics in Computational Intelligence, Knowledge and Intelligent Systems (Springer), Applied Soft Computing (Elsevier), and Neurocomputing (Elsevier), and Complex and Intelligent Systems (Springer). He has published more than 140 papers, of which 90 are in peer-reviewed journals. He has supervised 19 PhD and 8 Master’s students to completion. He was awarded a Marsden grant in 2013 in machine learning and its application to speech processing, and the Seed Projects from the National Science Challenges of New Zealand.
报告内容简介:
Human action recognition aims to automatically identify specified actions in a video. It has many applications such as intelligent video surveillance, human-computer interaction and smart hospital care. Obviously, the appearance cue plays an important role in still images while the motion clue plays an important role in action recognition. Recently, deep learning-based models have achieved the state-of-the-art performance for video tasks. In particular, 3D convolutional neural networks (CNNs) are the natural architecture for video data. However, current 3D CNNs for action recognition have some deficiencies. A multi-cue 3D convolutional neural network (M3D) is proposed to overcome the deficiencies. Further, we propose a deeper residual multi-cue 3D convolution model (R-M3D) to improve the representation ability to obtain representative video features. Experimental results show that our proposed models outperform the current 3D CNNs models and achieves better performance compared with the state-of-the-art models on the UCF101 and HMDB51 datasets.
主办单位:
吉林大学计算机科学与技术学院
吉林大学软件学院
吉林大学计算机科学技术研究所
符号计算与知识工程教育部重点实验室
吉林大学国家级计算机实验教学示范中心