A bag of words approach to subject specific 3D human pose interaction classification with random decision forests

Deng, Jingjing; Xie, Xianghua; Daubney, Ben

doi:10.1016/j.gmod.2013.10.006

Journal article 1013 views 232 downloads

A bag of words approach to subject specific 3D human pose interaction classification with random decision forests

Jingjing Deng, Xianghua Xie

, Ben Daubney

Graphical Models, Volume: 76, Issue: 3, Pages: 162 - 171

Swansea University Authors: Jingjing Deng, Xianghua Xie

PDF | Accepted Manuscript
Download (2.89MB)

Check full text

DOI (Published version): 10.1016/j.gmod.2013.10.006

Abstract

In this work, we investigate whether it is possible to distinguish conversational interactions from observing human motion alone, in particular subject specific gestures in 3D. We adopt Kinect sensors to obtain 3D displacement and velocity measurements, followed by wavelet decomposition to extract l...

Full description

Published in:	Graphical Models
ISSN:	15240703
Published:	Elsevier 2014
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa49635

first_indexed	2019-03-20T13:59:09Z
last_indexed	2020-12-08T04:03:06Z
id	cronfa49635
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2020-12-07T13:21:59.4785185</datestamp><bib-version>v2</bib-version><id>49635</id><entry>2019-03-20</entry><title>A bag of words approach to subject specific 3D human pose interaction classification with random decision forests</title><swanseaauthors><author><sid>6f6d01d585363d6dc1622640bb4fcb3f</sid><firstname>Jingjing</firstname><surname>Deng</surname><name>Jingjing Deng</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>b334d40963c7a2f435f06d2c26c74e11</sid><ORCID>0000-0002-2701-8660</ORCID><firstname>Xianghua</firstname><surname>Xie</surname><name>Xianghua Xie</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2019-03-20</date><deptcode>MACS</deptcode><abstract>In this work, we investigate whether it is possible to distinguish conversational interactions from observing human motion alone, in particular subject specific gestures in 3D. We adopt Kinect sensors to obtain 3D displacement and velocity measurements, followed by wavelet decomposition to extract low level temporal features. These features are thengeneralized to form a visual vocabulary that can be further generalized to a set of topics from temporal distributions of visual vocabulary. A subject specific supervised learning approach based on Random Forests is used to classify the testing sequences to seven different conversational scenarios. These conversational scenarios concerned in this workhave rather subtle differences among them. Unlike typical action or event recognition, each interaction in our case contain many instances of primitive motions and actions, many of which are shared among different conversation scenarios. That is the interactions we are concerned with are not micro or instant events, such as hugging and high-five, but rather interactions over a period of time that consists rather similar individual motions, micro actions and interactions. We believe this is among one of the first work that is devoted to subject specific conversational interaction classification using 3D pose features and to show this task is indeed possible.</abstract><type>Journal Article</type><journal>Graphical Models</journal><volume>76</volume><journalNumber>3</journalNumber><paginationStart>162</paginationStart><paginationEnd>171</paginationEnd><publisher>Elsevier</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>15240703</issnPrint><issnElectronic/><keywords>Human interaction, Action recognition, Human pose, Random forests, Bag of words</keywords><publishedDay>31</publishedDay><publishedMonth>5</publishedMonth><publishedYear>2014</publishedYear><publishedDate>2014-05-31</publishedDate><doi>10.1016/j.gmod.2013.10.006</doi><url>http://www.sciencedirect.com/science/article/pii/S1524070313000337</url><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm/><lastEdited>2020-12-07T13:21:59.4785185</lastEdited><Created>2019-03-20T10:10:34.4837235</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Jingjing</firstname><surname>Deng</surname><order>1</order></author><author><firstname>Xianghua</firstname><surname>Xie</surname><orcid>0000-0002-2701-8660</orcid><order>2</order></author><author><firstname>Ben</firstname><surname>Daubney</surname><order>3</order></author></authors><documents><document><filename>0049635-01042019171033.pdf</filename><originalFilename>gmod.pdf</originalFilename><uploaded>2019-04-01T17:10:33.3000000</uploaded><type>Output</type><contentLength>3079687</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><embargoDate>2019-04-01T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807>
spelling	2020-12-07T13:21:59.4785185 v2 49635 2019-03-20 A bag of words approach to subject specific 3D human pose interaction classification with random decision forests 6f6d01d585363d6dc1622640bb4fcb3f Jingjing Deng Jingjing Deng true false b334d40963c7a2f435f06d2c26c74e11 0000-0002-2701-8660 Xianghua Xie Xianghua Xie true false 2019-03-20 MACS In this work, we investigate whether it is possible to distinguish conversational interactions from observing human motion alone, in particular subject specific gestures in 3D. We adopt Kinect sensors to obtain 3D displacement and velocity measurements, followed by wavelet decomposition to extract low level temporal features. These features are thengeneralized to form a visual vocabulary that can be further generalized to a set of topics from temporal distributions of visual vocabulary. A subject specific supervised learning approach based on Random Forests is used to classify the testing sequences to seven different conversational scenarios. These conversational scenarios concerned in this workhave rather subtle differences among them. Unlike typical action or event recognition, each interaction in our case contain many instances of primitive motions and actions, many of which are shared among different conversation scenarios. That is the interactions we are concerned with are not micro or instant events, such as hugging and high-five, but rather interactions over a period of time that consists rather similar individual motions, micro actions and interactions. We believe this is among one of the first work that is devoted to subject specific conversational interaction classification using 3D pose features and to show this task is indeed possible. Journal Article Graphical Models 76 3 162 171 Elsevier 15240703 Human interaction, Action recognition, Human pose, Random forests, Bag of words 31 5 2014 2014-05-31 10.1016/j.gmod.2013.10.006 http://www.sciencedirect.com/science/article/pii/S1524070313000337 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University 2020-12-07T13:21:59.4785185 2019-03-20T10:10:34.4837235 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Jingjing Deng 1 Xianghua Xie 0000-0002-2701-8660 2 Ben Daubney 3 0049635-01042019171033.pdf gmod.pdf 2019-04-01T17:10:33.3000000 Output 3079687 application/pdf Accepted Manuscript true 2019-04-01T00:00:00.0000000 true eng
title	A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
spellingShingle	A bag of words approach to subject specific 3D human pose interaction classification with random decision forests Jingjing Deng Xianghua Xie
title_short	A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
title_full	A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
title_fullStr	A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
title_full_unstemmed	A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
title_sort	A bag of words approach to subject specific 3D human pose interaction classification with random decision forests
author_id_str_mv	6f6d01d585363d6dc1622640bb4fcb3f b334d40963c7a2f435f06d2c26c74e11
author_id_fullname_str_mv	6f6d01d585363d6dc1622640bb4fcb3f_*_Jingjing Deng b334d40963c7a2f435f06d2c26c74e11_*_Xianghua Xie
author	Jingjing Deng Xianghua Xie
author2	Jingjing Deng Xianghua Xie Ben Daubney
format	Journal article
container_title	Graphical Models
container_volume	76
container_issue	3
container_start_page	162
publishDate	2014
institution	Swansea University
issn	15240703
doi_str_mv	10.1016/j.gmod.2013.10.006
publisher	Elsevier
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
url	http://www.sciencedirect.com/science/article/pii/S1524070313000337
document_store_str	1
active_str	0
description	In this work, we investigate whether it is possible to distinguish conversational interactions from observing human motion alone, in particular subject specific gestures in 3D. We adopt Kinect sensors to obtain 3D displacement and velocity measurements, followed by wavelet decomposition to extract low level temporal features. These features are thengeneralized to form a visual vocabulary that can be further generalized to a set of topics from temporal distributions of visual vocabulary. A subject specific supervised learning approach based on Random Forests is used to classify the testing sequences to seven different conversational scenarios. These conversational scenarios concerned in this workhave rather subtle differences among them. Unlike typical action or event recognition, each interaction in our case contain many instances of primitive motions and actions, many of which are shared among different conversation scenarios. That is the interactions we are concerned with are not micro or instant events, such as hugging and high-five, but rather interactions over a period of time that consists rather similar individual motions, micro actions and interactions. We believe this is among one of the first work that is devoted to subject specific conversational interaction classification using 3D pose features and to show this task is indeed possible.
published_date	2014-05-31T19:42:16Z
_version_	1821345197807632384
score	11.04748

A bag of words approach to subject specific 3D human pose interaction classification with random decision forests

Similar Items