Personal tools
You are here: Home Tutorials WORLDCOMP'15 Tutorial: Prof. Wenbing Zhao
Call For Participation
General Attendees
Click here for details

Call for Papers
Submission of

Click here for details

Important Dates
July 27-30, 2015
20 joint conferences

Featured Workshops
Doctoral Colloquium
Demos Sessions

Click here for details

« August 2015 »
Su Mo Tu We Th Fr Sa

WORLDCOMP'15 Tutorial: Prof. Wenbing Zhao

Last modified 2015-07-12 11:22

Human Motion Tracking and Recognition with Microsoft Kinect
Prof. Wenbing Zhao
Associate Professor, Department Electrical and Computer Engineering
Cleveland State University, USA

Date: Date & Time: July 28, 2015 (05:40pm - estimated duration: about 2+ hours)
Location: Ballroom 5


    Launched in 2010, Microsoft Kinect is one of the most popular game controllers in recent years, having sold more than 24 million units as of February 2013. Kinect allows users to naturally interact with a computer or game console with gestures and/or voice commands. With such widespread popularity in the market, Microsoft Kinect has attracted researchers to investigate its applications in many fields beyond video games, including healthcare, education, retail, training, virtual reality, robotics, sign languages, and other areas. Moreover, researchers have intensively studied fundamental techniques for human motion tracking and analysis using Microsoft Kinect.

    In this tutorial, we present a comprehensive review of the applications of the Microsoft Kinect sensor in various domains and recent studies on human motion tracking and recognition that power these applications. This tutorial will contain the following 5 sections:

    Section 1 introduces the Kinect technology, including its features, technical specification, programming interfaces, and the depth sensing techniques.

    Section 2 describes various applications of the Kinect technology. The applications are categorized into the areas of healthcare (physical therapy, operating room assistance, and fall detection and prevention), virtual reality and gaming, natural user interface, robotics control and interaction, retail services, workplace safety training, speech and sign language recognition, and 3D recognition, and education and performing arts.

    Section 3 covers research works on human motion tracking. In this section, we discuss the various computer vision techniques used to achieve human pose and skeleton estimation. The foundation for human pose and skeleton estimation is the per-pixel body part classification, which is followed by estimating hypotheses of body joint positions by finding a local centroids of the body part probability mass using mean shift mode detection. In addition, we also introduce the research on hand detection and hand pose estimation.

    Section 4 presents research works on human motion recognition. Unlike human motion tracking, which focuses on the recognition of human body joints and segments, human motion recognition aims to understand the semantics of the human gestures and activities. A gesture typically involves one or two hands, and possibly body poses, to convey some concrete meaning, such as waving the hand to say goodbye. An activity usually refers to a sequence of full body movements that a person performs, such as walking, running, brushing teeth, etc., which not necessary conveys a meaning to the computer or other persons. Rehabilitation exercises form a special type of activities. The approaches used in gesture and activity recognition are divided into the following categories:

      • Algorithmic based recognition: In this approach, a gesture or an activity is recognized based on a set of manually defined rules. Algorithmic-based recognition is popular in gaming and healthcare applications because the gestures and/or activities are usually very well defined, relatively simple, and repetitive in nature. Each gesture or activity normally has a pre-defined starting and ending pose that can be used to delineate an iteration of the gesture or activity. Naturally, the algorithmic-based motion recognition approach is a good fit in such application domains.
      • Direct-matching-based recognition: In this approach, the unknown gesture or activity is directly compared with a set of templates. DTW is the most well-known technique to analyze the similarity between two temporal sequences that may vary in time and speed by finding an optimal alignment between them. Typically one sequence is an unknown sequence to be classified and the other sequence is a pre-classified reference sequence. The difference between the two sequences is expressed in terms of the distance between the two. In addition to DTW, other direct matching methods include the maximum correlation coefficient, and Earth Mover's Distance.
      • Machining-learning-based motion recognition: This approach typically relies on one or more sophisticated statistical models, such the Hidden Markov Model (HMM), Artificial Neural Networks (ANNs), Support Vector Machine (SVM), Decision Forest, and Adaboost, to capture the unique characteristics of a gesture or an activity. Most of such models consist of a large number of parameters, which have to be determined in a training step based on pre-labeled motion data (including both data for the gesture to be recognized, and other motion data that are known not be the specific gesture). In general, the larger of the feature set used for classification, the larger training dataset is required. For some models, such has ANNs, additional modeling parameters have to be manually tuned to achieve good classification accuracy. Furthermore, regression-based methods have also been used for motion recognition.

    Section 5 describes how to use the Kinect SDK and the Unity3D framework to develop practical and powerful Kinect applications for human motion tracking and recognition. A live demo will be included.


    This tutorial will enable you to understand the Microsoft Kinect technology for both the original Xbox 360 Kinect sensor, and the latest Xbox One Kinect sensor, and learn the various algorithms and techniques on human motion tracking and recognition. In particular, this tutorial will help you:
      • Learn the features and programming interfaces of the Kinect sensor
      • Understand the under-the-hood depth sensing technologies used in Kinect
      • Identify innovative use of the Kinect motion sensor in your application domains
      • Gain sight on how to perform human pose and skeleton estimation from the depth data
      • Learn various cutting-edge approaches to human motion recognition
      • Lean how to build a practical and power Kinect application that is able to track and recognize human gestures and activities.


    This tutorial is intended for faculty, graduate students, engineers, and scientists who want to learn how to design and implement Kinect applications and carry out human motion tracking and recognition research with Kinect. In particular, this tutorial may be interested to participants of the following WORLDCOMP’15 conferences:

      • The 19th International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV'15)
      • International Conference on Health Informatics and Medical Systems (HIMS'15)
      • The 11th International Conference on Frontiers in Education: Computer Science and Computer Engineering (FECS'15)
      • The 17th International Conference on Artificial Intelligence (ICAI'15)


    Dr. Zhao is currently an Associate Professor at the Department of Electrical and Computer Engineering, Cleveland State University. He eared his Ph.D. at University of California, Santa Barbara, under the supervision of Drs. Moser and Melliar-Smith, in 2002. Dr. Zhao has authored a research monograph titled: “Building Dependable Distributed Systems” and published over 50 papers in the area of fault tolerant and dependable systems (three of them won best paper awards). A selected list of publications directly relevant to the proposed tutorial are provided below (The entire publication list can be seen at:

      • Zhao, W. (2014). Building Dependable Distributed Systems. Scrivener Publishing and John Wiley & Sons.
      • Zhao, W. (2015). Fast Paxos Made Easy: Theory and Implementation. International Journal of Distributed Systems and Technologies (IJDST), 6(1), 15-33.
      • Chai, H., Zhang, H., Zhao, W., Melliar-Smith, P. M., & Moser, L. E. (2013). Toward Trustworthy Coordination of Web Services Business Activities. IEEE Transactions on Services Computing, 6(2), 276-288.
      • Zhang, H., Chai, H., Zhao, W., Melliar-Smith, P. M., & Moser, L. E. (2011). Trustworthy Coordination of Web Services Atomic Transactions. IEEE Transactions on Parallel and Distributed Systems, 23(8), 1551-1565.
      • Zhang, H., & Zhao, W. (2012). Concurrent Byzantine Fault Tolerance for Software-Transactional-Memory Based Applications. International Journal of Future Computer and Communication, 1(1), 47-50. Won the best paper award at the International Conference on Distributed Computing Engineering, July 2-3, 2012, Hong Kong.
      • Zhao, W. (2007, October). BFT-WS: A Byzantine fault tolerance framework for web services. In EDOC Conference Workshop, 2007. EDOC'07. Eleventh International IEEE (pp. 89-96). IEEE. Won the best paper award.
      • Zhao, W. (2014). Application-aware Byzantine fault tolerance, in Proceedings of the 12th IEEE International Conference on Dependable Autonomic and Secure Computing, August 24-27, Dalian, China
      • Chai, H. & Zhao, W. (2014). Byzantine fault tolerant event stream processing for autonomic computing, in Proceedings of the 12th IEEE International Conference on Dependable Autonomic and Secure Computing, August 24-27, Dalian, China
      • Chai, H. & Zhao, W. (2014). Towards trustworthy complex event processing, in Proceedings of the 5th IEEE International Conference on Software Engineering and Service Science, June 27-29, Beijing, China
      • Chai, H. & Zhao, W. (2014). Byzantine fault tolerance for services with commutative operations, in Proceedings of the 11th IEEE International Conference on Services Computing, June 27-July 2, Anchorage, Alaska.
      • Zhao, W., & Babi, M. (2013, April). Byzantine fault tolerant collaborative editing. In Information and Communications Technologies (IETICT 2013), IET International Conference on (pp. 233-240). IET.
      • Chai, H., & Zhao, W. (2013). Byzantine fault tolerance for session–oriented multi–tiered applications. International Journal of Web Science, 2(1), 113-125.
      • Chai, H., & Zhao, W. (2012). Interaction Patterns for Byzantine Fault Tolerance, in Proceedings of WSE 2012, Springer CCIS Vol. 342, pp. 180-188.
      • Zhang, H., Zhao, W., Moser, L. E., & Melliar-Smith, P. M. (2011). Design and implementation of a Byzantine fault tolerance framework for non-deterministic applications. IET software, 5(3), 342-356.
      • Zhao, W., & Zhang, H. (2009). Proactive service migration for long-running Byzantine fault-tolerant systems. IET software, 3(2), 154-164.
      • Zhao, W. (2008, December). Integrity-Preserving Replica Coordination for Byzantine Fault Tolerant Systems. In Parallel and Distributed Systems, 2008. ICPADS'08. 14th IEEE International Conference on (pp. 447-454). IEEE.
      • Zhao, W., & Villaseca, F. E. (2008, July). Byzantine fault tolerance for electric power grid monitoring and control. In Embedded Software and Systems, 2008. ICESS'08. International Conference on (pp. 129-135). IEEE.
      • Zhao, W., & Zhang, H. (2008, July). Byzantine fault tolerant coordination for web services business activities. In Services Computing, 2008. SCC'08. IEEE International Conference on (Vol. 1, pp. 407-414). IEEE.
      • Zhao, W. (2007, September). A byzantine fault tolerant distributed commit protocol. In Dependable, Autonomic and Secure Computing, 2007. DASC 2007. Third IEEE International Symposium on (pp. 37-46). IEEE.
Conference Proceedings
Get WORLDCOMP'13 & '14 Proceedings
Click Here

Past Events
Click Here

Click Here

Click Here

Click Here

Click Here

Click Here

WORLDCOMP'06, '07, & '08
Click Here

Photo Galleries

Join Our Mailing List
Sign up to receive email announcements and updates about conferences and future events


Administered by UCMSS
Universal Conference Management Systems & Support
San Diego, California, USA
Contact: Kaveh Arbtan

If you can read this text, it means you are not experiencing the Plone design at its best. Plone makes heavy use of CSS, which means it is accessible to any internet browser, but the design needs a standards-compliant browser to look like we intended it. Just so you know ;)