Human Pose Estimation is a task in the field of computer vision that involves identifying
and capturing the positions and orientations of the human body. This is typically done
by predicting the locations of specific keypoints, such as hands, head, and elbows, in an
image. Human Pose Estimation has various applications in different industries, including
robotics, augmented reality, gaming, accessibility, sports, and security.
Grazper Technologies ApS, the partner for this thesis, is working on developing a realÂtime
3D human pose estimation system using a multicamera setup. The primary application of
this system is in the field of security. However, one of the main challenges in implementing
this system is the requirement of multiple cameras to view the same scene from different
angles. This restriction limits the usability of the system, especially in security applications
where it is unlikely to have more than one or two cameras pointing at the same location
at the same time.
The aim of the present thesis is to study whether we can improve the 3D pose estimation
in these cases by incorporating knowledge about foot contact. To do so, we will acquire
an IoTÂconnected sole pair that can make pressure measurements, and incorporate it into
Grazper’s current video acquisition setup.
During the course of the thesis, we designed a reliable, stable, and automated data acquisition setup, enabling Grazper to easily record highÂquality datasets with the potential
to obtain synchronized ground truth sole pressure signals. We prove the feasibility of
predicting sole pressure based on the pose using deep learning techniques. Finally, we
show how sole contact can enhance the performance of a pose detector in scenarios with
fewer cameras.
These results offer a strong proof of concept for future AI solutions and demonstrate the
potential of this technique for further development and advancement.