OpenMoCap: Rethinking Optical Motion Capture under Real-world Occlusion

ACM Multimedia 2025
Chen Qian, Danyang Li, Xinran Yu, Zheng Yang, Qiang Ma
Tsinghua University, Beijing, China
ACM Paper Hugging Face Data Code
Method 2 SFU Test
Comparison of reconstructed human skeletons obtained by different methods under complex motions.

A 3-minute introductory video

Abstract

Optical motion capture is a foundational technology driving advancements in cutting-edge fields such as virtual reality and film production. However, system performance suffers severely under large-scale marker occlusions common in real-world applications. An in-depth analysis identifies two primary limitations of current models: (i) the lack of training datasets accurately reflecting realistic marker occlusion patterns, and (ii) the absence of training strategies designed to capture long-range dependencies among markers. To tackle these challenges, we introduce the CMU-Occlu dataset, which incorporates ray tracing techniques to realistically simulate practical marker occlusion patterns. Furthermore, we propose OpenMoCap, a novel motion-solving model designed specifically for robust motion capture in environments with significant occlusions. Leveraging a marker-joint chain inference mechanism, OpenMoCap enables simultaneous optimization and construction of deep constraints between markers and joints. Extensive comparative experiments demonstrate that OpenMoCap consistently outperforms competing methods across diverse scenarios, while the CMU-Occlu dataset opens the door for future studies in robust motion solving. The proposed OpenMoCap is integrated into the MoSen MoCap system for practical deployment.

Problem Overview

Optical Motion Capture (MoCap) in real-world scenarios often suffers from severe and long-term marker occlusions. As illustrated in Fig. 1, even with multiple cameras, certain markers inevitably become invisible due to body self-occlusion or limited viewpoints. The absence of these markers can lead to a drastic degradation in the performance of existing solvers, underscoring the need for models that remain robust under real-world occlusions.

Ray-traced simulation of visible and occluded markers
Fig. 1: Marker occlusion in Optical MoCap.
Comparison under occlusion (RoMo vs. OpenMoCap)
Fig. 2: Performance of solvers in a real MoCap scenario.

Framework

OpenMoCap is designed to handle real-world marker occlusions through a two-stage architecture. First, a Position Solver recovers the locations of both visible and occluded markers and estimates joint positions, powered by our Marker–Joint Chain Inference Mechanism that builds long-range dependencies between markers and joints. Then, a Rotation Solver takes the refined positions to predict joint rotations using a stacked attention-based model. Together, these components enable OpenMoCap to deliver accurate motion reconstruction even under severe occlusions.

Fig. 3: Overview of OpenMoCap.
Fig. 4: Marker-Joint Chain Inference Mechanism.

Dataset

We introduce the CMU-Occlu dataset, which incorporates realistic marker occlusion patterns and overcomes the limitations of existing optical MoCap datasets with overly simplistic and unrealistic occlusion assumptions. This dataset can serve as a benchmark for evaluation in the field of optical motion capture solving.

dataset comparison
Fig. 5: Comparison of marker occlusion patterns of different datasets.
Dataset demonstration
OpenMoCap trained on CMU-Occlu dataset exhibits superior reconstruction capability.

Demo

Below are qualitative results of OpenMoCap. Our method robustly reconstructs accurate human poses despite significant marker occlusions, demonstrating its effectiveness and reliability for practical applications.

Qualitative demo under severe occlusions

Comparison & Application

We conduct experiments on both the CMU and CMU-Occlu datasets to evaluate the performance of OpenMoCap against state-of-the-art methods. Tab.1 reports quantitative results, where OpenMoCap consistently achieves the best performance across all metrics. In addition, we collected real-world MoCap data, where an actor performs a Russian twist, and compared the reconstruction results obtained from different solvers, as illustrated in Fig. 6.

Tab. 1: Comparison of MoCap solvers on CMU and CMU-Occlu datasets.
MoSh++ MoCap-Solver Local MoCap RoMo OpenMoCap
CMU JPE (cm) 2.582.560.940.890.41
JOE (°) 9.406.513.593.432.52
CMU-Occlu JPE (cm) 2.722.951.231.160.46
JOE (°) 9.686.833.803.542.60
Qualitative results
Fig. 6: Qualitative results on real-world capture data.

MoCap System

This study is powered by our self-developed motion capture system (<$1,000 per camera) with sub-millimeter 3D accuracy.

motion capture camera
Fig. 7: Motion capture camera.
motion capture UI
Fig. 8: Motion capture UI.
motion capture scenario
Fig. 9: Motion capture scenario.

Contact

If you have any questions, please feel free to contact us:

BibTeX

@inproceedings{qian2025openmocap,
  title={OpenMoCap: Rethinking Optical Motion Capture under Real-world Occlusion},
  author={Qian, Chen and Li, Danyang and Yu, Xinran and Yang, Zheng and Ma, Qiang},
  booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
  pages={7529--7537},
  year={2025}
}