Real-Time camera motion tracking in planar view scenarios
Luis Alvarez, Luis Gomez, Pedro Henriquez, Javier Sanchez
CTIM. Centro de I+D de Tecnologías de la Imagen
Universidad de Las Palmas de G.C.

Index

 

Abstract

We propose a novel method for real-time camera motion tracking in planar view scenarios. This method relies on the geometry of a tripod, an initial estimation of camera pose for the first video frame, and a primitive tracking procedure. This process uses lines and circles as primitives, which are extracted applying CART (Classification and Regression Tree). We have applied the proposed method to HD (High Definition) videos of soccer matches. Experimental results prove that our proposal can be applied to processing high definition video in real time. We validate the procedure by inserting virtual content in the video sequence.

Experiments and results

To show the performance of the proposed method, we have tracked the camera motion in two different sequences, scale soccer court models (1440 x 809 pixels) and real scenes from soccer matches in HD (1920 x 1080 pixels). The experiments were executed in an Intel Core i7 2.00GHz processor with 4GB RAM. The camera motion tracking procedure for a 1440x812 frame lasts 3 milliseconds in the scale model sequence and 5 milliseconds in the real sequence (frame size is 1920 x 1080). In the processing time indicated, the primitive tracking and camera parameters computation are also included but the image loading time is not considered due to its strong dependency on the system architecture. From that, it is clear that the proposal can be applied for real time video processing.

To illustrate the quality and accuracy of the camera motion tracking, we have used the camera parameters obtained in the tracking procedure to insert graphics into the video. When the camera motion and intrinsic parameters are known, we can synchronize the real camera with a virtual camera and render objects with the same perspective. These experiments were chosen, because this kind of application requires an accurate and fast camera motion tracking computation.

Scale model sequence

 

Real sequence (HD)

Conclusions

In this paper we have proposed a new method for camera motion tracking in planar view scenarios. This method presents many interesting features: it relies on a very general tripod geometry, which allows us to use any platform camera settings; it attains real-time performance for HD video sequences, preserving a high calibration accuracy; it is also very robust, since it properly works for a small number of detected primitives. This is an important feature because it is usual for part of the soccer lines to be occluded by the soccer players.

To solve this problem, we first assumed that the camera is mounted on a tripod, which is a common situation in practice, and study the geometry of the tripod from a mathematical point of view. This assumption strongly simplifies the calibration problem and allows recovering the frame calibration in situations where general calibration techniques fail. One of the main novelties of the tripod model is the fact that the tripod rotation and camera projection centers do not necessarily coincide.

To perform the camera motion tracking process we propose a method which is based on homography estimation and primitive tracking. To track the primitives through the whole video sequence we use a CART (Classification and Regression Tree). The decision tree is estimated using a learning procedure based on RGB image channels and a suitable training set. We present some experiments using HD videos of sport events (soccer matches) in both, scale soccer court models and real scenarios. The experimental results show that the method is fast and accurate. The average processing time is 5 milliseconds per HD frame and the projection error is very small. In terms of computational complexity, the main novelty is that a decision tree computation is very fast and the method is local, i.e., we only need to process a neighborhood around the primitive location. As an application of the proposed method, we have shown that it is straightforward to seamlessly insert virtual objects in a video sequence.

Acknowledgements

This research has partially been supported by the MICINN project reference MTM2010-17615 (Ministerio de Ciencia e Innovación. Spain). We acknowledge MEDIAPRODUCCION S.L. for providing us with the real HD video we use in the numerical experiments.