The Method About Decreasing Vibration in Video With Two Ways: Simple Moving Average(SMA) and Kalman Filter(KF)

8 min readApr 16, 2021

1. Background

With the technology advanced, shooting a video is more common than before. People can shot a video with DSLR or the smartphone. For the professional photographer, they usually use video stabilizer to make the video more stable to improve the quality. But the cost of these device is too high for the amateur photographer.

So in this article, we propose two methods to decrease the vibration of video to improve the quality of the video. One is to smooth the motion of camera with Simple Moving Average(SMA), the other is to predict the motion of camera with Kalman Filter(KF). In the following chapters, we will introduce what tool we used, how we find the motion of camera and how we decrease the abnormal vibration of video.

2. Material

In this article, all methods are implemented and validated in Windows 10, and use C++ to program in Visual Studio. The tools for developing have two main SDK. One is OpenCV, the other is FFmpeg.

OpenCV is a library which is widely used to process image and support many programming language(C++, Python and Java…). And FFmpeg, which is written by C, is a library about processing media(e.g. transcode, mux and streaming). Moreover, FFmpeg can make up the deficiencies about generating media file. In OpenCV, to the best of my knowledge, is only provide one format(.avi) for media file and cannot capture audio track of the media.

In this article, all the method about image processing which are mentioned below use OpenCV to archive. And the others(getting metadata, decoding, encoding and generating media file) will use FFmpeg to deal with.

3. Methods

a. Motion Estimation

To find the motion of camera, the information between two consecutive frames is required. More specifically, we can find the changes of blocks(or pixels) in frames to estimate the motion of camera.

The selection of blocks will affect the performance of camera motion estimation. We can estimate the motion of camera such as translating, rotating and scaling easily with the good selection blocks. As you can see the below picture(Fig 3.), the blocks which include corners can satisfy our needs.

Fig 3. The motion which is detected with the block which contains a corner

We can determine the motion of camera with Optical flow after corners were got. The intensity of the pixels will be changed as the objects of the frame or camera is moving. This pattern is called as Optical flow. You can see the video which is show the optical flow of corners as below:

The optical flow of the corners can be considered as the motion of the corners. We can use Random sample consensus (RANSAC) to estimate the motion of camera.

RANSAC is a iteration method to observe the structure of the dataset and determine which samples are inline or outlier. The advantage of this method is that it still get the reasonable result even if the dataset is included a lot of outlier because it will iterate to get the inliers until the sufficient amount of inliers are observed.

After the motion of camera is estimated, Affine Transformation will be calculated frame by frame with the motion of camera.

Affine Transformation is the geometric transformation. It include the translation, rotation, scaling and shearing. In our method, we only take account into the 2D transformation.
To represent Affine Transformation, the transform matrix is usually used. More info about transform matrix:

File:2D affine transformation matrix.svg

English: Illustration of the effect of applying various 2D affine transformation matrices on a unit square. Note that…

zh.wikipedia.org

In the next chapter, we will show how to decrease the abnormal vibration with moving average and Kalman filter, respectively.

b. Video Stabilization

We already get the motion of the camera and affine transformation in the previous chapter. We can get the angle(da) and the translation of x-axis(dx) and y-axis(dy) by the results of affine transformation(transformation matrix). These three parameters are the cornerstone for decreasing the vibration of the video.

I. Simple Moving Average(SMA)

Simple Moving Average(SMA) is the method about smoothing data and it often used to analyze the trend of the data and widely used in finance. Fig 4. shows how SMA working.

In Fig 4, the window size is set as 7 data points and move 1 point at a time. The new value of the data point will be the average calculated with the next 3 points, the former 3 points and itself, just like Round 4 in Fig4. If the amount of former or next data points is less than 3, the average will be calculated by the existing point in the window, just like Round 1 and Round 16 in Fig4.

The window size is set as 151 data points and move 1 point at a time in this article. Three parameters(x, y, a) were processed to the trajectories(x, y, a). And SMA will smooth the trajectories. Fig 5. shows the result after SMA, green curve is the origin trajectory and red curve represent the trajectory after SMA.

Fig 5. The result of x, y and a after SMA

The data after SMA will be converted into transformation matrix and all the frame will be applied with this result. You can see the video as below, the left-side is the origin video and the right is the video after SMA

II. Kalman Filter(KF)

Kalman Filter, in short, it can predict the value with previous value and update(correct) the parameters(gain and noise) with the new value, previous value and observed value(usually from another source, for example, the new value and previous value is about position and observed value may be velocity or acceleration) to improve the estimation. KF is often used to predict or estimate the trajectory, e.g. PDR( Pedestrian Dead Reckoning), motion estimation and so on. If you want to learn more about KF, please see this link.

Because the second source is not easy to find in this topic and it will increase the computation of algorithms, KF in this article only used previous value and estimated value to update the parameters. As the result, dx, dy and da will be re-predicted via KF to avoid unconditional jitters. Fig 6 show the result of dx, dy and da after KF. Green curve is the origin dx, dy and da, and Red curve is the result after KF.

And you can see the video about smoothed by KF as below:

4. Discussion

Compared KF, the video after SMA looks like more stable but it need to take the data which is after the current data account in the algorithms. In contrast, KF only need the previous data and the current data to estimate the new current data. Moreover, KF may have to be optimized by tuning parameters, that is, Q(covariance of the process noise) and R(covariance of the observation noise). As the result, both methods show that they are capable to decrease vibration of video and they may be able to use different context. SMA, shows better performance of decreasing vibration but need all the data, can be used to the offline processing. And KF, only need the previous data but still have some annoyed vibration from origin video, may be used in the real-time stream if computing power is enough.

5. Conclusion

We proposed and compared two methods to decrease the vibration of video, Simple Moving Average(SMA) and Kalman Filter(KF), to improve the quality of the video. Both method show their capability about decreasing vibration. And each method may be able to use the different context. SMA, which has better performance about decreasing vibration and need all frames of the video, can be used in the offline context. And KF, which only need previous frames and the current frame, can be used in the real-time context(e.g. live-streaming) if the computing power is enough.

You can get all the source code of this article in GitHub.

ben60523/video_stabilization

This repository provides a way to post-process media file to decrease the vibration of the video which is caused by…

github.com

6. Reference

Flycam Zest Power Video Stabilizer (5-15kg)

Description Guarantees Perfect Workflow in a Future-Proof & Modular Design. Professional Sled Quickly Adapts…

www.proaim.be

Energizer Smart Photography Series Bluetooth 3-Axis Gimbal Video Stabilizer

The Smart Photography Series Bluetooth 3-Axis Gimbal Video Stabilizer from Energizer provides the angles and stability…

www.bhphotovideo.com

Kalman filter - Wikipedia

In statistics and control theory, Kalman filtering, also known as linear quadratic estimation ( LQE), is an algorithm…

en.wikipedia.org

Video Stabilization Using Point Feature Matching in OpenCV

In this post, we will learn how to implement a simple Video Stabilizer using a technique called Point Feature Matching…

learnopencv.com

How a Kalman filter works, in pictures

I have to tell you about the Kalman filter, because what it does is pretty damn amazing. Surprisingly few software…

www.bzarg.com

OpenCV: Optical Flow

Prev Tutorial: Meanshift and Camshift In this chapter, We will understand the concepts of optical flow and its…

docs.opencv.org

Affine Transformation

An affine transformation is any transformation that preserves collinearity (i.e., all points lying on a line initially…

mathworld.wolfram.com

The RANSAC (Random Sample Consensus) Algorithm

The RANSAC algorithm [ 1] is an algorithm for robust fitting of models in the presence of many data outliers. The…

homepages.inf.ed.ac.uk

What Is Transformation Matrix and How to Use It

When you work with objects in a PDF file using the PDFium library, you can use the SetMatrix functions to transform the…

forum.patagames.com

Multi-Sensor Fusion Approach for Improving Map-Based Indoor Pedestrian Localization - PubMed

The interior space of large-scale buildings, such as hospitals, with a variety of departments, is so complicated that…

pubmed.ncbi.nlm.nih.gov