Three-dimensional calibration

Before the system can be used to track markers, the system first needs to be calibrated. This is done by placing a calibration frame in front of the cameras and instructing the software to do an automatic calibration. Figure 1 shows a drawing of a calibration frame. The frame is a cube with a white marker on each corner and also a marker in the center. The center marker has a coordinate of (0;0;0). The axis convention is also shown in Figure 1.

Figure 1 - Example of a calibration frame from the camera's point of view

1 The coordinate systems used

There are several coordinate systems used, which are explained below.

1.1 The physical coordinate system

This coordinate system is the physical coordinates of the points in the physical environment. These coordinates are 3D and are measured in meters. The convention of the physical coordinate system is shown in Figure 1 above.

1.2 The image coordinate system

This coordinate system's origin lies at the camera's focal point and the Zb=0 plane is parallel to the CCD plane. The Zb-axis lies on the camera's optical axis. The units is just as with the physical coordinate system meters. The center of the image on the CCD is (0;0;fk). When the camera image is viewed on screen, the positive Xb-axis points to the right and the positive Yb-axis points to the top. Note that the Xb-axis lies in the horizontal (X-Z)f plane and not the horizontal plane on the CCD.

1.3 The pixel co-ordinate system

This co-ordinate system is 2D and lies on the CCD plane. The distances are measured in pixels, and the top left corner of the image is (0;0). The positive Xp-axis points to the right, while the positive Yp-axis points to the bottom. The angle between the Xp-axis and Xb-axis is Ozk.

2 The parameters

The calibration process consists the determination of 7 parameters for each camera:

The first parameter is fk, the focal distance of the camera. The focal distance is the distance between the focal point of the camera lens and the CCD plane, measured in meter.

The second set of parameters,(Xk;Yk;Zk), is the physical co-ordinate of the camera's focal point.

The third set parameters are three angle parameters:

Oxk is the horizontal angle between the Zf-axis and the camera's optical axis. Standing behind the camera, the Oxk angle is increased by turning the camera's front to the left.

Oyk is the vertical angle between the (X-Z)f plane and the camera's optical axis. Standing behind the camera, the Oyk angle is increased by turning the camera's front down.

Ozk is the rotation of the camera's CCD plane about its optical axis. Standing behind the camera, the Ozk angle is increased by turning the camera anti clockwise.

Figure 2 shows the angle conventions used for Oxk and Oyk.

Figure 2 - Angle conventions

Another parameter needs to be determined, but is only needed during calibration:

Dk is the distance between camera k's focal point and where the optical axis intercepts the origin of the physical coordinate system. This parameter determines the perspective of the image. This parameter is only used with the rotation model, and is invalid in other cases, as the optical axis not always intercepts origin of the physical coordinate system.

The following angle parameters form the link between the physical coordinate system and the image coordinate system.

axki is the angle in the horizontal plane between the line between marker i to camera k's focal point and the camera's optical axis. This angle is the same in the physical and the image coordinate systems.

ayki is the angle in the vertical plane between the line between marker i to camera k's focal point and the (X-Z)b plane in the image coordinate system. This angle is the same in the physical and the image coordinate systems.

There are also parameters inherent to the camera and do not need to be calculated. These parameters determines the relationship between the pixel and image coordinates.

The first set of parameters, (xpo;ypo), is the (0;0) point of the image coordinates in the pixel coordinate system in pixels. This point corresponds with the center of the CCD plane and is (376;291) for the CCD element that was used.

The second set of parameters, (xps;yps), is the size of one pixel element on the CCD plane in meters. This was (6.5µm;6.25µm) for the CCD element used.

3 Relationship between the coordinate systems

Figure 3 shows the setup from the top. Top right is the left camera, while the right camera is bottom right. The origin of the physical coordinate system is on the middle left. Bottom left is marker i with coordinate (Xi;Yi;Zi). The relationship between Oxk and axki can be easily seen below.

Figure 3 - Top view of the setup

The equations in the physical coordinate system will now be derived. After this the same will be done for the pixel- and image coordinate systems.

3.1 The physical coordinate system

Figure 4 represents a line between marker i and camera k's focal point in the physical coordinate system.

Figure 4 - The physical coordinate system

By looking at Figure 5.4, the following equations can be derived:

Yk = Yi + rki sin(Oyk + ayk)

bki = rki cos(Oyk + ayk)

Xk = Xi + bki sin(Oxk + axk)

Zk = Zi + bki cos(Oxk + axk)

Note that these equations are not entirely correct, but as long as Oyk is small, the error will be minimal. From the above equations, the following equation for the line between the focal point of camera k to marker i with co-ordinates (Xi;Yi;Zi) can be derived.

Xi = Xk - rki * cos(Oyk + ayki) * sin(Oxk + axki)

Yi = Yk - rki * sin(Oyk + ayki)

Zi = Zk - rki * cos(Oyk + ayki) * cos(Oxk + axki)        (1)

The last equation can be substituted into the above two equations to eliminate rki. The following two equations results from this:

Xi = Xk + (Zi-Zk) * tan(Oxk + axki)

Yi = Yk + (Zi-Zk) * tan(Oyk + ayki)/cos(Oxk + axki)        (2)

Above equation is known as the formal model in the rest of this text. The determination of the angles axki and ayki from the pixel coordinates will be explained in the following section.

3.2 The pixel- and image co-ordinate systems

The output of the marker recognition algorithm is the pixel co-ordinates of the markers. Figure 5 shows the relationship between the pixel and image coordinates on the CCD plane.

Figure 5 - Relationship between pixel and image coordinate systems.

There are two steps needed to convert from pixel coordinates to image coordinates. First the pixel coordinates should be scaled and translated. Secondly the obtaining coordinates should be rotated by Ozk to get the image coordinates of a marker. The pixel coordinates (xpki;ypki) are translated as follows to coordinates (xki;yki):

xki = xpki * xps - Xpo

yki = Ypo - ypki * yps (3)

where:

(xps;yps) the size of a pixel in image coordinates.

(xpo;ypo) the (0;0) punt of the image coordinates in the

pixel coordinate system (center of CCD plane).

After the coordinates, (xki;yki), of the marker has been calculated, it is necessary to take the rotation of the CCD element about the camera's optical axis into account to get the image coordinates (xbki;ybki) of a marker.

xbki = xki cos(Ozk) - yki sin(Ozk)

ybki = xki sin(Ozk) + yki cos(Ozk) (4)

Figure 6 represents the image co-ordinate system. The Z-axis of the image coordinate system is the optical axis of the camera. The point (0;0;0) is the point (Xk;Yk;Zk) in the physical coordinate system. To keep the same angle convention as with the physical coordinate system, the angles axki and ayki should be negated.

Figure 6 - Image co-ordinate system

The following equations can be derived from Figure 5.6:

xbki = rbki cos(-ayki) sin(-axki)

ybki = rbki sin(-ayki)

fk = rbki cos(-ayki)cos(-axki)

The last equation can be substituted into the first two to get rid of rbki. The new equations are as follows:

xbki = fk * tan(-axki)

ybki = fk * tan(-ayki)/cos(axki)

These equations can be rearranged to get the angles axki and ayki from the image co-ordinates of a point.

(5)

The calibration process will now be explained in more detail.

4 The calibration process

Before any calibration can be done, it is necessary for the user to enter the physical coordinates of the calibration frame into the computer. Since these coordinates only change when the calibration frame is changed, the software will automatically save these values. The computer model of the calibration frame will be known as the calibration mask in the rest of this text.

The calibration process works on the principle of transforming the calibration mask using a model to a 2D camera image toe. This calculated 2D image is then compared to the actual image of the calibration mask. The parameters, which also serve as input to the model, is then iteratively modified until the two images are as close as possible to each other.

Figure 7 shows a generalized flowchart of calibration algorithm.

Figure 7 - Generalized flowchart of calibration algorithm

First values for the parameters will be selected, secondly the calibration mask will be transformed via the model to a 2D image. The total error is than calculated. This numerical solution process will be explained in more detail in paragraph 7.

The most logic choice of transformation model is the formal model, based on the equations 2. The formal model can however not be used until the exact value of fk has been determined. Another model, the rotation model, is therefore used to calculate fk. Hereafter the formal model is used to determine the rest of the parameters.

The classification of the markers will now be explained.

4.1 Classification of the markers

The first step of the calibration process is to recognize the 9 markers of the calibration framework using the marker recognition algorithm. Classification is then based on sorting the markers. The markers are first sorted from high to low. The four highest markers are then sorted from left to right, the same is also done for to four lowest. The order of the markers in the right camera image is then swapped for the numbers to correspond with the left image. Figure 8 shows how the markers in the camera image look after being correctly sorted and numbered.

Left image Right image

Figure 8 - Example of camera images

Table 1 shows the positions of each marker after classification.

Table 1 - Position of markers
 
Marker nr Position
0 Left Back Bottom
1 Left Front Bottom
2 Right Back Bottom
3 Right Front Bottom
4   Center  
5 Left Back Top
6 Left Front Top
7 Right Back Top
8 Right Front Top

The two transformation models are now explained.

4.2 The rotation model

The purpose of this model is to determine the accurate values of Dk and fk. This model rotates the calibration mask first about the Y-axis, and then the X-axis and lastly about the Z-axis. Compensation for the perspective is then applied. The image is then scaled to be the same size as the real image, thereafter the difference between the two images is calculated.

Only four of the parameters, namely. Oxk, Oyk, Ozk and Dk are selected. The values of these parameters are not known currently and initial values must be selected. The best choice for initial values for angles is 0°, while a best choice for Dk is 1.5m. The best choice for the initial value of fk is the camera lens's specified focal distance. The true focal distance will be close to this value.

With both models, it is necessary to rotate the calibration mask through the angle Ozk. Instead of rotating the calibration mask, it is easier to rotate the real image, especially since this rotation already forms part of the equations in 5 to calculate (axki;ayki) from the 2D camera image.

The steps needed for the transformation of the calibration mask to a 2D image using the rotation model will now be explained.

Step 1 - Calculate angle coordinates (axki;ayki) from camera image:

The angle coordinates (axki;ayki) of the markers are first calculated from the camera image by using the equations in 5. The correct value of fk is in this stage still unknown, the values of (axki;ayki) will therefore be inaccurate.

Step 2 - Force camera's optical axis through physical origin:

This model can only provide valid results if the optical axis of the camera runs through the origin of the physical coordinate system. Put differently, the center marker must lie on the (0;0) point in image coordinates. This is done by subtracting angle coordinates (axk4;ayk4) from the angle coordinates of all the markers. The markers' compensated image coordinates (mxki;myki) are now calculated. The last two steps are done using the following equations:

mxki = fk*tan(axk4-axki)

myki = fk*tan(ayk4-ayki)/cos(axki-axk4) (8)

After above has been used, (mxk4;myk4) will be (0;0), which means that the optical axis runs through marker 4, which is the origin of the physical coordinate system. As the same value of fk is use here and in 5, the values (mxki;myki) will be correct.

Step 3 - Rotate calibration mask first about Y-axis and then about X-axis:

Firstly the calibration mask is rotated about the Y-axis through the angle Oxk and then about the X-axis through Oyk. In both cases a positive angle is a counter-clockwise rotation. The following matrix equation is used for the rotations:

The 3D physical coordinate of the calibration mask's point is (Xi;Yi;Zi), while the rotated calibration mask point is (xi;yi;zi). This point is now on the image coordinate system.

Step 4 - Simulate the perspective on the rotated mask:

The real camera image contains perspective, the calculated image of the calibration mask also needs perspective if the two images need to be compared. The perspective is a scale factor, the further a point from the camera, the smaller is its u and v co-ordinates. The distance referred to here is not the 3D distance vector, but only that vector's component that is parallel with the optical axis. Figure 9 shows a side view. Both lines L1 and L2 is of the same length, but because L2 is closer to the camera than L1, L2's image, l2, longer than L1?s image, l1.

Figure 9 - Perspective of an image

The scale factor, Pki, is calculated as follows:

After Pki is calculated, the coordinates (xki;yki) are scaled to get (uki;vki):

uki = xki * Pki

vki = yki * Pki

The coordinates (uki;vki) are the 2D image coordinates of the transformed calibration mask.

Step 5 - Scale the calculated 2D image of the mask:

The effect of the focal distance of the camera has not been taken into account so far, the calculated image will therefore not be of the same size as the camera image. An average scale factor, Sk, must be determined to scale the 2D calibration mask image to the same size as the real 2D camera image of the calibration frame.

Because marker 4?s u and v co-ordinates are zero, they are not taken into account in the calculation of Sk. The calculated 2D image of the calibration mask is now scaled to be of the same size as the camera image using the following equations:

uski=uki*Sk

vski=vki*Sk

Step 6 - Calculate total error between two images:

The total error distances between the true 2D points and the calibration mask's 2D points is now calculated.

Step 7 - Determine camera's focal distance, fk:

After Sk is calculated, the focal distance of the camera can be calculated. Figure 10 shows a line with length L and its image on the CCD plane with length l.

Figure 10 - Calculation of focal distance

The following equation can be derived from Figure 5.10:

This equation can be re-written as follows:

fk = Dk * Sk

The process is now repeated until Terror has reached a minimum value. After fk and Dk has been determined accurately, the formal model is used to determine the values of qxk and qyk accurately.

4.3 The formal model

The values of Dk and fk are now accurately determined y the rotation model. Due to the fact that the rotation model forces the camera's optical axis through the origin of the physical coordinate system, the values of Oxk and Oyk would be wrong. The purpose of the formal model is to determine these angles accurately. It is not necessary to determine a new value of Ozk, the rotation model and the formal model use exactly the same implementation for this angle, Ozk is thus already determined accurately enough with the rotation model.

Only two of the additional parameters namely. Oxk and Oyk are selected. The steps needed for the transformation of the calibration mask to a 2D image using the formal model is now explained.

Step 1 - Calculate image coordinates (xbki;ybki) from camera image:

The image coordinates (xbki;ybki) of the markers are first calculated from the camera images using the equations in 4.

Step 2 - Calculate camera focal point (Xk;Yk;Zk):

The camera's focal point's physical coordinates can be determined by substituting marker 4's physical coordinates, Xi=Yi=Zi=0 and rki=Dk, into the equations in 1:

Xk = Dk * cos(Oyk+ayk4) * sin(Oxk+axk4)

Yk = Dk * sin(Oyk+ayk4)

Zk = Dk * cos(Oyk+ayk4) * sin(Oxk+axk4) (5)

Step 3 - Calculate 2D coordinate from 3D mask coordinate:

This step transforms a 3D calibration mask point, (Xi;Yi;Zi), to a 2D image point, (uki;vki). The equations in (2) and (5) may be re-written as follows to give the image coordinates of the physical coordinates:

from equation (2):

from equation (5):

uki=fk*tan(-axki)

vki=fk*tan(-ayki)/cos(axki)

Step 4 - Calculate total error between two images:

Unlike the rotation model, it is not necessary to scale the image, the total error, Terror, can therefore be calculated directly as follows:

The process is repeated until the total error has reached a minimum value. Al the parameters have now been determined accurately. Due to factors such as noise and camera resolution, the error will not be zero.

5 Determination of a marker's 3D co-ordinate from 2 2D camera images

After the system has been calibrated, the system may be sued to determine the 3D coordinates of markers. The first step is to recognize the markers and to calculate their pixel coordinates using the marker recognition algorithm. After this the markers need to be classified correctly.

The following two parametric equations can be derived for the two lines that runs from the marker to the two cameras in the physical coordinate system. The left camera's focal point is (XL;YL;ZL) while the right camera's focal point is (XR;YR;ZR).

The equation for the left camera:

XiL = XL + (Zi-ZL) * tan(OxL + axLi)

YiL = YL + (Zi-ZL) * tan(OyL + ayLi)/cos(OxL + axLi)

The equation for the right camera:

XiR = XR + (Zi-ZR) * tan(OxR + axRi)

YiR = YR + (Zi-ZR) * tan(OyR + ayRi)/cos(OxR + axRi)

The marker lies on the point where these two lines cross each other. The software substitutes values for Zi until the distance, A, between the two points (XiL;YiL;Zi) and (XiR;YiR;Zi) has reached a minimum. The distance A is calculated as follows:

The initial value of Zi may be ZL or ZR, depending which one is the smallest. The distance between the two points will normally never be zero due to inaccuracies with the calibration and marker recognition processes. When the value of Zi has been determined, it is best to calculate the average values of Xi and Yi from the two lines as follows:

and 

In the next section factors effecting accuracy are discussed.

6 Factors determining accuracy

The factors determining accuracy of the calibration process is now discussed.

6.1 Noise on the camera image

Noise on the camera image causes that stationary markers are not always detected in the same location. This causes noise in the 3D coordinates of a point. By taking the results from consecutive frames and calculating the averages, the noise is minimized. This can only be done with static markers, like with calibration, but will not work if moving markers is tracked. The only solution for noise with moving markers is to filter the 3D coordinates.

6.2 The sturdiness of the setup

The calibration of the setup will be effected if any one of the cameras is moved. The camera stands should therefore be solid and out of the way to ensure somebody does not accidentally walk into them. Care should also be taking with applications like gait analyses where a treadmill my cause vibrations that may upset the cameras.

6.3 Lighting

The markers need to be lighted equally over its surface which points to the cameras to ensure that the marker is round in the camera image. All markers should also be of the same intensity if possible. If this is not the case, the marker positions will be calculated inaccurately. The simplest of ensuring proper lighting is to place a spotlight on the top of each camera. Uniform ambient light is also very important, as too strong a spotlight might cause unwanted reflections.

6.4 The accuracy of the calibration frame

The calibration algorithm can only function properly and give accurate results if the calibration frame is sturdy and the physical coordinates of the marker on the frame are known accurately. In the best case scenario the calibration of the setup can only be as accurate as the calibration mask.

6.5 Lens distortion

The larger the lens opening, the larger is the lens distortion as it is very difficult and expensive to make a lens accurate over its whole surface. Furthermore are all colours not refracted the same amount. These are factors which lead to image smear and distortion. By using proper lighting, the lens opening can be kept small, thus minimizing lens distortion.

7 Implementation of the calibration process in software

Calibration is done by choosing an intelligent value for a parameter, the calibration mask is then transformed using a model to a 2D image and the total error, Terror, between the camera image and the calculated image is then calculated. This process is repeated until the chosen value for the parameter provides a minimum error value. The numerical solution method is explained in this section.

The advantage of the numerical method is that no closed form equation is needed for Terror. The transfer function between the parameter and Terror needs to comply to a few conditions. Firstly the transfer function must have a minimum value, otherwise there does not exist a minimum value for Terror. Secondly the transfer function may only have one minimum point and no other minimum points. If there is more than one local minimum point, the algorithm may select the wrong minimum point and give an incorrect result.

Figure 11 shows an example of a graph of Terror versus parameter P, which is an angle in this case.

Figure 11 - Graph of Terror versus P

This transfer function has more than one minimum point and does not comply to the conditions. It can be made compliable by limited the values of P to Area 2. In the practical system all parameters lie in Area 2, if this was not the case, the marker would not be able to be sorted correctly. The second condition is therefore satisfied.

7.1 The numerical solution method

The easiest way of determining a value is by starting with the smallest value and then to increment the parameter until a minimum value Terror is reached. The drawback of this method is that it takes too long to find the answer. Figure 12 shows the flow chart of the algorithm which is used in the practical system.

Figure 12 - Flow chart of numerical solutions method

The initial value of P is not critical. As long as it lies in the right area, the correct minimum point will be reached. The initial value of I should not be too small, otherwise the algorithm must go through too many iterations before the answer is reached. If the initial value of I is too large, the value of P may leave the valid area which will give the wrong answer. The value of Imin depends on the accuracy needed for P. Imin must be smaller than the smallest unit of P, a factor of 10 times smaller is a safe value.

This algorithm can also find a solution for one parameter at a time. The next paragraph explains how the solution to more than one parameter is found.

7.2 The handling of more than one parameter

The same numerical method is used. A solution to one of the parameters is found, while the rest of the parameters are kept constant. This process is repeated for each parameter.

There are several orders in which solutions are found for the parameters. The order used does not influence the result, but may influence the number of iterations needed to get the solution. Figure 13 shows the order used by the software for the rotation model.

Figure 13 - Solution order of parameters

As can be seen, solutions are found for the parameters in the order Oxk, Oyk, Ozk and Dk. With the formal model it is only needed to find solutions for Oxk and Oyk. When the calibration process went through all the parameters, it is known as one iteration.

The number of iterations that is needed to obtain accurate values for the parameters in the practical system is less than 10.

9 Alternative calibration frames

There are several ways of building a calibration frame. The first frame consisted of a wire frame, with marbles painted white serving as markers. Unfortunately this calibration frame was not sturdy enough, and it was impossible to determine the position of the markers accurately.

The second calibration frame was made out of wood which was spray painted matt black. The marbles are spray painted matt white. The two top front markers were left out is this frame as the remainder 7 markers is adequate for calibration. Figure 14 shows a front and side view of the new calibration frame.

Figure 14 - Alternative calibration frame

The calibration algorithm itself stays the same for the new frame, except that there are now only 7 markers. Table 5.2 shows how the markers are classified for the new calibration frame.

Table 2 - Position of markers
 
Marker nr Position
0 Left Back Bottom
1 Left Front Bottom
2 Right Back Bottom
3 Right Front Bottom
4   Center  
5 Left Back Top
6 Right Back Top

By comparing Table 5.2 and Table 5.1 it can be seen that markers 0 to 5 stays the same, markers 6 and 8 are left out and marker 7 becomes marker 6.

10 Practical results

The calibration frame with 7 markers was used to test the systems. The size of the calibration frame was approximately 0.3m x 0.3m x 0.3m. The cameras' focal points were about 2m from the center marker. The angle between the cameras was about 50°.

The system was first calibrated, thereafter the positions of the markers of the calibration frame were determined using the tracker. The 3D error distance between the determined and actual positions was calculated. The maximum error was 1.5mm, while the average error for all the markers was 0.88mm. The maximum error is thus under 0.5% of the calibration frame's size. The biggest contributor to this error was camera noise.