Cylindrical Pixel Transfer

The goal of cylindrical pixel transfer is an automatic and reliable estimate of pixel correspondences between reference-views and visual prediction. The available knowledge consists of camera focal length in pixels, landmark position, orientation and reference-views, and an estimate of current robot pose. In addition, there is the knowledge that learned landmarks lie on planar or "almost planar" surfaces.

The basic concept for interpolating cylindrical panoramic images has been shown in figure 1. This is equivalent to computing 3D points from image correspondences and projecting them to a new target image. McMillan and Bishop, [19], devised an efficient method for transferring known image disparity values between cylindrical panoramic images to a new virtual view. Their approach uses the angular disparity (related to each cylindrical pair) to automatically generate warps that map reference views to arbitrary cylindrical or planar views.

The angular disparity can be estimated in different ways depending on the available knowledge. For example, by manually or automatically specifying a sparse set of corresponding points that are visible in both reference-views, by knowing or recovering camera internal parameters, and by exploiting epipolar geometry, a dense set of corresponding points can be recovered. Examples of procedures which exploit epipolar relations to recover dense correspondences from a sparse set of corresponding points, can be found in different literature works, (McMillan and Bishop [19], Faugeras [9], Blanc, Livatino and Mohr [3], [2]).

In the case of [14], the angular disparity is inferred from previously estimated correspondences along cylindrical epipolar lines. These correspondences allowed for estimating the geometry of the landmark surface based on: 3D positions of landmark center, a minimum of two landmark corners, and the result of the planarity test, Landmarks are required to lie on one planar or "almost planar" surface, nevertheless, the same procedure could be applied to landmarks lying on more than one surface (in case the surfaces are known).

The proposed rendering system takes as input cylindrical reference-views of landmarks, along with the map of the angular disparities. This information is used to automatically generate image warps that map landmark reference-views to arbitrary cylindrical landmark-views. Note that the generated warps are capable of describing perspective effects, and occlusions (using a simple visibility algorithm that guarantee back-to-front ordering [19]).

Figure: The top figure illustrates pixel correspondences in two different cylindrical panoramas and the related angular disparity. The bottom figure illustrates the cylindrical-to-cylindrical mapping based on angular disparity through an example (which consider a workspace floor-map). The bottom figure also includes images of reference panoramas, visual predictions and current observation.

The cylindrical-to-cylindrical mapping is illustrated in figure 2. Each angular disparity value, $\Delta_{\gamma,v}$ , can be obtained as in equation 1. Note that $(\gamma,v)$ are the pixel coordinates in the panorama, where $\gamma$ is an angle while

is the pixel row.

$\displaystyle \Delta_{(W^{left}_{\gamma},W^{left}_{v})} = W^{right}_{\gamma} - W^{left}_{\gamma}$

(1)

where $(W^{left}_{\gamma},W^{left}_{v})$ represents a generic pixel in one of the left cylindrical reference-views (labeled "left") identified by the angle $\gamma$ and the ordinate

, and $(W^{right}_{\gamma},W^{right}_{v})$ represents the correspondent ordinate

for a certain angle $\gamma$ , in the right reference-view.

Knowing the angular disparity for each landmark pixel, this can be converted for each position on the left cylinder ( $\gamma$ ,

), into an image flow vector field, ( $\gamma + \Delta_{\gamma,v}$ , $v(\gamma + \Delta_{\gamma,v})$ ). The figure 2 top row illustrates this conversion.

The disparity values can then be transfered from the known cylindrical pairs $(C^{left}_{x}, C^{left}_y, C^{left}_z)$ and $(C^{right}_{x}, C^{right}_y, C^{right}_z)$ (which respectively represent the left and the right camera positions), to a new cylindrical projection in an arbitrary position, $(C^{virt}_{x}, C^{virt}_y, C^{virt}_z)$ , using the following equations, where $\tau$ is the rotation offset which aligns the angular orientation of the cylinders to a common frame,

$\begin{displaymath}\begin{array}{l} a = ( C^{right}_x - C^{virt}_x ) \cos{(\tau ... ...}_x - C^{left}_x ) \sin{(\tau - W^{left}_{\gamma})} \end{array}\end{displaymath}$

(2)

$\displaystyle \cot{\Omega_{(W^{virt}_{\gamma},W^{virt}_{v})}} = \frac{a + b \hspace{0.1cm} \cot{\Delta_{(W^{left}_{\gamma},W^{left}_{v})}}}{c}$

(3)

The resulting $\Omega_{(W^{virt}_{\gamma},W^{virt}_{v})}$ is the angular disparity between the generic pixel in the left cylindrical reference-view, $W^{left}_{\gamma,v}$ , and the corresponding pixel in the virtual cylindrical view. In this way, each resulting angular disparity value, $\Omega_{\gamma,v}$ , can be converted, for each position on the left cylinder ( $\gamma$ ,

), into an image flow vector field ( $\gamma + \Omega_{\gamma,v}$ , $v(\gamma + \Omega_{\gamma,v})$ ) using the epipolar relation given by equation 4.

$\displaystyle W^{virt}_{v}(W^{virt}_{\gamma}) = \frac{M_x \cos{(\tau - W^{virt}_{\gamma})} + M_y \sin{(\tau - W^{virt}_{\gamma})}}{M_z} + C_v$

(4)

$\begin{displaymath}\left[ \begin{array}{c} M_x \\ M_y \\ M_z \end{array}\right... ...- W^{left}_{\gamma})} \\ C_v - W^{left}_{v} \end{array}\right]\end{displaymath}$

(5)

where $\tau$ is the rotation offset which aligns the angular orientation of the cylinders to a common frame, and

is ordinate

of the scan-line where the center of the projection would project onto the scene, (i.e. the ordinate of the line of zero elevation).

The above equation gives a concise expression for the curve, $W^{virt}_{v}(W^{virt}_{\gamma})$ , (i.e. the cylindrical epipolar line), formed by the projection of a ray across the surface of a cylinder, (labeled "virt"), where the ray is specified by its positions on some other cylinder, (labeled "left").

Once the angular disparity, $\Delta_{\gamma,v}$ , has been used for the transfer of the disparity values between the reference cylinder to a new viewing position, each estimated pixel in the virtual cylinder, $W^{virt}_{i,j}$ , is projected on the virtual camera image-plane, so becoming $W^{virt-plan}_{s,t}$ , in order to generate the landmark visual prediction. The visual prediction is converted to planar to be compared to landmark current observation. Figure 3 illustrates the mapping from cylindrical to planar image.

**Figure:** The figure illustrates the mapping from cylindrical to planar image.
$\begin{figure}\centerline{ \psfig{figure=figures/cyl2plan.eps,height=6cm} }\end{figure}$

The proposed landmark pixel transfer procedure starts with a forward mapping (from reference to virtual) for what concern the landmark corners (which position has been previously established by stereo-matching). This results in a region of the virtual-cylinder delimited by the four projected corners. Each pixel included in the delimited region is then projected forward or inverse depending on the required performance. Having in this case priority a high texture fidelity, a forward mapping is adopted in the compression case and an inverse mapping in the enlargement case.

In summary, the proposed technique of cylindrical pixel transfer allows for establishing pixel correspondence between landmark reference-views and virtual prediction. In particular, two reference views were considered. Experiments, (Livatino [14]), showed that the proposed technique allows for reliable matches between prediction and observation even in presence of significant positional errors, (denoted by clear displacements in image-plane, between prediction and observation).