Pose Estimation using Markers
Image courtesy to TagSlam
Throughout my research, I had to track the $SE(3)$ pose of various objects for several applications. In this post ill summarize the ROS
-based implementation of some of the marker
-based methods. Basically, most of the marker-based methods that ill cover only requires a USB-Cam, except the last one, which requires the whole OptiTrack
motion captures system. We uses OptiTrack
at the lab for for the finer applications.
In the following, ill use the notation $T^{1}_{0}$ to describe the homogeneous transformation matrix between coordinates frame 1 to 0. Detailed explanation about marker-based transformation can be seen Here
AprilTag Markers
The AprilTag come in several different families, depending on how many bits a tag can represent. Some tag families have large, coarse bit blocks (16h5
), while others are fine and smaller (36h11
- most common). For lower resolution camera, consider the 16h5
family, For better ill suggest the 36h11
and for circular object you should use the 21h7
family.
A full detailed explanation how to generate tags is described:
Measuring the tag Size
When entering (ill explain it later) the size of your tag for pose estimation it is important to know what “size” is actually a measure of. The measurement that is called “size” is shown as “edge length” in the image below.
The tag size should not be measured from the outside of the tag. The tag size is defined as the distance between the detection corners, or alternately, the length of the edge between the white border and the black border. The following illustration marks the detection corners with red X
s and the tag size with a red arrow for a tag from the 48h12Custom tag family.
Coordinate system
The coordinate system has the origin at the camera center. The z-axis points from the camera center out the camera lens. The x-axis is to the right in the image taken by the camera, and y is up.
apriltag_ros
Installation
Starting with a working ROS installation (Kinetic and Melodic are supported):
export ROS_DISTRO=melodic # Set this to your distro, e.g. kinetic or melodic
source /opt/ros/$ROS_DISTRO/setup.bash # Source your ROS distro
mkdir -p ~/catkin_ws/src # Make a new workspace
cd ~/catkin_ws/src # Navigate to the source space
git clone https://github.com/AprilRobotics/apriltag.git # Clone Apriltag library
git clone https://github.com/AprilRobotics/apriltag_ros.git # Clone Apriltag ROS wrapper
cd ~/catkin_ws # Navigate to the workspace
rosdep install --from-paths src --ignore-src -r -y # Install any missing packages
catkin build # Build all packages in the workspace (catkin_make_isolated will work also)
The package works as shown in the below figure. The following default input topics are subscribed to (which can be remapped based on your needs):
/camera/image_rect
: asensor_msgs/Image
topic which contains the image (e.g. a frame of a video stream coming from a camera). The image is assumed to be undistorted, i.e. produced by a pinhole camera. I recommend to use the image_proc_node, which is meant to sit between the camera driver and vision processing nodes.image_proc
removes camera distortion from the raw image stream, and if necessary will convert Bayer or YUV422 format image data to color./camera/camera_info
: asensor_msgs/CameraInfo
topic which contains the camera calibration matrix in/camera/camera_info/K
. One can obtain a specific camera’s K via camera intrinsics calibration using any camera calibarion methods (Option 1, Option 2, Option 3 and so on..).
The Apriltag ROS package takes in a rectified
camera feed and returns a list of detected tags and their 3D locations. In order for this to work the software needs to know what tags it is looking for and how large they are. These are defined in 2 config files: config/settings.yaml
and config/tags.yaml
.
The behavior of the ROS wrapper is fully defined by the two configuration files config/tags.yaml
(which defines the tags and tag bundles to look for) and config/settings.yaml
(which configures the core AprilTag 2 algorithm itself). Then, the following topics are output:
/tf
: relative pose between the camera frame and each detected tag’s or tag bundle’s frame (specified intags.yaml
) usingtf
./tag_detections
: the same information as provided by the/tf
topic but as a custom message carrying the tag ID(s), size(s) andgeometry_msgs/PoseWithCovarianceStamped
pose information./tag_detections_image
: the same image as input by/camera/image_rect
but with the detected tags highlighted.
settings.yaml parameters
this file configures the detection algorithm parameters (most are self-explained)
tag_family: 'tag36h11' # Tag family
tag_border: 1 # Size (in bits) of the black border. Always 1 if made by optitag
tag_threads: 2 # Number of detection thread. Tune per your CPU
tag_decimate: 1.0 # Reduce the resolution of the image by this number. Increases speed at the sacrifice of detecting smaller tags
tag_blur: 0.0 # tag_blur>0 blurs the image and tag_blur<0 sharpens the image
tag_refine_edges: 1 # improves edge detection and therefore improves pose estimation. Light computation
tag_refine_decode: 0 # reduces false negative detection rate. Medium computation
tag_refine_pose: 0 # improves pose estimation accuracy. Heavy computation
tag_debug: 0 # save debug images to ~/.ros Careful if running with video
publish_tf: true # publish tag/bundle poses to /tf topic
tags.yaml parameters
This file tells the algorithm what tags to look for in the environment and how large they are so they can be placed in 3D space. The software assumes distance measurements are in meters
and orientation is in quaternions
(One can use tf.transformation
package for other transformations)
standalone_tags:
[
{id: 10, size: 0.15},
{id: 20, size: 0.1},
{id: 30, size: 0.07}
]
tag_bundles:
[
{
name: 'my_bundle',
layout:
[
{id: 0, size: 0.05, x: 0.0000, y: 0.0000, z: 0.0, qw: 1.0, qx: 0.0, qy: 0.0, qz: 0.0},
{id: 4, size: 0.05, x: 0.0548, y: -0.0522, z: 0.0, qw: 1.0, qx: 0.0, qy: 0.0, qz: 0.0},
{id: 3, size: 0.05, x: -0.0580, y: -0.0553, z: 0.0, qw: 1.0, qx: 0.0, qy: 0.0, qz: 0.0},
{id: 2, size: 0.05, x: 0.0543, y: 0.0603, z: 0.0, qw: 1.0, qx: 0.0, qy: 0.0, qz: 0.0},
{id: 1, size: 0.05, x: -0.0582, y: 0.0573, z: 0.0, qw: 1.0, qx: 0.0, qy: 0.0, qz: 0.0}
]
}
]
As we can see in the above configuration, We can define 2 tracking methods, standalone_tags
and tag_bundles
. For standalone
tags, i.e. each marker represents a unique object in the environment we provide an ID and size for each tag you want to detect. For tag_bundles
is the new feature which let you track an object which represented by multiple tags to overcome occlusion and such. Upon detection of a single or multiple tags in the bundle, the software will report the 6 DOF pose of the bundle’s origin. When we create a bundle you specify a list of tags. Each tag has a tag ID, size, and 6 DOF location of the tag in reference to the bundle’s origin.
Important notes mentioned by the authors:
- No tag ID should appear twice with different sizes (this creates ambiguity in the detection)
- No tag ID should appear twice in the image (this creates ambiguity in the detection)
- It is fine for a tag with the same ID to be listed both in standalone_tags and in tag_bundles, as long as it has the same size.
A complete launch
file which:
- Open the camera stream from the
video_device
withwidth
andheight
parameters. - Launch the
image_proc
node to remove distortions. - Load the
settings.yaml
andtags.yaml
configurations files to the parameters server. - Run the
apriltag_ros
core node and publish the detected transformation.
<launch>
<node name="camera" pkg="usb_cam" type="usb_cam_node" output="screen" >
<param name="video_device" value="/dev/video4" />
<param name="image_width" value="1280" />
<param name="image_height" value="720" />
<param name="pixel_format" value="yuyv" />
<param name="camera_frame_id" value="rgb_cam_link" />
</node>
<node pkg="image_proc" type="image_proc" name="iamge_proc_node" ns="camera" />
<arg name="launch_prefix" default="" /> <!-- set to value="gdbserver localhost:10000" for remote debugging -->
<arg name="node_namespace" default="apriltag_ros_continuous_node" />
<arg name="camera_name" default="/camera" />
<arg name="camera_frame" default="camera" />
<arg name="image_topic" default="image_rect" />
<!-- Set parameters -->
<rosparam command="load" file="$(find apriltag_ros)/config/settings.yaml" ns="$(arg node_namespace)" />
<rosparam command="load" file="$(find apriltag_ros)/config/tags.yaml" ns="$(arg node_namespace)" />
<node pkg="apriltag_ros" type="apriltag_ros_continuous_node" name="$(arg node_namespace)" clear_params="true" output="screen" launch-prefix="$(arg launch_prefix)" >
<!-- Remap topics from those used in code to those on the ROS network -->
<remap from="image_rect" to="$(arg camera_name)/$(arg image_topic)" />
<remap from="camera_info" to="$(arg camera_name)/camera_info" />
<param name="camera_frame" type="str" value="$(arg camera_frame)" />
<param name="publish_tag_detections_image" type="bool" value="true" /> <!-- default: false -->
</node>
</launch>
ArTag Markers
In this part ill explain the usage of ar_track_alvar
package for AR tag tracking. This package is a ROS wrapper for Alvar, an open source AR tag tracking library.
ar_track_alvar
node has 4 main functionalities:
- Generating AR tags of varying size, resolution, and data/ID encoding
- Identifying and tracking the pose of individual AR tags, optionally integrating kinect depth data (when a kinect is available) for better pose estimates.
Identifying and tracking the pose of
bundles
consisting of multiple tags. This allows for more stable pose estimates, robustness to occlusions, and tracking of multi-sided objects.
Alvar features adaptive thresholding to handle a variety of lighting conditions, optical flow based tracking for more stable pose estimation, and an improved tag identification method that does not significantly slow down as the number of tags increases.
Installation
for melodic-devel:
sudo apt-get install ros-melodic-ar-track-alvar ros-melodic-ar-track-alvar-msgs ros-melodic-image-proc
or just clone and build the github repo Github.
In order to use AR Marker properly with your camera, be sure to add the camera model to the launch command when using AR Marker (camera_calibration process as described in aprtiltag section.
Generating AR tags
Two pdf files are in the markers directory containing tags 0-8 and 9-17, respectively. Alternativly, we can get the tags from ar_track_alvar and resize to our use. If you want to generate your own markers with different ID numbers, border widths, or sizes, run:
rosrun ar_track_alvar createMarker 0 -s 10.0
This will create MarkerData_0.png
file that stores a 10cm x 10cm marker with id 0. Print this file on a sheet of paper.
Due to differences in printer setups, the actual size of the printed marker may be different. Make sure the marker_size
parameter represents the actual size (in centimeters) of the AR tag.
Tracking
In order to identify and track the poses of (possibly) multiple AR tags that are each considered individually. The packges uses the node individualMarkers
which takes the following parameters:
marker_size
(double) – The width in centimeters of one side of the black square marker bordermax_new_marker_error
(double) – A threshold determining when new markers can be detected under uncertaintymax_track_error
(double) – A threshold determining how much tracking error can be observed before an tag is considered to have disappearedcamera_image
(string) – The name of the topic that provides camera frames for detecting the AR tags. This can be mono or color, but should be an UNrectified image, since rectification takes place in this packagecamera_info
(string) – The name of the topic that provides the camera calibration parameters so that the image can be rectifiedoutput_frame
(string) – The name of the frame that the published Cartesian locations of the AR tags will be relative to
IMPORTANT: The node assumes that a Kinect camera being used as the camera, so that depth data can be integrated for better pose estimates. If you are not using a Kinect or do not desire to use depth data improvements, use individualMarkersNoKinect
instead.
To use the package we need to create a new launch
file. Inside launch/
directory add alvar_track.launch
with the following content:
<launch>
<arg name="marker_frame_id" default="world"/>
<arg name="user_marker_size" default="7.0"/>
<arg name="camera_model" default="astra_pro" doc="model type [astra_pro, realsense_d435, raspicam]"/>
<arg name="camera_namespace" default="camera"/>
<arg name="rgb_camera_info_url" default="package://open_manipulator_p_camera/camera_info/$(arg camera_model).yaml" />
<arg name="depth_camera_info_url" default="" />
<include file="$(find ar_track_alvar)/launch/pr2_indiv_no_kinect.launch">
<arg name="marker_size" value="$(arg user_marker_size)" />
<arg name="max_new_marker_error" value="0.08" />
<arg name="max_track_error" value="0.2" />
<arg name="cam_image_topic" value="$(arg camera_namespace)/image_raw" />
<arg name="cam_info_topic" value="$(arg camera_namespace)/camera_info" />
<arg name="output_frame" value="$(arg marker_frame_id)" />
</include>
</launch>
Bundle
Sometimes it is advantageous to treat “bundles” of multiple tags as a single unit. For example, this can allow for the estimation of the pose of a many-sided object, even when some of the tags cannot be seen. A tag bundle is defined by an XML file that lists a set of tag IDs and their positions relative to a master
tag. The master tag always comes first in the XML file and defines a coordinate system for the rest of the tags.
IMPORTANT: this coordinate system is different from the standard system used in ROS! In this system, when facing the tag, positive-z comes out of the front of the tag toward the viewer, positive-x is to the right, and positive-y is up.
To create a bundle, first choose which tag you want to be the master tag. Treat the center of the master tag as (0,0,0). Then, after placing the rest of the tags, measure the x, y, and z coordinate for each of the 4 corners of all of the tags, relative to the master tag origin. Enter these measurements for each tag into the XML file starting with the lower left corner and progressing counter-clockwise around the tag. After creating the XML file all you need is to add the following parameter to your launch file.
bundle_files
(multiple strings) – A list of XML file names, one for each bundle you wish to detect.
An example XML file for 2 markers:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<multimarker markers="2">
<marker index="8" status="1">
<corner x="-2.2" y="-2.2" z="0.0" />
<corner x="2.2" y="-2.2" z="0.0" />
<corner x="2.2" y="2.2" z="0.0" />
<corner x="-2.2" y="2.2" z="0.0" />
</marker>
<marker index="9" status="1">
<corner x="-2.2" y="11.8" z="0.0" />
<corner x="2.2" y="11.8" z="0.0" />
<corner x="2.2" y="16.2" z="0.0" />
<corner x="-2.2" y="16.2" z="0.0" />
</marker>
</multimarker>
ArUco Marker
Todo.
OptiTrack
Todo.