Mapping, Perception and More

Well, I took a really nice course last semester, which was all about autonomous driving. I really liked it because it was a hands-on course, with some of the nicest projects. The course introduced some of the core functions of autonomous driving system, such as localization and mapping, spatial perception and route planning which later on focus on deep learning solutions.

Throughout the course, I thought it would be great to write a brief description about the projects I have done, for later projects I might do.

Project No.1

TODO: add the github folder here


  • Mapping and Localization
  • Probabilistic Occupancy Grid
  • Iterative Closet Points (ICP)

Modeling and understanding the environment is a crucial task for autonomous robotics, in particular for mobile robots. Occupancy Grid Mapping refers to a family of computer algorithms in probabilistic robotics for mobile robots. This address the problem of generating maps from noisy and uncertain sensor measurement data, with the assumption that the robot pose is known. The basic idea of the occupancy grid is to represent a map of the environment as an evenly spaced field of binary random variables each representing the presence of an obstacle at that location in the environment. In this project, we experienced with real captured driving data - KITTI Dataset. The data-set contains 6 hours of traffic scenarios at 10–100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3D laser scanner, and a high-precision GPS/IMU inertial navigation system.

The goal of an occupancy mapping algorithm is to estimate the posterior probability over maps given the data: $p(m\mid z_{1:t},x_{1:t})$, where $m$ is the map, $z_{1:t}$ is the set of measurements from time 1 to t, and $x_{1:t}$ is the set of robot poses from time 1 to t. If we let $m_{i}$ denote the grid cell with index $i$, then the notation $p(m_{i})$ represents the probability that cell i is occupied. The standard approach, then, is to break the problem down into smaller problems of estimating $p(m_{i}\mid z_{{1:t}},x_{{1:t}})$ for all grid cells $m_{i}$. Each of these estimation problems is then a binary problem. This breakdown is convenient but does lose some of the structure of the problem, since it does not enable modeling dependencies between neighboring cells. Instead, the posterior of a map is approximated by factoring it into $p(m\mid z_{{1:t}},x_{{1:t}})=\prod _{i}p(m_{i}\mid z_{{1:t}},x_{{1:t}})$.

Hmm, I think it’s enough intro, let’s dig into the implementation. For that I used the pykitti python package, which is a light package of minimal set of tools for working the the KITTI dataset. Basically, after passing the path of the recording, it provides each of the sensors data via generators for easy sequential access. This package assumes that you have already downloaded the calibration data associated with the sequences you want to work on, and that the directory structure is unchanged from the original structure laid out in the KITTI zip files.

In the GitHub folder above, you can find the implementation of the OccupancyMap class, which create and update the occupancy & threshold maps. I implemented the update function using the inverse sensor model described in Chapter 9 of “Probabilistic robotics” by Sebastian Thrun etal [2]. In each of the update functions, I looped over the perceptual field of the point cloud with respect to the current car position and updated the cell value of the occupancy map using the log odd form of the Bayes filter: $l(m_{i}|z_{1:t},x_{1:t}) = l(m_{i}|z_{t},x_{t}) + l(m_{i}|z_{1:t−1},x_{1:t−1}) − l_{p}(m_{i})$

The whole procedure of the occupancy map creation:

  1. At the beginning, I load the dataset and extract the rigid SE(3) pose transformation using dataset.oxts.T_w_imu w.r.t the first measurement as the origin.
  2. Load the rigid transformation from the IMU to the velodyne, which is the constant transformation matrix I used to transform the current measured point cloud to the world coordinates at each iteration.
  3. For each of the car transformation matrix w.r.t to the origin:
    • Get the pose of the car
    • Transform the PCL to world coordinate - loop over the current PCL and multiply each point with the current Lidar transformation matrix.
    • Importance sampling - get only the cells with more than one measurement in the vertical direction.
    • Update the occupancy and threshold maps.
    • Visualize the scene map, the current point cloud the current occupancy map.

Project No.2


Project No.4


Project No.4


Osher Azulay
Osher Azulay

My research interests include robotic manipulation, deep reinforcement learning.