Occupancy Grids vs Max Height Images

I am attempting to use the stretch_navigation ros package to autonomusly map my current environment. I am using explore_lite, a ros package that sends goals to move_base which allows it to explore what they call “frontiers” which is basically unexplored areas based on the global map. Currently, I am running into a problem while autonomusly navigating because my environment has “floating” objects such as desks such that the LIDAR sensor only picks up the legs and thinks it can move towards the space in between them which would cause the stem of the stretch2 to collide with the surface of the desk. Is it possible for me to map this environment while only using the LIDAR sensor or would I have to use a combination of a 3D map such as FUNMAP with a 2D map created by the LIDAR sensor? If the latter is the only way, any suggestions on tackling this?

1 Like

Hi @mlalji, good question! When you use lidar for navigation, your robot “sees” a slice of the world at the height the sensor is located (about 17cm off the ground). For example, if there were a tall chair in front of Stretch

the only obstacles that will reflect lidar pulses at that height are the four legs.

The ROS Navigation Stack (which is what powers stretch_navigation) uses an algorithm called Bresenham’s Line Algorithm to add “free area” and “obstacles” to its map.

Bresenham’s line algorithm rasterizes the lidar pulse line to a grid map, marking each cell before the line’s end as free space, and marking the cells at the line’s end as obstacles. By repeating this procedure for the ~500 lidar pulses seen in a single lidar scan, the NavStack starts to build its map.

Since the lidar only sees four clumps of obstacles from the chair’s four legs, NavStack will see the space in between as free space and attempt to travel within it. It is not possible to deal with this using only the Lidar sensor.


While NavStack represents its map as an occupancy grid, FUNMAP uses a different map representation called a “max height image”. The depth camera in Stretch’s head returns a “registered point cloud”, which means each 3D point is correlated with a RGB color by transforming the depth camera’s multiple sensors into a single optical frame.

By transforming the point cloud into a map frame and merging multiple clouds from a head scan (panning the head right to left), FUNMAP builds up a max height image.

In a max height image (MHI), each pixel represents a x-y location on the map. In the RGB channels of the pixel, the RGB color from is the registered point cloud is stored. In the height channel, the z height of the 3D point in the map frame is stored.

Writing an algorithm to translate a max height image into an occupancy grid should be easier to tackle after seeing how these maps work. The task is to fill out an occupancy grid with data from the height channel of the MHI. Where the MHI pixels are zero, no points were seen by the head camera, so the corresponding grid cells should be marked as unknown. Where the MHI pixels are at the floor threshold, the corresponding cells should be marked as free space. Finally, where the MHI pixels are above floor threshold, the cells should be marked as obstacles.

2 Likes

Hello @bshah! What I have to do is very clear to me now. Thank you!

2 Likes