SLAM - The Key Technology for Autonomous Vehicles in Logistics (Prof. Cyrill Stachniss, University of Bonn)

Show notes

In today's episode we will take a deep dive into the topic of Simultaneous Localization and Mapping (SLAM).

SLAM is a method used for autonomous vehicles such as AMRs in logistics for example that lets you build a map and localize your vehicles in that map at the same time. SLAM algorithms allow vehicles to map out unknown environments.

It’s one of those things we should all know about in more detail in order to understand how autonomous vehicles in logistics are able to do what they do.

Today's guest is a leading expert in the field of SLAM.

Cyrill Stachniss is a full professor at the University of Bonn and heads the Photogrammetry and Robotics Lab.

In his research, he focuses on probabilistic techniques for mobile robotics, perception, and navigation. The main application areas of his research are autonomous service robots, agricultural robotics, and self-driving cars. He has co-authored over 250 publications, has won several best paper awards, and has coordinated multiple large research projects on the national and European level.

This podcast is hosted by Marco Prüglmeier.

Together Marco and Cyrill cover the following topics:

A definition and explanation of SLAM (Simultaneous Localization and Mapping)
Where is SLAM being used
How the SLAM algorithm actually evolved over time and what Carl Friedrich Gauss has to do with it
The different categories of SLAM systems, how they work and what use cases they are best suited for
How SLAM systems are deployed in the world of logistics
Advantages of SLAM for robots in logistics
What problems and issues to look out for when implementing SLAM in logistics and warehouse situations with highly repetitive or highly dynamic environments
What the latest fields of research to advance SLAM technology are

Helpful links:

Webinar with JYSK and GreyOrange: https://bvl-digital.de/webinare/sendeplan/how-flexible-automation-helped-jysk-cope-with-the-unexpected-peaks-during-the-pandemic-and-support-their-ecommerce-growth/

YouTube Video on SLAM with Cyrill: https://www.youtube.com/watch?v=87S82fh4rI4

Another YouTube Video on SLAM with Cyrill: https://www.youtube.com/watch?v=BuRCJ2fegcc

To connect with Cyrill on LinkedIn: https://www.linkedin.com/in/cyrill-stachniss-736233173/

Show transcript

00:00:00: Hello and welcome to the logistics tribe

00:00:08: I'm boys felgendreher founder of the logistics tribe and today we will take a deep dive into the topic of simultaneous localization and mapping short slam,

00:00:17: slam is a method used for autonomous vehicle such as amr's and logistics for example that lets you build a map and localize your vehicles in that map at the same time,

00:00:27: it's one of those things that you should probably know about in more detail in order to understand how autonomous vehicles and Logistics are able to do what they do.

00:00:35: Today we have a leading expert in this field in the podcast to give you a real deep dive into the world of Slam

00:00:41: Silas darkness is a full professor at the University of Bonn and heads the photogrammetry and Robotics lab in his research he focuses on probabilistic techniques for mobile robotics perception in navigation

00:00:52: the main application areas of his research autonomous service robots agriculture robots and self-driving cars he has co-authored over 250 Publications.

00:01:01: Has won several best paper Awards and has coordinated multiple large research projects on the national and European level this podcast episode is hosted by Marco Polo Maya himself a logistics robots expert and aficionado

00:01:14: before we get started a quick message from a great supporters at grey orange grey orange in the German Logistics Association BV L will host a webinar together with the Danish home where retailer use.

00:01:24: The topic of the webinar will be how flexible automation helped your scope with the unexpected Peaks during the pandemic and support the e-commerce growth.

00:01:33: I will be moderating this webinar so I'm really looking forward to it the date is February 24th from 11 a.m. to noon CT.

00:01:40: If you're interested I will leave a link in the show notes I hope to see you there all right and now on to the show enjoy hi Cyril are welcome to the logistics tribe nice to have you here on the show yeah it's my pleasure to be here thanks for inviting me.

00:01:54: Yeah I'm really looking forward to this talk with you I remember that we talked

00:02:00: I don't know probably several years ago about slam and and technology and so on and I just wanted to do that again and and open it

00:02:10: for our audience here at the logistics tribe and to go

00:02:14: order to dive a little bit deeper into the topic of Slam because Slam in my eyes is really a Glock growing

00:02:24: technology or aspect over or algorithm that is used in also Logistics environments and,

00:02:32: that's why I think it really makes sense that we talked a little bit about slam but first of all this

00:02:38: what does slam cyrill can you can you as a professor explained that also in simple words or is it

00:02:46: complicated know I'll do my very best to do that so actually slam stands for simultaneous localization and mapping or the original term which was it say,

00:02:55: in Perils of Slam was actually C ml concurrent mapping and localization this is a two terms.

00:03:00: That were brought up initially but slam simply sounded cooler so people stick slam so what it basically does is.

00:03:07: What do you want to do is you want to build a map of the environments of your surroundings and you want to localize yourself in this map and you want to do that to the same point in time that's basically what the problem is about.

00:03:17: You can Envision this as a kind of a small child waking up for the first time opening his or her eyes and then walking through the environment.

00:03:25: So what this kid is doing it will actually see what's around him or her so whatever whatever chairs objects walls around me.

00:03:34: And you can walk and navigate the environment and gets views of those objects from different locations from different viewpoints and can use this not to

00:03:43: estimate its own motion for the environment and that's basically what a slam system is doing itself in doing on its own so we want to build a model of the environment we call it a model you can also called the map,

00:03:54: and we want to know where we are on that map while we are navigating and this is this small taneous so we want to do that together localization and mapping.

00:04:03: I mean compared to existing systems such as yourself device in your car for example this comes with the given map so this map is basically there and you just localized yourself in that map with the GPS signal for example.

00:04:15: But in Slam you really want to do the localization but the same point in time built the map,

00:04:20: that you use in order to perform the localization so it's somewhat more challenging than localization and its own yeah I like this analogy that you gave us with the child looking around in the room,

00:04:31: I actually hit the same analogy maybe we were talking back then about it but I always thought that you must have the map before looking around so that you can

00:04:42: basically calculate back where are you positioned in the room but actually it's done at the same time right yes absolutely so you really start without a map,

00:04:52: you don't know where you are and you just open your eye and try to build first a very local model so you just whatever,

00:04:59: look to the right look to the left and see what I see right of me and left of me and this starts to kind of grow a known map in your own mind,

00:05:07: and the more you walk around the more you see with of course a rough Eagle motion estimation because if you make a step forward you know I'm traveled probably M forward.

00:05:16: And this gives you an idea when you see the next object or something you've seen before this allows you to estimate where you are relative to the other object.

00:05:24: And this allows you to build a map and that's basically what slammed us exactly in the same way.

00:05:30: So you can really start without a map and navigate it's the same with you think you take a train and go to a different city a place we've never been before,

00:05:38: you're walking out of the train station you have no idea what that City looks like where you are and.

00:05:43: Through yourself walking through the environment and seeing things around you you can build a mental map.

00:05:48: Roadmap for example about the surroundings of that that's train station where you started for example and that's exactly what.

00:05:56: Mmm okay great so the next question would be what is it used for actually in technology I mean

00:06:05: of course coming from Logistics I'm a big fan of amr's and I know that we are using it there but what else are the two big use points of Slam.

00:06:16: So from my point of view whenever you have a system which should do something on its own,

00:06:21: you need to have an environment model so whenever you have a system which as you said a mobile robot which navigates through the environment and should do this in autonomous fashion,

00:06:29: it needs to know where the system is and what the world around.

00:06:34: The robot looks like into the task in this can be robots operating their houses we can see that was autonomous cars Ottomans trucks which have the same problem that they need to solve just kind of the environment looks a little bit different but in the end it's more or less the same problem.

00:06:48: But you can also see this for other tasks if you think about a are applications or VR things or especially AR when.

00:06:57: Combine virtual objects in for example goggles that you're seeing with the real world around you you need to estimate where you are where those objects are so if you have a virtual whatever glass and put it on a real table.

00:07:10: The system which kind of generates this image of the virtual glass sitting on a table you need to know where the table is,

00:07:17: how where you are position position with respect to the table in order to give you the impression that this virtual glasses actually embedded in the real world and so there is large number of applications like this you can even think of moviemaking where you,

00:07:32: want to build a 3D model of a scene in order to then whatever do some camera motions in post-production which you may have not recorded before so even there people are trying to build up highly accurate models of the environment which are photo realistic.

00:07:47: Through typically using a lot of cameras or sensors like laser range scanners in order to,

00:07:52: build real three models of the real world in the virtual scene to then generate new views for example so they are numerous applications not just in robotics and Logistics Not only was driving but also in a large number of other disciplines basically whenever,

00:08:06: you have a system which either should do something on its own and acting environment,

00:08:11: War where you want to give an impression to the user on how the world looks like you want to generate for example different views from viewpoints,

00:08:19: which you actually haven't seen then those systems are so super useful and how did

00:08:25: this algorithm actually evolve over time because to me it almost looked like

00:08:32: I don't know five or six or seven years ago bam it was Terror yeah so but that there must be something before that time so maybe you can

00:08:42: give us a small overview on the timeline and how this

00:08:47: that evolved also coming out of the universities and so on yeah so actually it's much much older,

00:08:53: then what do you think even so in the robotics community may came up in the late 80s 90s but it's actually much much much much older so I would actually date it back 200 years to go out Calicut there he goes.

00:09:06: Who was actually performing Slam by hand and what else was doing in.

00:09:11: Between 1820 and 1826 was actually measuring the kingdom of Hanover and he what he was basically doing here it distinct.

00:09:19: Objects in the world like the Tower of a church and other real landmarks and was measuring.

00:09:25: Distances and angles between those landmarks and was using this not to measure the kingdom of Hanover and this was the first application of a slant system,

00:09:34: just basically done by hand in the computation have been really really complex need a lot of people doing all the Matrix inversions and solving linear systems which is which was involved in there.

00:09:44: Um but basically it is a very very similar technique to what we are using today so if you would ask me I would say

00:09:51: slam is more less 200 years old or so I done by hand would never have imagined that and even if you go back whatever to 1880s so even something which.

00:10:02: Very very far away for us people like Helmer they built super Advanced systems in order to simplify computations stuff which is surprisingly Advanced and can be used in computers today,

00:10:14: to make system sparser so easier to solve from a mathematical point of view because they were really suffering because I had to do all that by hand,

00:10:22: so whatever every iteration of its lawmakers and took them half a year to do to compute manually,

00:10:27: inverting huge matrices and stuff like this and of course they invested a lot of brain power how can we do this better how can we paralyze stuff.

00:10:33: Things which I again or became really important of last year's in robotics and other communities so we probably solved saved,

00:10:42: 20 20 years of development in slamming robotics if we actually have gone back to Gauss and helmet and have would have done that stuff properly retrospectively I can clearly say that,

00:10:52: wow that's interesting so but then there was some kind anyway some kind of a breakthrough

00:10:58: like I don't know ten years ago or so because then it became really popular so what was the reason for that I would even say was around 2000 2002 who is time when slam was becoming more and more popular,

00:11:11: at least one academic point of view this was a case of a couple of people in Europe especially.

00:11:16: So many Christians and others organized what is called a slam summer school where they were bringing young PhD students all over Europe together,

00:11:25: organizing these events and large exchange and this has made a big change that more and more people got interested in that problem it was there in Roblox Community probably since the beginning of the 90s or late 80s first works but it was a fairly small community.

00:11:39: It will say around 2000 became more and more popular and this was also at a point in time that probably open source software became more popular least in the robotics community.

00:11:48: And so people are starting to publish their systems at open source and at that point in time you put in years of development into building a first when system and they have been very interesting Works coming out around 2000 to 2005 Maybe.

00:12:01: We have the first really operational sound system have been shipped which are still in use today in a lot of applications and a lot of Open Source systems are actually algorithms which date back from whatever 2005.

00:12:13: And this was probably a big boost then and then we were kind of experimenting around how can we improve our algorithms so on and so forth and then probably around.

00:12:23: 2010 maybe maybe a little bit earlier.

00:12:26: We actually moved back to the original ideas that gals already had before we were looking into other techniques so we didn't use the least squares approach which cows was proposing.

00:12:36: We went into things like carbon filters or particle filters other regressive State estimation techniques do to solve that problem.

00:12:43: But with having more computes and being able to do mathematical operations very fast having,

00:12:48: good mathematics tools or libraries around which can solve for and systems more efficient they are more faster we actually reverted back to the least squares ideas and a lot of Senses since out there today,

00:13:00: I'll basically squares of year and then things matured.

00:13:06: Systems have been released their basically I would say three optimization systems which are there which are used today in sign systems or at least cover the majority of them.

00:13:16: They became more robust and this then.

00:13:18: Gave other people the chance to build up some systems easier and this were things exploded and then neither autonomous cars driving around you want to build large-scale Maps also Maps which users could manipulate or change,

00:13:30: which these days squares approach allowed you to do better than other techniques before,

00:13:34: um and yeah why because of the calculation parser what is the reason the reason is a different one so the least squares approach is typically an offline approach so you get all your data together and you solve the system and get.

00:13:48: Possible result out.

00:13:50: Originally they took years the computations and they found we took too long and people looked into techniques way do this kind of we call it the sequential State estimation that means you're trying to reuse your result and always just add the mix in the reading to it.

00:14:04: And in those systems much harder to change something so if you as a user realized oh the system actually did a good job but made a mistake at one particular place in the environment it's very hard to fix this in filtering techniques and least squares approach you actually have all the data at hand and you can actually go back and,

00:14:19: delete some constraints and optimization problem then magically or everything falls into place because user user eliminated the mistake of the.

00:14:26: And this is more due to that that the system allows for easier.

00:14:31: I'm user interaction I'd say that way and yeah this has to do with the underlying calculations

00:14:37: but also the way the data is used or processed in such systems maybe we as we have you here as a professor maybe we can even dive in in the problematic a little bit more or the solution of the problematic you already mentioned the least squares

00:14:54: approach of cows being kind of a fundamental approach to it,

00:14:59: I imagine also that there are probably more than one type of Slam that there are multiple slams maybe we can go first into that and then,

00:15:09: take a little bit of a closer look on how does one simple slam really work and we do that absolutely

00:15:16: so if you want to categorize this lens systems which out there I would probably open up four categories

00:15:22: and so if I look from the an Ardent don't go into the kind of really old school girl stuff I start with whatever what the robotics Community has done and then most of the initial people around you don't invite sand,

00:15:35: people like this they started with using common filters so well-known State estimation techniques that we're using a lot of applications since the 1950s.

00:15:43: You know to this recursive State estimation and that was kind of the standard approach for for several years so.

00:15:50: What categorize this type of approaches it is based on the Kalman filter or extended Kalman filter so certain variance.

00:15:57: Everything is assumed to be a gaussian distribution and your best of living in a linear world so having no nonlinear functions in your system which is highly unrealistic in real world situations.

00:16:07: But kind of that's how that the assumption that those systems actually make more.

00:16:13: M and I'm so this was one group of approaches they work well under good conditions for limited application limited space and not if that environment is not huge,

00:16:25: they actually work quite well and things like how was in Australia I completely operated by cranes which are running on extended Kalman filters for doing this us

00:16:34: so that is one quick one question yeah I always connect Kalman filters with a different sensor input is that right is that

00:16:44: something that they are also used or am I wrong on that,

00:16:48: I'm think you're wrong and to some degree on that so these okay actually sends observations fairly independent from the underlying System state estimation system,

00:16:58: so what do you always have in the end you need to turn your sensor data into something that the state estimation systems can use.

00:17:05: So if you think of a camera we typically extract features in the environment things like sift features or serve or all the variant of different features which out there.

00:17:14: And that basically means you're looking for distinct points and then your image for things like corners for example things you can recognize really well.

00:17:21: And then you based on Legends in the logistics will yes exactly saw certain things that you can identify well in your image that's the only thing which matters you need to be able to identify them well in your image in always spot the same location in an.

00:17:34: And then you're basically just estimating the location of these points in your own position with respect to those points the 3D.

00:17:41: And this is something that the common filter can do really well and there are other techniques out there when you build kind of like floor plan like Maps so you've seen this probably if you have a laser scanner.

00:17:51: And you're not extracting those those features but you're basically building what we call an occupancy grid map that's basically like a floor plan map so if you buy a new house you get a floor plan and.

00:17:59: These sensors basically generate Maps which look like.

00:18:02: And they are much harder to integrate into a common filter because you don't have these distinct landmarks in there but there are still ways how you can at least do that to some degree,

00:18:11: so in this sense yes the kind of sensor data that you're using to represent you in our environment matters but it's less dependent on the state estimation technique than it probably sounds for most people you need to extract it to some degree but then you can actually do this.

00:18:26: Okay God and so this kind of the first group of approaches and then the second group of approach of just coming up this was pioneered by the original ideas came by to see and others and then Mike Montebello and also George Rossetti,

00:18:38: have been Pioneers in the particle filter base lens systems where you.

00:18:42: Get rid of the idea that everything must be a gaussian distribution all the errors and things like this and you can go to a non parametric representation allowing you multimodal distributions so you can say either I'm in Street one or I'm Street to but I know I'm nowhere else,

00:18:56: this is things that something you can't represent with the gaussian belief but you can do with a particle filter base belief and.

00:19:03: This has led to a new generation of silences them the most prominent one from the research perspective is probably fastened,

00:19:10: and from the usability perspective G mapping which is also an idea of fasting but building grip maps and it's probably one of the most used to dslam algorithms today Ross and a lot of other components basically use the original implementation of Gene mapping from.

00:19:23: 2004-2005.

00:19:26: And this has the great advantage that you get rid of this gaussian V assumption and it also allows a deal better with some nonlinearities that you.

00:19:36: On the downside mu it's quite memory and computationally expensive and also kind of yeah has some limitations in the amount of uncertainty you can represent.

00:19:47: If you think about a system which is localized fairly well and can track its post really well the particle filter does an outstanding job.

00:19:53: If your environments with huge uncertainty let's say I know I'm in Munich right now but I have no idea where I'm in Munich the uncertainty that you need to represent as particles becomes undetectable in such large uncertainty setups.

00:20:06: And therefore for example for a GPS localization system the particles that would not be a good idea but for most robotics applications the part of it does a really good job.

00:20:16: And that's kind of the second category of systems which are still in place today still a standard choice but if you build a new surround system you would probably not start with particle feel that way anymore.

00:20:26: And then the third group of approaches is the least squares approach which then became popular again this kind of dates back to the house and all the ideas using sparse matrices so basically matrices which contain a lot of zeros and that makes them easier to solve if you solve your linear system.

00:20:41: And the whole idea of least squares and that became really popular and most systems which are out there use the square slam today.

00:20:49: The force category of systems are kind of learning based approaches getting deep learning as impacts on a lot of different disciplines although I would.

00:20:59: I haven't seen the big breakthrough for Slam at least not for the course land idea so there are certain things you can really do well with the learning systems but other things I think the learning approach is not the right way to go.

00:21:11: Especially if we were able to formulate and solve a problem mathematically.

00:21:16: I don't want a closed form solution but have a really good idea on how to do that and we have that for least squares we doing things since 200 years.

00:21:22: It doesn't make much sense from my point of view to run it on a learning algorithm Train The Learning algorithms to do the matrix multiplication you can actually specify the correct solution with.

00:21:31: This doesn't make much sense to me.

00:21:33: They however other parts in the slam problem we're learning helps a lot for exactly those parts would you can't count for me like formulates or formalize that well.

00:21:42: Things like.

00:21:43: He's an image a equal to an image beam data associations so interpreting the sense of data they are learning is great and so it's probably the combination of learning together with one of the existing techniques such as these squares.

00:21:56: Um so which you probably use may I see the future so it's combining learning stuff for stop you can't,

00:22:03: formula as well mathematically and then using the kind of traditional least squares like approaches for system.

00:22:09: For the part that you can actually write down mathematically and not a soul.

00:22:13: And that's probably the last category we're probably the new generation of sound systems will all fall into this category.

00:22:20: And the last one that you mentioned is fairly new now right it's only I don't know the last 10 years probably yeah that's true so I'm correctly all the Deep learning stuff basically.

00:22:32: Came over from bottom machine learning computer vision and then into the robotics Community Maybe started 2012 something like this,

00:22:40: the first really successful deep learning approaches on image data,

00:22:43: and slowly becoming more and more popular to solve certain aspects of this but kind of the core thing that we call typically slam is still the least squares ideas it's just kind of,

00:22:53: how to turn the data which comes from our camera or laser scanner into a format that will be square system can actually export it to its best.

00:23:01: That's still by the Learning Place or all over there I see loading playing a big role so all those are four different types of Slam

00:23:10: I have one question because I was assuming that you are coming up with the types of 2D slam and 3D slam

00:23:20: but actually all of these four types can be applied in a 3D World or in a 2d world or can you

00:23:27: add some knowledge to that yes absolutely so in the end this is all.

00:23:31: All the techniques are presented are basically State estimation techniques so this fall into the category of estimating.

00:23:37: Parameter which refers to a state like where's the robot right now and your statement that could be.

00:23:44: Three dimensional thing X location Y location and an orientation Theta for example where the robot is looking to this would be the 2D World need a 3D Vector to represent.

00:23:55: Or you go to a 3D environment where you have x y z and then a yaw pitch roll angle so basically three angles where you looking to and where are you would be a 60 back door.

00:24:05: And the underlying State estimation system for them it's actually.

00:24:08: Pretty irrelevant if you are estimating a 3D vector or 60 vector and of course 60 is more variables it's a bit more complex.

00:24:16: But in the end calculation difference from the calculation from the West Point of View it's very similar.

00:24:22: Okay so interesting so you mentioned the least squares approach is basically the the,

00:24:28: yeah the foremost slam algorithm that is out there so maybe if we dive into that one so how would it calculate in

00:24:38: it's hard to explain a maybe we can try to explain it in not going too deep but so that we understand the basic algorithm okay yeah absolutely so.

00:24:50: What those systems do I know I'm looking to something which is called the post graph which is a simplified representation that you use for solving at least squares problem or slam with least squares problem and we course it's very easy to understand actually how that works.

00:25:04: What you're doing in the polls graph you're building up a graph of poses that's what name is so what is a graph and what suppose.

00:25:12: Opposed is the location of the robot together with its orientation so this 3D or 60 back to I was talking about and you want to represent this for every point in time so let's say every second you store where the robot right now and where is it looking.

00:25:26: So every second you will add a 60 back door to your state estimation problem that you want to solve.

00:25:32: So what you have is longer you you travel through an environment the more of those pulses you accumulate you can see this you're walking through your favorite City wherever you are you start at the train station you walk towards the whatever Cathedral.

00:25:45: So every few meters you will create one of those poses representing the trajectory that you're actually taken.

00:25:52: Okay the problem is what do you put into those variables and so what's the position where you are let's hear the train stations.

00:26:00: I close my eyes I open my eyes to see I'm at the center of my own world my coordinate frame is 000 that's where I'm standing right now and let's start from here and then you say okay I walk.

00:26:12: You're you know you're walking forward you know every step is maybe a meter and let's say every.

00:26:17: Three meters you create one of those new poses and you know okay my second pulse is probably three meters away from my first post and the third pulse is again three meters away from the.

00:26:27: So what you were adding is basically constraints between those.

00:26:31: So you say okay this position should be 3 meters away from the other position and maybe at some point time you're turning say okay I do a right turn probably 90 degree term or more or less.

00:26:41: I'm so I should have a 90-degree turn and then walk again three meters.

00:26:44: So all this information is not perfect it's noisy because your step length is not precisely M sometimes it's 95 CM sometimes as a media 20 and so it's never never nothing is perfect in here.

00:26:55: And then what we call them soft constraints you basically collecting those soft constraints and it not becomes interesting

00:27:02: when you do something what we call a loop closure closed loop so you're walking through your environment

00:27:07: I did some point in time you lose oh I'm back at the train station I was here before and then you can say okay I'm not in front of the train station we have been before so I know,

00:27:16: it can add an additional constraint that the place where I am right now is exactly the place where I started.

00:27:23: And then you basically create constraints between non-consecutive poses so places you've been before this is what we call Place revisiting or look closing away your back replies we have been before I say I realize I'm here.

00:27:37: Let's add a constraint in this adds more constraints to your to your optimization problem or request from that your then solving and this is the whole ingredient in what your then trying to do is say okay this least Square problem you say.

00:27:50: What is these constraints that I actually added to my system,

00:27:54: you collect all of them and then say how do I need to change the positions how do I shift them around a little bit let's say can I push this position immediate forward or this immediate backwards so that the error so the kind of the disagreement of the constraints get smaller and smaller.

00:28:08: And so the least squares apartment you're doing is you're busy trying to find a configuration of these nose of the post graph.

00:28:15: That the error which is introduced by these constraints as small as possible so you want to find the configuration so that all the constraints agree as as good as possible and this is can be a very very large problem because if you think

00:28:29: for every second you add whatever three dimensions to your state Vector you're traveling for an hour,

00:28:33: what about very high-dimensional State back door with a lot of constraints and so finding thousands or millions of parameters can be computationally expensive.

00:28:42: But that's what the least squares problem actually solves and so you have those constraints from just walking forward but I told you that's what.

00:28:51: Probably will automatically would be on a robot so basically you're counting the Revolutions of your wheels,

00:28:55: let's see now you're in a car so you have something which we call and I come on drive or you have another mobile robot which is differential Drive are the times of physics of the platform this gives an idea how much the platform move.

00:29:08: But Additionally you also have a sensor on board which you see the environment this could be a camera laser range scanner

00:29:14: or Elida online on yeah exactly and then you actually generating constraints between those lighter observations you're saying what I'm seeing right now this treaty local 3D model that have been generated from one position,

00:29:25: looks actually very very similar.

00:29:27: To the 3D model local 3D model of a place we have been 20 minutes ago that's probably the same place and then you can also add those constraints from based on the observations that you're generating.

00:29:37: And therefore you will get the more sensor information you have the more of those constraints you will get and it could also be that the skin matching so aligning laser scans tells you you move Demeter 20 forward.

00:29:48: M in your your step count and says I was probably M so you have two constraints which disagree.

00:29:53: And depending how much you trust your step length or your laser scan you can say I weigh the laser skin a bit better than my step length so I trust me this can do more and can.

00:30:02: Trade off the.

00:30:04: Information that these different observations provide and this makes it again more complicated but then you can come up with optimal on the Optimal Solutions to the slam problem you take all that into account in your senses,

00:30:15: but that's basically how it works you need to turn your observations and constraints or observation pairs to be precise into those constraints.

00:30:22: And then in the end solve a soft constraint problem or at least squares problem,

00:30:27: we minimize the squared error just because it's easy to do with the squared error and that's it that's exactly what you're doing,

00:30:33: okay so if we take this explanation now back to the logistics world we would have an ATV or an AMR.

00:30:43: And.

00:30:44: As you said already this machine would be equipped with odometry so that it would count the steps basically but very small steps on the Wheel.

00:30:54: So this is One sensor input.

00:30:56: And then you have your lighter typically that you also use for your safety so that you stop if something is in for in front but you also see your environment

00:31:06: just in 2D typically but you use these 2D scans,

00:31:14: and you also put it into the least Square solution or algorithm yes and just have your three only so the and make a small correction so what you're not putting into this least squares problems actually the raw lighter skin.

00:31:29: What you're actually doing typically is you basically you have this pose and you attach to this pose the laser scanning say okay at this post I got this laser scan.

00:31:39: And once you're finding a similarly looking laser skin because the new laser scan comes in say okay I have seen a laser scan which look like this already in the past.

00:31:49: And then you say okay the laser scan that I've seen 20 minutes ago in the laser scan I'm seeing right now they're very similar I can align them use will call Scan matching.

00:31:57: So in fact the iterative closest point algorithm were probably generalized ICP it would be the standard choice that you would use today to align those if you can align them you say okay I can align them they look really similar what you then now is

00:32:10: the location where I have taken the scan right now is a certain.

00:32:14: Offset from the position we took the scan 20 minutes ago let's say are your three meters to the right 1 m 2 V forward and 20 degrees rotated

00:32:24: just from the recording position and this gives you exactly one of those constraints that goes into the least squares problem and this is which connected the poses,

00:32:31: so in the end you're not really storing the late or using the laser scans in itself in the discussed problem but the constraint that you can generate from those,

00:32:39: he's a skins and that's a great and important difference why because in the end you can abstract

00:32:46: the sensor data away from the underlying State estimation.

00:32:50: In the end you don't care if it's a 2d lighter or 3D lighter or a lighter combined with the camera or an RGB camera whatever it is as soon as you can generate those constraints by something like skin matching or something like image matching.

00:33:01: You don't need to change the underlying algorithm or only to a very very limited degree to take this information into account mmm and even in the kind of vehicle world you probably would have a GPS,

00:33:11: how does a GPS comes into the game now let's say you have a GPS but your GPS has an uncertainty of whatever 20 meters or so your or even somewhere in the city block maybe even longer if you're in Venice in the streets are super narrow your easily off by 300 meters for,

00:33:25: and then you can add still soft constraints and other place where I am right now.

00:33:30: Should be at that GPS location at this longitude latitude but probably have an uncertainty of 300 meters.

00:33:36: So you allow the least-squares system to basically push the constant the pose around but only in this 300 meter range because you know I know where I'm roughly but not precisely,

00:33:46: you can combine this these different sensory modalities so you can add,

00:33:50: GPS you can add I'm use you can at lighter so you can add cameras whatever you have to use system but you're still totally fine with just a lighter and just playing it on the tree.

00:34:00: Okay yeah very interesting

00:34:05: now I actually have to admit for the first time I really fully understand how how this can can a great thanks so thank you very much for that

00:34:15: but one question is left for the first iteration because usually what we do with amr's is that we drive around and take the first

00:34:24: map of the building or off the logistics environment there and we have to do that to find.

00:34:32: Is this actually the input that you need for the scan matching algorithm so that you can find the.

00:34:38: The differences in the pose or what.

00:34:43: Why do we need that okay we need this so if you have perfect odometry or nearly perfect odometry then we wouldn't need.

00:34:51: Skin itching in the first place so we would could just trust the Revolutions of the wheels not to know where we are,

00:34:57: but the problem with this but we don't have that because we have slippage we have I don't know drifting or whatever exactly what kills us and therefore we say okay the good thing about the laser scanners to believe,

00:35:10: pretty accurate sensor so provides really good distance measurements.

00:35:14: Um and therefore if we align them we get an idea on how much the platform has moved according to the laser scanner so it's an independent measurement of that position or of this.

00:35:23: And this provides us with this additional information.

00:35:26: So what you said to begin initially you brought your first map of the environment I would actually disagree and say this is already the key part of the stem problem that you actually doing.

00:35:35: Um and they are skin matching of course is a key ingredient because this can measure provides you to begin an initial guess.

00:35:42: So you basically run scan matching in this is kind of the first starting point for your optimization problem so it's the initial configuration that you that you store in all those different nodes what your skin matter has told you.

00:35:53: And then you start shifting them around not to try to find a globally better solution than what the slam this can measure has done.

00:35:59: Because skin metros really incremental so it will also drift over time and through this Global component you can actually compensate this dress.

00:36:06: So the skin matching still basically finding the initial guess for your slam system okay generally speaking of what are the.

00:36:16: The big Advantage is for amr's and Logistics Vehicles using slam okay I would even say you can't really do it well without slam because in the end you need a map of the environment so the question is how do you get your map,

00:36:31: either the human provides the map because you have a robot which operates in a warehouse the warehouse has been precisely measured once nothing will change everything stays the same,

00:36:39: then find don't you slam ship that pre-built expert belt map to all your robots and let your robots run with this.

00:36:47: In reality this typically fails because your environment will change nothing will stay exactly the same all the time and you need to take those changes into account otherwise your system will get delocalized in those locations because they just basically trust the observations and take the map as God.

00:37:01: But maybe new self is installed in your Warehouse or things are moved away because you have a bigger forklifts and they operate whatever it is you will have changes over time.

00:37:12: And those changes need to be taken into account so your map will get updated and you,

00:37:16: may still want to run slam system to check if the map that the stamp system built is actually the same on the map before the given map,

00:37:24: and maybe just suggest changes to an operator saying hey I think something has changed here in this area

00:37:30: can we actually or should we take this additional change into account and update the maps of all robots that's something that you do and maybe you won't have an operator in there to check if this actually,

00:37:38: correct in order to whatever make your Fleet of robots diverged just because one robot made a mistake so this is something that you typically want to do.

00:37:47: M and but the other situations where the environment is not really like that so even if you think about your own home and you have a.

00:37:55: Robot that cleans your floor you can do this in a very stupid way without a map without slam just saying do random walk,

00:38:02: so the first robots which were cleaning your for we're basically doing a random walk so you saw those systems running completely weird,

00:38:09: for random patterns maybe through the environment to the bump something they do a random turn and continue works pretty well actually especially for hoovering.

00:38:17: Um but if you now want to have a more systematic cleaning and maybe you want to have a wet wipe where you don't want to have random walks because you see it on your floor,

00:38:26: you want us to systematically done and then you need a map of the environment and then you need to slam system because in my apartment whatever kids have stuff lying around chairs will be rearranged,

00:38:35: um whatever culture table is moved a bit from one week to another,

00:38:39: and this is a change in the environment so those systems basically need to run slam or least a local slam most of the time as well at least initially you need to build a map of the environment otherwise you cannot navigate well.

00:38:50: The same holds for think about navigating with a car through a city if you don't have a map of the environment and you don't have a sudden up and you don't have a model and you have no means to build up your mental model,

00:39:00: your internal brain slam.

00:39:02: Then it's a pretty hard time too you can just go basically by a direction said okay I think I'm driving thousand direction of the Sun and hope I reach my destination and if you want to do better than that if you want to do systematic planning need a map and slam basically provides you with that map.

00:39:16: Okay so big advantages or even it's basically necessary but on the other hand what's the other side what can go wrong

00:39:29: I heard of things that like for example buildings are getting so.

00:39:36: Huge and so big and are so repetitive like you have a column every,

00:39:44: ten liters and nothing is really there to have a distinction am I now at column 15 or column 26 sir yeah so this

00:39:53: might cause problems or as you said if you have.

00:39:57: Really bigger spaces that really rearranged is that can that be a problem yes what do you think about that

00:40:04: yes absolutely so they are different things which can go wrong so in the in the even the static settings of the environment doesn't really change you have this repetitive structure for example and this is what we call a data Association problem so you need to associate,

00:40:17: your current observation to one of those previous observation.

00:40:19: And if you think about you have a highly repetitive environment like an empty hospital where all the the the patient rooms look basically identical because they are built exactly the same way.

00:40:30: If the robot is inside one of those rooms it can say yes I'm the center of the room but I have no idea in which room I'm actually am.

00:40:38: Unless it can read whatever door signs or something like this that's a good way to actually fix that problem but if you don't have that capability to read or signs.

00:40:47: It's highly ambiguous so you don't know where you are you know you're in the middle of the room but it could be room one room to room 3 room for and so on.

00:40:56: And so this makes it data Association problem challenge.

00:40:59: What really helps there's having good odometry so you better your dormitory the more you can trust your dormitory and can eliminate a most of these ambiguities just by knowing I know idolatry is actually pretty good and that's actually one way how,

00:41:11: it's done soft and practice fairly off that you,

00:41:14: trust your Dharma teachers to some degree the other thing that you could do is you could go back to a location where you know you can localize well let's say you have.

00:41:24: A building where you have one Central place and then a lot of corridors connecting this was all ruined so from time to time it makes sense to go back to the central area for the robot to localize itself well and kind of to disambiguate the observations that's one way

00:41:36: the other thing you can do is a lot of those optimization system failure at least Square stuff.

00:41:40: Um you can also do tricks in there to deal with mistakes we call dealing with outliers so wronged at associations.

00:41:47: Um and so there are some techniques how you can take into account that something is potentially a narrow and that you basically don't wait so much and this is also makes your system is much more robust and can deal with this,

00:41:59: so this is one track of research and dealing with better data Sensations which is still under investigation is also research at universities but also companies

00:42:07: the second big thing is redone I make environments what happens if the world around you is not steady that's something which is key for autonomous driving,

00:42:15: but still they are you have GPS information most of the time and pretty good incrementals estimate that you don't increase your uncertainty too much overtime.

00:42:25: But even think about indoor environments we don't have this global positioning system which can,

00:42:29: help you at least to eliminate a lot of the potential ambiguities then this is hotter and robots operating in really crowded spaces this is still a challenge and how to represent space how do you represent,

00:42:43: common changes so sometimes you've changes which are not highly Dynamic but people moving things around think of a warehouse which is completely empty initially,

00:42:51: then a truck comes a lot of pallets are on up unloaded put anywhere for the robot just having a 2d laser scanner the world looks very very different.

00:42:58: And how do you take that into account how do you incorporate this in your map but still don't invalidate the other map so what actually good representations for dynamic environments for

00:43:08: maybe even that environments which have a changing structure about some typical repetitive pattern so there are still a lot of research which is going on in this direction or needs to go on because I think we don't have really.

00:43:18: A solution which can really deal with changing environments in a very robust manner

00:43:23: so are those actually the newest research directions that slam is going so getting out,

00:43:30: things like that that you mentioned and the second one incorporating or using machine learning or AI where it's necessary or where it helps.

00:43:41: Those top two main paths that it's going now or else what else are you actually researching on and.

00:43:49: We are still working on the State Association problem here learning plays a big role so you're learning systems can actually allow you to make better data Sensations or come up with better hypotheses because students it's a problem which is very hard to formulate how similar are two images.

00:44:03: When you can compare pixel values but even have the same Viewpoint it's actually hard thing to do and so this is where learning helps for getting better data associations

00:44:12: we are also learning trying to combine

00:44:16: pure geometry and semantics so putting meaning to things knowing that something you see is a door is a chair is a table is a couch so combining this geometric estimation with semantic estimation and as soon as you go into the semantics world you need to have learning algorithms because that the only way to

00:44:32: to learn.

00:44:33: Whatever that looks like a Chia but looks like whatever a bad that looks like a couch and integrating this into slime systems this is also one track of research that we are doing,

00:44:42: dealing with objects having basic an object based slam system that you take objects in the environment into account also something that we as humans partially use.

00:44:50: So we have an idea what objects are typically moving which objects are typically static,

00:44:53: and we rely to the static one so he walk through a city but it's better to localize based on buildings than on cars and this is something that you can also give to to autonomous robots if they have an awareness of what they are actually seeing and make the system more robust.

00:45:07: And still the dynamic environment representation is for my point of view still thing which is most unclear because we don't have a good model at least from my point of view that would solve all those problems,

00:45:16: well very interesting serial thank you very much for giving us some guidance on Slam and the technology beneath it

00:45:25: behind it and I hope it was also interesting for our audience Logistics audience and thank you very much sir

00:45:35: it was my pleasure to be here and it was great talking to you whenever you want to dive deeper into slam.

00:45:41: Call me come back to me I'm happy to help out yeah one can really feel you're the fun that you have playing around with stuff like that thanks thank you very much bye bye.

00:45:55: Alright that was a very nice Deep dive into the topic of slamming Logistics I hope you learned as much as I did if so make sure to subscribe to the podcast so you don't miss any of the future episodes of the logistics tribe.

00:46:08: Thanks for listening I'm Boris felgendreher until next time.

Show notes

Show transcript

New comment