How does an autonomous vehicle learn to avoid pedestrians and other obstacles?
Guest writer William Sachiti from the Academy of Robotics explains how autonomous vehicles are able to navigate around obstacles and avoid pedestrians…
To teach an autonomous vehicle to navigate its way around obstacles and avoid pedestrians – even if one runs out in front of it – we need to start by gathering huge quantities of data. To do this a data gathering car is used.
These custom-made vehicles, such as the example above produced by Pilgrim Motorsports with guidance from the Academy of Robotics in the UK, carry specialist sophisticated camera and computing equipment to be able to gather the required autonomous car data.
Its job is to go around a town to capture visual the data is in the form of video footage from up to 12 cameras with a combined 360 degree view around the car as well as capturing feedback from sensors and infrared detectors. This is all to gain a comprehensive understanding of the road environment and the road’s users, particularly in residential areas.
This data is taken back to a bank of supercomputers which watch it over and over again to learn. This type of computer science is called machine learning and uses evolutionary neural networks. Neural Networks are a computer system modelled on the human brain and nervous system and we run computer algorithms on neural networks. In this way the algorithms not only learn but also evolve with each iteration. This is not dissimilar to how we, as humans, have to have driving lessons and we learn a little bit more with each session.
Much like a child is taught what objects are at school, we take images of similar scenes to roads where the car will drive. From these scenes we mark out (annotate) what objects are. Using machine learning, we apply the annotated data to an algorithm which now begins to compare images and learn the difference between a car, a pedestrian, cyclist road, sky, etc.
After some time of doing this and us showing the computer more complex or harder to understand scenes, the algorithm in the computer eventually figures out the rest by applying what it has been taught and what it sees.
Now that the algorithm can tell what objects are, we attach multiple cameras looking in all directions. And in real-time, the algorithm is able to identify pretty much everything that is relevant in a scene. Using onboard supercomputers, that are performing up to 7 trillion calculations per second, the camera data is interpreted to reveal something like the image below.
The next step is to predict what each person, car, bicycle, traffic light, is going to do next.
In the real world, if your smartphone were to slip from your fingers and start to fall, you know it will hit the ground; it is a simple predictable action with an inevitable result.
Similarly, the vehicle is able to see and identify pedestrians, cars, and bicycles etc. and then predict multiple realistic potential scenarios, taking action based on which potential scenarios are more likely to happen.
This is possible because from the frame of reference of the car, everything is happening very, very slowly. It sees the world at 1000 frames per second. To it, all objects on screen, are moving as slowly as snails!
A combined view of the world
We then fuse the findings from each camera creating a combined view of the world as seen by the car. This combined view gives us a more accurate account of what is happening in the world around it.
There is a similar process for a vehicle to know how to keep in lane, where the road is and where it needs to be driving.
The example below shows a vehicle driving through a residential street in the UK.
As the vehicle needs to give way, it highlights in red the areas that it cannot drive and in green the areas that it considers space which is free on the road. There is an entire algorithm with its own neural network which has been trained to understand just the road taking into account details like texture, colour, obstructions etc.
We also have sub systems for reading road markings and traffic signs. These subsystems, running in their own Neural Network, are combined to create one super view of the world as the car sees it.
The end result is that currently, some of our test vehicles driven by neural networks are already out performing human counterparts in many scenarios.
The first smartphones were giant bricks which could not do much more than make phone calls. As time went on, they got more advanced and could do more.
Self-driving vehicles are the result of years of computer science and their arrival is the next step in the evolution of vehicles. First, we saw vehicles with cruise control, then cruise control with lane assist, then self-parking and now we’re moving onto self-driving.
The first autonomous cars will do an excellent job of driving themselves on very specific routes. With time, the vehicles will begin to drive more complex roads and routes, eventually, they will connect to each other and share data between each other; it is a step by step process.
I predict that we will begin seriously to see passenger carrying self-driving cars on the roads by 2020 and then a period of mass adoption between 2021 and 2025. The first self-driving cars you will see on the road are likely to be autonomous cars which deliver goods and don’t carry people. This is a simple, low risk start with a valid use. Our own autonomous delivery vehicle Kar-go is scheduled for trials later this year.
William Sachiti is the Founder and CEO of the Academy of Robotics, a UK company specialising in creating ground-breaking robotics technologies such as autonomous vehicles. Its first commercial solution, Kar-Go, is an autonomous delivery vehicle which, it claims, will reduce the last-mile costs associated with deliveries. www.academyofrobotics.co.uk