A brave new world

When developing new technologies, we tend to evaluate a given system on it’s performance in well-defined and narrow tasks, but a defining characteristic of intelligence lays on the system’s ability to interact and adapt to its surroundings. Our current approach to developing autonomous vehicles takes a form very much like the former. The task most of today’s autonomous vehicles are designed to solve are shallow and simple, far from what a human would consider to be ‘navigation’.

A vehicle driving by itself on a highway has little ability (if any) to adapt to its environment. From the perspective of such an agent, the entire world consists of a lane. Although additional levels of complexity may be procedurally added to this world view, its goal is to stay centered on this lane and to avoid collisions. Deviating from the goal is allowed only if necessary within certain margins. For the most part, the GPS device is the one in charge of route-planning, the vehicle only has to follow the designated path. Driving in an urban area can also be seen as the same idea of lane-keeping, but with traffic lights and signs, narrower margins for deviation, and a lot of extra obstacles.

The performance of our technology in such tasks is getting better every day. However, a large limitation to this approach is that it is rather hard to program a machine to view other road vehicles as decision making agents. When people take driving lessons, they are taught about the rules of traffic. What to do when they see such a sign, who has preference over who in such a situation,… but in practice, this knowledge is only a tool that we use to judge how to drive around others. The amount of theory of mind that goes into our daily driving is often unappreciated. Take as an example a car that wants to merge into the lane you are driving in. We don’t simply derive what he is trying to do from his use of the blinkers, there are a set of cues that the driver is broadcasting to communicate his intentions to you. How close he is getting to the edge of his lane, his acceleration/deceleration, even the direction the driver himself is gazing (assuming you can see his body) is information we use to process his intentions. This list might even extend to sources of information we are not even consciously aware of. My point here is not to say that we should increase our efforts into implementing these communication behaviours into autonomous vehicles, but instead to highlight the driving efficiency to be gained when proper communication is to be had in traffic.

Now, teaching autonomous vehicles to detect and understand some of these behaviours is not the biggest issue. A whole other discussion is the extent to which such autonomous vehicles can be programmed to perform some of these physical behaviours, as some of these might be too hard to implement reliably, specially to the safety standard traffic regulations will require them to be. What if instead there was a method to verbally communicate intentions? This idea might initially sound absurd in a human-to-human context. How would we even communicate across cars? It is not like we can implement some radio device in every car and begin broadcasting every one of our intentions. Not only would it be incredibly distracting as a driver, but how would we reliably know someone was trying to communicate a message specific to us? Furthermore, we could never hope to hold several communication streams at once.

This is where autonomous vehicles begin to show their advantages over human drivers. Two agents could communicate their intentions, for all intents and purposes, instantly and to everyone independently. Similar to how devices such as phones and laptops automatically establish one-to-one connections to your home router, vehicles could dynamically communicate with every other smart vehicle in the vicinity. Once such a system is in place, optimizations to traffic beyond reducing risks of accidents begin to make themselves apparent. Lets say a traffic light just turned green, no longer must a vehicle have to wait for visual confirmation that the car in front has begun moving in order to start accelerating. Milliseconds after green lights up, every agent would have received radio confirmation that every other agent in front of him is moving forward. In an intersection where all cars are exclusively autonomous, from the perspective of a human, it would appear as if all cars started moving simultaneously.

In fact, most (if not all) traffic issues arise from humans only being capable of sub-optimal communication in combination with slow reaction times. The solutions self-driving vehicles have to offer to traffic expand past traffic lights. Highway traffic jams during rush hour? The cause for them arises from literally the same issue as our traffic light example. Human drivers have slow reaction times, and thus brake and accelerate one after another in a line, causing jams to form and propagate backwards. The YouTuber ‘CPC Grey’ has a great video on this very subject. His animations are a lot more satisfying than I could ever hope my explanations to be, I highly recommend watching it.

For the sake of argument, say we banned human drivers from the road. We could equip autonomous vehicles with larger vocabularies to accommodate for more abstract intentions such as ‘wanting to go to point X taking path Y’. If everyone on the road is aware of every intention and route every other vehicle is planning to take, we reach a point where traffic signs are no longer needed. Why do you need traffic lights in a intersection when every car can efficiently plan a path around the intended path of everyone else? The reason we humans require traffic rules is because we need an arbitrary rule-set to determine who has right of way over who else. For human drivers, intersections without traffic lights (let alone traffic signs) are incredibly inefficient, as everyone needs to slow down, see where everyone else is going, and carefully path around each other. In a world where every vehicle knows the precise location, velocity, and planned path of everyone else, there would hardly be a reason to use the brakes. A good analogy to such a system is how we humans walk in crowded public spaces. Although we do so unconsciously, we predict the movement of everyone, not only based on their velocity, but also based on where their body language indicates they plan on moving. This lets us walk without hardly ever bumping into other people. It is funny how we plan our own movement around others so effortlessly without explicit communication, yet the though of vehicles being allowed to behave in the same manner is often considered outrageous.

Despite how cool such a world sounds, believing that all humans can be banned from driving over night is quite naive. This however does not mean that an interconnected network of vehicles has no place on the road. After all, something to point out is that maintaining a traffic sign system requires billions a year to maintain (a single traffic light alone requires 8000$ a year in electricity costs). That isn’t a bad incentive to get us started. If we can optimize a simple traffic intersection locally through vehicle-to-vehicle communication, what could we accomplish once we plan traffic at magnitudes in the order of cities?

Lets take a step back. Prior to having an autonomous interconnected network of vehicles, the traffic system will require it to accommodate for human drivers as well. Even if human drivers cannot communicate with autonomous vehicles directly, there is benefit to be had from their cars broadcasting information. The degrees of rotation of the wheel, the use of the brake and gas pedals,… these can convey useful information about a human driver’s intent for autonomous vehicles to use. This could be taken one step further by monitoring the driver’s gaze and attention with a dashboard camera, which could be used to better predict intent. Broadcasting this kind of information could have human driven cars mimic autonomous vehicles to an extent. When no human drivers are detected in the surroundings and every vehicle is broadcasting high-fidelity information about their intent, autonomous vehicles could follow their highly optimized set of behaviours in order to improve traffic efficiency. On the other hand, as soon as a human driver is detected in the surroundings (or there is a vehicle only broadcasting limited data), driving behaviours could be altered so as to make traffic safer and more intuitive for humans.

This brings me to another natural extension to the vehicle-to-vehicle communication vocabulary. There is no reason stop at intent-related information. Any kind of data could be transmitted between vehicles. Take the example above of having vehicles decide how to drive based on whether a human driver is present in the environment. It is some vehicle to be using the road and for whatever reason not be communicating, be it because it is an old vehicle without the capability to do so, or maybe it is experiencing transmission problems, or maybe it is not a vehicle altogether. Rather than having every autonomous vehicle independently detect non-communicating road users, it would be a lot more efficient if everyone also broadcasted to each other the presence of every detected road user foreign to the network of autonomous vehicles. Also, why stop at foreign road users? If there are objects on the road that other vehicles could benefit from being aware, those could be broadcast too. Maybe your autonomous car has no direct view of an obstacle on the road ahead (say the vehicle in front is blocking the view). Having the vehicles ahead of your car broadcast such potential dangers could allow for earlier and safer maneuvers.

Assuming the transmission bandwidth and the vehicle’s processing power are enough, there is no limit to what could be transmitted. In a perfect world, your autonomous vehicle could receive all sensor readings and path planning data from every other car in your surroundings. In practice, the resolution of sensory data exceeds what can be processed by even a single on-board computer. Each vehicle has enough trouble building their own representation of the environment. A more sensible approach would be for each vehicle to communicate their individual world model to others. A given agent would thus integrate all received world models to create a wider, more accurate inner own representation of the world. This kind of technology might sound futuristic, but in reality we have already functional pieces. Self-driving technology is already able to build 3-dimensional point clouds, merge separate world views, and recognize objects within it, all in real time.

We have so far only been concerning ourselves with short range communication between vehicles, but road information can be useful to agents outside of your immediate surroundings. Take navigation services for instance, they use traffic information to calculate fastest routes based on traffic density. They thus benefit from knowing the locations of as many vehicles as possible. Vehicles could be equipped with broadband network technologies like 4G to have them connect to the internet. Autonomous vehicles could thus have access to information from anywhere in the world. Suppose there has been an accident on your usual route to work and the road is blocked. Your vehicle could acquire this information from the internet as it is being uploaded by vehicles present at the location and recalculate its own route on the fly. The information does not even need to be directly beneficial to road users. Maybe your car’s cameras saw a storm approaching, a weather forecasting service might buy the sensor readings captured by your car. Or say you were riding a truck across some remote town, Google might pay you for your data in order to keep their Street View service up to date. With sensors such as LiDAR, the resolution of your 3D data might be useful for a lot more purposes than you think. Maybe your local government might appreciate information on any recent pot holes your vehicle detected. Even in places where broadband connection is not available, road vehicles are large enough to be equipped with satellite receivers. These are hardly larger than a pizza box and are getting cheaper by the year. With upcoming projects such as Starlink, a single device could allow for high bandwidth internet connection world-wide.

The infrastructure behind buying and selling data is a whole new topic on its own. I like to think there are two possible approaches to it: centralized and decentralized, each having their ups and downs. Centralized is the approach whereby everyone’s data would be managed by a central entity. I’ll bring up Google Street View again as it can serve as an appropriate example. As you travel through a location that is relatively outdated in Google’s servers, your recently gathered mapping of the terrain could be used to update their service. Whether the user is paid for this information or not is a different question. It might be that manufacturers take the ‘service cost’ approach, where by opting to use the vehicle and their integrated services, you give up any right to the collected data and agree to become a data gatherer for them. The upside of this approach is a simpler user-experience, for which you obviously have to give up privacy and control.

The decentralized approach I could see taking a form similar to how peer-to-peer (P2P) works. To put it simply, users could put up information up for sharing online, which other users can access. Whether this data is shared across the internet or across short range radio transmission would probably depend on the content. Again, some data might preserve its value for long periods of time, be it because it originates from remote areas or interesting time periods, which could allow you to put it up for download on the internet. Other data might not be worth storing long term. For example traffic density information might not be useful for anyone after a few hours. However, vehicles driving in the opposite direction to yours 5 minutes after you acquired it, might even be willing to pay some cents for it. Given that we can expect most vehicles to be electric in the future, these are cents that could go into refueling the vehicle.

A question that such a decentralized system might bring up is how to ensure the data you acquire is reliable. In a centralized system, it is easy to enforce users to verify their identity and their information’s integrity. A solution to the decentralized system could come from cryptocurrencies. Some crytocurrencies allow for a summary of the traded data to be displayed publicly in the transaction details. If an agent decides to trade fraudulent information, it could be traced back to him which would have consequences to his reputation.

In the topic of using your vehicle to make money, one very interesting idea is that of vehicle sharing. Given that most people only use their car for going and coming back to work, the vehicle spends a lot of its lifetime unused. If vehicles are able to drive themselves, after your car has driven you to work, it could drive itself back home and pick your kids up to be brought to school. One can thus see how, by taking this idea of car sharing further, we could arrive to an era where we share several cars with large groups of people. Imagine the service Uber currently provides, but without you needing to drive the vehicle. Taxi services could simply become a network of self-driving vehicles trying to optimize human transportation. Such vehicles could even be programmed to refuel themselves. If the vehicles are owned by the municipality, one could then see it as another means of public transport. I don’t think this idea would end private vehicles all together, as I am sure many people would not like the idea of sharing a car that spent the night driving drunk costumers around, but I can however imagine it reducing the need of having so many privately owned cars parked all day everywhere on the street. One could also imagine a different system where your private car can be shared with a list of people to whom you have granted permission. Families, neighbours, or companies could have their private pools of cars. Whenever you want a ride, you could pull out your phone and, out of the vehicles you have been given permission to use, request for the fastest available one to come pick you up.

Something I must bring up, is that the communication infrastructure required for these ideas is by no means trivial to set in place. For the same reason you can’t travel to any place in the world and expect people to understand your native language, it is likely the autonomous vehicle industry will have to collectively agree upon a standard communication protocol. Imagine the trouble that might arise on the road if Android trucks were not able to speak to Apple cars. Although this might take industry and governmental effort, some years down the line we may have system as well integrated and regulated as aircraft communication is today.

Although I think the technology might take years to begin shifting towards fully-autonomous roads, I believe it is only a matter of time before these ideas are wide-spread enough for the industry to begin pushing them forward. We might just wake up one day and find ourselves having to explain to our grandchildren why people were ever allowed to drive.