Wapo journalist verifies that robotaxis fail to stop for pedestrians in marked crosswalk 7 out of 10 times. Waymo admitted that it follows “social norms” rather than laws.
The reason is likely to compete with Uber, 🤦
Wapo article: https://www.washingtonpost.com/technology/2024/12/30/waymo-pedestrians-robotaxi-crosswalks/
Cross-posted from: https://mastodon.uno/users/rivoluzioneurbanamobilita/statuses/113746178244368036
People, and especially journalists, need to get this idea of robots as perfectly logical computer code out of their heads. These aren’t Asimov’s robots we’re dealing with. Journalists still cling to the idea that all computers are hard-coded. You still sometimes see people navel-gazing on self-driving cars, working the trolley problem. “Should a car veer into oncoming traffic to avoid hitting a child crossing the road?” The authors imagine that the creators of these machines hand-code every scenario, like a long series of if statements.
But that’s just not how these things are made. They are not programmed; they are trained. In the case of self-driving cars, they are simply given a bunch of video footage and radar records, and the accompanying driver inputs in response to those conditions. Then they try to map the radar and camera inputs to whatever the human drivers did. And they train the AI to do that.
This behavior isn’t at all surprising. Self-driving cars, like any similar AI system, are not hard coded, coldly logical machines. They are trained off us, off our responses, and they exhibit all of the mistakes and errors we make. The reason waymo cars don’t stop at crosswalks is because human drivers don’t stop at crosswalks. The machine is simply copying us.
Training self driving cars that way would be irresponsible, because it would behave unpredictably and could be really dangerous. In reality, self driving cars use AI for only some tasks for which it is really good at like object recognition (e.g. recognizing traffic signs, pedestrians and other vehicles). The car uses all this data to build a map of its surroundings and tries to predict what the other participants are going to do. Then, it decides whether it’s safe to move the vehicle, and the path it should take. All these things can be done algorithmically, AI is only necessary for object recognition.
In cases such as this, just follow the money to find the incentives. Waymo wants to maximize their profits. This means maximizing how many customers they can serve as well as minimizing driving time to save on gas. How do you do that? Program their cars to be a bit more aggressive: don’t stop on yellow, don’t stop at crosswalks except to avoid a collision, drive slightly over the speed limit. And of course, lobby the shit out of every politician to pass laws allowing them to get away with breaking these rules.
According to some cursory research (read: Google), obstacle avoidance uses ML to identify objects, and uses those identities to predict their behavior. That stage leaves room for the same unpredictability, doesn’t it? Say you only have 51% confidence that a “thing” is a pedestrian walking a bike, 49% that it’s a bike on the move. The former has right of way and the latter doesn’t. Or even 70/30. 90/10.
There’s some level where you have to set the confidence threshold to choose a course of action and you’ll be subject to some ML-derived unpredictability as confidence fluctuates around it… right?
In such situations, the car should take the safest action and assume it’s a pedestrian.
But mechanically that’s just moving the confidence threshold to 100% which is not achievable as far as I can tell. It quickly reduces to “all objects are pedestrians” which halts traffic.
This would only be in ambiguous situations when the confidence level of “pedestrian” and “cyclist” are close to each other. If there’s an object with 20% confidence level that it’s a pedestrian, it’s probably not. But we’re talking about the situation when you have to decide whether to yield or not, which isn’t really safety critical.
The car should avoid any collisions with any object regardless of whether it’s a pedestrian, cyclist, cat, box, fallen tree or any other object, moving or not.
All of which takes you back to the headline, “Waymo trains its cars to not stop at crosswalks”. The company controls the input, it needs to be responsible for the results.
Some of these self driving car companies have successfully lobbied to stop citys from ticketing their vehicles for traffic infractions. Here they are stating these cars are so much better than human drivers, yet they won’t stand behind that statement instead they are demanding special rules for themselves and no consequences.
The machine can still be trained to actually stop at crosswalks the same way it is trained to not collide with other cars even though people do that.
I think the reason non-tech people find this so difficult to comprehend is the poor understanding of what problems are easy for (classically programmed) computers to solve versus ones that are hard.
if ( person_at_crossing ) then { stop }
To the layperson it makes sense that self-driving cars should be programmed this way. Aftter all, this is a trivial problem for a human to solve. Just look, and if there is a person you stop. Easy peasy.
But for a computer, how do you know? What is a ‘person’? What is a ‘crossing’? How do we know if the person is ‘at/on’ the crossing as opposed to simply near it or passing by?
To me it’s this disconnect between the common understanding of computer capability and the reality that causes the misconception.
You can use that logic to say it would be difficult to do the right thing for all cases, but we can start with the ideal case.
I think you could liken it to training a young driver who doesn’t share a language with you. You can demonstrate the behavior you want once or twice, but unless all of the observations demonstrate the behavior you want, you can’t say “yes, we specifically told it to do that”
Most walkways are marked. The vehicle is able to identify obstructions in the road and things on the side of the road that are moving towards the road just like cross street traffic.
If (thing) is crossing the street then stop. If (thing) is stationary near a marked crosswalk, stop and go if they don’t move in (x) seconds. If they don’t move in a reasonable amount of time, then go.
You know, the same way people are supposed to handle the same situation.
Most crosswalks in the US are not marked, and in all places I’m familiar with vehicles must stop or yield to pedestrians at unmarked crosswalks.
At unmarked crosswalks and marked but uncontrolled crosswalks we have to handle the situation with social cues about which direction the pedestrian wants to cross the street/road/highway and if they will feel safer crossing the road after a vehicle has passed than before (almost always for homeless pedestrians and frequently for pedestrians in moderate traffic).
If waymo can’t figure out if something intends or is likely to enter the highway they can’t drive a car. Those can be people at crosswalks, people crossing at places other than crosswalks, blind pedestrians crossing anywhere, deaf and blind pedestrians crossing even at controlled intersections, kids or wildlife or livestock running toward the road, etc.
Thing? Like a garbage bin? Or a sign?
Person, dog, cat, rolling cart, bicycle, etc.
If the car is smart enough to recognize a stationary atop sign then it should be able to ignore a permantly mounted crosswalk sign or indicator light at a crosswalk and exclude those from things that might move into the street. Or it could just stop and wait a couple seconds if it isn’t sure.
A woman was killed by a self driving car because she walked her bicycle across the road. The car hadn’t been programmed to understand what a person walking a bicycle is. Its AI switched between classifying her as a pedestrian, cyclist, and “unknown”. It couldn’t tell whether to slow down, and then it hit her. The engineers forgot to add a category, and someone died.
It shouldn’t even matter what category things are when they are on the road. If anything larger than gravel is in the road the car should stop.
Difference is that humans (usually) come with empathy (or at least self-preservation) built in. With self-driving cars we aren’t building in empathy and self (or at least passenger) preservation, we’re hard-coding in scenarios where the law says they have to do X or Y.
deleted by creator
Whether you call in it programming or training, the designers still designed a car that doesn’t obey traffic laws.
People need to get it out of their heads that AI is some kind of magical monkey-see-monkey-do. AI isn’t magic, it’s just a statistical model. Garbage in = Garbage out. If the machine fails because it’s only copying us, that’s not the machine’s fault, not AI’s fault, not our fault, it’s the programmer’s fault. It’s fundamentally no different, had they designed a complicated set of logical rules to follow. Training a statistical model is programming.
You’re whole “explanation” sounds like a tech-bro capitalist news conference sound bite released by a corporation to avoid guilt for running down a child in a crosswalk.
It’s not apologeia. It’s illustrating the foundational limits of the technology. And it’s why I’m skeptical of most machine learning systems. You’re right that it’s a statistical model. But what people miss is that these models are black boxes. That is the crucial distinction between programming and training that I’m trying to get at. Imagine being handed a 10 million x 10 million matrix of real numbers and being told, “here change this so it always stops at crosswalks.” It isn’t just some line of code that can be edited.
The distinction between training and programming is absolutely critical here. You cannot hand waive away that distinction. These models are trained like we train animals. They aren’t taught through hard coded rules.
And that is a fundamental limit of the technology. We don’t know how to program a computer how to drive a car. Instead we only know how to make a computer mimic human driving behavior. And that means the computer can ultimately never peform better than an attentive sober human with some increases reaction time and visibility. But if there is any common errors that humans frequently make, then it will be duplicated in the machine.
It’s obvious now that you literally don’t have any idea how programming or machine learning works, thus you think no one else does either. It is absolutely not some “black box” where the magic happens. That attitude (combined with your oddly misplaced condescension) is toxic and honestly kind of offensive. You can’t hand waive away responsibility like this when doing any kind of engineering. That’s like first day ethics-101 shit.
That all sounds accurate, but what difference does it make how the shit works if the real world results are poor?
It’s telling that Tesla and Google, worth over 3 trillion dollars, haven’t been able to solve these issues.