this post was submitted on 30 Jun 2025
156 points (98.8% liked)
Autonomous Vehicles
119 readers
24 users here now
Autonomous Vehicles is a community dedicated to the news, discussion and exploration of autonomous vehicles and how we as a society, will embrace this futurology today!
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The blockers for Tesla are that it's processing a 2D input in order to navigate 3D space. They use some ai trickery to make virtual anchor points using image stills and points of time to work around this and get back to a 3D space but the auto industry at large (not me) has collectively agreed this cannot overcome numerous serious challenges in realistic applications (the one people may be most familiar with is Mark Rober's test where the tesla just drives right into a wall painted to look like the road Wiley Coyote style, but this has real world analogs such as complex weather). Lidar and ultrasonics integrated into the chain of trust can (and already do for most adas systems) mitigate a significant portion of the risk this issue causes (Volvo has shown even low resolution "cheap" Lidar sensors without 360 degree coverage can offer most of these benefits). To be honest I'm not certain that the addition would fix everything, perhaps the engineering obstacles really were insurmountable... but from what I hear from the industry at large, my friends in the space, and my own common sense; I don't see how a wholly 2D implementation relying on only camera input can be anything but an insurmountable engineering challenge to overcome in order to produce the final minimal viable product. So from my understanding it'd be like being told you have to use water and only water as your hydraulic fluid, or that you can only use a heat lamp to cook for your restaurant. It's just legitimately unsuitable for the purpose despite giving off the guise of doing the same work.
Also and I'd forgotten to mention, but what you see in the on-screen representation is entirely divorced from the actual stack doing your driving. They're basically running a small video game using the virtual world map they build and rendering in assets and such from there. It's meant to give you a reasonable look into what the car sees and might do, but they've confirmed that it is in no way tied to the underlying neural decision network.
But that's exactly the point.
if the virtual map they're building from cameras is complete, correct and stable (and presumably some other criteria that I didn't think of from the top of my head), then the cameras would be sufficient.
The underlying neural decision network can still fuck things up from a correct virtual world map.
Now, how good is the virtual world map in real world conditions?
You're missing the point though maybe? You can't take data, run it through what is essentially lossy compression, and then get the same data back out. Best you can do is a facsimile of it that suffers in some regard.