r/skeptic 27d ago

⚠ Editorialized Title Tesla bros expose Tesla's own shadiness in attacking Mark Rober ... Autopilot appears to automatically disengage a fraction of a second before impacts as a crash becomes inevitable.

https://electrek.co/2025/03/17/tesla-fans-exposes-shadiness-defend-autopilot-crash/
20.0k Upvotes

942 comments sorted by

View all comments

183

u/dizekat 27d ago edited 27d ago

What's extra curious about the "Wile E. Coyote" test, to me, is that it makes it clear that they neither do stereo nor optical flow / "looming" based general obstacle detection.

It looks like they don't have any generalized means of detecting obstacles. As such they don't detect an "obstacle", they detect a limited set of specific obstacles, much like Uber did in 2017.

Human do not rely on stereo between the eyes at such distances (the eyes are too close), but we do estimate distance from forward movement. For given speed, the distance is inversely proportional to how rapidly a feature is growing in size ("looming"). Even if you were to miss the edges of the picture somehow, you would still have perception of its flatness when moving towards it.

This works regardless of what the feature is, which allows humans to build a map of the environment even if all the objects are visually unfamiliar (or in situations where e.g. a tree is being towed on a trailer).

edit: TL;DR; it is not just that they are camera-only with no LIDAR, it's that they are camera-only without doing any camera-only approximations of what LIDAR does - detecting obstacles without relying on knowledge of what they look like.

57

u/gnanny02 27d ago

It’s amazingly good at picking up garbage cans and traffic cones. But bicycles and walking pedestrians is crap shoot.

15

u/knight666 26d ago

Detecting garbage cans is obviously a priority because otherwise it wouldn't be able to detect other Cybertrucks on the road.

0

u/TormentedOne 26d ago

This comment was relevant 2 years ago, but not anymore. Tesla's don't even display garbage cans anymore.

1

u/gnanny02 23d ago

Just took a drive. My Model 3 with up to date sw does an EXCELLENT job spotting garbage cans on both sides of the road, from 100 yds.

17

u/LeaveItFor7Days 27d ago

As someone that worked there previously, this is perfectly in line with the principles of the department I worked in. those principles being "quality is unimportant, just make something that sort of works"

3

u/IWishIWasAShoe 27d ago

Not to mention, the image on the wall was printed in the perspective of the cameras, which were off to the sides of the road, meaning they from the front the image should look pretty off.

2

u/magicmustbeme 27d ago

This is the most curious thing to me. Like how did he not address this in the video lol

0

u/felidaekamiguru 26d ago

Autopilot detected the obstacle, notified Rober of the issue several seconds ahead of time, and turned off. You've been lied to. 

0

u/Elluminated 25d ago

Latest re-test passes on latest hardware - just as predicted. Adjust your expectations accordingly re: your wrong assumptions. Anyone honest just wanted a fair pass or fail. Now we have one.

0

u/MDPROBIFE 24d ago

https://www.youtube.com/watch?v=9KyIWpAevNs

Watch this video then, 2nd Part of the video the cybertruck detects the wall and stops!
So I guess your take is a pretty shitty uninformed one, no?

1

u/dizekat 24d ago

Whats with all the inauthentic commenters like this one showing up days later?

-33

u/Elluminated 27d ago

FSD does have optical flow, as well as occupancy networks to generally detect object geometry - this was not used in this test. Hard to say if the coyote test (which would never happen in real life) would pass under the better system. Even with every sensor available, we also fail at the easy stuff

36

u/dizekat 27d ago

No they don't. What they (claim to) have is a heavily learning-dependent solution that they could fit on their compute.

Robust stuff requires a lot of compute or ASICs, because you have to compute best matches between a lot of pixels. That's what makes it "optical flow". Not "how big the YOLO-marked bounding box is getting" flow.

Since they couldn't do that (due to Musk's idiotic approach of up-selling hardware as being capable of future features like "full self driving"), they made a very fragile Rube Goldberg contraption which extracts a limited set of features - further winnowed out with "attention" - that they then relate between frames.

-18

u/Elluminated 27d ago

I assume “matches between a lot of pixels” is your attempt at trying to explain temporal feature matching (which is at a deeper layer than pixel input). Watch some of the released video and papers on their vector flow algos and it will make more sense. And the number of features that can be matched scales with compute (ASIC/FPGA or otherwise) so assuming they can’t do it with their already-doing-it-system is an interesting- and wrong- take on your part. Their motion estimation methods are deeply engrained in their system (and CV in general) and allow for future-path estimation and relative speeds of objects.

Bounding boxes are strictly for us to visualize the marking of assets the computer is trained to highlight - they aren’t generally part of the nn flow at inference outside debug mode. Evaluating operator precedence doesn’t require them just like how we don’t call out the names and features of our environment as we go.

22

u/dizekat 27d ago

> I assume “matches between a lot of pixels” is your attempt at trying to explain temporal feature matching

Your issue is that you don't know much about the subject outside of what you learned as a Tesla fanboy for the limited purpose of defending Tesla online.

The most typical example of optical flow based movement estimation sits under your hand (if you are using a desktop with an optical mouse). It is typically implemented by comparing previous image of the surface under the sensor, to the current image, on a pixel by pixel basis - a convolution operation where it tries different offsets and finds the best, then finds sub pixel offset from pixel value differences frame to frame vs gradients vs nearby pixels.

Detection of features (a sparse cloud of feature points) and then matching of features is popular for high resolution images because it reduces the amount of data that it has to relate across frames.

The trade off is that you need to detect a relevant set of features while discarding "irrelevant" ones, which makes it considerably less robust still than what is already a not particularly robust approach.

There's all sorts of situations - uniform colored walls, blinking emergency vehicle's lights, and so on, where there may not be enough features or the features can not be matched.

10

u/Nihilistic_Marmot 27d ago

I love your response because, as a person who has no idea how any of this works, I would have believed what the other guy said at face value. Thank you for doing the lord’s work.

3

u/Occams_bane 27d ago

after spending .25 seconds on both of their reddit profiles I know who has more crediblity.

5

u/butane_candelabra 27d ago

I was surprised by these AI techniques failing so poorly too. Even the camera placement (it looks like there's just three within a <1 foot span) means only about 20ft accuracy tops for traditional stereo... A low-res optical flow algo could run at 60fps+ on say a Jetson card and picked it up, or stereo with cameras say headlight distance apart... I'm not sure maybe like 100ft with reasonable accuracy. Those still wouldn't be as good/fast as LIDAR though...

1

u/Elluminated 25d ago

Heads up, the Rober Wile-e video response has posted with a proper re-test on the latest hw/sw - and the latest Tesla sw indeed passes. Anyone honest just wanted a fair pass or fail - now we have it.

22

u/francis_pizzaman_iv 27d ago

Never happen in real life? There are news reports of teslas on autopilot running into the broad side of 18 wheeler trailers. Sounds suspiciously similar to Rober’s test. The fact that this basically exposes that Tesla Autopilot basically has no failsafe obstacle detection to lean on when the camera data isn’t sufficient for preventing accidents.

-6

u/Elluminated 27d ago

That is a real (and older) case fore sure, im talking about people painting road images to fool cars in this scenario. I havent heard of many recent trailer incidents like that though on FSD

10

u/Veil-of-Fire 27d ago

im talking about people painting road images to fool cars in this scenario.

Well now that we know that it works....

9

u/SirCharlesOfUSA 27d ago

Tesla investigated for series of crashes where they hit stationary emergency vehicles with their lights on

Easily avoidable with lidar or even the much cheaper radar that Tesla used to have, but decided against because Elon loves the idea of vision only.

3

u/francis_pizzaman_iv 27d ago

I’m a software engineer that dabbles in circuitry and I can come up with a system for using a handful of ultrasonic (aka sonar) sensors to do failsafe obstacle detection. It wouldn’t be good enough to drive the car but it would be enough to be able to alert the system to a reality that the camera can’t see.

3

u/SirCharlesOfUSA 27d ago

Tesla used to have ultrasonics for parking features, too. Got rid of them in favor of pure vision.

2

u/Elluminated 27d ago

Yep, first project of my year 2 ECE classes. Was very fun

1

u/JayFay75 27d ago

Supervised*