Tangram Vision's AI-powered 3D sensor could transform robotic computer vision

ThatGeoGuy · on Nov 14, 2023

Hey all, senior engineer on the Tangram team here — we're really excited to be launching HiFi today!

Through past work with similar sensors [0][1], we've heard a lot of feedback about what people want from a depth camera! With our expertise in calibration & shipping sensor SDKs in the past, we saw this as an amazing opportunity to ship what we think is a big leap forward in sensing: 1.6MP cameras, AI capability (up to 8 TOPS / 8GiB onboard memory driven via TIDL), and what is most relevant to me: self-calibration and rock-solid software support.

I've spent the better part of my career building and helping develop multi-sensor systems! We hope you'll pick HiFi for your project (and even if you don't, none of our software is locked to any specific hardware vendor). HiFi is a great chance for us to flagship our software on a first-class piece of hardware, and we want to share that superpower with all of you!

[0]: https://gitlab.com/tangram-vision-oss/realsense-rust

[1]: Myself, as well as various members of the team used to work at what is now https://structure.io/

goosinmouse · on Nov 14, 2023

I'm not super informed on the space but i do try to keep up with different 3D sensing tech. What makes this a big leap forward over what we already have? I mean doesnt the iphone and most flagships already do 3D sensing?

reteltech · on Nov 14, 2023

Hi - I'm one of the founders of Tangram Vision here. It's a good question. This sensor in particular is focused on robotics, where the capabilities of 3D sensors are fairly different from what you'd find on an iPhone. In the case of HiFi, the leaps are in resolution (much higher than other depth sensors for robots), AI compute (about 5x the amount of the next competitor), and ease of integration with a robotic platform.

ThatGeoGuy · on Nov 14, 2023

It would perhaps be more accurate to say that this is a big leap forward compared to most existing off-the-shelf depth cameras for robotics. To address the iPhone specifically: you probably aren't going to mount iPhones on a bunch of production robots in the field.

Comparing to other alternatives in the robotics space (I've listed RealSense and Structure above, but there are others), there is somewhat of a laundry list of potential pitfalls and issues that we've seen folks trip over again and again.

Calibration is a big one, and a large part of what we're doing with HiFi is launching it with it's own automatic, self-calibration process (no fiducials). There are some device failures that a process like this wouldn't be able to handle, but the vast majority of calibration problems in the field result from difficult tooling or requirements, a need to supply one's own calibration software, or a combination of hardware and software that make the process difficult. A nickel for every time someone has to train a part-time operator to fix calibration in the field, and I'd own Amazon.

Depth quality and precision is another big pitfall — there are folks out there today using RealSense for their robot, but we've talked to a number of folks who just don't rely on the on-board depth. It's too noisy, it warps flat surfaces, etc. Lots of little details that on the surface you might not think about when just looking at a list of cameras! Putting our edge AI capabilities aside, the improved optics and compute available on the HiFi allow us to build a sensor that always provides good depth. That sounds like a baseline for this kind of tech, but there's plenty of examples otherwise on the market today!

Software is probably the other last big thing that we really want to leap forward on. We don't have too much to say about our SDK today, but when we launch it we hope to make working with these sensors a lot easier. I work with RealSense quite a bit (I am the maintainer of realsense-rust), and quite honestly what has been a solid overall hardware package for many years (until HiFi, I hope) is let down by how confusing it is to use librealsense2 in any meaningful project.

Needless to say, I think HiFi stands on some solid merits and I'm not sure it can be directly compared to other 3D sensors in e.g. iPhones, mostly because the expected use-case is so utterly different.

goosinmouse · on Nov 14, 2023

Appreciate the detailed response! Definitely seems like we've come a long way from when i heard about people using Kinect cameras, and look forward to all future advancements that you will contribute!

trzy · on Nov 14, 2023

Question: Luxonis seems comparable (I’ve used their products). Are you pursuing the same use cases? How do you plan to differentiate?

ThatGeoGuy · on Nov 14, 2023

I think in terms of use-cases there is bound to be a lot of overlap. Of course, in terms of comparable products, I'd have to mention that Luxonis sells a good number of different products, so understand that there's a necessary disclaimer here that HiFi isn't going to replace every possible option that Luxonis offers (especially their modules / full robot offerings, HiFi is just the sensor!).

In terms of how we differentiate ourselves, I think the main focus is going to be primarily in terms of depth quality and software. Our expertise in providing robust calibrations, combined with the improved optics on the HiFi allow us to produce much higher quality depth frames, more consistently, than what we see on the market today.

In fact, the whole purpose of building this sensor was to bring to market a 3D depth camera that provides good quality data, always, to better enable long-term autonomy. AI tools and capabilities enhance that data in a way that customers have told us existing market offerings are currently lacking.

As for software: We're a software company at heart, which helps, and HiFi is meant to be a flagship representation of what our software can power. We've used a lot of sensors, sensor APIs, SDKs, etc. and the number one thing we find is that these systems are complex, opaque, and difficult to debug or understand. A big part of designing software for me personally is producing software that is legible; not in the sense that one can literally read it (because reading code isn't easy), but in the sense that the software itself can be understood in the abstract. We're hoping that when we ship HiFi and the corresponding SDK that folks will appreciate the steps we've taken to make working with it understandable and obvious.

trzy · on Nov 14, 2023

Awesome! Thanks for the detailed write up and I hope to have an excuse in the next few months to check you guys out :)