Hacker News new | past | comments | ask | show | jobs | submit login
Nvidia Research Turns 2D Photos into 3D Scenes (nvidia.com)
494 points by bcaulfield on March 25, 2022 | hide | past | favorite | 227 comments



It would be really great to recreate loved ones after they have past in some sort of digital space.

As I’ve gotten older, and my parents get older as well, I’ve been thinking more about what my life will be like in old age (and beyond too). I’ve also been thinking what I would want “heaven” to be. Eternal life doesn’t appeal to me much. Imagine living a quadrillion years. Even as a god, that would be miserable. That would be (by my rough estimate) the equivalent of 500 times the cumulative lifespans of all humans who have ever lived.

What I would really like is to see my parents and my beloved dog again, decades after they have past (along with any living ones at that time). Being able to see them and speak to them one last time at the end of my life before fading into eternal darkness would be how I would want to go.

Anyway, there’s a free startup idea for anyone—recreate loved ones in VR so people can see them again.


There's so many things we invent with good intentions but in the end go terribly wrong and I think this is one of those things. I think it's ok to mourn and remember the past, but moving on and accepting reality is important to a healthy life.

Let's be real though, the startup that makes this but appeals to our worst instincts make bank. I can't imagine how much more messed up future generations will be as we keep making more dangerous technology that appeals to our primal instincts.


Let be real: I've worked on the R&D stage of a Chinese research project for a State supported Ancestor Worship software where people's ancestors are recreated in 3D, their "ancestorial home" is made available in pieces and parts the software user must purchase with real currency, and the user is encouraged to discuss their day to day life issues with their observing and consoling animated ancestors. The software is a complete Orwellian Spy while masquerading as all your ancestors listening, offering advice, and demanding gifts that cost real currency. To say the least, I spooked the hell out of that situation.


Wow. I want to know every detail about this.


Me, too, but I'm betting that's already all they can say (and probably more than they should have).


"mourn and move on" is a somewhat Western concept of dealing with death. Plenty of cultures around the world have developed different practices, up to religious forms of ancestor worship.


These practices would typically be associated with the idea that the dead have a genuine existence beyond simply “existing in my memory of them” and would be restricted in the forms they take by surrounding ritual, so I’m not sure there’s a direct point of comparison, though?


Yes, there's a direct comparison. Practices change as times and technology change. When photography became cheap enough, practices like in-home shrines often changed to incorporate photographs of loved ones. I think it's reasonable to speculate that newer forms digital representation can become part of these practices as well.


People often fear losing their memory of a loved one. A memento could help someone heal


Bs. People do different things. But nobody can sit around wallowing in the VR. Having seen what death of my cousin and grand parents did to the family, moving on helped every one heal in a way. VR would have been a torture to live in.


Perhaps starting your comment with "bs" is not a surefire way of getting your thoughts considered seriously..


Sigh. Point taken. You are right.


You are taking the worst possible interpretation of my comment. I was responding to literally the aspect of the GP that said "mourn.... move on"

It is a factual matter, easily discovered through simple searches, that non-Western cultures often take a different approach. Including, literally, ancestor worship. I am not making a judgement on which is better. I believe there are multiple healthy ways of dealing with grief, and "mourn & move on" can be one, but not the only one.

This is very far from an obsessive interaction with a semi-emulated version of a dead person in VR. However things wouldn't have to be that extreme.

In cultures that practice it, it's not uncommon for a small shrine in the home to be setup. And yet that is of limited access to family located further away, so a digital form of this not bound to a specific location could also be of use to people from these cultures. There is no reason that such things couldn't compliment existing practices of honoring ancestors.


I dont know how a grieving person can maintain distance from vr when the memories of their loved ines are tied. Reminds me so much of the movie Reminiscence


I think by "move on" they meant moving forward with life and getting through your grief.

You are absolutely right though that there are different ways of dealing with grief!


I'd say religion is the epitome of not moving on - or, moving on by imagining that it didn't really happen.


Possible the comment was directed towards a Western consumer.


Possibly, I'd give it the benefit of the doubt. Either way, I think it is fair to note that other cultures see death & remembrance in different ways.


Curious how you find other cultures and belief systems are? In Hinduism, at least I can attest that there is the concept of "mourn and move on" in that the body is not even preserved, it is freed of its physical form through cremation. The fire into the funeral pyre is instilled by hand by the eldest son - it's a poetic closure.


If you do a search for "ancestor worship" you'll find a few examples. It seems more common among Asian cultures, perhaps specifically those with a Buddhist tradition, but that's what I happen to be more familiar with so it may be common outside of that as well.


My gut reaction to 'recreating loved ones in VR' is that it would be torturous more than soothing.

Once a person dies, they're gone. The world is different, and nothing will bring them back. Spending time with a simulacrum isn't really spending time with a loved one.


Greg Egan's "zendegi" is a very good exploration of that idea. set in the fairly near future, so relatable.


That might be true for you but lots of people imagine talking to dead loved ones and imagine the responses and advice they'd give. Some of them do that reflection in front of a picture of their dead loved one. Seeing them in VR would just be an enhancement. No need to hate on it.


Heh, that could then be a vector for manipulation of individuals, through their dead loved ones.

"You know, you should really do XYZ..." (whatever the agenda is)

Pretty dystopian.:(


Don't knock it til you try it?

I mean, already in recent years people have made some very low fidelity 'resurrections' and gotten some measure of comfort from it, never mind the many years of history of people who visit gravestones to 'chat' (some even believing they get replies to some extent or another). When markov chains were hip, "talk with Charles Dickens!" (play with a markov model trained on his works) was at least interesting to some, GPT of course can do a lot better. Imagine we had actual superintelligent AI working on this, which actually tried to recreate brain models, or even restore a sense of existence to the resurrection that can continue independently and to their own delight rather than just being some static VR experience turned on and off whenever. (I jotted some thoughts a few years ago... https://www.thejach.com/view/2017/6/how_you_might_see_some_o...)

I'm in agreement though at least for now that even if you dialed up the fidelity to the point I can't point to anything objectively "off", let alone went with a low fidelity VR thing or just a chat bot, I'd still always think in the back of my mind that this isn't really the same person. Nevertheless, it could still be comforting to some, and interesting to others, and so if it's at all possible for potential future humans/ems/aligned AIs to work at it, they will.

Curiously I don't have the same back of the mind feeling at all when considering the idea of someone preserved with cryonics and then brought back as an em, the person would be the same to me, even if there was a bit of damage from the vitrification process that was error-corrected. I have a disagreement with a friend on that point, who thinks it would be something similar but not the same. Anyway, I think this is due to just having a much higher fidelity "source of truth" to work from that is the preserved adult brain, whereas someone supposedly information-theoretically dead requires a lot more guess-work, perhaps truly impossibly too much more even for a superintelligence, to bring back convincingly.


I suppose... but I'll always know that it's some sort of cheap trick to make me think they're still here when really their consciousness ceased.

Will the simulacrum age? Will it change? Will it ever surprise me or intrigue me? And if it does, is it something the dead person truly would have done?

It's sort of like a photograph; a photograph seems to capture reality but all it shows is some abstraction of a physical reality at one point in time. The photograph tells me nothing about the current reality of anything depicted within.

A simulacrum gives me the person I knew, when they died, and that's it. Perhaps comforting and interesting, but ultimately unfulfilling and unsettling I would imagine.


To me, the whole idea of a recreation of a person for my own daily comfort just cheapens the former existence of that person. It's one thing to have genuine photographs or even videos since there's an obvious delineation between memory and reality. Photos can make it even more obvious that the person is no longer alive. But to turn someone into an AI for the sake of coping (and denial) turns them into a product.

Maybe that's fine for some people. To me there's a line where it crosses over into offense. In no way do I think the status quo of my mental well-being is so important that I'd replace someone with a digital robot facsimile.


Have you lost someone you've loved?


Yes, but I get the unstated implication. I don't think it would be fair to apply it to me even though I think it may be fair to apply it to e.g. Kurzweil, unless he's made recent statements suggesting otherwise (I don't keep up with him). I currently have no expectation of seeing a convincing simulation/resurrection/recreation/continuation of any of them, or the ones I currently anticipate losing over the next few decades, even should I go on indefinitely living, and don't expect anything at all if/when I should die.


Would you want to have an avatar of them? I don't think I would.


If the 'avatar' is convincing and can have a continued existence as a person independent of my interaction with it, i.e. I don't "have" them as a form of possession, yes. It does seem better than 50/50 to me though that some (maybe all) wouldn't want continued existence and would decide to go back to not existing (for all I can tell), there may even be strong predictive signs of that in the brain models such that they don't even need to be temporarily brought back and asked or first made to listen to arguments or just have some final-final talks with me/others before deciding. I'd accept that.

For less convincing avatars where the point is just my own benefit of conversing with something like them when I want, from slightly like them to eerily like them, for one it's a weak yes, for the others I'm more indifferent -- it'd be more in the realm of curiosity than desire, like talking to a historical figure or a fictional character. The weak yes I expect will get weaker (as it already has, despite non-linear flare-ups/resurgences where it's temporarily stronger) and eventually match the others after long enough.


> it would be torturous more than soothing

Relevant Asimov story: https://en.wikipedia.org/wiki/The_Dead_Past


I was going to suggest the HBO film "Reminiscence", which was panned as mediocre but I liked it for being Noir and somewhat original. It features a machine that can replay your own memories and record them for others -- useful for investigating crime, but otherwise highly addictive to the nostalgic.

Reading about The Dead Past tho, there's a good deal of overlap with the plot device of "Devs", a periscope into deep history, but which Nick Offerman's character uses to re-live moments of his daughter's short life, as Asimov's protagonist does.


Agreed, to me it sounds like the worst possible kind of uncanny valley.


It has the potential to be therapeutic. Perhaps only to be used under the supervision of a mental health professional.

I for one have many things I wish I had said to my parents before they died. I’d like to use it, but not be obsessed or consumed by it.


Precisely. This is going to be used to suck people into the metaverse.


>Imagine living a quadrillion years. Even as a god, that would be miserable.

This seems very subjective, I don't agree at all.


Seriously... the idea of living a quadrillion is hardly an extension of 100. You can still never go backward. You still need to plan your life. Death would still be a common occurrence.

I would much MUCH rather live much longer. I don't hate my life at all.


At some point you will have heard all possible jokes. What is life without humor?


That's nonsense. There are infinite possible jokes.


And memory is limited. What if one were a forgetful god. What if that is _already_ the case?

Possibly where the Alzheimer's patients are told the same joke over and over, and it does bring them joy, comes to mind.


This is not true. Jokes that take more than (say) an hour to tell are boring and thus not really jokes.


Literally there are about the same number of atoms in the universe as the number of shuffles in a 52 card deck (52!). The number of topics that can be played with in a joke is beyond even this limited number.

Remember, jokes about tinder/grinder dates didn’t even exist a decade ago.


Agreed it's subjective.

That said, what would you do with your quadrillion year long lifespan, assuming you're healthy during it? It's way past everything you can do and learn, and cosmic level events are not what the human mind can perceive as they slowly unfold...

I would like to live a couple thousand years though, as long as my loved ones get to be along for the ride. Though I wonder if my loved ones would eventually become my hated ones...


> It's way past everything you can do and learn

Such arrogance, humanity isn't all that far beyond it's banging rocks together phase. We haven't even scratched the surface. There is so much more to learn.


It's not arrogance. There aren't an infinite number of things to do and learn. It may seem so when your lifespan is less than a century, but quadrillion years? You'd run out of things that make sense for a human being.


How do you know it's way past everything you can do and learn?

No one has ever had the experience to be able to say.


I'm fond of my loved ones, but I also like creating new ones. I also would like to see very long term plans come to fruition. It seems very much like a "640K ought to be enough for anybody" problem.

Works of art that require thousands of lifetimes to complete. Deep understanding of the physical world. Exploring the universe. Much more.


I get what you mean, but I don't think the human mind can contemplate things that take thousands of years to unfold; it's just not wired for that. Even things that take years are difficult for us to comprehend.

Re: creating new beings. How long till you also get bored of that? Remember we're talking about quadrillions of years.

An additional thing to consider: if you could live quadrillions of years and still be able to perceive things that take thousands or millions of years to unfold, wouldn't you subjectively be living the lifespan of a normal human being? That is, your life wouldn't feel long to your mind, but normal (your mind would have to adjust to run more slowly). Wouldn't then regret your "short" lifespan and wish you could live longer?


It seems like you assume you would remain human for a quadrillion years. Why? I'd assume a technology to upload yourself into a better substrate (one you have full control over) will appear within next few hundred years. From that point there are no limits on what you can do or become.


Your answer is the only valid counterpoint, in my opinion. Since the human mind cannot perceive the passage of quadrillion years, we would have to become something else, something inhuman.

I don't know whether this is something I would wish for myself. Maybe. I like being human. But maybe, who knows?


What do you like about being human?


This extends to a cohort of people living thousands of years, having memories of each other to thousands of years in the past. Plus they all need beds.


In Ursula Le Guin's book "Changing Planes", there's a story called "The Island of the Immortals." Spoiler alert: living forever is no fun.


If there was no tension the book would be no fun, so I don't think we should pay much attention to story lines created for entertainment.


I mean if you were a god your experience would be qualitatively different. You could make it pass by as fast as a minute takes for a human being.


“Mom do you remember the walk we took around the lake just before you died?”

“Yes what a wonderful day. I also fondly remember the Adidas sneakers I wore. Did you know they’re 25% off at Target this weekend only?”


This reminds me a lot of Black Mirror season 2 episode 1.

Always good to treasure the time we are given.


I miss Black Mirror. Some episodes were hit-n-miss, but many of them made me think. The best part was that each episode was mostly independent of the other.


and then you have to put your loved one's in the attic


Hah! I made this suggestion here about a couple years ago. It's been my dream-pet-project-I'll-never-create for a while now, but it seems like more talented and predisposed individuals have also considered it / may be working on it already...

Some interesting replies there

https://news.ycombinator.com/item?id=24073994


If that technology existed a lot of people would live in misery knowing that they could be seeing their dog and their parents and speak to them one last time, but only when their time to goes come. Either that or people would be able to use it whenever they wanted or could pay for it and it would become a new highly addictive and harmfully dissociative tech product that I venture to guess would be like 10x more harmful to society than Facebook ever was.

I don't even know if it would have some kind of therapeutic use of whatever. Yeah ir would be a wonderful experience and I... ehh... let's say "dreamed" about similar experiences a few times and it was a powerful "dream". But being able to do it on demand would I think change how grieving works in the modern world so much! And we rely on the things that went to not actually be there for us on demand to be able to move on.

It's a beautiful idea but it's as beautiful as the fear of nuclear MAD keeping the first world safe from war in their own land, not as beautiful as a poem or a flower or a good memory.


I suppose I'm less pessimistic. I assume that eternal life wouldn't be the same experience as the corporeal passage of time. Time would exist of course, but would be perceived either as its true essence, or at least as a higher dimensional projection. Rather than be anchored in time at a constant velocity, you would be able to move and exist independently from it.


Upload is a pretty good series on Amazon Prime right now that explores that. It's both amazing and horrifying.


You may also enjoy Marjorie Prime, too

https://en.m.wikipedia.org/wiki/Marjorie_Prime


I'm with you on this. I would love to be able to connect with ancestors I barely or never met, even if it only captures a fraction of their essence. I worked on this for a while but Microsoft holds a pretty broad patent on this concept, which scared me off.


We are not too far away from that [1].

[1] https://www.sfchronicle.com/projects/2021/jessica-simulation...


Isn’t that the premise of the Amazon series upload


> It would be really great to recreate loved ones after they have past in some sort of digital space.

Tim Heidecker did this with his son Tom Cruise, Jr. Heidecker and it was quite moving: https://youtu.be/2fVQtmMN_ao?t=178


I think it would be a special kind of hell to have your resurrected loved ones sell you ads in virtual reality.


> It would be really great to recreate loved ones after they have passed

That’s probably the basis for most of human religion and philosophy; coping mechanisms.

I’ve always wondered if it might be possible to ”reverse engineer” parents or siblings from one’s own DNA.. at least visually.


You'd probably like Spielberg's A.I., or at least find it topical.


I can imagine people would also create their exes and talk to them after they've broken up.


what's fun is my mom has a bunch of 3D photo slides from the 1950's and 1960's so we could bring those in to stereo VR really easily. I still need to borrow my friend's slide scanner...


Sounds like the end of the movie: "A.I.".


You mean Westworld ?


More like "Things to do in Denver when you're dead."


> Eternal life doesn’t appeal to me much. Imagine living a quadrillion years. Even as a god, that would be miserable.

That isn't necessarily what eternal life would be. Indeed, eternity is not temporal at all.

For example, in the Catholic understanding of heaven, those in heaven exist in aeveternity or aevum[0], a state in between the temporal and the eternal. Eternity[1] is proper only to God and is by definition timeless, with no beginning or end, so it would not make sense to speak of quadrillions of years. Where God is concerned, there is, loosely, only a now with no beginning, no end, no past, and no future, only the present, to use temporal language analogically.

Furthermore, heaven is in part characterized by the beatific vision[2] which is an immediate, direct, and inexhaustible knowing of God (Being Himself) which is Man's ultimate and supreme happiness. In this life, knowledge of God is generally mediate like much of human knowledge.

In other words, thus understood, the best of this life is but a faint shadow of a shadow in comparison to the ultimate fulfillment of heaven which is Man's proper end.

[0] https://www.newadvent.org/summa/1010.htm#article5

[1] https://www.newadvent.org/cathen/05551b.htm

[2] https://www.newadvent.org/cathen/02364a.htm


This in no way represents my views, it's far from them. But I vouched for this comment because it deserves to be a part of this conversation.


“I’ve also been thinking what I would want “heaven” to be. Eternal life doesn’t appeal to me much.”

please tell me more!


My prediction/hope is that NeRFs will totally revolutionize how the film/TV industry. I can imagine:

- Shooting a movie from a few cameras, creating a movie version of a NeRF using those angles, and then dynamically adding in other shots in post

- Using lighting and depth information embedded in NeRFs to assist in lighting/integrating CG elements

- Using NeRFs to generate virtual sets on LED walls (like those on The Mandalorian) from just a couple of photos of a location or a couple of renders of a scene (currently, the sets have to be built in a game engine and optimized for real time performance).


This sort of stuff (generating 3D assets from photographs of real objects) has been common for quite a while via photogrammetry. NeRFs are interesting because (in some cases) they can create renders that look higher quality with fewer photos, and they hint at the potential of future learned rendering models.


I did it for a project… it was SLOW! Gave decent rules for a large area, but poor results for small details.

The real trick is textures.

When the photo is laid over and wrapped on the model, it looks great. But once you remove that, the raw mesh underneath is not as impressive.

I would really like to see the examples they had here without the texture laid on top.


There is no mesh here. Nerfs are 5d (colours are computed based on a 3D position vector + a view direction vector) fields that are rendered volumetrically. So the “texture” is an integral part of the neural representation of the scene, not just an image applied to a mesh.

The cool part is that this also allows for capturing transparency, and any effects caused by lighting (including complex specular reflections) are embedded into the representation.


Nitpicking, but for GP; Nerf is the internal representation, but the output doesnt have to be 2D (ray traced basically)

There are examples of people outputting SDF (and by extension geometry) with nerf, and projecting original texture onto that would give some nice effects; (live volumetric works best this way) though there would be some disparity where edges/occlusion isnt perfect, so youd want to sample nerf's rgb anyway... although a lot of that is fuzzy at the edges too. A lot of incorrect transparency at edges looks great in the 2D renders (so much anti aliasing and noise!) but less good for texturing


A NeRF is not the same as an SDF though. NGP (the paper by Nvidia linked here) can train NeRFs and SDFs, but I don't know of any straightforward way of extracting an SDF from a NeRF.

And while it's true that there are methods for extracting a surface from a NeRF, achieving a high quality result can be challenging because you have to figure out what to do with regions that have low occupancy (i.e. regions that are translucent). Should you consider those regions as contained within the surface, or outside of it? Especially when dealing with things like hair, it's not obvious how to construct a surface based on a NeRF.


Given the same amount of compute as this used, photogrammetry would be about as fast.


Perhaps even making non-gimmicky live action 3d films.

Having 3d renders of the entire film without needing green screens and a bunch of balls seems like it would have to make some of the post processing work easier. You can add or remove elements. Adjust the camera angles. More effectively de-age actors. Heck, even create scenes whole cloth if an actor unexpectedly dies (since you still have their model).

Seems like you could also save some time having fewer takes. What you can fix in post would be dramatically expanded.

Best part for film makers, they are often using multiple cameras anyways. So this doesn't seem like it'd be too much of a stretch.


With this - and with all the footage on actors you have (their movies) - then you basically have all actor's models.


Throw those into Copilot for filmmakers and away you go!


> - Using lighting and depth information embedded in NeRFs to assist in lighting/integrating CG elements

> - Using NeRFs to generate virtual sets on LED walls

Sounds like a powerful set of tools to defeat a number of image manipulation detection tricks, with limited effort once the process is set up as routine. State actor level information warfare will soon be a class of its own. Not just in terms of getting harder to detect, but more importantly in terms of becoming able to produce "quality" in high volume.


Computer games, VR and AR could also be pretty amazing uses for this technique too.


RIP photo-realistic modelers


hmmm well I still think they will be in demand for the same reason software developers will be not automated away. NeRF is really mind boggling good but there are still artifacts, and something that modelers have a good eye for.

Having said that, it might be the end for any junior type of roles. Same reason that github copilot really takes a bite of the need to have a junior developer.

I'm very curious what will happen because it will become a sort of trend across other industries apart from legal or medical professions (peace of mind from human-in-the-loop).


Maybe we'll have people spend their time building IRL sculptures and spaces to get digitized.


People made clay sculptures of CG characters as a modeling technique for a long time. It’s still done, but digital sculpture tools are getting easier to use so it’s not as common as it was.


Not even a full 3-D environment is required, just a bit of 6DOF and parallax would go a long way. I think VR videos (ok, porn) has gone as far as it can without head tracking.


I'm pretty sure the porn industry will find many uses to this technology


Have you seen the first episode of Halo, there are multiple outdoor scenes where you feel sure it's a recording than a CGI render. The uncanny valley is almost crushed


I can see this really taking off for football games. You'll be able to look at plays from all angles, zoom in/out, and get to _play_ with the game.


My, maybe too extreme, future fantasy version of this is turning existing movies into 3d movies you could watch in VR.


I'm thinking it would be something like: I want to be the baddy in Die Hard and want the protagonist to be Peter Griffin (cartoon version). The system feeds you the movie ... I'm imagining there could be an industry for writers to create the off-screen plots of other characters and principaly it would be rendered with the same scenes as the original movie.


- help with product placement advertisements


It will boost cut-scenes in games as well.


Tangent

I wonder what happens to most people when they see innovation such as this. Over the years I have seen numerous mind-blowing AI achievement, which essentially feel like miracles. Yet literally after an hour I forget what I even saw. I don't find these innovations to have a lasting impression on me or on the internet except for the times when these solutions are released to the public for tinkering and they end up failing catastrophically.

I remember having the same feeling about chatbots and TTS technology literally ages ago, but at present time, the practical use of these innovation feel very mediocre.


TTS is extremely useful - many people use car nav, TikTok, screen readers, etc even if they don't like voice assistants.

Neural TTS is so good now it's being used to fake the original voice actors in video game mods.


besides being very helpful for edge case uses such as for differently abled people (maybe this is subjective) but I don't find any of these to be exciting or mundane life quality improvements at all for the general population.


I have the impression, that now some of them seem to really end up in some practical applications. Funnily enough someone just today showed me a feature of his phone where you can select some undesired objects in youe photo and it would just replace them with a fitting background indistinguishable from the original photo.


It's rather entertaining when this happens in the opposite direction automatically too: https://twitter.com/mitchcohen/status/1476351601862483968


It was already debunked. If you scroll down in that Twitter thread, you read:

> Big news! I sent @sdw the original image. He theorized a leaf from a foreground tree obscured the face. I didn’t think anything was in view, and the closest tree is a Japanese Maple (smaller leaves). But he’s right! Here’s a video I just shot, showing the parallax. Wow!


Aw. But I guess it's better than too-fancy AI in the phone :)

TIL though, thank you!


The followup on that is... it didn't happen. There was a leaf in the foreground, and the depth of field in the photo was large enough that it was in focus rather than blurring. https://twitter.com/mitchcohen/status/1476951534160257026


In a follow up tweet it looks like there was a leaf on a tree in the foreground that obscured her face, not an AI replacement.


Magic eraser on Pixel?


I immediately think of profitable applications

Another comment mentioned education: I take classes, formal education, for things that are immediately applicable for me right at the time

So its pretty consistent for me

Another thing that might be different is that I don't go for perfect or good enough, or worry that an existing alternative like "having an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem" might already exist, I just go for novelty and am selling to people that also like novelty


hmmm I really find this to be different from chatbots, in fact it took me a lot of skepticism to overcome before using github copilot and I saw a new reality where it became part of the process, albeit, not as prolific but enough to make me ponder what the next evolution might be.

For 3D modelers, this is huge since it takes a lot of experience and grunt work to put the right touches to get an even a boilerplate 3D model. So much so that many game companies have outsourced non-human 3d modeling, this would certainly impact those markets.

1) It could further lower the cost and improve quality.

2) Studios could move back those time-consuming tasks on-shore and put an experienced in house artist/modeler to manage the production.

3) Hybrid of both

What I see here is that NeRF has a far more impact to the 3d modeling/animating industry than github copilot. Another certainty is that we are going to see faster rate innovation. We are at a point where a paper released merely months ago are being completely outpaced by another. The improvement in training time that NeRF offers is insane, especially given how quickly this new approach came out.

We could be at a future where the release of AI achievements will not be able to keep up with published works. It would be as fast as somebody tweeting a new technique, only to be outdone by somebody weeks or possibly days.

Truly exciting times.


The problem is that when the thing is initially announced, it's not useful to anyone yet, because it's not productionized and released to the general public.

But then once it IS released to the general public, it's probably been at least several months, maybe even multiple years since the announcement, so people are like, "yawn, this is old news."


I would love to see this being popular in VR. I enjoy google earth in VR way too much (it is just 360 photos) and there are some 3d real scenes you can walk in


just like most education is useless and mostly forgotten. Only when we can apply it will it be meaningful


I think with any tech demo (or other corporate PR piece), it is good to assume the worst, because companies spin things to be as ducky as possible. This is a self-reinforcing cycle, because if two companies have identical products, then the best liar--er, marketer--will win.

(not to say this sort of behavior is exclusive to corporate PR. as the best and smartest person ever, I would never need to exaggerate my achievements on a job application, but others may)


I don't really understand why NeRFs would be particularly useful in more than a few niche cases, perhaps because I don't fully understand what they really are.

My impression is that you take a bunch of photos in various places and directions, then you use those as samples of a 3D function that describes the full scene, and optimize a neural network to minimize the difference between the true light field and what's described by the network. An approximation of the actual function, that fits the training data. The millions of coefficients are seen as a black box that somehow describes the scene when combined in a certain way, I guess mapping a camera pose to a rendered image? But why would that be better than some other data structure, like a mesh, a point cloud, or signed distance field, where you have the scene as structured data you can reason about? What happens if you want to animate part of a NeRF, or crop it, or change it in any way? Do you have to throw away all trained coefficients and start again from training data?

Can you use this method as a part of a more traditional photogrammetry pipeline and extract the result as a regular mesh? Nvidia seems to suggest that NeRFs are in some way better than meshes, but according to my flawed understanding they just seem unwieldy.


If you tried to repro these results (including time & space constraints) using traditional photogrammetry, you would be sorely disappointed.

Photogrammetry is great if you have a very solid object that is not shiny or translucent at all. You get a lot of surface color micro-detail and a bit of bumpy meso-detail.

But, if something is fuzzy, hairy, or lacey or smokey you are straight-up out of luck. Don't even try.

If it is shiny, it can be difficult to capture at all --let alone capture the shine. Material capture techniques that are not "chalk sculpture" are rare, very limited and usually experimental.

NeRFs however are pretty much a photograph that you can walk around in. They have about as much structure as a photograph ;) But, that lets them not care about the mathematical definition of your scuffed-up, lacquered, iridescent, carbon fiber mirror frame and just show it as it looks from whatever angle.


Maybe NeRFs can be used as an intermediate step to reconstruct the scene, and then extract the surfaces and their materials to more conventional representations like meshes, textures, refraction index, etc. I guess the main benefit is that it fills in the undersampled areas in a scene, whether that's an occlusion, a reflection angle, or something else.

My main problem with them is that it seems as if all the data is unstructured and interdependent, not like pixels, voxels, or similar where you can clearly extract and manipulate parts of the data and know what it means. To use your photograph example, a digital photo is a simple grid of colored points, and it's easy to change them individually. A regular 3D scene is a collection of well defined vertices, triangles, materials etc, that is then rendered into a digital photo using a easy to describe process. A NeRF on the other hand seems to be more like enter camera pose => magic => inferred image.

Maybe I'm overthinking it and it doesn't have to be as general as out current formats, maybe a binary blob that can represent a static scene is fine for plenty of applications. But it feels needlessly complicated.


https://www.matthewtancik.com/nerf is able to sample the 3D volume somehow. "We can also convert the NeRF to a mesh using marching cubes."


That's more interesting than I realized. In this example, I assumed that the model was generating some sort of 3D mesh representing the woman. Is that not at all the case? Would this technique be unable to generate a model or volumetric information despite being able to reasonably render her from many directions?


No, there is no mesh. A NeRF is a neural network trained to work as a function f(x, y, z, θ, φ). You put in your viewing position (x, y, z) in 3D space and the direction (θ, φ) you're looking into (where θ and φ are the angles for up/down and left/right, respectively), and the function will output a tuple (r, g, b, σ) of the colour (r, g, b) and the material density (σ) of whatever you see at the pixel in that direction and from that position.

You can generate a mesh from the density information this function gives you, but for that you need to discretise the continuous densities you get out.


Minor correction: it's not your (i.e. the camera's) XYZ position what you input, but the position of the point whose RGBA you're trying to render.


Also, the method is lossy as hell. Not something you'd ever use for an intermediate representation.


The pros in the vfx industry still all use reconstructed geometry. And yes, animating or cropping a Nerf is painful.

In my opinion, Nerf is more about showing progress in making AI memorize 3D scenes and the hope is that this will lead to actual understanding sometime in the future.


What happens if you want to animate part of a NeRF, or crop it, or change it in any way? Do you have to throw away all trained coefficients and start again from training data?

You don’t change NeRF (the model). You change the point of view of an observer.


I mean, this is the parent posts point; the use case for a static photo or a static 3d nerf is pretty limited.

With other structured data compositing and animating is relatively trivial.

It turns out that people have approached this problem before and you can composite nerf too (1) by sampling different functions over the volume.

…but, let’s not pretend.

The complaint is entirely valid. You’re taking a high resolution voxel grid and encoding it into a model.

Working with simple voxel data let’s you do all kinds of normal image manipulation techniques, and it’s not clear how you would do some of those with a nerf.

Practically speaking, the applications you can use this for are therefore reasonably limited right now.

[1] - https://www.unite.ai/st-nerf-compositing-and-editing-for-vid...


> Working with simple voxel data let’s you do all kinds of normal image manipulation techniques, and it’s not clear how you would do some of those with a nerf.

Any image transformation you can do on voxels you can straightforwardly transfer to nerfs. Voxel data is just a lookup table from discrete positions to material properties like color and density. When you apply a transform, you change the inputs (e.g. multiplying them with a rotation matrix) or the outputs (e.g. changing the color). If you want to do the same thing with a nerf that maps continuous positions and directions to material properties like color and density, just transform the inputs or the outputs.

The major difference is that with voxel data you can easily do output-modifying transformations directly on the stored representation, while for nerfs it might be cheaper to do it on the fly instead of redoing the training procedure to bake the change into the model.


> Any image transformation you can do on voxels you can straightforwardly transfer to nerfs

No.

> it might be cheaper to do it on the fly instead of redoing the training procedure to bake the change into the model.

I think it’s a bit more complex than you imagine; it’s not “cheaper/not cheaper”; it’s literally the only way of doing it.

If you have a transformation f(x) that takes a pixel array as in input and returns a pixel array as an output, that is a trivial transformation.

If you have a transformation that takes a vector input f(x) and returns a pixel output, it’s seriously bloody hard to convert it to a “good” vector again.

Consider taking a layered svg and applying a box blur.

Now you want an svg again.

It’s not a trivial problem. Lines blur and merge, you have reconstruct an entirely new svg.

Now you add the constraint in 3d; you can never have a full voxel representation in memory even temporarily because of memory constraints.

At best you’re looking at applying voxel level transformations on the fly to render specific views, and then retrain those into a new nerf model.

I think that counts as … not straightforward.

Doing all your transformations on the fly is a lovely idea, but you gotta understand reason nerf exists is that the raw voxel data is too big to store in memory. It’s simply not possible you can dynamically run a image processing pipeline over that volume data in real-time. You have to bake it into nerfs to use it at all.


> Consider taking a layered svg and applying a box blur.

> Now you want an svg again.

> It’s not a trivial problem. Lines blur and merge, you have reconstruct an entirely new svg.

Nerfs are not svgs.

Consider taking a nerf and applying a box blur. Easy, a box blur on voxel data takes multiple samples within a box and averages them together, so to do the same thing to a nerf just take multiple samples and average them together.

That does get slower the more samples you need, but you never have to materialize a full voxel representation.


You still have to process every voxel.

> Doing all your transformations on the fly is a lovely idea, but you gotta understand reason nerf exists is that the raw voxel data is too big to store in memory.

> It’s simply not possible you can dynamically run a image processing pipeline over that volume data in real-time.


You are right, this is why approaches like plenoxels are vastly faster than nerf. They combine the optimization approach of neural nets, but combine it with a simple and regular data representation of the scene.

https://arxiv.org/pdf/2112.05131.pdf


This is great, and the paper+codebase they're referring to (but not linking, here [1]) is neat too.

The research is moving fast though, so if you want something almost as fast without specialized CUDA kernels (just plain pytorch) you're in luck: https://github.com/apchenstu/TensoRF

As a bonus you also get a more compact representation of the scene.

[1] https://github.com/NVlabs/instant-ngp


The key difference is that this TensoRF is discrete (like a JPG) whereas NeRFs are continuous (like vector graphics).


>The model requires just seconds to train on a few dozen still photos — plus data on the camera angles they were taken from — and can then render the resulting 3D scene within tens of milliseconds.

Generating the novel viewpoints is almost fast enough for VR, assuming you're tethered to a desktop computer with whatever GPUs they're using (probably the best setup possible).

The holy grail (from my estimation) is getting both the training and the rendering to fit into a VR frame budget. They'll probably achieve it soon with some very clever techniques that only require differential re-training as the scene changes. The result will be a VR experience with live people and objects that feels photorealistic, because it essentially is based on real photos.


You can just make a mesh once you have the NeRF, which is plenty fast for VR. (What's important isn't new perspective generation but scene training. New perspective generation doesn't have to be real time just within a reasonable time to preprocess and then make a mesh.)


The meshes I've seen from nerfs are pretty horrible and you lose many of the things that make nerfs interesting in the first place.


That doesn't make much sense. You should be able to get arbitrarily good meshes by generating more and more viewpoints.


OK. So that's using a render of a NERFfield as input into photogrammetry? Yeah - that might be more feasible. I was talking about directly generating a mesh from the NERF density itself (marching cubes).

But NERFs can capture and render things that are very difficult subjects for photogrammetry: Fur, vegetation, reflective or transparent surfaces etc.


> plus data on the camera angles they were taken from

Doesn't seem like much of a stretch to determine the angles as well.

E.g. a semi brute forced way with GANs


I've spent a lot of time thinking about this (i.e. taking a video and creating a 3D scene) and I don't think that it is feasible in most cases to have good accuracy. If you need to infer the angle, you need make a lot of biased assumptions about things like velocity, position, etc., of the camera and even if you were 99.9% accurate, that 0.1% inaccuracy is compounded over time. Now I'm not saying it's not possible, but I'd believe that if you want an accurate 3D scene, you'd rather be spending your computation budget on things other than determining those angles when it can be simply be provided by hardware.


You're far too pessimistic (or maybe you don't know the field well). The problem of estimating the relative poses of the cameras responsible for a set of photos is a long standing and essentially "solved" problem in computer vision. I say "solved" because there is still active research (increasing accuracy, faster, more robust, etc.) but there are decades-old, well known techniques that any dedicated programmer could implement in a week.

If you're genuinely curious, look into structure from motion, visual odometry, or SLAM.


https://github.com/NVLabs/instant-ngp has a script that converts a video into frames and then uses COLMAP ([1]) to compute camera poses. You can then train a NeRF model within a few seconds.

It all works pretty well. Trying it on your own video is pretty straightforward.

1. https://colmap.github.io/


> even if you were 99.9% accurate, that 0.1% inaccuracy is compounded over time

Not really, with SLAM there are various algorithms to keep inaccuracy in check. Basically it works by a feedback loop of guessing an estimate for position and then updating it using landmarks.


You don't even need anything that fancy. Traditional structure-from-motion, or visual odometry gives accurate enough position estimations.

If you want to experiment, take a bunch (~100) of photos of an object, and use COLMAP to generate the poses. COLMAP implements a global SfM technique, so it will be very accurate but very slow.


You could train the initial model in sim and do sim2real finetuning. Someone must be working on this already


There is an explosion of NeRF papers:

https://github.com/yenchenlin/awesome-NeRF

It's possible to capture video / movement to into NeRFs, possible to animate, relight, compose multiple NeRF scenes, and a lot of papers are about making faster more efficient and higher quality NeRF. Looks very promising.


I hope someone can take this, all the images of street view, recent images of places etc. and create a 3d environment covering as much of earth as possible to be used for an advanced Second Life or other purposes.


Self-driving car companies are already working on this. Check out: https://waymo.com/research/block-nerf/

NERF is a very active research area, and the progress from 2020 to now has been nothing sort of astonishing. In 5 years, I expect there to be fully generative NERF's in research i.e describe a scene, and a NN produces a full 3d scene that can you interact with.


Can it be done with Google's existing Street View data?


In the demo video, they mention that they used a lot of footage from self-driving cars to produce that.

One thing I noticed is there are no pedestrians and cars in those scenes. So they must do a lot of work to filter them out by combining a lot of footage. Therefore, it likely can't be used (as-is) on the street view dataset...


Yes!!!! I always get frustrated when ever it does the weird stretch thing as soon as you move around in street view. Just jump to the next frame of you have to.


Can be done yes. Is it the best dataset to do what you describe, no.


This is where we were going with https:/ayvri.com but we've moved on to other projects. We still operate and are profitable, but too early for the market.


My first thoughts seeing this is darn, Facebook will with there metaverse, be drinking this up for content. So much so that my thoughts of, would I be shocked if Facebook/Meta made a play to buy Nvidia! Certainly wouldn't shock me as much now as it would before this given how they are banking upon the metaverse/VR being there new growth divergance, what with the leveling of with current services user base after well over a decade and a half.

Certainly though, game franchised films would become a lot more imersive, though I do hope that whole avenue dosn't become sameish with this tech overly learned upon.

But one thing for sure, I can't wait to bullet-time the film - The Wizzard of OZ with this tech :).


Dude, this isn't 2015 anymore - Nvidia has a higher market cap than Meta.

The days of Meta being a giant that can continously buyout other companies to keep stayling alive are coming to an end.


Is the example the result of just 4 photos? Or more? Are there any other data available, spatial data attached to photos for example?

Why they don't explain the scope of achievement properly?

edit: I don't think it is just 4 https://news.ycombinator.com/item?id=30810885


Actually it is explained in the article, I somehow missed it

    The model requires just seconds to train on a few dozen still photos — plus data on the camera angles they were taken from — and can then render the resulting 3D scene within tens of milliseconds
Pretty impressive, but lesser compared to generating it from 4 photos (which imho the movie suggests). Which would be "real magic level of impressiveness" for me


I am kinda skeptical, AI demos are impressive but the real world results are underwhelming.

How much it resources it takes to generate images like that? is this the most ideal situation?

Can you take images from the web and based on metadata make a better street view?

With all this AI where is one accessible translation service? or even an accent-adjusting service? or just good auto-subtitles?


This is essentially like a 3D JPEG, but instead of modelling the image with Discrete Cosine Transform (DCT) they use a neural net. So the neural net itself will learn just that one image and be able to reproduce it point by point, from various angles. A whole network for just one example, and an innovative way to look at what a neural network can be.


as someone who works in both AI and filmmaking, I remember losing my mind when this paper was first released a few weeks ago. It's absolute insanity what the folks at Nvidia have managed to accomplish in such a short time. The paper itself[0] is quite dense, but I recommend reading it -- they had to pull some fancy tricks to get performance to be as good as it is!

[0]https://nvlabs.github.io/instant-ngp/


[flagged]


No, really, it's a cool application of neural nets. It was unexpected when it first came up and it took a whole day to learn a scene, but a couple of years later it can be done in seconds.

I think the cool part is that a neural net can learn to produce (R,G,B) from (X,Y,Z,angle) by using a clever encoding trick with sin() and cos() for the input coordinates. And the fact that a neural net can be a frigging JPEG in 3D.


You're missing this work in the context of the field as a whole. Labs have been releasing papers boasting 2-4x speedups and getting them published at conferences, and then this group comes in and speeds up the original by 1000x. That's a huge leap in capability.


> Creating a 3D scene with traditional methods takes hours or longer, depending on the complexity and resolution of the visualization.

This just isn't true. I can create a 3D scene from 360-degree photos (even 4) in a minute or so using traditional methods, even open-source toolkits.

It doesn't look as good as this because it doesn't have a neural net smoothing the gaps, but it's not true that it takes hours to build 3D information from 2D images.


I think it's comparing to crunching through many more images to get a comparable quality scene.


Are there examples of this being used on large outdoor spaces?


Yes, Waymo did the whole San Francisco block: https://waymo.com/research/block-nerf/


There's something very uncanny-valley about that video. I can't decide if it's the smoothness of the shading on the textures or if it's the way the parallax perspective on the buildings sometimes is just a tiny bit off. I don't generally get motion sickness from VR but I feel like this would cause it.


You’ll find this is true of all NeRFs if you spend time playing around with them. If a NeRF is trying to render part of an object that wasn’t observered in the input images, it’s going to look strange, since it’s ultimately just guessing at the appearance. The NVidia example in the link has the benefit of focusing on a single entity that’s centered in all of the input photographs - the effect is much more pronounced in large scale scenes with tons of objects, like the Waymo one. You can still see some of this distortion in the NVidia one - pay close attention to the backside of the woman’s left shoulder. You’ll see a faint haze or blur near her shoulder - the input images didn’t contain a clear shot of it from multiple angles, so the model has to guess when rendering it.


I know when doing typical 2D video based rotoscoping it is possible to use frames from before/after the current frame to see data that is being blocked in the current frame. It's also common in restoration when removing scrathes/hair in the gate/etc.

To that end, I wonder if exporting a similar bit of video from that same path exported as stills would be enough to generate the 3D version.


The nerf already does this - the problem is when part of the object doesn’t appear in any of the frames.


Could just be the lighting on the reconstruction. It's not meant to be photorealistic. It's freaking nuts how they can do this.


Woah, this video is way more interesting than the Nvidia polaroid teaser in the original link.


Still, NVIDIA's achievement (and Thomas Müller in particular) is amazing. Thomas and his collaborators achieved an almost 1000x performance improvement, by a combination of algorithmical and implementation tricks.

I highly recommend trying this at home:

https://nvlabs.github.io/instant-ngp/

https://github.com/NVlabs/instant-ngp

Very straightforward and gives better insight into what NeRF is than any shiny marketing demo.


Waymo needed 2.8 million images to create that scene, I wonder how many Nvidia would need? Or was the focus only on speed? I skimmed the article and didn't really find info on that.


Waymo essentially trained several NeRF models for Block-NeRF that are rendered together. It's conceivable that NVIDIA's instant-ngp could be used for that.


What's new about this? That it's faster? People have been reconstructing 3D images from multiple photos for over a decade. The experimental work today is constructing a 3D image from a single photo, using a neural net to fill in a reasonable model of the stuff you can't see.


Five years ago, I've used common software to do this. I had to take hundreds of pictures of a scene, getting as many angles and details possible. Then when you pass it to the computer. Stitching it all together took well beyond 24 hours.

Now that I had a 3d model of the scene, I had to spend countless hours cleaning it up to make sure it was useable. Maybe in the last 5 years, things have improved.

But this demo used 4 pictures. And apparently, it rendered the final image in seconds. That's what's new.


Did it really use only 4 pictures? Do you have a source? https://news.ycombinator.com/item?id=30810885


If I understand it correctly it didn't make a 3d model, though. So you can't extract and reuse the result. Only move around in it and it creates an image for that viewpoint. But no meshes or textures.


it's not just faster, it's extremely faster. They're achieving results that are better than SOTA in a fraction of the time. Wildly impressive work.


This type of thing looks like the future of Meta or even Zoom to me.


So the part which makes this interesting to me is the speed. My new desire in our video conferencing world these days has been to have my camera on but running a corrected model of myself so I can sustain apparently eye-contact without needing to look directly at the camera.



Is there a video of this? I'm not sure what's the connection to the top photo/video/matrix-360-effect

Was that created from a few photos? I didn't see any additional imagery below

--- Update

It looks like these are the four source photos: https://blogs.nvidia.com/wp-content/uploads/2022/03/NVIDIA-R...

Then it creates this 360 video from them: https://blogs.nvidia.com/wp-content/uploads/2022/03/2141864_...


    four source photos
Is it just 4 or are there more?

I find it hard to believe there is only 4. There are clearly more data in video

https://i2.paste.pics/645fe17e418b2cb1f6179e0b6671a170.png like back side of camera here (it is kinda visible but much poorer compared to video). Or existence of a 2nd white sheet in background. But correct me if it is only 4 and you have a source on that


Nvidia is really turning into an AI powerhouse. The moat around CUDA, and how those target customer aren't as stringent about budget, especially when the hardware cost is tiny compare to what they do.

I wonder if they could reach a trillion market cap.


it's not a matter of "if", but "when" they will reach a trillion dollar market cap.


Well I think that is a little optimistic in the near terms. Considering there has never been a Semi-Conductor player reaching that milestone without a Fab. So Nvidia will be a first. At current P/E of 70, Semiconductor industry average is only ~30. Realistically Nvidia will need to triple their revenue at a fair P/E. I could see that in Data Center, reaching $30B revenue per year within next ~5 years. And this is already larger than Intel's DataCenter record revenue. We are still $30B short coming from Gaming and Professional Visualisation. Intel already has their GPU play ready in 2022. ( Assuming it is competitive. ). i.e Nvidia will need to find another massive market to conquer to reach that Trillion Market Cap status.


AI and 3D content making is becoming so exciting. Soon we'll have an idea and be able to make it with automated tools. Sure having a deeper undertaking of how 3D works will be beneficial, but will no longer be the entry requirement.


I know that taste in comedy is seasonal (yes, there were a people in a time that thought vaudeville was the cat's pajamas), but has anyone ever greeted a pun with anything other than a pained sigh?


It's ones like this that make me shake my head and go "Aiaiai."


Puns aren't to make people laugh, the pained sigh is the point. It's schadenfreude for the person making the pun.


> It's schadenfreude for the person making the pun.

Nah, if it is a joke at their own expense then it is "self deprecating humor", something which is definitely designed to get a laugh. Humiliation fetish, maybe? Obviously nothing is funny past a certain point of deconstruction... especially if you find yourself defending the distinguishing difference of the "meta". Just stop making puns, easy.


Idk, personally I find wordplay quite punny -- though I almost always try to greet the person who made the pun, and not the pun itself (they're abstract, inanimate concepts, pretty difficult to say "hello!" to)

:P


Watch Bob's Burgers. The whole show is basically puns. I chuckle.


In terms of practical use - is there a pipeline to use the NeRF 3D scenes in Unreal Engine? How many photos do you need on average vs photogrammetry? 50% less?


This is not a textured polygon asset, it's a neural field, so it's like stored the directional data in a neural network, I think using spherical harmonics.


That’s ok, just need some level of integration with UE (being able to /integrate with UE camera etc). Specifically interested in using this for LED green screens that sync to a live camera movement (Mandalorian style). Our pipeline uses UE, Nvidia cards and photogrammetry/plates/3D models atm, this could speed things up a lot and require less photos for creating the 3D backgrounds.


Next time someone says "why does everyone in AI use NVidia and CUDA"? this is why.

They do high quality research and almost inevitably end up releasing the code and models. It's possible to reconstruct all that as a non-CUDA model, but when you want to use it, why would you when it's going to take months of work to get something that isn't as optimised?


Comment related to top comment

Was talking to someone 2 days ago, just died randomly, early 40's. It's trippy, I have data of this person's face eg. videos/base64 strings... it's eerie. Unanswered texts wondering what's wrong. My thinking is I was only exposed to a part of this person, won't be them fully if reproduced.


I'm guessing if you can "detect/recognize" an object in 2D space, you could guestimate it's "missing-dimension" i.e depth.

If you detect an apple in a photo, you could quite reliably guess how the back look

Still very cool :)


Is anyone else kinda terrified?


This nerf project is cool too.

https://github.com/bmild/nerf

I've been trying to get GANs to do this for a while, but NeRFs look like the perfect fit.


Would this make "I Dreamed a Dream" from Les Miserables less moving https://youtu.be/RxPZh4AnWyk ?


I'm curious for those that work with NeRFs what their results look like for random images as opposed to the 'nice' ones that are selected for publications/demos.


IIRC Microsoft had something like this years ago, but the results weren't nearly as smooth or natural looking. I can't remember what it was called, though.


I remember a very old video where they rendered 3d scenes for frames 1 and 5 (examples) but interpolated the frames in between, or something like that, instead of re-rendering the whole scene. Maybe they were only redrawing some triangles if the angle changed too much.

If it's that tech, I'm pretty sure it got dropped when 3D acceleration made it feasible to just re-render the whole scene every frame. Dedicated hardware won out over software tricks.


Photosynth?


I’m probably going to ramp up the number of photos I take in hope that google photos auto applies this tech


Google has done some expirements from people taking very similarly framed images from different times to create timelapse videos, so that would hint to me they definitely have the content to try.

There was an article here some time back that showed streets in NYC that used this kind of idea of using older photographs to put one in street view of older NYC. So, yeah, I'm guessing it could be done. Might be weird with the different quality of images (modern digital, polaroids, kodachrome, etc).


I’m gonna do the same particularly dog pics. I’d love to bring my old dogs back to virtual life. Have old spanky (died a decade ago) running with lilybean (current version) in iMovie.


> Blink of an AI

I know this post adds nothing, but that one's well worth being pointed out.


I'm really looking forward to this technology getting applied to home improvement.


Are they using four photos or more?


I don't think so but I don't have a source. Please someone correct me if I am wrong https://news.ycombinator.com/item?id=30810885


I'm guessing at least six photos, judging from the six tripods.

(They could be stands for lights, but like I said, just guessing.)


Nvidia is leaving us all behind


What's the current state of research on true volumetric displays? That's what I'm excited for, although that takes less AI and more hardware, so quite a bit more difficult.


If you have a graphics card which is unobtainable.


Still plenty of them for OEMs/prebuilts, it's just people don't want to buy the extra PC around it, right?


Just want to say I appreciate the cleverness of the title.


ENHANCE. ROTATE.

I mean, obviously generated images can't be used as proof in the court of law, but this feels like we're slipping into crummy USA show territory.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: