Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This was the first AI thing to fill me with a feeling of existential dread.


What is with the hyperbole in this thread? This stuff sounds like incoherent noise. It is noticeably worse than AI audio stuff I heard 5 years ago. What is going on with the responses here?


I assume the stuff from 5 years ago was essentially spitting out a midi output which would be fed in to a traditional tool to play samples. So it's going to sound a lot sharper while being a lot less sophisticated. The real breakthrough here is this is generating everything from scratch and it still resembles the prompt.

One of the automated prompts was "Eminem anger rap", I'm confident if you had showed me the audio without the prompt I could identify which artist it sounded like.

And this is just a basic first attempt at reusing a tool not even designed for audio. I can only imagine how powerful it could be after some trivial revisions like using GPT-3 to generate coherent lyrics.


I feel exactly the opposite way, but I suppose everyone has a different ear and taste. I think a good 3-4% of what this produces sounds damn amazing and beautiful. I've been vibing to it a lot. Fantastic stuff! There is also the feeling of shock and awe like with ChatGPT where you give it a prompt about a niche thing you think it will definitely not understand and it turns out it understands it shockingly well. As an example I just gave it a prompt "Avril 4th" and the result literally gave me chills.


Usage of an image generator to produce passable music fragments, even if they sound a bit distorted, is very surprising. That type of novelty is why we come here.


People did the same with GANs years ago with similar odd results. I do think the kinks will eventually be ironed out but i don’t think this is it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: