Using AI this way seems like the equivalent of inventing spam for music to me. It's interesting, but basically we've figured enough out about the entropy and shape of an honest signal that we can automatically produce noise against it that cancels meaning, or subverts it.
The difference between simulating music and using formalisms to make discoveries about it seems like a matter of intent. As in, what's the difference between a horizon, a window viewing one, and a picture of one? The existence of an observer makes them related, and the position of the observer makes them different. That difference is probably analogous to what simulated music is to intentional music.
The existence of a listener makes it music, and the relationship of the listener makes it different.
These AI generated images are like cancelling-noise to images we already associate meaning with. Which is seems analogous to 1/f fractal or Perlin noise at a certain level of abstraction. Not to dismiss or attempt to trivialize the accomplishment at all, but when people create tech that is overwhelming to the senses like that, sometimes a new frame of reference can help.
If you like that, you can look up other music video shot by Michel Gondry. He has an incredible creativity and DIY ethos that really sets him appart.
The film "Be Kind, Rewind" is kind of a "mise en abîme" in that regards, shot by Gondry and featuring characters that show the same kind of cinematic craftiness.
The different elements in the scenery represent various sounds and instruments in the song.
Another of Gondry's videos that did a similar thing was Daft Punk - Around the World (https://www.youtube.com/watch?v=LKYPYj2XX80) where each of the different types of "people" represented different sounds in the source track.
I think they did a good job choosing a scene that is somewhat easier for CGI: artists have struggled for a long time to make acceleration and general changes of momentum line up with the way that objects do so in real life. It also makes the motion blur shader more or less a constant which is easy to simulate, and our brain fills in the rest :)
Star Guitar is from 2003 which was well in the era of "cheap computers are good enough now that we can manipulate each pixel" (after all, Terminator 2 and Jurassic Park were made in the early 90's, and music videos usually don't need "cinematic" image quality because they were broadcasted over TV).
If you look closely at the Star Guitar video, you can find "clipping artefacts" where the fragments are pieced together or fade in and out.
Well it's just a few minutes, but definitely the most impressive few minutes of the movie ;) AFAIK most scenes where dinosaurs are jumping and running around were computer-rendered.
Remember that scene where the protagonists are running with the dino swarm around them? The paths of the dinosaurs were laid out in the real world with tennis balls, and back then we wondered "wow they must have great image processing code to remove all those tracking markers". Turns out it wasn't algorithms but humans who removed them frame by frame, pixel by pixel (it didn't occur to us computer nerds that brute force makes the most sense if you're on a tight schedule).
I recall seeing a behind-the-scenes clip somewhere (perhaps on the Directors Label series that came out in the early 2000s) where Gondry talks about his brother piecing this together from video footage they'd captured. I can't find that clip but came across this very early exploration towards the final product: https://youtu.be/GF0-wGbRqEs
The 1st one is very similar due to the syncopated bass being anticipated by one 16th within the 1st beat of each measure. I guess it's common in that style.
This is scary and what I feared would eventually happen. How long before AI is better than every music producer, trained on all of music how likeable something is? How long before fake influencers on IG and TikTok completely dominate by having more addictive personalities and videos to follow than real people?
There's two fallacies I think you've touched on here, one is that there is an "objectively best" music. Music is so tightly coupled to cultural movements, very subjective, and extremely broad in scope, it just can't be the case.
The second I feel is that people would still be interested in AI music in a meaningful way. I think AI music has a following now because it's interesting and novel, and that gives it a story, but once it's mainstream, that story is boring and people will go back to seeking real culturally relevant music.
Sure the music on TikTok/IG and friends could be generated, but I am not sure I care too much. Those platforms are almost entirely vapid, void of authenticity already, the music being fake too doesn't detract that much.
> one is that there is an "objectively best" music.
Case in point, I enjoy some shit. No way AI will replace lo-fi recorded-in-underground-bunker limited-cassette-tape-release atmospheric black metal / noise.
I mean it might, but here's a take from a different angle: art is not just the end result, it's the process and the story behind it. Nobody objectively gives a shit about a selfie from the early 1500's, but because of its provenance the Mona Lisa is considered one of the most valuable pieces of art out there. Nobody gives a shit about a database row with an ID, timestamp, user ID and the text "just setting up my twttr", but because it's the first tweet and it's put on a trading platform and it might appreciate in value, someone spent $2.5M on it.
You slag of TikTok because every generation will slag off whatever the following generation(s) do, that's normal. But the generation following you see that differently, and in two decades they will still reference some of the more iconic clips they've seen. I mean I do it with really bad movie voiceovers from nearly 20 years ago, as well as terrible porn intros. It's part of culture.
Anyway, there will be a place for AI generated anything alongside the handcrafted stuff. Like how there is still a market for handcrafted goods alongside the mass producing machines. Or hand-drawn art in the age of digital. Or physical valuables in the age of electronic money and cryptocurrencies.
I am old enough to remember when people were afraid that Muzak would take over.
It did not.
I was at a Chinese restaurant, around a year and a half ago, and realized that the music they played was a sort of “ersatz” music. It was familiar tunes, like Scarborough Fair, or Hotel California, slightly modified, and strung together. Vocals were basically wordless humming.
There is an entire industry, based on musicians, providing “stock music,” and it’s been around for a long time. Sort of a musical equivalent of Shutterstock. Some of this music is quite good. Most is fairly boring.
Music is way too connected to our emotions to be reliably synthesized. AI would need to advance to be able to produce emotions and creativity, before it would threaten the music industry.
Also, there’s always the “tabloid” aspect of the industry. There are artists that may be fairly unremarkable musicians, but generate a lot of press coverage. Unless the world of Questionable Content becomes real, I can’t see any Star headlines of AIs beating up paparazzi.
I would argue that a lot of this "stock music" is just a continuation of elevator music which has been around for a very long time. There's musicians (often very good ones) who essentially made their whole career doing this music, look up James Last for example.
Always seemed like the modern analogue to the older practice of artists only being able to make a living if they could secure patronage. The days of (semi) direct payment for physical copies of recorded media are starting to seem like an aberration rather than the rule.
> I think AI music has a following now because it's interesting and novel, and that gives it a story, but once it's mainstream, that story is boring
I'd have agreed 10 years ago, but I swear every time someone drives past with their stereo bumping, it's "robot music" (as I smart-assedly refer to heavily quantized/autotuned vocals). I guess it will probably still die out eventually, but that fad has been a lot longer-lived than I'd expected.
I don't think we need "objectively best" here—merely "subjectively best". And because each person could have their own AI DJ trained on an arbitrarily rich set of preferences and experiences from their one-person audience, we should expect that AI DJ to be subjectively the best for their respective human.
Obviously, this overlooks the shared experience element of music, but that's not so relevant in the case of online music we consume solo.
Maybe we call them dopamine melodies instead of music to underscore the effect rather than the art. The fact that most synthetic music today is hot garbage doesn’t convince me that we couldn’t connect with it in a meaningful way in the future. After all, things like GPT3 are more like talking to a hive of humans than a single alien.
I'm convinced tons of this stuff already exists as things like youtube channels of 'Best Chill Piano Ambient'. I think it's become impossible to tell neural-net generated stuff from the output of humans trying hard to define a functional genre, especially one that's effectively 'background' rather than didactic or challenging.
That's not even a close comparison. Humans make CGI, it's still directed and created by humans. The real comparison would be seeing if people would accept that movies aren't written and directed by humans anymore.
Music outside of pop gains popularity through grass roots popularity contests essentially. Then in Top 40 people care way more about the personalities than the music, you could probably sneak some AI music in there but you would still need the singer to sing.
Music is very often a cult of personality thing, even outside of pop. In those arenas authenticity is really important, someone not even making their music would be outed pretty quick.
Some examples of what I mean:
People go to clubs to hear DJs play, because they perceive the DJ to be generating the atmosphere. You couldn't get people to pay a cover charge to see that DJ's mix play without them present, even though it's technically very easy and would be exactly the same music.
Similarly, you can't get people to go to a concert of a recording of a band, that would be lame. People want the band.
Yes, a lot of people don't get this. Pop is as much about fashion and branding as sound, and an AI would need a virtual personality to get anywhere with that.
Which will likely happen within 10 years or so. Hatsune Miku is already a thing, and a few upgrades from now she'll probably appear to be running her own post-TikTok account.
J- and K-pop boy/girl bands are already run like this. The individual artists are more or less interchangeable and can be dropped at any time if they lose their looks or appeal. They don't get much/any creative freedom, and the performances are all externally choreographed.
It wouldn't take much to create a virtual version. Give it some virtual sass and you have a tame monster.
William Gibson's Neuromancer (1984) features a Jamaican space tug where the "righteous dub" played within is automatically generated by remixing a vast library of existing music. I was fascinated by the concept when I first read about it as a budding DJ, and it's kind of exciting to see it come to fruition, even if the single tune this particular AI seems to play is pretty insipid.
That is something that would be of interest to those working with splitting audio into 'stems', drums, bass, voice, music. Algorithm DJay can do a version of that 'on the fly'. Serato Studio is a kind of remix 'on the fly', but there are many others. Both have automix settings that work very well, especially if you line up similarly grouped tracks.
Yeah, I've not used the "auto" features to that extent, but I've DJ'ed at a few events and festivals just for fun and it's super convenient to have software that has analyzed my music library for BPM and key. I still get to play whatever I want to play, but if I'm drawing a blank or just looking for inspiration, I can sort by all tracks within "n" BPM of the current track and in the same/compatible key.
Then if I really want to play something that wouldn't normally fit with an easy transition, I can always adjust they key or tempo on the fly. There are limits before it starts to sound weird, but that can also be a neat way to mess with a song. Add a riff or a phrase from one song into the current one, but only because I'm able to halve the speed or chop up the time signature to match.
And that's just me as a barely fluent hobbyist. Experienced DJs can do some great stuff. Even a non-"mixing" DJ like a radio DJ can string together songs based on thematic connections, sneaky relations between artist or lyric or even the conditions in the room. Those can be a lot of fun and would be a lot harder to automate than simple beat matching.
I encourage you to read up on "Doctoring the Tardis" and "The Manual", as well as "I wanna 1-2-1 with you".
These were productions from the artists behind the KLF to expose how formulaic chart hits were. So I would not be surprised if we see AI produce top 40 hits sometime in the future. However, I would be very surprised if AI could replace the vast majority of music that is out there.
How do you know we're not already at that tipping point, if not there than in other forms of influencer 'culture'?
Of course you're right, but I'll tell you what next.
Redefine 'dominate'. All that has happened is we've shown mass media, mass popularity, is better suited to the unreal. Humans need not apply here.
'Mass' anything, then gets less interesting as it is self-evidently a dead-end. And not 'everybody', but significant numbers of people, pursue something else, perhaps things that are unlikeable in an interesting way. The differences will always be less 'addictive' than the 'mass' stuff, but will overperform in other, definable ways.
As someone who has deleted Twitter and Facebook I think I'm correct in this notion that the most addictive, most 'mass' media sources are not 'good' in any normal human sense other than manipulation of that very instinct in all its forms…
When it gets good I do think this will decimate the industry that cranks out generic electronic music for YouTube videos. (Does that count as an industry?) That aspect is not so bad perhaps.
THE FUTURE:
Your phone will have an exploratory music feature. You dial in all sorts of weights... genre, tempo, density, melody, etc. You'll crank out a custom instrumental playlist at the tap of a button. This will probably make for great study/work music.
When it eventually overshadows normal instrumental music it's going to be a giant bummer for anyone hoping to make money doing it. Artists are going to have a lot of soul searching to do.
That's not how music works. My bet is we will see crappy music for ads replaced with AI generated stuff, but NFTs etc may actually mean a much more interesting market for real musicians moving forward..
I remember reading 10 or 20 years ago that dj/rap turntables had outsold guitars in the UK for the first time that year. That was my "wow, people don't play music any more" moment...
Considering I like metalcore, bagpipes, Enya, Bruno Mars, Pantera, The Blues Brothers/Chicago blues in general, Daft Punk, Monty Python songs, Loreena McKennitt, Led Zeppelin, Bach, ...
I absolutely doubt that any AI will ever produce something I genuinely like. Maybe some features in isolation, hopefully not multiple genre features in some unholy mish mash...
Don't forget that playing a musical instrument or watching crowds move to your beat is an incredible experience that is not going to be enjoyed by artificial systems, at least anytime soon.
The public personas of celebrities were always unrealistic and unattainable for ordinary people. Maybe replacing them with explicitly fictional people would lead to a healthier culture.
If it's made by AI then it has no inherent value. There was no labour associated with whatever AI produced. Open source AI trained on public models is the ultimate "hold my beer". What's the point of paying for anything when my old GTX 760 could generate it for me?
It's actually good music. The visual parts match the music pretty well, too (though it makes me anxious if I look at it longer than 3 seconds).
I do wonder, though, whether it's just over-fits, and takes the original music with some minor tweaks.
Another interesting point would be to know if this music could be sold cheaply without legal troubles (or even distributed royalty free). Most of the music I heard is pretty neat, I could image using them in YouTube videos, or as podcast jingles.
> though it makes me anxious if I look at it longer than 3 seconds
I've read that people with Trypophobia can often have similar reactions to these GAN type generated images. You may want to look into seeing if you have a mild case of it.
I don't have it to the point of phobia, but I do find images of multiple tiny holes (especially with eye-looking things inside) in animals and plants disturbing.
I think it is indeed a natural response. Animals/plants with lots of tiny holes look diseased to me.
By the way, if you do have this to the point of phobia, don't look at images of the Surinam Toad, whose young grow out of holes in the mother's back. Really, it's nightmarish.
Not afraid just triggered, disgusted and uncomfortable(I get goosebumps), plus the image stays burned in my mind for hours , constantly triggering me, kinda like when you see a graphic image.
Can someone explain why this is impressive? Seems easy to correlate music features to visuals. It's been done for decades. What's special about this one?
I would love to see this used with music that's already extremely "algorithmic" and aesthetically lends itself to a computer-created process... IDM stuff like Access to Arasaka[0], Autechre[1], Qebrus[2] etc. Can I just take an artist's entire discography (or that of a few artists) and generate comparable material? Ahhhh can I have "more music" from my favorite artists!??! hehehe :)
This makes me uncomfortable, but also very intrigued. Something about the visuals "tickles the brain" but not in any particular way I can reason about. Cool!
I tried this recently but it's possible to get all the way to putting your card details in without it ever telling you how much it's going to charge you. I closed it.
What do you mean? Normally you should see the price at the start of checkout.
We are improving some stuff though:
- Better communication of what's include when you upgrade your project -> DONE
- Explain pricing on other pages as well, like home page -> TODO
Is there an explanation of how it works aside from just "It's a GAN"?
I'm guessing it's generating a GAN on frames, and navigating around it (at random?) synchronized to the music, like the transition animations at https://www.thisfuckeduphomerdoesnotexist.com/ , I'm curious if there's something else to it
It started with just noise, but a bit later there was actual music which was very impressive if indeed synthesized out of 'nothing' by a neural network. Chord progression which was something like I-V-IV-vi, a simple melody based on the same progression, a simple baseline...
I had lots of fun taking images from https://thispersondoesnotexist.com/ and feeding that to wombo.ai (converts static images of people to them singing a song)
It would be cool if videos like these can be generated in realtime to replace music visualizers like waveform or spectrograms. Even if it has no utility as a real visualizer, it is still nice to watch!
Those visuals are badass. I know the nn was trying to replicate “realistic” images, but this semi-organic nonsense vibe would actually be awesome at a concert.
a GAN for music videos, in the same vein as thispersondoesnotexist.com and thisworddoesnotexist.com - here's an index of them https://thisxdoesnotexist.com/
I'm asking if anyone here knows anything about it beyond what's implied in the URL (I'm aware of the other websites with similar URLs). Things like: how was it made, what are the inputs, is it generated on the fly each time you go to the page?
Beautiful, almost every frame is a perfectly plausible and very tasteful abstract painting. But it seems a bit obsessed with cars, not a very meaningful image. Would it be possible to nudge the net to other concepts?
It's picking a random video out of 14:
${urls[urls.length * Math.random() | 0]}
With the list loaded as an asset:
urls = ["Spu3eiOEJ-M", "BE2lZ-Ti1Wc", "lHpcYPfjiLs", "6w2WXRFJpAE", "jPBgu-IO6TI", "g4px8cFR3gc", "whD78YCQXoo", "4iN9738uASY", "2tSa701EftM", "hjinBNYEkb8", "jckJS8RNMbw", "gq5EQtSiJiE", "zhV3ecScgrA", "wVnt_CX0C64"]
Whatever I viewed, frequently morphed into car shapes, with an apparent emphasis on frontal and rear portions. A sort of pseudo intellectual ADHD pr0n vid for kinky vehicle computers, or very peripherally specialized humans.
There are only 14 videos linked to from the the website (see my other comment on this thread). The video IDs could be seeds for generation, or some other identifier besides sequential numbers.
The difference between simulating music and using formalisms to make discoveries about it seems like a matter of intent. As in, what's the difference between a horizon, a window viewing one, and a picture of one? The existence of an observer makes them related, and the position of the observer makes them different. That difference is probably analogous to what simulated music is to intentional music. The existence of a listener makes it music, and the relationship of the listener makes it different.
These AI generated images are like cancelling-noise to images we already associate meaning with. Which is seems analogous to 1/f fractal or Perlin noise at a certain level of abstraction. Not to dismiss or attempt to trivialize the accomplishment at all, but when people create tech that is overwhelming to the senses like that, sometimes a new frame of reference can help.