Apple would have no problem implementing something similar.
It's the brand, mindshare and music store/service lead gen that's more difficult to replicate. Why get rid of an icon that's already on everyone's phones that could be a funnel to apple music instead of spotify?
It's odd that they didn't try to purchase soundhound then. The company has more evolved tech, and also has voice recognition services beyond just music through houndify.
If they bought soundhound for the tech to bake into their own service, they'd be competing with Shazam. If they bought Shazam for the tech to bake into their own service, soundhound would just exist as an alternative. Seems like an attempt to buy the "name brand" to get tech and inherently beat the competition at the same time.
Note: I've never heard of soundhound though, so it might be popular in some places. Shazam is like the name-brand of music recognition though, to the extent of being a verb.
Workflow wasn't a popular Android application that sends users to a competing service though... I could see the app live on iOS with Spotify integration stripped out, but seriously doubt it has a future on Android.
This has the smell of a comment written by someone with limited real-world experience. Simply writing down the list of problems you would have to solve to build Shazam would take an entire afternoon.
Yes, deep neural networks have proven remarkably useful for machine perception, but you would still need to collect a colossal amount of audio data, fingerprint all of it, build a low-latency processing infrastructure for making inferences, and convince a hundred million people to install your software to feed you copious real-world training data that you can use to improve model performance.
> and convince a hundred million people to install your software to feed you copious real-world training data that you can use to improve model performance.
That's actually the easy part. You already have the music. Distorting it by superimposing background noise is really not difficult.
Lol. When you superimpose noise, the original data is still there. When you have a FM radio playing staticky, heavily compressed music through crappy speakers in an acoustically terrible store and being captured by a terrible microphone and then being compressed, a significant amount of nonlinear distortion has taken place. That is extremely hard to model. And you would have to model it or have real data to train a neural network. Neural networks are extremely hard to train without excellent data.
I mean, you can easily find thousands of hours of music online. Recording background noise is easy (just go to a random bar where they are not playing music). Now simply add the two signals (you can shift them randomly to generate more data). You can also add some linear filtering if you like (just imagine random settings of an equalizer for starters).
This should give you enough data to build a proof of concept at least.
Illegally grabbing thousands of hours of music to train a commercial model hardly qualifies as fair use. Any company you build upon that would be tainted.
For sustaining:
In addition, you'll need to keep an updated catalog of music to identify new songs against, and most uses of a service like shazam are to find names of songs people aren't familiar with, so that catalog needs to be very fresh.
That means you'll have to grab some sort of feed, and engage in large scale music piracy for commercial gain or have access to a library of songs from many disparate music providers, such as ascap.
Background noise:
there are literally hundreds of different background noise environments you need to train against. Dozens of common microphone configurations. Clipping, variations.
It's very much a problem where a proof of concept is neat but doesn't really get you anywhere.
Also, I'm not saying it's impossible or not worth doing (obviously, it's possible and worth doing), just that a few minutes of thinking and hacker news comments are going to hardly touch the breadth of difficulties required to get this to work even somewhat reliably.
Shazam doesn't actually let you improve the answer, nor report incorrect guess. They are so confident with them, even if it's sometimes completely missed genre and style of music.
I'd be more curious to see you try to build this in an afternoon.
Also, it works a lot better than being able to find "slightly distorted" versions. It can catch a song in a noisy room where you can barely make out the song to begin with. Couple months back it found a song when there was a very loud crowd yelling over it. They're also able to determine differences between versions of songs pretty well. Some remixes might sound very close to the original.
Other thing you might be missing is just how fast it is even on a slow mobile connection.
This is a heap of nonsense. Not only does this summarily dismiss the enormous challenges in digital signal processing required for removing arbitrary background audio, it exposes some confusion associated with the ideas of correlated random variables, inner products, and affine transformations.
A tiny percentage of the time occasionally when a successful company makes a N-hundred million purchase of a technology and company and you don't understand why, it's because they have made a mistake.
The smart money, though, is on the main chance: you don't understand the purchase, or the problem domain, or both.
In this case I think you are overestimating the progress in NN and search, and underestimating the signal processing. Have you tried this with any significant corpus?
"Whack it through a FFT and do correlation " seems like one of the obvious solution to the toy problem version, but this is exactly the sort of thing that usually falls apart in practice.
Building the service is not usually the hard part, but building the ecosystem around it is. There were/are countless services similar to Facebook or Twitter, but only a few of them can be really successful because of herd mentality.
Thanks. This actually proves my point that the core concepts of Shazam can be implemented in a weekend. Of course, programming the front-end etc. is more work, but that is besides the point.
> What's not straightforward is recognizing cover songs and the like. But that's not only non-trivial but AFAIK can't be done.
Well, you could translate the music into actual notes (or musical intervals), and use Smith-Waterman (or any more advanced and more recent technique) to find the song with the lowest edit-distance.
Yes, you can look at the frequency with the highest intensity in the FFT. This is the "dumb" version of converting music to notes (and is what I really intended to say but didn't choose to for sake of brevity).
The thing is that the process looks for spectral patterns, let's call it "harmonic content per unit of time," not just notes. Mere notes would result in lots and lots of false positives.
Let's just agree that the process is not too far removed from my initial brief description, and should be simple to implement, as the article shows. For any competent signal processing engineer, this should all be evident, which was the main point.
Also, even if you have many false positives, you have already narrowed down the search, and this allows you to do more brute-force searching like computing cross-correlations.
Where are you going to get 'the music'? There are millions and millions of hours of music out there, how are you going to gather and fingerprint it all?
y'all realise they run a music service too right? Having access to cross-platform data that gives them insight into bleeding edge emerging/trending artists and songs is priceless.
Huh. I thought Apple completely killed Spotify. I personally had to switch because Artists started doing exclusives with Apple only and I really couldn't justify Spotify over Apple Music, even though I much preferred Spotify's experience. I've also noticed Apple's catalogue is much larger than Spotify's. Is Spotify getting better? I'd love to switch back.
I haven't used apple music before, but spotify's ml recommenders are really impressive, and most of my favorite songs and artists were recommended to me via its "discover weekly" playlist. Its apps are super slick (cross device play/control is really handy!), especially compared to itunes.
Just buy a month of premium ($10/month or $5 if you're a student) and try it.
Apple Music has recommendation playlists as well: New Music Mix, Favourites Mix and Chill Mix which all get updated weekly based on your likes/dislikes and existing music collection.
I actually find Spotify apps to be far worse than iTunes at least on iOS. And the Apple Watch app for Apple Music is really impressive.
Apple's recommendations can be ridiculous though. When I go to the "For You" tab in their iOS app here is what is shown as I scroll down
1. Favourites mix - i.e. the music I've played the most
2. Recently played - i.e. the music I've played recently
3. Tuesday's Playlists - the first of any real recommendations so far, but 4 of the 9 album covers it shows in the thumbnails are music I've played recently
4. Heavy Rotation - i.e. music I've played a lot, but not just recently
5. Tuesday's Albums - recommendations based on an artist (Waxahatchee) I've listened to
6. Artist Spotlight Playlists - a selection of playlists, including "Influences" and "Inspired By" playlists by artists that I don't listen to and are really unrelated to most of my collection.
7. New Releases
finally there's the wordy stuff I don't care about, social media posts.
Most of this stuff is not even bad ML (like the "Amazon recommends me vaccuum cleaners because I searched for and bought a vaccuum cleaner" problem) it is just literally showing me what I listened to. I've tried the recommended playlists a handful of times and they don't really show me much new things, they remain pretty unchanged in the weeks or so that I check them.
When you throw in the fact that they periodically delete all of the music I've downloaded, and nuked a chunk of my music collection after I signed up ... I have to say, my experience of Apple Music overall is pretty terrible.
Just as a single counterpoint: Spotify has Mixtape of the Week and two similar auto-generated playlists, and the songs in there are at best vaguely related to what I listen to on Spotify - I haven't found anything interesting in there.
The albums I haven't listened to in a while and might want to listen to again according to the app are those I listen to daily.
New releases are not sorted or filtered by genre, so I guess it is great that some pop or reggae artist has a new album out when I only listen to metal on Spotify?
Etc., etc... in other words - I use Spotify for totally unrelated reasons and switched from Google Play Music, but it has all the same faults, it just works better for some part of the target group, but it is in now way perfect, or even good, with regards to their recommendation engine either.
Yeah I agree with this. Spotify is a good experience. Save one. Their security is shit. After having my account hacked for the unpteenth time I finally threw in the towel since they clearly are not interested in securing their damn service and jumped to Apple Music. The UI is not as good as Spotify but I don’t worry about my account being hacked every week and their catalog is bigger at Apple Music. My wife found some obscure ass Pakistani tune she loved since she was a kid in Delhi. That was hardcore. Spotify never had much desi stuff.
I hope Spotify fixes their security woes. Either way I have no reason to leave Apple Music now.
Taylor Swift is back on Spotify, which is the only artist that I ever noticed missing (and on Apple Music).
Personally I think Spotify's recommendations, radio stations, and app (both mobile and desktop) are just more pleasant to use than Apple Music and iTunes.
For me it remains amazing and has more of the music I like than Apple, and it's recommendation algorithms blow Apple's out of the water. I guess it depends what music you are into. 3rd party integrations are also miles ahead of Apple Music such as ability to use Amazon Echo and Fire TV as output devices.