right??
If I listen to a bunch of music and at some point feel inspired and I create and produce my own original song, would it be in violation of copyright laws? I don’t think so, unless there was something in it that very closely resembled another artist’s copyrighted work, like sampling.
But times have changed now. It is very easy for a person to go to a site like suno and type in a prompt of what kind of music they want generated and get some decent results. I first played with this tech about a year ago (https://drawwith.ai/2024/01/04/discontent-with-the.html) and while I occasionally use it for gimmicky purposes, like creating a song for someone who’s name is really hard to rhyme with or creating a song with a very specific phrase in it, I don’t know if it has impacted the music industry that much. But it does seem to have the potential to do that.
Generative AI is making a lot of hard things easy, such as producing a catchy song that isn’t too painful to listen to. I bet if you take a song that was generated on suno and instead, you actually took the time to write, perform, and produce something like that song (assuming no direct copyright violation) I’d think that there wouldn’t be any legal issues with that.
But since this tool was so easy to use and it produced something of decent quality, something of potential value, the assumption is that something was stolen. This may or may not be true. This AI model is benefiting from being trained by lots of other people’s hard work. So is it different from a person creating these songs themselves on Garage Band or Logic Pro? I tend to think so. There should be compensation since it is creating something based on something that is not free. If the AI model was trained on sounds of nature — birds chirping, rocks falling, waterfalls crashing — there wouldn’t be a problem with that, right? It’s a tough question.
One approach that could alleviate this is a compensation model. But how would you take a generated song and find all the text and songs and content in the training set that influenced the generation of the song? It feels like you’d be looking for a precious handful of needles in multiple silos filled with haystacks. Maybe one approach is to at least try to match the latent space of the inference, the generated song, and latent space of all tokenized elements (and their positionings) of the training set. And then see which training content matches most significantly with those. I assume this is how a visual similarity search works. But this process probably has flaws. Like how can we be sure the vectorization process is comprehensive enough to represent and relate similar parts or concepts in a musical work?
[update] it looks like deepseek has caused a bit of a stir in the AI world. And recently OpenAI has accused Deepseek of “distillation” from their models, essentially taking OpenAI’s content. Is this that much different from OpenAI scraping as much of the internet and everything it could find and not giving credit/attribution?
Wednesday January 29, 2025