Father, Hacker (Information Security Professional), Open Source Software Developer, Inventor, and 3D printing enthusiast

  • 0 Posts
  • 6 Comments
Joined 2 years ago
cake
Cake day: June 23rd, 2023

help-circle

  • stolen copied creations

    When something is stolen the one who originally held it no longer has it anymore. In other words, stealing covers physical things.

    Copying is what you’re talking about and this isn’t some pointless pedantic distinction. It’s an actual, real distinction that matters from both a legal/policy standpoint and an ethical one.

    Stop calling copying stealing! This battle was won by every day people (Internet geeks) against Hollywood and the music industry in the early 2000s. Don’t take it away from us. Let’s not go back to the, “you wouldn’t download a car” world.



  • My argument is that the LLM is just a tool. It’s up to the person that used that tool to check for copyright infringement. Not the maker of the tool.

    Big company LLMs were trained on hundreds of millions of books. They’re using an algorithm that’s built on that training. To say that their output is somehow a derivative of hundreds of millions of works is true! However, how do you decide the amount you have to pay each author for that output? Because they don’t have to pay for the input; only the distribution matters.

    My argument is that is far too diluted to matter. Far too many books were used to train it.

    If you train an AI with Stephen King’s works and nothing else then yeah: Maybe you have a copyright argument to make when you distribute the output of that LLM. But even then, probably not because it’s not going to be that identical. It’ll just be similar. You can’t copyright a style.

    Having said that, with the right prompt it would be easy to use that Stephen King LLM to violate his copyright. The point I’m making is that until someone actually does use such a prompt no copyright violation has occurred. Even then, until it is distributed publicly it really isn’t anything of consequence.


  • If we’re going pie in the sky I would want to see any models built on work they didn’t obtain permission for to be shut down.

    I’m going to ask the tough question: Why?

    Search engines work because they can download and store everyone’s copyrighted works without permission. If you take away that ability, we’d all lose the ability to search the Internet.

    Copyright law lets you download whatever TF you want. It isn’t until you distribute said copyrighted material that you violate copyright law.

    Before generative AI, Google screwed around internally with all those copyrighted works in dozens of different ways. They never asked permission from any of those copyright holders.

    Why is that OK but doing the same with generative AI is not? I mean, really think about it! I’m not being ridiculous here, this is a serious distinction.

    If OpenAI did all the same downloading of copyrighted content as Google and screwed around with it internally to train AI then never released a service to the public would that be different?

    If I’m an artist that makes paintings and someone pays me to copy someone else’s copyrighted work. That’s on me to make sure I don’t do that. It’s not really the problem of the person that hired me to do it unless they distribute the work.

    However, if I use a copier to copy a book then start selling or giving away those copies that’s my problem: I would’ve violated copyright law. However, is it Xerox’s problem? Did they do anything wrong by making a device that can copy books?

    If you believe that it’s not Xerox’s problem then you’re on the side of the AI companies. Because those companies that make LLMs available to the public aren’t actually distributing copyrighted works. They are, however, providing a tool that can do that (sort of). Just like a copier.

    If you paid someone to study a million books and write a novel in the style of some other author you have not violated any law. The same is true if you hire an artist to copy another artist’s style. So why is it illegal if an AI does it? Why is it wrong?

    My argument is that there’s absolutely nothing illegal about it. They’re clearly not distributing copyrighted works. Not intentionally, anyway. That’s on the user. If someone constructs a prompt with the intention of copying something as closely as possible… To me, that is no different than walking up to a copier with a book. You’re using a general-purpose tool specifically to do something that’s potentially illegal.

    So the real question is this: Do we treat generative AI like a copier or do we treat it like an artist?

    If you’re just angry that AI is taking people’s jobs say that! Don’t beat around the bush with nonsense arguments about using works without permission… Because that’s how search engines (and many other things) work. When it comes to using copyrighted works, not everything requires consent.