We Asked A.I. to Create the Joker. It Generated a Copyrighted Image.::Artists and researchers are exposing copyrighted material hidden within A.I. tools, raising fresh legal questions.

  • dragontamer@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    9 months ago

    Because this proves that the “AI”, at some level, is storing the data of the Joker movie screenshot somewhere inside of its training set.

    Likely because the “AI” was trained upon this image at some point. This has repercussions with regards to copyright law. It means the training set contains copyrighted data and the use of said training set could be argued as piracy.

    Legal discussions on how to talk about generative-AI are only happening now, now that people can experiment with the technology. But its not like our laws have changed, copyright infringement is copyright infringement. If the training data is obviously copyright infringement, then the data must be retrained in a more appropriate manner.

    • abhibeckert@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      9 months ago

      But where is the infringement?

      This NYT article includes the same several copyrighted images and they surely haven’t paid any license. It’s obviously fair use in both cases and NYT’s claim that “it might not be fair use” is just ridiculous.

      Worse, the NYT also includes exact copies of the images, while the AI ones are just very close to the original. That’s like the difference between uploading a video of yourself playing a Taylor Swift cover and actually uploading one of Taylor Swift’s own music videos to YouTube.

      Even worse the NYT intentionally distributed the copyrighted images, while Midjourney did so unintentionally and specifically states it’s a breach of their terms of service. Your account might be banned if you’re caught using these prompts.

      • dragontamer@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        But where is the infringement?

        Do Training weights have the data? Are the servers copying said data on a mass scale, in a way that the original copyrighters don’t want or can’t control?

        • Auli@lemmy.ca
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          There response well be we don’t know we can’t understand what its doing.

          • dragontamer@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            9 months ago

            There response well be we don’t know we can’t understand what its doing.

            What the fuck is this kind of response? Its just a fucking neural network running on GPUs with convolutional kernels. For fucks sake, turn on your damn brain.

            Generative AI is actually one of the easier subjects to comprehend here. Its just calculus. Use of derivatives to backpropagate weights in such a way that minimizes error. Lather-rinse-repeat for a billion iterations on a mass of GPUs (ie: 20 TFlop compute systems) for several weeks.

            Come on, this stuff is well understood by Comp. Sci by now. Not only 20 years ago when I learned about this stuff, but today now that AI is all hype, more and more people are understanding the basics.

    • rsuri@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      9 months ago

      But its not like our laws have changed

      And that’s the problem. The internet has drastically reduced the cost of copying information, to the point where entirely new uses like this one are now possible. But those new uses are stifled by copyright law that originates from a time when the only cost was that people with gutenberg presses would be prohibited from printing slightly cheaper books. And there’s no discussion of changing it because the people who benefit from those laws literally are the media.

    • LainTrain@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      9 months ago

      By that logic I am also storing that image in my dataset, because I know and remember this exact image. I can reproduce it from memory too.

      • dragontamer@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        You ever try to do a public performance of a copyrighted work, like “Happy Birthday to You” ??

        You get sued. Even if its from memory. Welcome to copyright law. There’s a reason why every restaraunt had to make up a new “Happy Happy Birthday, from the Birthday Crew” song.

        • LainTrain@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          9 months ago

          Yeah, but until I perform it without a license for profit, I don’t get sued.

          So it’s up to the user to make sure that if any material that is generated is copyright infringing, it should not be used.

          • dragontamer@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            9 months ago

            Otakon anime music videos have no profits but they explicitly get a license from RIAA to play songs in public.

            • LainTrain@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              0
              arrow-down
              1
              ·
              9 months ago

              So? I’m not saying those are fair terms, I would also prefer if that were not the case, but AI isn’t performing in public any more having a guitar with you in public is ripping off Metallica.

              • dragontamer@lemmy.world
                link
                fedilink
                English
                arrow-up
                0
                ·
                edit-2
                9 months ago

                You don’t need to perform “for profit” to get sued for copyright infringement.

                but AI isn’t performing in public any more having a guitar with you in public is ripping off Metallica.

                Is the Joker image in that article derivative or substantially similar to a copyrighted work? Is the query available to anyone who uses Midjourney? Are the training weights being copied from server-to-server behind the scenes? Were the training weights derived from copyrighted data?

                • LainTrain@lemmy.dbzer0.com
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  9 months ago

                  Yes and none of that matters in the slightest. By that logic the Library of Babel is also copyright infringement. By that logic my memory of the movie is copyright infringing even if I don’t do anything with it.

                  • dragontamer@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    9 months ago

                    You’re taking a fictional work and trying to apply real world laws to it?

                    Copyright assumes that Library of Babel would take up so much space as it’d be impossible to create.

                    Which is true. Every possible combination of letters, spaces, and characters would never fit on anything in today’s universe (be it a 24 TB Hard Drive, or even a collection of thousands of them).

                    Secondly: any computer-generated work is automatically non-copyrighted as per US Law.

        • LainTrain@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          edit-2
          9 months ago

          What’s the difference? I could be just some code in the simulation

          Edit: downvoted by people who unironically stan Ted Kaczynski