• Alphane Moon@lemmy.world
    link
    fedilink
    English
    arrow-up
    124
    arrow-down
    6
    ·
    edit-2
    1 day ago

    And this is how you know that the American legal system should not be trusted.

    Mind you I am not saying this an easy case, it’s not. But the framing that piracy is wrong but ML training for profit is not wrong is clearly based on oligarch interests and demands.

    • Arcka@midwest.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 hours ago

      If this is the ruling which causes you to lose trust that any legal system (not just the US’) aligns with morality, then I have to question where you’ve been all this time.

    • themeatbridge@lemmy.world
      link
      fedilink
      English
      arrow-up
      80
      arrow-down
      11
      ·
      1 day ago

      This is an easy case. Using published works to train AI without paying for the right to do so is piracy. The judge making this determination is an idiot.

      • Null User Object@lemmy.world
        link
        fedilink
        English
        arrow-up
        26
        ·
        1 day ago

        The judge making this determination is an idiot.

        The judge hasn’t ruled on the piracy question yet. The only thing that the judge has ruled on is, if you legally own a copy of a book, then you can use it for a variety of purposes, including training an AI.

        “But they didn’t own the books!”

        Right. That’s the part that’s still going to trial.

      • AbidanYre@lemmy.world
        link
        fedilink
        English
        arrow-up
        54
        arrow-down
        3
        ·
        1 day ago

        You’re right. When you’re doing it for commercial gain, it’s not fair use anymore. It’s really not that complicated.

        • tabular@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          arrow-down
          1
          ·
          1 day ago

          If you’re using the minimum amount, in a transformative way that doesn’t compete with the original copyrighted source, then it’s still fair use even if it’s commercial. (This is not saying that’s what LLM are doing)

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      27
      arrow-down
      5
      ·
      1 day ago

      You should read the ruling in more detail, the judge explains the reasoning behind why he found the way that he did. For example:

      Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16). But Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.

      This isn’t “oligarch interests and demands,” this is affirming a right to learn and that copyright doesn’t allow its holder to prohibit people from analyzing the things that they read.

      • Lemming6969@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 hours ago

        Except learning in this context is building a probability map reinforcing the exact text of the book. Given the right prompt, no new generative concepts come out, just the verbatim book text trained on.

        So it depends on the model I suppose and if the model enforces generative answers and blocks verbatim recitation.

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          1
          ·
          4 hours ago

          Again, you should read the ruling. The judge explicitly addresses this. The Authors claim that this is how LLMs work, and the judge says “okay, let’s assume that their claim is true.”

          Fourth, each fully trained LLM itself retained “compressed” copies of the works it had trained upon, or so Authors contend and this order takes for granted.

          Even on that basis he still finds that it’s not violating copyright to train an LLM.

          And I don’t think the Authors’ claim would hold up if challenged, for that matter. Anthropic chose not to challenge it because it didn’t make a difference to their case, but in actuality an LLM doesn’t store the training data verbatim within itself. It’s physically impossible to compress text that much.

      • Leon@pawb.social
        link
        fedilink
        English
        arrow-up
        2
        ·
        10 hours ago

        LLMs don’t learn, and they’re not people. Applying the same logic doesn’t make much sense.

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          2
          ·
          8 hours ago

          The judge isn’t saying that they learn or that they’re people. He’s saying that training falls into the same legal classification as learning.

      • tamal3@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 hours ago

        Isn’t part of the issue here that they’re defaulting to LLMs being people, and having the same rights as people? I appreciate the “right to read” aspect, but it would be nice if this were more explicitly about people. Foregoing copyright law because there’s too much data is also insane, if that’s what’s happening. Claude should be required to provide citations “each time they recall it from memory”.

        Does Citizens United apply here? Are corporations people, and so LLMs are, too? If so, then imo we should be writing legal documents with stipulations like, “as per Citizens United” so that eventually, when they overturn that insanity in my dreams, all of this new legal precedence doesn’t suddenly become like a house of cards. Ianal.

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          3
          ·
          8 hours ago

          Not even slightly, the judge didn’t rule anything like that. I’d suggest taking a read through his ruling, his conclusions start on page 9 and they’re not that complicated. In a nutshell, it’s just saying that the training of an AI doesn’t violate the copyright of the training material.

          How Anthropic got the training material is a separate matter, that part is going to an actual try. This was a preliminary judgment on just the training part.

          Foregoing copyright law because there’s too much data is also insane, if that’s what’s happening.

          That’s not what’s happening. And Citizens United has nothing to do with this. It’s about the question of whether training an AI is something that can violate copyright.

      • kayazere@feddit.nl
        link
        fedilink
        English
        arrow-up
        18
        arrow-down
        1
        ·
        1 day ago

        Yeah, but the issue is they didn’t buy a legal copy of the book. Once you own the book, you can read it as many times as you want. They didn’t legally own the books.

        • Null User Object@lemmy.world
          link
          fedilink
          English
          arrow-up
          16
          ·
          1 day ago

          Right, and that’s the, “but faces trial over damages for millions of pirated works,” part that’s still up in the air.

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          4
          arrow-down
          1
          ·
          21 hours ago

          Do you think AIs spontaneously generate? They are a tool that people use. I don’t want to give the AIs rights, it’s about the people who build and use them.

          • xavier666@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            13 hours ago

            “No officer, you can’t shoot me. I have a LLM in my pocket. Without me, it’ll stop learning”

      • realitista@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 day ago

        But AFAIK they actually didn’t acquire the legal rights even to read the stuff they trained from. There were definitely cases of pirated books used to train models.

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          9
          ·
          1 day ago

          Yes, and that part of the case is going to trial. This was a preliminary judgment specifically about the training itself.

      • Alphane Moon@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        1 day ago

        I will admit this is not a simple case. That being said, if you’ve lived in the US (and are aware of local mores), but you’re not American. you will have a different perspective on the US judicial system.

        How is right to learn even relevant here? An LLM by definition cannot learn.

        Where did I say analyzing a text should be restricted?

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          3
          ·
          1 day ago

          How is right to learn even relevant here? An LLM by definition cannot learn.

          I literally quoted a relevant part of the judge’s decision:

          But Authors cannot rightly exclude anyone from using their works for training or learning as such.

          • Alphane Moon@lemmy.world
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            1
            ·
            1 day ago

            I am not a lawyer. I am talking about reality.

            What does an LLM application (or training processes associated with an LLM application) have to do with the concept of learning? Where is the learning happening? Who is doing the learning?

            Who is stopping the individuals at the LLM company from learning or analysing a given book?

            From my experience living in the US, this is pretty standard American-style corruption. Lots of pomp and bombast and roleplay of sorts, but the outcome is no different from any other country that is in deep need of judicial and anti-corruotion reform.

            • FaceDeer@fedia.io
              link
              fedilink
              arrow-up
              3
              ·
              1 day ago

              Well, I’m talking about the reality of the law. The judge equated training with learning and stated that there is nothing in copyright that can prohibit it. Go ahead and read the judge’s ruling, it’s on display at the article linked. His conclusions start on page 9.

    • catloaf@lemm.ee
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      1 day ago

      The order seems to say that the trained LLM and the commercial Claude product are not linked, which supports the decision. But I’m not sure how he came to that conclusion. I’m going to have to read the full order when I have time.

      This might be appealed, but I doubt it’ll be taken up by SCOTUS until there are conflicting federal court rulings.

      • Tagger@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        4
        ·
        1 day ago

        If you are struggling for time, just put the opinion into chat GPT and ask for a summary. it will save you tonnes of time.