The New York Times is suing OpenAI and Microsoft for copyright infringement, claiming the two companies built their AI models by “copying and using millions” of the publication’s articles and now “directly compete” with its content as a result.

As outlined in the lawsuit, the Times alleges OpenAI and Microsoft’s large language models (LLMs), which power ChatGPT and Copilot, “can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style.” This “undermine[s] and damage[s]” the Times’ relationship with readers, the outlet alleges, while also depriving it of “subscription, licensing, advertising, and affiliate revenue.”

The complaint also argues that these AI models “threaten high-quality journalism” by hurting the ability of news outlets to protect and monetize content. “Through Microsoft’s Bing Chat (recently rebranded as “Copilot”) and OpenAI’s ChatGPT, Defendants seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment,” the lawsuit states.

The full text of the lawsuit can be found here

  • Zima@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    11 months ago

    ?? Are you trolling. If you design a car to combust gasoline without burning the lubricants but you still end up burning them it doesn’t mean that the lubricants are needed for the combustion itself. Conversely you have not made any nuanced argument explaining why memorization is necessary. I gave you an example where we know there is no memorization and you ignored it.

    “Otherwise how would it create the words” is just saying you wouldn’t know.

    • EvilMonkeySlayer@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      11 months ago

      So, me pointing out the flaw in your argument is trolling?

      What?

      If you choose to use weasel wording to try and get out of something that is your call.

      • Zima@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        11 months ago

        Ok i believe that you believe that. It’s ok. I have professional experience in this space so you’re either not reading carefully or you don’t understand much about the topic.

        Perhaps you might want to reconsider this in more abstract terms. The engine example you ignored could help you with that.

        Do you really think that the fact that we have language models that don’t memorize and are simple enough that we can know for certain is not all we need to show that language models don’t necessarily have to memorize? You keep repeating the same (illogical) argument and ignore the simpler arguments that disprove your claim.

        • EvilMonkeySlayer@kbin.social
          link
          fedilink
          arrow-up
          1
          ·
          11 months ago

          So, now it’s gone from “reasonable effort” to most definitely you can say without any doubt that all the trained models contain no copyrighted data at all?

          Come on. Make up your mind.