ANI vs. ChatGPT: A Complex, Curious, and Costly Copyright Litigation (?)

Bonjour,

Got the recent news about ChatGPT? I am sure you have. Here, I broach some questions that I deem demand deliberation.

Well, the news is that ChatGpt has gotten into legal trouble in India, like many other places. Or, legally speaking, the issue of AI training and copyright—specifically, whether the use of copyrighted works to train AI constitutes infringement—which has been a subject of much debate globally, has landed on Indian soil. However, this discussion, long in the making, is already caught with competing suggestions and ideas. In a way, the discourse – that there is something new technology to which copyright has to respond has already come (See Prof. Carys Craig’s thinking piece called the AI/Copyright Trap). In the discourse, some advocate for new exceptions or assess training on fair dealing/use directly (some countries have already made such laws); some claim author remuneration — tacitly acknowledging copyright infringement. Meanwhile, certain scholars – to which I identify myself– argue that such training does not amount to infringement. The position, nevertheless, remains far from settled, at least so I believe. Anyway, in my tentative French, I can say that the AI/copyright debate has arrived in India en chair et en os—in the flesh, alive and kicking– ready to be ruminated and litigated.

But is the issue involved in any way novel? I doubt it. Why? A few years ago, the news of the JNU data depot (“a gigantic store of text and images extracted from 73 million journal articles dating from 1847 up to the present day.”) broke out in 2019. SpicyIP carried out an interesting discussion on whether the Indian Copyright Act would regard data mining as copyright infringement or as fair dealing. Very spicy discussion, it was! Please check if you have not. See, e.g., here, here, and here.

Returning to the ChatGPT case in 2024, my main concern is that the case landed on Indian (judicial) soil with pre(in)formed narratives. How? The new reports show that the defendant, OpenAI, has already called its use fair dealing/use under corporate law. I don’t know if this is a “faith-based” claim or stems from an intelligent litigation strategy made after assessing the risk of claiming no infringement. I partially think the latter is because of the concerns raised by scholars (e.g., Akshat’s potent post), problematizing the dominant position of OpenAI in the relevant market. In any case, it’s worth noting that the court (going against the wave of granting temporary injunction just like that) has considered it “a complex matter which requires deeper examination.” However, given the currently taken positions (as reported in the news, since the order is not out), it would be an exciting tale that would long echo in the hallowed halls of copyright jurisprudence. 

This post, I pen, to ponder and point out four issues, some of which overlap. Let’s begin with the parties’ claim and then see where they would lead. Please know that I won’t wade deeper into them.

Let’s initiate with the intuitive taker – ANI, the plaintiff who has made two arguments: 1.) OpenAI had used its copyrighted material to train ChatGPT. The plaintiff added that it provides exclusive news items to its subscribers that are not intended for the public. However, the not-so-publically available materials allegedly appear as part of ChatGPT’s responses. 2.) ChatGPT attributes false statements to the plaintiff, spreading misinformation. Thus, “it hurts not just … private rights, but also spread fake news and public disorder. There’s also a public angle to this.”

Au contraire, the defendant argues that ChatGPT does not store or reproduce ANI’s content. With that, it cites idea-expression dichotomy, i.e., copyright law protects expressions, not ideas or facts. News content, argued the defendant, constitutes only a negligible fraction of ChatGPT’s training data, with the plaintiff’s contribution being even less significant. Additionally, it emphasizes that OpenAI has long provided tools for website owners to block their content from being accessed, presenting this as evidence of their “transparent and bona fide” practices from the outset.” Finally, it challenges the Delhi High Court’s jurisdiction, arguing that OpenAI operates globally without servers or offices in India.

Alright, let’s get to the issues. 

1. Whether AI training constitutes infringement: Two possibilities arise: either the court undertakes a subject matter analysis checking whether ChatGPT’s use of ANI’s works constitutes “use” under Section 14 of the Copyright Act. As many have already argued, this is not a “use” as the law envisages with respect to the economic rights of holders. Instead, it constitutes a non-expressive / non-consumptive use of work that copyright does not concern itself with; see, e.g., Prof. Severine Dusollier’s article here. The second is to apply the existing fair dealing provisions (as the ChatGPT has claimed), with arguments available on both sides. (See Prashant Reddy’s 2019 piece arguing otherwise).

Here’s a hiccup, however. As the plaintiff asserted that ChatGPT has copied its “data” and infringed copyright. But remember, copyright law does not protect data as such. Although the defendant’s lawyer has called it a “Freudian slip” (i.e., “an unintentional error regarded as revealing subconscious feelings”), this would complicate the situation and drive the court to the first possibility I underscore above, i.e., whether the kind of use ChatGPT does contradicts the limited rights of authors under section 14 of the Act. Simply put, is AI using these works as raw data without engaging with them as “expressive works”? Herein lies another hitch: the defendant’s assertion—or perhaps admission—that its use qualifies as “fair use” conveys a tacit acknowledgment of infringement. This means the use by ChatGPT was already “use” as per Section 14. I say this because fair use is a prima facie proof of infringement.  Reliance be placed on the Madras High Court’s E.M. Forster And Anr. vs A.N. Parasuram, which remarked that “With the propositions relating to “Fair Dealing” we need not concern ourselves immediately, for that will arise only if it could be otherwise established for the appellants that there has been an infringement by substantial reproduction in the present case. If that is not made out, there is a failure at the threshold of the claim, and the question does not really arise whether Mr. Parasuram (respondent) could claim that he is protected by any of the objectives of “Fair Dealing.”

Nevertheless, let’s see how the Court would deal with this matter, as this case can have serious legislative implications. 

2. The appointment of an expert: The court’s mention of appointing an amicus curiae particularly piques my interest. For one, as far as my research goes, only practicing lawyers/advocates can be appointed amicus curiae (See here and here). Thus, an academic (who cannot practice in the courts) cannot be appointed in that role. However, the Court can appoint an academic as an “expert” under Rule 31 Of the DHC IPD Rules, as done previously by Justice Pratibha M. Singh, by appointing Prof. Arul Scaria (from NLSIU, Banglore) in a different case. It will be nice to watch this part, as this appointment will significantly impact the case outcome.

3. The question of “opt-out:”  An amusing angle to this case is its mention of an opt-out scheme, as evidenced by this claim. As SpicyIP’s Md. Sabeeh succinctly puts it:

“​​Here, OpenAI seems to be taking the “opt-out” defense where it allows websites to be blocked if they are informed of any content infringement. However, ANI’s counsel flagged that there is difficulty even after blocking the website since the same content is reproduced by other websites. 

Akshat has explained this in his excellent post. However, I fail to fathom how it directly relates to copyright concerns (Am happy to be corrected!). Why does it matter if ChatGPT blocks a website or gives an option to some websites to “opt-out” from ChatGPT’s usage? Instead, the opt-out argument seems akin to saying, “If you don’t want your copyright to be violated, keep it out of my reach.” Besides irking my intuition, I wonder if opting out makes any sense, especially in cases like news where copying of work by many websites is common. Just because a copyrighted work is accessible publicly doesn’t give someone the right to infringe upon it. I say “infringe” because ChatGPT has tacitly accepted that by claiming fair use. In sum, at least theoretically, the burden should not fall on the copyright holder to prevent infringement; instead, it ought to be up to the user (ChatGPT here) to respect rights. The opt-out argument diverts attention from the core issue of whether training AI on copyrighted works constitutes infringement.

4. The question of jurisdiction: Not having an office in India doesn’t necessarily mean jurisdiction cannot be invoked. Reliance may be placed on Section 62 of the 1957 Act which makes “place of carrying of business” as one of the relevant factors for invoking a particular jurisdiction. Moreover, a catena of cases has interpreted the provision liberally, or at least in a way that leaves ample scope to invoke jurisdiction. Sample, World Wrestling Entertainment, Inc. vs. M/S Reshma Collection & Ors where the court, equating Section 134(2) of the Trademarks Act, 1999 and Section 62(2) of the Copyright Act, 1957, noted: “The availability of transactions through the website at a particular place is virtually the same thing as a seller having shops in that place in the physical world.” Similarly, in Banyan Tree Holding (P) Limited vs A. Murali Krishna Reddy & Anr., the court noted that “For the purposes of a passing off action, or an infringement action where the Plaintiff is not carrying on business within the jurisdiction of a court, and in the absence of a long-arm statute, in order to satisfy the forum court that it has jurisdiction to entertain the suit, the Plaintiff would have to show that the Defendant “purposefully availed” itself of the jurisdiction of the forum court. For this it would have to be prima facie shown that the nature of the activity indulged in by the Defendant by the use of the website was with an intention to conclude a commercial transaction with the website user and that the specific targeting of the forum state by the Defendant resulted in an injury or harm to the Plaintiff within the forum state.” Given that ChatGPT is very much available to Indians, including those in Delhi, there is a prima facie argument for carrying business in India, so the jurisdiction point won’t be challenging to controvert. Check this post for an analysis of the issue. 

Conclusion

The next hearing for the case is slated for January 28, 2025. Until then, we can only imagine (with a pre-given narrative) how the court will approach this costly question and costlier litigation. In the meantime, check out this piece I wrote, “Faith-Based Fair Dealing,” and this insightful multi-part series by Sneha Jain and Akshat Agarwal: Part 1, Part 2, Part 3, and Part 4.

Alright, that’s from my end today. See you around.

Relevant (Long) readings:

  1. Fiil-Flynn, S.M.; Butler, B.; Carroll, M.; Cohen-Sasson, O.; Craig, C.; Guibault, L.; Jaszi, P.; Jütte, B.J.; Katz, A.; Quintais, J.P.; Margoni, T.; Rocha de Souza, A.; Sag, M.; Samberg, R.; Schirru, L.; Senftleben, M.; Tur-Sinai, O.; Contreras, J.L.,  L.Legal reform to enhance global text and data mining research | Science 
  2. Senftleben, Martin, Compliance of National TDM Rules with International Copyright Law – An Overrated Nonissue? (IIC) 53, No. 10 (2022)
  3. Bracha, Oren, The Work of Copyright in the Age of Machine Production (September 24, 2023). U of Texas Law, Legal Studies Research Paper.
  4. Senftleben, Martin, Win-win: How to Remove Copyright Obstacles to AI Training While Ensuring Author Remuneration (and Why the European AI Act Fails to Do the Magic) (July 04, 2024). Chicago-Kent Law Review, Volume 98, Forthcoming.
  5. Luca Schirru, Allan Rocha de Souza, Claudia Chamas, Building a Text and Data Mining Limitation: The Brazilian Case, GRUR International, Volume 73, Issue 3, March 2024, Pages 217–222, [Paywalled]

Image from here