Students sitting on a campus quad during a sunny day.

How Does ChatGPT Answer Fair Use Questions?

By Treasa Bane February 21, 2023

We’ve all heard our “fair” share about ChatGPT—the powerful and controversial artificial intelligence chatbot from OpenAI that answers questions, writes essays, generates code, and chats with humans. In celebration of Fair Use Week 2023, we wanted to share selected strengths and weaknesses of the large language model’s current (Feb. 9 version) ability to generate human-like responses to questions about fair use.

In Scholarly Communication Services at Washington University Libraries, we’re often faced with a patron wanting a definitive yes or no answer to their fair use questions. The legal doctrine of fair use guides the everyday use of copyrighted material in the university. We replicated this approach in our questions to ChatGPT, and the AI assistant would almost always advise consulting a nebulous legal counsel or “otherwise qualified professional,” with the repetitious conclusion that, “ultimately, fair use is determined on a case-by-case basis and can be difficult to assess.”

Fair use is a limitation on copyright where purposes such as criticism, comment, news reporting, teaching, scholarship, or research are not infringements of copyright. The four fair use factors are as follows:

the purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes;
the nature of the work itself, or the degree to which the work that was used relates to the copyright’s purpose of encouraging creative expression;
the amount and substantiality of the work used in relation to the copyrighted work as a whole;
the effect of the use upon the potential market for or value of the copyrighted work.

Asking Questions of ChatGPT

The first question asked was, “Is uploading a Jacques Lacan essay to my university learning management system a fair use?” In summary, the response correctly and coherently explains that fair use is a legal doctrine that allows for limited and reasonable use of copyrighted material without obtaining permission from the copyright owner.

ChatGPT acknowledges that supporting classroom instruction (factor one) favors fair use, but if it substitutes the sale for the work (factor four), it weighs against fair use. ChatGPT is correct that a yes/no answer is not always possible in response to a fair use question (even if it were to pretend to be a lawyer).

Because this response indicates that some details weigh for and others weigh against fair use, the questioner provides additional clarifying details about each factor, namely that the essay is not creative, and money is not being made from the use. Details about the nature (factor two) and amount and substantiality (factor three) of the work were addressed next. These include information that there are two paragraphs in a seven-page essay for students to read and then discuss in class. After this, it concludes that the use could favor a fair use finding—noting that, while using a small portion of a work can be seen as favoring fair use, even a small portion of a work can be considered substantial if it represents the heart of the work or the most valuable or significant part of the work.

ChatGPT provided a decent answer on how to determine if a portion is valuable or significant, but additional questions in trying to analyze market effect proved to be more complicated, with answers falling short and unsatisfying. Recognizing the importance of balancing all four factors is reassuring, but ChatGPT does not solicit bibliographic details about the essay or how it’s going to be used. This would be helpful to instill confidence in those attempting to make a fair use analysis. ChatGPT can’t replicate a reference interview, in other words, a conversation between a librarian and a library patron.

A second question presented some stumbling blocks: “I want to develop a program that recommends readings based on bibliographies and citations. Can I use indexes, bibliographies, and citation information from various books to populate my program?”

It answers that it’s likely to be considered a fair use based on the purpose, but it reiterates the importance of the remaining factors, considering the amount and substantiality of the portions used and the usage’s impact on the market. Rather than teasing out details for ChatGPT to aid in analysis this time, we attempt to confirm an assumption, “Isn’t bibliographic and citation information considered data and therefore not protected by copyright?”

It correctly answers that this is generally true, but how it is expressed matters. It gives vague examples to demonstrate this idea of how data can be expressed, ultimately concluding that it’s difficult to determine.

The following two questions attempt to gain an understanding of the transformative purpose of the reading recommendation program the questioner would like to develop. ChatGPT spits out definitions and responds both times that it depends, making it clear that ChatGPT “understands” the connection between transformative use and fair use, but it cannot and will not apply these concepts to answer the question.

In the third question, “Can I use an evolutionary biology article in a live workshop demonstration of LaTex?” ChatGPT answers that despite educational purposes, this use would likely require the permission of the copyright owner. It adds that it may be possible to obtain permission from the copyright owner to use the work, either through direct negotiation or through a licensing service. This response notably omits reference to 17 U.S.C. §110(1), which allows performances and displays of works in the course of face-to-face teaching. This provision is crucial for educational and instructional activities and includes few limitations or conditions (i.e., the material is lawfully acquired, restricted to enrolled students, connected to instruction, and displayed in a classroom or similar place at the institution; this specific exception does not apply to the online delivery of digital media).

It’s interesting that licensing is brought up for articles but not earlier for the Lacan essay, because whether a license applies to a proposed use is almost always a consideration, regardless of the type of the work at issue. Ideally, ChatGPT would make a reference to the potential existence of a license—and the potential implications thereof—for all fair use questions. This interaction also made it clear that ChatGPT cannot clarify or determine whether the author and the rights holder may or may not be the same person—a detail of obvious importance.

ChatGPT manages one question at a time based on the information in the natural language prompt that the questioner provides. Occasionally it will build responses from previous questions—it recognized this third question as another fair use question, for example. ChatGPT might improve its ability to incorporate important details for the questioner to consider during a fair use analysis, as with the “heart of the work” aspect explored in the first question; however, ChatGPT is still behind in its ability to acknowledge licensing market considerations, among other things. ChatGPT cannot “think,” nor can it anticipate all of the possible clues that might matter to a situation undergoing a fair use analysis. ChatGPT can provide helpful information about fair use, but its responses should not be relied upon as a substitute for legal advice or analysis.

What ChatGPT Does and Doesn’t Do Well

Determining what constitutes fair use can be a complex and nuanced process that requires an understanding of legal principles and case law. One strength of ChatGPT’s response to fair use questions is its ability to provide questioners with general information about the principles of fair use. This can be helpful for those who are unfamiliar with the legal doctrine and want to learn more about it. ChatGPT can also provide examples of how fair use has been applied in specific cases. However, there are also limitations to ChatGPT’s ability to provide accurate and reliable information about fair use. The model’s responses may not take into account the specific facts and circumstances of a questioner’s situation, which can be critical in determining whether a particular use is fair. In addition, ChatGPT’s responses may not reflect the most current legal precedents and developments, as the model’s knowledge is limited by its training data and cutoff date.

ChatGPT seems to have found and used some of the best fair use resources, at various points recommending the US Copyright Office, ALA’s fair use evaluator tool, and the Code of Best Practices. Someone new to fair use may be served well by ChatGPT’s ability to pull facts. Those looking for help with analyzing and weighing each of the four factors will be disappointed and may turn to the recommended tools instead. ChatGPT cannot empower questioners to assert fair use as a right, and we must remember that we can lose the rights we don’t use.