What is Happening with Books and AI?
I can no longer open my various inboxes without seeing something about AI: newsletters remarking on the success or failure of various AI tech and startups, urgent calls to explore AI tools designed for social media and content managers, headlines about how AI is being used in exciting and terrible ways across industries…So, while I hesitated before writing this piece and possibly contributing to your own AI-related spam, I can’t help but recognize that stories about AI’s effects on the book world are intensifying and worth paying attention to. Not only is it part of my job to remain curious and vigilant when it comes to these updates, it’s also interesting to anyone who pays close-ish attention to books and publishing.
As developments continue to crowd this intersection, I think what strikes me most is that they deal with so many different corners of the book world. AI is a technology pervasive enough to highlight some of the more subtle layers and facets that make up book production and book culture. It’s made me consider and reconsider the ethics of using AI as a professional publishing tool, the question of what defines creative work, and how AI might impact the average reader, to name a few philosophical exercises.
Even as I type, a new drama related to AI algorithms, and involving author reactions that mirror sentiments around the use of copyrighted works to train AI and Large Language Models (LLMs — we’ll get into this momentarily) in particular, is unfolding. To give you a sense of how quickly these controversies escalate, see this Gizmodo article recounting how a group of writers, including Jeff Vandermeer and Indra Das, pushed back against fiction analytics site Prosecraft after author Zach Rosenberg called it out for using copyrighted works to develop a data library. Rosenberg posted about it on X on August 7, and it quickly garnered attention and calls from authors to remove their books from the library. Later that very same day, after making attempts at damage control, Prosecraft developer Benji Smith voluntarily took the site down.
Though, as the Gizmodo article points out, Prosecraft isn’t exactly an LLM — the model at the heart of many headlines about authors versus AI — it’s not difficult to make connections between the escalation against the site and broader pushback and advocacy concerning the unauthorized and uncompensated use of human-created work used to train AI.
Like many others, my initial hands-on exploration of AI began with ChatGPT. Curious about how it worked, I read up on LLMs — an acronym theretofore unfamiliar to me. The thing to know about LLMs as we get into these stories is that they take in existing datasets (think books, articles, and other digitized resources) to output predictive text. They work with what they’ve got and what they’ve got is sometimes inaccurate, biased, or copyrighted.
This input method of scraping for data fueled a fire under the almost 8,000 writers who signed a letter to some of the biggest AI companies calling for them to stop using their works to train LLMs. While the letter, crafted by professional advocacy organization The Author’s Guild, collected signatures from authors with sizable platforms — authors like Alexander Chee and Nora Roberts — the most it could do was ask these companies to please compensate the people who authored these works. Those willing to spend the time and money on more aggressive measures, however, have filed lawsuits, taking companies to court over this issue.
Sarah Silverman, Christopher Golden, and Richard Kadrey are three such plaintiffs. Their suits accuse OpenAI and Meta of copyright infringement and allege that the datasets OpenAI uses to train ChatGPT include works from “shadow library” websites that illegally torrent their and other authors’ works. They also allege that the trail of Meta’s LLaMA datasets leads to “shadow libraries.” The Verge reports that Matthew Butterick and Joseph Saveri, the lawyers representing the three authors, are also litigating against AI companies on behalf of artists, which brings us to another corner of publishing: book covers.
An argument between two authors and a publisher about the use of AI to generate book covers represents a slim wedge of the discussion around how AI might affect creative professionals’ access to work and where we draw the line between art that requires a certain level of human effort and art generated by AI prompts. What we’re talking about is low labor art with a matching price tag that might be more attractive to businesses (like publishing houses) looking to cut costs even if it means cutting out artists creating wholly original works.
That’s not even getting into lawsuits like the ones represented by Butterick and Saveri, which allege that AI art generators like Stability AI and Midjourney scrape copyrighted art to generate images.
The issues mentioned above, and even problems like AI-generated stories making messes of submissions portals, indirectly affect readers, but mostly impact creators. So what about AI, book culture, and readership? Well, there is the problem of AI-generated books spamming Amazon where you might be looking for quality reads written by humans. But if what you’re most worried about is whether AI will disrupt the book culture you know and love, take some comfort in this Wired article that criticizes this prediction. The piece argues that people in tech often miss the point when they make big prognostications about new technology replacing or reinventing the way we read and enjoy books because they don’t seem to enjoy the practice themselves and are solving for a problem that doesn’t exist. I found myself nodding while reading this analysis.
While AI will continue to be challenged, developed, and used to shift how we work and interact with data, there will always be room for a sitdown with a good (human-authored) book.
Leave a comment
Become an All Access member to add comments