Five major book publishers are suing Meta, claiming the company used millions of copyrighted books without permission to train its Llama AI chatbot. The publishers call it “one of the most massive infringements of copyrighted materials in history.”
This lawsuit could change how AI companies get their training data. Right now, most AI companies scrape content from across the internet to teach their systems how to write and think. But publishers say Meta crossed a line by using entire books without paying authors or asking permission.
The Battle Over AI Training Data
Macmillan, McGraw-Hill, Elsevier, Hachette, and Cengage filed the class action lawsuit, joining a growing fight between content creators and AI companies. The publishers argue that Meta’s Llama AI learned to write so well because it studied their copyrighted books word-for-word.
Meta isn’t alone in facing these challenges. OpenAI, the maker of ChatGPT, has faced similar lawsuits from authors and news organizations. The core question is whether AI companies should pay for the content they use to train their systems, or if this counts as fair use.
The outcome could force AI companies to completely change how they build their systems. Instead of freely using any text they find online, they might need to pay licensing fees or get explicit permission from content owners. This could make AI development much more expensive but would give authors and publishers a way to profit from their work being used to train AI.


