This is an issue that has blown up over the past few days and something I wanted you as Write Squad subscribers to be informed on, hence the mid-week post.
Authors everywhere have been reeling with the reporting of the latest piracy/AI scandal.
If you’ve not seen the news, here’s a good wrap up of it.
In a nutshell, there’s a database of pirated books (which has existed since at least 2015), and recently good old Meta scraped the database to train its latest AI machine.
Bad? Yes. Deplorable? Yes. Surprising… I’m sorry to say, No.
As I mentioned, the LibGen database has been pirating books since at least 2015 (probably earlier). They have been taken to court numerous times and I believe they were even ordered to pay copyright royalties at one stage, but the problem is, no one knows who runs LibGen, so no one has every been made accountable and LibGen continues to exist. Ah, the power of hiding behind the internet.
And this isn’t the first time Meta has been accused and had legal action started against them for ‘stealing data for AI training’. In fact, in 2023, and more recently 2024, these things were brought to light. And back in January of this year, the first news leaked that Meta had used LibGen for their AI training. I remember reading it. And, yes, being unsurprised. But until the Atlantic’s article earlier this week, it seems many were unaware.
Meta are of course standing by the ‘public domain, public use’ excuse and I’m no lawyer, but unfortunately I think in legal terms they have a case. A flimsy immoral one, but it’s what they’re standing by. They didn’t pirate the books, they simply accessed a database that shared said pirated books (and more) in the public domain. So, free for all, right? …Ah-hem….
There are two issues here.
The piracy. And, the unauthorised use of data for AI training.
Piracy has existed since the written form began. Thankfully, it was more easily ‘managed’ before the internet, but the explosion of the WWW and data sharing has made it both easier for these pirates and harder for detection and bringing them to justice.
If you’re old enough you’ll remember Napster. The notorious file sharing app that broke open the issue of sharing what was copyrighted data - in this case, music. Metallica and Dr Dre were the first to start legal action which led to the shut-down of Napster. However, we all know that even before one illegal operation has been shut-down, more have already surfaced.
I’m not here to debate the piracy issue. Piracy in any form including digital piracy is wrong. It’s illegal, it’s immoral and it’s downright deplorable. But, we are on the back foot, fighting a battle that isn’t un-winnable, but which is certainly a long way from ever being ‘fixed’.
The next issue is the scraping of the internet to train AI models. This is what meta and many other countless companies—mega, big and small—are doing.
Big business sees the money to be made in AI technology. And as always, big business is only interested in the money. They don’t care for moral obligations, for copyright, unauthorised use or access issues, and they certainly don’t have a conscience. They never have, so why start now? Especially when the dollar signs are flashing brighter than ever before.
The issue is with the how fast AI is developing.
Again, we - the consumer, the creator, our governments - are on the back foot. Actually, we are leagues behind. Lightyears. The development of AI technology left the start gate before we even entered the race. And now we’re scrambling, wishing we’d paid attention sooner, done the training and fitness, because we are losing the race.
So, should we throw our hands up in the air?
No. Of course not.
But, I also believe that we - and now I’m talking about creators as well as the everyday citizens of this world, need to stop and take notice.
We need to step out of our cocoon thinking that the world will carry on no matter what. It will. Of course it will. But while we might be waiting for governments and the legal system to ‘do something’ we need to start taking notice ourselves. We need to be aware and keep being aware.
Be aware of what?
Be aware that what we put on the internet - yes, this includes this thing you’re reading right now - is out there in cyberspace. It’s being crawled and scraped by bots. Your words. Your photos. Your videos. Your social media. Your data. Your information. Your IP. Your creativity.
Be aware that this includes social media, even if you think you have high privacy settings. Did you know Substack’s defaults allow your content to be accessed via web bots? You can amend these AI training settings and I highly recommend you do if you are using Substack. (It’s under your publication settings - under privacy. I suggest you make sure it’s on.)
Be aware this includes your own website. Do you have a crawler blocker on your website? You should. Are they effective? At the moment for the most part. But again, it’s something you need to keep aware of.
Be aware of what is happening in AI. Keep up to date as best you can. Awareness doesn’t mean doing a degree on the subject, but reading news (from reputable sources) and being aware of the issues and developments.
Be aware of what’s happening online. I have known about the LibGen database for a couple of years now and I was actually shocked how many authors were not aware and were only made aware this past week. If we want to play in this online world we need to be proactive and take more notice of things. As much as we want it to be a positive and fun space, darkness lurks in shadows.
Be aware on how AI is being used. Understand the different forms that AI takes in our current world, and the world that is being developed. What is generative AI? What is predictive AI? What is AI assisted technology? If you don’t know the difference, find out.
Be aware of the AI that you are using right now and make an informed decision on if you are okay using it or not. Google, in it’s simplest form, is a form of AI. The maps on your phone - AI. Yes AI, does have it’s place in today’s world. AI can be useful for good. But only in certain forms. How much AI has it’s place in your world is something you need to decide on and to do so, you need to be informed and understand. You need to be aware.
Be aware of your digital footprint. Be aware how you use social media and the internet. Full stop.
Like all the other authors, writers, and creatives who have had their work stolen, and have had their work illegally used by generative AI models I’m pissed off. But I’m past the point of being angry. There’s work to be done.
I don’t want to diminish anyone’s shock or feelings, but it’s a fact. AI is here. It’s happened. It’s continuing to happen. But to stop the data scraping, the theft and the sinister violation of our lives we need to stand up.
It starts right here - with your awareness.
And it continues with your actions.
Your actions can include:
Firstly, being aware of the issues, the current standing, and the future that is being developed.
Making noise by taking a stand against the unauthorised use of and blatant disregard for the copyright and IP of creators (There are many authors guilds, societies and organisations building data and information to take action on behalf of creatives. You can find the Australian Society of Authors one here https://docs.google.com/forms/d/e/1FAIpQLSe9YCy_3b6AhPiD5OnJ4S5gWmveYj8okfTrhldoupni2-yaiw/viewform?fbclid=PAZXh0bgNhZW0CMTEAAaZnYKpkVjCR5vaGcgtTBVMZkdfIh6J3REAMTBq-bjgIuBlKcGLVZhG-ODs_aem_TBjmoLJNNfTNmsLM86nyjQ (Or through the link on their Instagram account)
Make noise to your publisher. Inform them if you suspect or have confirmed your book has been pirated. Publishers have a duty of care to their authors and their own business to take action in whatever form they can.
Make noise to your publisher when you are signing any future contracts. Check any AI clauses in your contract and if you do not consent - do not sign. Some authors are starting to include clauses allowing AI training on ‘some’ of their titles (Ref: https://www.thebookseller.com/news/harpercollins-signs-contract-with-tech-company-to-use-limited-number-of-titles-to-train-ai)
Make noise to your local member of parliament. Bring the piracy and unauthorised use of your work (or anyone’s work for that matter) to their attention. Governments need to know this is an issue that affects everybody and they need to step up to the plate. Now. There’s been inaction for far too long.
Make noise by helping others understand the situation and issues surrounding the futures of our online world and fast-evolving AI.
Support authors by getting your books via legitimate means. Whether that’s the author’s own website, bricks and mortar shops, or reputable online platforms that align with your values. You can also support authors by borrowing their books from the library, and sharing books you love via word of mouth.
I wish I could say that I’m hopeful for the current state of the world in terms of our digital landscape. But I’m not.
The world wide web was supposed to make our world smaller. Bring people together. Be used for positive.
Of course, we knew that there was always a section of our community that would take advantage of such a thing and use it for illegal and immoral purposes. We didn’t know just how fast and out of hand it would get. We blinked and missed it.
Pondering the repercussions of being online
I have to admit I’ve been finding it hard to reconcile the online world lately. I wanted to be AI-curious, understand its benefits, but I’m fast coming to the conclusion the negatives outweigh the positives one-hundred-fold.
And, for all the positives and opportunities the online world has brought me personally, I can’t help but feel the space isn’t somewhere healthy right now. But I also know that there is no answer to this ongoing conundrum.
I’ve been stepping back from social media, and am being very conscious and intentional about how I spend my time online. Ironic that here I am. But that’s one of the positives of the online world, being able to share things in a positive and helpful way.
However, personally, my relationship with my own digital footprint, my interaction in the online landscape, as well as my creativity continues to be a question I ponder. It’s ongoing, and I’m not sure if there will ever be an answer.
In the meantime, if I had any advice for creatives, writers, for the everyday person living in today’s online world it would be - be aware. And be wary.
Thanks Jodi, I’ve been pondering my own life online and tbh I want to step back but finding it increasingly difficult to do - it’s a crutch I don’t want and in this political climate too how does one stay informed yet remove themselves from social media? The expectations of publishers too is that writers will be ‘out there’. I find this all extremely crushing. I want to be a good writerly citizen too!
Thanks for sharing all your knowledge and insights on this very topical subject, Jodi. It's a minefield. My husband is in IT and has resisted social media since its inception whereas I was not so black and white about it. Now I find myself in a quandary, like everyone else. Pirated copies of my books have also been scraped to train AI. To me, it needs to be a corporate and government level of leadership to really put some boundaries around this new tech. It can't be up to the poor (figuratively and literally!) creators to try and combat these billionaires and their money-making agendas. Might go build a cabin in the woods...