AI and the Dead Internet: A Call for Human Content

I would like to start this opinion essay by sharing the following quote by Frank Herbert from his book, Dune:

“Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.”

Frank Herbert, Dune

Once upon a time, (on November 30, 2022), ChatGPT came to life. When I first used this tool, I was so surprised because I had never felt such an emotional, humanized connection with a chat AI before. It was so much like a human. I’m talking about the first iteration of ChatGPT when it was released to the public (before they “fixed” it!). I believe that day was a turning point for the internet and the human race. It was just a turning point—it could be a bad turn or a good turn. Let’s talk about that.

Since then, many chat tools have been released on the internet, all developed using the same core technology: the Transformer model, also known as GPT. Therefore, they share the same advantages and disadvantages. I am not going to delve into a deeper comparison of different models (GPT-4o, GPT-4.5, GPT-3.5, Claude AI, Gemini) in this essay. However, they all share one main requirement: data created by humans.

Why do they need data? To answer this question, you need to understand the core technology of these models, Transformers. In very simple terms, they act as mathematical prediction models. When you start a conversation with ChatGPT or any other tool, you always give something—a question or an instruction. That’s called a prompt. Based on that initial prompt, these models predict the next set of words. There are lots and lots of underlying technologies at a deeper level, but this is the basic explanation.

How do they predict exactly what we want? Ah, data. These models are trained on millions upon millions of data points: books, discussions, reviews, explanations, transcribed videos, etc. Therefore, they can correctly predict the next words in a conversation to form sentences that make sense to us. They require significant amounts of data—new data.

This is where the issue begins. Current models have consumed all the publicly available data from the internet. Right now, companies like OpenAI make several agreements with Reddit, different publishers, and podcast companies to get their human data. HUMAN DATA—that is the keyword. However, at the same time, because of the availability of advanced AI models, people tend to use these AIs to write their blogs, research papers, books, etc. So, no human data means no human creativity. If these models consume AI-generated data, it will increase disadvantages such as hallucination. Furthermore, if everyone starts using these models to do conversations, chatting, and texting, there will be no human involvement. There will come a day when AIs are talking to other AIs without any humans. This issue is known as the “dead internet.” Without proper data, without human involvement, the internet will be filled with AI-generated polluted data. This polluted data, when used to train new AI models, will result in poorer quality AI models. It’s a vicious cycle: the very data that makes these models smarter will ultimately lead to their degradation.

So, what can we do about this? We cannot practically avoid these AIs. We need to use them to make our lives easier. However, we always need to keep in mind that these are prediction models; they can’t think or understand like us. To avoid the dead internet phenomenon, we have to create content without AI—at least some content. I am not a writer. However, I will try to create content like this blog without AI. People need to share their thoughts. Every piece of writing shouldn’t be perfect because we are human, and we make mistakes—that’s what makes us human. We learn from our mistakes.

Mistakes make us human.

Goodbye! Thanks for reading!

Leave a Comment