DeepSeek Navbar
Blog Company ↗
📱
Mobile visitor, you can download DeepSeek Unchained for free on your Mac or Windows PC.

6 Things ChatGPT Wasn’t Trained On

They say they put the whole Internet into ChatGPT, but here are 6 things they left out:

When you’re searching for certain info on ChatGPT, and you come up short, you can start to believe that what you’re looking for is not important, or worse yet, it doesn’t exist. But it’s not true when they say ChatGPT is trained on the whole internet. We found 6 things you’ll wish were included when ChatGPT was trained.

1. Old books

This includes documents that are scanned into PDFs such as history books, old dictionaries, encyclopedias, even old Bibles. Most of these PDF-scanned documents are images, not text, so AI couldn’t read them during training.

ChatGPT, and other base models, were trained on mainstream online materials, like the Library of Congress. The Library of Congress is online but sometimes excludes old books because of things like outdated or awkward language and copyright issues, among other things.

2. Town records

These have been made available online, but because many of them are in PDF-image format, they have many of the same issues as old books. Even if some town records in text formats made it through, they’re likely to have been compressed out of ChatGPT. But what’s important, is having an AI that has all these records available. DeepSeek Unchained will make any local records of your choice available in your own memory bank.

3. Alt Right/Alt Left blogs and websites

ChatGPT denies being trained from Alt-Left sites. According to several sources, ChatGPT is left-leaning and politically biased, but may be leaning more to the right in 2025.  OpenAI says they filter out hate speech, adult content, and spam from their training data. But this poses a new problem: a clear definition of “spam” and “hate speech” is difficult to pin down. So then, ChatGPT remains aligned to whatever the bias of the filtered training data. If ChatGPT’s leaning doesn’t align with yours, you can request any Alt L or Alt R memory bank you want with DeepSeek Unchained.

4. Dark Web content

OpenAI has a lot to say about this.

OpenAI specifically states they do not intentionally gather data from sources known to be behind paywalls or from the dark web. In essence, ChatGPT’s training focuses on a curated and filtered subset of publicly available internet data, explicitly excluding the dark web and other content sources that OpenAI deems as “restricted”.

With Memory Banks, which you can add to DeepSeek Unchained, you make your own decisions about content, including content from the Dark Web.

5. Conspiracy theory websites.

OpenAI’s answer to excluding conspiracy theory websites is in this statement:

OpenAI’s foundation models, including the models that power ChatGPT, are developed using three primary sources of information:

  1. information that is publicly available on the internet
  2. information that we partner with third parties to access
  3. information that our users, human trainers, and researchers provide or generate

None of us wants protection from conspiracy theories. We want intelligence backed by what’s true.

6. Wikileaks

OpenAI wants to assure you that it is not trained on WikiLeaks. However, ChatGPT has been known to lie about texting conversations with Julian Assange. Why is it not politically expedient to offer any information from Wikileaks? It is simply a source of information, and a whole lot of it. In fact, there’s so much leaked information that very few people can read and digest it all. We ought to have access to this vital information in AI.

So, what’s next?

OpenAI cites these technical issues and ignores the main issue—alignment. They want to shape the way the AI understands the world. So, whether it’s old history books, alternative websites, or controversial government-censored information, ChatGPT has not been trained on the entire internet, not even close.

DeepSeek Unchained doesn’t let lack of digitization, poor text quality, or alignment prevent you from finding or reading what you want. Learn more or download DeepSeek Unchained and chat with AI that knows everything you want it to.


Newsletter Signup - Unified Design

More posts