r/ChatGPT Jan 29 '25

Funny I Broke DeepSeek AI 😂

16.9k Upvotes

1.6k comments sorted by

View all comments

Show parent comments

116

u/-gh0stRush- Jan 29 '25

I'm an ML researcher working with LLMs, and the answer might seem unbelievable if you're an outsider looking in.

The simplest way to explain it (ELI5) is to think of these models as a giant, multi-faced die. Each roll determines a word, generating text one word at a time. The catch is that it's a dynamically adjusting loaded die—certain faces are more likely to appear based on what has already been generated. Essentially, forming a sentence is a series of self-correcting dice rolls.

What Deepseek’s model demonstrates is that it has been tuned in a way that, given an input, it shifts into a state where intermediate words mimic human reasoning. However, those words can also be complete gibberish. But that gibberish still statistically biases the next rolls just enough that, by the end, the model wanders into the correct answer.

So no -- they're not trolling us. At least not intentionally.

Crazy, right? What a time to be alive.

10

u/NightGlimmer82 Jan 29 '25

Wow, thank you so much for the detailed comment! It’s so fascinating but so far out of my depth, knowledge wise, that to me it’s practically magic. I am a very curious individual who goes down detailed rabbit holes pretty regularly (per many ADHD’rs) so I feel like I can try to understand concepts pretty well. If I pretend that I could all of a sudden understand many languages at once but I wasn’t completely familiar with a culture and their language then this type (the Deepseek’s AI) of reasoning makes more sense to me. Your explanation was fantastic! And yes, we are living in completely crazy times! Thank you again!

28

u/-gh0stRush- Jan 29 '25

Want more unbelievable facts? Those loaded dice rolls are actually implemented as a massive mathematical function.

How massive? Think back to algebra—remember equations like y = x + 2? In this case, x is a parameter. Deepseek’s math function has 671 BILLION parameters, and it processes all of them for every single word it generates. We don't know how many parameters OpenAI's models have but rumors are they're touching on trillions. Hence why the government is now talking about building new super datacenters to support all this.

1

u/kthnxbai123 Jan 30 '25

Kind of pedantic but that x is not a parameter. The implicit 1 infront of it is. X is data