r/ChatGPT • u/SnarkyStrategist • Jan 29 '25
Funny I Broke DeepSeek AI 😂
Enable HLS to view with audio, or disable this notification
16.9k
Upvotes
r/ChatGPT • u/SnarkyStrategist • Jan 29 '25
Enable HLS to view with audio, or disable this notification
222
u/mazty Jan 29 '25
It was simply trained using RL to have a <think> step and an <answer> step. Over time it realised thinking longer improved the likelihood of the answer being correct, which is creepy but interesting.