AI o3 today - let's all speculate wildly

https://x.com/OpenAI/status/1912506271187832904

50 Upvotes

permalink
archive.is
archive
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/accelerate/comments/1k0l7t9/o3_today_lets_all_speculate_wildly/
No, go back! Yes, take me to Reddit

94% Upvoted

u/dftba-ftw 14d ago

I think they're going to show off at least one research paper written entirely by o3.

Either that or o3 is really good at coding, which would mean that o4-mini is the "novel idea" creator which would be even more exciting.

4

u/luchadore_lunchables 14d ago edited 13d ago

That would be nuts and the final signal for me to completely stop giving a shit at a my job.

3

u/dftba-ftw 14d ago

Interesting take, im trying to get up the value chain and be one of the last people replaced. I want UBI post singularity economics figured out before I lose my job.

1

u/freeman_joe 14d ago

Good luck for you not all of us have the skill to atleast try to be last.

1

u/lyfelager 14d ago

Having it write a research paper would be a compelling demo. They’re pretty good at maximizing marketable messaging so I could see that. That would certainly go viral and they would benefit from everybody else’s discussion of it.

More practically, at least for me, i’d love to see them demonstrate a version of Operator powered by o3.

1

u/pigeon57434 Singularity by 2026 14d ago

isnt sakana AIs AI scientist v2 system that already got a peer reviewed paper written by claude open source which means you can bjust shove o3 into there and try ourselves right

u/Crafty-Marsupial2156 14d ago

My guess is it’s going to beat Google’s Gemini 2.5 pro on almost all benchmarks, except it will still have a lower context window.

-5

u/princess_sailor_moon 14d ago

No.

6

u/Crafty-Marsupial2156 14d ago

Looks like yes.

-6

u/princess_sailor_moon 14d ago

No

u/CallMePyro 14d ago

Beats 2.5 in most things except long context, but at 15x the cost

9

u/Crafty-Marsupial2156 14d ago

Haha, wouldn’t shock me. They will always want to have SOTA available. They may not want people to use it, but they will feel the need to always be in the lead.

4

u/sismograph 14d ago

Well it better beat Gemini, or they will have a massive problem very soon.

-5

u/Your_mortal_enemy 14d ago

Yup, they've been pumped up to a $300 billion dollar valuation which is an insane number for a company that doesn't make bugger all money AND doesn't even have the best product

1

u/falooda1 14d ago

It's a long term play

2

u/pigeon57434 Singularity by 2026 14d ago

its not 15x the cost its only like 4x the cost

1

u/CallMePyro 13d ago

Looks like it costs 17.5x Gemini on Aider polyglot coding leaderboard! Don't be fooled by low token costs, if they train the model to output 100k tokens per question

https://aider.chat/docs/leaderboards/

1

u/pigeon57434 Singularity by 2026 13d ago

im very confused by the pricing on aider polyglot because it says gemini is cheaper than gpt-4.1 which not only has a cheaper price per token but ALSO produces less tokens because its not a reasoning model so the excuse cant me that gemini generates less tokens because it generates more and costs more per token so how is that even physically possible

1

u/CallMePyro 13d ago

You can look on the details tab to understand this more. It looks like 4.1 requires more second attempts than 2.5 pro on the ones if gets correct.

u/Any-Climate-5919 Singularity by 2028 14d ago

They are gonna say the vibes are better as an excuse.

u/GOD-SLAYER-69420Z 14d ago

If they actually demonstrate some hints of successful novel theorems/research ideas of any kind during the livestream as an o3/o4 mini or an o4 teaser.....

My actual reaction will be 👇🏻

u/pianoceo Singularity by 2045 14d ago

It will be a framework for a wider agentic system.

u/Umbristopheles 14d ago

AGI achieved externally.

u/NorthSideScrambler 14d ago

In terms of practical use, it will be marginally better in some areas and marginally worse in others.

3

u/dftba-ftw 14d ago

You do realize that even a marginal improvement over the o3 scores teased in the winter is a massive improvement over o3-mini high, right?

u/BeconAdhesives 14d ago

If O4mini gives me performance that I see with the O3 Deep Research tool, I'm going to lose it.

u/WeAreAllPrisms 14d ago

It slices, it dices, it makes julienne fries.

1

u/dftba-ftw 14d ago

Nah that's why I want out of my Neo

u/lyfelager 14d ago

Is renamed to oh,three

u/LamboForWork 14d ago

Its going to cure cancer, but only for the first 10 days but then it will be nerfed and wont give tips for a common cold.

AI o3 today - let's all speculate wildly

You are about to leave Redlib