DeepSeek's owner asked R&D staff to hand in passports so they can't travel abroad. How does this make any sense considering Deepseek open sources everything?
Manus turns out to be just Claude Sonnet + 29 other tools, Reflection 70B vibes ngl
Manus turns out to be just Claude Sonnet + 29 other tools
So the much-hyped Manus AI agent from China turns out to be just Claude Sonnet + 29 other tools
Jokic, who was questionable tonight with an ankle injury, becomes the first player with 30 pts, 20 reb, 20 ast in a game in NBA history
QwQ on LiveBench - is better than Sonnet 3.7 (non thinking)!
Severance - 2x08 "Sweet Vitriol" - Post-Episode Discussion
What's the point of local LLM for coding?
o1 like image generator next? This could be game changing if it works!
OpenAI's next image generation will likely have some kind of chain of thought/inference time compute usage, probably based on GPT-4o. This could be very interesting.
GPT-4.5 creates a Louis CK style standup routine. This material is new afaik and genuinely funny. I haven't seen any model generate anything remotely close to this
How is Sesame not all everyone is talking about today? This blows ChatGPT Voice out of the water. I am in awe!
GPT-4.5 is a base model. Just compare other thinking models to their non-thinking versions to see what's coming.
The big week has started with an absolute banger!!!!! Claude 3.7 sonnet absolutely crushes every single competitor in real world coding tasks by a large margin
Claude 3.7 results in the Aider Polyglot benchmark
3.7 sonnet LiveBench results are in
Everyone is catching up.
Imagine the balls on this AI putting some guy named Aristotle in the same line as the great man(child)
I am a little disappointed at how little innovation exists when it comes to products using the current LLMs
Grok is about to be permabanned by US and xAI closed lmao
I thought AI would build my app for me... Here's what actually happened...
Sonnet 3.5 is still the king, Grok 3 has been ridiculously over-hyped and other takeaways from my independent coding benchmarks
Despite all the hatred Sam Altman gets online for his double speak about jobs and hype tweets.........
Looks like we're going to get GPT-4.5 early. Grok 3 Reasoning Benchmarks
People are seriously downplaying the performance of Grok 3