Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
grepfru_it
32 days ago
|
parent
|
context
|
favorite
| on:
Workhorse LLMs: Why Open Source Models Dominate Cl...
I am curious the need for 70 t/sec?
Aeolun
32 days ago
|
next
[–]
Waiting minutes for your call to succeed is too frustrating?
ekianjo
32 days ago
|
parent
|
next
[–]
Depends entirely on the use case. Not every LLM workflow is a chatbot
jbellis
31 days ago
|
root
|
parent
|
next
[–]
no, but if you're not latency sensitive you should probably be using DeepSeek v3 (cheaper than flash, significantly smarter)
lostmsu
31 days ago
|
root
|
parent
|
next
[–]
What makes you believe DeepSeek is smarter than Flash 2.5? It is lower on all leaderboards.
jbellis
31 days ago
|
root
|
parent
|
next
[–]
you're right, I should clarify that I'm talking about no thinking mode, otherwise flash goes from "a bit more expensive than dsv3" to "10x more expensive"
cootsnuck
31 days ago
|
prev
[–]
High concurrency voice AI systems.
grepfru_it
29 days ago
|
parent
[–]
Why are you self hosting that?
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: