Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
atomicapple
13 days ago
|
parent
|
context
|
favorite
| on:
Highly efficient matrix transpose in Mojo
I think the OP based the title off of "This kernel archives 1437.55 GB/s compared to the 1251.76 GB/s we get in CUDA" (14.8%) and not the final kernels for whatever reason
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: