Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Qwen2-7B-Instruct with TensorRT-LLM: consistently high tokens/SEC (inferless.com)
1 point by agcat on Sept 5, 2024 | past | 1 comment
Fast Cold-starts for Serverless GPU Inference is becoming a reality (inferless.com)
1 point by agcat on May 29, 2024 | past | 1 comment
LLMs Tokens/Second Benchmark ( Mistral, Llama2, Gemma) – Independent Research (inferless.com)
2 points by agcat on March 25, 2024 | past
Show HN: Scale PDF Q&A App to 10K Users with GPUs – <$250/Mo (inferless.com)
7 points by agcat on March 4, 2024 | past | 2 comments
Finetune Phi-2 with DPO (inferless.com)
1 point by agcat on Feb 1, 2024 | past | 1 comment
The state of serverless GPUs (inferless.com)
1 point by goeldhru on Nov 7, 2023 | past
Deploying Hugging Face Models on Nvidia Triton Inference Server at Scale (inferless.com)
2 points by agcat on July 21, 2023 | past | 1 comment
The State of Serverless GPUs (inferless.com)
105 points by kiyanwang on April 28, 2023 | past | 76 comments
The State of Serverless GPUs (inferless.com)
1 point by dom_fr on April 18, 2023 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: