|
|
| | Qwen2-7B-Instruct with TensorRT-LLM: consistently high tokens/SEC (inferless.com) | | 1 point by agcat on Sept 5, 2024 | past | 1 comment | |
| | Fast Cold-starts for Serverless GPU Inference is becoming a reality (inferless.com) | | 1 point by agcat on May 29, 2024 | past | 1 comment | |
| | LLMs Tokens/Second Benchmark ( Mistral, Llama2, Gemma) – Independent Research (inferless.com) | | 2 points by agcat on March 25, 2024 | past | |
| | Show HN: Scale PDF Q&A App to 10K Users with GPUs – <$250/Mo (inferless.com) | | 7 points by agcat on March 4, 2024 | past | 2 comments | |
| | Finetune Phi-2 with DPO (inferless.com) | | 1 point by agcat on Feb 1, 2024 | past | 1 comment | |
| | The state of serverless GPUs (inferless.com) | | 1 point by goeldhru on Nov 7, 2023 | past | |
| | Deploying Hugging Face Models on Nvidia Triton Inference Server at Scale (inferless.com) | | 2 points by agcat on July 21, 2023 | past | 1 comment | |
| | The State of Serverless GPUs (inferless.com) | | 105 points by kiyanwang on April 28, 2023 | past | 76 comments | |
| | The State of Serverless GPUs (inferless.com) | | 1 point by dom_fr on April 18, 2023 | past | |
|

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
|