Submissions from inferless.com

		Qwen2-7B-Instruct with TensorRT-LLM: consistently high tokens/SEC (inferless.com)
		1 point by agcat on Sept 5, 2024 \| past \| 1 comment
		Fast Cold-starts for Serverless GPU Inference is becoming a reality (inferless.com)
		1 point by agcat on May 29, 2024 \| past \| 1 comment
		LLMs Tokens/Second Benchmark ( Mistral, Llama2, Gemma) – Independent Research (inferless.com)
		2 points by agcat on March 25, 2024 \| past
		Show HN: Scale PDF Q&A App to 10K Users with GPUs – <$250/Mo (inferless.com)
		7 points by agcat on March 4, 2024 \| past \| 2 comments
		Finetune Phi-2 with DPO (inferless.com)
		1 point by agcat on Feb 1, 2024 \| past \| 1 comment
		The state of serverless GPUs (inferless.com)
		1 point by goeldhru on Nov 7, 2023 \| past
		Deploying Hugging Face Models on Nvidia Triton Inference Server at Scale (inferless.com)
		2 points by agcat on July 21, 2023 \| past \| 1 comment
		The State of Serverless GPUs (inferless.com)
		105 points by kiyanwang on April 28, 2023 \| past \| 76 comments
		The State of Serverless GPUs (inferless.com)
		1 point by dom_fr on April 18, 2023 \| past