MLX reports peak memory usage at the end of the response. Otherwise I'll use Act... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		simonw 77 days ago \| parent \| context \| favorite \| on: Mistral ships Le Chat – enterprise AI assistant th... MLX reports peak memory usage at the end of the response. Otherwise I'll use Activity Monitor.

aukejw 77 days ago [–]

I'm also trusting `get_peak_memory` + some small buffer for now.

Still, it reports accurate peak memory usage for tensors living on GPU, but seems to miss some of the non-Metal overhead, however small (https://github.com/aukejw/mlx_transformers_benchmark/issues/...).

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact