Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Austin-Tui – Spy inside a running Python program at no performance cost (github.com/p403n1x87)
256 points by p403n1x87 on Oct 27, 2020 | hide | past | favorite | 36 comments



Two other profilers that also let you spy on a running Python program:

- py-spy: https://github.com/benfred/py-spy (written in Rust)

- pyflame: https://github.com/uber-archive/pyflame (C++, seems to be not maintained anymore)

The "no performance cost" thing is interesting: my experience writing a similar profiler is that there are a couple of things that can affect performance a little bit:

1. You have to make a lot of system calls to read the memory of the target process, and if you want to sample at a high rate then that does use some CPU. This can be an issue if you only have 1 CPU.

2. you have two choices when reading memory from a process: you can either race with the program and hope that you read its memory to get the function stack before it changes what function it's running (and you're likely to win the race, because C is faster than Python), or you can pause the program briefly while taking a sample. py-spy has an option to choose which one you want to do: https://github.com/benfred/py-spy#how-can-i-avoid-pausing-th...

Definitely this method is a lot lower overhead than a tracing profiler that instruments every single function call, and in practice it works well.

One thing I think is nice about this kind of profiler is that reading memory from the target process sounds like a complicated thing, but it's not: you can see austin's code for reading memory here, and it's implemented for 3 platforms in just 130 lines of C: https://github.com/P403n1x87/austin/blob/877e2ff946ea5313e47...


> you can either race with the program and hope that you read its memory to get the function stack before it changes what function it's running (and you're likely to win the race, because C is faster than Python), or you can pause the program briefly while taking a sample.

Py-spy defaults to blocking because the results can be pretty wrong otherwise: https://github.com/benfred/py-spy/issues/56 . You can see this problem profiling a program like https://github.com/benfred/py-spy/blob/master/tests/scripts/... with or without the nonblocking flag in py-spy - the nonblocking version produces garbage output.

Somewhat interestingly, this problem doesn't seem to occur with Ruby - and rbspy can get away without pausing the target program with only minor errors seen when profiling a similar function. I suspect this is because of differences between how the Ruby and Python interpreters store call stack information, but haven't had a chance to dig into the specifics.


Also, this kind of profiler is great because you can use it on any running Python program, which is pretty magical and very useful. (especially when it's an application you didn't write)

But it's not right for every use case: by design austin/py-spy can only really profile the whole program, and if you want to profile a specific function or endpoint in your program, something like PyInstrument https://github.com/joerick/pyinstrument (which includes Django middlewares & Flask decorators) is a lot more useful.


For these kind of things one could use

https://github.com/P403n1x87/austin-python

An early attempt at an APM that was specifically designed to measure endpoint response times was here

https://github.com/P403n1x87/austin/blob/apm/austin/apm.py

The problem with statistical profilers is that they might fail to catch every single call, especially when they are fast.


All good points. The "no performance cost" is indeed more like "negligible performance costs". That's because these days multicore architectures are quite ubiquitous and standard Python applications are single process. For multi-process Python applications, a busy profiler would certainly steal a good chunk of a core, so the impact might be noticeable in that case.

As for the race conditions, Austin does not introduce any pauses. Even if it did, there would be no guarantee that it paused at a "good" point, so there are no real benefits in terms of accuracy in pausing. Error rates are quite low anyway, so the actual benefit comes from not pausing at all.


>you're likely to win the race, because C is faster than Python

Heh, so the only reason there's "no performance cost" is the enormous performance cost of using Python in the first place?


Yes, enormous performance cost in CPU hours and enormous performance gain in human life hours. hehe


I seriously, seriously beg to differ.


Running Python is not a performance cost. The meaning behind "no performance cost" is that a tool like this is unlikely going to impact the performance of the application that is being profiled. The fact that Python is not a "fast" programming language is then a different matter.


The point is that it's only possible in the first place because Python leaves so much performance on the table. You can't snoop inside a C++ program in the same way - unless you throw a bunch of sleep() calls everywhere to slow it down, and then hey presto you can!

Or, to put it another way, taking Python from 500x SlowerThanCee to 501x is "negligible", but taking C from 1x to 2x slower isn't.


true that


Correct.


Note that tracing profilers don't need to be high overhead - Python is slow enough that efficient tracing can be mostly hidden. For example, https://functiontrace.com tends to have <10% overhead when tracing.


This is a TUI for the profiler Austin, whose README has a bit more detail on what it can do:

https://github.com/P403n1x87/austin


What’s the difference between a TUI and a CLI?

(Also my preferred TUI acronym is tactile user interface).


Both use the terminal, but in different ways. A CLI interface uses the basic terminal functionality of writing and perhaps reading lines, or even simply accepting flags and running non-interactively. A TUI uses the terminal's advanced features, using the terminal as the basis for a kind of GUI, generally occupying the full area of the terminal.

Vim, Midnight Commander, and htop have TUIs. They rely heavily on the terminal's 'control character' features to accomplish this. apt-get has a CLI, as its interactive IO is handled with printing lines and having the user submit lines (even if it's just the letter y).

Somewhere in the middle are interfaces like bash and zsh which make light use of the terminal's advanced features for things like auto-complete, but which don't take over the whole terminal area.

I'd count non-interactive applications like gcc and sort as command-line applications, although strictly speaking you could just as well use a graphical interface to configure their flags and run commands.

See also https://en.wikipedia.org/wiki/Text-based_user_interface


If a CLI program ceases to make sense when its standard-output is a physical line-printer onto a roll of paper, then it's a TUI.


God these explanations suck. Reminds me of the "two boats meet in an ocean" explanation of IRC.

A CLI (command line interface) just uses the streams to show and take in data.

A TUI uses escape codes or proprietary console commands (the latter on Windows, the former on most other things) to take control of the entire console window and display a user interface (hence Text User Interface or TUI). Usually mouse input is handled on modern systems too.


CLI refers to interacting with a program by typing commands. Commands need not come from a terminal (a terminal need not even exist). They might be just passed in from a different program.

TUI refers to interacting with a program via the terminal—the terminal is the program's user interface (think 'nano', which you can't really run without a terminal). The terminal need not interact via commands, although many TUI programs also accept commands.



I would say that a CLI is primarily for entering commands, while a TUI is primarily for viewing and interacting with data.


Back in the MS-DOS days a TUI would be something done with Turbo Vision or Clipper, simulating GUIs with text.


Love the TUI! Which library did you use for it?


I wrote a blog post about the technology behind the TUI

https://p403n1x87.github.io/the-austin-tui-way-to-resourcefu...

It's a custom resource-based framework that uses curses as back-end.


This looks really interesting - thanks!

One question: in the first XML example (minimal-view.xml), why is the root element a <aui:MinimalView ...> but then the last line ends it with </aui:MiniTop> ? Is it a typo or is there something else going on?


Ah! Totally a typo, thanks for that :)


Can Austin attach to python process running in the docker container from the host system? `sudo austin-tui -Cp <pid>` I used pyflame for that at some point.


That should work


looks awesome. how easy is it to use this to profile a webapp written in a framework like django? Open-metrics? just curious


Pyinstrument might be easier to profile web requests: https://github.com/joerick/pyinstrument#profile-a-web-reques...


One benefit of Austin-based tools is that they don't require any instrumentation/extra configuration. The low overhead means that you can just attach to an application that is running in production.


as easy as profiling with Austin itself. Just pass the pid of the web server

sudo austin-tui -Cp <webserverpid>

See https://github.com/P403n1x87/austin#examples for more details. To avoid using sudo on Linux you can do

sudo setcap cap_sys_ptrace+ep `which austin`

and then simply

austin-tui -Cp <webserverpid>


Is it possible to spy from a python script running in a virtual python 2 environment?


yep :)


Great news, can't wait to try it on some long running reports.


Wasn't there lots of work around using dtrace for this?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: