Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Clang-uml – C++ to UML diagram generator based on Clang (github.com/bkryza)
79 points by bkryza on June 26, 2023 | hide | past | favorite | 20 comments
Hi,

clang-uml is an open-source C++ to UML diagram generator, driven by YAML configuration files.

The main idea behind the project is to easily maintain up-to-date diagrams within a code-base or document legacy code.

The configuration file for clang-uml defines the types and contents of each generated diagram.

The diagrams can be currently generated in PlantUML and JSON formats.

Main features: - class, sequence, package and include diagrams - up to C++17 with support for C++20 concepts - visualization of template specialization relationships - declarative diagram content filtering based on namespaces, elements and relationships - relationship inference from C++ containers, smart pointers and custom templates - customizable interactive links in diagrams (SVG output only) - generation of UML packages from namespaces or directories - JSON output containing intermediate diagram model representation for custom processing

More features, usage information and examples are in README at https://github.com/bkryza/clang-uml and online docs at https://clang-uml.github.io

In particular, checkout diagrams generated from test cases here: https://clang-uml.github.io/md_docs_2test__cases.html



Remember the days people thought they could build everything in UML diagrams, and have the code essentially be generated or trivially written from there? Fast forward to 2023 and we're generating UML diagrams from C++ code because it's easier. I love this.


The only reason UML didn't pan out was because it isn't code, and if it's not code, software engineers don't want to do it because software engineers are lazy when it comes to things that are neither code nor one-step-away-from-code tasks. They will go build a million different things instead of documenting. As a consequence, they lose their engineering title and get called developers.

UML works.

One can lead a horse to water... just like one can stand up ALL of the top, fully-functional UML tools for a software team at what used to be a performant defense contractor, and have literally ZERO people use it.

Same with system engineers and commitment. Requirements work. Model driven time wasting doesn't.


> Remember the days people thought they could build everything in UML diagrams, and have the code essentially be generated or trivially written from there?

Those days were a decade ago, and everyone with a license to software like Enterprise Architect did that already with a click of a button.

Even before that this was at the reach of everyone with FLOSS tools like Umbrella.

What exactly was your point?

> Fast forward to 2023 and we're generating UML diagrams from C++ code because it's easier. I love this.

You missed the whole point of this tool. This tool is ideal for those of us who work on legacy projects and want a high level view of the project structure to make it easier to get a feel of the project.


> This tool is ideal for those of us who work on legacy projects and want a high level view of the project structure to make it easier to get a feel of the project.

Which is anyone more than few months into any project developed by multiple people.

I continue to be amazed at how our entire field claims to consider code readability paramount, and yet invests approximately zero effort into making codebases legible.


I'd bet with GPT-XX based code generation this might UML sequence diagrams may make a resurgence. its at least good to run some kind of automated 'design lint' to run even if just to ensure a new feature or commit has not broken the high level assumptions/state machine. The codebase I work in I definitely could use something like this regularly.


That's funny, I was just thinking this the other day, because there was another HN thread that seemed largely critical of UML's legacy. I remember being a little surprised (with all the GPT craze) that no one had suggested a possible resurgence of UML (or something with similar features) for directing code generation.


learning "true" UML seems like too much overhead. Perhaps something more flexible in the direction of ink and switch's programmable ink would be easier to pick up https://www.inkandswitch.com/inkbase/


UML... the only pictures that _don't_ paint a thousand words!


Not sure what cursed C++ you've seen but UML is easily 100x worse.

There may be a few hundred good C++ programmers around but nobody can grok UML, if somebody says they can: they're a liar.


> Not sure what cursed C++ you've seen but UML is easily 100x worse.

I don't know what UML you've seen, but the one I typically see only expresses high level details and is created to convey the important details, and therefore is trivially simple to understand.


This is why code bases from the Java 1.4 days are so unwieldy.


Doxygen in it's full glory with all flags enabled can come pretty closeto this.

It has iheritance, composition, call graph, it can include source code fragments and everything is interlinked.

It is very useful to get you started in an unknown codebase.


Yes, absolutely, in fact clang-uml also uses Doxygen for it's own documentation - although diagrams are 'self'-generated by clang-uml - for instance see here: https://clang-uml.github.io/structclanguml_1_1config_1_1conf...

However, the main difference between clang-uml and Doxygen is that in Doxygen the diagrams are a by-product of generated documentation. clang-uml aims to give you much more control over what you want in your diagrams, without having to use Doxygen or any other documentation tool, which possibly makes it more flexible in how you can use its output.

Obviously, which approach you prefer depends on your use case and needs.


Ideally we should write and organize our code such that it is easy to visualize. We shouldn’t write code and then visualize it. And we also shouldn’t think we have to abandon text-based editors when we want to do such.

People forget that higher level languages are only to help _people_ understand and organize code better.

I’m waiting for the first programming language or text editor that visualizes your code from the first line you type.


F# and OCaml are kind of great for this in my opinion.

With F# in particular, the code project/workspace has a root .fsproj file where you list the dependency order of files. The topmost file cannot depend/call anything from any of the files below, and the bottom most file is the program's entry point depending on all the other files in the project and which itself cannot be depended on. OCaml has the same rule, but the dependency order is automatically detected (which I comparatively dislike to be honest as the .fsproj file serves as easy documentation - although I've seen tools that generate a graph for OCaml's dependency order).

The same dependency rule (code at the top of the file cannot call code below) is followed within a single file too by default. It is possible to opt-out of this rule in an intra-file (within the same file basis) but discouraged to do so I think.

I've found that much more helpful in understanding code than UML diagrams in all honesty.

I remember (before I learned F# or OCaml) reading some of VS Code's source code[0] for learning purposes and finding that class A referenced another class B, and that class B referenced class A which I found confusing. I managed to overcome my issue by drawing a dependency tree of functions which gave some chronological order and understanding.

[0] This to be precise: https://github.com/microsoft/vscode-textbuffer .


Anybody building open source tools for c++ is great, so thanks. Most confusion for me happens around dynamic dispatch. I want a good tool to generate and control dynamic call graphs.


>Most confusion for me happens around dynamic dispatch. I want a good tool to generate and control dynamic call graphs.

I use the profiler (eg. gprof call graph) and good old "printf/log tracing" for this.

PS: See also -finstrument-functions in gcc : https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.h... and https://balau82.wordpress.com/2010/10/06/trace-and-profile-f...


Awesome! I'd been looking for something like this for the past few weeks. I'm mainly interested in sequence diagram generation, so I can visualise the program/code flow.


I used Umbrello (https://apps.kde.org/umbrello/) in the long long ago, might be worth checking out too if you haven't already.


This looks amazing for digging through large amounts of legacy code.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: