More

CSDude · 2025-11-14T17:38:08 1763141888

I used Xmonad for a while, then switched to awesomewm used it for years. It was good on a 1366x768 screen to use space efficiently.

CSDude · 2025-10-22T22:10:35 1761171035

Imagine being a world class F1 driver and (someone) still have to upload your CV somewhere.

ddalex · 2025-10-23T17:16:18 1761239778

a couple of weeks ago Verstappen raced in a "Advanced-amateur" competition in Germany - he had to be "trained" by an official instructor in a restricted car because he hadn't raced there before

I imagine the instructor "What could I teach Verstappen now..."

CSDude · 2025-10-19T11:50:20 1760874620

There is an embedded one in DuckDB for a while now and it's great. I get the apeal of yours but this one is much easier to use for same cases:

https://duckdb.org/2025/03/12/duckdb-ui

mrs6969 · 2025-10-19T11:53:22 1760874802

This is not a self hosted one though. You can not use default ui offline, you can not guarantee data safety

jasonjmcghee · 2025-10-19T14:20:28 1760883628

It's very weird they don't offer it by default, but there are workarounds.

(You can use it offline)

https://github.com/duckdb/duckdb-ui/issues/62

ludicrousdispla · 2025-10-19T15:56:50 1760889410

some of the comments on that thread are surprising. Are people not aware that software can be bundled in such a way as to run on machines not having internet access?

dkdcio · 2025-10-19T16:08:16 1760890096

the background is this UI is the MotherDuck UI for their cloud SaaS app. MotherDuck is a VC-backed DuckDB SaaS company, not to be confused with DuckDB Labs or the DuckDB foundation

MotherDuck decided to take their web app UI and make it a locally usable extension via DuckDB. however as noted in that thread, the architecture is a bit odd as the actual page loads once the extension is running from MotherDuck’s servers (hence the online requirement)

I don’t think it’s intentionally malicious or bad design or anything, just how this extension came about (and sounds like they’re fixing it)

disclaimer: I do know and actively work with the MotherDuck folks, I’ve also worked w/ DuckDB Labs in the past

ludicrousdispla · 2025-10-19T16:26:37 1760891197

Thanks, any recommendations on where to find the best information reference/resource for DuckDB-Wasm?

hamandcheese · 2025-10-19T17:17:09 1760894229

> MotherDuck is a VC-backed DuckDB SaaS company, not to be confused with DuckDB Labs or the DuckDB foundation

Separate entities, but cooperative/comprised of many overlapping people, right?

matsonj · 2025-10-19T17:42:18 1760895738

Afaik no overlapping people

nirav72 · 2025-10-19T14:09:53 1760882993

Anyone know if there is a similar selfhosted/run local option?

Rhubarrbb · 2025-10-19T15:28:29 1760887709

`duckdb -ui` and you can launch a local server bound to 127.0.0.1

curtisblaine · 2025-10-19T20:30:16 1760905816

Doesn't do charts though (meaning: it does statistic histograms on your columns but no custom chart like OP's software does).

CSDude · 2025-09-22T16:04:22 1758557062

It's so opinionated but many people find it okay. And it's hard to install Arch successfully. Compared to Ubuntu Arch's package manager (also combined with AUR) are great.

I use every possible opportunity to say "Fuck Ubuntu Snaps"

dham · 2025-09-22T16:20:14 1758558014

>And it's hard to install Arch successfully.

archinstall. You can even select a DE in it

CSDude · 2025-09-22T17:07:24 1758560844

I only learned it with Omarchy after all of these years :(

CSDude · 2025-06-21T10:57:47 1750503467

Blanket statements like this miss the point. Not all data is waste. Especially high-cardinality, non-sampled traces. On a 4-core ClickHouse node, we handled millions of spans per minute. Even short retention windows provided critical visibility for debugging and analysis.

Sure, we should cut waste, but compression exists for a reason. Dropping valuable observability data to save space is usually shortsighted.

And storage isn't the bottleneck it used to be. Tiered storage with S3 or similar backends is cheap and lets you keep full-fidelity data without breaking the budget.

jiggawatts · 2025-06-21T12:53:40 1750510420

I agree with both you and the person you're replying to, but...

My centrist take is that data can be represented wastefully, which is often ignored.

Most "wide" log formats are implemented... naively. Literally just JSON REST APIs or the equivalent.

Years ago I did some experiments where I captured every single metric Windows Server emits every second.

That's about 15K metrics, down to dozens of metrics per process, per disk, per everything!

There is a poorly documented API for grabbing everything ('*') as a binary blob of a bunch of 64-bit counters. My trick was that I then kept the previous such blob and simply took the binary difference. This set most values to zero, so then a trivial run length encoding (RLE) reduced a few hundred KB to a few hundred bytes. Collect an hour of that, compress, and you can store per-second metrics collected over a month for thousands of servers in a few terabytes. Then you can apply a simple "transpose" transformation to turn this into a bunch of columns and get 1000:1 compression ratios. The data just... crunches down into gigabytes that can be queried and graphed in real time.

I've experimented with Open Telemetry, and its flagrantly wasteful data representations make me depressed.

Why must everything be JSON!?

nijave · 2025-06-21T18:13:59 1750529639

I think Prometheus works similar to this with some other tricks like compressing metric names.

OTEL can do gRPC and a storage backend can encode that however it wants. However, I do agree it doesn't seem like efficiency was at the forefront when designing OTEL

valyala · 2025-06-22T12:52:17 1750596737

These tricks are essential for every database optimized for metrics / logs / traces. For example, you can read on how VictoriaMetrics can compress production metrics to less than a byte per sample (every sample includes metric name, key=value labels, numeric metric value and metric timestamp with millisecond precision). https://faun.pub/victoriametrics-achieving-better-compressio...

pdimitar · 2025-06-21T20:52:58 1750539178

Very curious to read your code doing it. Thought of a very similar approach but never had the time. Are you keeping it somewhere?

jiggawatts · 2025-06-22T01:07:03 1750554423

I only ever got it to a proof of concept. The back end worked as advertised, the issue was that there are too many bugs in WMI so collecting that many performance counters had weird side effects.

Google was doing something comparable internally and this spawned some fun blog titles like “I have 64 cores but I can’t even move my mouse cursor.”

pdimitar · 2025-06-22T09:15:18 1750583718

Ah, I don't mean the Windows-specific stuff. I mean the binary diffing and RLE.

While not difficult, I am just curious how others approached it.

ofrzeta · 2025-06-21T11:22:57 1750504977

> Dropping valuable observability data to save space is usually shortsighted

That's a bit of a blanket statement, too :) I've seen many systems where a lot of stuff is logged without much thought. "Connection to database successful" - does this need to be logged on every connection request? Log level info, warning, debug? Codebases are full of this.

nijave · 2025-06-21T18:10:24 1750529424

Yes, it allows you to bisect a program to see the block of code between log statements where the program malfunctioned. More log statements slice the code into smaller blocks meaning less places to look.

citrin_ru · 2025-06-21T11:56:50 1750507010

Probably not very useful for prod (non debug) logging, but it’s useful when such events are tracked in metrics (success/failure, connect/response times). And modern databases (including ClickHouse) can compress metrics efficiently so not much space will be spent on a few metrics.

throwaway0665 · 2025-06-21T11:40:39 1750506039

There's always another log that could have been key to getting to the bottom of an incident. It's impossible to know completely what will be useful in advance.

vidro3 · 2025-06-21T23:32:06 1750548726

in our app each user polls for a resource availability every 5 mins. do we really need "connection successful" 500x per minute? i dont see this as breaking up the logs into smaller sections. i see it as noise. i'd much rather have a ton of "connection failed" whenever that occurs than the "success" constantly

CSDude · 2025-06-08T06:58:40 1749365920

I agree that education needs overhaul, it's scary for new comers, AI can make mistakes that you need to be careful (so does old StackOverflow answers) but let’s be honest: Most employers aren’t paying for your art or your dopamine.

throw93484i4 · 2025-06-08T07:11:50 1749366710

Most employers aren’t paying for your degrees!

Universities as we know them are obsolete.

CSDude · 2025-04-02T06:46:55 1743576415

I love the option for "Use a solid color background:" is Windows 95 background color. I love that color.

CSDude · 2025-03-18T16:30:39 1742315439

I’d commit 5000$ for a 20 inch macbook

CSDude · 2025-03-14T06:02:03 1741932123

For years, I just didn't get why replicated databases always stick with EBS and deal with its latency. Like, replication is already there, why not be brave and just go with local disks? At my previous orgs, where we ran Elasticsearch for temporary logs/metrics storage, I proposed we do exactly that since we didn't even have major reliability requirements. But I couldn't convince them back then, we ended up with even worse AWS Elasticsearch.

I get that local disks are finite, yeah, but I think the core/memory/disk ratio would be good enough for most use cases, no? There are plenty of local disk instances with different ratios as well, so I think a good balance could be found. You could even use local hard disk ones with 20TB+ disks for implementing hot/cold storage.

Big kudos to the PlanetScale team, they're like, finally doing what makes sense. I mean, even AWS themselves don't run Elasticsearch on local disks! Imagine running ClickHouse, Cassandra, all of that on local disks.

jiggawatts · 2025-03-14T07:57:24 1741939044

I looked into this with an idea of running SQL Server Availability Groups on the Azure Las_v3 series VMs, which have terabytes of local SSD.

The main issue was that after a stop-start event, the disks are wiped. SQL Server can’t automatically handle this, even if the rest of the cluster is fine and there are available replicas. It won’t auto repair the node that got reset. The scripting and testing required to work around this would be unsupportable in production for all but the bravest and most competent orgs.

hodgesrm · 2025-03-16T17:00:40 1742144440

There are a number of axes of performance that aren't covered in this [wonderful] article on storage performance. One of these is that EBS allows you to scale the VM up / down to change the amount of CPU & RAM available to process data on disk. We run several hundred ClickHouse clusters on this model. Rescaling to address performance issues is far more common than failures.

Example; you get a tenant performance issue on Sunday morning US time. The simplest fix is often rescale to a larger VM for the weekend, then get the A team working on the root cause first thing Monday. The incremental cost is minimal and avoids far more costly staff burnout.

CSDude · 2025-03-11T16:36:27 1741710987

For years, I just didn't get why replicated databases always stick with EBS and deal with its latency. Like, replication is already there, why not be brave and just go with local disks? At my previous orgs, where we ran Elasticsearch for temporary logs/metrics storage, I proposed we do exactly that since we didn't even have major reliability requirements. But I couldn't convince them back then, we ended up with even worse AWS Elasticsearch.

I get that local disks are finite, yeah, but I think the core/memory/disk ratio would be good enough for most use cases, no? There are plenty of local disk instances with different ratios as well, so I think a good balance could be found. You could even use local hard disk ones with 20TB+ disks for implementing hot/cold storage.

Big kudos to the PlanetScale team, they're like, finally doing what makes sense. I mean, even AWS themselves don't run Elasticsearch on local disks! Imagine running ClickHouse, Cassandra, all of that on local disks.

rcrowley · 2025-03-11T17:14:59 1741713299

Glad to find a kindred spirit. I proposed PlanetScale Metal (though our CEO Sam gets credit for the name) based on how we ran MySQL and Vitess at Slack for many, many years.

We earn our durability through MySQL replication and redundancy and reap the benefits in low, predictable latency.