Hacker News new | past | comments | ask | show | jobs | submit | Icathian's comments login

If information is important enough to bother someone asking for a copy of it, but not important enough to spend an hour ingesting, I'm not sure what to tell you.

The thing is: When working with the material afterwards the important part are the small details. The talk/recording are good for the high level overview and following along on the big picture, but for details it is annoying as one has to jump around for specific words and phrases. Something written or an image/diagram is a lot better to study in depth.

And there lies the trouble with slides: During a talk they should support what is being said, but they are often abused as also being the handout for afterwards.


It sounds like you want detailed documentation. That’s fine, but that’s not what a talk is. A good talk isn’t a reference. And good documentation isn’t an engaging talk.

If people want that, produce two artifacts. Don’t try shoehorn a talk into being documentation. That’s just a recipe for bad work.


It depends on what the talk is about. Of course Steve Jobs' of cited iPhone introduction didn't have any details for in depth research later on, but was a high level product introduction.

A technical talk however explains a concept, a tool or something and thus contains technical information to follow up with, but for that I need the words, the phrases stated so I even know what to look for in the manual. And probably I want to follow it in the order they presented it (I hope they thought about the order they presented it in!) however the manual is ordered more in a reference order.

So yeah, if you do a high level marketing talk it doesn't matter, but then I also won't spend the time on watching a second time. If it has technical depth, then being able to follow the depth is good.


I have dealt with this issue as well before. If folks need something more in depth I will use a LLM + some massaging of my own to create a supporting document. Here is an example of a very disorganized conversation and the supporting document I made with it: https://www.danielvanzant.com/p/what-does-the-structure-of-l... It has clear definitions of the key terms. Timestamps for the important moments, and links to external resources to learn more about any of the topics.

Slides should just have relative links to supporting content online that is accessible on same website/domain and can be downloaded as a single zip.

It is not that complicated really, no need to reinvent the wheel.


I've been in this situation. I'll spend the hour watching the info, but I'll dislike the inefficiency. I consider it impolite.

Not a complete mitigation, but VLC et al plays back at 1.5X+. Highly recommended.

Lots of things fall into this category. Speech is very low information density per time.

Thankfully speech recognition and AI summary is a thing now.


This type of phrasing is strange to me. I guess it depends on what you consider to be, and not to be, “information”.

Reading a bullet point summary of Moby Dick certainly would compress the time required to understand the plot.

Isn’t the prose or phrasing part of the transmission?


For most talks, I would say no. If I were going to a lecture by Pynchon (ha!) I would want to listen at 1x. For 99% of talks at conferences which are mostly just a way of communicating technical data, a text transcription that is then reduced in word count by 50% is probably only a very small loss (if that), and a 90%+ time savings.

This gives me an idea for a website. All of the talks of a conference, audio transcribed and LLM summarized into 3-minute reads.

It might be worth doing the whole INFOCON archive…



Wait. I'm unclear what your point is.

Is it that asking for a copy is an unreasonable burden that should require a significant time investment from me?

I've sent many copies of many things I made in my live. It's not so bad. And it's easily shared with many people at once.

Or is it that people can't ingest any meaningful information in less than an hour?

That's clearly not true either. A five minute article can contain extremely valuable insights. A 30 second conversation even more so.


The slices of a good presentation are worthless without the presentation itself. If the deck is valuable in and of itself, it could have just been an email or word doc in the first place.

Well, it's not the reality of most slides I've seen. Most of them seem to be a pretty good summary of the talk. Weirdly, some of them contain more information than the talk.

I do believe most presentations I've seen could've been an email or an article. So I guess I agree with you?


> I do believe most presentations I've seen could've been an email or an article. So I guess I agree with you?

Yeah, I really should have said that in my original post. Most presentations could have been a one pager, and any presentation worth sitting through the slides aren't worth having.


My company records all presentations: it’s like sharing the slides, but better, since we just have the entire presentation again.

Always recording is a good practice I think. It's so cheap with video conferencing that you might as well. Even if nobody uses it later, it didn't cost much. And if you get that one presentation that provides stellar value it's a gift that keeps on giving.

I don't really agree that a recording is always better than the slides. Slides are a text medium, and as such can be searched. You can also go through them much, much faster than through a recording (even if you can listen at 2x). If you're just looking for something specific, slides can be much better.

And sometimes you need to get the whole experience. And then the recording is much better.


Hosting and operating the autoscaling of the various services (compute, pageserver, safekeeper, storage broker) that it takes to make all that work is complex enough that most folks would rather not. Same as any other "managed X" service.


This is the exact same attitude as people who threw tantrums about seatbelt laws in the 90s. It was wrong then, and it's wrong now. For mostly the same reasons.


Not using a variable is the same as flying out your car window?

It's more like the cashier smacks a peanut out of your hand saying you'll get fat.


Even when compiling in Debug mode! Where the analogy is closer to a friend who's applying for a cashier job asking you to role play being a customer so they could practice, and then spitting in your face when you go "Hey, Joe! I'll take these" because "No real life customer would do that! Start over."


> Even when compiling in Debug mode! closer to a friend asking you to role play being a customer so they could practice, and then spitting in your face saying "No customer would do that! Start over."

Thanks for the hearty laugh :D


If we're going to go with the seatbelt comparison then we'd have to flesh it out a bit:

Your new car sounds an alarm non-stop if any seat belt in the car is unfastened, whether or not a seat is occupied and whether or not the car is even on. This seems kind of silly, so you raise it with the manufacturer and they say that standard procedure is to just leave all seat belts fastened all the time.

You start doing that and quickly realize that it's much harder to remember to put your seat belt on than it was in your old car, because the car doesn't warn you when you've forgotten your seat belt and when you do remember on your own you have to sit up awkwardly to unfasten it before you can get under it and fasten it around you. You point this out to some fans of the manufacturer and they act like you're the weird one for not seeing how this makes you safer.


Seatbelt laws are still wrong, government has no business protecting me from myself.

But even from an utilitarian perspective, compilers do have warnings and they could just have used that.


> Seatbelt laws are still wrong, government has no business protecting me from myself.

If there are multiple people in a car and some choose to wear seatbelts and some choose not to, those who are not wearing seatbelts become a danger to everyone else as their bodies become in-vehicle projectiles.

Sure, I can understand the debate when it's just a single person in a car. But when a person's decision starts impacting others the debate is going to be very hard to win.


Even if it’s just you, you’d be leaving your mangled corpse on a public road for other people to deal with, which is a nuisance.

Like take the car crash out of the equation and imagine some cars came with an ejector switch that launches you through the windshield at 70 mph. This would not be allowed.


It also protects innocent bystanders from being forced to see your horrifyingly mangled body tossed on the ground in front of them in what could otherwise have either been a crash with no injuries, minimized injuries, or at least contained injuries. Do you still think that law is overstep, if so why? Genuine question, I have no horse in this race and am on the fence myself.


Because any restriction of freedom is bad in principle and acceptance of those tend to create overreaching, totalitarian states/mafias. There are valid arguments for restricting freedom from an individual to harm another, but making sure no one can see your dead body right after you happen to crash your car definitely isn't one. It is very much infeasible and an absolute helicopter mom type of concern.


Keeping with the analogy, yes, you should always wear your seatbelt on public roads (release), but that doesn't mean I feel like I need to buckle up just to move my car while staying in my own driveway (debug).


That's reasonable. I think major restrictions that cause you to need to refactor your code when going from debug to release are a footgun and a half, but that'd at least be defensible.


So proud of all the HN denizens that none of us asked lvass about their stance on mandatory baby seats in cars. Or those bars they use to pin you down in your car on roller coasters.


I would pay some real money for the rust equivalent of this kind of material. Anyone got a book or repo they like and would recommend?


Most hardware-level observations, like the latency of various memory accesses or numeric operations, would be the same for the Rust code. As for higher-level abstractions, I've already started porting them to Rust <https://github.com/ashvardanian/less_slow.rs>.

Next, it would be exciting to implement a concurrent job-stealing graph algorithm in both languages to get a feel for their ergonomics in non-trivial problems. I can imagine it looks very different in Rust and C++, but before I get there, I'm looking for best practices for implementing nested associative containers with shared stateful allocators in Rust.

In C++, I've implemented them like this: <https://github.com/ashvardanian/less_slow.cpp/blob/8f32d65cc...>, even though I haven't seen many people doing that in public codebases. Any good examples for Rust?


I’m definitely interested in seeing this kind of content in Rust, have you looked at Rayon’s implementation for work stealing yet? Can result in some very nice high-level code.


This is completely wrong. Kids can easily learn to read at age 5. A child who is working on "basic reading proficiency" at 8 is very behind and has not been well-served by the people responsible for raising them.


Yes, but still I would argue that books are the more appropriate reading material :)


Yeah, 100% with you on that.


Termdebug works great with gdb, and I get my usual editor features as well as the full functionality of gdb. Seems fine to me.

Before I switched from emacs I had an equivalently good setup with dap-mode.


I think this mischaracterizes the state of the space. Iceberg is the winner of this competition, as of a few months ago. All major vendors who didn't directly invent one of the others now support iceberg or have announced plans to do so.

Building lakehouse products on any table format but iceberg starting now seems to me like it must be a mistake.


Yeah working in the data space I see a ton of customers using Iceberg and some using Delta Lake if they're already a Databricks shop. Virtually no Hudi.


I'd vote for you. God damn I wish this was the world we lived in.


Tacking on my own article about git bisect run. It really is an amazing little tool.

https://andrewrepp.com/git_bisect_run


This is mostly correct, but it's worth mentioning that cloudberry substantially predates Greenplum going closed source. It just got quite a boost from that change happening. Different dev team too, afaik none of the original Greenplum team was involved with Cloudberry until very recently.

Also, Greenplum 7 tracks postgres 14. Which is still old at this point, but not so bad as 12....

I also don't think I'd call the architecture ancient. Just very tightly coupled to postgres' own (as a fork of postgres that tries to ingest new versions from upstream every year or two) and paying the overhead of that choice in the modern landscape.

Source: former member of the Greenplum Kernel team.


Thanks for the context. In what way would you say Cloudberry lags behind Greenplum technology-wise? I see newer Greenplum versions have a lot of planner improvements.

Greenplum 7 is listed as tracking Postgres 12 in the release announcement [1], and the release notes for later 7.x versions don't mention anything. Is there a newer release with higher compatibility?

When I say ancient, I mean that it's a "classical" shared-nothing design where the database is partitioned and hosted as parallel, self-contained replica servers, where each node runs as a shard that could, in theory, by queried independently of the master database. This is in contrast to newer architectures where data is sharded at the heap level (e.g. Yugabyte, CockroachDB) and/or compute is separated from data (e.g. Aurora, ClickHouse, Neon, TiDB).

[1] https://greenplum.org/partition-in-greenplum-7-whats-new/


Cloudberry, last I checked, took their snapshot of all the Greenplum utilities way before the repos got archived and development went private. The backup/restore, DR, Upgrade, and other such seem to leave a lot on the table. I haven't checked in a bit, it's possible they've picked back up some of that progress.

You're completely right, I had the wrong PG version in my memory. Embarrassing, thanks for catching that.


All the Greenplum utilities you mentioned here are also open-sourced and available for Cloudberry, but some of them are not in the main repo of Apache Cloudberry (This is more a matter of adhering to the Apache Software Foundation's regulations than a technical limitation).

Here is the unofficial roadmap of Cloudberry:

1. Continuously upgrading the PostgreSQL core version, maintaining compatibility with Greenplum Database, and strengthening the product's stability. 2. End-to-end performance optimization to support near real-time analytics, including streaming ingestion, vectorized batch processing, JIT compilation, incremental materialized views, PAX storage format, etc. 3. Supporting lakehouse applications by fully integrating open data lake table formats represented by Apache Iceberg, Hudi, and Delta Lake. 4. Gradually transforming Cloudberry Database into a data foundation supporting AI/ML applications, based on Directory Table, pgvector, and PostgresML.


Delighted to see Greenplum mentioned in this article, also equally pleased to see Apache Cloudberry mentioned in the comments. Greenplum has been open-source for nearly a decade, forming a fairly mature global open-source ecosystem, with many core developers distributed around the world ( they were not necessarily hired by Pivotal/VMware/Broadcom). Greenplum forked as Cloudberry wasn't to outdo Greenplum Database, but to foster a more neutral and open community around an MPP database with a substantial global following. To that end, the project was donated to the Apache Software Foundation following Greenplum's decision to close source. Since the project is in its early stages within the Apache incubator, our immediate goal is to build a solid foundation that adheres to Apache standards. Instead of introducing extensive new features, we are concentrating on developing a stable and compatible open-source alternative to Greenplum.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: