More

c4wrd · 2025-08-28T00:23:00 1756340580

I’m in this position now. The longer I’ve been in it, I’ve come to realize can be summarized as:

You experience some the benefits of being a manager but bear all the responsibilities of managing others. It becomes challenging to make sound judgments when you must consider two different perspectives of a problem. Essentially, you’re taking on the duties of two jobs. I’ve found it incredibly difficult to step back and allow the team to make decisions without my input. My technical bias compels me to intervene when I perceive a decision as clearly incorrect. However, this approach hinders growth and may be perceived as micromanagement. While it’s a challenging position, it’s an excellent opportunity to explore management and determine if it’s a long-term career path you’re interested in.

c4wrd · 2025-08-19T19:41:29 1755632489

Obsidian's plug-in ecosystem is fantastic. I've used Obsidian for three years as a replacement for Notion, and I have never used the Graph mode. My Obsidian plugins enable automatic task synchronization with TickTick (where I manage my tasks) and allow me to set up features like templates. I strongly recommend giving it a spin.

The only downside for me is the inability to use it from a web browser. This isn't a major issue for my workflows.

hed · 2025-08-19T20:45:05 1755636305

What are you using to sync with TickTick?

c4wrd · 2025-07-01T18:06:57 1751393217

You're missing the bigger picture. It isn't free to put content on the Internet. At a bare minimum, you have infrastructure and bandwidth costs. In many cases, a goal someone may have is that if they publish content on the internet, they will attract people to return for more of the content they produce. Google acted as a broker, helping facilitate interactions between producers and consumers. Consumers would supply a query they want an answer to, and a producer would provide an answer or facilitate a space for the answers to be found (in the recent era, replace answer with product or store-front).

There was a mostly healthy interaction between the producers and consumers (I won't die on this hill; I understand the challenges of SEO optimization and an advertisement-laden internet). With AI, Google is taking on the roles of both broker and provider. It aims to collect everyone's data and use it as its own authoritative answer without any attribution to the source (or traffic back to the original source at all!).

In this new model, I am not incentivized to produce content on the internet, I am incentivized to simply sell my data to Google (or other centralized AI company) and that's it.

A clearer picture to help you understand what's going on: the internet of the past few decades was a bazaar marketplace. Every corner featured different shops with distinct artistic styles, showcasing a great deal of diversity. It was teeming with life. If you managed your storefront well, people would come back and you could grow. In this new era, we are moving to a centralized, top-down enterprise. Diversity of content and so many other important attributes (ethos, innovation, aestheticism) go out of the window.

haiku2077 · 2025-07-01T18:13:15 1751393595

> You're missing the bigger picture. It isn't free to put content on the Internet. At a bare minimum, you have infrastructure and bandwidth costs.

While it technically isn't free, the cost is virtually zero for text and low-volume images these days. I run a few different websites for literally $0.

(Video and high-volume images are another story of course)

jorvi · 2025-07-01T18:11:10 1751393470

> A clearer picture to help you understand what's going on: the internet of the past few decades was a bazaar marketplace.

That internet died almost two decades ago. Not sure what you're talking about.

MisterTea · 2025-07-01T18:48:03 1751395683

The web died. The internet is still a functional global IP network. For now.

c4wrd · 2025-03-21T15:42:53 1742571773

For those who aren't in the video game development field and like technical content, I cannot recommend watching GDC's content enough (even better, attending!). Although this blog post isn't as detailed as I've seen them do in the past, I appreciate that game developers have the opportunity to share their wins like this externally.

If you enjoy watching technical content and video games, the following videos are greatly entertaining deep dives into ludicrously technically complicated bugs or performance optimizations done to deliver the gaming experience that I myself take for granted.

- I Shot You First: Networking the Gameplay of Halo: Reach: https://www.youtube.com/watch?v=h47zZrqjgLc - 8 Frames in 16ms: Rollback Networking in Mortal Kombat and Injustice 2: https://www.youtube.com/watch?v=7jb0FOcImdg - It IS Rocket Science! The Physics of Rocket League Detailed: https://www.youtube.com/watch?v=ueEmiDM94IE - https://gdcvault.com/play/1023220/Fighting-Latency-on-Call-o...

c4wrd · 2025-03-18T18:11:56 1742321516

> When attached to an EBS–optimized instance, General Purpose SSD (gp2 and gp3) volumes are designed to deliver at least 90 percent of their provisioned IOPS performance 99 percent of the time in a given year. This means a volume is expected to experience under 90% of its provisioned performance 1% of the time. That’s 14 minutes of every day or 86 hours out of the year of potential impact. This rate of degradation far exceeds that of a single disk drive or SSD. > This is not a secret, it's from the documentation. AWS doesn’t describe how failure is distributed for gp3 volumes, but in our experience it tends to last 1-10 minutes at a time. This is likely the time needed for a failover in a network or compute component. Let's assume the following: Each degradation event is random, meaning the level of reduced performance is somewhere between 1% and 89% of provisioned, and your application is designed to withstand losing 50% of its expected throughput before erroring. If each individual failure event lasts 10 minutes, every volume would experience about 43 events per month, with at least 21 of them causing downtime!

These are some seriously heavy-handed assumptions being made, completely disregarding the data they collect. First, the author assumes that these failure events are distributed randomly and expected to happen on a daily basis, ignoring Amazon's failure rate statement throughout a year ("99% of the time annually"). Second, they argue that in practice, they see failures lasting between 1 and 10 minutes. However, they assert that we should assume each failure will last 10 minutes, completely ignoring the severity range they introduced.

Imagine your favorite pizza company claiming to deliver on time "99% of the time throughout a year." The author's logic is like saying, "The delivery driver knocks precisely 14 minutes late every day -- and each delay is 10 minutes exactly, no exceptions!". It completely ignores reality: sometimes your pizza is delivered a minute late, sometimes 10 minutes late, sometimes exactly on time for four months.

As a company with useful real-world data, I expect them not to make arguments based on exaggerations but rather show cold, hard data to back up their claims. For transparency, my organization has seen 51 degraded EBS volume events in the past 3 years across ~10,000 EBS volumes. Of those events, 41 had a duration of less than one minute, nine had a duration of two minutes, and one had a duration of three minutes.

remram · 2025-03-18T18:56:18 1742324178

They are expanding on what the guarantee from AWS means, their statement is correct. They did not say the pizza place does this, they said the pizza place's guarantee allows for this. I don't see a problem.

c4wrd · 2025-02-28T20:51:54 1740775914

Imagine you're studying for a test where you are given an image and need to answer the correct class. To prepare, you're given a deck of flashcards with an image on the front and the class on the back.

(Random) You shuffle the deck every time you go through it. You're forced to learn the images and their classifications without relying on any specific sequence, as the data has no signal from sequence order.

(Fixed order) Every time you go through the deck, the images appear in the exact same order. Over time you may start to unconsciously memorize the sequence of flashcards, rather than the actual classification of each image.

When it comes to actually training a model, if the batches are sampled sequentially from a dataset, it risks learning from correlations caused by the sequencing of the data, resulting in poor generalization. In contrast, when you sample the batches randomly, the model is biased and encouraged to learn features from the data itself rather than from any signals that arise from artifacts of the ordering.

dekhn · 2025-03-01T01:01:14 1740790874

Why, then are so many successful models trained on multiple passes through sequential data? Note, I'm not naive in this field, but as an infra person, random reads make my life very difficult, and if they're not necessary, I'd rather not deal with them by having to switch to an approach that can't use readahead and other strategies for high throughput io.

Terretta · 2025-03-01T13:46:20 1740836780

Why did they take time to invent an entire FS?

dekhn · 2025-03-01T22:46:46 1740869206

The engineers at DeepSeek seem extremely smart and well-funded, so my guess is they looked at the alternatives and concluded none of them would allow them to make their models work well enough.

rfoo · 2025-03-01T16:10:49 1740845449

They did this back in their trading firm days, and...

Imagine that you have a sequence of numbers. You want to randomly select a window of, say, 1024 consecutive numbers, a sequence, as input to your model. Now, say, you have n items in this sequence, you want to sample n/c (c is a constant and << 1024) sequences in total. How to do fixed shuffle?

The key is, we have overlap in data we want to read. If we brute force fixed shuffle and expand, we need to save 1024/c times more than original data.

This isn't useful for LLMs, but hey, wonder how it started?

dekhn · 2025-03-01T22:45:19 1740869119

I guess I'm much more of the "materialize the shuffle asychronously from the training loop" kind of person. I agree, the materialization storage cost is very high, but that's normally been a cost I've been willing to accept.

As an ML infra guy I have had to debug a lot of failing jobs over the years, and randomizing datapipes are one of the hardest to debug. Sometimes there will be a "record-of-death" that randomly gets shuffled into a batch, but only causes problems when it is (extremely rarely) coupled with a few other records.

I guess I'll just have to update my priors and accept that inline synchronous randomization with random reads is a useful-enough access pattern in HPC that it should be optimized for. Certainly a lot more work and complexity, hence my question of just how necessary it is.

rfoo · 2025-03-02T00:34:59 1740875699

Yeah, I don't want to do this either. This is a super special case, after exploring alternatives with our researchers it's unfortunately needed. As for record-of-death, we made sure that we do serialize all rng state and have our data pipeline perfectly reproducible even when starting from checkpoint.

Building a system for serving read-only data at NVMe SSD speed (as in IOPS) took surprisingly few effort, and is mostly enough for training data. Kudos to DeepSeek who decided to spend extra effort to build a full PFS and share it.

c4wrd · 2024-12-27T14:27:16 1735309636

> “We once again need to raise more capital than we’d imagined. Investors want to back us but, at this scale of capital, need conventional equity and less structural bespokeness.”

Translation: They’re ditching the complex “capped-profit” approach so they can raise billions more and still talk about “benefiting humanity.” The nonprofit side remains as PR cover, but the real play is becoming a for-profit PBC that investors recognize. Essentially: “We started out philanthropic, but to fund monstrous GPU clusters and beat rivals, we need standard venture cash. Don’t worry, we’ll keep trumpeting our do-gooder angle so nobody panics about our profit motives.”

Literally a wolf in sheep’s clothing. Sam, you can’t serve two masters.

c4wrd · on July 13, 2024

The point you are missing is that many of us do not have big tech careers. I am very fortunate to have a big tech career, but before I was hit by a stroke of luck, I was doing gig work paycheck to paycheck barely making ends meet. When you can’t see more than two weeks ahead in time, which you cannot do living paycheck to paycheck, you don’t think about the long term consequences because you are not capable of it. The incentive structure is too strong to sell zero days to any external party for those who have nothing to do all day but try to find exploits.

dmurray · on July 13, 2024

I think GP is suggesting an insider could introduce a bug, have a confederate "find" it, and split the money. At $5m I think more than a few big tech employees might decide to write themselves a new minivan.

bahmboo · on July 13, 2024

I think you'd have a tough time deliberately putting something like that in at a large company. The cost of failure is losing a very good job.

If you discovered a vulnerability and sat on it for a future payout that would be more likely, yet still risky.

Though it does come down to choosing to do crimes in the face of incentives and disincentives. Nothing unique here - humans break the rules all the time.

zeroCalories · on July 13, 2024

It's trivial for a motivated engineer to deliberately introduce bugs, most couldn't avoid it if they tried. It wouldn't be too hard to pass it off as an honest mistake either. You might not even lose your job, as a lot of places have a "blameless culture".

c4wrd · on June 18, 2024

You have a fair point in that a native experience will more than likely keep privacy in mind and have more resources to produce a potentially better experience. The concern is that these are unethical practices. Now, they don’t own a monopoly over the smartphone market, but they do have a monopoly on the actual app purchase market, so the monopoly question is debatable. Apple has both created the game (marketplace), owns the game and has private information on how the game is being played. They are then using that information to increase the value of their offering. They have a statistically unfair advantage against any offering on their marketplace and can use that to build up their own product. I’m not a lawyer and I can’t speak to the legality, although this is something that would be classified as at least suspicious, although it’d be hard to build a case on, if at all possible.

Ethically, their practice of appearing to target successful offerings and undermining them is debatable. There’s no right answer here, it’s a matter of opinion. But this repeated behavior would signal to anyone looking to build a commercial success on their marketplace: don’t be too successful. You have to remember that you have a multi-trillion dollar international corporation suppressing the success of small (internationally <$1B business is considered small) businesses, who have employees that believe in their mission. And you can see I keep referring to “their marketplace” that they legitimately own, and I understand they may not be doing anything “illegal” as defined by current law, but when you look at a multi-decade view of this practice, I think we’re going to start to see new laws being drafted and regulations put on place. I’m not sure how you even begin to fight a “virtual monopoly” which has little physical monopoly dominance.

I will caveat that I do thoroughly enjoy Apple products and the seamless Apple native applications and integrations with each other. I do appreciate their craftsmanship and attention to detail. It just leaves a bad taste in my mouth when we see a similar headline every year each of the last few years, where a closed environment controlled by one entity is swallowing more and more of the world’s resources and preventing others from growing.

c4wrd · on June 10, 2024

I have taken a thorough look at the source and I very much enjoyed it.

The project highlights the utility of Bazel more than I have seen before. How long did it take you (and others) to become as fluent as you are in Bazel and working with it?

Additionally, you (Will) seem to have a very intuitive grasp of build processes in general, notably many of the scripts are fun to dive into and have a very clean design. How did you familiarize yourself so well with those sorts of concepts? Do you have any recommended readings or side-projects you’d recommend tackling to get experience with?

willchen · on June 10, 2024

Thanks! I've used Google's internal version of Bazel but it wasn't until this project where I had to spend a lot of effort getting Bazel working in a new project, which is honestly a lot of work and not very straightforward :(

What helped me the most was looking at other projects using Bazel with similar tech stacks and then assembling it together, e.g. :

- https://github.com/angular/components - https://github.com/tensorflow/tensorboard

Alex Eagle of https://www.aspect.build/ has a lot of great resources on using Bazel - both on YouTube and the aspect website.