Hacker Newsnew | past | comments | ask | show | jobs | submit | MontyCarloHall's commentslogin

The SPXT [0]/XMAG [1] ETFs are exactly what you're looking for.

[0] https://www.proshares.com/our-etfs/strategic/spxt (S&P minus tech stocks)

[1] https://www.defianceetfs.com/xmag/ (S&P minus "Magnificent 7")


Am I misreading, or does SPXT still hold >20% GOOG, TSLA, META, AMZN?

"Information technology" apparently just means Microsoft and Apple.


It's because none of those companies is considered "Information Technology" according to the official GICS criteria [0] used to classify companies in the index. For instance, Meta and Google are in the "Communications Services" sector; Amazon is in "Consumer Discretionary." There are 69 total companies in the S&P 500 in the "Information Technology" GICS sector [1], and all are excluded from SPXT.

[0] https://en.wikipedia.org/wiki/Global_Industry_Classification...

[1] https://en.wikipedia.org/wiki/List_of_S%26P_500_companies


Welp, time to see if my 401k provider supports them.

In a similar vein, e^pi - pi = 19.9990999792, as referenced in this XKCD: https://xkcd.com/217/


Also, (-1)^(-i) - pi = 19.999... ;)


Not really in a similar vein, because there's actually a good reason for this to be very close to an integer whereas there is no such reason for e^pi - pi.


No known reason :-)


Assuming those 20PB are hot/warm storage, S3 costs roughly $0.015/GB/month (50:50 average of S3 standard/infrequent access). That comes out to roughly $3.6M/year, before taking into account egress/retrieval costs. Does it really cost that much to maintain your own 20PB storage cluster?

If those 20PB are deep archive, the S3 Glacier bill comes out to around $235k/year, which also seems ludicrous: it does not cost six figures a year to maintain your own tape archive. That's the equivalent of a full-time sysadmin (~$150k/year) plus $100k in hardware amortization/overhead.

The real advantage of S3 here is flexibility and ease-of-use. It's trivial to migrate objects between storage classes, and trivial to get efficient access to any S3 object anywhere in the world. Avoiding the headache of rolling this functionality yourself could well be worth $3.6M/year, but if this flexibility is not necessary, I doubt S3 is cheaper in any sense of the word.


Like most of AWS, it depends if you need what it provides. A 20PB tape system will have an initial cost in the low to mid 6 figures for the hardware and initial set of tapes. Do the copies need to be replicated geographically? What about completely offline copies? Reminds me of conversations with archivists where there's preservation and then there's real preservation.


How the heck does anyone have that much data? I once built myself a compressed plaintext library from one of those data-hoarder sources that had almost every fiction book in existence, and that was like 4TB compressed (but would've been much less if I bothered hunting for duplicates and dropped non-English).

I suspect the only way you could have 20PB is if you have metrics you don't aggregate or keep ancient logs (why do you need to know your auth service had a transient timeout a year ago?)


Lots of things can get to that much data, especially in aggregate. Off the top of my head: video/image hosting, scientific applications (genomics, high energy physics, the latter of which can generate PBs of data in a single experiment), finance (granular historic market/order data), etc.


In addition to what others have mentioned, before the "AI bubble", there was a "data science bubble" where every little signal about your users/everything had to be saved so that it could be analyzed later.


> Does it really cost that much to maintain your own 20PB storage cluster?

If you think S3 = storage cluster than the answer is no.

If you think about S3 what it actually is: scalable, high throughput, low latency, reliable, durable, low operational overhead, high uptime, encrypted, distributed, replicated storage with multiple tier1 uplinks to the internet than the answer is yes.


>scalable, high throughput, low latency, reliable, durable, low operational overhead, high uptime, encrypted, distributed, replicated storage with multiple tier1 uplinks to the internet

If you need to tick all of those boxes for every single byte of 20PB worth of data, you are working on something very cool and unique. That's awesome.

That said, most entities who have 20PB of data only need to tick a couple of those boxes, usually encryption/reliability. Most of their 20PB will get accessed at most once a year, from a predictable location (i.e. on-prem), with a good portion never accessed at all. Or if it is regularly accessed (with concomitant low latency/high throughput requirements), it almost certainly doesn't need to be globally distributed with tier1 access. For these entities, a storage cluster and/or tape system is good enough. The problem is that they naïvely default to using S3, mistakenly thinking it will be cheaper than what they could build themselves for the capabilities they actually need.


Very cool. tl;dw: an inverted triple pendulum has 2^3 = 8 equilibria, since each arm of the pendulum can either be up or down (naturally, all but one equilibria are unstable), and this controller is able to make all 8*7 = 56 transitions between them.

Control theory is one of those things that shouldn't possibly work, yet here we are.


>thousands of middle-class "bullshit jobs" are disappearing, but rather than being replaced by a wave of productive jobs [...] we're just seeing unemployment, underemployment.

Jobs are neither fungible nor mutually exclusive; there is no reason to assume that someone working in a bullshit job would thrive in a non-bullshit job that contributes to society in more productive ways, nor does the existence of bullshit jobs prevent people from working non-bullshit jobs. I hate to say it, but perhaps many people are employed in bullshit jobs because they are not capable of anything more challenging.


> I hate to say it, but perhaps many people are employed in bullshit jobs because they are not capable of anything more challenging.

Because bullshit jobs paid mode. Your average engineer working on ad targeting or at a hedge fund makes a lot more than working in say medicine.


"Bullshit job" has a specific meaning that's less about being in a pointless field-of-work (like adtech or many parts of fintech) and more about occupying a pointless role, regardless of the field. David Graeber (the originator of the term) gave the following examples [0]:

— Flunkies, who serve to make their superiors feel important, e.g., receptionists, administrative assistants, door attendants, store greeters

— Goons, who act to harm or deceive others on behalf of their employer, or to prevent other goons from doing so, e.g., lobbyists, corporate lawyers, telemarketers, public relations specialists

— Duct tapers, who temporarily fix problems that could be fixed permanently, e.g., programmers repairing shoddy code, airline desk staff who calm passengers with lost luggage

— Box tickers, who create the appearance that something useful is being done when it is not, e.g., survey administrators, in-house magazine journalists, corporate compliance officers, academic administration

— Taskmasters, who create extra work for those who do not need it, e.g., middle management, leadership professionals

[0] https://en.wikipedia.org/wiki/Bullshit_Jobs


My point stands. Its an incentive game. People work in BS fields because they pay more. People work BS jobs because again: they pay well. There is no incentive to work somewhere else.


We live in a very complex system, beyond any one persons comprehension. Some people think devolved decision making allocating resources to things like, advertising better, is the most efficient way of allocating resources. The invisible hand. How much is bullshit and how much is just beyond your awareness? If you were king and allocating so the work, would it be better? For who? I'm doubtful about bullshit jobs.


I suspect you don’t actually hate to say it


Twitter is a concrete demonstration of this. There were so many prognostications [0] that Twitter would imminently implode after downsizing from ~8k to ~1.5k employees following Musk's takeover, and when these claims never came to pass, it was a wake-up call to the rest of the industry [1].

[0] https://news.ycombinator.com/item?id=34617964

[1] https://www.livemint.com/companies/news/elon-musk-fired-80-p...


Pretending the current iteration of twitter is anything remotely comparable to what existed before is pretty ridiculous. Other than grok, which is by far the worst of all the flavors of models out there (and very technically, made by one of musk's other companies), there haven't been any new features in years, even down to the terrible UI/UX has barely changed at all, and the particular "slant" the site takes in addition to the swarms of boosted bots out there rendered the site practically unusable for me in a very short period of time. I honestly don't understand people that still use it or what they could possibly get out of it. If there was any honest reporting about DAU/MAU I'd bet a large part of my paycheck it's way down from pre-musk levels.


Grok is a separate product created by xAI. It is integrated into Twitter yes but they hired a ton of engineers at xAI to make it happen.


Those are due to deliberate policy changes from Musk to boost engagement of his right-wing sycophants, not due to any technical failings. From a strictly technological point-of-view, Twitter works just as well as it did pre-takeover, and certainly did not catastrophically collapse as many predicted.


I would categorize what happened to the site and it being rendered unusable by anyone even halfway serious as catastrophic - but perhaps my bar is a little higher for the "smartest man in the world" than "I can still get a 200 response from the site" (which actually is also down, in terms of outages).


I agree that the site is barely usable, but that's entirely due to a shift in Twitter's userbase caused by top-down policy changes (e.g. boosting right-wing spam), not any engineering shortcomings.

If Musk had never purchased Twitter and Jack Dorsey performed the same reduction in engineering staff, I doubt the site would be materially different from how it was pre-Musk.


That's because software is immortal. It will continue to run even if you do nothing. What happens, though, is that stuff around it moves.

Of course twitter still works. Even with 0 engineers, it would still work. That's never been the goal of a software company. I can compile Mario 64 right here, right now, decades later. Should Nintendo just go home? Call it quits? Of course not.


It’s rhetoric like this that has created the market we have today.

The perceived success is not the same as actual success. Remember it is a private company and you don’t actually have any idea how bad the balance sheets were after the layoffs. Before the financial engineering that Musk did by using his other companies to invest in Twitter to preserve its valuation, the company was down almost 80%. [1] If public companies go down that route, they’ll very quickly find out what the actual impact of that model is.

[1] https://techcrunch.com/2024/09/29/fidelity-has-cut-xs-value-...


Twitter's failures are solely due to Musk's changes in corporate governance (e.g. boosting fringe right-wing content causing its existing userbase and advertisers to flee the platform), not due to any engineering problems caused by reducing headcount. Strictly from an engineering standpoint, Twitter works just as well as it did before Musk took it over.

As I wrote in another post, if Musk had never purchased Twitter and Jack Dorsey performed the same reduction in engineering staff, I doubt the site would be materially different from how it was pre-Musk.


> Twitter works just as well as it did before Musk took over

Just because it works on your phone doesn’t mean there are no engineering problems behind the scene. You’re just not aware of the problems that exist because it’s a private company and you’re not privy to the information.


> Twitter works just as well as it did before Musk took it over.

Not true. The main reason I stopped clicking Twitter links in the first place was the abysmal chance of the tweet loading and not just displaying a generic "Failed to load. Try again?" after the takeover. I mean it occasionally happened before as well, but it became the default behavior.

It lasted long enough that by the time (over a year) they'd finally fixed it, the platform had deteriorated to a right-wing cesspool anyway.


To be fair a lot of the fired people were moderators and it shows today.


>I use nucgen to generate a random 100M line FASTQ file and pipe it into different tools to compare their throughput with hyperfine.

This is a strange benchmark [0] -- here is what this random FASTQ looks like:

  $ nucgen -n 100000000 -l 20 | head -n8
  
  >seq.0
  TGGGGTAAATTGACAGTTGG
  >seq.1
  CTTCTGCTTATCGCCATGGC
  >seq.2
  AGCCATCGATTATATAGACA
  >seq.3
  ATACCCTAGGAGCTTGCGCA
There are going to be very few [*] repeated strings in this 100M line file, since each >seq.X will be unique and there are roughly a trillion random 4-letter (ACGT) strings of length 20. So this is really assessing the performance of how well a hashtable can deal with reallocating after being overloaded.

I did not have enough RAM to run a 100M line benchmark, but the following simple `awk` command performed ~15x faster on a 10M line benchmark (using the same hyperfine setup) versus the naïve `sort | uniq -c`, which isn't bad for something that comes standard with every *nix system.

  awk '{ x[$0]++ } END { for(y in x) { print y, x[y] }}' <file> | sort -k2,2nr
[0] https://github.com/noamteyssier/hist-rs/blob/main/justfile

[*] Birthday problem math says about 250, for 50M strings sampled from a pool of ~1T.


The awk script is probably the fastest way to do this still, and it's faster if you use gawk or something similar rather than default awk. Most people also don't need ordering, so you can get away with only the awk part and you don't need the sort.


Totally agree it's a bit of weird benchmark - it was just the first thing that I thought of to generate a huge amount of lines to test throughput.

There are definitely other benchmarks that we could try as well to test other characteristics as well.

I've actually just added in this `awk` implementation you provided to the benchmarks well.

Cheers!


>Ye Olde ARPAnet Kludge

Seems fitting that NetBSD's internal mailing lists still use ossified address syntax from a time before DNS.


You are figuratively killing me with your literalism.


Literally, anytime someone uses the term decimate, I assume they don't actually know what it means unless they explicitly state that they do.


These days some would've written, "you are literally killing me . . ." (which I find deplorable).


Using "literally" figuratively or, more precisely, as a hyperbolic intensifier [0], is a tradition employed by notable English writers who lived and died long before you were born.

[0] https://en.wikipedia.org/wiki/Literally


And someone would have replied "I could care less" (incorrectly implying they do care, even if it's a little bit) :(


I don't mind that one because it does not risk destroying the usefulness of a previously useful unambiguous word.


It’s a bitter monkey paw irony: when you ask FOSS advocates how developers would be paid in a fully FOSS world where piracy cannot exist because all software is free, the answer is often “service contracts.”

The monkey paw curls. Now we live in a world where software is nothing but service contracts and more closed than ever.


There is still open source software, and it is still as free as ever.


And routinely gets license changed as the authors discover they actually need to make money in a capitalist world.


FOSS is a bad way to make money, but it is a great way for independently wealthy developers to get clout.


It's the Westphalian system which includes not only (protestant) capitalism, but also scientific positivism, liberal humanism and everything else. Which we now call (post-/meta-)modern.

There's nothing we can do about all that and for practical reasons we just accept the world as is and tend to forget/ignore the reasons it is so. But for retaining cognitive sovereignty if think it's good to remember that.


> Now we live in a world where software is nothing but service contracts and more closed than ever.

Indeed, and that software ends up optimized for service contract billing potential over usability.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: