I know this is HN and we love these simplistic universal laws here (Hanlon's razor being another beloved law) but Occam’s razor isn’t really applicable here, or at the very least, it isn’t very helpful, and it certainly isn’t conclusive. That he was exfiltrated by the Chinese government also requires the same number of variables as that he was disappeared by the American government. So Occam’s Razor actually suggests nothing here.
Occam’s razor is a rule of thumb which suggests that if several hypotheses explain the same thing, the simplest one should be favored. When we have a missing person with several unknown variables, until we see some evidence, any guess is as good as other. And no hypotheses should be ruled out.
On HN we also like Bayesian analysis, so instead of looking for the simplest explenation, we should be asking what we already know. What is the probability that this person was disappeared by the government, given that 3 other students have been disappeared by the government lately? Given that these 3 other students were also people of color? Etc.
I didn't either at first, I went and looked it up after enough people started using it to nullify/discredit what I had to say in serious conversation.
Murphy's Law Book 2, January 24, 1980
A book of jokes, meant to be humorous.
No previous literature related to it before then.
I can't help but laugh when people use it now, so I guess its still funny... in a way.
It says a lot about the character of a person that takes a joke and uses it in a way, that it was never intended to be used. Useful knowledge to keep in your back pocket if needed.
> When we have a missing person with several unknown variables, until we see some evidence, any guess is as good as other.
Really? So that he was abducted by aliens is also on the table and has the same probability?
Obviously there can be some prior information that makes „he was exfiltrated by Chinese intelligence services“ more likely than „he was disappeared by US intelligence services“.
I’m not saying there is, but philosophically, you don’t need to know anything more than „A person with relations to china disappeared“ to be able to determine one hypothesis as the most likely one.
> Obviously there can be some prior information that makes „he was exfiltrated by Chinese intelligence services“ more likely than „he was disappeared by US intelligence services"
The obvious one being if the FBI is arresting a spy they would not be doing search warrants on his properties weeks after he goes missing. They do that stuff immediately.
Instead if we're speculating, this far more sounds like the FBI are playing catch up to circumstances that became suspicious after a few weeks.
I answered the exfiltration likelihood in a sibling thread. The summary is that spies are very rare and it is not a likely explanation. Keeping the Bayesian hat on we can pretty much rule out aliens. We have never witnessed an alien abduction, so the chances of this being an alien abduction is actually infinitely smaller than exfiltration.
We can set the same prior probability to all the possibilities, that is fine. But given the evidence the posterior for alien abduction is very close to 0, if not just simply 0.
Given the recent pattern of behavior from the US government, we have seen quite a few people disappeared. The posterior for US government involvement is IMO much higher than exfiltration. That said, no government involvement is higher still (and is actually the hypothesis favored by Occam’s Razor).
Wrapping your conspiracy theories into pseudo-science based on wild guesses doesn't make them more likely.
There is a long series of scientists in the US that were secretly working for the Chinese government. It is simply the most likely explanation that this is another instance of that.
It also doesn't make any sense that the US government would secretly "disappear", as you put it, this individual while they are openly deporting thousands elsewhere, often based on flimsy evidence. Furthermore, if you are secretly getting rid of someone, you do it to avoid taking responsibility. Why would you send the FBI to raid their residence afterwards? That immediately connects you to the disappearance, thereby sabotaging your own scheme.
I did say the likeliest explanation is no government involvement, but that USA involvement is still likelier than China‘s involvement given recent pattern of conduct.
Chinese exfiltration is also a conspiracy theory. In fact chinese exfiltration requires a larger conspiracy than USA disappearance. If you want to apply Occam’s Razor (which you shouldn’t), you should actually favor disappearance over exfiltration (but really you should favor no government involvement).
> There is a long series of scientists in the US that were secretly working for the Chinese government.
If that is so, can you provide me with a list. That list would actually have to be pretty long to make exfiltration the likeliest hypothesis. Like there would have to be more than a couple of dozen cases this century.
EDIT: I did quick googling to find any sources which backs your claim. This NY Post article is the strongest one (https://nypost.com/2025/02/20/us-news/us-science-labs-face-g...) it is basically a propaganda piece citing far right US politicians who claim extraordinary numbers (like 8000 scientists) without any evidence. As we know extraordinary claims require extraordinary evidence.
I feel that the word conspiracy is not helping to bring clarity to the discussion. Covert operations are by definition conspirations but that's not what people generally mean when they use the word conspiracy so let's just avoid using a word that might muddy the discourse unnecessary.
Both the Chinese and USA governments engaged (or have been accused of engaging) in unlawful kidnapping of their own (and foreign) nationals from foreign countries over the last decades.
So I'm not sure how one can dismiss the possibility that this is indeed one of such cases.
Searching the DOJ homepage with the search term “spy” (to see which they have showcased) gives us 209 results, including a bunch of news around the 2009 conviction. The most resent news I found was from 2022 (https://www.justice.gov/archives/opa/pr/five-individuals-cha...) a very vague news which is actually more about harassing private citizen (presumably piggy-banking out of the anti-asian hate following Covid). The most resent news about actual spying against the US government was from 2020 (https://www.justice.gov/archives/opa/pr/former-cia-officer-a...) Alexander Yuk Ching Ma. He was actually found guilty in may 2024 (so he is missing from that Wikipedia list). The DOJ article about the conviction doesn’t use the word spy, so it is missing from my search, but here it is: https://www.justice.gov/archives/opa/pr/former-cia-officer-s....
Given the above, I think it is safe to bet that if he was arrested for being a spy, the DOJ would indeed want to showecase it. However, spies are very rare, and the likelyhood of OP’s case being a spy is very low. It is much more likely that he disappeared for other reasons.
A guy was arrested for being a PRC spy, stealing sensitive documents from GE Vernona. 2-3 years ago. I remember it because he was using stenography to embed documents in landscape photos sent via email.
They also use various means to charge people. Lying to a federal agent is a serious crime. Almost any misuse of federal funds can a a serious crime.
Wouldn't "murder victim" be the Occam's razor explanation? We know his wife is still living in the house. We also know that statistically speaking your spouse is the person most likely to murder you. I'm not going to say anything more than that because the obvious next statement would be one that is based entirely on speculation, but it takes a lot less speculation than all these ridiculous James Bond fanfics I'm seeing.
> We also know that statistically speaking your spouse is the person most likely to murder you.
This is slightly misleading. While it is true that the person with the highest probability of murdering you is your partner (at least, if you're a woman), it is more likely that the murder is a stranger to the victim than to be the spouse of the victim.
For example, in the FBI's 2011 dataset [1], of the 7076 murders by someone with a known relationship to the victim, 1295 were by a husband, wife, boyfriend, or girlfriend. This is compared to, of course, the 2700 murders committed by acquaintances and 1481 by strangers. From this, we can see ~18% of murders are by their partners and ~60% are by someone quite a bit further flung.
The reason why both statistics are simultaneously true is that most people have only a few partners (focusing the entire risk on a small group) where as there are many thousands of people in the latter categories (creating a very diffuse risk).
So, if you were to bet one which person murdered someone, you would do well to guess one of their partners. If you had to guess if the victim knew their murderer, you should bet that they did not.
Yeah, good example of the maximum entropy principle, [0] in which "the selected distribution is the one that makes the least claim to being informed beyond the stated prior data, that is to say the one that admits the most ignorance beyond the stated prior data", a.k.a. the most entropy. Occam's Razor must still be constrained to known facts, and you cannot construct a "simplest answer" out of data you simply do not have.
There's plenty of other public information. The FBI was searching his house for undisclosed reasons, so most definitely they have some kind of investigation open on him. And his degrees were issued by universities in Nanjing and Shanghai, China, strongly suggesting he is a Chinese National, perhaps naturalized to the United States.
That was known at the time of my post.
We now also know now at the time of his escape, Indiana University had terminated his Professorship. And has continued to disavow he and his wife.
I did not misuse Occam's razor, my knowledge of public facts simply exceeded yours.
Acts in public are not protected. This, however is:
"The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no warrants shall issue, but upon probable cause, supported by oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.
Posting on a publicly accesible website is the legal equivalent of standing on the street corner having a conversation. A police officer doesn't need to search anything because it's out in the open.
A conversation is ephemeral unless someone specifically goes out of their way to record it. That is not true by default for just about anything involving a computer and we all know it. Where's the line between obtuse and lying?
No. This is not a common practice. An Accounting / ERP system with full audit trails and data entry compliance is. Exporting data from such a system into Excel for analysis is. But Excel as the [re-reads] Primary data file is not a practice at all.
I've worked on dozens of Excel to ERP migrations. Excel is way more common as the primary data source including financial data. If you only look at financial data by volume (and not count) maybe ERP win, but I wouldn't be sure.
> and was used for “consolidation, journals, business-critical reporting, and analysis.”
Not sure what 'journals' is supposed to mean here but “consolidation, business-critical reporting, and analysis.” does not mean operations day to day use, it means high level management reporting
My position is that the article (or the consultants who are trying to sell the ERP) are exaggerating the role of the spreadsheet.
The phrase “consolidation, journals, business-critical reporting, and analysis.” is in the actual report referring to multiple spreadsheets (p60). But the Register has reported that as referring to the main spreadsheet. There's no way a top-level consolidated spreadsheet would also be journal entries for low level accounting.
And I note the report is by Deloitte who I trust about as far as I can throw them, although the organisation in question undoubtedly messed up here, Deloitte are angling for a nice juicy government ERP implenmentation lasting years and costing tens of millions
PSA: every package you install from any package manager from browser extensions to npm/composer etc presents the risk of malware. Because the open source community lacks the financial resources to vet every single version of every package. Demanding this level of security from software provided at no cost that relies on open contributions is wholly unreasonable. If you need that, buy an IDE from a company financially capable of ensuring security and accept the limitations of their offering.
Mitigations like running in a VM might protect your dev workstation. But not code you put into production that relies on third parties.
> Demanding this level of security from software provided at no cost that relies on open contributions is wholly unreasonable
VS Code isn't some kind of hobby project by a couple of dudes on laptops with nothing but the best interests of the community at heart. It's a flagship IDE produced by one of the most valuable tech companies in the world, released for free as a loss leader in service to very specific corporate goals.
When a tech behemoth releases a free IDE as a loss leader and it drives out all of the scrappy open source projects one by one, I think it's reasonable to hold that tech behemoth to tech behemoth standards rather than scrappy open source project standards.
The marketplace isn't operated on a paid contract for vetted extensions. You vet the extensions you use. Most don't, and it's ok. Don't shift the blame and the cost on microsoft though, they don't have to offer it.
Yet Mozilla, for all the flak it gets, isn't paid a dime by its users, but does find resources to vet the most popular extensions. Everything I use is checked by them.
Raymond Hill (of ublock fame) wasn't really impressed with how it is performed, but it's still much better than nothing (which is what MS apparently does).
VSCode is an IDE in name only, it's a glorified text editor, and pretty mediocre one at that. I in "IDE" stands for "integrated", like what you'd expect from JetBrains' products. Or even the real visual studio.
Paid JetBrains user here. What JetBrains gives you is self-implemented add-ons in their marketplace. These are perceived to have the same level of trust as the base product. Then there is the similar level of “might be malware” or “steal your infoware” from (possibly adapted) open source and third parties available on their marketplace.
As a example: Rider (https://www.jetbrains.com/rider/) - a IDE - comes with everything you could possibly need to build and compile .NET apps out of the box, while VSCode - a code editor - relies on extensions (and thus mostly the community surrounding VSCode) for this.
Or to make things more succinct:
* VSCode is a extendable code editor (like vim, neovim, Zed and Sublime)
* Jetbrains Rider is a fully equipped Integrated Development Environment (like Microsoft Visual Studio or its direct sibling Jetbrains IntelliJ IDEA)
And while extensions are optional within a IDE (and often solely used for increased productivity), more often than not they are a necessity in a code editor to even become productive.
I'm a big JetBrains fan, but this distinction is just silly. If you look at the way that JetBrains IDEs are packaged, the differences between IDEs all come down to extensions—which are enabled by default, which are available to install at all. IntelliJ Ultimate can be made to have all the features of PyCharm with the right extension combo. And occasionally they break out a new IDE by taking an extension and making it no longer available for installation elsewhere (like RustRover). The entire architecture is one of plugins.
"Integrated" isn't meant to contrast with a plugin-based system (otherwise JetBrains wouldn't count!), it's meant to contrast with a dev environment built out of a bunch of individual tools and terminal commands run separately.
Good point. In the old times if someone had Eclipse but installed plugins for different language than Java we wouldn't suddenly downgrade Eclipse that it is a text editor.
> Because the open source community lacks the financial resources to vet every single version of every package.
I made the point elsewhere, but this seems to fail in the face of Debian and Red Hat and Canonical who have been publishing mostly-secure distros of exclusively open source software for decades now.
There's a reason why MS and NPM get caught by this sort of shenanigans, but it's not "open source".
Because the attack surface is smaller and more difficult to extract value out of. I think it’s been shown time and time again the more motivated your attacker the more difficult it is to defend and very visible popular platforms see more attacks. NPM and MS represent drastically larger platforms.
Uh... no. There is far (far) more code[1] shipped in the package repository of any Linux distro than in all the world's vscode extensions. Are you being serious? NPM arguably gets a little closer, but only a little.
No, the reason Linux is safe and modern distributors aren't is the "packaging" step. Debian volunteers package software that they understand to be high quality via existing community consensus. You can't just show up to Fedora and say "ship my junkware app", you need to convince the existing community that your stuff doesn't suck.
And that's worked extremely well for decades now, going all the way back to 2BSD being shipped above V7 Unix. The reason MS and NPM et. al. abandoned it isn't just pure experience[2]. They don't want to wait for their repos to fill with good software, they want all the software in it now so that they don't get beaten by whoever their competitors are.
And this is the inevitable result. If you allow anyone to distribute software to your users then you allow everyone to distribute software to your users. And everyone includes a lot of bad people.
[1] With vastly more capability! The distro ships everything from firmware blobs and kernel drivers up through browser glitz and desktop customization. Talk about "attack surface"!
Remember, when we're triggered our reading comprehension goes down and we confuse emotion for facts. Did I say they ship more/less code? No, first I was talking about the user base size and the economic incentives for malicious users.
For the most popular package:
Debian: ~253K installs per month [1]
NPM: ~236M installs per month [2]
VSCode: ~158M installs total [3]
Obviously VSCode is hard to compare, but the most popular Debian package would need 52 years to achieve the total VSCode numbers so I'm sure it's safe to say VSCode beats Debian significantly on installs and NPM wins even more convincingly.
Ok, but let's take a look at how much code is shipping which was your metric:
Debian: 242k submissions per month for amd64 [4]
NPM: ~50k new non-spam packages per month, ~800k new version submissions per month [5]
VSCode: No data available
I don't know how VSCode compares, but clearly NPM beats Debian which makes sense because of how open it is and more importantly how many orders of magnitude there are JS developers vs Linux developers and how much more frequently they update their packages because the overhead is lower for creating a submission.
It's really easy to forget that the number of JS developers or people using IDEs is much larger than the number of Linux users. So NPM still beats Debian on this front. As for the security assumption and how good a job maintainers are doing, I'm not so sure on that either. The xz utils backdoor into SSH was found by a Microsoft employee (i.e. the community) not by Debian maintainers. It's not hard to imagine that the lack of notable security issues (particularly attempts recorded) actually indicates very little review, not that there's a higher bar because the maintainers are more talented or have better incentives for "reasons" - there's a reason Chrome was perceived as having better security than IE (it did - architecture was better) and STILL they see regular successful attacks bypassing all the mitigations.
Again, to reiterate in case the above got you triggered again - NPM & VScode have significantly more users than Debian and that creates economic incentives for attackers. The capabilities of a vulnerability matter less unless you're a state actor because capabilities do not track economic results as strongly. This has so much evidence it shouldn't even need this kind of explanation. Remember when people said that Mac had better security? Well turns out Apple is dealing with all the same vulnerability and spam issues on a closed down system when their popularity went up; again, economic incentives.
The "triggered" bit is just flaming. Please stop that.
But I'm not following how you get from popularity numbers to "attack surface". The latter is a term of art that reflects the amount of complexity on the "outside" of a software system that can be interacted with by an attacker. It correlates well with "amount of code". I don't see that it has any relation at all to number of installs.
I originally used attack surface imprecisely in terms of how many people you compromise with a single vulnerability. In other words the economic value of the attack. But also in the formal term of art, it's still true that NPM has a larger attack surface with many more weak points than something like Debian has. VSCode is trickier since it's a single application, so may not be from that perspective. However, it is basically running Chrome so it is still quite a large attack surface area.
But sure, let's use "amount of code" as a proxy. Debian has ~123GiB of source code [1] across ~65k packages [2] while NPM has 74 GiB [3] if I'm reading it correctly (other sources say 128 GiB) across 3.3 M packages [4]. Given that JS requires less code than C for equivalent functionality (due to a richer runtime & no memory management), any way you slice it, NPM is a much larger attack surface both in terms of number of opportunities and how valuable the attack is.
You do realize this is Microsoft we're talking about here? Not merely a couple dudes in their bedroom doing this in their spare time? I guarantee you that a non-zero percentage of the code in VSCode was paid for.
Then they can pay those developers to sandbox vscode extensions at the very least. I like using vscode sometimes but I'm sure as shit not going to use it if my work bans installing extensions due to security risks.
> You do realize this is Microsoft we're talking about here?
Fiscal responsibility: required
> Not merely a couple dudes in their bedroom doing this in their spare time?
Fiscal responsibility: optional
I would also point out, the malware-infested extension we are talking about presents more as the “two guys in a bedroom” model (though possibly a state-sponsored actor).
I guess thats why Matthew Broderick's character had a script which dialed random numbers in a target area code (I think he used Sunnyvale, CA in the movie)
I wonder if anyone did that back in the day. Not sure how much the telco would have appreciated it ...
Never used an auto-dialer myself, but it would be trivial to code one. Just send ATDT<number> out the serial port and see if "CONNECT" comes back before timing out.
Back in that time, I think a good rate was $0.01/minute for a local call on a consumer landline. Unlimited calling plans came later. Not attributing any intent to the telco, just saying, there would be no cost issue to motivate an investigation.
It definitely wasn't local - he was in Washington but dialed into Sunnyvale, CA.
I can't remember charges for local exchanges (same area code), but I only remember as far back as the late 80s. It was something like 10 cents a minute.
I remember all the adds about "friends and family" special rates/etc. Metering on voice calls persisted into the 2000s.
But the calls were very brief (if they did pick up) unless he got a "hit". So thousands of calls could have no charge
That word has a meaning, and a model that isn’t politically aligned isn’t it.
That said, I’ve been using deep seek distilled to qwen, which should yield an incredibly censored model if they had been censoring the models, but instead yields a pretty balanced model that is more than willing to talk about Tiananmen and Xi Jinpings human rights failings.
reply