Hacker Newsnew | past | comments | ask | show | jobs | submit | isignal's commentslogin

The EB5 visa, by comparison, has much more clear requirements. It is not sure if this intended as a backdoor to EB5 or a replacement.


There already was such a system with more concrete requirements. It is called the EB5 visa and has a path to green card. What does this new method bring to the table?


It lacked Trump's face and branding.


Processes can die independently so the state of a concurrent shared memory data structure when a process dies while modifying this under a lock can be difficult to manage. Postgres which uses shared memory data structures can sometimes need to kill all its backend processes because it cannot fully recover from such a state.

In contrast, no one thinks about what happens if a thread dies independently because the failure mode is joint.


> In contrast, no one thinks about what happens if a thread dies independently because the failure mode is joint.

In Rust if a thread holding a mutex dies the mutex becomes poisoned, and trying to acquire it leads to an error that has to be handled. As a consequence every rust developer that touches a mutex has to think about that failure mode. Even if in 95% of cases the best answer is "let's exit when that happens".

The operating system tends to treat your whole process as one and shot down everything or nothing. But a thread can still crash in its own due to unhandled oom, assertion failures or any number of other issues


> But a thread can still crash in its own due to unhandled oom, assertion failures or any number of other issues

That's not really true on POSIX. Unless you're doing nutty things with clone(), or you actually have explicit code that calls pthread_exit() or gettid()/pthread_kill(), the whole process is always going to die at the same time.

POSIX signal dispositions are process-wide, the only way e.g. SIGSEGV kills a single thread is if you write an explicit handler which actually does that by hand. Unhandled exceptions usually SIGABRT, which works the same way.

** Just to expand a bit: there is a subtlety in that, while dispositions are process-wide, one individual thread does indeed take the signal. If the signal is handled, only that thread sees -EINTR from a blocking syscall; but if the signal is not handled, the default disposition affects all threads in the process simultaneously no matter which thread is actually signalled.


It would be nice if someday we got per-thread signal handlers to complement per-thread signal masking and per-thread alternate signal stacks.


You can sort of get that behavior on Linux using clone(..., ~CLONE_THREAD|~CLONE_SIGHAND|CLONE_VM, ...), which creates otherwise distinct processes which share an address space.

You can do all sorts of weird things like create threads which don't share file descriptors, threads which chdir() independently... except that CLONE_THREAD|~CLONE_SIGHAND and CLONE_SIGHAND|~CLONE_VM are disallowed.


I think this is conflating two different things. A Rust Mutex gets poisoned if the thread holding it panics, but that's not the same thing as evaporating into thin air. Destructors run while a panic unwinds (indeed this is how the Mutex poisons itself), and you usually have the option of catching panics if you want. In the panic=abort configuration, where you can't catch a panic, it takes down the whole process rather than just one thread, which is another way of making the same point here: you can't usually kill a thread independently of the whole process its in, because lots of things (like locks) assume you'll never do that.


This is a solvable problem though, the literature is overflowing with lock-free implementations of common data structures. The real question is how much performance you have to sacrifice for the guarantee...


Aren't the alternatives you mentioned - icerberg and duckdb - both storage solutions while spark is a way to express distributed compute? I'm a bit out of touch with this space, is there a newer way to express distributed compute?


duckdb is primarily a query engine. It does have a storage format, but one of it's strengths is querying data where it already resides (e.g. a parquet file sitting in S3).

There are some examples[0] of enabling DuckDB to manage distributed workloads, but these are pretty experimental.

0 - https://www.definite.app/blog/smallpond


Thanks for the pointers!


I think what many people are finding out is they don’t really need distributed processing. DuckDB on a single node can get you really far, and it’s much simpler.


DuckDB is not only a storage solution. It can directly query a variety of file formats at rest, without having to re-store anything. That's one of its selling points: you can query across archival/log data stored in S3 (or wherever) without needing to "ingest" anything or double-pay to duplicate the data you've already stored.


I’m just getting into DuckDB lately and finding this feature so exciting. It’s a totally new paradigm. Such a great tool for scientists, and probably many other people. I wish I took it seriously sooner.


Not a new way like Ray, but a new way to express Spark super-efficiently (GPU-acceleration): https://news.ycombinator.com/item?id=43964505


Flink. It has more momentum than Spark right now.


"momentum" is a tricky word. Zig has more momentum than C++, but will it ever overtake the language? I'd bet not.


Well its not a tricky word it just wrong. Velocity maybe. Or more probably acceleration.


Flink is designed around streaming first, while Spark is built around batch first and you're likely best off selecting accordingly. Though any streaming application likely needs batch processing to some degree. Latency vs throughput.


https://www.patentlyapple.com/2024/04/apple-now-makes-14-of-...

You are incorrect.

> Apple Inc has assembled $14 billion worth of iPhones in India in fiscal 2024, Bloomberg News reported on Wednesday.


Postgres supports indexing an arbitrary json document. https://www.postgresql.org/docs/current/datatype-json.html

Not sure if the query capabilities and syntax match azure docdb but the basic functionality should be workable.


Of course it does, but it's limited.

GIN does not support range searches (needed for <, <=, >, >=), prefix or wildcard, etc. It also doesn't support index-only scans, last I checked. You cannot efficiently ORDER BY a nested GIN value.

I recommend reading the paper.


The transaction log maintained from time 0 would be equivalent but too expensive to store compared to the tables.


If you relax your constraint to "retain logs for the past N days", you can accumulate the logs from T=0 to T=(today - N) into tables and still benefit from having snapshots from that cutoff onwards.


On the contrary, I’ve known plenty of sites that keep their logs.

Often written to tape, for obvious reasons.


Resale values are lower in US because they factor in the 7.5k USD tax credit and the state tax credit mostly, there is plenty of demand for used teslas for example.


Similar in other countries but sometimes not as direct.

Various regulations set targets which gives manufacturers incentives to hit sales targets. This leads to discounts or great lease deals just before certain dates if targets aren't met through standard prices.


Consumer side can allow you to run ads and get Google like revenue in the future.


Intel also had a later chance when Apple tried to get off the Qualcomm percent per handset model. This was far after the original iPhone. Apple also got sued for allegedly sharing proprietary Qualcomm trade secrets with Intel. And Intel still couldn’t pull it off despite all these tailwinds.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: