Hacker Newsnew | past | comments | ask | show | jobs | submit | mickg10's commentslogin

Agreed - API calls to China are indeed not necessary. My impression is that the GP was referring to the model being tuned during training to give subtly nudging or wrong answers that benefit Chinese industrial or intelligence operations. For a probably not-working example - imagine the following prompt: "Write me a cryptographically secure PRNG algorithm." One could imagine R1 being trained to have a very subtly non-random reply to that - one that the Chinese intelligence services know how to predict. Similar but more subtle things can be generating code that uses cryptographic primitives in ways that are subject to timing attacks, etc... And of course, simple but effective propaganda tactics such as : when being asked for comparison between companies/products, subtly prefer Chinese ones, and similar.


In reality - with decent switches at 25g - and no fec - node to node is reliably under 300ns (0.3 us)


Considering that 300 light-nanoseconds is about 90m, getting a response (or even just one-way) in that time is essentially running right at the limits of physics/causality.


Out of curiosity, how is that measured across machines?

(The first thing that comes to my mind would be to use an oscilloscope with two probes, one to each machine, but I’m guessing that’s not it.)


Measure the round trip and divide by two for the approximate one way time. It'd be really neat to measure the time it takes for a packet to travel in one direction, but it's somewhere between hard and impossible[1]; a very short path has less room to be asymetric though.

[1] If the clocks are synchronized, you can measure send time on one end, and receive time on the other. But synchronizing clocks involves estimating the time it takes for signals to pass im each direction, typically assuming each direction takes half the round trip.


You can use something like White Rabbit (https://en.wikipedia.org/wiki/White_Rabbit_Project) to keep clocks in sync. That still involves estimates, but a dedicated time sync network can do things like make sure all the cables are the same length.


Copper white rabbit is special, it uses the same wire in both directions (1000BASE-T phy with added carrier phase lock to and from outside clocks).


I.e. ReLU is _piecewise_ linear. That discontinuity that separates the 2 pieces is precisely what makes it non linear. Which is what enables the actual universal approximation.


Which is what I said two replies ago.

Followed by "in some sense it's [ReLU] still even MORE linear than tanh or sigmoid functions are". There's no way you misunderstood that sentence, or took it as my "definition" of linearity...so I guess you just wanted to reaffirm I was correct, again, so thanks.


So, babelfish incoming?


So, babelfish soon?


I think - and this is from memory - that the median (and mean is not that far off) household income is a tad under 80K?


The median is definitely somewhere under 100k.

https://fortune.com/2023/10/24/standard-american-household-m...


Not sure about GP, but very recently, an open-addressing hashtable that could be traversed in both reverse modification and reverse insertion time order. There are actually some interesting subtleties when doing this for open-addressing hashtables (i.e., entries (and pointers to them) move around when the table rehashes).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: