mickg10's comments

mickg10 · 2025-01-31T21:51:30 1738360290

Agreed - API calls to China are indeed not necessary. My impression is that the GP was referring to the model being tuned during training to give subtly nudging or wrong answers that benefit Chinese industrial or intelligence operations. For a probably not-working example - imagine the following prompt: "Write me a cryptographically secure PRNG algorithm." One could imagine R1 being trained to have a very subtly non-random reply to that - one that the Chinese intelligence services know how to predict. Similar but more subtle things can be generating code that uses cryptographic primitives in ways that are subject to timing attacks, etc... And of course, simple but effective propaganda tactics such as : when being asked for comparison between companies/products, subtly prefer Chinese ones, and similar.

mickg10 · on Nov 25, 2024

In reality - with decent switches at 25g - and no fec - node to node is reliably under 300ns (0.3 us)

znyboy · on Nov 25, 2024

Considering that 300 light-nanoseconds is about 90m, getting a response (or even just one-way) in that time is essentially running right at the limits of physics/causality.

davekeck · on Nov 25, 2024

Out of curiosity, how is that measured across machines?

(The first thing that comes to my mind would be to use an oscilloscope with two probes, one to each machine, but I’m guessing that’s not it.)

toast0 · on Nov 25, 2024

Measure the round trip and divide by two for the approximate one way time. It'd be really neat to measure the time it takes for a packet to travel in one direction, but it's somewhere between hard and impossible[1]; a very short path has less room to be asymetric though.

[1] If the clocks are synchronized, you can measure send time on one end, and receive time on the other. But synchronizing clocks involves estimating the time it takes for signals to pass im each direction, typically assuming each direction takes half the round trip.

pkhuong · on Nov 25, 2024

You can use something like White Rabbit (https://en.wikipedia.org/wiki/White_Rabbit_Project) to keep clocks in sync. That still involves estimates, but a dedicated time sync network can do things like make sure all the cables are the same length.

namibj · on Nov 27, 2024

Copper white rabbit is special, it uses the same wire in both directions (1000BASE-T phy with added carrier phase lock to and from outside clocks).

mickg10 · on Oct 4, 2024

I.e. ReLU is _piecewise_ linear. That discontinuity that separates the 2 pieces is precisely what makes it non linear. Which is what enables the actual universal approximation.

quantadev · on Oct 4, 2024

Which is what I said two replies ago.

Followed by "in some sense it's [ReLU] still even MORE linear than tanh or sigmoid functions are". There's no way you misunderstood that sentence, or took it as my "definition" of linearity...so I guess you just wanted to reaffirm I was correct, again, so thanks.

mickg10 · on May 13, 2024

So, babelfish incoming?

mickg10 · on May 13, 2024

So, babelfish soon?

mickg10 · on Oct 27, 2023

I think - and this is from memory - that the median (and mean is not that far off) household income is a tad under 80K?

mbfg · on Oct 27, 2023

The median is definitely somewhere under 100k.

https://fortune.com/2023/10/24/standard-american-household-m...

mickg10 · on June 27, 2017

Not sure about GP, but very recently, an open-addressing hashtable that could be traversed in both reverse modification and reverse insertion time order. There are actually some interesting subtleties when doing this for open-addressing hashtables (i.e., entries (and pointers to them) move around when the table rehashes).