More

GavCo · 2025-04-29T14:25:36 1745936736

Was an interesting experience travelling to Italy and suddenly starting to get cookie banners on sites I visit daily that normally don't have

GavCo · 2025-04-29T14:15:06 1745936106

Interesting. Did they explain why?

mitch292 · 2025-04-29T14:43:36 1745937816

They had this in the reply

> How to rectify: Ensure your privacy policy contains details about user data collection, handling, storage and sharing. Omission of any section is not allowed.

So I added a section for each. I could make the "Information We Collect" section less verbose for sure.

Xunjin · 2025-04-29T14:49:39 1745938179

Does this kind of privacy policy they demand follow any law, or it's just their "you should do this way"?

mitch292 · 2025-04-29T19:36:24 1745955384

I'm honestly not sure.

GavCo · 2025-04-03T10:30:27 1743676227

Fully agree. The physics of solar panels on cars just doesn't work. It's bizarre that this is actively pursued by startups and concept cars from large manufacturers when it takes just quick back-of-the-napkin math to see.

A car has about 5 m^2 of flat space on the roof/hood/trunk so that's the maximum surface area that can capture solar energy at any given time.

The total energy to hit the area is 1000 w/m^2.

The panels can't rotate to track the sun so the effective area is the cosine of the angle. So you end up with about half the amount of effective sunlight hours as the actual daylight hours. So in summer you get about 6 hours of effective sunlight.

Good panels in real world conditions can give you 22% efficiency.

So in optimal conditions you get: 5 * 1000 * 6 * 0.22 = 6.6 kwh

That will reflect your best days. It can be dramatically less if it's cloudy, overcast, winter, far from the equator, car is dirty, parked in shade, etc.

6.6 kwh is about one tenth of the battery in my Hyundai Kona EV. With very conservative highway driving, 6.6 kwh can get about 40km of range and about 50km in city driving. It's what I get from plugging into my home charger for 30 min and what you get from a fast charger in about 3 minutes.

So besides some very niche uses, there's no sense in massively increasing the cost and complexity of a car by installing solar panels. Far better to put the panel on the roof of parking and just plug in for a few minutes while you park.

GavCo · 2025-04-01T08:46:40 1743497200

I was wondering the same and found these related papers:

https://arxiv.org/pdf/2309.08561 https://arxiv.org/pdf/2406.02649

I haven't really dug in yet but from a quick skim, it looks promising. They show a big improvement over Whisper on a medical dataset (F1 increased from 80.5% to 96.58%).

The inference time for the keyword detection is about 10ms. If it scales linearly with additional keywords you could potentially scale to hundreds or thousands of keywords but it really depends on how sensitive you are to latency. For real-time with large vocabularies my guess is you might still want to fine-tune.

GavCo · 2025-03-25T17:31:25 1742923885

Idk, 7 articles over 10 years isn't very strong evidence of a raging debate

recursive · 2025-03-25T17:51:31 1742925091

Surely this isn't the totality of all discourse on the subject.

objektif · 2025-03-25T18:50:57 1742928657

I do not even think it is morality actually. It is just that citizens of western countries would absolutely like to see these mosquitos eliminated from their countries. It is just that they are against eliminating them in Africa, Latin America etc. In the name of being “concerned about ecological impact.” of doing so. Once global warming accelerates and some of those diseases such as dengue and zika become prevalent here in America public opinion will magically change haha.

recursive · 2025-03-25T19:58:22 1742932702

Well you can count this western perspective as being in favor of mosquito elimination from the entire planet.

kelseyfrog · 2025-03-25T21:04:16 1742936656

I hesitated adding an eight link. Alas, it was too few.

GavCo · 2025-03-25T17:29:29 1742923769

the noise is pretty hard to stomach

idontwantthis · 2025-03-25T21:45:24 1742939124

The smell can be pretty bad too

GavCo · 2025-03-25T07:20:31 1742887231

Take a look at this one, might be a fit depending on your use case https://www.eself.ai/

GavCo · 2025-03-16T13:07:33 1742130453

Surprised nobody has pointed this out yet — this is not a GPT 4.5 level model.

The source for this claim is apparently a chart in the second tweet in the thread, which compares ERNIE-4.5 to GPT-4.5 across 15 benchmarks and shows that ERNIE-4.5 scores an average of 79.6 vs 79.14 for GPT-4.5.

The problem is that the benchmarks they included in the average are cherry-picked.

They included benchmarks on 6 Chinese language datasets (C-Eval, CMMLU, Chinese SimpleQA, CNMO2024, CMath, and CLUEWSC) along with many of the standard datasets that all of the labs report results for. On 4 of these Chinese benchmarks, ERNIE-4.5 outperforms GPT-4.5 by a big margin, which skews the whole average.

This is not how results are normally reported and (together with the name) seems like a deliberate attempt to misrepresent how strong the model is.

Bottom line, ERNIE-4.5 is substantially worse than GPT-4.5 on most of the difficult benchmarks, matches GPT-4.5 and other top models on saturated benchmarks, and is better only on (some) Chinese datasets.

InkCanon · 2025-03-16T13:46:06 1742132766

To try to avoid the inevitable long arguments about which benchmarks or sets of them are universally better: there is no such thing anymore. And even within benchmarks, we're increasingly squinting to see the difference.

threatripper · 2025-03-16T14:57:40 1742137060

Do the benchmarks reflect real-world usability? My feeling is that the benchmark result numbers stop working above 75%.

In a real problem you may need to get 100 things right in a chain which means a 99% chance of getting each single one correct results in only 37% change of getting the correct end result. But creating a diverse test that can correctly identify 99% correct results in complex domains sounds very hard since the answers are often nuanced in details where correctness is hard to define and determine. From working in complex domains as a human, it often is not very clear if something is right or wrong or in a somewhat undefined and underexplored grey area. Yet we have to operate in those areas and then over many iterations converge on a result that works.

Not sure how such complex domains should be benchmarked and how we objectively would compare the results.

fau · 2025-03-16T15:29:32 1742138972

GPT-4.5's advantages are supposed to be in aspects that aren't being captured well in current benchmarks, so the claim would be shaky even if ERNIE's benchmarks actually showed better performance.

bdelmas · 2025-03-16T13:33:00 1742131980

You know what's sad? Every Western company has been using this technique for a long time...

iandanforth · 2025-03-16T13:17:05 1742131025

So, fairly accurate if you're Chinese?

GavCo · 2025-03-16T13:27:07 1742131627

It doesn't really matter what nationality or ethnicity you are, but if you communicate with the model in Chinese you might get better results from this model.

Then again, if they've misrepresented the strength of the model overall, there might be some other shenanigans with their results. The fact that their results show their model is worse than GPT-4.5 on 2 Chinese language benchmarks, while it's so much stronger on some of the others, is a bit weird.

GavCo · 2025-01-31T18:51:33 1738349493

Also worse than o1-mini on agentic tasks (page 29), big drop from 39% -> 27%

GavCo · 2025-01-29T20:11:05 1738181465

Interesting. I wonder if this is related to the model architecture and attention mechanism.

The author seems to be implying it could be: "Even a single mention of ‘code enhancement suggestions’ in our instructions seemed to hijack the model’s attention"

jimminyx · 2025-01-29T20:45:27 1738183527

The attention is probably just latching on to strong statistical patterns. Obvious errors create sharp spikes in attention weights, and drown out more subtle signals that can actually matter more