Hacker Newsnew | past | comments | ask | show | jobs | submit | whoisjuan's commentslogin

It's actually quite fascinating if you watch it for 5 minutes. Some models are overall bad, but others nail it in one minute and butcher it in the next.

It's perhaps the best example I have seen of model drift driven by just small, seemingly unimportant changes to the prompt.


> model drift driven by just small, seemingly unimportant changes to the prompt

What changes to the prompt are you referring to?

According the comment on the site, the prompt is the following:

Create HTML/CSS of an analog clock showing ${time}. Include numbers (or numerals) if you wish, and have a CSS animated second hand. Make it responsive and use a white background. Return ONLY the HTML/CSS code with no markdown formatting.

The prompt doesn't seem to change.


The time given to the model. So the difference between two generations is just somethng trivially different like: "12:35" vs 12:36"

presumably the time is replaced with the actual current time at each generation. I wonder if they are actually generated every minute or if all 6480 permutations (720 minutes in a day * 9 llms) were generated and just show on a schedule

It is really interesting to watch them for a while. QWEN keeps outputting some really abstract interpretations of a clock, KIMI is consistently very good, GPT5's results line up exactly with my experience with its code output (overly complex and never working correctly)

We can't know how much is about the prompt though and how much is just stochastic randomness in the behavior of that model on that prompt, right? I mean, even given identical prompts, even at temp 0, models don't always behave identically.... at least, as far as I know? Some of the reasons why are I think still a research question, but I think its a fact nonetheless.

Kimi seems the only reliable one which is a bit surprising, and GPT 4o is consistently better than GPT 5 which on the other hand is unfortunately not surprising at all.

Did you guys change the pricing of Exa?

When I checked this a year or so ago, I might have gotten the impression that it was cheaper. Now, it costs the same as what Perplexity charges for search-grounded queries, which is the same as Google charges for Gemini queries with search.

So basically, one player sets a price, and everyone is anchored on that as the pricing for the entire category? I'm just genuinely interested in why every offering in this space is priced like this.

It seems a bit misaligned with how pure LLM queries are priced.

I have a product that would benefit from search grounding, but this pricing wouldn't work with my volume of queries.


We charge $5 per 1000 requests with our search and answer endpoints.

Perplexity charges the same on their lowest tier model, and three times as much for their more expensive models.

Gemini charges $35 per 1000 requests.

https://exa.ai/pricing

https://docs.perplexity.ai/guides/pricing

https://ai.google.dev/gemini-api/docs/pricing


What product of yours would benefit, if I dont mind asking?


This is wild, but many studies have reached the same conclusion.

I remember reading somewhere that heart transplant recipients have random memory flashes that are not their memories, and sometimes they develop new personality traits.


A theory I have seen is that we tend to mix up cause and effect.

So, for example, a dangerous situation causes stress and stress causes the heart to beat faster, all normal. But make the heart beat faster through external means and it will also cause stress. So it is not clear which one is the cause and which one is the effect, probably some weird combination, with all sorts of feedbacks. Life is messy.

So get a heart that isn't yours and it will not beat in a familiar way, which, in turn may be interpreted as changing emotions. And even if memories are entirely contained within the brain, what if the heartbeat is part of these memories, with a heart that reacts differently, the meaning of these memories may change.

For a tech analogy, in order to record a video game session, it is common to only record player input. If the game is deterministic, you just need to run the game with the recorded inputs and the session will be faithfully reproduced. It is much more compact than something like a video. Now imagine we change the game engine so that it responds slightly differently to inputs, now, when replayed, the game will appear different. If we imagine memories are "replays" and the engine is our body, than altering our body will also alter our memories.


> I remember reading somewhere that heart transplant recipients have random memory flashes that are not their memories, and sometimes they develop new personality traits.

Wild. Doesn't necessarily surprise me too much that the body stores some memories outside the brain, but it seems _very_ surprising that another body/brain can read and understand ones created by another. I'd assume that the whole mind and memory system is one big correlated mess, not essentially composed of data files in a ~standard encoding.


It would be hasty to assume that any memories would be transferable in such a way. If your hypothesis is that transplant recipients can have their memories altered by interpreting information carried by foreign organ cells, start by assuming they're reading junk data that they cannot decipher. Brains are great at turning junk data into something that feels real.


I would probably ascribe it to the procedure itself. Like I imagine if you put someone under, opened up their chest, took their heart out and then... put it back in - that the stress of that whole thing would be enough to seriously mess with your head.


You could probably test that theory. Just compare heart transplants against similarly invasive surgeries and see if the same effects exist.


That was my followup question, are the memories accurate (even as much as normal memories are), or are they nonsense? Or even better, it'd be fun if they're not completely nonsense, but corrupted in some understandable way (like people/places are substituted for instance). There's no way at all that memories are encoded as essentially mpeg files, so _something_ has to be wrong with them.

But yeah, you're right, odds seem good that they're just nonsense, but even then it just feels weird that the body can even interpret them as memories in the slightest.


Maybe it's all about encoding and it IS pretty standard? Brain can decode vision through tongue nerves [1] as long as it looks like vision data and is correlated with head movements. There were experiments with other senses sent through different means or whole new sense (magnetic [2] and echolocation [3]). Looks like brain is so flexible, that anything resembling sensible information will be decoded.

[1] https://news.wisc.edu/a-taste-of-vision-device-translates-fr...

[2] https://blinry.org/compass-belt/

[3] https://www.physoc.org/magazine-articles/echolocation-in-peo...


Rabies virus induces the same behavior across different species (the victims in terminal state are terrified by swallowing liquids).


That sounds really interesting! Can you cite any articles or anything?




> In addition to changes in preferences, some recipients describe new aversions after receiving a donor heart. For example, a 5-year-old boy received the heart of a 3-year-old boy but was not informed about his donor’s age or cause of death. Despite this lack of information, he provided a vivid description of his donor after the surgery: “He’s just a little kid. He’s a little brother like about half my age. He got hurt bad when he fell down. He likes Power Rangers a lot I think, just like I used to. I don’t like them anymore though” (p. 70, [8]). Subsequently it was reported that his donor had died after falling from an apartment window while trying to reach a Power Ranger toy that had fallen onto the window ledge. After receiving his new heart, the recipient refused to touch or play with Power Rangers

This is the most fascinating thing I've read in a long time. Thanks for the link


There’s a similar story I’ve read before in a different paper regarding about an organ donor who drowned and then the recipient developed an extreme aversion to water.

I don’t recall what the exact title or link to the article was though.


Man that seems like such a fantastical claim — but yeah, it does seem like the physical structure to support it could be there.


This is a very sneaky ethically gray company. Their app is not only of terrible quality but also full of dark patterns. I'm convinced that any revenue they make comes from people who can't figure out how to cancel. Stay away from it.


Ironic given the app’s purpose.


I'm a designer. I built brainglue.ai without Figma, a design system, or a UI library. I just went directly to code (react+tailwind) and let a style organically emerge.

I'm not saying that I'm a unicorn and that my idea-to-code-to-design execution is flawless, but I certainly believe that in this situation, if I hadn't done it this way, I wouldn't have done it all. However, doing this would be wasteful or dumb in almost every other situation that requires my design output.

People pay for Figma precisely because it's a middle ground. It was a middle ground before, and it will continue to be unless something fundamental changes.


Nice design!

Small suggestion, I would preload the contents of each tab and the images in the circle after the main content is loaded. There was a good two second lag loading the images here in Australia.


Ohh nice catch. Will do. Thanks


AppleTV+ is a tiny business. It's nowhere near of generating enough revenue to cover a $20B hole in content production costs.

Yes, Apple generates LOTS of revenue overall, but that doesn't justify bleeding cash on a business line that hasn't produced material returns and has no significant positive trajectory in sight.

It's clear that Apple saw this as their Prime Video bet on their services strategy, but that hasn't worked out. Just look at AppleTV+ market share. It's hilariously miniscule.


Apple can afford to play the long game here though.

TV+ is nice value add on their bundled subscription package so may be driving more people to opt for that. I know it was a major factor in my decision and now I am playing Apple Arcade games and use Apple Music as my primary music service.

Operating TV+ as a halo or loss leader product to get people to try other services within the ecosystem could be a winning strategy for them. Also likely drives some hardware sales.


In what world does Apple need a “loss leader” to “drive hardware sales”.

They have some of the highest sales for high value products in the world.

What the goal of this was obviously to scoop $10 more cream a month from a % of those huge sales numbers.

“Loss Leader” theory makes no sense next to Apples financial and sales reality.


The hardware sales I was talking about is their streaming box which is in no way dominant but is really nice. Also we have pretty much reached peak iPhone or close to it. These subscription services help keep people in the ecosystem as it raises switching costs a bit.

The article is implying that TV+ is losing money. I don’t know that it is but my point is that for Apple it is likely still worth keeping and growing the service even if they are losing money on it at the moment.


Apple TV doesn’t matter, I’m sure iPad watching dwarfs ATV many hundreds if not thousands of times over.

AppleTV exists to make existing mega fans happy the ecosystem extends to their TV not to sell things.


It's a "tiny" part of a services business with a billion subscribers, that generates $20B of revenue in a quarter. $20B in production costs over 4.5 years works out to less than $5B/year in costs, against competitors like Netflix and Disney+ that are spending $20B a year.

I actually haven't seen any numbers on their current marketshare, but I'll give you that they aren't anywhere near their competitors. I don't think the problem is that they're spending too much money.


I own a Model 3, and I honestly think it is the best car I have ever owned.

Despite that, I know people who have ruled out owning a Tesla because they believe the brand mirrors Elon Musk's public persona. They flat-out reject any Tesla product because the brand's visible face is someone they believe doesn't represent their values.

I'm unsure he understood the implications of becoming such a polarizing figure. It was totally unnecessary, yet that was his choice.


Or car buyers just want turn signal stalks or auto windshield wipers that work.


and can be refilled in 5 minutes.


I don't think it was a conscious or deliberate choice.

Hubris made him think he was the real Tony Stark, genius playboy philanthropist bullshit, but as talented as he is in some areas, his flaws are clearly visible.

But at the end of the day, I think that if Tesla was making valuable cars, it would not matter that much. The thing is that those cars also have many flaws, and a lot of undelivered promises around self-driving...


> Hubris made him think he was the real Tony Stark

Maybe it was the time Tony Stark praised his ideas.

https://www.youtube.com/watch?v=-wC4rLguuYI


"Republicans buy sneakers, too"

It does not take much business acumen that taking a political stance is going to anger someone. Best keep your mouth shut unless you are prepared for the consequences.

I doubt Elon’s personal views are much different from any other billionaire CEO. Yet the majority of them are not in the spotlight and drawing attention to themselves.


I don't doubt that a portion of potential buyers are turned off by Musk's politics and refuse to buy his cars. But a much, much larger group (in my opinion) are people who like gas-powered cars and have zero interest in EVs, no matter how large the subsidies are, how advanced the tech is, or how many chargers are out there. Ignoring this group, and focusing on those caught up with Musk's politics, is missing the big picture, I believe.


What other cars have you owned?


Counterpoint: It was 100% necessary. Telling the truth always is.


Which truth are you referring to?


So the "truth" would be his primary base of personal wealth is utterly derived from selling cars to...checking notes...the worst people in the country who deserve to be bullied and ridiculed? And that he has no choice but to educate the world about this truth? Even if it means destroying his personal reputation and businesses?

Or - and this is amply documented now - perhaps the "truth" is he's doing way too many drugs (WSJ), making horrible decisions (Supercharger), and per the deeply persuasive data in the linked thread, is killing the company?

Like, huh? What?


Elon Musk is not famous for honesty.


No op, but I don’t think test-driven development resounds with everyone who writes code.

I don’t want to write tests for everything. I just want to write the ones that matter.


That is a common misconception about TDD.

TDD is _about_ writing tests that matter, but most people think it is about writing all unit tests first.

If you are following TDD anywhere close to the way it is described, you will only be writing tests that relate to domain functionality first.

Note how it is described here, although it is turse.

https://martinfowler.com/bliki/TestDrivenDevelopment.html

The coverage metric as a goal writing style doesn't work for TDD, sorry you were exposed to that.

You are correct that model doesn't work.


> The coverage metric as a goal writing style doesn't work for TDD

Coverage is not a goal of TDD, but in practice you will have 100% coverage by following TDD as you would never have reason to write code that isn't covered by test.

Ultimately, the purpose of coverage tools is to let you know what you might have forgotten to clean up during a refactor, to help you remove what you missed.


TDD or not, why would you write tests for things that don't matter?

More importantly, why are you writing any code for things that don't matter?


I think productivity is lower in the winter, so I'm not sure about quality per se, but intuitively it makes sense that anything written in the winter months is less verbose.


GPTs is basically a ripoff of Poe by Quora. Quora’s CEO is Adam D’ Angelo. Adam D’ Angelo is one of OpenAI’s board members.

Make your own conclusions.


Never heard of Poe I had to look it up.

> Poe lets you ask questions, get instant answers, and have back-and-forth conversations with Al. Gives access to GPT-4, gpt-3.5-turbo, Claude from Anthropic, and a variety of other bots.

I'm not sure I would call Poe a rip-off at all? Sounds bundled chatgpt product.


Poe has allowed custom bots for over 6 months now. It's a very similar experience to creating/using [custom] GPTs.


But that was always available with ChatGPT? GPTs is just some new interface/market-ish to them no?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: