More

sebastiennight · 2025-10-12T07:19:31 1760253571

Very interesting project, and I found two things particularly smart and well executed in the demo:

1. Using a "painter commenter" feedback loop to make sure the slides are correctly laid out with no overflowing or overlapping elements.

2. Having the audio/subtitles not read word-for-word the detailed contents that are added to the slides, but instead rewording that content to flow more naturally and be closer to how a human presenter would cover the slide.

A couple of things might possibly be improved in the prompts for the reasoning features, eg. in `answer_question_from_image.yaml`:

  1. Study the poster image along with the "questions" provided.
  2. For each question:
     • Decide if the poster clearly supports one of the four options (A, B, C, or D). If so, pick that answer.
     • Otherwise, if the poster does not have adequate information, use "NA" for the answer.
  3. Provide a brief reference indicating where in the poster you found the answer. If no reference is available (i.e., your answer is "NA"), use "NA" for the reference too.
  4. Format your output strictly as a JSON object with this pattern:
     {
       "Question 1": {
         "answer": "X",
         "reference": "some reference or 'NA'"
       },
       "Question 2": {
         "answer": "X",
         "reference": "some reference or 'NA'"
       },
       ...
     }

I'd assume you would likely get better results by asking for the reference first, and then the answer, otherwise you probably have quite a number of answers where the model just "knows" the answer and takes from its own training rather than from the image, which would bias the benchmark.

sebastiennight · 2025-10-10T22:25:51 1760135151

What is your general opinion on the wisdom and decision-making of politicians and heads of state that are above the age of 80?

RiverCrochet · 2025-10-11T00:34:06 1760142846

Anyone who is 81 or older is elderly, therefore they have a lot of wisdom, therefore their decisions are the best. My opinion completely aligns with this logic with no deviation.

sebastiennight · 2025-10-11T15:15:35 1760195735

Agreed. As a Frenchman, I can testify that the same logic allows anyone to take perfectly dreary 2022 Napa valley wine, and turn it to a beautiful 1942 Saint-Emilion. You just need to age it 80 years.

lostlogin · 2025-10-11T00:48:13 1760143693

> elderly, therefore they have a lot of wisdom

That’s a hell of an assumption, though maybe this is missing a ‘/s’?

sebastiennight · 2025-10-06T19:19:19 1759778359

I hope your comment doesn't get downvoted into oblivion, because this is a common reaction that deserves to be addressed.

The issue isn't that your history/thoughts are harmless. The problem is that you might consider them harmless, but some authority in the future might decide that you're not one of the "good" citizens.

There are prior examples of this happening in history, eg. there was no reason to believe that your candid answering of a census question about religion in late 19th-century/early 20th-century Germany would ever lead to a young startup called International Business Machines helping your government hunt you down a few years later.

sebastiennight · 2025-10-02T19:12:25 1759432345

> So I guess USA is now gonna make war against Mexico/Canada, because they are divided?

Don't give them ideas. Once they are finished with Portland, LA and NYC, Canada is just next door

nmz · 2025-10-02T19:37:24 1759433844

Forgot chicago

sebastiennight · 2025-10-02T18:18:17 1759429097

Ever since I've heard of the meme that "modern men can't spend 24 hours without thinking of the Roman Empire", I haven't been able to escape it, even on days where my only contact with the outside world is HN.

I guess it's like a curse, once you've heard about it you're doomed.

And for anyone finding out about it just now, alea jacta est

frollogaston · 2025-10-02T23:29:29 1759447769

There used to be some SF Italian restaurant that showed first if you Googled "SPQR." Their SEO was stronger than Rome. I don't even live near there.

I think this meme has bumped the real SPQR back to the top.

sverhagen · 2025-10-02T19:45:18 1759434318

I need an faq or something then, to figure out what's wrong with me for never thinking of the Roman Empire. Except now then.

Moru · 2025-10-02T23:00:27 1759446027

It's only a certain type of man that they are talking about. We are not that type of men I guess. Can't say I know anyone what that problem to be honest. And yes, I have heard that saying before. Didn't work then and doesn't work now.

sph · 2025-10-03T09:38:08 1759484288

It's extremely easy if you're immersed in Southern European culture.

Moru -> Flag of Sardinia, whose Wikipedia page incidentally I was reading yesterday (the Four Moors, "is cuatru morus") -> Sardinian language -> grammatically still the closest language to Latin -> the everlasting glory of the Roman Empire

---

Reminds me of the fantastic, unreleased Monty Python sketch about memory association: https://youtu.be/KnpY46lOTX4?si=3Yb17jvGp-1vn6de&t=2058

Also, Monty Python -> The Life of Brian -> the everlasting glory of the Roman Empire

sd9 · 2025-10-02T22:22:02 1759443722

I just lost the game

ZeWaka · 2025-10-02T23:33:37 1759448017

michaelsshaw · 2025-10-02T23:56:34 1759449394

Join the club of people who acknowledge that there's much more interesting history than that, and you'll suddenly forget all about Rome.

AlecSchueler · 2025-10-03T10:51:12 1759488672

For me it's not so much what's interesting as what affects my day to day. I love Chinese history but I'm unlikely to come across anything today with origins in Chinese law, or traverse the path of a Chinese road, or use an interesting word with a Chinese etymology and an associated story from old China.

sebastiennight · 2025-10-03T06:17:54 1759472274

I love learning new history and I'm open to suggestions. Any less-trodden paths you'd recommend?

michaelsshaw · 2025-10-03T09:50:25 1759485025

For national history, Chinese is probably my favorite by far.

May I suggest you do a domain-specific history dive, such as the history of computing, the history of science or some other subject you may enjoy more. That's the real good stuff.

thecupisblue · 2025-10-03T06:59:21 1759474761

It's so funny to see this be a worldwide phenomenon. As someone who grew up playing in the ruins of Roman temples & villas and was obsessed with it as a child, it almost feels like people are talking about "some other Rome".

sebastiennight · 2025-10-03T09:59:36 1759485576

I grew up in a school system which taught us about "our ancestors, the Gauls"...

Which is fun if you're an Asterix fan, but one day you end up asking yourself - wait, we're in an ex-French colony here, but how much Gaul blood does anyone have in this place really?

kruffalon · 2025-10-02T19:08:04 1759432084

Thank you for reminding me, this is so fun to have bobbing around in the back of your mind! :D

sebastiennight · 2025-09-29T18:32:11 1759170731

It... literally is?

Or otherwise, can you share what you think the ratio is?

emp17344 · 2025-09-29T18:35:56 1759170956

No, 1 is 1 more than 0. There’s a certain sense in which you could say that 1 is infinitely greater than 0, but only in an abstract, unquantifiable way. In this case, it doesn’t make sense to say you’re “infinitely more productive” because you’re producing something rather than nothing.

sebastiennight · 2025-09-30T06:33:18 1759213998

It goes like this:

"For any positive "x", is 1 x times greater than 0? Well, 0 times x is lower than 1, and 1 divided by x is larger than 0."

So his productivity increased by more than twice, more than ten times, more than a billion times, more than a googol times, more than Rayo's number. The only mathematically useful way to quantify it is to say his productivity is infinitely larger. Unless you want to settle for "can't be compared", which is less informative.

jama211 · 2025-09-29T19:07:26 1759172846

I just read it as a turn of phrase that says exactly that, that it means they produce something rather than nothing.

stavros · 2025-09-30T00:35:21 1759192521

Only if you think that the phrase "two times more productive" is also nonsensical.

Fraterkes · 2025-09-29T18:45:13 1759171513

I think it's a pedantic point, but maybe they just meant that talking about 1 being multitudes greater than 0 implies multiplication. And since 1/0 is undefined that doesn't make much sense.

inopinatus · 2025-09-29T18:45:40 1759171540

Someone attributing all of their productivity to a given tool and none to their own ingenuity and experience is allocating 100% credit to that tool.

It is not a ratio, it is a proportion.

sebastiennight · 2025-09-27T18:54:21 1758999261

Beautifully amazing. Though I never got a chance to send a prompt, as all the rooms appear full to me. It's a bit frustrating to have no indication whether you're going to be in a queue, or if it's just random chance.

streetmeat · 2025-09-27T18:57:09 1758999429

Thanks! Sorry about that I didn't expect so much usage, and with my 300 dollars of cloud credit I dont want to add more rooms cuz this is already only going to last the afternoon.

The queue is FIFO and resets when the round does, if you get in it, it adds a place above the text box letting you know your current order in the queue.

sebastiennight · 2025-09-27T07:58:41 1758959921

For things like this (extracting precise computed data from unstructured blobs) I find that it's often more effective to ask your AI tool to provide a program (I usually ask for a HTML page with a JS form, or a bookmarklet) that can do the actual math.

Otherwise you're just as likely to be getting hallucinated answers based on the AI model's existing biases and training (if it's an American model, it might start telling you the sentenced convicts are young male and non-white even without looking at the data on the page).

arcfour · 2025-09-27T11:15:35 1758971735

I have noticed AI getting much better at correctly doing math than it used to be. Not perfect, but nearly so, and a far cry from this being required for most simple math calculations. (My experience is largely with Claude.)

sebastiennight · 2025-09-28T10:52:05 1759056725

I've heard that there's been experimenting around giving thinking models access to tools inside of the "thinking" part, so that e.g. calculations could use a Python interpretor, which would give the illusion that the model did the math correctly.

Not sure if it's just OpenAI or if Anthropic has tried this too.

arcfour · 2025-09-28T12:27:41 1759062461

They have that, but I have been reading that new models are so good at math already (solving complex math problems) I am guessing it's generally not needed?

sebastiennight · 2025-09-28T20:25:18 1759091118

There is a large conceptual gap though, between "solving a complex math problem" which is navigating through logic/reasoning, versus "correctly predicting the next token in the multiplication of 2 or more large numbers".

Eg. If we've already worked out premises that A is larger than B, and 2C is smaller than B, than you can easily compute the next token in the sentence "Therefore C is..."

versus computing 123,287,211 times 971,222, where computing the first token is non-trivial, but computing what comes after "11973" in the result is even less obvious. (it would be tremendously easier if you were predicting the result backwards, starting with the last digit).

There is some evidence that models actually "plan ahead" somewhat (something like guessing more than one token at a time, eg. when writing a line in a poem, the model has an "idea" of what the ending word will be) but there are limits to the reliability of that, vs. using a calculator tool.

sebastiennight · 2025-09-26T18:13:35 1758910415

All it sees is a big blob of text, some of which can be structured to differentiate turns between "assistant", "user", "developer" and "system".

In theory you could attach metadata (with timestamps) to these turns, or include the timestamp in the text.

It does not affect much, other than giving the possibility for the model to make some inferences (eg. that previous message was on a different date, so its "today" is not the same "today" as in the latest message).

To chronologically fade away the importance of a conversation turn, you would need to either add more metadata (weak), progressively compact old turns (unreliable) or post-train a model to favor more recent areas of the context.

sebastiennight · 2025-09-26T09:06:14 1758877574

Bold of you to think that all of these humans would need to be involved, vs. you getting a call from your sister's assistant directly