More

cccybernetic · 2025-05-21T13:29:22 1747834162

An incredibly written film that's infinitely quotable.

"Balls. We want the finest wines available to humanity. We want them here, and we want them now!"

"Here Hare Here"

"I feel like a pig shat in my head."

"I don't advise a haircut, man. All hairdressers are in the employment of the government. Hair are your aerials. They pick up signals from the cosmos and transmit them directly into the brain. This is the reason bald-headed men are uptight."

piltdownman · 2025-05-21T16:34:25 1747845265

While most would focus on Withnail and Danny's more on-the-nose moments, it's really Uncle Monty who steals every scene he features in.

In my opinion he is simply one of the most beautifully realised characters in all of comedy, his pompous and lugubrious Oxbridge eccentricities representing the last vestiges of a more genteel era that Withnail had just missed out on being a beneficiary of (Free to those who can afford it, very expensive to those who can't).

I can't dispute that, in his impeccable introduction his most quotable moments (and delivery) arise, with the vainglorious delivery of his Hamlet monologue one of the most succinct and hilarious summations of a character in Cinema:

'It is the most shattering experience of a young man's life when one morning he awakes and quite reasonably says to himself, "I will never play the Dane."'

Ditto the magnificent upper-class eccentricity of his agrarian/sexual aspirations:

"I think the carrot infinitely more fascinating than the geranium. The carrot has mystery. Flowers are essentially tarts. Prostitutes for the bees. There is a certain je ne sais quoi - oh, so very special - about a firm, young carrot...Excuse me..."

However it is in his more maudlin moments where his dialogue - and the writing - truly shines, mourning the fin de siècle in his own way, with macabre little monologues hinting at his deep reserves of loss and despair.

'Oh my boys, my boys, we are at the end of an age! We live in a land of weather forecasts and breakfasts that set in, shat on by Tories, shovelled up by Labour, and here we are, we three; perhaps the last island of beauty... in the world'

My favourite however, has to be the little appendix to his anecdotes of Oxford and his 'sensitive crimes' in a punt with his poetry-reciting lover:

"I sometimes wonder where Norman is now. Probably wintering with his mother in Guildford. A cat, rain, Vim under the sink, and both bars on. But old now, old. There can be no true beauty without decay."

Sheer brilliance, with its fingerprints all over British Comedy almost 40 years on - with particular reference to the likes of Armando Ianucci, Edgar Wright, Simon Pegg, David Mitchell/Robert Webb, and Richard Ayoade, covering movies like 'Submarine' and TV like 'Peep Show' and 'Spaced' most notably.

wyclif · 2025-05-21T17:12:56 1747847576

Well said, piltdownman. Don't forget Harry Enfield, though!

sumo89 · 2025-05-22T08:35:41 1747902941

"We've gone on holiday by mistake" is forever in my head.

cccybernetic · 2025-02-05T19:36:10 1738784170

Most PDF parsers give you coordinate data (bounding boxes) for extracted text. Use these to draw highlights over your PDF viewer - users can then click the highlights to verify if the extraction was correct.

The tricky part is maintaining a mapping between your LLM extractions and these coordinates.

One way to do it would be with two LLM passes:

  1. First pass: Extract all important information from the PDF
  2. Second pass: "Hey LLM, find where each extraction appears in these bounded text chunks"

Not the cheapest approach since you're hitting the API twice, but it's straightforward!

Jimmc414 · 2025-02-05T19:49:12 1738784952

Here's a PR thats not accepted yet for some reason that seems to be having some success with the bounding boxes

https://github.com/getomni-ai/zerox/pull/44

Related to

https://github.com/getomni-ai/zerox/issues/7

cccybernetic · 2025-02-05T19:27:06 1738783626

Shameless plug: I'm working on a startup in this space.

But the bounding box problem hits close to home. We've found Unstructured's API gives pretty accurate box coordinates, and with some tweaks you can make them even better. The tricky part is implementing those tweaks without burning a hole in your wallet.

xiaofei_ · 2025-02-17T18:07:17 1739815637

How is their API priced? I checked a few months ago and remembered it being expensive.

cccybernetic · 2024-12-10T18:16:59 1733854619

I built a web app that extracts data from documents, like PDFs, Word, etc. I've seen people say "GPT wrapper", but it consistently outperforms similar tools in the space. My main customer is a private equity fund that randomly reached out. I didn't know much at all about fintech, but it works and gets the job done.

I don't have a proper marketing site yet since I've been focused on building the app, but it's coming soon (hopefully...)

giarc · 2024-12-10T18:26:40 1733855200

How do you reduce errors or hallucinations? I recently uploaded a very clear PDF to meta.ai and asked it a few, very simple questions. It completely made up quotes, including page numbers, section numbers etc.

cccybernetic · 2024-12-10T18:59:51 1733857191

I don't feed documents directly to an LLM. First, extract and process the data in a structured way that maintains the hierarchy and metadata of the content (this is important!). Then convert this into a scheme that you can control — it doesn’t really matter what it is (JSON, XML, markdown). From there, feed this to the LLM in chunks. This will get you most of the way there.

There's different ways to validate, but that's why maintaining hierarchy and metadata is so important. If you track this information properly, you can cross-check responses across different LLMs!

gcanyon · 2024-12-11T05:38:35 1733895515

I'd like to learn more -- please email me (link in profile).

acrooks · 2024-12-10T22:53:24 1733871204

I'm interested, can you email me (address in profile)

laylower · 2024-12-11T11:34:48 1733916888

Could you please link your website?

cccybernetic · 2024-12-11T22:59:32 1733957972

Sure, you can try the demo at:

https://www.subsystem.ai/demo

cccybernetic · 2024-11-18T16:21:30 1731946890

I built a drag-and-drop document converter that extracts text into custom columns (for CSV) or keys (for JSON). You can schedule it to run at certain times and update a database as well.

I haven't had issues with hallucinations. If you're interested, my email is in my bio.

cccybernetic · on Aug 16, 2024

I haven't seen it framed this way, but yeah - well put.

cccybernetic · on July 16, 2024

This is a problem I’m working on.

I’m a software engineer at major US research university developing AI-powered software to improve critical reading and writing skills in higher ed. The idea is to provide immediate, high-quality feedback to students, closing the “latency” of submitting something and waiting to hear back from you professor.

I do genuinely think AI can reshape teaching and learning, but it will be a slow iterative process. We can use it scale what works (personalized learning and tutoring, helping students develop mastery/automaticity on topics, targeting areas where they struggle). It can also automate time-consuming tasks that bog teachers down.

If you're interested in pedagogy, AI, and tech, please reach out.

ilamont · on July 16, 2024

> I’m a software engineer at major US research university developing AI-powered software to improve critical reading and writing skills in higher ed.

Oftentimes, the root cause of the critical reading problem is the quality of the writing that students are subjected to. My daughter recently showed me one of her economics readings, and said she couldn't understand it. It was 40 pages of convoluted academic writing like this:

Wibbels argues that developing countries face an inherently disadvantaged position in the world economy due to their dependence on foreign capital and an undiversified base of commodity exports as primary sources of hard currency. This dependent position relative to capital markets prevents developing countries from borrowing to engage in counter-cyclical aggregate demand management.

Is such language the optimal way to express ideas for comprehension by peers, students, and policymakers?

I hope your mission to improve writing skills in higher ed addresses the source of output - professors, teaching assistants, journal editors, and others who continue to promote outdated, inconsistent, and counterproductive academic writing styles.

Merik · on July 17, 2024

Claude 3.5 to the rescue:

“Wibbels claims that developing countries are at a disadvantage in the global economy for two main reasons:

1. They rely heavily on foreign investment.

2. They depend on exporting a limited range of raw materials to earn foreign currency.

Because of this weak economic position, developing countries struggle to borrow money when needed. This makes it hard for them to boost their economies during economic downturns, unlike wealthier nations that can more easily borrow and spend to stimulate growth. “

notachatbot1234 · on July 17, 2024

Are those really the same?

- "commodity exports" -> "raw materials"?

- "hard currency" -> "foreign currency"?

corimaith · on July 17, 2024

This does illustrate a problem when talking about complex topics or mechanisms is the need for specificity. Using short, simple sentences comes at the risk of making things seem overly vague and hand wavey, or worse, misrepresent the concept.

In continental philosophy or mathematical papers this gets all too apparent, as alot of argument hinge on very fine differences and nuances that need to specified else people get the wrong idea.

pants2 · on July 17, 2024

Wow, I would have killed to have access to something like Claude when I was in school. I would have spent a lot less time stuck on problems or topics.

km3r · on July 17, 2024

As an interesting example, i passed that passage to chatgpt3.5 with the prompt "make the following passage much simpler to understand"

Wibbels says that poorer countries are in a tough spot in the global economy. They rely a lot on money from other countries and mainly sell raw materials to get cash. Because they're so reliant on foreign money, they can't borrow much to manage their economy when things are going bad.

This is a pretty basic prompt too, you can add qualifiers to fit your general audience: "easier to understand for a high school student". You could provide context of previous and subsequent passages as well. Yeah its not perfect, but the right prompt could provide at least a consistent output style.

freejazz · on July 17, 2024

>Is such language the optimal way to express ideas for comprehension by peers, students, and policymakers?

Those are three completely different groups with completely different needs.

throwaway2037 · on July 17, 2024

This sounds great.

I have a roommate who is a weak non-native English speaker. However, he needs to write and submit scientific papers in English. He uses the "pro" version of ChatGPT to improve his written English. He said it is like having a 1:1 English language tutor because he gets nearly instant feedback when trying to rewrite a sentence or paragraph. I am native in English. He showed me some before and after examples. His message remained unchanged, but the revised versions were so much smoother. My point: He is not using ChatGPT to "cheat", rather to improve delivery of his message. I wrote about this previously here on HN. It received very mixed reviews.

3D30497420 · on July 17, 2024

I'm using ChatGPT to help me learn German. It is good, but as with all current AI it has a tendency to be very confidently wrong. This is especially true for more nuanced grammatical questions, such as "Why is this in dativ?" For that reason, I never feel I can fully trust it (and certainly wouldn't build a language-learning product around it). With that said however, it is a great addition to the various tools I use.

I think what your roommate is doing is fine. Being a native English speaker is a bit of a cheat anyhow. I very much appreciate the challenge of learning another language, so am not going to fault someone for using different tools to help improve their language. So long as the ideas are (relatively) original, the specific wording seems less important.

__loam · on July 16, 2024

I'm glad I got my degrees before people starting trying to integrate bullshit generators into my education. I've been really frustrated with the conversation about the potential applications for this technology. These chatbots have no relationship with the truth or with knowledge, and are designed to agree with users and act accommodating regardless of how wrong someone is. We're talking about putting this tech between patients and doctors, students and teachers and meanwhile McDonald's is rolling back deployments because it can't even take a fast food order accurately.

Merik · on July 17, 2024

I think you’re confusing the technology with a product developed using that technology. The prevalence of poorly implemented products or the lack of fit of some products to a particular target market, do not inherently provide evidence for conclusions about the technology itself.

xanderlewis · on July 17, 2024

The lack of successful implementation is surely at least evidence that the technology might not be living up to the hype, though — no?

It’s like “but that wasn’t real communism”.

champdebloom · on July 17, 2024

I’m a teacher turned web developer building tools to help other teachers automate their menial admin tasks. I’d love to chat when you have a moment!

cccybernetic · on July 17, 2024

Absolutely, email is in my profile. Please reach out.

cccybernetic · on May 7, 2024

To add to this, John Perry Barlow, one of the Dead's two main lyricists, co-founded the Electronic Frontier Foundation.

dekhn · on May 8, 2024

Barlow wrote the famous "Declaration of the Independence of Cyberspace" (https://www.eff.org/cyberspace-independence) in 1996, famous in its time but also somewhat poignant, given the commercialization of the internet not long after.

Barlow was also big on the WELL, an early influential time-sharing messaging system (https://en.wikipedia.org/wiki/The_WELL a Bay Area legends from a time when the Bay Area dominated in both technology and music). In many ways Hacker News is an intellectual inheritor of the WELL (along with Usenet and Slashdot).

cccybernetic · on May 8, 2024

Awesome, thanks for sharing.

cccybernetic · on March 28, 2024

This presumes a theory of justice where "social benefit" is relevant at all — not everyone accepts this.

cccybernetic · on March 15, 2024

this is the hardware version of the (in)famous hackernews comment on Dropbox:

> you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.