Hacker News new | past | comments | ask | show | jobs | submit login

Awesome! This is a huge opportunity to help a lot of people (clients, subcontractors and builders). A lot of money and time is wasted by the current inefficiencies. We gave takeoff construction plan parsing a go in 2022-2023 but couldn’t get the AI part to work well enough (and still haven’t been able to even with the latest ViT/ CLIP models). There was a lot of interest though!

- You’re right, data is very hard to come by. I’m curious, how do you plan to get around this? Outsourcing human labeling? We found it to be a very difficult task.

- The subcontractors and local construction companies we talked to were overwhelming excited about the idea.

- It’s entire people’s jobs to get this done and done correctly. They sit on site holding the pdfs in their hands, manually counting and calculating. You bet a lot of mistakes occur. They would absolutely love to have a digital assistant for this.

- Some of them (especially managers and owners) are quite technical and are using software such as BlueBeam and other CAD software to make these calculations. It’s quite manual currently, but gives great insight into a better solution. This led us to having the user manually select the symbol they wanted counted (which ML struggled to get right). Just getting the part counts (and highlighting them in the pdf) was a huge help!

- Impressive you got square footage calculations correct! In our experience, there was way too much variation between architects (and multistep dimension labeling) which made it hard (even for humans) to get right. How has your model generalized OOD thus far?

- Are you planning to integrate voice? Many of the subcontractors we worked with are very low tech. They usually talk with their clients in person, on the phone, or maybe text. But they don’t use email or their smart phones for much.

I will be following your work! I have friends who would love to use this once it passes the human threshold.




I think parsing a whole blueprint with monolithic models is really difficult, but the constrained object detection/semantic segmentation problems are significantly more tractable. You can chain those CV models with VLMs to do things like get scale right. I'm always interested in novel HCI paradigms like voice!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: