More

pablomendes · 2025-04-16T19:08:26 1744830506

In what kinds of workloads or usage patterns do you see the biggest performance gains vs traditional FaaS + storage stacks?

jtagliabuetooso · 2025-04-16T19:25:52 1744831552

In a nutshell, data and AI workloads require fast re-building and vertical scaling:

1) you should not need to redeploy a Lambda if you you're running January and February vs only January now. In the same vein, you should not need to redeploy a lambda if you upgrade from pandas to polars: rebuilding functions is 15x faster than lambda, 7x snowpark (-> https://arxiv.org/pdf/2410.17465)

2) the only way (even in popular orchestrators, e.g. Airflow, not just FaaS) to pass data around in DAGs is through object storage, which is slow and costly: we use Arrow as intermediate data format and over the wire, with a bunch of optimizations in caching and zero-copy sharing to make the development loop extra-fast, and the usage of compute efficient!

Our current customers run near real-time analytics pipelines (Kafka -> S3 / Iceberg -> Bauplan run -> Bauplan query), DS / AI workloads and WAP for data ingestion.

pablomendes · on Sept 20, 2024

What does the tech stack look like?

pablomendes · on Sept 20, 2024

Is the course focused on LLMs used to generate text or does it also talk about other kinds of testing like search, images, etc?

pablomendes · on Sept 20, 2024

Cool! What's next in the roadmap?

zmccormick7 · on Sept 23, 2024

The main thing we need to add is metadata filtering, as that's required for a lot of use cases. We're also thinking about adding hybrid search support and multi-factor ranking.

pablomendes · on Sept 20, 2024

Congrats on the launch! I'm an inveterate search nerd so I can't help but ask how did you implement search?

pablomendes · on Aug 30, 2024

What are you using for the search tech?

thendrill · on Aug 30, 2024

Cassandra and Elastic Search

pablomendes · on Aug 21, 2024

That statement is both kind of true and, well, revisionist. Originally there was a strong focus on logics, clean comprehensive modeling of the world through large complicated ontologies, and the adoption of super impractical representation languages, etc. It wasn't until rebellious sub-communities went rogue and pushed for pragmatic simplifications that things got any widespread impact at all. So here's to the crazy ones, I guess.

pablomendes · on Aug 21, 2024

One thing that works well for me is going working-to-working. Get a simple, de-scoped, incomplete, probably crappy version done end-to-end. Now it's not about finishing, it's about improving. And if it was worth building in the first place, it will beg for improvement. And then it's easier to just keep turning the crank, working-to-working.

SCUSKU · on Aug 21, 2024

In Pragmatic Programmer, this is called the "tracer bullet" approach. Basically just another word for end-to-end, but with the emphasis that by completing a slice end-to-end you'll get feedback much faster (as in the feedback tracer bullets provide when shooting a target).

breakingcups · on Aug 21, 2024

So... MVP?

mncharity · on Aug 21, 2024

Hmm, how about... End-to-end: "can really implement?". Tracer-bullet: "can adjust aim?". MVP: "can make user happy?". Though there's more to each.

So then market fit exploration benefits from tracer-bullet, but you might do iterative reimplementation instead. And it might be overkill for a one-shot market test. MVP is largely orthogonal to end-to-end - "Submit button emails founder who does the thing by hand over breakfast" is fine MVP but isn't very end-to-end. And a non-MVP non-product end-to-end can be tracer-bullet or not. A soundly architected low-debt end-to-end yes, a hackathon high-debut "to adjust, rewrite", or a throwaway end-to-end exploratory spike, no.

pablomendes · on Feb 13, 2023

Hey @afandian, can I interview you about the tech stack you all used to build search at Crossref.org?

afandian · on Feb 14, 2023

Always happy to share more details about how we build open scholarly infrastructure! Can do here, or reply with contact details.

pablomendes · on Feb 13, 2023

Are you planning to implement search and filter tags?