It can be your laptop, an Airflow task, a prefect flow or a step function or a cron job on a VM - it's the "host" process (for the data product we picked lambda because it's the easiest way for people to "run small Python stuff every 5 minutes" - this is a prefect example: https://www.prefect.io/blog/prefect-on-the-lakehouse-write-a...).
When you interact with the Bauplan lakehouse, all the compute happen on bauplan, nothing happens in the lambda: think of launching a Snowflake query from a lambda - the client is in the lambda but all the work is done in the SF cloud. Unlike many (all?) other lakehouses, Bauplan is code-first, so you can program the entire branching and merging patterns with a few lines of code, offloading the runtime to the platform.
The platform itself runs on standard EC2, which contains the dockerized functions needed for execution - typically we manage Ec2 in single tenant, private link, soc2 compliant account we own for simplicity, but nothing prevents the VMs to be somewhere else (given connectivity is ok etc.). It is our philosophy that you should not worry about the infra part of it, so even in case of BYOC we will be in charge of managing that.
So, the AWS lambda in the data product example is a bit of a red herring, and it's used as the outer process to create branches and launch bauplan pipelines through the Python client (https://github.com/BauplanLabs/data-products-with-bauplan/bl...).
It can be your laptop, an Airflow task, a prefect flow or a step function or a cron job on a VM - it's the "host" process (for the data product we picked lambda because it's the easiest way for people to "run small Python stuff every 5 minutes" - this is a prefect example: https://www.prefect.io/blog/prefect-on-the-lakehouse-write-a...).
When you interact with the Bauplan lakehouse, all the compute happen on bauplan, nothing happens in the lambda: think of launching a Snowflake query from a lambda - the client is in the lambda but all the work is done in the SF cloud. Unlike many (all?) other lakehouses, Bauplan is code-first, so you can program the entire branching and merging patterns with a few lines of code, offloading the runtime to the platform.
The platform itself runs on standard EC2, which contains the dockerized functions needed for execution - typically we manage Ec2 in single tenant, private link, soc2 compliant account we own for simplicity, but nothing prevents the VMs to be somewhere else (given connectivity is ok etc.). It is our philosophy that you should not worry about the infra part of it, so even in case of BYOC we will be in charge of managing that.
Does it help clarify the mental model?