Interesting. Daft currently does validation on types/names only at runtime. The flow looks like:
1. Construct a dataframe (performs schema inference)
2. Access (now well-typed) columns and operations on those columns in the dataframe, with associated validations.
Unfortunately step (1) can only happen at runtime and not at type-checking-time since it requires running some schema inference logic, and step (2) relies on step (1) because the expressions of computation are "resolved" against those inferred types.
However, if we can fix (1) to happen at type-checking time using user-provided type-hints in place of the schema inference, we can maybe figure out a way to propagate this information through to mypy.
Would love to continue the discussion further as an Issue/Discussion on our Github!
1. Construct a dataframe (performs schema inference)
2. Access (now well-typed) columns and operations on those columns in the dataframe, with associated validations.
Unfortunately step (1) can only happen at runtime and not at type-checking-time since it requires running some schema inference logic, and step (2) relies on step (1) because the expressions of computation are "resolved" against those inferred types.
However, if we can fix (1) to happen at type-checking time using user-provided type-hints in place of the schema inference, we can maybe figure out a way to propagate this information through to mypy.
Would love to continue the discussion further as an Issue/Discussion on our Github!