It's hard to say what length should be because it could mean different things. How many u8 values there are? How many graphemes are present?
If it's just counting how many u8's in a slice, that's trivial. But storing the total number of variable-length graphemes isn't. I assume that's why Data.Text's length function is O(n), and other languages that have UTF-16 or UTF-8 strings also have length functions that are linear.
Graphemes? Strings can be expected to be transformed so that they contain the "right" type of characters (in particular, plain letters followed by the respective combining diacritical marks vs a smaller number of precomposed accented letters) before asking about their length, which is how many characters they contain. If you want to count "graphemes" you can have a different function and possibly a separate cached result.
This in theory. It is readily apparent from the Data.Text page at https://hackage.haskell.org/package/text-1.2.2.2/docs/Data-T... that the authors care about performance only selectively ("fusion" of buffer allocations is fun, relatively messy data structures to cache important data such as string length are not fun), and that they don't care enough about Unicode to add serious string-level abstractions over the Unicode tables exposed in the Char type.
If it's just counting how many u8's in a slice, that's trivial. But storing the total number of variable-length graphemes isn't. I assume that's why Data.Text's length function is O(n), and other languages that have UTF-16 or UTF-8 strings also have length functions that are linear.