Yeah often the best way to tackle exactly-once delivery to a database is a uniqueness constraint, but that isn't free – there's the index cost, additional write cost, and the error needs to be handled when it's thrown back to the client on a collision (something many applications don't handle well).
Stateful services are far harder to scale than stateless ones. Typically a stateless service can be scaled out horizontally with relative ease, but when there's state storage involved this becomes harder. You can scale vertically, but only so far. You can scale horizontally, but then typically need to introduce some sort of sharding/consistent hashing/etc to ensure that the service has a consistent view of the world without needing to connect to every database instance.
Not sure where the expectation of things being free comes from.
If your stating point is stateless then you can consider the tradeoff of introducing state vs. processing the same request multiple times.
Stateful services are far harder to scale than stateless ones. Typically a stateless service can be scaled out horizontally with relative ease, but when there's state storage involved this becomes harder. You can scale vertically, but only so far. You can scale horizontally, but then typically need to introduce some sort of sharding/consistent hashing/etc to ensure that the service has a consistent view of the world without needing to connect to every database instance.