For an example of a reward model that doesn't include "C" explicitly consider a reward model defined to be the count of the one bits in letters in the input. It would define a reward for "C" but "C" doesn't show up explicitly, because the reward had universal reach and "C" was among its members as a result.