In the article `p` really does take a list not a function! Haskell lists may be "backed by" functions but these functions must be pure - no back-channels. I'm not sure if this fact illuminates or obscures the point.
Anyways you can't pass a closure to the predicate, and the Haskell version has no "back channels" other than termination. The idea is to arrange the recursive search so that the predicate (not the search) demands the bits. This is what terminates the recursion, and it's hard to arrange without laziness (except by introducing more functions as you say).
> type Cantor = Natural -> Bit
> (#) :: Bit -> Cantor -> Cantor
> x # a = \i -> if i == 0 then x else a(i-1)
(You could represent bit streams with something like data Stream = Cons Bit Stream in Haskell, of course, but that's not what the article does. The code in the article works in e.g. Python almost as-is.)
And you can certainly pass closures to the predicate. Not closing over a mutable variable, of course -- I just mean that the bit stream function itself has to have access to p, which is the trick that makes this possible. If you couldn't do that -- if you could only query it with bit sequences that were defined independently of p -- this wouldn't be possible.
A lot of this theory is gamified "you pick YOUR function, THEN I'll pick MINE..." and the order is very important. Here representing sequences as functions only obscures. The important idea is that the sequence is lazy, BUT predetermined. It can't know what the predicate will be. That's where the black magic comes in.
> You could represent bit streams with something like data Stream = Cons Bit Stream in Haskell
I wish it did, it would be much clearer. The article makes that point itself:
"For the type of infinite sequences, we could use the built-in type of lazy lists..."
which is the point you made (which the article dismisses as insufficiently mathematical).
> If you couldn't do that -- if you could only query it with bit sequences that were defined independently of p -- this wouldn't be possible.
No, there's the rub! The bit sequence and `p` are independent. You choose your bit sequence, I choose `p`, and somehow execution finishes. That's exactly what makes this "seemingly impossible."
There is a back-door dependence where `p` forces a lazy sequence to materialize bits, but nothing more. The sequence cannot introspect `p`, and vice-versa.
I think this my third time encountering this post and it took me a few tries to get it!
On laziness, I only mean that the language feature isn't important. You can certainly represent infinite data using laziness (and you must represent it as some form of codata, such as a function), but it's not really the crucial thing here.
Here's a simpler version of seemingly impossible programs, which is better for explaining what I mean: Instead of bit streams, use conatural numbers, i.e. the natural numbers + infinity. You can define these in various ways (e.g. monotonic bit streams), but we'll use a regular Haskell type:
> data Conat = Z | S Conat
Ignoring bottoms (which we won't permit), this has all the natural numbers as inhabitants, but it also has infinity:
> inf :: Conat
> inf = S inf
Note that, if I give you a conatural, all you can do with it is peel off successors; you can't distinguish infinity from "a really big number", extensionally.
A predicate on conaturals is a total function that takes a conatural and returns a boolean. As in the bit stream case, totality is a very strong requirement. For example, "is n even" isn't a valid predicate, because the way to check for evenness is to peel off pairs of successors until you reach either 0 or 1, but that doesn't halt on infinity. In particular, to be total, every predicate must give up eventually: After seeing k successors (for some unknown k), it has to give up and return true or false. If there was no such k, the predicate wouldn't terminate on infinity.
Now you can ask: Given a predicate, how can I decide whether there's any input that it holds for? And this is "seemingly impossible": You can test the predicate on infinity, and, say, on every finite number up to some N, but that doesn't give you any guarantee about the numbers you haven't tested. If all you can do is query p with any finite set of Conats -- even interactively, deciding what to ask it based on its previous responses! -- you don't get enough information to decide if the predicate is satisfiable.
The trick is to construct a special conatural which is defined in terms of p itself. If you think of this more imperatively, you can imagine a program that queries the predicate about increasingly larger natural numbers -- 0, 1, 2, 3, ... -- and prints out an "S" each time the predicate returns false, and a "Z" when the predicate returns true. If the predicate is false for all inputs, it prints out "S"s forever -- but that's still a valid (productive) conatural number.
This conatural is -- by construction -- the smallest input that satisfies p, if there is one, and infinity if there isn't. Now you can apply p to this number, and, if there's any input that satisfies p, p will return true.
The point I was making above is that it's crucial for this conatural to be able to call p. It doesn't introspect p, in the sense of being able to access its source code, but it's not independent either. And, although laziness expresses this elegantly, I wouldn't really say it's the core of the trick here. You could express the same thing using Turing machines that write out "S"s to their output tape, for example (as long as you don't require them to terminate overall, only to be productive). You do need to guarantee that the "conatural" Turing machine that you give to p has access to p (either by embedding its source code or by being able to call it as an oracle).
Here's the implementation in Haskell (which is simpler than the bit stream case given in the article, and doesn't need mutual recursion):
> -- find p returns a conatural that satisfies p, if one exists.
> -- (If one doesn't exist, it returns infinity.)
> find :: (Conat -> Bool) -> Conat
> find p = if p Z then Z else S (find (p . S))
> -- Does any conatural satisfy the predicate?
> exists :: (Conat -> Bool) -> Bool
> exists p = p (find p)
Anyways you can't pass a closure to the predicate, and the Haskell version has no "back channels" other than termination. The idea is to arrange the recursive search so that the predicate (not the search) demands the bits. This is what terminates the recursion, and it's hard to arrange without laziness (except by introducing more functions as you say).