But compiled code loses a lot of the "extra" data. Also these are "language" models so I would be surprised if training on binaries was much more efficient versus writing in some kind of language.
Besides, how do you even check the result now without running untrusted code? Every run of the model you need to reverse-engineer the binary?