I've build a custom thumbnail/metadata extraction toolkit based on libavcodec/libavformat that runs the decoding in seccomp's strict mode and communicates the results through a linear RGB stdout stream. Works pretty well and has low overhead and complexity.
Full transcoding would be a bit more complex, but assuming decoding is done in software, I think that should also be possible.
Full transcoding would be a bit more complex, but assuming decoding is done in software, I think that should also be possible.