Not just that, but also all the bytes have to go through an extra pipe. Presumably they're copied an extra time because of this.
When you run "cmd < file", the command reads from stdin, which pulls directly from the file. When you do "cat file | cmd", "cat" opens the file, reads from there, and writes to a pipe. Then "cmd" reads from its stdin, which is a pipe.
GNU cat will use the copy_file_range syscall when possible!
copy_file_range allows a user land program to copy data between two files without doing any user space work. Instead of reading data into a buffer and writing it back out to the destination, the kernel will somehow manage to move the data for you.
I think this will prevent any extra copies from occurring in situations where it can be used.
When you run "cmd < file", the command reads from stdin, which pulls directly from the file. When you do "cat file | cmd", "cat" opens the file, reads from there, and writes to a pipe. Then "cmd" reads from its stdin, which is a pipe.