The PDP-11 had a MMU, but it was an extremely simple MMU, even simpler than what was available in Intel 8088/8086.
When MMU is mentioned in UNIX context, a MMU like in DEC VAX is understood, which provided features like access protection, dynamic relocation and virtual memory.
The main purpose of the PDP-11 MMU was to extend the maximum memory size from 64 kB to 4 MB, but that was done in a very inconvenient way.
A single process was limited to 64 kB of program code + 64 kB of data. For larger processes one had to use memory overlays, i.e. to swap out parts of the program while other parts were swapped in over the same addresses, but this feature was not used in UNIX, but in other operating systems for PDP-11, e.g. DEC RSX-11M.
In UNIX, the larger memory was exploited by loading in memory many concurrent processes. It was easier to split some task into subtasks done by cooperating processes communicating through pipes, than to attempt to remap some of the 8 kB pages of the 64 kB address space while the program was running, in order to exceed the process size limit.
The PDP-11 MMU was more powerful than what existed in the 8088/8086. It provided a supervisor/user distinction and didn't allow user code to go to town on the MMU directly. Yes, being a true 16 bit system, it was limited to 64KB total in each address space, but there was both
* True process isolation since user code couldn't load whatever they wanted into the base registers like on an 8086,
* True pages, so that the entire 64KB didn't have to be contiguous like on a 8086, but was instead broken into 8KB pages that could be mapped, shared between process, and mixed together in different places in the virtual address space.
Everything you're saying applies equally to the 32bit virt/36bit phys PAE world that we had for a while, and I wouldn't call that a simple MMU.
It is true that the PDP-11 MMU provided memory protection, unlike 8088/8086.
However that mattered only for the operating system kernel, as it allowed process isolation.
For user programs, the 8088/8086 MMU was a thousand times more convenient, because the segments could start on any 16-byte boundary instead of 8 kB boundaries and there were 4 segments instead of 2 and you could load a pointer to a new segment with a single instruction.
For an IBM PC, it was very easy to write a 512 kB program, you just had to select an appropriate memory model option for the compiler. For a PDP-11, even when having 4 MB of memory, writing a large program was extremely difficult.
You had to partition the code and data of the program in pieces not to small and not too large, to fit in 1 or a few 8 kB pages. You had to partition in such a way that the parts will not need to be active simultaneously and you would not need to swap them frequently. Swapping parts was slow, as it had to be done by an operating system call.
The linker had to be instructed to allocate the overlapping parts to appropriate addresses, so that it will be possible to swap them.
If you wanted to escew protection for ease of use, you could just map the unibus page into the user process and let them fiddle the attributes directly.
Additionally the 8kb pages were subdivided into 128B contiguous blocks, with a base and limit, allowing smaller granularity.
The feature set that the 8086 gave you was more or less still available, there was just another option that was deemed more useful in most cases.
I have to disagree - I ported v6 and v7 (and system III/V/etc ) - at the time we called them 'MMU's - we distinguished between a a pdp11 "base and bounds" MMU a "SUN style" SRAM based MMU (with fixed pages loaded at context switch time), a 68451 (which had power of 2 region mappings loaded at context switch time) - and full paging MMUs (vax, PMMUs, the RISC chips witrh software replacement when they came out)
I believe this was actually used (to a limited degree) in later releases of 2BSD, as features from 4BSD were backported to the PDP-11, and the kernel ballooned in size such that overlays were in necessary. The authors of said overlays seemed rather exasperated by the whole thing, and I vaguely remember watching a talk on YouTube where someone recounted the (apocryphal?) tale of them pushing their pdp-11 out the window and cheering.
The 8088/8086 had a MMU that provided only addressing space extension, from 16-bit to 20-bit.
It did not provide memory protection.
The PDP-11 MMU provided memory protection so that process isolation was possible, but its main function was also addressing space extension, from 16-bit to 18-bit in the first version and from 16-bit to 22-bit in the later version, but that function was much less convenient to use than in 8088/8086.
The 8088/8086 segment architecture is not what is generally considered an MMU. Yes, it provides address translation. That's part of what an MMU does. That doesn't make it an MMU, though.
The 80286 is generally considered to be when an MMU was added to the x86 line.
The address translation, either with segments or with pages, is the essential function of a MMU and by far its most complex feature.
Adding flags for additional features piggy-backed over the address translation, e.g. memory protection, is the easy part of a MMU.
80286 added memory protection, which is essential for a multi-user operating system, like UNIX. That is why it was the first Intel CPU to which UNIX was ported, but any computer with address space extension must have a MMU.
8088/8086 was indeed unusual in having a MMU without memory protection, because it was intended only for personal computers, while the previous computers expensive enough to need address space extension were all intended for multi-user applications, so all included memory protection in the MMUs.
> The 8088/8086 had a MMU that provided only addressing space extension, from 16-bit to 20-bit.
In hardware this is literally an add. Calling a single add (with aliasing) an MMU is quite a reach. The 80286 is not merely an extension on this as the segment registers actually do indirection. Also in terms of silicon real-estate I think you greatly underestimate the complexity of what you call the “easy” part - anything that can fault is no longer trivial for one - the 8088 cannot do this.
I believe the 8086, as originally designed and implemented, had a segmented address space with base and bounds registers in the style of the CDC 6600. This rudimentary but effective memory management approach was scrapped by Intel to make the 8086 part commensurate with the stepper reticle and semiconductor process of the time.
An email exchange with one of the 8086 architects informs me that no version of the 8086 design had base and bound hardware. Vivid and detailed as my recollection is, I appear to be remembering something that never happened.