Operations on a Mutex are magnitudes slower than operations on a CRITICAL_SECTION, that's the main reason (I did a performance comparison once, and there was a huge difference; unfortunately I don't remember the exact numbers). The reason is that Mutex is a kernel object and every operation has to be done through the kernel, which seems to be very slow on Windows. CRITICAL_SECTION is implemented in user space, and its implementation is very efficient (inlining, etc.)
The timed wait operation is a recent addition, so we did not have the problem in the original design. However, the performance benefits of CS and the fact that a time wait is not used very often still make CS the preferred mechanism to implement the Mutex.