Windows Timer Resolution
First things first, most of my knowledge on this is based on the work of Bruce Dawson and his 2013 article Megawatts Wasted. Bruce is one of my favorite writers with regards to IT stuff for a number of reasons, partly because he has a very deep knowledge of the Windows Kernel and doesn't just take it on faith that it works perfectly because lots of programmers work on it, but also because he wrote his own tool that among other things improves Crysis frame rates by 30% through increasing the frequency of the check to see if a new frame should start being rendered.
That being said this article is a bit out of date even as it comes out, since recent updates to Windows 10 scheduling means that as of the 2020 Windows 10 update the timer now works completely differently. Windows 7 and 8.1 editions still work as normal.. and may well be faster for games at this point. One thing worth questioning is that, knowing that there's a Windows for Workstations out there that has a low latency performance mode out there, does that suggest that the Windows for Workstations release has the old behavior? I don't have a version of Windows for Workstations, but I'm very keen to find out which version of the behavior the "Low Latency" mode goes with.
The short version of the Megawatts Wasted article is that Windows has a timer that checks to see how often a thread should check to see if it's time to wake up and run again. By default that's every 15ms, which coincidentally adds up to around 66 times a second. This default number is different on single core systems. A process can request a smaller time slice down to 0.5 ms, roughly 2000 times a second. However this value, in versions of Windows before 2020 is system wide. So if your media player requests the maximum resolution, that applies to every application in your system. This matters because games and responsive apps benefit a lot from a lower value, but don't always request the lowest value themselves as the system for doing so is fairly obscure and the benefits of requesting the highest possible resolution may not be apparent on lower powered systems.
It's worth understanding how Windows actually schedules a thread to be run. Windows divides up execution units on the processor into "quanta" and each application runs for a given amount of quanta before being asked to check if another application should run instead. Even on multi-processor systems, this yield-and-check still happens. An application can also yield beforehand, if it specifically tells Windows it's done using processor time. If you've seen the Programs > System Services button in your system manager settings, that tells Windows how many quanta the foreground and background processes should each get by default. The quanta themselves don't have any specific length... as their length depends entirely on the currently running timer resolution. So a higher timer resolution will cause the check that sees if a thread is waiting to run to trigger more often as well as having threads that fail to yield correctly be pushed aside more often. Of course in a world of perfectly written code this check would be unnecessary as the various applications would just negotiate this between themselves, but this doesn't happen even with applications written by the large studios.
If you're interested you can grab Bruce's tool off his article, it's worth doing as the performance benefits are very real, especially as CPUs become more multi-threaded and more powerful. The Windows XP version of the tool works on all Windows editions, the only thing you're missing is the ability to run it from command line.
For reference, Linux has a timer resolution hard-coded into the operating system when the disc image is built. Usually it's the equivalent of 10ms in Windows terms, but can go as low as 1ms. In older versions of the Linux KDE desktop you can tell what the kernel setting is through how certain animations display, as they depend on functions that run less than one full cycle of the scheduler but don't return correctly. This means that the folder opening animation on older KDE looked noticeably different depending on the Kernel's resolution.
So if this makes games go faster... what's the downside? Well for one thing, on some laptop transformers, running a high timer resolution makes the laptop give out a high pitched but soft whine that you may or may not notice. It's not in your head, it's really there, and I've noticed this as well in the past, but not on any modern machines. Supposedly it's from the power to the CPU switching from high to low and back 2000 times per second instead of doing so more slowly. Additionally there's a performance penalty on processing tasks, running the task scheduler will require the CPU to dump its first level of memory, and then read it back in when it's done evaluating the next task, having to do this more often than usual will cause applications to run more slowly overall by something like 3% on older gear, though this applies more to things like physics simulations and pure math tasks than responsive tasks like gaming. Finally on low workloads there's a higher power consumption, this might not matter on desktops, but does matter on laptops and it's something that the developers of the Chrome browser noticed enough to switch Chrome's timer resolution from 0.5ms to 1ms to try and fix.
All of that being said, this re-affirms my suspicion that most modern processors are heavily underutilized.