Image taken from VideoCardz

Oh shit I guess I’m doing a speculation article!

I just really find this Alder Lake-S leak/rumor showing it having 8 big cores and 8 little cores quite interesting. There are a number of uses I can think of for such a processor. But let’s not get into those right away and get all excited.

A couple things that keep being brought up on twitter are in relation to either Windows 10’s ability to support such a core configuration or the validity of little cores on desktop when Skylake itself is already so efficient.

I’ll start with the Windows 10 thing. Windows 10 has support for ARM big.LITTLE in the kernel. Windows 10 also has support for a thing called CPU sets.

This writeup over here describes things pretty well.

Windows 10 brought us another innovation in the scheduler space with CPU Sets. CPU Sets allow a process to partition the system such that its process can take over a group of processors and not allow any other process or system to run their threads on those processors. Windows Kernel even steers Interrupts from devices away from the processors that are part of your CPU set. This ensures that even devices cannot target their code on the processors which have been partitioned off by CPU sets for your app or process. Think of this as a low-tech Virtual Machine. As you can imagine this is a powerful capability and hence there are a lot of safeguards built-in to prevent an app developer from making the wrong choice within the API. CPU sets functionality are used by the customer when they use Game Mode to run their games.

Finally, this brings us to ARM64 support with Windows 10 on ARM. The ARM architecture supports a big.LITTLE architecture, big.LITTLE is a heterogenous architecture where the “big” core runs fast, consuming more power and the “LITTLE” core runs slow consuming less power. The idea here is that you run unimportant tasks on the little core saving battery. To support big.LITTLE architecture and provide great battery life on Windows 10 on ARM, the Windows scheduler added support for heterogenous scheduling which took into account the app intent for scheduling on big.LITTLE architectures.

By app intent, I mean Windows tries to provide a quality of service for apps by tracking threads which are running in the foreground (or starved of CPU) and ensuring those threads always run on the big core. Whereas the background tasks, services, and other ancillary threads in the system run on the little cores. (As an aside, you can also programmatically mark your thread as unimportant which will make it run on the LITTLE core.)

Work on Behalf: In Windows, a lot of work for the foreground is done by other services running in the background. E.g. In Outlook, when you search for a mail, the search is conducted by a background service (Indexer). If we simply, run all the services on the little core, then the experience and performance of the foreground app will be affected. To ensure, that these scenarios are not slow on big.LITTLE architectures, Windows actually tracks when an app calls into another process to do work on its behalf. When this happens, we donate the foreground priority to the service thread and force run the thread in the service on the big core.

There are already things in place for designating programs as background or foreground. I don’t see any issue with Windows 10 being able to support an x86 version of big.LITTLE.

For the other issue of the little cores not being necessary due to efficiency already being there is well founded but falls apart if single core speed and GHz are the path forward. Do you need or want some background Windows 10 service bringing your cores up to 5.3GHz? Extra points if those Atom cores only have a base clock and no turbo and a separate power domain from the big cores. If Intel knows that they don’t want to compete on core count in client desktop they need to try exotic things to keep up this clock speed race. You can partition so many things to those LITTLE Atom cores. Yes, it is true that more big cores can, in theory, get more things done in a timely fashion. What if the goal isn’t completion of a task? What if the goal is maintaining background tasks and keeping the big cores free to boost properly and compute what is going on in the foreground?

Anyway, these are some of my thoughts about this whole thing.