This is a follow-up ticket created after good discussion along the lines of bug #211746, particularly good comment that sums up the essence of it is quoted below.
In our particular case this causes to inability to use more than 64MB or memory to load kernel and images. The solution could also be some kind of scatter-gather mechanism for blobs or memory, so if the blob is to big to fit into single chunk then loader splits it over multiple regions and lets kernel do its VM magic to stitch pages back together in the KVA. This will leave us with 64MB of data for the kernel, but at least we would be able to pre-load much larger images.
Marcel Moolenaar freebsd_committer 2017-02-17 18:34:37 UTC
I think the complexity of having the kernel at any other physical address is what has us do the staging/copying. It was a quick-n-dirty mechanism that avoided a lot of work and complexity -- which is ok if you don't know it's worth/needed to go through all that hassle. And I guess it looks like we now hit a case that warrants us to start looking at a real solution.
As an example (for inspiration):
For Itanium I had the kernel link against a fixed virtual address. The loader built the VA-to-PA mapping based on where EFI allocated blobs of memory. The mapping was loaded/activated prior to booting the kernel and the loader gave the kernel all the information it needed to work with the mapping. This makes it possible to allocate memory before the VM system is up and running. Ultimately the mapping needs to be incorporated into the VM system and this is where different CPU architectures have different challenges and solutions.
Note that there are advantages to having the kernel link against a virtual address. In general it makes it easier to load or relocate the kernel anywhere and this enables a few capabilities that other OSes already have and then some.
There are also downsides. You may need to support a large VA range if you want to support pre-loading CD-ROM images or run entirely form a memory disk that's preloaded. A few GB of address space would be good to have.
Anyway: It's probably time that to you restate this bug into an architectural (x86-specific for now) problem and have a discussion on the arch@ mailing list.
We need a more people involved to bring this to a closure.
Our memory map (Windows 10, built-in hyper-v) is here:
EFI model of handing control from loader to OS is dictated by Windows loader. There, loader constructs kernel virtual address space and establishes the initial mappings. Most ample demonstration of the approach is with the runtime services abomination requirement that loader provides the future mapping of runtime segments to firmware, while kernel did not even started.
Change of amd64 loader/kernel interaction to adopt this model is possible, but I am not sure that it is worth the efforts. At least, I do not consider the use cases of large preloaded md as enough justification for all the work required, and for causing flag day where new kernel will absolutely require new loader.