Anatomy of the thread suspension mechanism in Windows (Windows Internals)

 

Introduction


Process suspension is a technique which is quite well-known, and it is used for a variety of reasons (even by malicious software sometimes). The term “suspension” means “stopping” something, and in-case you did not guess it yet, “process suspension” is a technique to temporarily “stop” a running process. If you are suspended from school then you won’t be attending for the duration you’re suspended for, and when the term is used with a process, the process won’t be carrying out any operations whilst it’s suspended.

When you suspend a process, the threads of the process will be set to a suspended state; the threads of a process are responsible for processing the code belonging to the process – the CPU executes the instructions. Usually, a process will have more than one thread, and this will allow the process to execute several operations at the same time simultaneously. If we were to suspend one of the running threads, then the targeted thread would be postponed from carrying out any operations until it has been resumed. If we were to suspend each thread contained under the process object, then we would have successfully suspended the process! To cut it short, process suspension is an operation which relies on suspending the threads of a process – this cuts off code execution until we resume the process, which consists of resuming each thread under the process which we put into a suspended state.

When a process is being spawned, there will be a main thread for the process and it will be in a suspended state until initialisation of the newly starting process has been completed. Even if the requester of the process spawn operation does not specify the CREATE_SUSPENDED flags, the process will still be started with a suspended state by the Windows loader until initialisation has been successfully completed. When the process is ready to start running it’s own code due to the Windows loader being finished, it will resume the main thread which is maintained with a suspended state until this point (unless the CREATE_SUSPENDED flags were specified by the requester of the process spawn). The resume operation for the main thread at this point in time will lead to a routine known as NtResumeThread (NTDLL) being invoked – a system call is performed by the NtResumeThread stub present in NTDLL to get the real NtResumeThread routine invoked (which resides under kernel-mode memory – NTOSKRNL which is the Windows Kernel to be precise).

This article will be broken into separated sections. The first section will be discussing user-mode, and the second section will be discussing kernel-mode. In both sections, suspending and resuming a process’ threads will be discussed.

 

Section 1 – User-Mode


When it comes down to suspending a process from user-mode, you have a few options.

  • Invoke NtSuspendProcess (NTDLL) which is undocumented. [1]
  • Enumerate the threads of the targeted process and invoke NtSuspendThread (NTDLL). [2]
  • Enumerate the threads of the targeted process and invoke SuspendThread (KERNEL32). [3]

[1] – The first method noted, via NtSuspendProcess, is the most minimal solution. At the same time however, it is also one of the most unreliable. NtSuspendProcess is not officially documented by Microsoft (despite being documented by third-parties for several years), which can only make you wonder why they are yet to document it themselves considering it is so widely exposed to the public already – it would take them barely any time to document it over at the Microsoft Developer Network (MSDN).

The function takes in one parameter only which needs to be the handle to the process being targeted for suspension, and thus the data-type of this singular parameter is of HANDLE (VOID*).

I’ve left a type-definition for the NtSuspendProcess routine below.

typedef NTSTATUS(NTAPI *pNtSuspendProcess)(
    HANDLE ProcessHandle
);

 

When you invoke NtSuspendProcess (NTDLL), the system performs a transition operation via a system call to cause NtSuspendProcess (NTOSKRNL) to become invoked. We can verify these findings by taking a look at NTDLL.DLL for the NtSuspendProcess exported routine.

 

image
NtSuspendProcess function prologue under NTDLL.

 

If we remember back to what I have previously said at the start of this article, process suspension works by suspending the threads of the targeted process. How does NtSuspendProcess work then? NtSuspendProcess will lead down a path which has one end-result only. The end-result is the threads of the targeted process being enumerated and each found thread during the enumeration being applied for suspension.

I’ve created a very straight-forward user-mode snippet based in C on invoking NtSuspendProcess (NTDLL) for demonstration purposes. You’ll need to include the <stdio.h> and <windows.h> libraries and add a main entry-point routine to compile and test it out.

#define STATUS_INSUFFICIENT_RESOURCES 0xC000009A

typedef _Return_type_success_(return >= 0) LONG NTSTATUS;
typedef NTSTATUS *PNTSTATUS;

#define NT_SUCCESS(Status) (((NTSTATUS)(Status)) >= 0)

typedef NTSTATUS(NTAPI *pNtSuspendProcess)(
    HANDLE ProcessHandle
);

pNtSuspendProcess fNtSuspendProcess;

BOOLEAN InitializeExports()
{
    HMODULE hNtdll = GetModuleHandle("NTDLL");

    if (!hNtdll)
    {
        return FALSE;
    }

    fNtSuspendProcess = (pNtSuspendProcess)GetProcAddress(hNtdll,
       "NtSuspendProcess");

    if (!fNtSuspendProcess)
    {
        return FALSE;
    }

    return TRUE;
}

NTSTATUS NTAPI NtSuspendProcess(
    HANDLE ProcessHandle
)
{
    if (!fNtSuspendProcess)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    return fNtSuspendProcess(ProcessHandle);
}

BOOLEAN WINAPI SuspendProcess(
    HANDLE ProcessHandle
)
{
    if (!ProcessHandle)
    {
        return FALSE;
    }
    
    return (NT_SUCCESS(NtSuspendProcess(ProcessHandle)))
        ? TRUE : FALSE;
}

Here is a break-down on how the snippet is supposed to be used/work.

  1. The InitializeExports routine (return-type of BOOLEAN) will setup NtSuspendProcess for usage via a dynamic import. pNtSuspendProcess is a type-definition for the function structure, and this is used with a global variable which gets pointed to the address of NtSuspendProcess. This means we can simply treat fNtSuspendProcess like a normal function however it will lead to NtSuspendProcess (NTDLL) invocation since it points to NtSuspendProcess (NTDLL) address.
  2. After opening a handle to a process (with at-least the PROCESS_SUSPEND_RESUME access rights present), it can be passed as the parameter for the SuspendProcess routine. This routine has the WINAPI macro however this simply represents __stdcall, and the NTAPI macro also represents __stdcall – __stdcall is a calling convention. Just to be clear and not cause potential confusion, SuspendProcess is not a Win32 API routine, I simply used the WINAPI macro to represent __stdcall because I prefer to do so.
  3. The SuspendProcess routine will call the NtSuspendProcess routine. The NtSuspendProcess routine which is manually declared will call fNtSuspendProcess which points to the address of NtSuspendProcess (NTDLL) as previously noted.

 

[2] – The second method noted, via enumeration of the targeted process’ threads and then calling NtSuspendThread on each one, is also potentially unstable due to it being undocumented like the NtSuspendProcess method noted under [1], however it does the job.

NtSuspendThread is not an “officially” documented routine in the same way that NtSuspendProcess isn’t either. However, NtSuspendThread does not have a complicated structure. The routine takes in two parameters: HANDLE, and ULONG* (PULONG). The former is for the handle of the thread being targeted by suspension (we must first acquire a handle to the thread we are targeting in the same sense that we must have a handle to the target process before we can use NtSuspendProcess – the handle must have at-least THREAD_SUSPEND_RESUME access rights), and the latter is to do with a counter of suspensions (as far as I am aware – although it is entirely optional and whenever I’ve needed to use this routine I’ve never had to make use of the second parameter).

I’ve left a type-definition for the NtSuspendThread routine below.

typedef NTSTATUS(NTAPI *pNtSuspendThread)(
    HANDLE ThreadHandle,
    PULONG PreviousSuspendCount OPTIONAL
);

 

Just like with NtSuspendProcess (NTDLL), when we invoke NtSuspendThread (NTDLL), a system call is performed by the system to transition from user-mode to kernel-mode; the end-result is NtSuspendThread (NTOSKRNL) being invoked. The handy thing about NtSuspendThread though is that we can suspend only X amount of threads, and leave some threads resumed, a perk which does not come with NtSuspendProcess of course.

A quick snippet of how you would go about using NtResumeThread is left below for you to take a quick peek at.

typedef NTSTATUS(NTAPI *pNtSuspendThread)(
    HANDLE ThreadHandle,
    PULONG PreviousSuspendCount OPTIONAL
);

pNtSuspendThread fNtSuspendThread;

BOOLEAN InitializeExports()
{
    HMODULE hNtdll = GetModuleHandle("NTDLL");

    if (!hNtdll)
    {
        return FALSE;
    }

   fNtSuspendThread = (pNtSuspendThread)GetProcAddress(hNtdll,
        "NtSuspendThread");

    if (!fNtSuspendProcess ||
        fNtSuspendThread)
    {
        return FALSE;
    }

    return TRUE;
}

NTSTATUS NTAPI NtSuspendThread(
    HANDLE ThreadHandle,
    PULONG PreviousSuspendCount
)
{
    if (!fNtSuspendThread)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    return fNtSuspendThread(ThreadHandle,
        PreviousSuspendCount);
}

BOOLEAN WINAPI SuspendThreadWrapper(
    HANDLE ThreadHandle
)
{
    if (!ThreadHandle)
    {
        return FALSE;
    }

    return (NT_SUCCESS(NtSuspendThread(
        ThreadHandle,
        NULL))) ? TRUE : FALSE;
}

 

In case you’re wondering why I took the leap with the routine naming and used “SuspendThreadWrapper” instead of “SuspendThread”, it’s because there’s already a Win32 API routine called “SuspendThread”. The SuspendThread routine is exported by KERNEL32.DLL and it will do the same thing we are doing in our wrapper routine (more-or-less at-least) – it will call NtSuspendThread (NTDLL).

Bear in mind that you will need to have at-least THREAD_SUSPEND_RESUME access rights on the handle you attempt to use with NtSuspendThread. To acquire the handle to the thread, you could use NtOpenThread (NTDLL), or preferably it’s Win32 API equivalent which would be OpenThread (KERNEL32).

Unless you need to suspend a certain amount of threads of a process, it is likely going to be more convenient to use NtSuspendProcess. The reason being that NtSuspendProcess (NTOSKRNL – invoked after the NTDLL system call) will call a kernel-mode routine to handle the process suspension, and this routine will automatically enumerate through all the threads of the targeted process and call another kernel-mode routine to handle suspension. Whereas, if you are enumerating the threads and suspending them yourself, you’re doing more yourself to replicate the same functionality. Unless a routine like NtOpenProcess/NtSuspendProcess has been hooked and you don’t fancy bypassing the set hooks, and NtOpenThread/NtSuspendThread was forgotten about, then you may as well use NtSuspendProcess if you need to suspend a process.

 

[3] – The third method noted, via enumeration of the targeted process’ threads and then calling SuspendThread on each one, is without a doubt the most stable technique. At the same time though, it’s a bit more “obvious”. You’ll neither be able to perform a manual transition from user-mode to kernel-mode to bypass any hooks since it isn’t a Nt* routine which is one down-fall – it is still the most documented mechanism for accomplishing process suspension though, and for this reason, it is recommended that you use this technique unless you have specific requirements which prevents you from doing so.

Despite SuspendThread being the most documented mechanism for accomplishing suspension functionality, NtSuspendProcess/NtSuspendThread have been around for an extremely long time, since Windows 2000 I believe. The chances of these routines being deprecated are extremely small, it would be like making NtOpenProcess obsolete, which I am sure is not going to happen any-time soon. They are core routines in the Windows Kernel, so whether you go down the undocumented and less-stable route for this or not, as long as you know how to use the routines properly you likely won’t have any issues from patch updates any-time soon to say the least.

For the record, SuspendThread (KERNEL32) will call NtSuspendThread (NTDLL). As expected, NtSuspendThread (NTDLL) will perform a system call and then NtSuspendThread (NTOSKRNL) will be invoked; NtSuspendThread is not exported by the Windows Kernel however it can still be accessed if you can find the address via pattern scanning or the System Service Descriptor Table (SSDT).

Since SuspendThread is documented by Microsoft over at the Microsoft Developer Network (MSDN), I’ve left the type definition for the routine below along with the link to the official documentation.

typedef DWORD(WINAPI *pSuspendThread)(
    HANDLE ThreadHandle
);

https://msdn.microsoft.com/en-us/library/windows/desktop/ms686345(v=vs.85).aspx

The routine can be used the same way as the SuspendThreadWrapper routine we saw earlier which would call the NtSuspendThread proxy routine. The return value for SuspendThread (KERNEL32) will be the value returned for the PreviousSuspendCount parameter which is the parameter we previously ignored with the NtSuspendThread call.

DWORD Status = SuspendThread(GetCurrentThread());

if (Status)
{
    printf("Thread suspended\n");
}

If the suspension operation is successful, you’ll never reach the conditional statement because the thread which is supposed to be processing those instructions will be in a suspended state. The conditional statement won’t be reached until the suspended thread has been resumed (by another thread which is running or by another process).

 

We’ve talked a bit about how we can suspend a process (and how this relies on targeting the threads) or suspend specific threads, but what about resuming them? We’ll move onto this now before progressing to the kernel-mode section of this article.

When it comes down to resuming a process from user-mode, you have a few options.

  • Invoke NtResumeProcess (NTDLL) which is undocumented. [1]
  • Enumerate the threads of the targeted process and invoke NtResumeThread (NTDLL). [2]
  • Enumerate the threads of the targeted process and invoke ResumeThread (KERNEL32). [3]

In-case you’re yet to notice, we are doing the same as when we are performing suspension, but in-reverse. For example, our first option for process suspension would be by invoking NtSuspendProcess (NTDLL), and our first option for resuming a process would be to invoke NtResumeProcess (NTDLL). NtSuspendProcess, NtSuspendThread and SuspendThread all have a “Resume” variant; simply replace the “Suspend” key-word in the function routines with “Resume” and bobs your uncle!

As noted with NtSuspendProcess and NtSuspendThread about stability, NtResumeProcess and NtResumeThread are in the same boat; they aren’t officially documented by Microsoft and there must be a reason for this. However, worrying about it isn’t really something you should do, considering they are core routines used in Windows and the likelihood of them being made obsolete is very low.

I’ve left an example below for NtResumeProcess and NtResumeThread. It is more-or-less the same as the process suspension examples, except for resume instead. You can use ResumeThread the exact same way you use SuspendThread.

typedef NTSTATUS(NTAPI *pNtResumeProcess)(
    HANDLE ProcessHandle
);

typedef NTSTATUS(NTAPI *pNtResumeThread)(
    HANDLE ThreadHandle,
    PULONG PreviousSuspendCount OPTIONAL
);

pNtResumeProcess fNtResumeProcess;
pNtResumeThread fNtResumeThread;

BOOLEAN InitializeExports()
{
    HMODULE hNtdll = GetModuleHandle("NTDLL");

    if (!hNtdll)
    {
        return FALSE;
    }

    fNtResumeProcess = (pNtResumeProcess)GetProcAddress(hNtdll,
        "NtResumeProcess");

    fNtResumeThread = (pNtResumeThread)GetProcAddress(hNtdll,
        "NtResumeThread");

    if (!fNtResumeProcess ||
        !fNtResumeThread)
    {
        return FALSE;
    }

    return TRUE;
}

NTSTATUS NTAPI NtResumeProcess(
    HANDLE ProcessHandle
)
{
    if (!fNtResumeProcess)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    return fNtResumeProcess(ProcessHandle);
}

NTSTATUS NTAPI NtResumeThread(
    HANDLE ThreadHandle,
    PULONG PreviousSuspendCount
)
{
    if (!fNtResumeThread)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    return fNtResumeThread(ThreadHandle,
        PreviousSuspendCount);
}

BOOLEAN WINAPI ResumeProcess(
    HANDLE ProcessHandle
)
{
    if (!ProcessHandle)
    {
        return FALSE;
    }

    return (NT_SUCCESS(NtResumeProcess(ProcessHandle)))
        ? TRUE : FALSE;
}

BOOLEAN WINAPI ResumeThreadWrapper(
    HANDLE ThreadHandle
)
{
    if (!ThreadHandle)
    {
        return FALSE;
    }

    return (NT_SUCCESS(NtResumeThread(
        ThreadHandle,
        NULL))) ? TRUE : FALSE;
}

 

We can check to ensure that the following is true with some simple static reverse engineering.

  1. SuspendThread (KERNEL32) -> NtSuspendThread (NTDLL)
  2. ResumeThread (KERNEL32) -> NtResumeThread (NTDLL)

The first thing we are going to do is retrieve the address of SuspendThread (KERNEL32) and we’ll set a break-point. I’m using Visual Studio 2017 and I’ll be using the Visual Studio debugger for this task.

 

2
Snippet for obtaining the address to SuspendThread (KERNEL32).

 

When we debug and the break-point is hit, we can step into/over and then check the value stored under the SuspendThreadAddress variable.

 

3
Debugging the snippet from above and checking the value held under SuspendThreadAddress after stepping into w/ the debugger.

 

If you go to the top menu in Visual Studio and select Debug -> Windows -> Disassembly (Ctrl + Alt + D short-key by default) then you can bring up the Disassembly view. This option won’t be possible unless you’re currently debugging the program.

 

4
An example of what Visual Studio’s Disassembly view looks like.

 

If you notice at the top of the Disassembly view we have a text label saying “Address” followed by a text-box control. We can put in the address we got back for SuspendThread (KERNEL32) and it will take us to the disassembly at that location; KERNEL32.DLL is loaded under the address space of our currently-debugged process and the SuspendThread address we got given back from GetProcAddress is present under memory for the loaded KERNEL32 module in our process – we can view the disassembly using the Visual Studio debugger.

 

5
Highlight of the Address bar on the Disassembly view.

 

6
Disassembly for SuspendThread (KERNEL32).

 

Now we’re viewing the disassembly for the SuspendThread function prologue. Hey! What is going on here? Why are we simply redirecting execution flow to another address via a JMP instruction?

The truth is that the real stub for SuspendThread is now located under another module, named KernelBase.dll (KERNELBASE) since Windows 8. On previous versions of Windows, such as Windows Vista and 7, it would have been under KERNEL32 as it is known to be… But many changes were made in Windows 8 and those changes still haunt us to this day on the latest version of Windows 10. On 32-bit versions of Windows, a 32-bit compiled copy of kernelbase.dll can be found under SystemDrive:\WINDOWS\System32\, and on 64-bit versions of Windows, a 32-bit compiled copy can be found under SystemDrive:\WINDOWS\SysWOW64\ and a 64-bit compiled copy can be found under SystemDrive:\WINDOWS\System32\.

 

8
System32 folder show-casing the KernelBase.dll is present.

 

We’re going to take a quick peak at KernelBase.dll in Interactive Disassembler (IDA) for static disassembly. We’ll start by making sure there there’s an export available for the SuspendThread routine.

 

7
IDA Exports tab showing that SuspendThread is an export of KernelBase.dll.

 

Well would you look at that! There’s an export for SuspendThread under KernelBase.dll. We’ll take a look at it now.

 

9
Disassembly for SuspendThread (KERNELBASE).

 

Looking at the above function prologue, you may already have noticed the pink styled text which contains the key-word “NtSuspendThread”. Yes, SuspendThread (KERNELBASE) does call NtSuspendThread (NTDLL); after all, NtSuspendThread (NTDLL) performs the system call so the functionality (in-which resides in instructions under kernel-mode memory) can really be invoked. The SuspendThread routine will do some other things aside from calling NtSuspendThread, and we’ll note these down now for educational purposes.

 

10
Pseudo-code of SuspendThread (KERNELBASE).

 

  1. Sets up two local variables.
    • result is returned at the end of the routine as the return value for the SuspendThread routine. The caller gets this value returned back to them to determine if the operation was/wasn’t successful.
    • v2 is returned at the end of the routine and it will represent the ULONG return for the second parameter of the NtSuspendThread call. If we remember back to NtSuspendThread, there was the PreviousSuspendCount parameter which we were passing as NULL; SuspendThread (KERNELBASE) will have this value returned to the caller if the operation is successful.
  2. Invoke NtSuspendThread (NTDLL) and set the value of the <v2> local variable as the PreviousSuspendCount target (return that data to the <v2> variable).
  3. If the NtSuspendThread (NTDLL) call does not return STATUS_SUCCESS (which is the same as 0 – 0x00000000) then invoke BaseSetLastNTError (Win32 API) and set the value of result to -1 (which will indicate failure to the caller).
  4. If the NtSuspendThread (NTDLL) call does return STATUS_SUCCESS (successful), then set the value of result to be returned to the caller as the value held under the v2 variable.
  5. Return back the value of result

NOTE: result and v2 aren’t the real variable names in the KernelBase.dll source-code. These names are generated automatically by IDA.

 

It’s more or less pretty straight forward and follows this routine for ResumeThread (KERNELBASE) also but we’ll look at this as well anyway.

 

11

 

We can see that the same process is being repeated except NtResumeThread (NTDLL) is being invoked instead of NtSuspendThread (NTDLL).

For the record, the BaseSetLastNTError routine will invoke RtlSetLastWin32Error. The end-result is the correct error code being set as the “last error” so the caller can invoke GetLastError if the operation fails and acquire additional information regarding what went wrong; this also means that the NTSTATUS error code returned by NtSuspendThread/NtResumeThread is converted to a DOS error code via RtlNtStatusToDosError.

 

We are going to end this section of the article here and progress on to discussing kernel-mode. The next section will show examples of how to suspend/resume a process from kernel-mode, along with explanations about how NtSuspendProcess/NtSuspendThread/NtResumeProcess/NtResumeThread actually work in the kernel.

 

Section 2 – Kernel-Mode


In kernel-mode, things a lot different than in user-mode. For starters, you don’t have access to the Win32 API in kernel-mode; you can communicate to a user-mode process via Inter-Process Communication (IPC) or perform kernel-mode code injection targeting a user-mode process to get Win32 API calls invoked but this is not the same as directly using such from kernel-mode, which you cannot do.

When working in kernel-mode, you have access to the Native API (NTAPI). The Native API includes routines with an Nt* prefix which are exported by NTDLL (and those routines exported by NTDLL will perform a system call – whereas in kernel-mode you don’t need to perform a system call), however you won’t have access to all of them by default – there is also the kernel-mode only routines which don’t follow the Nt* style.

When a user-mode process directly or indirectly invokes an NTAPI routine and a system call is performed for user-mode to kernel-mode transition, a kernel-mode routine such as KiSystemCall32/KiSystemCall64 (on the latest versions of Windows 10 after the recent patch updates regarding Meltdown, there is now KiSystemCall32Shadow/KiSystemCall64Shadow/KiSystemCall32AmdShadow/KiSystemCall64Shadow) will be invoked. These routines will call other routines, and a kernel-mode routine named KiSystemServiceRepeat will eventually be invoked. The KiSystemServiceRepeat routine will access the System Service Descriptor Table (KeServiceDescriptorTable) which is nothing more than a dispatch table (another saying of an “array”) which contains pointer addresses – each pointer address represents the address of a Native API routine within the address space of NTOSKRNL (the Windows Kernel). There are routines which can be invoked from user-mode via a system call and there are routines which are “kernel-mode only” (and thus cannot be invoked from user-mode via a system call). The routines which can be invoked via a system call have an entry in the KeServiceDescriptorTable, and there’s also a “Shadow” version which would be KeServiceDescriptorTableShadow to also allow access to pointer addresses of win32k.sys routines.

This article isn’t about going through the process of how a system call works and how the kernel handles them, it’s about suspension of processes and how this mechanism works. However, this is all related to the topic because both NtSuspendProcess and NtSuspendThread are not exported by NTOSKRNL. What does this mean? It means that by default, you do not have access to either of the routines in kernel-mode when developing a kernel-mode device driver. This can be irritating if you have a genuine reason to suspend a process in kernel-mode, and you may not wish to communicate back down to a user-mode component to invoke NtSuspendProcess (NTDLL)/NtSuspendThread (NTDLL) for you.

There are two options in this scenario if you don’t wish to work with a user-mode component (which would be the most stable option available at this moment in time – and I suggest if you do have a user-mode component such as a Windows Service, you should make use of such properly).

  1. Access the KeServiceDescriptorTable manually and use this as an entry-point to locating the address of NtSuspendProcess/NtSuspendThread.
  2. Find out if there’s an exported kernel-mode routine which can be used for process/thread suspension.

The first method is very straight forward on 32-bit systems because KeServiceDescriptorTable is actually exported by the Windows Kernel for 32-bit systems. This means you can find the address to KeServiceDescriptorTable effortlessly. Sadly, this isn’t the case for 64-bit systems; Microsoft implemented a feature called PatchGuard for 64-bit versions of Windows only at the start of Windows Vista and PatchGuard contains a whole wide-range of functionality however one thing they did when they introduced PatchGuard was not export the KeServiceDescriptorTable. Please do not be confused, accessing the System Service Descriptor Table on a 64-bit system will not cause a BSOD as long as the calculations are correct, however it’s more hassle to make use of it due to the fact that it is no longer exported for 64-bit systems.

If you want to go down the route of accessing the System Service Descriptor Table on a 64-bit system, it isn’t all that complicated. You can find the address of KiSystemCall64/KiSystemCall64Shadow/KiSystemCall64AmdShadow via the IA32_LSTAR Model Specific Register (MSR) and then you can locate the non-exported kernel-mode routine KiSystemServiceRepeat not far off from the address pointed to by IA32_LSTAR MSR. As we’ve already established earlier on, the address of the KeServiceDescriptorTable is exposed in the KiSystemServiceRepeat routine. People have been using this technique to locate the SSDT on 64-bit systems for countless years now, but you should avoid doing this unless you really need to, because Microsoft can change something at any time and prevent your source code from working on the updated systems.

Thankfully, the idea noted in our second option is a valid option for us when it comes to process suspension from kernel-mode. As it turns out, there’s a routine exported by NTOSKRNL which is called by NtSuspendProcess. The sad part is that there is no exported routine for singular thread suspension, but I suspect most people trying to suspend from kernel-mode will be targeting a whole process and not after suspending X amount of threads under a process only. The exported routine is called PsSuspendProcess, and it isn’t officially documented by Microsoft.

 

12
Exports of NTOSKRNL – highlighting PsSuspendProcess.

 

We’re going to take a look at how PsSuspendProcess works but we’re going to take a look at NtSuspendProcess beforehand. As previously noted, NtSuspendProcess is not exported by NTOSKRNL, however it does have an entry under the System Service Descriptor Table, which will have the pointer address to it’s routine within the address space of NTOSKRNL – if this wasn’t the case then the Windows Kernel wouldn’t know the location in-memory of the routine when a system call was performed for it to be invoked.

 

13
Disassembly for NtSuspendProcess (NTOSKRNL).

 

In the NtSuspendProcess routine, there are two important things which will happen.

  1. The handle is used to retrieve an object. The handle is passed in to the routine as the parameter which we learnt about during the user-mode section, however the routine doesn’t actually send the handle anywhere itself. Instead, it retrieves an object to the process using the handle via an undocumented, non-exported kernel-mode only routine, named ObpReferenceObjectByHandleWithTag. ObpReferenceObjectByHandleWithTag works by accessing an undocumented, non-exported table stored in the kernel; the routine will use other routines such as ExpLookupHandleTableEntry. An object for a process in kernel-mode would be a pointer structure to the EPROCESS (PEPROCESS) kernel-mode structure for that process – the EPROCESS structure contains many fields which stores data about that process. There’s also the KPROCESS structure which is the first field of the EPROCESS structure and contains a lot of data about the process in question.
  2. Invoke PsSuspendProcess. NtSuspendProcess doesn’t really do anything in itself, it just forwards execution control to PsSuspendProcess. The only reason it calls ObpReferenceObjectByHandleWithTag in advance to retrieve an object to the process (PEPROCESS -> pointer structure of EPROCESS to be precise) is because PsSuspendProcess does not accept a handle as the parameter, but it accepts an object instead.

If you go snooping around NtSuspendProcess with static disassembly and try to generate pseudo-code for it, you’ll likely get back a messy view which will appear ugly at first. I cleaned up the data-types and variable names a bit so it is a bit more pleasant and understandable when looking at it.

 

14
Pseudo-code for NtSuspendProcess (NTOSKRNL).

 

The following is done in-order.

  1. Two local variables are setup. One has a data-type of NTSTATUS and the other has a data-type of PEPROCESS (pointer structure variant of EPROCESS). Of course the variable names in the screen-shot are not the same ones used in the Windows source code, I re-named them. Earlier in previous screen-shots, variable names like “v2” are neither really from the source code. IDA just sets it to these by default and you can change them while reversing.
  2. The local variable which has a data-type of NTSTATUS is assigned a value of the return value from the ObpReferenceObjectByHandleWithTag call. This routine is now invoked; ObpReferenceObjectByHandleWithTag returns an NTSTATUS error code.
  3. The invocation of ObpReferenceObjectByHandleWithTag will assign a value to the local variable which has a data-type of PEPROCESS (named Process as the variable name by myself). If you look at the 6thparameter for the ObpReferenceObjectByHandleWithTag call, we’re passing a pointer to our PEPROCESS variable – the routine will use this to set the value of our variable in the called routine.
  4. If the NtStatus value represents success (NT_SUCCESS -> >= 0) then the routine progresses to call PsSuspendProcess and it will perform a clean-up operation with the object to the process which was previously retrieved via ObfDerferenceObjectWithTag.
  5. The NtStatus value is returned by NtSuspendProcess back to the caller.

I think it’s about time we take a look at the famous PsSuspendProcess routine, don’t you?

 

15
Pseudo-code for PsSuspendProcess (NTOSKRNL).

 

Here’s a break-down of how the routine works, I’ll stick to the core parts.

  1. The threads of the targeted process are enumerated via PsGetNextProcessThread and a while loop. The way it works is PsGetNextProcessThread will be called to return a PETHREAD (pointer to the ETHREAD structure) object for the first thread found within the targeted process and then an operation using the returned PETHREAD will be performed, followed by another PsGetNextProcessThread call and a re-start of the loop. PsGetNextProcessThread is an non-exported kernel-mode only routine. When there are no more threads to be found, the returned PETHREAD will be nothing (NULL) and this will be caught by the conditional statement which checks if the variable which is supposed to be returned the next PETHREAD (of the next thread to be found), and at this point the break instruction is used to exit the while loop since the operation will have no more business regarding thread enumeration.
  2. For each enumerated thread in which a PETHREAD object can be acquired for, the undocumented and non-exported kernel-mode only routine named PsSuspendThread will be called, passing the PETHREAD as a parameter.
  3. When an occurrence of a NULL PETHREAD returned value is returned, the operation exits because the loop is exited. Exiting the while loop is followed by the return of the NTSTATUS error code (which could represent success or failure). The return status for PsSuspendProcess will always be STATUS_SUCCESS unless the targeted process is in a state preparing for termination.

We should see what routine PsSuspendThread will call.

 

16
Pseudo-code for PsSuspendThread (NTOSKRNL).

 

PsSuspendThread will call KeSuspendThread, and KeSuspendThread returns a status which is set to the value of the second parameter passed in PsSuspendThread as the second parameter. However, PsSuspendProcess doesn’t care about the second parameter and thus it doesn’t check the return status by KeSuspendThread. PsSuspendProcess will only return STATUS_SUCCESS or STATUS_PROCESS_IS_TERMINATING.

The fuss regarding the acquisition of “run-down protection” is regarding to preventing the thread from being “terminated” during the operation. Such would cause system instability and likely would bug-check the system because the kernel would then be operating with an object which would no longer be deemed valid.

 

17
Pseudo-code for KeSuspendThread.

 

Now here is a more interesting part, but it should get a bit more interesting when we move to KiSuspendThread.

KeSuspendThread is invoking a routine called RtlRaiseStatus, but this is only happening depending on a conditional statement. The conditional statement is put in place to determine whether a suspension of a thread is being taken place in-which the maximum suspension count has already been met. If we remember back to NtSuspendThread invocation, we had a parameter regarding the previous suspension count which could be returned to us – SuspendThread (KERNEL32/KERNELBASE) was making use of it happily and returning it as the return value of the routine. Well, it turns out that the maximum suspension count is 127. If the conditional statement is met, a status error code is raised with RtlRaiseStatus. The error code being raised which is displayed as 0xC000004Ai64 actually translates to 0xC000004A which is the same as STATUS_SUSPEND_COUNT_EXCEEDED.

The KiSuspendThread routine is called towards the end of the KeSuspendThread routine. As expected, KiSuspendThread is another undocumented and non-exported kernel-mode only routine. KiSuspendThread is actually a bit more interesting though, because it exposes how the whole suspension mechanism in Windows actually works – you may be very surprised at how minimal it truly is, and it relies on a documented programming technique which most developers will have used both in kernel-mode and user-mode at-least once in their development time.

 

18
Pseudo-code for KiSuspendThread (NTOSKRNL) 1/2.

 

19
Pseudo-code for KiSuspendThread (NTOSKRNL) 2/2.

 

Well, would you look at that!

We have a call to KiInsertQueueApc which is a step involved in dispatching an Asynchronous Procedure Call (APC). Asynchronous Procedure Calls are used for communication all the time – in fact they have also been abused for thread hijacking as an entry-point for code injection for numerous of years now, and there’s a technique named Atom Bombing which relies on APC injection as well – and the way it works is you target a thread for the APC and you can force the targeted thread to execute the callback routine for the APC event. This in turn, will force the targeted thread to execute the code which you wish it to execute. If we wanted to inject code into a user-mode process from kernel-mode, we could rely on KeInitializeApc and KeInsertQueueApc – of course this routine is using the non-exported variants though.

The KiInsertQueueApc does exactly what the routine name implies. It inserts an APC event into a queue. It pretty much sets up the PKAPC structure which is used for the APC dispatch operation – the PKAPC structure (pointer to KAPC structure) holds data such as the environment for the APC event (e.g. KernelMode or UserMode), the targeted thread, among other data.

The KiSignalThreadForApc also does what the routine name implies. It will signal the thread for the APC event as far as I am aware, and this relies on KiSignalThread (another non-exported, undocumented kernel-mode routine). However, a flag named KiDisableLightWeightSuspend must be set to TRUE for the KiSignalThreadForApc operation to occur, among several other conditional statements.

The KiSuspendThread routine will use the ETHREAD structure (PETHREAD because it’s a pointer to the ETHREAD structure) for the current thread being put into a “suspended” state.

Under the KTHREAD structure there is data regarding thread suspension, for example, a SuspendCount field. The kernel will update such data regarding thread suspension and then the thread will be held up waiting for the data to be updated again via KeWaitForSingleObject. An Asynchronous Procedure Call is performed so the KeWaitForSingleObject call can be made on the targeted thread and this wait ends when the data is updated again to remove the suspend state.

If you were expecting some sort of super-human mechanism for “thread suspension” then I am sorry for letting you down. The suspension mechanism evolves around waiting for the semaphore data to be updated to indicate the thread should be resumed – the wait is executed in the address space of the targeted process to hold the targeted thread via the APC which was dispatched. It is possible that the semaphore part is not identically correct for the latest versions of Windows, because changes may have been made and I initially remember hearing about this many years ago – and have been unable to completely verify that this operation works like this at-least still – but it makes complete sense and I doubt it is inaccurate.

We have all this talk about thread suspension, but we’ve forgotten all about thread resuming. It isn’t fair for us to give all the attention to PsSuspendProcess, KeSuspendThread and KiSuspendThread… We need to give some love to PsResumeProcess! That’s right, there’s an exported kernel-mode routine named PsResumeProcess!

 

23
Exports for NTOSKRNL, show-casing that PsResumeProcess is an export.

 

Well this is exciting… I’m going to take a leap with my confidence and estimate what the PsResumeProcess routine will do.

  1. Enumerate through all the threads of the targeted process
  2. Invoke a routine named KeResumeThread
  3. Return an NTSTATUS error code (either STATUS_SUCCESS or STATUS_PROCESS_IS_TERMINATING).

 

24
Pseudo-code for PsResumeProcess (NTOSKRNL).

 

Looks like my estimation was right. Some may call me a cheater but that will just be the jealousy talking!

 

The PsResumeProcess routine will be called by NtResumeProcess, and we can verify these findings by checking the routine disassembly.

 

25
Disassembly of NtResumeProcess (NTOSKRNL).

 

I highlighted in red the call instruction being used for ObpReferenceObjectByHandleWithTag and PsResumeProcess. The only difference between NtSuspendProcess and NtResumeProcess is that the former will be calling PsSuspendProcess and the latter will be calling PsResumeProcess.

 

Since we’re looking at those Nt* stubs, we may as well take a look at NtSuspendThread and NtResumeThread briefly to see if they will call.

 

26
Pseudo-code for NtSuspendThread (NTOSKRNL).

 

NtSuspendThread will just lead down a path of PsSuspendThread, which is also called in PsSuspendProcess for each found thread during the enumeration operation. Nothing we’ve not seen done before.

What about NtResumeThread though? We really need to stop giving all the attention and love to thread suspension and treat thread resuming the same or it might turn rogue out of anger! 😉

 

27
Pseudo-code for NtResumeThread (NTOSKRNL).

 

NtResumeThread will simply call PsResumeThread, which is also called by PsResumeProcess during the thread enumeration operation. Nothing too interesting here either sadly because we’ve already seen it all.

We’ve done a lot of discussing and not a lot of programming so it’s time we brought back some C code. This time, the difference is that the example C source-code will be for kernel-mode software and not for user-mode software; there are some pre-requisites before you can start developing kernel-mode software (e.g. kernel-mode device drivers), however I shouldn’t have to lay these out because you shouldn’t be attempting such a task without having some background in Windows kernel-mode software engineering in the first place.

For tutorials sake, I will note the following.

  1. Download the latest version of Visual Studio (Visual Studio 2017 at the time of writing this) and install it.
  2. Download Windows 10 SDK (latest version) and install it.
  3. Download Windows Driver Kit (WDK – latest version) and install it.

You can check the following resource to help get into kernel-mode development, and official from Microsoft themselves: https://docs.microsoft.com/en-us/windows-hardware/drivers/develop/

 

The example source code is going to more-or-less replicate NtSuspendProcess/NtResumeProcess; no access to the System Service Descriptor Table required.

  1. We do not have access to ObpReferenceObjectByHandle, but we do have access to ObReferenceObjectByHandle. ObReferenceObjectByHandle will call ObpReferenceObjectByHandle and it will allow us to obtain an object to the process by providing a valid handle to it.
  2. PsSuspendProcess/PsResumeProcess.

Let’s get on with it!

 

First things first, we need to setup our type-definitions. I’m using a header file specifically for the C file which is going to contain this.

#define PROCESS_SUSPEND_RESUME 0x0800

typedef NTSTATUS(NTAPI *pPsSuspendProcess)(
    PEPROCESS Process
);

typedef NTSTATUS(NTAPI *pPsResumeProcess)(
    PEPROCESS Process
);

The reason I’ve also defined PROCESS_SUSPEND_RESUME is because these “specific” access rights aren’t available in the WDK libraries by default, and we need that access right for process suspension/resume operations. We don’t need to have more privileges therefore we shouldn’t try to gain such.

I got the PROCESS_SUSPEND_RESUME definition from MSDN: https://msdn.microsoft.com/en-gb/library/windows/desktop/ms684880

We already know that PsSuspendProcess and PsResumeProcess take in only one parameter of data-type PEPROCESS because of earlier when we took a look at the routines with static reverse engineering and saw what the NtSuspendProcess/NtResumeProcess routines were doing before invoking the Ps* routines. Because of ObpReferenceObjectByHandleWithTag, we know that the parameter is of type PEPROCESS (EPROCESS*).

The next thing we need to do is retrieve the address to PsSuspendProcess and PsResumeProcess. Since PsSuspendProcess and PsResumeProcess are both exported by NTOSKRNL, this will be very simple for us to do. There’s a routine named MmGetSystemRoutineAddress.

PVOID MmGetSystemRoutineAddress(
    _In_ PUNICODE_STRING SystemRoutineName
);

According to the Microsoft documentation, this routine takes in one parameter which is of PUNICODE_STRING data-type.

We’ll continue by doing the following.

  1. Setup a routine to return us the address via MmGetSystemRoutineAddress
  2. Setup the addresses for PsSuspendProcess and PsResumeProcess
pPsSuspendProcess fPsSuspendProcess;
pPsResumeProcess fPsResumeProcess;

PVOID ReturnSystemRoutineAddress(
    WCHAR *RoutineName
)
{
    UNICODE_STRING RoutineNameUs = { 0 };

    if (!RoutineName)
    {
        return 0;
    }

    RtlInitUnicodeString(&RoutineNameUs,
        RoutineName);

    return MmGetSystemRoutineAddress(&RoutineNameUs);
}

NTSTATUS InitializeExports()
{
    fPsSuspendProcess = (pPsSuspendProcess)ReturnSystemRoutineAddress(
        L"PsSuspendProcess");

    fPsResumeProcess = (pPsResumeProcess)ReturnSystemRoutineAddress(
        L"PsResumeProcess");

    if (!fPsSuspendProcess ||
        !fPsResumeProcess)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    return STATUS_SUCCESS;
}

That should do the trick. Remember, you will need to make sure the InitializeExports routine is called before you attempt to make use of the fPsSuspendProcess/fPsResumeProcess variables being setup, otherwise they will point to NULL and you’ll cause a bug-check crash due to dereferencing a NULL pointer.

The next thing we need to do is setup our wrapper routines to invoke PsSuspendProcess and PsResumeProcess. This isn’t a necessity of course, you could just call fPsSuspendProcess/fPsResumeProcess or whatever you named those global variables to point to the correct address with the function type-definition set to it, but I find it easier to manage when you have a wrapper routine for each one – this means if there is ever an issue with the call itself, you can patch the issue by changing one routine instead of many. It also looks more appealing to me to have wrapper routines, so this is what I’ll be doing in this article.

All we need to do is have two routines which return an NTSTATUS error code (STATUS_SUCCESS or another error, this will be the value returned by PsSuspendProcess/PsResumeProcess). We will need to check within the routine if the fPsSuspendProcess or fPsResumeProcess variable is NULL or not otherwise if the routine is accidentally called and the address truly is NULL, we’ll be dereferencing a NULL pointer and end up bug-checking the system, which would be really silly. If the address is NULL then we can return STATUS_INSUFFICIENT_RESOURCES since this implies we do not have the required resources to complete the operation, which would be truthful because the addresses are a “resource” which are needed to make the suspension/resume operation.

NTSTATUS NTAPI PsSuspendProcess(
    PEPROCESS Process
)
{
    if (!fPsSuspendProcess)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    return fPsSuspendProcess(Process);
}

NTSTATUS NTAPI PsResumeProcess(
    PEPROCESS Process
)
{
    if (!fPsResumeProcess)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    return fPsResumeProcess(Process);
}

The last thing we have to do to implement the process suspension/resume functionality in kernel-mode is make our wrapper routines for NtSuspendProcess and NtResumeProcess. Why think about locating the address to NtSuspendProcess (NTOSKRNL) or NtResumeProcess (NTOSKRNL) at all when you can have your own wrapper routine which does the same thing as the official one? Well, I can think of a few reasons why regarding stability… But don’t ruin the moment!

We have a few approaches for the NtSuspendProcess/NtResumeProcess wrapper.

  1. We can use the documented, kernel-mode only routine named PsLookupProcessByProcessId to obtain an object to the targeted process and we can then pass this process object (PEPROCESS) to the PsSuspendProcess/PsResumeProcess wrapper routines.
  2. We can accept a handle parameter to the NtSuspendProcess/NtResumeProcess wrapper routines the same way the official NtSuspendProcess/NtResumeProcess routines under NTOSKRNL do, and then we can use the handle to obtain an object to the process.

We’re going to go with the second option to keep it closer to the real NtSuspendProcess/NtResumeProcess, but either are fine. Personally, I’d go for using the process object instead of a handle from the start if I was working in kernel-mode, but that’s just me.

Thankfully, we have the ability to make use of the ObReferenceObjectByHandle routine. It’s also officially documented. It isn’t the same one used in the NtSuspendProcess/NtResumeProcess routines in the Windows Kernel but it leads down the same path so it’s perfectly fine.

NTSTATUS NTAPI NtSuspendProcess(
    HANDLE ProcessHandle
)
{
    NTSTATUS NtStatus = STATUS_SUCCESS;
    PEPROCESS Process = 0;

    if (!ProcessHandle)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    NtStatus = ObReferenceObjectByHandle(ProcessHandle,
        PROCESS_SUSPEND_RESUME,
        *PsProcessType,
        KernelMode,
        &Process,
        NULL);

    if (!NT_SUCCESS(NtStatus))
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    NtStatus = PsSuspendProcess(Process);

    ObDereferenceObject(Process);

    return NtStatus;
}

NTSTATUS NTAPI NtResumeProcess(
    HANDLE ProcessHandle
)
{
    NTSTATUS NtStatus = STATUS_SUCCESS;
    PEPROCESS Process = 0;

    if (!ProcessHandle)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    NtStatus = ObReferenceObjectByHandle(ProcessHandle,
        PROCESS_SUSPEND_RESUME,
        *PsProcessType,
        KernelMode,
        &Process,
        NULL);

    if (!NT_SUCCESS(NtStatus))
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    NtStatus = PsResumeProcess(Process);

    ObDereferenceObject(Process);

    return NtStatus;
}

 

If you’re conscious about sticking to as less code as possible for whichever reason, it may be nice for me to comment that you could make another wrapper routine which takes in the process handle and a BOOLEAN flag to determine if the process should be suspended or resumed. Then, depending on the flag, use the returned process object with the PsSuspendProcess/PsResumeProcess wrapper routines. This would prevent you from doing a check-up on the ProcessHandle parameter, calling ObReferenceObjectByHandle and checking the status codes in both routines which would cut down a few lines of code – but the end-result would be identical.

 

I tested it out by opening a handle to notepad.exe with PROCESS_SUSPEND_RESUME access rights and then passing the handle to the NtSuspendProcess routine. I set my DriverUnload routine to call NtResumeProcess on the process so it would be resumed after the driver was unloaded (experimental purposes).

 

28
Testing PsSuspendProcess with a remote kernel-debugger attached under an analysis environment 1/2.

 

At this moment in time, notepad.exe had been suspended by the experimental kernel-mode device driver.

 

29
Testing PsSuspendProcess with a remote kernel-debugger attached under an analysis environment 2/2.

 

This happened after unloading the driver, because the NtResumeProcess wrapper was called, targeting the notepad.exe suspended process. The process was successfully resumed. =

If you look at the screen-shot in which notepad.exe was in a suspended state, you’ll notice the threads were suspended. If those threads weren’t suspended, then the process would not be “suspended”. After the process has been resumed in the above screen-shot, the threads are “resumed” (aka. “running”/”active”).

You can even see in the screen-shot where the process is suspended that the “State” details in the Process Hacker pop-up Properties window (-> Threads tab) is saying “Wait:Suspended (1)” – the thread is being held up the same way a normal developer may call WaitForSingleObject (KeWaitForSingleObject is the kernel-mode equivalent) on a thread of his own in user-mode.

 

That’s all for this post, I’d like to thank you for reading up to this point and hopefully this post was found to be useful!

– Opcode

 

Twitter: https://twitter.com/NtOpcode

Anatomy of the Process Environment Block (PEB) (Windows Internals)

The Process Environment Block (PEB) is a wonderful thing, and I’d be lying if I told you that I didn’t love it. It has been present in Windows since the introduction of the Win2k (Windows 2000) and it has been improved through newer versions of Windows ever since. On earlier versions of Windows, it could be abused to do some nasty things like hiding loaded modules present within a process (to prevent them from being found – obviously this is not a beautiful thing though).

What is this magic so-called “Process Environment (PEB)”? The PEB is a structure which holds data about the current process under it’s field values – some fields being structures themselves to hold even more data. Every process has it’s own PEB and the Windows Kernel will also have access to the PEB of every user-mode process so it can keep track of certain data stored within it.

Where does this sorcery come from? The PEB structure comes from the Windows Kernel (although is accessible in user-mode as well). The PEB comes from the Thread Environment Block (TEB) which also happens to be commonly referred to as the Thread Information Block (TIB). The TEB is responsible for holding data about the current thread – every thread has it’s own TEB structure.

Can the Thread Environment Block or the Process Environment Block be abused for malicious purposes? Of course they can! In fact, they have been abused for malicious purposes in the past but Microsoft has made many changes over the recent years to help prevent this. An example would be in the past where rootkits would inject a DLL into another running process, and then access the PEB structure of the current process they had injected into (the PPEB structure is a pointer to the PEB structure) so they could locate the list of loaded modules and remove their own module from the list… Thus hiding their injected module from view when someone enumerates the loaded modules of the affected process. This is known as memory patching because you would be modifying memory by patching the PEB. Microsoft’s mitigation for this behavior was to prevent the manual altering of the list which represents the loaded modules in user-mode – you can still access it for reading the data in user-mode though and you can still patch the memory from kernel-mode.

This article will be split up into two different sections: theory and user-mode practical.

Theoretical


We’re going to take a look at the Thread Environment Block (TEB) structure using WinDbg. Since the TEB structure is available in user-mode, and used by user-mode Windows components such as NTDLL and KERNEL32, we won’t require kernel-debugging to query about the structure.

Bear in mind that you will need to have your symbols correctly setup otherwise you will fail with the next upcoming steps, please see the following URL: https://msdn.microsoft.com/en-us/library/windows/desktop/ee416588(v=vs.85).aspx

We’ll start by opening up WinDbg – I’ll be opening up the 64-bit version.

1
WinDbg default view.

Now we’ll open up notepad.exe. Once it is open, we can attach to notepad.exe in WinDbg by going to File -> Attach to a Process -> notepad.exe. Alternatively, you can use the default hot-key which should be F6.

2
Attaching to a process via WinDbg. 1/2
3
Attaching to a process via WinDbg. 2/2

After doing this, the WinDbg command window will be displayed. The command window is the work-space we will have to enter commands at our own discretion to get back various desired results. For example, if we wish to manipulate something, or query information about something, we can do this with a command. WinDbg has a whole wide-range of commands available and you can learn more about that here: http://windbg.info/doc/1-common-cmds.html

We’ll be using the dt instruction. “dt” stands for “Display Type” and can be used to display information about a specific data-type, including structures. In our case, it is more than appropriate because it supports structures and we need to find out information about the TEB structure.

We can use the following instruction to query information about the TEB structure.

dt ntdll!_TEB
4
WinDbg command (dt) for the _TEB structure.

We can see already that there are many fields of the structure, so many fields that they all don’t fit on the singular image view. However, if we look towards the very top of the structure, we’ll find the Process Environment Block’s field.

5
Highlighting the ProcessEnvironmentBlock field of the _TEB structure.

We can see that WinDbg is labelling the data-type for the field as “Ptr64 _PEB”. This simply means that the data-type is a pointer to the PEB structure (PPEB). Since we are debugging a 64-bit compiled program (notepad.exe since our OS architecture is 64-bit), the addresses are 8 bytes instead of 4 bytes like on a 32-bit environment, which is why “64” is appended to the “Ptr”.

We can view the fields of the PEB structure with the following WinDbg command.

dt ntdll!_PEB

6

7
WinDbg command (dt) for the _PEB structure.

The WinDbg output is below.

0:007> dt ntdll!_PEB
 +0x000 InheritedAddressSpace : UChar
 +0x001 ReadImageFileExecOptions : UChar
 +0x002 BeingDebugged : UChar
 +0x003 BitField : UChar
 +0x003 ImageUsesLargePages : Pos 0, 1 Bit
 +0x003 IsProtectedProcess : Pos 1, 1 Bit
 +0x003 IsImageDynamicallyRelocated : Pos 2, 1 Bit
 +0x003 SkipPatchingUser32Forwarders : Pos 3, 1 Bit
 +0x003 IsPackagedProcess : Pos 4, 1 Bit
 +0x003 IsAppContainer : Pos 5, 1 Bit
 +0x003 IsProtectedProcessLight : Pos 6, 1 Bit
 +0x003 IsLongPathAwareProcess : Pos 7, 1 Bit
 +0x004 Padding0 : [4] UChar
 +0x008 Mutant : Ptr64 Void
 +0x010 ImageBaseAddress : Ptr64 Void
 +0x018 Ldr : Ptr64 _PEB_LDR_DATA
 +0x020 ProcessParameters : Ptr64 _RTL_USER_PROCESS_PARAMETERS
 +0x028 SubSystemData : Ptr64 Void
 +0x030 ProcessHeap : Ptr64 Void
 +0x038 FastPebLock : Ptr64 _RTL_CRITICAL_SECTION
 +0x040 AtlThunkSListPtr : Ptr64 _SLIST_HEADER
 +0x048 IFEOKey : Ptr64 Void
 +0x050 CrossProcessFlags : Uint4B
 +0x050 ProcessInJob : Pos 0, 1 Bit
 +0x050 ProcessInitializing : Pos 1, 1 Bit
 +0x050 ProcessUsingVEH : Pos 2, 1 Bit
 +0x050 ProcessUsingVCH : Pos 3, 1 Bit
 +0x050 ProcessUsingFTH : Pos 4, 1 Bit
 +0x050 ProcessPreviouslyThrottled : Pos 5, 1 Bit
 +0x050 ProcessCurrentlyThrottled : Pos 6, 1 Bit
 +0x050 ReservedBits0 : Pos 7, 25 Bits
 +0x054 Padding1 : [4] UChar
 +0x058 KernelCallbackTable : Ptr64 Void
 +0x058 UserSharedInfoPtr : Ptr64 Void
 +0x060 SystemReserved : Uint4B
 +0x064 AtlThunkSListPtr32 : Uint4B
 +0x068 ApiSetMap : Ptr64 Void
 +0x070 TlsExpansionCounter : Uint4B
 +0x074 Padding2 : [4] UChar
 +0x078 TlsBitmap : Ptr64 Void
 +0x080 TlsBitmapBits : [2] Uint4B
 +0x088 ReadOnlySharedMemoryBase : Ptr64 Void
 +0x090 SharedData : Ptr64 Void
 +0x098 ReadOnlyStaticServerData : Ptr64 Ptr64 Void
 +0x0a0 AnsiCodePageData : Ptr64 Void
 +0x0a8 OemCodePageData : Ptr64 Void
 +0x0b0 UnicodeCaseTableData : Ptr64 Void
 +0x0b8 NumberOfProcessors : Uint4B
 +0x0bc NtGlobalFlag : Uint4B
 +0x0c0 CriticalSectionTimeout : _LARGE_INTEGER
 +0x0c8 HeapSegmentReserve : Uint8B
 +0x0d0 HeapSegmentCommit : Uint8B
 +0x0d8 HeapDeCommitTotalFreeThreshold : Uint8B
 +0x0e0 HeapDeCommitFreeBlockThreshold : Uint8B
 +0x0e8 NumberOfHeaps : Uint4B
 +0x0ec MaximumNumberOfHeaps : Uint4B
 +0x0f0 ProcessHeaps : Ptr64 Ptr64 Void
 +0x0f8 GdiSharedHandleTable : Ptr64 Void
 +0x100 ProcessStarterHelper : Ptr64 Void
 +0x108 GdiDCAttributeList : Uint4B
 +0x10c Padding3 : [4] UChar
 +0x110 LoaderLock : Ptr64 _RTL_CRITICAL_SECTION
 +0x118 OSMajorVersion : Uint4B
 +0x11c OSMinorVersion : Uint4B
 +0x120 OSBuildNumber : Uint2B
 +0x122 OSCSDVersion : Uint2B
 +0x124 OSPlatformId : Uint4B
 +0x128 ImageSubsystem : Uint4B
 +0x12c ImageSubsystemMajorVersion : Uint4B
 +0x130 ImageSubsystemMinorVersion : Uint4B
 +0x134 Padding4 : [4] UChar
 +0x138 ActiveProcessAffinityMask : Uint8B
 +0x140 GdiHandleBuffer : [60] Uint4B
 +0x230 PostProcessInitRoutine : Ptr64 void 
 +0x238 TlsExpansionBitmap : Ptr64 Void
 +0x240 TlsExpansionBitmapBits : [32] Uint4B
 +0x2c0 SessionId : Uint4B
 +0x2c4 Padding5 : [4] UChar
 +0x2c8 AppCompatFlags : _ULARGE_INTEGER
 +0x2d0 AppCompatFlagsUser : _ULARGE_INTEGER
 +0x2d8 pShimData : Ptr64 Void
 +0x2e0 AppCompatInfo : Ptr64 Void
 +0x2e8 CSDVersion : _UNICODE_STRING
 +0x2f8 ActivationContextData : Ptr64 _ACTIVATION_CONTEXT_DATA
 +0x300 ProcessAssemblyStorageMap : Ptr64 _ASSEMBLY_STORAGE_MAP
 +0x308 SystemDefaultActivationContextData : Ptr64 _ACTIVATION_CONTEXT_DATA
 +0x310 SystemAssemblyStorageMap : Ptr64 _ASSEMBLY_STORAGE_MAP
 +0x318 MinimumStackCommit : Uint8B
 +0x320 FlsCallback : Ptr64 _FLS_CALLBACK_INFO
 +0x328 FlsListHead : _LIST_ENTRY
 +0x338 FlsBitmap : Ptr64 Void
 +0x340 FlsBitmapBits : [4] Uint4B
 +0x350 FlsHighIndex : Uint4B
 +0x358 WerRegistrationData : Ptr64 Void
 +0x360 WerShipAssertPtr : Ptr64 Void
 +0x368 pUnused : Ptr64 Void
 +0x370 pImageHeaderHash : Ptr64 Void
 +0x378 TracingFlags : Uint4B
 +0x378 HeapTracingEnabled : Pos 0, 1 Bit
 +0x378 CritSecTracingEnabled : Pos 1, 1 Bit
 +0x378 LibLoaderTracingEnabled : Pos 2, 1 Bit
 +0x378 SpareTracingBits : Pos 3, 29 Bits
 +0x37c Padding6 : [4] UChar
 +0x380 CsrServerReadOnlySharedMemoryBase : Uint8B
 +0x388 TppWorkerpListLock : Uint8B
 +0x390 TppWorkerpList : _LIST_ENTRY
 +0x3a0 WaitOnAddressHashTable : [128] Ptr64 Void
 +0x7a0 TelemetryCoverageHeader : Ptr64 Void
 +0x7a8 CloudFileFlags : Uint4B

As we can see, there’s a lot of fields for the PEB structure. We’ll only be focusing on a select few of them during the practical sections though.

Before we can continue, we need to briefly talk about how the Process Environment Block is actually found. It’s located at FS:[0x30] in the Thread Environment Block/Thread Information Block for 32-bit processes, and it’s located at GS:[0x60] for 64-bit processes.

To start off, the third field of the PEB structure (“BeingDebugged”) can be read to determine if the current process is attached to via a debugger – this is one vector which is commonly closed by analysts who are debugging malicious software, because malicious software tends to keep a close-eye out for debuggers and other analysis tools to make things more difficult for malware analysts. There’s a routine from the Win32 API called IsDebuggerPresent (KERNEL32) and the routine works by checking the BeingDebugged field of the PEB structure. We can validate this by reverse-engineering kernel32.dll ourselves.

8
IDA pseudo-code for IsDebuggerPresentStub (KERNEL32 – Windows 8+).

As we can see, kernel32.dll has a routine named IsDebuggerPresentStub which calls IsDebuggerPresent. This is because the environment I’m getting these images from is Windows 10 64-bit, and Microsoft moved to using KernelBase.dll (introduced starting Windows 8). However, for backwards-compatibility, kernel32.dll is still pushed for usage by their documentation – and if they had dropped support for it then they would have to have moved more than they have across to a new module project, and there’d have been a lot of incompatible software for Windows 8+ at the time.

Therefore, we need to take a look at KernelBase.dll.

9
Disassembly for IsDebuggerPresent (KERNEL32 / KERNELBASE).

Perfect! KernelBase.dll has an exported routine named IsDebuggerPresent. We’re going to debunk what the above disassembly is telling us.

  1. The address of the Process Environment Block is being moved into the RAX register. Since we’re looking at the 64-bit compiled version of KernelBase.dll, 64-bit registers are being used. The Process Environment Block is located at + 0x60 for 64-bit processes.
  2. The value from the BeingDebugged field under the Process Environment Block is being extracted and put into the EAX register. The data-type for the BeingDebugged field is UCHAR (which is one byte), and it’s offset is 0x002 – the first field of the PEB structure is located at 0x000 which means the third field (which is the BeingDebugged field) is located +2 bytes from this address. Since the RAX register is holding the address to the Process Environment Block, (RAX + 2) is performed to reach the address of the BeingDebugged field.
  3. Returning with the RETN instruction. Since the value for the BeingDebugged field of the PEB structure is held within the EAX register, the caller of the routine is going to return the value stored within the BeingDebugged field.

A routine like IsDebuggerPresent (KERNEL32 / KERNELBASE) might be an obvious sign for a malware analyst who is taking a look at the API calls being made by a sample therefore some malware samples will manually access the PEB structure to check – doing this is stealthier and usually less-expected.

The next fields we’re going to briefly talk about are the IsProtectedProcess and IsProtectedProcessLight fields of the Process Environment Block.

These fields can be used to determine if the current process is “protected” or not, hence the “ProtectedProcess” key-word in the field names. In Windows, there’s multiple process protection mechanisms although the former (non-Light variant) has been around a lot longer than the Process Protection Light (PPL) variant. Standard process protection mechanism in Windows has been around since Windows Vista, however the PPL feature came into play starting Windows 8. Microsoft use these mechanisms to protect their own System processes from being abused by malicious software or forcefully shut-down by a third-party source (because for some Windows processes this can cause the system to bug-check/improperly function). If we can access these fields within the Process Environment Block, then we can check if the current process is protected or not by Windows. All of this is enforced from kernel-mode by the Windows Kernel using the undocumented and opaque EPROCESS structure, and you cannot write to these fields in the PEB structure and have the changes take effect because it won’t update the EPROCESS structure for the current process.

The standard process protection mechanism is used by Windows system processes. This mechanism is enforced from within the Windows Kernel and it’s not supposed to be used by third-parties, and it helps prevent system processes from being exploited by attackers (or forcefully shut-down – the Operating System cannot function properly without it’s critical user-mode components). On top of this, Windows will set the state of various system processes to “critical”, and this is flag-based and will cause the system to be forcefully crashed (via a bug-check) if the “critical” processes become terminated. There are two different implementations for the “critical” state: critical processes and critical threads. Setting a process as critical will cause the bug-check once the process has been terminated, and setting a thread as critical will cause the bug-check once the thread has been terminated. Usually, the former is more appropriate because threads come and go regularly (e.g. spawn a new thread to handle an operation simultaneously and then the thread will be terminated once it returns back it’s status from the operation). Windows does not set “threads” as critical as far as I am aware, although it will set specific processes as critical (processes like csrss.exe).

We’re going to take a look at how the process protection mechanism which is built-into Windows actually works very briefly using Interactive Disassembler and WinDbg. 

We can easily check using the following routines.

  1. PsIsProtectedProcess (NTOSKRNL)
  2. PsIsProtectedProcessLight (NTOSKRNL)

Both of the above routines are undocumented but they are still exported by the Windows Kernel.

11
Disassembly for PsIsProtectedProcess (NTOSKRNL).

Looking at the disassembly of PsIsProtectedProcess, we can see that the TEST instruction is being used. The TEST instruction is used for a “bitwise operation”. However, we can also see that [RCX+6CAh] is the target. The PsIsProtectedProcess routine takes in one parameter only and it returns a BOOLEAN (UCHAR) – the parameter’s data-type should be a pointer to the EPROCESS structure for the target process being checked on. This tells us that the value stored in the RCX register will be the address of the PEPROCESS (EPROCESS*) for the target process, and it’s accessing the structure to read the value stored under an unknown field which symbolises if the process is or is not protected. The offset for where the field under the EPROCESS structure is located is 6CAh. This means that if you add on 0x6CA from the base address of the EPROCESS* for a process, you will land yourself at the address in which the value being checked in this routine is located at (for this environment only because the offsets regularly shift around and will vary between environment – due to patch updates and separate OS versions).

We can check with WinDbg which field is for the 0xC6A offset.

12
WinDbg command (dt) for the _EPROCESS structure, showing the Protection field.

Nice! The field in the EPROCESS structure which holds data regarding process protection is named Protection and has a data-type of _PS_PROTECTION (which is a structure) – at-least for the standard process protection mechanism, we are yet to check on the Light variant. We can take a look at the _PS_PROTECTION structure with the dt instruction.

13
WinDbg command (dt) for the _PS_PROTECTION structure.

Now if we check the disassembly of the PsIsProtectedProcessLight routine, we can see if it uses the same mechanism to query the status.

14
Disassembly for PsIsProtectedProcessLight (NTOSKRNL).

It’s targeting the Protection field of the EPROCESS structure as well – the same field of the structure too. The only difference here is that PsIsProtectedProcess is and PsIsProtectedProcessLight are doing some different checks.

In the PEB structure, there’s an entry named Ldr which has a data-type of _PEB_LDR_DATA. Within this structure, we have a field named InMemoryOrderModuleList which has a data-type of _LIST_ENTRY. Double linked lists are very common in Windows components such as in the Windows Kernel or lower-level user-mode components.

There’s an instruction in WinDbg named !peb which can be used to enumerate data for the PEB of the currently debugged process. Below is an image of what the output will look like, focus only on the non-highlighted parts.

15
WinDbg command (!peb) output.

If we go through the InMemoryOrderModuleList, we can extract each entry and assign to a pointer of the LDR_DATA_TABLE_ENTRY structure using the CONTAINING_RECORD macro. Then we could view details about the current module enumerated using the linked lists… We will do this during the practical code section which is right about now.

We’re going to be using the PEB for practical use in the next section.

User-Mode


In this section we’re going to be re-writing a few Win32 API routines in user-mode which rely on the Process Environment Block.

  1. GetModuleHandle – using the Ldr field of the PEB structure
  2. GetModuleFileName – using the ProcessParameters field of the PEB structure

We need to make sure we’ve declared some structures. Depending on the header files you’re using, you may not need them. However if you do need them…

typedef struct _UNICODE_STRING {
    USHORT Length;
    USHORT MaximumLength;
    WCHAR *Buffer;
} UNICODE_STRING, PUNICODE_STRING;

typedef const UNICODE_STRING
              *PCUNICODE_STRING;

typedef struct _CLIENT_ID {
    PVOID UniqueProcess;
    PVOID UniqueThread;
} CLIENT_ID, *PCLIENT_ID;

typedef struct _RTL_USER_PROCESS_PARAMETERS {
    BYTE Reserved1[16];
    PVOID Reserved2[10];
    UNICODE_STRING ImagePathName;
    UNICODE_STRING CommandLine;
} RTL_USER_PROCESS_PARAMETERS, *PRTL_USER_PROCESS_PARAMETERS;

typedef struct _PEB_LDR_DATA {
    BYTE Reserved1[8];
    PVOID Reserved2[3];
    LIST_ENTRY InMemoryOrderModuleList;
} PEB_LDR_DATA, *PPEB_LDR_DATA;

typedef struct _LDR_DATA_TABLE_ENTRY {
    PVOID Reserved1[2];
    LIST_ENTRY InMemoryOrderLinks;
    PVOID Reserved2[2];
    PVOID BaseAddress;
    PVOID Reserved3[2];
    UNICODE_STRING FullDllName;
    UNICODE_STRING BaseDllName;
    BYTE Reserved4[8];
    PVOID Reserved5[3];
#pragma warning(push)
#pragma warning(disable: 4201) // we'll always use the Microsoft compiler
    union {
        ULONG CheckSum;
        PVOID Reserved6;
    } DUMMYUNIONNAME;
#pragma warning(pop)
    ULONG TimeDateStamp;
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;

typedef struct _PEB {
    BYTE Reserved1[2];
    BYTE BeingDebugged;
    BYTE Reserved2[1];
    PVOID Reserved3[2];
    PPEB_LDR_DATA Ldr;
    PRTL_USER_PROCESS_PARAMETERS ProcessParameters;
    PVOID Reserved4[3];
    PVOID AtlThunkSListPtr;
    PVOID Reserved5;
    ULONG Reserved6;
    PVOID Reserved7;
    ULONG Reserved8;
} PEB, *PPEB;

typedef struct _TEB {
    NT_TIB NtTib;
    PVOID EnvironmentPointer;
    CLIENT_ID ClientId;
    PVOID ActiveRpcHandle;
    PVOID ThreadLocalStoragePointer;
    PPEB ProcessEnvironmentBlock;
} TEB, *PTEB;

The next thing you might want is a global definition for NtCurrentPeb(). This isn’t mandatory but it can be a bit helpful if you’d prefer to type NtCurrentPeb() instead of NtCurrentTeb()->ProcessEnvironmentBlock every-time you need to gain access to the PEB. I always preferred to type NtCurrentPeb() but that’s just me.

#define NtCurrentPeb() \
        NtCurrentTeb()->ProcessEnvironmentBlock

What is NtCurrentTeb()?

NtCurrentTeb() is a function which is packed within winnt.h, and it’ll return a pointer to the TEB structure at the correct address of where the TEB is located.

NtCurrentTeb() will change depending on the configuration however for a 32-bit compilation, it will locate the TEB by using the __readfsdword macro, targeting 0x18 as the location. This means that the target location is actually FS:[0x18]. For a 64-bit compilation, __readgsqword will be used and the target location will be different.

GetModuleHandle replacement

HMODULE GetModuleHandleWrapper(
    WCHAR *ModuleName
)
{
    PPEB ProcessEnvironmentBlock = NtCurrentPeb();
    PPEB_LDR_DATA PebLdrData = { 0 };
    PLDR_DATA_TABLE_ENTRY LdrDataTableEntry = { 0 };
    PLIST_ENTRY ModuleList = { 0 },
                ForwardLink = { 0 };

    if (ProcessEnvironmentBlock)
    {
        PebLdrData = ProcessEnvironmentBlock->Ldr;

        if (PebLdrData)
        {
            ModuleList = &PebLdrData->InMemoryOrderModuleList;
            ForwardLink = ModuleList->Flink;

            while (ModuleList != ForwardLink)
            {
                LdrDataTableEntry = CONTAINING_RECORD(ForwardLink,
                    LDR_DATA_TABLE_ENTRY,
                    InMemoryOrderLinks);

                if (LdrDataTableEntry)
                {
                    if (LdrDataTableEntry->BaseDllName.Buffer)
                    {
                        if (!_wcsicmp(LdrDataTableEntry->BaseDllName.Buffer,
                            ModuleName))
                        {
                            return (HMODULE)LdrDataTableEntry->BaseAddress;
                        } 
                     }
                 }

                 ForwardLink = ForwardLink->Flink;
             }
         }
    }

    return 0;
}

The above routine does the following.

  1. Retrieves the PPEB
  2. Checks if the PPEB could be acquired or not
  3. Enumerates the InMemoryOrderModuleList
  4. Retrieves a pointer to the LDR_DATA_TABLE_ENTRY structure for each entry
  5. Returns the BaseAddress of the module if its a match based on module name buffer comparison with the parameter passed in

GetModuleFileName wrapper

WCHAR *GetModuleFileNameWrapper()
{
    PPEB ProcessEnvironmentBlock = NtCurrentPeb();
    
    if (ProcessEnvironmentBlock)if (ProcessEnvironmentBlock)
    {
        if (ProcessEnvironmentBlock->ProcessParameters)
        {
            if (ProcessEnvironmentBlock->ProcessParameters->ImagePathName.Buffer)
            {
                if (ProcessEnvironmentBlock->ProcessParameters->ImagePathName.Buffer)
                {
                    return ProcessEnvironmentBlock->ProcessParameters->ImagePathName.Buffer;
                }
            }
        }
    }

    return NULL;
}

The above routine does the following.

  1. Retrieves the PPEB (pointer to the PEB)
  2. Checks if the PPEB could be acquired or not
  3. Checks if it can access the ProcessParameters field
  4. Returns the ImagePathName buffer (it’s a UNICODE_STRING so the Buffer field is a wchar_t*)

 


 

All of this has been known for an extremely long time now but for those of you which have only just got into Windows Internals and started studying areas like the Process Environment Block, this could help clear things up for you quickly and put an end to some confusion.

As always, thanks for reading.

NtOpcode