• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    公众号

PEHeaderandExportTableforDelphi

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

Malware Analysis Tutorial 8: PE Header and Export Table

2. Background Information of PE Header
Any binary executable file (no matter on Unix or Windows) has to include a header to describe its structure:

e.g., the base address of its code section, data section, and the list of functions that can be exported from the executable, etc.

When the file is executed by the operating system, the OS simply reads this header information first,

and then loads the binary data from the file to populate the contents of

the code/data segments of the address space for the corresponding process.

When the file is dynamically linked (i.e., the system calls it relies on are not statically linked in the executable),

the OS has to rely on its import table to determine where to find the entry addresses of these system functions.

Most binary executable files on Windows follows the following structure: 

DOS Header (64 bytes), 

PE Header

sections (code and data).

For a complete survey and introduction of the executable file format,

we recommend Goppit's "Portable Executable File Format - A Reverse Engineering View" [1].

DOS Header starts with magic number 4D 5A 50 00,

and the last 4 bytes is the location of PE header in the binary executable file.

Other fields are not that interesting.

The PE header contains significantly more information and more interesting.

In Figure 1, please find the structure of PE Header.

We only list the information that are interesting to us in this tutorial.

For a complete walk-through, please refer to Goppit's work [1].

Figure 1. Structure of PE Header

As shown in Figure 1, PE header consists of three parts:

(1) a 4-byte magic code,

(2) a 20-byte file header and its data type is IMAGE_FILE_HEADER, and

(3) a 224-byte optional header (type: IMAGE_OPTIONAL_HEADER32).

The optional header itself has two parts:

the first 96 bytes contain information such as major operating systems, entry point, etc.

The second part is a data directory of 128 bytes.

It consists of 16 entries, and each entry has 8 bytes (address, size).

4 + 20 + [ 224 = ( 96 + 8 * 16 ) ] 

We are interested in the first two entries: one has the pointer to the beginning of the export table, and the other points to the import table.


2.1 Debugging Tool Support (Small Lab Experiments)

Modern binary debuggers have provided sufficient support for examining PE headers.

We discuss the use of WinDbg and Immunity Debugger.

(1) WinDbg.

Assume that we know that the PE structure of ntdll.dll is located at memory address 0x7C9000E0.

We can display the second part: file header using the following.

dt nt!_IMAGE_FILE_HEADER 0x7c9000e4
   +0x000 Machine          : 0x14c
   +0x002 NumberOfSections : 4
   +0x004 TimeDateStamp    : 0x4802a12c
   +0x008 PointerToSymbolTable : 0
   +0x00c NumberOfSymbols  : 0
   +0x010 SizeOfOptionalHeader : 0xe0
   +0x012 Characteristics  : 0x210e


Then we can calculate the starting address of the optional header:

0x7C9000E4 + 0x14 (20 bytes) = 0x7C9000F8. < IMAGE_FILE_HEADER 20 bytes >

The attributes of optional header is displayed as below.

For example, the major linker version is 7 and the the address of entry point is 0x12c28

(relative of the base address 0x7c900000).

kd> dt _IMAGE_OPTIONAL_HEADER 0x7c9000F8
nt!_IMAGE_OPTIONAL_HEADER
   +0x000 Magic            : 0x10b
   +0x002 MajorLinkerVersion : 0x7 ''
   +0x003 MinorLinkerVersion : 0xa ''
   +0x004 SizeOfCode       : 0x7a000
   +0x008 SizeOfInitializedData : 0x33a00
   +0x00c SizeOfUninitializedData : 0
   +0x010 AddressOfEntryPoint : 0x12c28
   +0x014 BaseOfCode       : 0x1000
   +0x018 BaseOfData       : 0x76000
   +0x01c ImageBase        : 0x7c900000
   +0x020 SectionAlignment : 0x1000
   +0x024 FileAlignment    : 0x200
   +0x028 MajorOperatingSystemVersion : 5
   +0x02a MinorOperatingSystemVersion : 1
   +0x02c MajorImageVersion : 5
   +0x02e MinorImageVersion : 1
   +0x030 MajorSubsystemVersion : 4
   +0x032 MinorSubsystemVersion : 0xa
   +0x034 Win32VersionValue : 0
   +0x038 SizeOfImage      : 0xaf000
   +0x03c SizeOfHeaders    : 0x400
   +0x040 CheckSum         : 0xb62bc
   +0x044 Subsystem        : 3
   +0x046 DllCharacteristics : 0
   +0x048 SizeOfStackReserve : 0x40000
   +0x04c SizeOfStackCommit : 0x1000
   +0x050 SizeOfHeapReserve : 0x100000
   +0x054 SizeOfHeapCommit : 0x1000
   +0x058 LoaderFlags      : 0
   +0x05c NumberOfRvaAndSizes : 0x10
   +0x060 DataDirectory    : [16] _IMAGE_DATA_DIRECTORY


As shown by Goppit [1], OllyDbg can display PE structure nicely.

Since the Immunity Debugger is based on OllyDbg, we can achieve the same effect.

In IMM View -> Memory, we can easily locate the starting address of each module (e.g., see Figure 2).

Then in the memory dump window, jump to the starting address of the PE of ntdll.dll.

Then right click in the dump pane, and select special -> PE, we can have all information nicely presented by IMM.

3. Export Table

Recall that the first entry of IMAGE_DATA_DIRECTORY of the optional header field contains information about the export table. By Figure 1, you can soon infer that the 4 bytes located at PE + 0x78 (i.e., offset 120 bytes) is the relative address (relative to DLL base address) of the export table, and the next byte (at offset 0x7C) is the size of the export table.

The data type for the export table is  IMAGE_EXPORT_DIRECTORY. Unfortunately, the WinDbg symbol set does not include the definition of this data structure, but you can easily find it in winnt.h through a google search (e.g., from [2]). The following is the definition of IMAGE_EXPORT_DIRECTORY from [2].

typedef struct _IMAGE_EXPORT_DIRECTORY {
  DWORD Characteristics; //offset 0x0
  DWORD TimeDateStamp; //offset 0x4
  WORD MajorVersion;  //offset 0x8
  WORD MinorVersion; //offset 0xa
  DWORD Name; //offset 0xc
  DWORD Base; //offset 0x10
  DWORD NumberOfFunctions;  //offset 0x14
  DWORD NumberOfNames;  //offset 0x18
  DWORD AddressOfFunctions; //offset 0x1c
  DWORD AddressOfNames; //offset 0x20
  DWORD AddressOfNameOrdinals; //offset 0x24
 }

Here, we need some manual calculation of addresses for each attribute for our later analysis. In the above definition, WORD is a computer word of 16 bites (2bytes), and DWORD is 4 bytes. We can easily infer that, MajorVersion is located at offset 0x8, and AddressOfFunctions is located at offset 0x1c.

Now assume that IMAGE_EXPORT_DIRECTORY is located at 0x7C903400, the following is the dump from WinDbg (here "dd" is to display memory):

kd> dd 7c903400
7c903400  00000000 48025c72 00000000 00006786
7c903410  00000001 00000523 00000523 00003428
7c903420  000048b4 00005d40 00057efb 00057e63
7c903430  00057dc5 00002ad0 00002b30 00002b40
7c903440  00002b20 0001eb58 0001ebb9 0001e3af
7c903450  0002062d 000206ee 0004fe3a 00012d71
7c903460  000211e7 0001eaff 0004fe2f 0004fdaa
7c903470  0001b08a 0004febb 0004fe6d 0004fde6

We can soon infer that there are 0x523 functions exposed in the export table, and there are 0x523 names exposed. Why? Because the NumberOfFunctions is located at offset 0x14 (thus its address is 0x7c903400+0x14 = 0x7c903414)  For another example, look at the attribute "Name" which is located at offset 0xc (i.e., its address: 0x7c90340c), we have number 0x00006787. This is the address relative to the base DLL address (assume it is 0x7c900000). Then we have the name of the module located at 0x7c906786. We can verify using the "db" command in WinDbg (display memory contents as bytes): you can verify that the module name is indeed ntdll.dll.


kd> db 7c906786
7c906786  6e 74 64 6c 6c 2e 64 6c-6c 00 43 73 72 41 6c 6c  ntdll.dll.CsrAll
7c906796  6f 63 61 74 65 43 61 70-74 75 72 65 42 75 66 66  ocateCaptureBuff

Read page 26 of [1], you will find that the "AddressOfFunctions", "AddressOfNames", and "AddressOfNameOdinals" are the most important attributes. There are three arrays (shown as below), and each of the above attributes contains one corresponding starting address of an array.

PVOID Functions[523]; //each element is a function pointer
char * Names[523]; //each element is a char * pointer
short int Ordinal[523]; //each element is an 16 bit integer


For example, by manual calculation we know that the Names array starts at 7C9048B4 (given the 0x48B4 located at offset 0x20, for attribute AddressOfNames; and assuming the base address is 0x7C900000). We know that each element of the Names array  is 4 bytes. here is the dump of the first 8 elements:
kd> dd 7c9048b4
7c9048b4  00006790 000067a9 000067c3 000067db
7c9048c4  00006807 0000681f 00006831 00006845

We can verify the first name (00006790): It's CsrAllocateCaptureBuffer. Note that a "0" byte is used to terminate a string.
kd> db 7c906790
7c906790  43 73 72 41 6c 6c 6f 63-61 74 65 43 61 70 74 75  CsrAllocateCaptu
7c9067a0  72 65 42 75 66 66 65 72-00 43 73 72 41 6c 6c 6f  reBuffer.CsrAllo

We can also verify the second name (000067a9): It's CsrAllocateMessagePointer.
kd> db 7c9067a9
7c9067a9  43 73 72 41 6c 6c 6f 63-61 74 65 4d 65 73 73 61  CsrAllocateMessa
7c9067b9  67 65 50 6f 69 6e 74 65-72 00 43 73 72 43 61 70  gePointer.CsrCap

Now, given a Function name, how do we find its entry address? The following is the formula:
Note that array index starts from 0. 
Assume Names[x].equals(FunctionName)
Function address is Functions[Ordinal[x]]

4. Challenge1 of the Day
The first sixteen elements of the Ordinal is shown below:
kd> dd 7c905d40
7c905d40  00080007 000a0009 000c000b 000e000d
7c905d50  0010000f 00120011 00140013 00160015

The first eight elements of the Functions array is shown below:
kd> dd 7c903428
7c903428  00057efb 00057e63 00057dc5 00002ad0
7c903438  00002b30 00002b40 00002b20 0001eb58


What is the entry address of function CsrAllocateCaptureBuffer? The answer is: it's 7C91EB58.

Think about why? (Pay special attention to the byte order of integers).

 

Windows PE Header

Hi Folks, during the past days a couple of students asked me a question about Windows PE header.
 
Well, I supposed the PE was a "kind of" well known structure, instead it seems to be pretty much obscured for most of my people.
 
So I am going to resume very very briefly what PE is giving some useful pictures harvested out here.
 
Each executable file has a Common Object File Format COFF which is used from the OS loader to run the program.
 
Windows Portable Executable (PE) is one of the COFF available in todays OS.
 
For example the Executable Linking File (ELF) is the main Linux COFF.
Microsoft migrated to the PE format with the introduction of the Windows NT 3.1 operating system. 
All later versions of Windows, including Windows 95/98/ME, support the file structure.
The format has retained limited legacy support to bridge the gap between DOS-based and NT systems.
For example, PE/COFF headers still include an MS-DOS executable program,
which is by default a stub that displays the simple message "This program cannot be run in DOS mode" (or similar).

PE also continues to serve the changing Windows platform.
Some extensions include the .NET PE format (see below), a 64-bit version called PE32+ (sometimes PE+),
and a specification for Windows CE.

Nowadays the Windows PE header has the following structure . 
 

MZ are the first 2 bytes you will see in any PE file opened in a hex editor.
The DOS header occupies the first 64 bytes of the file - ie the first 4 rows seen in the hexeditor in the picture below.
The last DWORD before the DOS stub begins contains 00h 01h 00h 00h, which is the offset where the PE header begins.
 
The DOS stub is the piece of software that runs if the executable is run from DOS environment (for example DOS shell).
For retro-compatibility it often executes a printf("This program must be run under Win32");.
 
The PE header begins with its signature 50h, 45h, 00h, 00h (the letters "PE" followed by two terminating zeroes).
 
If in the Signature field of the PE header, you find an NE signature here rather than a PE,
you're working with a 16-bit Windows New Executable file.
 
Likewise, an LE in the signature field would indicate a Windows 3.x virtual device driver (VxD).
 
 
An LX here would be the mark of a file for OS/2 2.0.
 
FileHeader is the next 20 bytes of the PE file and contains info about the physical layout & properties of the file e.g. number of sections.
 
OptionalHeader is always present and forms the next 224 bytes.
 
It contains info about the logical layout inside the PE file e.g. AddressOfEntryPoint.
Its size is given by a member of FileHeader.
The structures of these members are also defined in windows.inc.
The PE header is defined as follows:
Not all these section must be used, but you need to modify the NumberOfSectionsto add or delete sections from a PE file.
The best way to analyze those section is by using PEExplorer or PEID. The following image shows the PEID in use.
 
EntryPoint is The Relative Virtual Addresses (RVA) of the first instruction that will be executed when the PE loader is ready to run the PE file.
If you want to divert the flow of execution right from the start, you need to change the value in this field
to a new RVA and the instruction at the new RVA will be executed first.
 
Executable packers usually redirect this value to their decompression stub,
after which execution jumps back to the original entry point of the app the OEP.
 
Of further note is the Starforce protection in which the CODE section is not present in the file on disk but is written into virtual memory on execution.
 
ImageBase is the preferred load address for the PE file.
For example, if the value in this field is 400000h, the PE loader will try to load the file into the virtual address space starting at 400000h.
The word "preferred" means that the PE loader may not load the file at that address if some other module already occupied that address range.
In 99% of cases it is 400000h.
 
SectionAlignment is the granularity of the alignment of the sections in memory.
For example, if the value in this field is 4096 (1000h), each section must start at multiples of 4096 bytes.
If the first section is at 401000h and its size is 10 bytes, the next section must be at 402000h
even if the address space between 401000h and 402000h will be mostly unused.
 
FileAlignment is the granularity of the alignment of the sections in the file.
For example, if the value in this field is 512 (200h), each section must start at multiples of 512 bytes.
If the first section is at file offset 200h and the size is 10 bytes, the next section must be located at file offset 400h:
the space between file offsets 522 and 1024 is unused/undefined.
 
SizeOfImage is the overall size of the PE image in memory. It's the sum of all headers and sections aligned to SectionAlignment.
 
SizeOfHeaders is the size of all headers + section table.
In short, this value is equal to the file size minus the combined size of all sections in the file.
You can also use this value as the file offset of the first section in the PE file.
 
DataDirectory It is the final 128 bytes of OptionalHeader, which in turn is the final member of the PE header IMAGE_NT_HEADERS.
DataDirectory is an array of 16 IMAGE_DATA_DIRECTORY structures, 8 bytes apiece,
each relating to an important data structure in the PE file.
 
Each array refers to a predefined item, such as the import table.
The structure has 2 members which contain the location and size of the data structure in question: 
 
VirtualAddress is the relative virtual address (RVA) of the data structure , and isize contains the size in bytes of the data structure.
 
Summing up the whole PE Header structure in nutshell:
Alright this was a short description of the much more complex Windows PE header.
 
I believe this is what everybody (of course I am not talking about grandma, but security skilled guys)
should know about Windows PE.
 
After that when you need to deal with PE header obviously these information aren't enough to attack or to reverse engineer a PE header,
so I suggest to look into the most authoritative guides: thisthis and this.
 

Randy Kath
Microsoft Developer Network Technology Group

Download the EXEVIEW sample. (Afterwards, use PKUNZIP.EXE -d to recreate directory structure.)

Download the PEFILE sample. (Afterwards, use PKUNZIP.EXE -d to recreate directory structure.)

 

Abstract

The Windows NT™ version 3.1 operating system introduces a new executable file format called the Portable Executable (PE) file format. The Portable Executable File Format specification, though rather vague, has been made available to the public and is included on the Microsoft Developer Network CD (Specs and Strategy, Specifications, Windows NT File Format Specifications).

Yet this specification alone does not provide enough information to make it easy, or even reasonable, for developers to understand the PE file format. This article is meant to address that problem. In it you'll find a thorough explanation of the entire PE file format, along with descriptions of all the necessary structures and source code examples that demonstrate how to use this information.

All of the source code examples that appear in this article are taken from a dynamic-link library (DLL) called PEFILE.DLL. I wrote this DLL simply for the purpose of getting at the important information contained within a PE file. The DLL and its source code are also included on this CD as part of the PEFile sample application; feel free to use the DLL in your own applications. Also, feel free to take the source code and build on it for any specific purpose you may have. At the end of this article, you'll find a brief list of the functions exported from the PEFILE.DLL and an explanation of how to use them. I think you'll find these functions make understanding the PE file format easier to cope with.

 

Introduction

The recent addition of the Microsoft® Windows NT™ operating system to the family of Windows™ operating systems brought many changes to the development environment and more than a few changes to applications themselves. One of the more significant changes is the introduction of the Portable Executable (PE) file format. The new PE file format draws primarily from the COFF (Common Object File Format) specification that is common to UNIX® operating systems. Yet, to remain compatible with previous versions of the MS-DOS® and Windows operating systems, the PE file format also retains the old familiar MZ header from MS-DOS.

In this article, the PE file format is explained using a top-down approach. This article discusses each of the components of the file as they occur when you traverse the file's contents, starting at the top and working your way down through the file.

Much of the definition of individual file components comes from the file WINNT.H, a file included in the Microsoft Win32™ Software Development Kit (SDK) for Windows NT. In it you will find structure type definitions for each of the file headers and data directories used to represent various components in the file. In other places in the file, WINNT.H lacks sufficient definition of the file structure. In these places, I chose to define my own structures that can be used to access the data from the file. You will find these structures defined in PEFILE.H, a file used to create the PEFILE.DLL. The entire suite of PEFILE.H development files is included in the PEFile sample application.

In addition to the PEFILE.DLL sample code, a separate Win32-based sample application called EXEVIEW.EXE accompanies this article. This sample was created for two purposes: First, I needed a way to be able to test the PEFILE.DLL functions, which in some cases required multiple file views simultaneously--hence the multiple view support. Second, much of the work of figuring out PE file format involved being able to see the data interactively. For example, to understand how the import address name table is structured, I had to view the .idata section header, the import image data directory, the optional header, and the actual .idata section body, all simultaneously. EXEVIEW.EXE is the perfect sample for viewing that information.

Without further ado, let's begin.

 

Structure of PE Files

The PE file format is organized as a linear stream of data. It begins with an MS-DOS header, a real-mode program stub, and a PE file signature. Immediately following is a PE file header and optional header. Beyond that, all the section headers appear, followed by all of the section bodies. Closing out the file are a few other regions of miscellaneous information, including relocation information, symbol table information, line number information, and string table data. All of this is more easily absorbed by looking at it graphically, as shown in Figure 1.

Figure 1. Structure of a Portable Executable file image

Starting with the MS-DOS file header structure, each of the components in the PE file format is discussed below in the order in which it occurs in the file. Much of this discussion is based on sample code that demonstrates how to get to the information in the file. All of the sample code is taken from the file PEFILE.C, the source module for PEFILE.DLL. Each of these examples takes advantage of one of the coolest features of Windows NT, memory-mapped files. Memory-mapped files permit the use of simple pointer dereferencing to access the data contained within the file. Each of the examples uses memory-mapped files for accessing data in PE files.

Note Refer to the section at the end of this article for a discussion on how to use PEFILE.DLL.

 

MS-DOS/Real-Mode Header

As mentioned above, the first component in the PE file format is the MS-DOS header. The MS-DOS header is not new for the PE file format. It is the same MS-DOS header that has been around since version 2 of the MS-DOS operating system. The main reason for keeping the same structure intact at the beginning of the PE file format is so that, when you attempt to load a file created under Windows version 3.1 or earlier, or MS DOS version 2.0 or later, the operating system can read the file and understand that it is not compatible. In other words, when you attempt to run a Windows NT executable on MS-DOS version 6.0, you get this message: "This program cannot be run in DOS mode." If the MS-DOS header was not included as the first part of the PE file format, the operating system would simply fail the attempt to load the file and offer something completely useless, such as: "The name specified is not recognized as an internal or external command, operable program or batch file."

The MS-DOS header occupies the first 64 bytes of the PE file. A structure representing its content is described below:

WINNT.H

 

typedef struct _IMAGE_DOS_HEADER {  // DOS .EXE header
    USHORT e_magic;         // Magic number
    USHORT e_cblp;          // Bytes on last page of file
    USHORT e_cp;            // Pages in file
    USHORT e_crlc;          // Relocations
    USHORT e_cparhdr;       // Size of header in paragraphs
    USHORT e_minalloc;      // Minimum extra paragraphs needed
    USHORT e_maxalloc;      // Maximum extra paragraphs needed
    USHORT e_ss;            // Initial (relative) SS value
    USHORT e_sp;            // Initial SP value
    USHORT e_csum;          // Checksum
    USHORT e_ip;            // Initial IP value
    USHORT e_cs;            // Initial (relative) CS value
    USHORT e_lfarlc;        // File address of relocation table
    USHORT e_ovno;          // Overlay number
    USHORT e_res[4];        // Reserved words
    USHORT e_oemid;         // OEM identifier (for e_oeminfo)
    USHORT e_oeminfo;       // OEM information; e_oemid specific
    USHORT e_res2[10];      // Reserved words
    LONG   e_lfanew;        // File address of new exe header
  } IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

The first field, e_magic, is the so-called magic number. This field is used to identify an MS-DOS-compatible file type. All MS-DOS-compatible executable files set this value to 0x54AD, which represents the ASCII characters MZ. MS-DOS headers are sometimes referred to as MZ headers for this reason. Many other fields are important to MS-DOS operating systems, but for Windows NT, there is really one more important field in this structure. The final field, e_lfanew, is a 4-byte offset into the file where the PE file header is located. It is necessary to use this offset to locate the PE header in the file. For PE files in Windows NT, the PE file header occurs soon after the MS-DOS header with only the real-mode stub program between them.

 

Real-Mode Stub Program

The real-mode stub program is an actual program run by MS-DOS when the executable is loaded. For an actual MS-DOS executable image file, the application begins executing here. For successive operating systems, including Windows, OS/2®, and Windows NT, an MS-DOS stub program is placed here that runs instead of the actual application. The programs typically do no more than output a line of text, such as: "This program requires Microsoft Windows v3.1 or greater." Of course, whoever creates the application is able to place any stub they like here, meaning you may often see such things as: "You can't run a Windows NT application on OS/2, it's simply not possible."

When building an application for Windows version 3.1, the linker links a default stub program called WINSTUB.EXE into your executable. You can override the default linker behavior by substituting your own valid MS-DOS-based program in place of WINSTUB and indicating this to the linker with the STUB module definition statement. Applications developed for Windows NT can do the same thing by using the -STUB: linker option when linking the executable file.

 

PE File Header and Signature

The PE file header is located by indexing the e_lfanew field of the MS-DOS header. The e_lfanew field simply gives the offset in the file, so add the file's memory-mapped base address to determine the actual memory-mapped address. For example, the following macro is included in the PEFILE.H source file:

PEFILE.H

 

#define NTSIGNATURE(a) ((LPVOID)((BYTE *)a +    \
                        ((PIMAGE_DOS_HEADER)a)->e_lfanew))

When manipulating PE file information, I found that there were several locations in the file that I needed to refer to often. Since these locations are merely offsets into the file, it is easier to implement these locations as macros because they provide much better performance than functions do.

Notice that instead of retrieving the offset of the PE file header, this macro retrieves the location of the PE file signature. Starting with Windows and OS/2 executables, .EXE files were given file signatures to specify the intended target operating system. For the PE file format in Windows NT, this signature occurs immediately before the PE file header structure. In versions of Windows and OS/2, the signature is the first word of the file header. Also, for the PE file format, Windows NT uses a DWORD for the signature.

The macro presented above returns the offset of where the file signature appears, regardless of which type of executable file it is. So depending on whether it's a Windows NT file signature or not, the file header exists either after the signature DWORD or at the signature WORD. To resolve this confusion, I wrote the ImageFileType function (following), which returns the type of image file:

PEFILE.C

 

DWORD  WINAPI ImageFileType (
    LPVOID    lpFile)
{
    /* DOS file signature comes first. */
    if (*(USHORT *)lpFile == IMAGE_DOS_SIGNATURE)
        {
        /* Determine location of PE File header from 
           DOS header. */
        if (LOWORD (*(DWORD *)NTSIGNATURE (lpFile)) ==
                                IMAGE_OS2_SIGNATURE ||
            LOWORD (*(DWORD *)NTSIGNATURE (lpFile)) ==
                             IMAGE_OS2_SIGNATURE_LE)
            return (DWORD)LOWORD(*(DWORD *)NTSIGNATURE (lpFile));

        else if (*(DWORD *)NTSIGNATURE (lpFile) ==
                            IMAGE_NT_SIGNATURE)
            return IMAGE_NT_SIGNATURE;

        else
            return IMAGE_DOS_SIGNATURE;
        }

    else
        /* unknown file type */
        return 0;
}

The code listed above quickly shows how useful the NTSIGNATURE macro becomes. The macro makes it easy to compare the different file types and return the appropriate one for a given type of file. The four different file types defined in WINNT.H are:

WINNT.H

 

#define IMAGE_DOS_SIGNATURE             0x5A4D      // MZ
#define IMAGE_OS2_SIGNATURE             0x454E      // NE
#define IMAGE_OS2_SIGNATURE_LE          0x454C      // LE
#define IMAGE_NT_SIGNATURE              0x00004550  // PE00

At first it seems curious that Windows executable file types do not appear on this list. But then, after a little investigation, the reason becomes clear: There really is no difference between Windows executables and OS/2 executables other than the operating system version specification. Both operating systems share the same executable file structure.

Turning our attention back to the Windows NT PE file format, we find that once we have the location of the file signature, the PE file follows four bytes later. The next macro identifies the PE file header:

PEFILE.C

 

#define PEFHDROFFSET(a) ((LPVOID)((BYTE *)a +  \
    ((PIMAGE_DOS_HEADER)a)->e_lfanew + SIZE_OF_NT_SIGNATURE))

The only difference between this and the previous macro is that this one adds in the constant SIZE_OF_NT_SIGNATURE. Sad to say, this constant is not defined in WINNT.H, but is instead one I defined in PEFILE.H as the size of a DWORD.

Now that we know the location of the PE file header, we can examine the data in the header simply by assigning this location to a structure, as in the following example:

 

PIMAGE_FILE_HEADER   pfh;

pfh = (PIMAGE_FILE_HEADER)PEFHDROFFSET (lpFile);

In this example, lpFile represents a pointer to the base of the memory-mapped executable file, and therein lies the convenience of memory-mapped files. No file I/O needs to be performed; simply dereference the pointer pfh to access information in the file. The PE file header structure is defined as:

WINNT.H

 

typedef struct _IMAGE_FILE_HEADER {
    USHORT  Machine;
    USHORT  NumberOfSections;
    ULONG   TimeDateStamp;
    ULONG   PointerToSymbolTable;
    ULONG   NumberOfSymbols;
    USHORT  SizeOfOptionalHeader;
    USHORT  Characteristics;
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;

#define IMAGE_SIZEOF_FILE_HEADER             20

Notice that the size of the file header structure is conveniently defined in the include file. This makes it easy to get the size of the structure, but I found it easier to use the sizeof function on the structure itself because it does not require me to remember the name of the constant IMAGE_SIZEOF_FILE_HEADER in addition to the IMAGE_FILE_HEADER structure name itself. On the other hand, remembering the name of all the structures proved challenging enough, especially since none of these structures is documented anywhere except in the WINNT.H include file.

The information in the PE file is basically high-level information that is used by the system or applications to determine how to treat the file. The first field is used to indicate what type of machine the executable was built for, such as the DEC® Alpha, MIPS R4000, Intel® x86, or some other processor. The system uses this information to quickly determine how to treat the file before going any further into the rest of the file data.

The Characteristics field identifies specific characteristics about the file. For example, consider how separate debug files are managed for an executable. It is possible to strip debug information from a PE file and store it in a debug file (.DBG) for use by debuggers. To do this, a debugger needs to know whether to find the debug information in a separate file or not and whether the information has been stripped from the file or not. A debugger could find out by drilling down into the executable file looking for debug information. To save the debugger from having to search the file, a file characteristic that indicates that the file has been stripped (IMAGE_FILE_DEBUG_STRIPPED) was invented. Debuggers can look in the PE file header to quickly determine whether the debug information is present in the file or not.

WINNT.H defines several other flags that indicate file header information much the way the example described above does. I'll leave it as an exercise for the reader to look up the flags to see if any of them are interesting or not. They are located in WINNT.H immediately after the IMAGE_FILE_HEADER structure described above.

One other useful entry in the PE file header structure is the NumberOfSections field. It turns out that you need to know how many sections--more specifically, how many section headers and section bodies--are in the file in order to extract the information easily. Each section header and section body is laid out sequentially in the file, so the number of sections is necessary to determine where the section headers and bodies end. The following function extracts the number of sections from the PE file header:

PEFILE.C

 

int   WINAPI NumOfSections (
    LPVOID    lpFile)
{
    /* Number of sections is indicated in file header. */
    return (int)((PIMAGE_FILE_HEADER)
                  PEFHDROFFSET (lpFile))->NumberOfSections);
}

As you can see, the PEFHDROFFSET and the other macros are pretty handy to have around.

 

PE Optional Header

The next 224 bytes in the executable file make up the PE optional header. Though its name is "optional header," rest assured that this is not an optional entry in PE executable files. A pointer to the optional header is obtained with the OPTHDROFFSET macro:

PEFILE.H

 

#define OPTHDROFFSET(a) ((LPVOID)((BYTE *)a                 + \
    ((PIMAGE_DOS_HEADER)a)->e_lfanew + SIZE_OF_NT_SIGNATURE + \
    sizeof (IMAGE_FILE_HEADER)))

The optional header contains most of the meaningful information about the executable image, such as initial stack size, program entry point location, preferred base address, operating system version, section alignment information, and so forth. The IMAGE_OPTIONAL_HEADER structure represents the optional header as follows:

WINNT.H

 

typedef struct _IMAGE_OPTIONAL_HEADER {
    //
    // Standard fields.
    //
    USHORT  Magic;
    UCHAR   MajorLinkerVersion;
    UCHAR   MinorLinkerVersion;
    ULONG   SizeOfCode;
    ULONG   SizeOfInitializedData;
    ULONG   SizeOfUninitializedData;
    ULONG   AddressOfEntryPoint;
    ULONG   BaseOfCode;
    ULONG   BaseOfData;
    //
    // NT additional fields.
    //
    ULONG   ImageBase;
    ULONG   SectionAlignment;
    ULONG   FileAlignment;
    USHORT  MajorOperatingSystemVersion;
    USHORT  MinorOperatingSystemVersion;
    USHORT  MajorImageVersion;
    USHORT  MinorImageVersion;
    USHORT  MajorSubsystemVersion;
    USHORT  MinorSubsystemVersion;
    ULONG   Reserved1;
    ULONG   SizeOfImage;
    ULONG   SizeOfHeaders;
    ULONG   CheckSum;
    USHORT  Subsystem;
    USHORT  DllCharacteristics;
    ULONG   SizeOfStackReserve;
    ULONG   SizeOfStackCommit;
    ULONG   SizeOfHeapReserve;
    ULONG   SizeOfHeapCommit;
    ULONG   LoaderFlags;
    ULONG   NumberOfRvaAndSizes;
    IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER, *PIMAGE_OPTIONAL_HEADER;

As you can see, the list of fields in this structure is rather lengthy. Rather than bore you with descriptions of all of these fields, I'll simply discuss the useful ones--that is, useful in the context of exploring the PE file format.

 

Standard Fields

First, note that the structure is divided into "Standard fields" and "NT additional fields." The standard fields are those common to the Common Object File Format (COFF), which most UNIX executable files use. Though the standard fields retain the names defined in COFF, Windows NT actually uses some of them for different purposes that would be better described with other names.

 

  • Magic. I was unable to track down what this field is used for. For the EXEVIEW.EXE sample application, the value is 0x010B or 267.
  • MajorLinkerVersionMinorLinkerVersion. Indicates version of the linker that linked this image. The preliminary Windows NT Software Development Kit (SDK), which shipped with build 438 of Windows NT, includes linker version 2.39 (2.27 hex).
  • SizeOfCode. Size of executable code.
  • SizeOfInitializedData. Size of initialized data.
  • SizeOfUninitializedData. Size of uninitialized data.
  • AddressOfEntryPoint. Of the standard fields, the AddressOfEntryPoint field is the most interesting for the PE file format. This field indicates the location of the entry point for the application and, perhaps more importantly to system hackers, the location of the end of the Import Address Table (IAT). The following function demonstrates how to retrieve the entry point of a Windows NT executable image from the optional header.

    PEFILE.C

     

    LPVOID  WINAPI GetModuleEntryPoint (
        LPVOID    lpFile)
    {
        PIMAGE_OPTIONAL_HEADER   poh;
    
        poh = (PIMAGE_OPTIONAL_HEADER)OPTHDROFFSET (lpFile);
    
        if (poh != NULL)
            return (LPVOID)poh->AddressOfEntryPoint;
        else
            return NULL;
    }
    
  • BaseOfCode. Relative offset of code (".text" section) in loaded image.
  • BaseOfData. Relative offset of uninitialized data (".bss" section) in loaded image.

Windows NT Additional Fields

The additional fields added to the Windows NT PE file format provide loader support for much of the Windows NT-specific process behavior. Following is a summary of these fields.

 

  • ImageBase. Preferred base address in the address space of a process to map the executable image to. The linker that comes with the Microsoft Win32 SDK for Windows NT defaults to 0x00400000, but you can override the default with the -BASE: linker switch.
  • SectionAlignment. Each section is loaded into the address space of a process sequentially, beginning at ImageBaseSectionAlignment dictates the minimum amount of space a section can occupy when loaded--that is, sections are aligned on SectionAlignment boundaries.

    Section alignment can be no less than the page size (currently 4096 bytes on the x86 platform) and must be a multiple of the page size as dictated by the behavior of Windows NT's virtual memory manager. 4096 bytes is the x86 linker default, but this can be set using the -ALIGN: linker switch.

     

  • FileAlignment. Minimum granularity of chunks of information within the image file prior to loading. For example, the linker zero-pads a section body (raw data for a section) up to the nearest FileAlignment boundary in the file. Version 2.39 of the linker mentioned earlier aligns image files on a 0x200-byte granularity. This value is constrained to be a power of 2 between 512 and 65,535.
  • MajorOperatingSystemVersion. Indicates the major version of the Windows NT operating system, currently set to 1 for Windows NT version 1.0.
  • MinorOperatingSystemVersion. Indicates the minor version of the Windows NT operating system, currently set to 0 for Windows NT version 1.0
  • MajorImageVersion. Used to indicate the major version number of the application; in Microsoft Excel version 4.0, it would be 4.
  • MinorImageVersion. Used to indicate the minor version number of the application; in Microsoft Excel version 4.0, it would be 0.
  • MajorSubsystemVersion. Indicates the Windows NT Win32 subsystem major version number, currently set to 3 for Windows NT version 3.10.
  • MinorSubsystemVersion. Indicates the Windows NT Win32 subsystem minor version number, currently set to 10 for Windows NT version 3.10.
  • Reserved1. Unknown purpose, currently not used by the system and set to zero by the linker.
  • SizeOfImage. Indicates the amount of address space to reserve in the address space for the loaded executable image. This number is influenced greatly by SectionAlignment. For example, consider a system having a fixed page size of 4096 bytes. If you have an executable with 11 sections, each less than 4096 bytes, aligned on a 65,536-byte boundary, theSizeOfImage field would be set to 11 * 65,536 = 720,896 (176 pages). The same file linked with 4096-byte alignment would result in 11 * 4096 = 45,056 (11 pages) for theSizeOfImage field. This is a simple example in which each section requires less than a page of memory. In reality, the linker determines the exact SizeOfImage by figuring each section individually. It first determines how many bytes the section requires, then it rounds up to the nearest page boundary, and finally it rounds page count to the nearest SectionAlignmentboundary. The total is then the sum of each section's individual requirement.
  • SizeOfHeaders. This field indicates how much space in the file is used for representing all the file headers, including the MS-DOS header, PE file header, PE optional header, and PE section headers. The section bodies begin at this location in the file.
  • CheckSum. A checksum value is used to validate the executable file at load time. The value is set and verified by the linker. The algorithm used for creating these checksum values is proprietary information and will not be published.
  • Subsystem. Field used to identify the target subsystem for this executable. Each of the possible subsystem values are listed in the WINNT.H file immediately after theIMAGE_OPTIONAL_HEADER structure.
  • DllCharacteristics. Flags used to indicate if a DLL image includes entry points for process and thread initialization and termination.
  • SizeOfStackReserveSizeOfStackCommitSizeOfHeapReserveSizeOfHeapCommit. These fields control the amount of address space to reserve and commit for the stack and default heap. Both the stack and heap have default values of 1 page committed and 16 pages reserved. These values are set with the linker switches -STACKSIZE: and -HEAPSIZE:.
  • LoaderFlags. Tells the loader whether to break on load, debug on load, or the default, which is to let things run normally.
  • NumberOfRvaAndSizes. This field identifies the length of the DataDirectory array that follows. It is important to note that this field is used to identify the size of the array, not the number of valid entries in the array.
  • DataDirectory. The data directory indicates where to find other important components of executable information in the file. It is really nothing more than an array ofIMAGE_DATA_DIRECTORY structures that are located at the end of the optional header structure. The current PE file format defines 16 possible data directories, 11 of which are now being used.

Data Directories

As defined in WINNT.H, the data directories are:

WINNT.H

 

// Directory Entries

// Export Directory
#define IMAGE_DIRECTORY_ENTRY_EXPORT         0
// Import Directory
#define IMAGE_DIRECTORY_ENTRY_IMPORT         1
// Resource Directory
#define IMAGE_DIRECTORY_ENTRY_RESOURCE       2
// Exception Directory
#define IMAGE_DIRECTORY_ENTRY_EXCEPTION      3
// Security Directory
#define IMAGE_DIRECTORY_ENTRY_SECURITY       4
// Base Relocation Table
#define IMAGE_DIRECTORY_ENTRY_BASERELOC      5
// Debug Directory
#define IMAGE_DIRECTORY_ENTRY_DEBUG          6
// Description String
#define IMAGE_DIRECTORY_ENTRY_COPYRIGHT      7
// Machine Value (MIPS GP)
#define IMAGE_DIRECTORY_ENTRY_GLOBALPTR      8
// TLS Directory
#define IMAGE_DIRECTORY_ENTRY_TLS            9
// Load Configuration Directory
#define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG    10

Each data directory is basically a structure defined as an IMAGE_DATA_DIRECTORY. And although data directory entries themselves are the same, each specific directory type is entirely unique. The definition of each defined data directory is described in "Predefined Sections" later in this article.

WINNT.H

 

typedef struct _IMAGE_DATA_DIRECTORY {
    ULONG   VirtualAddress;
    ULONG   Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

Each data directory entry specifies the size and relative virtual address of the directory. To locate a particular directory, you determine the relative address from the data directory array in the optional header. Then use the virtual address to determine which section the directory is in. Once you determine which section contains the directory, the section header for that section is then used to find the exact file offset location of the data directory.

So to get a data directory, you first need to know about sections, which are described next. An example of how to locate data directories immediately follows this discussion.

 

PE File Sections

The PE file specification consists of the headers defined so far and a generic object called a section. Sections contain the content of the file, including code, data, resources, and other executable information. Each section has a header and a body (the raw data). Section headers are described below, but section bodies lack a rigid file structure. They can be organized in almost any way a linker wishes to organize them, as long as the header is filled with enough information to be able to decipher the data.

 

Section Headers

Section headers are located sequentially right after the optional header in the PE file format. Each section header is 40 bytes with no padding between them. Section headers are defined as in the following structure:

WINNT.H

 

#define IMAGE_SIZEOF_SHORT_NAME              8

typedef struct _IMAGE_SECTION_HEADER {
    UCHAR   Name[IMAGE_SIZEOF_SHORT_NAME];
    union {
            ULONG   PhysicalAddress;
            ULONG   VirtualSize;
    } Misc;
    ULONG   VirtualAddress;
    ULONG   SizeOfRawData;
    ULONG   PointerToRawData;
    ULONG   PointerToRelocations;
    ULONG   PointerToLinenumbers;
    USHORT  NumberOfRelocations;
    USHORT  NumberOfLinenumbers;
    ULONG   Characteristics;
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;

How do you go about getting section header information for a particular section? Since section headers are organized sequentially in no specific order, section headers must be located by name. The following function shows how to retrieve a section header from a PE image file given the name of the section:

PEFILE.C

 

BOOL    WINAPI GetSectionHdrByName (
    LPVOID                   lpFile,
    IMAGE_SECTION_HEADER     *sh,
    char                     *szSection)
{
    PIMAGE_SECTION_HEADER    psh;
    int                      nSections = NumOfSections (lpFile);
    int                      i;


    if ((psh = (PIMAGE_SECTION_HEADER)SECHDROFFSET (lpFile)) !=
         NULL)
        {
        /* find the section by name */
        for (i=0; i<nSections; i++)
            {
            if (!strcmp (psh->Name, szSection))
                {
                /* copy data to header */
                CopyMemory ((LPVOID)sh,
                            (LPVOID)psh,
                            sizeof (IMAGE_SECTION_HEADER));
                return TRUE;
                }
            else
                psh++;
            }
        }

    return FALSE;
}

The function simply locates the first section header via the SECHDROFFSET macro. Then the function loops through each section, comparing each section's name with the name of the section it's looking for, until it finds the right one. When the section is found, the function copies the data from the memory-mapped file to the structure passed in to the function. The fields of theIMAGE_SECTION_HEADER structure can then be accessed directly from the structure.

 

Section Header Fields

 

  • Name. Each section header has a name field up to eight characters long, for which the first character must be a period.
  • PhysicalAddress or VirtualSize. The second field is a union field that is not currently used.
  • VirtualAddress. This field identifies the virtual address in the process's address space to which to load the section. The actual address is created by taking the value of this field and adding it to the ImageBase virtual address in the optional header structure. Keep in mind, though, that if this image file represents a DLL, there is no guarantee that the DLL will be loaded to theImageBase location requested. So once the file is loaded into a process, the actual ImageBase value should be verified programmatically using GetModuleHandle.
  • SizeOfRawData. This field indicates the FileAlignment-relative size of the section body. The actual size of the section body will be less than or equal to a multiple of FileAlignment in the file. Once the image is loaded into a process's address space, the size of the section body becomes less than or equal to a multiple of SectionAlignment.
  • PointerToRawData. This is an offset to the location of the section body in the file.
  • PointerToRelocationsPointerToLinenumbersNumberOfRelocationsNumberOfLinenumbers. None of these fields are used in the PE file format.
  • Characteristics. Defines the section characteristics. These values are found both in WINNT.H and in the Portable Executable Format specification located on this CD.

 

 

 

 

 

 

 

 

 

 

Value Definition
0x00000020 Code section
0x00000040 Initialized data section
0x00000080 Uninitialized data section
0x04000000 Section cannot be cached
0x08000000 Section is not pageable
0x10000000 Section is shared
0x20000000 Executable section
0x40000000 Readable section
0x80000000 Writable section

 

 

Locating Data Directories

Data directories exist within the body of their corresponding data section. Typically, data directories are the first structure within the section body, but not out of necessity. For that reason, you need to retrieve information from both the section header and optional header to locate a specific data directory.

To make this process easier, the following function was written to locate the data directory for any of the directories defined in WINNT.H:

PEFILE.C

 

LPVOID  WINAPI ImageDirectoryOffset (
        LPVOID    lpFile,
        DWORD     dwIMAGE_DIRECTORY)
{
    PIMAGE_OPTIONAL_HEADER   poh;
    PIMAGE_SECTION_HEADER    psh;
    int                      nSections = NumOfSections (lpFile);
    int                      i = 0;
    LPVOID                   VAImageDir;

    /* Must be 0 thru (NumberOfRvaAndSizes-1). */
    if (dwIMAGE_DIRECTORY >= poh->NumberOfRvaAndSizes)
        return NULL;

    /* Retrieve offsets to optional and section headers. */
    poh = (PIMAGE_OPTIONAL_HEADER)OPTHDROFFSET (lpFile);
    psh = (PIMAGE_SECTION_HEADER)SECHDROFFSET (lpFile);

    /* Locate image directory's relative virtual address. */
    VAImageDir = (LPVOID)poh->DataDirectory
                       [dwIMAGE_DIRECTORY].VirtualAddress;

    /* Locate section containing image directory. */
    while (i++<nSections)
        {
        if (psh->VirtualAddress <= (DWORD)VAImageDir &&
            psh->VirtualAddress + 
                 psh->SizeOfRawData > (DWORD)VAImageDir)
            break;
        psh++;
        }

    if (i > nSections)
        return NULL;

    /* Return image import directory offset. */
    return (LPVOID)(((int)lpFile + 
                     (int)VAImageDir. psh->VirtualAddress) +
                    (int)psh->PointerToRawData);
}

The function begins by validating the requested data directory entry number. Then it retrieves pointers to the optional header and first section header. From the optional header, the function determines the data directory's virtual address, and it uses this value to determine within which section body the data directory is located. Once the appropriate section body has been identified, the specific location of the data directory is found by translating the relative virtual address of the data directory to a specific address into the file.

 

Predefined Sections

An application for Windows NT typically has the nine predefined sections named .text, .bss, .rdata, .data, .rsrc, .edata, .idata, .pdata, and .debug. Some applications do not need all of these sections, while others may define still more sections to suit their specific needs. This behavior is similar to code and data segments in MS-DOS and Windows version 3.1. In fact, the way an application defines a unique section is by using the standard compiler directives for naming code and data segments or by using the name segment compiler option -NT--exactly the same way in which applications defined unique code and data segments in Windows version 3.1.

The following is a discussion of some of the more interesting sections common to typical Windows NT PE files.

 

Executable code section, .text

One difference between Windows version 3.1 and Windows NT is that the default behavior combines all code segments (as they are referred to in Windows version 3.1) into a single section called ".text" in Windows NT. Since Windows NT uses a page-based virtual memory management system, there is no advantage to separating code into distinct code segments. Consequently, having one large code section is easier to manage for both the operating system an


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap