Charles Hooper

Thoughts and projects from an infrastructure engineer

Dropping Privileges in Python for Tornado Apps

Today, I’m going to show you how to drop from the root user to an unprivileged user in Python for the purpose of running a Tornado app.

First make a system user for your project to run as. In my example, I’ll be using projectuser as the username. Creating this user can be done like so:

1
sudo useradd --system --user-group projectuser

Now, in your script that is responsible for starting your Tornado app, you likely have something that probably looks like the following:

Snippet in web.py
1
2
3
4
if __name__ == "__main__": 
    http_server = tornado.httpserver.HTTPServer(application) 
    http_server.listen(port) 
    tornado.ioloop.IOLoop.instance().start()

What we need to do now is define a user to run as and then drop privileges using a call to setuid. We can do this by replacing the above with:

Modification to web.py
1
2
3
4
5
6
7
8
9
10
11
12
if __name__ == "__main__":
    import os
    import pwd
    # define user to run as
    run_as_user = "projectuser"
    # drop privileges
    uid = pwd.getpwnam(run_as_user)[2]
    os.setuid(uid)
    # start tornado app
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(port)
    tornado.ioloop.IOLoop.instance().start()

And voila, your app should now run as the user you defined! Do note that only the root user can call setuid. As a result, your script now needs to be run using sudo or from an upstart startup script, for example.

One caveat is that you won’t be able to use port numbers below 1024 since you are dropping to an unprivileged user before binding to the port. I think there’s a way to get around this by replacing http_server.listen() with http_server.bind(), http_server.start(), and dropping privileges between those calls, but this remains untested for now. Alternatively, you could use the respective proxy modules for Lighttpd or nginx to listen on privileged ports.

Twitter BookSuggest

Twitter BookSuggest is a web app that attempts to make book recommendations based on a person’s last 20 tweets. Clicking on a book cover will present you with a description of the book, as well as a clickable link to Amazon.com where you can purchase the book or add it to your wishlist. I’ve been working on this project for at the past couple of weeks, so please check it out. If you don’t use Twitter then just hang tight, I’ll be releasing a Facebook version shortly.

Twitter-repeater Source Released

Last night I finally released the source to my twitter-repeater bot. > twitter-repeater is a bot that automatically retweets any tweets in which its name is “mentioned” in. In order for a tweet to be retweeted, the bot account must be following the original user who tweeted it, that user must not be on the ignore list, and the tweet must pass some basic quality tests. > The idea was originally inspired by the @SanMo bot and was created so I could use something similar for New London, CT (@NLCT)

The bot is released under the MIT license and makes use of the tweepy library.
 

Minesweeper Hacking – Viewing Process Memory in Windows

I wrote a very simple program to read Minesweeper’s memory and display a grid showing where the bombs are. I used OllyDbg for disassembly and reversing and CheatEngine for quickly finding known values in memory. During this process, I found that Minesweeper will sometimes assist you and move bombs away from where you are clicking on. Originally, I thought that Minesweeper was only “spawning” about half of the bombs, but as it turns out I misunderstood the way minefield was represented in memory and all bombs are generated at the beginning of the game and not first click or any later clicks.

My error was in thinking that minefield was stored in a 2-dimensional array (ie: minefield[x][y] = FLAGS) where max(x) (and max(y)) are the size of the grid (ie: 9×9 on Beginner) but as xumiiz on Reddit pointed out:

His program is buggy. It’s not reading the grid in correctly – it’s a constant width of 32 bytes, but a window from the top left is taken for the actual size of the playing field. So, first bugfix to his source:

for(DWORD grid_loc = 0; grid_loc < grid_height * grid_width; grid_loc ) {

should be:

for(DWORD grid_loc = 0; grid_loc < grid_height * 32; grid_loc = ((grid_loc2)==(grid_width-1)) ? (32-grid_width 1):1) {

And:

if((grid_loc % grid_width) == (grid_width - 1))

should be changed to:

if((grid_loc % 32) == (grid_width - 1))

With these fixes, it reads all the bombs properly.

And also an anonymous comment:

Sorry but your program is reading the grid incorrectly. Minesweeper uses a grid with a fixed width of 32 bytes and the playing field is taken as a window of that grid from the top left. e.g. beginner mode uses bytes 0 to 8 and skips bytes 9 to 31 per every 32 byte row.* Fixing the program to read based on that patten shows that Minesweeper only moves the mine if it happens to be the first square you click on. Apart from that, all mines are randomly placed at the start of the game. (* Actually it would use bytes 0 to 10, where bytes 0 and 10 are 0×10 which is to indicate the border of the mine field, and bytes 1 to 9 are the actual squares. but that’s not really relevant to the analysis if you’re just &ing with 0×80 to find bombs.)

The program is available here:

I have released a binary, and the software has been placed under the BSD License. Many thanks to the people and sites who linked me, people who posted comments, most definitely the contributors, but most importantly: the readers.

Link Aggregation on Cisco Catalysts & Foundry Switches

LACP, or Link Aggregation Control Protocol, allows you to configure multiple ethernet ports to act as a single device. This is sometimes referred to as channel bonding or trunking. Link Aggregation provides several benefits: Increased bandwidth, load balancing, and allows you to create redundant ethernet links. If a link in your ethernet channel goes down, the switches, routers, or servers you have configured to use LACP will automatically fail over to the links that are still up and remain connected. With the right hardware and the right firmware, setting this up is very simple. On a Cisco Catalyst switch running IOS, once logged in, you will need to perform the following steps:

1
2
3
4
5
6
7
8
9
10
11
cisco>enable  
Password: *enter password*  
cisco#config term  
cisco(config)#int Gi0/1  
cisco(config-if)#channel-group 1 mode active  
cisco(config-if)#channel-protocol lacp  
cisco(config-if)#int Gi0/2  
cisco(config-if)#channel-group 1 mode active  
cisco(config-if)#channel-protocol lacp  
cisco(config-if)#end  
cisco#write mem

It also helps to label your ports and to be sure that your bonded ports are configured in the right VLAN or set up for VLAN trunking. If your other device is another Cisco Catalyst, then just repeat the steps above, connect your newly bonded ports, and disconnect any non-bonded ports connecting the two devices. However, if you are configuring a Foundry switch, your configuration is just as easy, but slightly different.

1
2
3
4
5
6
7
8
9
BR-telnet@foundry>enable  
Password: *enter password*  
BR-telnet@foundry#config term  
BR-telnet@foundry(config)#interface ethernet 1  
BR-telnet@foundry(config-if-e1000-1)#link-aggregate active  
BR-telnet@foundry(config-if-e1000-1)#interface ethernet 2  
BR-telnet@foundry(config-if-e1000-2)#link-aggregate active  
BR-telnet@foundry(config-if-e1000-2)#end  
BR-telnet@foundry#write mem

Provided you don’t have a complicated VLAN set up, you’re all set! One thing you should note about the Foundry devices, is that you can only start your trunked ports on the first port of each group of 4. Explanation: On a 12-port switch, you would only be able to start port groups on ports 1, 5, and 9. But this does not mean that you can’t bond ports 1 & 2 — but it does mean that you can’t bond ports 3 & 4.

Recommended reading: * Link aggregation article on Wikipedia

Link Aggregation on a RedHat (CentOS) Server and a Cisco Catalyst Switch

IEEE 802.3ad, more commonly known as Link Aggregation, allows you to configure multiple Ethernet ports to act as a single device. This is sometimes referred to as channel bonding, Ethernet bonding, or trunking. Link Aggregation provides several benefits: Increased bandwidth, load balancing, and allows you to create redundant ethernet links. If a link in your ethernet channel goes down, the switches, routers, or servers you have configured to use LACP will automatically fail over to the links that are still up and remain connected. On a RedHat-based Linux distribution, such as CentOS or Fedora, the configuration may look a little complex, but in fact, it is very straight-forward.

To enable EtherChannel bonding on your RedHat-based server, follow these four easy steps:

1. First, create/replace /etc/sysconfig/network-scripts/ifcfg-bond0 with:

DEVICE=bond0
ONBOOT=yes
USERCTL=no

This file is also where you will configure your interface options, such as your IP address or if you will be using DHCP to obtain that information automatically.

2. Next, for each interface you want bonded, create the file (and backup any already-existing ones) /etc/sysconfig/network-scripts/ifcfg-ethX, where X is the number of the interface, for example: eth0 and eth1. Check the output of dmesg or ifconfig if you are confused.

In each of these files, put the following lines:

DEVICE=ethX
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=no

Once again, instead of an X you will want to replace that with the number of the interface.

3. Add the following text to /etc/modprobe.conf on a line by itself: alias bond bonding. Now would also probably be a good time to review your firewall rules and configuration files, to change any interface-specific directives to refer to “bond0″ now instead of “eth0″, or whatever it may have been before.

4. Restart your networking scripts. As root, issue the following command: service network restart

At this point you may lose connectivity. Do not panic! This is because we have configured the server to use EtherChannel bonding, but we have not told the switch we were going to do so.

On a Cisco Catalyst switch running IOS, once logged in, you enable EtherChannel bonding by performing the following steps. These steps assume that both interfaces are configured in the same (and correct) VLAN, and have the same speed and duplex settings.

cisco>enable
Password: *enter password*
cisco#config term
cisco(config)#int Fa0/1
cisco(config-if)#channel-group 1 mode on
cisco(config-if)#int Fa0/2
cisco(config-if)#channel-group 1 mode on
cisco(config-if)#end
cisco#write mem

It also helps to label your ports. It is important to understand that your bonded interfaces do not need to be consecutive ports on the switch, but consolidating them to one location on the switch is good for organizational purposes.

Now you are free to plug in your additional cables and enjoy your new redundant Ethernet links!

Hacking the Actiontec GT701 Wireless Gateway

Introduction

This paper is our attempt to deobfuscate the Actiontec GT701 wireless gateway. There are a couple of other websites out there with the same goal in mind, however, our intent was to provide accurate information based off of various sources including both official and un-official documentation, kernel source, configuration files, and just plain hacking.

Hardware

The hardware making up this unit revolves around the AR7, the AR7 is Texas Instruments’ “system on a chip” solution for DSL routers. The hardware of the GT701 (or any other AR7-based device for that matter,) consists of a power supply, the 160Mhz MIPS 4KEc V4.8 processor, 16Mb of SDRAM, and 4Mb of FLASH. For your input/output, there’s the RJ-11 for your DSL, your ethernet device (TI Avalanche CPMAC) jack, a USB port, and an ACX-11x based (chip # TNETW130) wireless setup as well as 6 status LEDs. On the board, there are also two separate sets of 5 pins each. These are mostly believed to be serial (JTAG is also possible) due to Texas Instruments displaying a serial/UART interface on the AR7 diagrams, several pins being attached to the board, and due to the following ADAM2 variables:

1
2
3
modetty0        38400,n,8,1,hw  
modetty1        38400,n,8,1,hw  
bootserport     tty0  

ADAM2

To be perfectly honest, we’re still not entirely too sure what ADAM2 really is. We know that it’s stored on block 2 of the MTD device. We also know that that it appears to be some sort of system for storing environment variables in flash used during both boot-time and run-time, as well as a boot-loader of some sort. We also know that it’s responsible for storing the MAC addresses, as found in our mtd dump:

1
2
Error: environment variable "maca" not set.  
Setting default mac address : 00:e0:a0:a6:66:70  

The following is a dump of /proc/ticfg/env, which is the /proc interface to ADAM2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# cat /proc/ticfg/env  
memsize 0x01000000  
flashsize       0x00400000  
modetty0        38400,n,8,1,hw  
modetty1        38400,n,8,1,hw  
bootserport     tty0  
cpufrequency    150000000  
sysfrequency    125000000  
bootloaderVersion       0.22.02  
ProductID       GT701-WG  
HWRevision      2A  
SerialNumber    none  
AEIBootVersion  0.2i  
my_ipaddress    192.168.0.1  
prompt  Adam2_AR7DB  
firstfreeaddress        0x9401d328  
req_fullrate_freq       125000000  
maca    00:20:E0:1D:95:F4  
mtd2    0x90000000,0x90010000  
mtd1    0x90010000,0x900d0000  
mtd0    0x900d0000,0x903e0000  
mtd3    0x903f0000,0x90400000  
macb    00:20:E0:1D:95:F5  
macc    00:20:E0:1D:95:F6  
usb_board_mac   00:20:E0:1D:95:F8  
usb_rndis_mac   00:20:E0:1D:95:F9  
mac_ap  00:20:E0:1D:95:F7  
autoload        1  
mtd4    0x903e0000,0x903f0000  
usb_pid 0x6010  
usb_vid 0x1668  
man     Actiontec Electronics, Inc.  
prod    Actiontec USB/Ethernet Home DSL Modem  

When you hold down the Reset button during boot, an FTP server is spawned on the default port (TCP/21) typically allowing you to flash new firmware, as well as set and unset different ADAM2 environment variables.

The following is a list of commands that the ADAM2 FTP server supports.

1
2
3
4
REBOOT          UNSETENV        SETENV          GETENV  
MEDIA           RETR            TYPE            STOR  
P@SW            PASV            SYST            PASS  
USER            PORT            QUIT            ABOR  

When Actiontec’s recovery app is run, it also sends a UDP packet to port 5035, and then initiates a connection to the FTP port. The following is the output of a sniffed connection of a typical firmware upgrade.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
UDP broadcast port 5035: (16 bytes)  
0x00 0x00 0x16 0x02 0x01 0x00 0x00 0x00  
0xc0 0xa8 0x00 0x01 0x00 0x00 0x00 0x00  

UDP response from modem to port 5035: (16 bytes)  
0x00 0x00 0x16 0x02 0x02 0x00 0x00 0x00  
0x01 0x00 0xa8 0xc0 0x00 0x00 0x00 0x00  
  
220 ADAM2 FTP Server ready.  
USER adam2  
331 Password required for adam2.  
PASS adam2  
230 User adam2 successfully logged in.  
TYPE I  
200 Type set to I.  
MEDIA FLSH  
200 Media set to FLSH.  
PORT 192,168,0,102,130,11  
200 Port command successful.  
STOR nsp.ar7wrd.squashfs.img mtd0  
150 Opening BINARY mode data connection for file transfer.  
226 Transfer complete.  
TYPE I  
200 Type set to I.  
MEDIA FLSH  
200 Media set to FLSH.  
PORT 192,168,0,102,130,12  
200 Port command successful.  
STOR ram_zimage_pad.ar7wrd.nsp.squashfs.bin mtd1  
150 Opening BINARY mode data connection for file transfer.  
226 Transfer complete.  
TYPE I  
200 Type set to I.  
MEDIA FLSH  
200 Media set to FLSH.  
PORT 192,168,0,102,130,13  
200 Port command successful.  
STOR config.xml mtd3  
150 Opening BINARY mode data connection for file transfer.  
226 Transfer complete.  
REBOOT  
221-Thank you for using the FTP service on ADAM2.  
221 Goodbye.  
QUIT  

The Actiontec GT701′s MTD blocks are set up as follows:

1
2
3
4
5
mtd0            3,136K     Root (SquashFS - compressed filesystem)  
mtd1            768K       Kernel  
mtd2            64K        ADAM2  
mtd3            64K        config.xml  
mtd4            64K        unknown/unused  

We’re not too sure what else it is capable of, but there are some hints of it being able to boot off the network (DHCP,) and/or booting specified images. Here are some ADAM2 commands, though we haven’t actually been able to test these yet:

1
2
3
4
5
6
7
8
9
fixenv         Defragment for Env. space  
unsetenv       Unsets the Env. variable   
setenv         Sets Env. variable  with a value   
printenv       Displays Env. Variables  
erase          Erase Flash except Adam2 Kernel and Env space  
setmfreq       Configures/dumps the system and cpu frequencies  
memop          Memory Optimization  
info           Displays board information  
h/help         Displays the commands supported  

There are others, but some of the command names didn’t show up, only the descriptions, and we don’t have a console hooked up to see them for ourselves yet.

Software

The Actiontec GT701 runs off of Linux kernel 2.4.17 patched for MIPS, ATM, SquashFS, and pre-empt (not enabled.) The kernel is provided by MontaVista and is believed to be the MontaVista Carrier Grade Linux kernel version 2.1.

1
Linux version 2.4.17\_mvl21-malta-mips\_fp_le () (gcc version 2.95.3 20010315 (release/MontaVista)) #24 Fri Jul 16 13:22:25 PDT 2004

Along with the kernel, the GT701 also runs on top of Busybox 0.61.pre with uClibc libraries (version 0.9.19.) The root filesystem uses SquashFS 1.x, which is a compressed, read-only filesystem stored on the MTD block. One should note that SquashFS 2.x is not backwards-compatible with 1.x. A ramdisk is mounted at /var and any files that require write access or either stored there, or symlinked to that tree.

In order to retrieve and edit the file system one would first have to download SquashFS and compile it into their kernel, as well as build the user-land tools. Once this is complete your first step would be to either extract nsp.ar7wrd.squashfs.img from the recovery tool, or do something similar to the following (while running a tftp server):

1
2
3
4
# dd if=/dev/mtdblock/0 of=/var/mtd0  
6272 0 records in  
6272 0 records out  
# tftp -p -l /var/mtd0 -p mtd0.img

This will give you a mountable SquashFS image wherever you you placed your tftp root. In order to to write to it though, you will need to copy a mounted SquashFS directory to a non-SquashFS directory as follows:

1
2
3
# mkdir temp fs  
# mount -o loop -t squashfs mtd0.img temp/  
# cp -R temp/ fs/  

And you now have a write-able directory to edit/delete or whatever else may please you. Re-creating the image is just as easy:

1
2
3
4
5
6
7
# mksquashfs target.old/ target.img -noappend -check_data  
Creating little endian filesystem on target.img, block size 32768.  
  
Little endian filesystem, data block size 32768, compressed data, compressed metadata  
Filesystem size 1897.99 Kbytes (1.85 Mbytes)  
33.35% of uncompressed filesystem size (5691.04 Kbytes)  
--- Output cut ---  

There are two things to keep in mind while building filesystem images. The first is that the GT701 can only STORE 3,136KB (compressed) on the FLASH chip. You should at this point, also realize that the filesystem is decompressed and stored in RAM when mounted, and you only have 16MB RAM to begin with, so either way, it’s a tight fit.

Actiontec uses a set of utilities to manage your configuration files. They manage the XML file stored on mtd3 as well as handle your web-based configuraiton changes. There is also supposed to be a CLI client for it, however, I haven’t quite figured out how that works yet. These utilities can usually be identified by having cm_ as a prefix, although the CGI program for the web-based configuration is called webcm, and of course, we can’t forget libcm.so. The XML file contains all of your coniguration, including IP addresses, authentication, networking settings, and probably just about everything else. You can extract a current version of the file the same way we demonstrated dumping the filesystem above, but by replacing mtd0 with mtd3. You will also need to strip all of the excess garbage at the end of the file. I should also note that that mtd3 is monitored regularly for corruption, and if mtd3 happens to become corrupted, it will repopulate the block with /etc/config.xml.

The list of configuration programs is as follows:

1
2
3
4
5
cm_pc             Started at boot, stdout is /dev/tts/0,starts cm_logic  and cm_monitor  
cm_logic          Monitors and re-populates mtd3  
cm_monitor        ? ... Not exactly sure.  
cm_cli            Used to perform the actual updating of the config files.  
webcm             Handles web-based configuration changes, sends them off  to cm_cli

Webcm is used in conjunction with thttpd to provide a small, yet working, web-based interface to allow you to make changes to your gateway’s configuration.

As far as networking is concerned, the GT701 used pppd with a PPPoA plugin for your connection to your ISP. For telnet and DHCP, the gateway uses utelnetd and udhcpd, respectively. The Actiontec GT701 also supports UPNP through the use of upnpd on interfaces ppp0 and br0. br0 consists of the USB device, the Ethernet device, and the wireless device.

The wireless drivers are not compiled into the kernel or as a kernel module, rather, they are handled by a userland driver called user_drv. On the original firmware, the user_drv_cli utility provided a very capable command line interface that allowed you to change many settings pertaining to the wireless network device. Some of these settings included what Regulatory domain you were in, for instane, one could take their access point out of the FCC domain, and place it under the French domain, or better yet, a custom domain, and change power levels, as well as usable channels. In the newer firmware, it seems this software has been crippled, and will not allow you to access the CLI.

Conclusion

The Actiontec GT701-wg is a powerful embedded Linux device running on a MIPS platform based off of Texas Instruments’ AR7 “one-chip” solution. It is relatively easy to hack the GT701. The firmware images are squashFS 1.x images and the base Linux system is run on BusyBox with the uClibc libraries. If one were to set up a cross-compile environment and use the squashFS tools they could generate new firmware images with great ease.

Finding the Linux System Call Table in 2.6 Series Kernels

I have been modifying Sebek to get it to work in more recent 2.6 series (~2.6.18) kernels and ran into some snags. Most notably, I could not intercept/redirect/wrap any system calls. As it turns out, Sebek couldn’t find the system call table. The code Sebek was using to find the system call table is 100% identical to the code found in an article on KernelTrap.

Unfortunately, that code is outdated as either loops_per_jiffy, boot_cpu_data, or sys_call_table appear to have been moved. I found that I could find the system call table between unlock_kernel and loops_per_jiffy and have modified the code as follows.

// -----------------------------------------------------------------------------   
// Sys Call Table Address  
//-----------------------------------------------------------------------------  
unsigned long **find_sys_call_table(void)  {  
unsigned long **sctable;  
unsigned long ptr;  
extern int loops_per_jiffy;
      sctable = NULL;  
for (ptr = (unsigned long)&unlock_kernel; ptr < (unsigned long)&loops_per_jiffy; ptr  = sizeof(void *))    {  
unsigned long *p;  
p = (unsigned long *)ptr;  
if (p[__NR_close] == (unsigned long) sys_close)       {  
sctable = (unsigned long **)p;  
return &sctable[0];  
}  
}  
return NULL;  
}

 

EtherChannel Trunking Between a Foundry Switch and Cisco Catalyst

In my other article, LACP on Cisco Catalysts & Foundry switches, I describe how to configure aggregated links using LACP (802.3ad) on a Cisco Catalyst and Foundry switch. In this howto, I will be describing how to configure trunked ports using EtherChannel — LACP’s predecessor.

Before starting, one thing that you should be aware of is that Foundry’s configure EtherChannel trunks as dot1q-encapsulated VLAN trunks by default. Cisco Catalysts (or IOS, rather) configure all ports as access ports by default.

On a Cisco Catalyst switch running IOS, once logged in, you will need to perform the steps below. It is important that you perform these steps without the other device connected.

cisco>enable
Password: *enter password*
cisco#config term
cisco(config)#int Gi0/1
cisco(config-if)#channel-group 1 mode on
cisco(config-if)#int Gi0/2
cisco(config-if)#channel-group 1 mode on
cisco(config-if)#end
cisco#write mem

If your other device is another Cisco Catalyst, then just repeat the steps above, connect your newly bonded ports, and disconnect any non-bonded ports connecting the two devices.

If you are configuring a Foundry switch, there are several rules you must follow. Among these rules, is that you can only start your trunked ports on the first port of each group of 4.

BR-telnet@foundry>enable
Password: *enter password*
BR-telnet@foundry#config term
BR-telnet@foundry(config)#trunk ethe 1 to 2
BR-telnet@foundry(config)#trunk deploy
BR-telnet@foundry(config)#exit
BR-telnet@foundry#write mem

Provided you correctly configured your VLANs prior to setting up EtherChannel, you may connect your cables starting with the primary port. The primary port is always the lowest-numbered port in the trunk, in our example, this is 1. Once connected, you can verify operation of your EtherChannel trunk with show etherchannel summary on the Cisco or show trunk on the Foundry in enable mode.

Auto Type Juggling and Unsanitized Input

I would like to think that one day we will all be past the point where we are constantly finding silly bugs and vulnerabilities caused by unvalidated or unsanitized user input. Unfortunately for us all, we’re not. In this article, I will attempt to describe a rather overlooked result of PHP’s type juggling.

If you aren’t already familiar, type juggling is PHP’s way of automatically switching or converting variables to different types based on the context it’s being used in. For example, if you have the strings “2″ and “3″ and you add them, PHP will convert them to integers and return the integer 5. This is useful in PHP web-applications since any variable that PHP retrieves from either GPC (Get, Post, Cookie) variables or the database is considered a string. This is regardless of its actual value or column type in the database and the string will need to be converted to a numeric type before any type of math operations or numeric comparisons can be performed on them.

On the lower-level, PHP is actually using libc’s strtod() function. If we look at the man page, we learn If no conversion is performed, zero is returned. Without looking at PHP’s source, it would seem to me that PHP is blindly handing off string pointers to strtod() and if the type conversion fails it will retain the type of whatever the variable was last. This isn’t PHP’s fault, it’s the developer’s responsibility to make sure that the data they are using is properly sanitized!

The best real-world example I can give about this bug was a series of exploits in the web-based RPG Bootleggers last round. One of the most well known exploits in this series was known to players as “The Arm Wrestling bug.” This arm wrestling bug was the result of a flaw or limitation in PHP’s magic type switching that allowed players to bypass certain logic checks to send extremely large strings of numbers to the database backend and essentially create money out of thin air, one MySQL signed integer (2,147,483,647) at a time.

For an example on how we would go about exploiting this, we look at the following code:

1
2
3
4
$var1 = '1';
$var2 = '21';
var_dump((bool) ($var1 < $var2));
# Output: bool(true)

All this code is doing is assigning strings to two variables and then doing a numeric comparison of them — In our case, this script will output “true” since, when evaluated as an integer, 1 is in fact less-than 21. However, if there is any non-numeric data being passed to the variable then strtod() will return 0 and PHP will evaluate these variables as strings and use natural sorting to determine if one variable is “less”-than another. Proof?

1
2
3
4
$var1 = '111 '; // there is a space before the closing quote
$var2 = '21';
var_dump((bool) ($var1 < $var2));
# Output: bool(true)

This block of code will also output true because the character 1′ is before the character 2′ when using natural sorting. If a user is able to bypass the logic checks with this exploit, then typically the string will be sent straight to the database as-is (OK, maybe escaped for apostrophes ;)). Depending on the DBMS there may be a variance in behavior. Usually, exploiters will use this attack with a very large string so that the column being updated with that value will actually update to the maximum value for that column’s data type.

While exploits on web-based RPGs aren’t the end of the world, imagine if your bank used a PHP-based web-application. Wouldn’t you feel better if they all knew about this caveat and properly sanitized and validated user input? :-)