SquTUN IP over secure UDP tunnel Using universal TUN/TAP interface Kyle Rose 2005-Mar-09 INTRO I wrote SquTUN (pronounced "skew-ton") over the course of two days in order to solve a problem that had been plaguing me for two years: CIPE. CIPE is bad in two ways: evidently, it has some security problems that are covered on numerous web pages; and it has implementation problems that are really, really irritating. The primary implementation issue is that CIPE reqiures a complex custom kernel module that has very annoying logging which can't be shut off without modifications to the source. Furthermore, since it's in the kernel and isn't used by that many people, it is possible that it has a heretofore unknown vulnerability that could allow an attacker to DoS the machine with a packet-of-death, or grant an attacker root access to a machine on which it is running. A secondary implementation issue is the complexity of its configuration: it has lots of options that I simply won't use in the course of setting up a VPN. Both of these beg for a simpler VPN solution. DESIGN PRINCIPLES (1) Run in user space, and exchange packets with the kernel using an existing well-tested interface. (2) Do no key exchange: use symmetric keys that are pre-exchanged between endpoints. KISS. (3) Use a believed-secure crypto implementation to encrypt each packet. (4) Use a believed-secure MAC implementation to authenticate each packet. IMPLEMENTATION SquTUN uses the universal TUN/TAP interface in TUN mode to receive over a character device packets that are sent by the kernel to that virtual interface. It generates a MAC with SHA-1 using the H(K,H(K,M)) method outlined in Schneier (2nd ed, p.458) and then encrypts an inner SquTUN header (containing the MAC, the send time, and some other information required to decode the packet) and the encapsulated IP packet with AES in CBC mode. The IV is included in the outer SquTUN header of each packet, and is based on the feedback block remaining from the encryption call in the last packet send, or on random data gathered from OpenSSL's RAND_bytes call for the first packet. The receiver first verifies the sender (if specified on the command line) and then decrypts the packet, checks the time skew, and authenticates the MAC. If all succeed, the packet is sent over the receiver's TUN interface to the kernel. The entire .c file is approximately 700 lines long because it makes use of the time-tested TUN/TAP kernel module. Contrast this to all of CIPE, which amounts to nearly 8,000 lines. Of course, since I'm the only one using this at the moment, it may have some bugs; but since the daemon can be run as a non-privileged user in a chroot jail, there are fewer potential security issues even in the presence of a remote shell exploit. Furthermore, since the code is much simpler, it will probably achieve something close to 100% correctness (modulo bugs in the TUN/TAP code or the OpenSSL AES and SHA-1 implementations) shortly after some number of people start using it, an assertion that will probably never be true of CIPE. BASIC USE First, compile TUN/TAP support into your kernel or as a kernel module. Reboot and load the module, if necessary. Install the tunctl utility, which is oddly distributed by Debian in the uml-utilities package. Create a group called "tun" and modify your configuration (classic, devfs, udev) such that /dev/net/tun is owned by group tun with rw access. Add the user who will be running SquTUN to group tun. Make sure you have OpenSSL development files installed, and call "make install" in the SquTUN source directory as root. This will install squtun into /usr/local/sbin and create a directory /var/run/squtun owned by root:tun with permissions 01775. As root, create a tun interface by calling "tunctl -u UID -t tun0", where UID is the uid or username of the user who will be running squtun. Configure the interface with some private IP space that will act as your mini network, and bring the interface up. I use a small 2-bit not-really-CIDR block for the endpoints of my VPN's. You'll need to set up routing (and learn about iptables's TCPMSS --clamp-mss-to-pmtu option) if you want things to work correctly. Create a key file, readable only by the appropriate user, containing 64 hex digits. A good way to generate such a key is to execute the following: head -c 32 /dev/random | hd -e '32/1 "%02x"' | tail -1 | head -c 64 ; echo Finally, create one end of the VPN tunnel by calling /usr/local/sbin/squtun with appropriate arguments. If you want to listen on port 7778 and your peer is 1.2.3.4 and will be waiting for packets on port 7777, then you could use /usr/local/sbin/squtun -l 7778 -p 1.2.3.4 -r 7777 -i tun0 -k ~/key.txt where ~/key.txt is your key file. To create the other end of the VPN tunnel, do all of the above except: (a) copy the key file over securely (e.g., through ssh, sneakernet, etc.) instead of generating a new one. (b) Start squtun with the ports switched and with the first machine's IP as the argument to option -p. Now, ping one end from the other and use tcpdump and dmesg as necessary to figure out why packets aren't arriving, if you are unlucky enough to set it up incorrectly. I don't do network debugging housecalls. :) NAT If one end of the connection is NAT'ed and cannot know its public IP or port, the other end can omit the -p and -r arguments, which will cause squtun to discover (and rediscover) its peer's return address. With this configuration, you will absolutely want the firewalled machine to send squtun ping packets with the -a option so the firewall does not deprovision the NAT mapping for that UDP session. (It actually might not be a bad idea to have each end of the connection ping, though if the firewalled machine disappears, the peer will continue sending packets to the old port indefinitely, which might raise the hackles of your firewall administrator.) Obviously, at least one end of the connection must not be NAT'ed: both ends cannot possibly discover the other's IP. MTU Optimally, you want the MTU on your TUN interface to be small enough such that encapsulated packets will travel through the real interface without being fragmented: fragmented UDP packets raise the possibility of dropped packets exponentially, because only one fragment need be lost for the packet to be un-reconstructible. You can compute the optimal MTU of your TUN interface by calling squtun -m 1500, replacing 1500 by whatever the MTU of your real interface is. This essentially subtracts the encapsulation overhead and rounds down to the nearest AES blocksize. OTHER OPTIONS Call squtun -h to get a full list of options. One notable interesting option is time-skew checking, to minimize exposure to UDP replay attacks. DEBIAN INTERFACES EXAMPLE If you're using Debian, you can add an interfaces stanza similar to the following to start/stop a tunnel with ifup/ifdown tun0: iface tun0 inet static pre-up tunctl -t tun0 -u squtun address 1.2.3.4 network 1.2.3.0 netmask 255.255.255.0 broadcast 1.2.3.255 up ifconfig tun0 mtu `/usr/local/sbin/squtun -m 1500` up su squtun -c '/usr/local/sbin/squtun -l 3456 -p 18.2.3.4 -r 3457 -i tun0 -k /usr/local/etc/squtun/mytunnel.key -a 59 -c 30' down ([ -e /var/run/squtun/squtun-tun0.pid ] && kill `cat /var/run/squtun/squtun-tun0.pid` && sleep 1) || /bin/true post-down tunctl -d tun0 PERFORMANCE Who knows? Someone who cares, let me know. FIN Let me know if there are any problems. I'd love for people to analyze this for vulnerabilities. The list of open issues is in this source distribution as ISSUES. Feel free to address these. But whatever you do, don't bitch: if you do, then it means I've got less free time than you do and you should simply appreciate the effort I've put into making something you find useful. Let's work together to make this simple protocol as secure as it can be while maintaining its incredible simplicity.