pfSense and IPsec 2
Practical Troubleshooting
I love pfSense. So far it's superior to every Linux-based routing appliance. No product is perfect, but the 2.0 release is very promising. I have been troubleshooting tunnels which inexplicably do not work. I have been recieving the following error during phase1 connection:
racoon: ERROR: couldn't find configuration
This usually means a significant mismatch exists in phase1 negotiations. Despite my meticulous efforts the tunnels would not start. I watched the IPsec logs hopelessly, trying many different things. What finally worked was connecting to the console, killing racoon and starting it manually as follows:
racoon -d -v -F -f /var/etc/racoon.conf
By monitoring the output, I discovered during debugging that the packets were coming from the wrong source IP address. One of my sites has multiple WAN links, and racoon was using the wrong source address for IPsec negotiation. The phase1 arrival was clearly logged and rejected - because it didn't match any existing configuration.
Once complete, I was quickly able to determine what to do. However, if you don't have access to a host behind the pfSense firewall then you may have problems creating IPsec tunnels. I used this to force a packet:
My firewall's LAN address, which is part of the IPsec local subnet scope, is 192.168.0.1. The remote network is 10.1.1.0/24. I need to create a single packet from 192.168.0.1 to something in 10.1.1.x.
ping -S 192.168.0.1 -c 1 10.1.1.3
What could be better.
Feedback for improvement would begin with one admonition: Don't trust the log output of Racoon. I should have used TCP-dump on both ends, watching for packets setting up sessions.[2.0-RC1][root@gateway.site-a.com]/root(1): tcpdump -ni re0 port 500 16:39:07.697695 IP 192.168.81.126.500 > 10.1.101.217.500: isakmp: phase 2/others I inf[E] 16:40:26.980944 IP 10.1.101.217.500 > 192.168.81.126.500: isakmp: phase 1 I agg 16:40:36.982740 IP 10.1.101.217.500 > 192.168.81.126.500: isakmp: phase 1 I agg 16:40:46.983927 IP 10.1.101.217.500 > 192.168.81.126.500: isakmp: phase 1 I agg 16:40:56.985122 IP 10.1.101.217.500 > 192.168.81.126.500: isakmp: phase 1 I agg 16:41:06.986307 IP 10.1.101.217.500 > 192.168.81.126.500: isakmp: phase 1 I agg
Had I done this at both ends, I would have clearly seen that the wrong interface was emitting packets. Both ends. That was my mistake. I trusted logs at Site-A, and I never verified my problem at Site-B. Hours of painstaking troubleshooting for no good reason.
Work Around
My current imperfect workaround is to add the following line to each of my remote sites crontab:cat crontab|grep newsys
- root /sbin/ping -S 192.168.0.1 -c 10.1.1.1
Obviously I turfed this above, I just thought I would share it with everyone. This has the net effect of a 60 second tunnel keep alive. May not be appropriate for some environments. Good luck.
IPsec - The Evil Cisco Concentrator
Cisco VPN concentrators are a regular occurrence in the field. They can be the bane of your life. However, there is one simple change to enable these to consistently work with multiple policy routed subnets.
In your /etc/ipsec.conf use set the policy level to 'unique' instead of 'require'.
The entries in /etc/ipsec.conf are fully covered in the ipsec.conf man pages, and online at various locations. Google and find. My focus is the 'policy-level', the last value in the spdadd string. I have only ever seen it set to 'require', but recently I discovered the 'unique' as well as the 'unique:<1-32768>'. This allows for negotiating Phase2 crypto per-policy, or per-group. (unique: Consider this policy file:
/etc/ipsec.conf
#### Tunnel: CheeseSteak Club
spdadd 88.88.30.231 192.168.1.240/28 any -P in ipsec esp/tunnel/88.88.30.231-66.66.177.102/require;
spdadd 192.168.1.240/28 88.88.30.231 any -P out ipsec esp/tunnel/66.66.177.102-88.88.30.231/require;
spdadd 99.99.0.0/16 192.168.1.240/28 any -P in ipsec esp/tunnel/88.88.30.231-66.66.177.102/require;
spdadd 192.168.1.240/28 99.99.0.0/16 any -P out ipsec esp/tunnel/66.66.177.102-88.88.30.231/require;
spdadd 99.99.0.0/16 66.66.177.102 any -P in ipsec esp/tunnel/88.88.30.231-66.66.177.102/require;
spdadd 66.66.177.102 99.99.0.0/16 any -P out ipsec esp/tunnel/66.66.177.102-88.88.30.231/require;
#### Tunnel: Guinness Brewery Concentrator
spdadd 44.44.82.31 192.168.1.0/24 any -P in ipsec esp/tunnel/44.44.82.31-66.66.177.102/unique;
spdadd 192.168.1.0/24 44.44.82.31 any -P out ipsec esp/tunnel/66.66.177.102-44.44.82.31/unique;
## Main Net (ireland)
spdadd 10.1.30.205 192.168.1.0/24 any -P in ipsec esp/tunnel/44.44.82.31-66.66.177.102/unique;
spdadd 192.168.1.0/24 10.1.30.205 any -P out ipsec esp/tunnel/66.66.177.102-44.44.82.31/unique;
spdadd 10.1.30.205 66.66.177.102 any -P in ipsec esp/tunnel/44.44.82.31-66.66.177.102/unique;
spdadd 66.66.177.102 10.1.30.205 any -P out ipsec esp/tunnel/66.66.177.102-44.44.82.31/unique;
## Mainland Dist. Net (America: New York)
spdadd 10.1.30.210 192.168.1.0/24 any -P in ipsec esp/tunnel/44.44.82.31-66.66.177.102/unique;
spdadd 192.168.1.0/24 10.1.30.210 any -P out ipsec esp/tunnel/66.66.177.102-44.44.82.31/unique;
spdadd 10.1.30.210 66.66.177.102 any -P in ipsec esp/tunnel/44.44.82.31-66.66.177.102/unique;
spdadd 66.66.177.102 10.1.30.210 any -P out ipsec esp/tunnel/66.66.177.102-44.44.82.31/unique;
## Western Region Sales (America: Seattle, Wa)
spdadd 10.2.30.200 192.168.1.0/24 any -P in ipsec esp/tunnel/44.44.82.31-66.66.177.102/unique;
spdadd 192.168.1.0/24 10.2.30.200 any -P out ipsec esp/tunnel/66.66.177.102-44.44.82.31/unique;
spdadd 10.2.30.200 66.66.177.102 any -P in ipsec esp/tunnel/44.44.82.31-66.66.177.102/unique;
spdadd 66.66.177.102 10.2.30.200 any -P out ipsec esp/tunnel/66.66.177.102-44.44.82.31/unique;
## Backup Network (America: Cheyenne, WY)
spdadd 172.16.106.10 192.168.1.0/24 any -P in ipsec esp/tunnel/44.44.82.31-66.66.177.102/unique;
spdadd 192.168.1.0/24 172.16.106.10 any -P out ipsec esp/tunnel/66.66.177.102-44.44.82.31/unique;
IPsec: Off the Map with Key Expiration
Along the way what did we see?
- Random Packet Loss
- TCP Connection Difficulty (Read: w/o the Tunnel here.)
- Tunnel Lock up
- Raccoon (IPsec-tools) Lockup
- Cisco Hangs
- Cisco Mysteriofscking IOMEM boot-back-to-previous-IOS problems
- Interactive, interspersed tunnel-based TCP connection resets.
- MTU related problems.
- Cisco config magic witchcraft.
- The Cisco admin going on vacation.
- Cisco config butchery.
- KByte based key-expiration.
- Key-logger password compromise and subsequent SSH hackery by a script-kiddie - resulting in the reinstall of a terminal server, my mail server and my jabber server. (*sniff* Jabber is still down.)
Did I forget anything? I think I did, but to be honest, I can't imagine bitching about this too much more. The bottom line is OMGWTFBBQ.
Remove the Ciscos: Remove the latency, and the non-tunnel TCP resets.{WTF}
Remove the key expiration after 8 megs: remove the (tunnel-based) TCP disconnects, tunnel crashes, and other hangs.
No, I still am not happy with Cisco or the CCNA "Cisco to Cisco doesn't have these problems, Cisco does that, Cisco is dipped in gold and ready to make your life better, just pay at the coffer.... {insert foul language here}".
Netopias: Cheap, simple, and if you just need to handle traffic for a T1 w/o inspection or intelligence: perfect. (read: I hate them, but I can't fault them for being Cisco^H^H^H^H^Hbroken.)
The Watchguards have been very nice this trip around. Apart from the expense and the limits of their lesser OS versions. , inability to shape traffic, complete lack of diagnostic tools, etc. Perfect, perfect indeed. Oh well.
Linux was Linux. Killer, functional, and totally lacking in kernel-based IPsec policy matching for Netfilter (read: no good firewall support for IPsec), no way to tell if the tunnel is up or down, etc, etc, etc.
Firewall: Shorewall 3.0
IPsec: Quest of the ever elusive TCPMSS
When dealing with Encrypted sessions you can either set this or MTU. Often times lowering MTU can lead to session locks and other problems.