BorderManager 3.8 Site-Site VPN Problem - Aug. 7, 2005

Slave VPN Server Won't Start VPN Services

Update: Aug. 7, 2005: Novell has a new java file (scm.jar) that fixes the issue of the VPN services not starting unless the server can contact a replica of root. I have personally tested it, and found that it works, although there is a still one- or two-minute delay while the VPN tries to contact Root before it gives up. This file is available in the beta 3.8 SP4 patch, which can be found at www.novell.com/beta. If you do not want to apply the beta SP4, you can just use the scm.jar file from it on your BM38SP3 server.

There is another issue that can cause a similar symptom (on master or slave server) - if there is a problem with the SCMServiceObject in eDirectory, the VPN services will not launch. The object must be present (in the same OU as the server), and it also needs to have certain rights. I am trying to get those rights documented, as well as a procedure for recreating the object.

Update: Aug 24, 2004: There is a design bug with BorderManager 3.8 Site-to-Site VPN that requires the slave server to contact a replica of the Root partition in order to launch. This means (for now) that you need to put a replica of Root on the VPN slave server. This makes sense in terms of what I have seen with my workaround, which simply allows the slave server to contact a Root replica through a backup link. It is, of course, exceedingly poor NDS design in many cases to have to put a root replica on a VPN slave server, and I assume that Novell will start taking steps to address this. (Also, I have not yet confirmed this yet in more than one instance). As noted below, if you are already in the situation where your slave VPN is down, and you can't bring it up in order to get NDS synched, I have a work-around, and I can do it for you if you need help.

August, 2004

I don't generally post many tips on the VPN setup, partly because it is complex enough that it doesn't lend itself well to tips, and I try to cover it in my BorderManager 3.x book. However, there is a problem that I have seen enough times now that I think should be posted, so that others have some chance of recovering from the issue. The problem has a simple symptom: a VPN slave server in a BorderManager 3.8 Site-Site VPN just won't start its VPN services. The problem seems to occur after a simple reboot most of the time that I have seen it. This only happens when the VPN slave is in the same eDirectory tree as the master. It happens even if you have a good NDS design. I don't know the exact cause, but I am working with Novell to try to find out.

On a normal 3.8 VPN slave server, when you start BorderManager services (startbrd.ncf), there is a command in there to start VPN services as well (startvpn). The startvpn command launches a number of modules, including a VPN monitoring module, and a java app that is supposed to look at the NDS configuration for the server, and launch VPN services if the server is configured as a VPN server. You normally see this resulting in the VPSLAVE and IKE modules loading. When the problem occurs, the VPSLAVE and IKE modules do not load. If you manually load them, they still will not make a connection. It is as if the server does not think that it is configured as a VPN server.

I have found a workaround, which (to me) is fairly simple. Make a backup link for the slave server to synchronize eDirectory to the master server (or at least the root of the tree, which is normally located at the master server's subnet in my experience). Once NDS synchronizes, *something* happens to the VPN slave that results in it 'rediscovering' itself to be a VPN server, and VPSLAVE and IKE load right up, connect to the master, and work fine. I don't know if there is some sort of tree walking or time stamping going on, but in case after case, this workaround has brought back up the VPN link for me.

I normally make this backup link with a 3rd-party VPN link, specifically I usually set up a cheap Linksys BEFXS41 VPN router, and adjust static routes just enough to get the slave server able to sync eDirectory to the master site. I use the Linksys because I can get them for under $100 new, and I needed a cheap router to play with when I was adding a 3rd-party VPN example to my BorderManager 3.x book. There are certainly limitations to the Linksys which I won't go into, but it is sufficient to get this job done. If you need to do this and want to do it on your own, my BorderManager 3.x book shows how to configure a Linksys router as a VPN link to a 3.8 server. If you are not comfortable in doing this, I can talk you through the process or do it for you (as a paid consultant) - see my contact information if you need help.

I have seen this problem with BorderManager 3.8 (unpatched) BM38SP1 and BM38SP2. I'm sure Novell will identify the problem and fix it in a patch at some point, but I don't know when. If you have seen this issue and have another workaround, email me please.



Return to the Main Page