Recently, a support ticket was opened for an issue where a couple of websites were slow to load after blocking outbound internet access from a VM. The IP addresses of these websites were whitelisted, but the browser would hang for about 30 seconds before they'd load. In the end, we found that the SSL certificate CRL check process was causing the hang. Let's take a look at the diagnosis process and how we discovered the root cause.
To recreate the issue for this post, I am using an Azure VM with a network security group that is blocking all traffic to the internet. Our test website will be www.concurrency.com which has an IP address of 18.104.22.168. Here is the NSG I'm applying to the VM:
On the inbound side, I have a rule to allow RDP in from my IP address. For outbound traffic, I'm allowing HTTP and HTTPS traffic to the IP of www.concurrency.com and blocking all other internet traffic.
With the above in place, IE is hanging on this screen for about 30 seconds before the actual site loads:
The first thing we did was test the same URL from another machine that didn't have internet access blocked to ensure that is what caused it. On another machine, the site loads normally so we know that the deny rule is causing the issue somehow.
Next up, we used the networking tab on the developer tools within IE. This shows the network calls that IE is triggering, and I was hoping to see a stalled call that was causing the hang. However, The network tab was blank during the time where the browser was still loading:
At this point, it appeared that IE was waiting on something else before it could begin loading the page. If IE had called another website and was waiting on it, we'd see it in the network tab. However, since it's blank, we know it's something else on the server, probably at a lower level, that's causing the delay.
Next up was running a netstat to look for any connections that were in a broken state. For example, SYN_SENT would indicate that the outbound connection was started, but no packets were received back. That's what we'd expect to see if the server was trying to call something that was blocked by the network. Here, we run netstat and use findstr to filter for the 2 PIDs IE is using at the time:
Looking at this, there are 4 IPs that IE is trying to connect to. The first is the IP of the website itself where ESTABLISHED indicates it connected successfully. The next 3 aren't as clear though. To try to get more info about them, we ran each through nslookup:
It's not clear what the first 2 are, but the 3rd one appears to be related to certificates in some way (based on "trust" being in the name). Running a Google search for that IP brings us to https://otx.alienvault.com/indicator/ip/22.214.171.124. On that page, there are a list of DNS names that map to that IP, including "crl.identrust.com". Based on that, it looks like the IP address is being used for a CRL check for the certificate protecting www.concurrency.com. For those that aren't familiar, when a client connects to an SSL-protected website, it checks to see if the certificate protecting the website has been revoked by the issuer. Certificates are revoked for various reasons, and the list of revoked certificates is called the CRL. The issuer will publish a new CRL each time a certificate is revoked, and clients are supposed to ensure the server's SSL certificate isn't on the CRL before allowing the connection. To test out if the CRL is the issue, we temporarily disable the "Check for server certificate revocation" option in Internet Options:
After closing and re-opening IE, the site loaded immediately. This proves that the revocation check is causing the slow down. Since this server will remain in this state, where outbound internet access is completely blocked except for the one website, we opted to simply leave revocation checking disabled. There are other options, such as changing the CRL timeout or also whitelisting the IP address of the CRL URL. However, those have their own pros/cons.