Saturday, October 26, 2013

Coding a Fast Browser Proxy Script

Well, I checked Google and wasn’t satisfied that this information was generally available…

So, I’m reviving some research I did in 2003 that led to significant speed increase in dealing with browser proxy determination.  The enterprise environment where you would apply this tactic is one where you have multiple routes to internet and corporate resources, multiple on-site subnets (including non-routed subnets) and varying levels of authentication and html tag filtering that you want to enforce.
The problem with most example proxy scripts that address these issues is that they use a cookie-cutter approach to determining where to forward a user-entered request.  Often they use the most basic approach which is to do a DNS lookup on the hostname and see if it belongs to a specific subnet.

Eventually, most non-local addresses need to be determined by a DNS lookup, but DNS lookups are costly in time and network resources.  It is best to reduce DNS lookups required, or even to eliminate them, where possible.

One clear example where DNS lookup is not required is where the host name contains the string of your corporate subnet name; for example, apps.org.com contains the subnet name org.com.  For that host, the proxy script should return immediately with the directive “DIRECT”, which means don’t go through the proxy.

But in a large enterprise environment, there may be hosts on the open internet (on a DeMilitarized Zone, DMZ network) that have the same subnet name, like www.org.com.  That host will need to be addressed by enterprise workstations, using the proxy script, through a proxy server (and through a router off the enterprise LAN).  This test needs to be done before the check described above, where addresses with our enterprise subnet name are directed to be found “DIRECT”.

Here is the basic script (e.g., proxyscript.pac) to get us this far:

function FindProxyForURL(url, host)
{
    var mHost = host.toLowerCase();

    if( (mHost == "public1"     ) ||
        (mHost == "public2"     ) ||
        (mHost == "public1.org.com") ||
        (mHost == "public2.org.com")
    )
        return "PROXY clearproxy.org.com:8080";

    // dnsDomainIs() resolves upper and lower case domain
    if( dnsDomainIs(host, ".org.com") )
        return "DIRECT";

This script, so far, does not do any DNS lookups.  Notice that the “host” parameter that is being passed to the FindProxyForURL() function is whatever hostname the user enters in the browser address line.  For local hosts, the subnet name (“org.com”) does not need to be included, so we represent the host name in our script, both with and without the subnet name.  These hosts, since they are ours, might be available through a proxy (e.g., clearproxy) that does not require authentication and perhaps does not do html tag filtering.  It is a security measure that is often enforced, that everything within a set of <applet></applet>, <embed></embed>, and/or <object> tags is removed by the proxy before the page is delivered to the client browser.

There may be some additional hosts on the open internet that belong to our Parent company, for which we also don’t want to do authentication or tag filtering.  We can direct the browser to the correct proxy, again without doing a DNS lookup for those specific hosts:

    if( (mHost == "www.parent.com"        ) ||
        (mHost == "www.sister.com")
    )
        return "PROXY clearproxy.org.com:8080";

At this point, we are going to have to do a DNS lookup to get the address of the host.  We will then use the host IP address  to determine the proxy directive required.  If the host IP address is bogus, we simply return the “DIRECT” directive – whether it works for the browser or not.

    var HostIP = "999.999.999.999";

    // First 1 or 2 DNS queries here
    if( isPlainHostName(host) || isResolvable(host) )
        HostIP = dnsResolve(host);

    // On bogus HostIP, or localhost, we are done!
    if( (HostIP == null             ) ||
        (HostIP == "999.999.999.999") ||
        (HostIP == "127.0.0.1"      ) ||
        (HostIP == ""               )
    )
        return "DIRECT";

Next we want to identify our local, enterprise subnets by IP address, and return the directive “DIRECT”, as in “no proxy required”.  We should order this list by likelihood that the browser will be addressing hosts in that subnet, because each test may require a separate DNS query.  For example, if our data center is in subnet 111.11.0.0, and our desktop workstations are in subnet 122.22.22.0, then we would list 111.11.0.0 first, since browsers would most likely be addressing hosts in our data center.  We also include non-routed subnets, if needed.

    if( isInNet(HostIP, "111.11.0.0"     ,"255.255.0.0"    ) ||
        isInNet(HostIP, "122.22.22.0"    ,"255.255.255.0"  ) ||
        isInNet(HostIP, "192.168.0.0"    ,"255.255.0.0"    ) ||
        isInNet(HostIP, "172.16.0.0"     ,"255.240.0.0"    ) ||
        isInNet(HostIP, "10.0.0.0"       ,"255.0.0.0"    )
    )
        return "DIRECT";

If you deal with users, you will find that some will have a unique need to get to a server on the internet that cannot handle your usual tag filtering.  Or perhaps you have some users that need to get to a site (for example a benefits site, like Blue Cross Blue Shield) but who don’t have credentials to authenticate to the proxy.  Those kinds of exceptions will require a separate block in your proxy script, like this:

    if( isInNet(HostIP, "33.33.33.0", "255.255.255.0" ) )
        return "PROXY alternateproxy.org.com:8080";

Finally, you may have several dedicated network routes to corporate subnets (Parent or sister companies) that are not local, for which unique proxy provisions apply (probably no authentication and no tag filtering to resources on those subnets).  For those subnets, you will go through an appropriate proxy, and for EVERYTHING ELSE, you will send the browser through your standard proxy.

    if(
        isInNet(HostIP, "144.44.0.0"    ,"255.255.0.0"    ) ||
        isInNet(HostIP, "155.55.0.0"     ,"255.255.0.0"    ) ||
        isInNet(HostIP, "166.66.66.0"  ,"255.255.255.0"  ) ||
        isInNet(HostIP, "177.77.77.77" ,"255.255.255.252")
    )
        return "PROXY parentproxy.org.com:8080";
    else
        return "PROXY externalproxy.org.com:8080";
}

No comments:

Post a Comment