In the spirit of “all of us are in this together”, I have asked fellow Apps.com developer, Brandon Zehm, to write a guest post about handling subdomains in your SaaS app. I hope you find it useful.
-david
The Problem
Let’s say you’re building a web service and have decided that each customer should access the service using a personalized, easy-to-remember URL. Several existing SaaS products use this approach — for example, when customers sign up for the TSheets Time Tracking service they are given a unique dashboard URL based on their company name. A customer with a company named “ABC Painting” can select a dashboard URL of https://abcpainting.tsheets.com/ when signing up for TSheets’ SaaS services, and their URL will be ready to use instantly, as soon as they click the final button to create their account. The ability to rapidly and dynamically create customer-specific subdomains is known as dynamic subdomain provisioning.
Implementing dynamic subdomain provisioning presents several challenges, such as:
-
How is the new subdomain configured and published in DNS?
-
How does one get an SSL certificate signed for every new subdomain? Don’t these normally take hours/days to issue? What about cost per certificate?
-
How is the new subdomain configured and published on a web server?
-
How does the SaaS web-tier software know about the new subdomain and display the proper information when a customer loads the page?
I will address each of these challenges and their solutions. My observation is that handling each of these individually is significantly simpler than it at first appears. I hope to convey that simplicity here and take the black magic out of the process.
Provisioning new subdomains in DNS
In our example above, the user has selected abcpainting.tsheets.com as their new dashboard page URL. We need to configure DNS to return a valid and correct IP address when looking up the new subdomain before the user tries loading it in their browser. If we’re not fast enough they’ll get a “Domain not found” or “Invalid Hostname” error, which doesn’t really inspire confidence in our ability to deliver a web service! The problem with DNS is that the user will expect to be able to use their new subdomain instantly (things should “just work”, right?), but domain name entries can take time to replicate throughout our DNS infrastructure, not to mention the global DNS infrastructure.
What do we do then? The answer: be more clever with our DNS by using “wildcard” DNS records. A wildcard DNS entry is one that matches any subdomain of a domain name.
Here’s an example mapping of a wildcard DNS entry for all subdomains of “mydomain.com” to an IP address:
*.mydomain.com -> 100.200.30.40
What this wildcard mapping means is that a browser that tries to resolve random-url.mydomain.com will receive an answer from DNS directing it to the 100.200.30.40 IP address. The browser will then connect to the server at this IP address to load the random-url.mydomain.com web site. Easy! And this scales to a practically-infinite number of subdomain names.
I already hear your first question, “What if I want the company’s ‘www.mydomain.com’ main page URL to resolve to a different IP address?”
That’s not a problem either. DNS software such as Bind or TinyDNS can be configured to handle wildcard domains in conjunction with other subdomains. If we want the www.mydomain.com site to be hosted by one server, and all the wildcard *.mydomain.com sites hosted by another, then we would configure our DNS server to first list www.mydomain.com, and then *.mydomain.com, like this:
www.mydomain.com -> 100.200.10.10 *.mydomain.com -> 100.200.30.40
Using this approach, “www.mydomain.com” is matched first since it appears first in the list, and requesters will get the IP of 100.200.10.10 for that subdomain. For all other requests of a “mydomain.com” subdomain, the request will not match the first entry, but will match the wildcard rule of the second entry, so those requests will still get 100.200.30.40 as the IP they looked up.
To summarize, there are too many complications to managing thousands of subdomains with individual records in DNS. Instead of doing that, just create a single wildcard “*.mydomain.com” record that handles all subdomains at once, and only add a handful of individual records for the exceptions that you don’t want hosted on the wildcard IP address(es). I heartily recommend running your own DNS servers like we do at TSheets, but if that’s not an option there are plenty of hosted DNS providers that support wildcard DNS records.
Dealing with SSL certificates for thousands of subdomains
Similar to DNS, SSL certificates also support wildcards. We cannot purchase one SSL certificate for “www.mydomain.com” and expect it to work for “abccompany.mydomain.com“. Technically, it’s perfectly secure — however, the user will get an error in their browser saying the certificate can’t be trusted since the domain names don’t match.
Instead, we should purchase a wildcard SSL certificate for *.mydomain.com and use it for both our “www.mydomain.com” website and every other website on that domain. These certificates are more expensive than normal SSL certificates, but well worth their price.
One caveat: a certificate for *.mydomain.com will trigger SSL warnings in browsers if a user tries to browse to our root domain. For example, if a user browses to “https://mydomain.com/”, their browser will give them a warning since technically “mydomain.com” doesn’t match the certificate’s name pattern of “*.mydomain.com”. If it’s a requirement that users access the root domain, we’ll either have to purchase a multi-domain SSL certificate signed for both www.mydomain.com and mydomain.com, or just have two different normal SSL certificates, one for each site. This is a rare case — usually there is no requirement that users be able to access the root domain.
In fact, most sites simply redirect users coming to the root domain elsewhere. For example, we could redirect all visitors from http://mydomain.com to https://www.mydomain.com using tools within our web server (or our language, if it supports sending HTTP redirects). This has a secondary benefit of allowing us to capture and redirect all non-secure HTTP traffic to an SSL-secured HTTPS URL. Although a detailed description of how to do this is beyond the scope of this article (and the process is well-documented elsewhere), here is a quick PHP code snippet showing one way to conditionally redirect a user’s browser to another URL based on whether they are using an SSL-secured connection or not:
// If we are not on HTTPS (port 443), redirect browser to the // same URI, but using HTTPS. This works with Apache+PHP. if ($_SERVER['SERVER_PORT'] != 443) { $https = 'https://' . strtolower($_SERVER['HTTP_HOST']); header("Location: {$https}{$_SERVER['REQUEST_URI']}"); exit; }
Configuring the new subdomains in your web-server
Once again we’re faced with a choice of either configuring every single subdomain in our web-server software, or configuring a catch-all wildcard site that handles every subdomain. I strongly recommend the latter unless you want your system engineers to go insane managing individual configurations for thousands of subdomains. At TSheets, we’re using Apache for our web application server, and it’s configured to route requests for www.tsheets.com to one directory on the server and requests for *.tsheets.com to another. This allows us to completely separate our our application from our www.tsheets.com site but still use the same set of servers to serve both.
An example configuration for an Apache web server hosting a “www.mydomain.com” web site and a wildcard “*.mydomain.com” web site would look something like this:
NameVirtualHost *:80 NameVirtualHost *:443 ... ServerName www.mydomain.com ServerAlias mydomain.com ServerAdmin webmaster@mydomain.com DocumentRoot /var/www/www-site ... ServerName anything.mydomain.com ServerAlias *.mydomain.com ServerAdmin webmaster@mydomain.com DocumentRoot /var/www/my-application ...
Note the “ServerAlias *.mydomain.com” line — this is the directive which tells Apache to match the wildcard pattern. It’s also important that this VirtualHost section come AFTER any other VirtualHost sections defining a specific subdomain of mydomain.com. Otherwise, if you have a wildcard VirtualHost (“*.mydomain.com”) followed by a VirtualHost for “www.mydomain.com”, the wildcard pattern will match first, and the www.mydomain.com site will serve content from the wildcard VirtualHost. This is generally not desired behavior.
Handling the new subdomain in your website codebase
Fortunately, major web application environments expose the URL the user is requesting. The ability to access this information at runtime is critical to handling dynamic subdomains. For example, in PHP running on Apache, reading the $_SERVER[‘HTTP_HOST’] variable allows us to easily determine which of the thousands of subdomains a remote user is trying to access. Here’s a PHP code snippet demonstrating a way to extract the subdomain from the Apache environment:
// Use the HTTP_HOST to determine what client subdomain is // being accessed. If the URL is “abcpainting.mydomain.com”, // then $subdomain will contain the string “abcpainting” // after this code executes. $subdomain = preg_replace('/.mydomain.com.*$/', '',
$_SERVER['HTTP_HOST']);
Once we can determine which subdomain is being accessed, we can perform specific actions and customizations based on that information.
As a simple example, at TSheets, we store a record for each subdomain in our application database. This record associates other data with the subdomain, such as the name of the company that signed up to access that subdomain URL. So, if a user browses to “https://abcpainting.tsheets.com“, their request is directed to the code that handles the *.tsheets.com wildcard subdomain. This code determines that the user is trying to access the “abcpainting” subdomain, and retrieves a database record based on that information. Within the retrieved record data is an associated “Company Name” field, which the code uses to add a page header saying something like “Welcome to ABC Painting Company”.
This is a very simple example of customization based on subdomain, but it serves to prove the concept. With this final tool, it is possible to take actions specifically based on which subdomain a user is accessing.
Summary
What at first appears to be a very complicated problem turns out to be fairly simple once we break it down into parts. Solving each part is in turn simple — each problem has been solved long ago by engineers facing similar challenges, and thanks to the ubiquity of the web, has been documented in various mailing lists and blog posts such as this one.
While I may not be an expert in the DNS/web-server/language you’re using, you’re welcome to reach out to me if you have questions. I’m @zehm on Twitter.
Brandon Zehm is the Co-Founder and CTO of TSheets. He is a loyal Linux and PHP developer. When not writing code he enjoys cave exploring, skydiving and mountain climbing.
Leave a Reply