How We're Different
Most hosting horror stories come from people using budget hosts, and they all seem to revolve around three main problems:
- There's no-one there to fix the problem when something unexpected happens (like a spammer using your e-mail address as his own, filling your mailbox with bounces until you're over your quota).
- You host loses your data (your account is deleted by mistake and not backed up first, or the server fails and your data was never backed up).
- Your server has problems (it's overloaded or it goes down completely) and you're helpless until the server is fixed because everything is on that server.
We've done everything we can to insure those problems don't happen here:
Availability:
We offer telephone based support to all of our clients. Instead of making you use e-mail or a trouble ticket system, we encourage you to call us when something goes wrong. We prefer to handle normal requests during business hours, but will happily fix critical problems at all hours.
Backups:
All of our servers are backed up 6 times per day using rsnapshot, to an off-site backup server. This means that we can go back to any 4-hour period and recover your files as they existed at that point.
This isn't to say that you shouldn't make your own backups, just that most failures are covered by our default system. If your needs exceed this, please let us know and we'll help you set things up to make your own backups.
Clustered Hosting Solution:
We use a clustered hosting solution, so a failure of one component (web, database, or mail) shouldn't affect other components. This increases complexity a bit, but the additional complexity works to distribute load and minimize interruptions, rather than intruducing more failure points.
Here's an overview:
- When you sign up with us, you're running a control program on the CP server -- the brains of our system. As part of the automated sign-up process you are assigned logical servers for mail, ftp/web, and databases. While it's possible you'll find yourself in a situation where all the services end up on the same machine, odds are mail.yoursite.com, www.yoursite.com, and mysql.yoursite.com will be on different machines -- the physical machine name of your server is never used.
- While signup is occurring, a second mail server on a different machine is assigned backup duty for your domain. If your primary mail server is unavailable for some reason, the secondary will grab incoming mail and hold on to it until your primary mail server becomes available so it can be delivered.
- We have multiple servers for each service we offer. When one approaches our predefined limit, we tell our system not to use that logical server for new signups and no-one else is assigned to use that service/machine combination.
We may still run into a situation where a server that had a comfortable amount of free resources starts to struggle, in which case users need to be migrated. Depending on the resource in question, this can go one of three ways:
- If your web/ftp data needs to be moved to a new server, this is done without any service interruption. First, your data is copied to the new server, then the DNS records are pointed to the new server. Once DNS has propagated the change, the data in your original location is deleted. Users never notice a thing and are transferred over gradually, as their local DNS servers pick up the change. If changes are made during the move, rsync keeps both sites up-to-date automatically.
- Should your mail server become overwhelmed (not likely, but with the increase in the need for spam filtering and the heavy CPU demands of some modern filters it is possible), there is some downtime (less than an hour), but it's much less than that involved in DNS change propagation. We create the new mail server machine, turn off your mail service on your primary mail machine (your mail relay will still received new mail -- you just can't receive it during the transition), bulk-copy your mail data over, change DNS information on the servers so the new server assumes the role of the old one, and bring the mail service back up. That's it -- you're moved.
- MySQL moves can be done similarly, but we're more likely to just move your account. Moving your account to a new machine (because your server runs a forum that's grown too resource-intensive for the original SQL server, for instance) means stopping your database service so it can be moved, moving the data to the new server, and changing DNS information for mqsql.yourdomain.com to reflect the new change. Since your server accesses our DNS servers directly, you get the updated information immediately rather than waiting for the change to propagate. Note that this will be coordinated ahead of time to find the best way to stop access to your database without affecting other users, and the move will involve some down-time, but this is usually limited to less than 10 minutes (copying the data takes the longest part).
What this means is that you need to pay more attention when configuring things. Instead of assuming your MySQL server runs on the same machine as your web server and is accessible as localhost, you'll need to check which mysql server to connect to. When configuring your mail client you'll need to specifiy your username and domain name, rather than just your username. And so on.
The thinking has been done in advance so that responses to problems are much faster and less troublesome than with non-clustered hosting solutions. We've also eliminated what many see as the biggest problems with most other web hosts -- we minimize the likelihood of our clients running out of resources, and when this does happen (as it inevitably will to some fraction of our users) we can correct the problem with zero downtime and minimal inconvenience in most cases.
In the near future H-Sphere will support network storage solutions that will make our already minimal downtime due to changes even shorter if not eliminating them entirely. It will also allow us to offer truly redundant, load-balanced hosting for all of our services (right now web service for a domain can't be spread to more than one machine, so web serving is still a single point of failure).
Overselling
"Overselling" is a difficult term to really define, but it can be summed up by referencing what the phone companies do: build infrastructure that's enough to handle 98% of the loads you'll ever see. This results in a system that's reliable most of the time, but that fails when it's most needed (think 9/11 or Katrina and the sort of overloads we saw in cell phone systems).
This can be a difficult thing to judge with web hosts -- disk space and network bandwidth can be expanded almost infinitely, which leads many hosts to put thousands of sites on the same server. Not us.
Our measure is our load statistics. We strive to maintain optimal load on our servers, which pretty much means an average load of under two on our two-processor machines. This means that there's no more than one process waiting per CPU, which translates in Unix-land as "maximum efficiency" -- that point below which there's a lot of idle time, and above which performance starts to suffer.
When our loads get close to this magic number, we turn that server off for new signups. If it starts to exceed that number, then we move our busiest users transparently to another machine before the server starts to slow down measurably.
content on this site © 2002-2006 WELLBUILTNETWORKS LLC
