Choosing a Webserver

So many times, people choose to run WordPress on a traditional LAMP stack...STOP IT! Just because something is an industry standard, doesn't mean it's good. Yes, LAMP is a good stack sometimes, but not always. That's the important thing to know: when to use it, and when to use another stack.

For the time being, I'm going to focus on "A" and leave L,M, and P out of this.

In LAMP, A is Apache, one of the old and most popular web server applications on the Internet. Apache is great and serves up a lot of websites all over the web, but you want see any of the top 10 websites using it. WordPress.com doesn't even use Apache! I used Apache for a long time, but have since moved away due to performance issues. No hard feelings though! I still use it in some places!

What most people don't know is that Apache is relatively bloated these days and doesn't use RAM/CPU in the most efficient manner possible. It's not Apache's fault, it's just the result of being in existence since 1995 (Yes, I'm implying that's old).

I won't go over the installation/setup (in this post) as each one is very different from the other, but here are a few alternatives to consider before going with Apache:

NGINX:

NGINX™ is an advanced Internet infrastructure software. It is a high performance web server with the lowest memory footprint and it provides complete combination of the most essential features required to build modern and efficient web infrastructure. Today NGINX is the 2nd most popular open source web server on the Internet. NGINX functionality includes HTTP web server, HTTP and SMTP/IMAP/POP3 reverse proxy, content caching, load balancing, compression, bandwidth policing, connection multiplexing and reuse, SSL offload and media streaming.

LIGHTTPD:

Security, speed, compliance, and flexibility -- all of these describe lighttpd (pron. lighty) which is rapidly redefining efficiency of a webserver; as it is designed and optimized for high performance environments. With a small memory footprint compared to other web-servers, effective management of the cpu-load, and advanced feature set (FastCGI, SCGI, Auth, Output-Compression, URL-Rewriting and many more) lighttpd is the perfect solution for every server that is suffering load problems. And best of all it's Open Source licensed under the revised BSD license.

Litespeed:

LiteSpeed Web Server is the leading high-performance, high-scalability web server. It is completely Apache interchangeable so LiteSpeed Web Server can quickly replace a major bottleneck in your existing web delivery platform. With its comprehensive range of features and easy-to-use web administration console, LiteSpeed Web Server can help you conquer the challenges of deploying an effective web serving architecture.

See? There are other options out there. Ironically, each one clames to be the best.

Go ahead, step out of your comfort zone and use something other than Apache this time. You can do it!

I'll talk about what I use and why in an upcoming post...I might even go over how to install it...in fact, I will! I'll even cover installation and configuration of Apache, NGINX, and Lighttpd...with juicy sample config files to make your hearts melt. Stay tuned for those.

Testing with Blitz

Since I'm always seeking to increase speed, I always need new tools to test everything. I use Pingdom Tools and Google Page Speed (GPS) on an almost daily basis. Though those sites are great for testing load speed, I needed something else tested: capacity. Pingdom Tools and GPS only simulate one user at a time, and while it's nice, it's not always practical. There's always a chance that more than one person will be accessing your page at a time.

Meet Blitz.

Blitz doesn't just simulate one user. Blitz uses two tests to help you determine how your server or application will do under any type of load.

The first step is called Sprinting. Sprinting runs a simple test from any of Blitz's regions around the world and shows you response time.

If the Sprint completes successfully (and it should under most circumstances) you move on to Rushing. Rushing is the real load test.

You can define the number of users and time using Blitz's command line syntax.

After you Rush, Blitz will show how you did. It's a great way to see where you need to improve. Here's a look at my Sprint and Rush.

[caption id="attachment_1219" align="alignleft" width="620" caption="Sprint Success!"][/caption]

[caption id="attachment_1235" align="alignleft" width="620" caption="Rush Success!"][/caption]

[caption id="attachment_1224" align="alignleft" width="620" caption="Rush Results!"][/caption]

It looks like I can withstand a lot of traffic before having to worry!

Blitz is an awesome tool! I've been using it to monitor An Altered Reality's performance over the past week. So far, it's been incredibly helpful helping me tweak my setup to get the best possible results...and it was better than my prior approach (that actually got some attention!) Give it a shot!

What do your results look like?

Caching: Memory or Disk

[caption id="attachment_1008" align="alignright" width="270" caption="Caching Removes Weight!"][/caption]

Choices, Choices.

A small disclaimer: I'm going to assume you have a caching plugin installed at this point and you're wanting to know if you should choose disk based caching or memory based caching. I'm also going to assume that you don't have rockstar command line skills and that servers really aren't your thing...so I won't be giving you anything to put in a command line nor will I dive deep into hardware configuration. Nothing too deep.

Caching is a huge deal. All the popular sites do it, and the reasoning behind it is simple: it increases page load time and decreases server load. The less load, the more servers can serve content ultimately leading to fewer physical machines and lower operating costs (sometimes). It's a good thing, really.

If you're using shared hosting, caching is a must.

Disk Caching

At first, I was clueless. Disk (Advanced) appeared good the first time I installed W3 Total Cache (W3TC). I didn't think about the difference in hardware at the moment nor was memory caching an option. It just didn't seem relevant. I don't know why...hardware is half of my real job.

Disk caching has its pros and cons:

Pros:

  • Works on all types of hosting, shared and private
  • Increases page speed enough for some
  • Lightens the load on the server application (Apache and the like)

Cons:

  • Disk caching is caching files on a hard drive; the slowest part of a computer. Server drives aren't solid state either, those drives spin.
  • Increased drive I/O isn't the best thing in the world (Drives fail. Not if, when.)
  • Isn't incredibly fast

Disk caching is the easiest caching and for the inexperienced user, probably the best option. It requires almost zero configuration. It's a set and forget method, and it's fast enough for most people. The biggest downside is that it's caching on the disk. Hard drives, even the 15K (that's fancy talk for 15,000 RPM) drives in my NAS servers, are the slowest part of a computer.

Memory Caching

Memory caching (mem caching) works slightly different. Programs like APC and Memcached cache in memory rather than on the disk. Memory is fast. Very fast. When you remove moving parts from the equation, everything speeds up. Your data never has to leave the motherboard, travel down some cable, hit a controller, wait to be written...you get it? Though mem caching is better, it's not perfect either.

Pros:

  • Faster than disk caching
  • Significant page speed increase if configured properly
  • More efficient on hardware (no moving parts; RAM lasts longer than hard drives)

Cons:

  • Not easy to configure
  • Requires some maintenance
  • Can consume large quantities of memory causing your site to time out

Though the biggest plus is page speed increase, the biggest con is that successful mem caching requires a good bit of RAM, or some seriously modified configuration files. If you don't have the RAM, and you attempt to use something like APC or Memcached, you might find your server unresponsive. It's not fun, and not easy to fix.

Conclusion

Caching, memory or disk, is crucial to site performance. If you don't cache, you risk slow page load times which ultimately affect SEO as well as your pageviews. People aren't patient, and I promise, our content is no exception.

Good luck! If you have any questions, hit the comments.

Epic WordPress Setup

Towards the end of last month, John from TentBlogger unveiled his "Ultimate WordPress Blog Hosting Setup". I liked the idea, and accepted it as a challenge to be faster, so I figured I'd share mine. My setup usually stays constant, but I'm constantly trying to make things faster so this could all change at any moment. I have a test environment in place, but I usually test on the live site.

The majority of my properties run on Amazon EC2. I do have a DreamHost account though, and I'd recommend them for anyone looking for a hosting company. I've been nothing but satisfied since signing up two years ago.

My EC2 instances are all Micro instances running a mix of Ubuntu 10.10 and 11.04 with LighttpdPercona/MySQLPHPAPC, and Memcached with PHP-Memcache (LLPP instead of LAMP or anything else).

Content Delivery is done with CloudFront using an S3 bucket as the origin. In the previous post, I talk about why I chose to do this versus using a custom origin.

DNS is managed through Amazon Route 53 for the EC2 sites, and DreamHost for the DreamHost sites. DNS is a big deal to me.I'm super impatient, so I hate waiting for changes to propagate. With Route53, my propagation time is usually no more than 5 minutes...and if it's more, I just edit my HOST file to keep working.

Lighttpd, my web server of choice, is a super light weight web server. It's used on some of the web's largest websites such as Youtube, Wikipedia, and at one time, Twitter. It's simple to configure and super easy to maintain. I chose it based on its efficient use of CPU/RAM as well as its built in modules. It's open source, too!

Percona is the most carefully maintained, independent fork of MySQL, and its benchmark tests show it significantly faster (those here). It's a drop in replacement, meaning it appears to be MySQL (you access it the same way, login the same way, and yes, tools like PHPMyAdmin work with it). You have to uninstall MySQL before using it, then install Percona after. You can compare Percona vs. MySQL feature sets here (HINT: Percona is better!). I'm currently running version 5.1. As soon as 5.5 comes out of RC stage, I'll be upgrading as it's even faster. Database speed is important - WordPress relies heavily on it.

PHP, APC, and Memcached aren't special. PHP is running as CGI. APC is standard. I keep Memcached installed in case I want to use it (usually for DB/Object caching). When I do, it uses a second EC2 instance for load balance/fault tolerance. I prefer APC as I've seen better performance over time (I think most people have).

I do have some level of redundancy built in. I have a second EC2 instance in a different region. The databases are setup as Master/Slave with near instant replication when the master makes a change. In case I log into the secondary server (because of Round-Robin DNS), I'm leveraging HyperDB to tell WordPress to read from the local copy (slave), but to write to the master. That keeps load times down on the front end.

[caption id="attachment_907" align="alignright" width="300" caption="Global Site Load Balancing = EPIC"][/caption]

I'd love to use a global load balancer to route traffic to the closest server. Right now, it doesn't make sense since Amazon only has an east and west region with no central region to balance it out.

Eventually I'd love to see Amazon allow you to use their load balancer across multiple regions (they currently allows cross-Availability-Zone load balancing), assign it an IP address to use with Route 53, and provide the ability to load balance based on location and load rather than just load.

In all, I think it's a pretty awesome setup. It's definitely not typical, but it's a heck of a lot faster than LAMP. My load times are between 2 - 3.5 seconds (3.5 being almost too slow in my opinion). You can see a recent load test here.

To recap:

Servers: EC2

Hosting Company: DreamHost (only for some projects - An Altered Reality is powered by EC2)

OS: Ubuntu Linux 10.10/11.04 Server (CLI Only)

DNS: Route 53/DreamHost

Software: Lighttpd, Percona, MySQLPHP, APC, and Memcached with PHP-Memcache

Content Delivery: Amazon CloudFront via Amazon S3

All of this used to be free...but now it's costing around $5-$10 a month.

If you have any questions or comments, let me know below!

Amazon CloudFront and GZIP

If you look at my posts here, you can tell that I have a bit of a love affair with Amazon Web Services. I won't deny it either, because I do. But, just because I love AWS doesn't mean that it's perfect. Like any relationship, it's far from it.

Recently I moved from Amazon S3 to CloudFront with a custom origin. CloudFront's custom origin option essentially makes CF a pull CDN. After looking at the the front page I noticed that the CSS wasn't loading - at all! In fact, it wasn't even there!

Upon further research, I realized CF doesn't handle GZipped content when using custom origins.

Obviously, that's an issue since gzip is one of the ways I make this whole thing run fast.

In order to serve GZipped content via CF, you have to use both S3 and CF together. Your S3 bucket becomes your CF distribution origin.

Though it's round about, it works very well.

Hope that's helpful!

Making the Move to a CDN: What You Need to Know

Speed is important. If a visitor comes to your site and everything takes forever to load, do you think they're going to want to come back? Your content had better be good because if it's not, no one is going to wait on your slow blog. It just doesn't work that way. Humans are naturally impatient. Waiting doesn't work.

Over the years, I've tried my hardest to learn the best ways to make my sites faster. I want darn near instant load times. I don't want to wait on my own stuff to load. Before I dive in to talking about Content Delivery Networks (CDNs) here's some things that I've learned:

  • Images are often the heaviest things on a blog
  • CSS/JS files can be heavy and are great candidates for caching
  • White space = byte space. If you can fit it on one line - do it.
  • RAM is crucial - get more or get gone.
  • Databases generate overhead. Take care of it.
  • Revision history = bloat.

That's some of what I've learned over the years. I can't go through everything. I need some consulting jobs every now and then. I'm sure you understand.

Content Delivery Networks are like giant caches of static files. CDNs aren't designed for dynamic files, so don't try and use them for that! You'd just be disappointed.

Here's how it works:

[caption id="attachment_428" align="alignright" width="300" caption="MaxCDN's Edge Locations"][/caption]

  1. You upload a file to your CDN
  2. That file is then distributed to edge servers around the country or maybe even the world!
  3. When that file is accessed, the edge server closest to the requester responds and delivers the content.

Simple, right?

Distance matters, and someone finally realized that when CDNs were developed. The closer the file to the requester, the faster it's going to load. That's why CDNs speed things up.

Who Needs a CDN?

Do you have slow load times, large amounts of visitors? Does your hosting plan offer little to no storage? Are you willing to spend a few extra bucks a month?

If you answered yes to any of those questions, you're a candidate and should seriously consider moving to a CDN.

The average cost is $.15/GB. That's cheap! You do pay for bandwidth, but it's a negligible amount of money. You'd only feel it if you have a massive amount of traffic.

Now comes the hard part: who to use.

Pick a CDN:

Here's the big 4 in the game:

  1. Amazon S3/ Cloudfront
  2. Rackspace Cloudfiles
  3. MaxCDN
  4. Akamai

Each has their own list of pros and cons. Amazon and Akamai, for example, own their networks. Rackspace and Max partner with Limelight Networks and NetDNA respectively.

There's more than those 4, of course, but those are 4 of the biggest CDN providers I can think of.

I won't recommend one over the other, but I can tell you, no matter who you pick, you'll be pleased with your load times!

You've Picked One, Now What?

There's a few ways to do this. You can change the code in your blog to point to your CDN or you can find a plugin to do the dirty work for you.

I won't go into the details of either, but I'd suggest finding a plugin...unless modifying Wordpress code is your thing...

I use W3-Total-Cache. I give it the access info to my CDN and then tell it what to upload and then use. It's really simple.

--

Those are the basics. CDNs are awesome, hands down. They're cheap, they're fast, and they help!

I'd encourage any blogger to look into it. I did, and I haven't looked back.

If you have any questions, need some help, have something to add...the comments are open!