Virtualization and the ISP (part 1)

With things changing all over the marketplace, virtualization has, again, come to the forefront as the savior of the data center.

And wouldn’t you know, I’d like to save my data center, at least some power and cooling needs.

I have started to review how we use our servers and where we could do combining to save power, cooling, and rack space.  During this installment, I’ll be discussing the usage, and combining, of 3 parts of our network:

  • POP/IMAP servers
  • Apache based web servers
  • SMTP (inbound, delivery, outbound) servers

When looking at the POP/IMAP servers, I find that their CPU and memory utilization is quite low with most of the I/O that occurs is on the wire and not on the local storage system.  (some of this will change as I formalize our changeover from Courier-IMAP to Dovecot for POP and IMAP service to customers as Dovecot keeps an index of messages on the volume for faster search and retrievals)

Now onto the web servers…

These systems can use vast amounts of CPU and memory, but the only I/O done on the local storage system is the transient logging from Apache.

But the SMTP servers are the tough part.  CPU, memory, and disk I/O utilization are all different:

  • Inbound servers take connections from the Internet, upwards of 2-3 million per day.  Initial spam filtering happens at this level, though not very CPU intensive itself as the goal is to toss as many connections as fast as possible.  Disk I/O utilization is high (30-40 I/O per second normal, peaking at 300+).
  • Outbound servers take connections from our customers, upwards of 100 thousand per day.  Spam filtering, virus scanning, anti-phishing techniques come into play.  These systems today have very idle CPU and disk I/O measurements, and memory is of medium usage.  Disk I/O utilization is low.
  • Delivery servers have the bulk of the work.   These servers take connections from the antispam appliances (black boxes) and deliver 500-600 thousand messages per day when busy.  But they have the highest CPU and memory usage of the 3 sets of SMTP servers in operation – they do one last anti-virus and anti-phish scan, do per mailbox routing of messages, handle any ‘i am out of the office’ type processing.  The CPU utilization in this area of the SMTP train is greater than the other 2 positions combined.  Disk I/O utilization is high (30-40 I/O per second normal, peaking at 300+).

So, how to combine these systems?

That has been the pending question on my mind the past 2 months and I think I know what I want to do…

Combine the POP/IMAP servers with the web servers as 6 machines using the following (basic) hardware configuration:

  • Intel L5410 (2.33Ghz) quad core processor (single)
  • 8GB RAM (4x2GB sticks)
  • 3 80GB disk drives in a RAID5 configuration
  • 3 1Gbps ethernet ports

This allows each of the virtualized servers enough of each resource, divvied up…

  • Host operating system
    • ~1GB RAM
    • 9GB disk
  • POP/IMAP virtual server (allow max CPU of ~1 core)
    • 2GB RAM
    • 90GB disk
  • Web virtual server (allow max CPU of ~3 cores)
    • 5GB RAM
    • 40GB disk

That takes care of 2 of the 3 areas so far, now onto the SMTP server (basic) hardware configuration:

  • Intel L5420 (2.5Ghz) quad core processor (single)
  • 8GB RAM (4x2GB sticks)
  • 4 (or more depending on measured I/O requirements) 73GB 15K RPM SAS disks in a RAID5 configuration
  • 3 1Gbps ethernet ports

This would allow the following virtualized server layout…

  • Host operating system
    • ~1GB RAM
    • 9GB disk
  • Inbound SMTP server (allow max CPU of ~2 core)
    • 2GB RAM
    • ~60GB disk
  • Outbound SMTP server (allow max CPU of ~1 core)
    • 2GB RAM
    • ~60GB disk
  • Delivery SMTP server (allow max CPU of ~2 core)
    • 3GB RAM
    • ~60GB disk

Using the 15K RPM disk drives would allow for very high I/O loads – one of the items that the SMTP servers require that the other 2 types of servers did not.  The slightly faster processor will allow for further longevity as mail resource usage (SMTP type) does not grow at the same rate as the web or POP/IMAP servers.  As usage of the POP/IMAP servers grow, more servers can be deployed

You’ll also notice that CPU limits are being placed for max utilization of each of the different types of virtual servers.  If a runaway process or a heavy spike of resource usage occurs, we can protect the other virtual(s) on the same host from being resource starved.

Whew, that was a lot of data!

Part 2 will start to address the different host operating systems I have been using to facilitate this combining of servers, which one I am likely to be using, and how I set it up on the ipHouse network.