Martin Walshaw (Global) - Virtual Servers Still Face Real World Challenges

VMware provides brilliant facilities for data it stored in clusters, but it can't do anything to the data flowing inward. Martin Walshaw, senior system engineer at F5, looks at the four key areas where VMware can improve: capacity on demand, moving servers, site recovery, and VM density.


We've all seen TV news images of stampedes at football stadiums, where huge, multi-million dollar facilities with state-of-the-art fields, restaurants, bars and seating are simply swamped by a tide of people pushing and shoving to get what they want.

Replace "football stadium" with "datacentre", and "screaming football fans" with "connection requests", and you start to see where this is going. There are two aspects to managing large-scale networked information systems - processing the data, and moving in and out of the datacentre quickly and smoothly.

When it comes to processing the data, cloud computing is obviously a giant leap forward - more efficient resource utilisation, distribute processing to where it's needed more, more resilience - we all know the story. And when you talk about cloud, you have to talk about VMware, which came from almost nowhere as an academically abstract piece of computer science to now building the software bedrock of virtualisation.

But as powerful as VMware is, it is the last element in a long communications chain that starts somewhere - anywhere - in the world, finds its way to a datacentre, then needs to find the correct server, and then the correct virtual machine within that server. VMware provides brilliant facilities for the data once it gets into the cluster, but it can't do anything to the data on the way in, when the connections are being set up. Datacentre meltdowns happen when connection requests don't get what they want quickly enough, and start pushing, shoving and being dropped.

Let's look at four areas in particular: capacity on demand, moving servers, site recovery, and VM density.

Capacity on demand is one of the biggest drawcards to cloud computing and server virtualisation. There is a powerful tool in the VMware suite called vCloud Director that lets you spin up or drop servers as they are needed. Trouble is, to be really useful, it needs to be automated so that depending on traffic loads and incoming connections, servers come up or down as needed without manual intervention. You also need to be redirecting and managing connection requests on the network before they get to the server clusters, because as servers come up, a high bandwidth, low latency gatekeeper ensures that traffic ends up where it needs to be as quickly as possible, but without overwhelming the cloud controllers while they're re-allocating resources.

This is where F5's application controllers step in, acting as the police in day-glo vests to keep the traffic moving smoothly and preventing bottlenecks.

In the case of VMware's Site Recovery Manager, the application controller also ensures that bringing sites back up is transparent to live connections, particularly where traffic is trying to reach unique IP addresses that may no longer exist because of a server failure (even though the server may have a failover replicant already up).

It does a similar job in the case of what they call in the VMware world "long distance vMotion". Virtualisation technology has moved to the point where you can move a server from one location to another while it is still servicing connections. Tricky to do within a single datacentre, very difficult if the datacentres are on opposite sides of the country. Basically, you need a huge amount of bandwidth and extremely low latency between the two sides

The requirement to do long distance vMotion comes up more and more nowadays - possibly there's a catastrophic datacentre problem, or, more likely, it's just that you're wasting money carrying traffic across the country, and want to move a server closer to the clients.

This is where an application controller on the network is invaluable - it can manage the connections within the virtual server, off-loading the system while the server is handling the VM transfer. It also ensures that data is moving on the right paths - if connections jump from one firewall to another, for example, sessions will die. The application controller ensures that traffic from clients to the VM, and from the VM to the different storage pools, get where they need to be - but without unacceptable adding latency.

Finally, while VMware is an amazing technology, both the server hardware and VMware licences have capital expenditure implications. By offloading encryption, compression and application acceleration duties, to the hardware-based F5 application controller, more can be done with fewer hardware and software resources, with lower running costs.

The bottom line is to use hardware-based traffic and session management to enhance the user experience, preventing a live session from crashing, or ensuring that a session can be initiated without making the user wait. If you can do this automatically, so that no-one has to sit watching your virtual servers like a hawk to move resources around (or manually restore connections), you win.

By Martin Walshaw, senior systems engineer at F5