Virtualizing Citrix XenApp on vSphere, part I

October 26, 2009

Today many organizations who use XenApp are still bound to x86 platforms because of legacy applications which don’t run on a x64 platform. Sometimes an application does run on a x64 platform but is not supported by the vendor on a x64 platform. By sticking to the x86 platform, modern server hardware can’t be fully utilized due to the memory limit of 4 GB of the x86 platform.

To overcome this limitation virtualization comes to practice! Virtualizing a XenApp server has always been a challenge. However with the maturity of vSphere and current CPU’s, there isn’t a limitation anymore to not virtualize a XenApp server. In this article I will share you my experience of implementing a virtualized XenApp production environment for a customers and give you my recommendations for a successful virtualization of XenApp.

This article is split in different parts. In the first part I focus on the configuration of the XenApp VM. In the second part I will look at the application landscape and the underlying ESX host and in the last part I will look at the performance results.

Building a XenApp VM

The configuration of a XenApp VM is crucial to the performance of a virtualized XenApp server and has a direct link with the hardware configuration of the underlying ESX host. As with every VM there are four main performance aspects that need to be optimized for the corresponding workload: cpu, memory, disk I/O and network I/O.

The most important resource for the XenApp VM is CPU. XenApp VM’s tend to use a lot of CPU resources and this is most likely to be the first bottleneck. In creating you XenApp VM, there are two scenario’s: scale-out or scale-up. In the scale-out scenario there a lot of light XenApp VM’s created with one vCPU. In the scale-up scenario less VM’s are created with two vCPU. The main objective is to not over commit your physical CPU’s. Let’s say you have a ESX host with two Quadcore CPU’s, which is a total of 8 cores. If you create eight 1 vCPU VM’s, each VM can schedule a dedicated CPU. The same applies for 2 vCPU VM’s, if you create four 2 vCPU VM’s, each VM can schedule a dedicated set of CPU’s.

Depending of the workload on your XenApp VM, one of this scenario’s fits best. If you have light workloads the scale-out scenario might be best, but in most situations the scale-up scenario does the best job. In most circumstances, using 2 vCPU’s allows for more users and a more consistent user experience. With the improvements made to the CPU scheduler in vSphere, like further relaxed co-scheduling, SMP XenApp VM’s are no longer a problem. If you are using a host with Intel Nehalem CPU’s enable the CPU and Memory hardware assistance (more information on this in part II).

Disk I/O
The second important resource is disk I/O. This will be further explained in the next part of this article but for now I recommend to use two virtual disks for a XenApp VM. One for the operating system and one for the applications. For optimal disk I/O performance, make sure you align the file system in the guest OS.

The next resource for the XenApp VM is memory. With memory there is one simple rule. Don´t over commit memory for your XenApp VM´s. Depending of the workload of the XenApp server, configure the XenApp VM with the corresponding amount of memory. In most situations this will be 4096 MB of RAM (assuming you are using a 32 bit OS). Make sure you also make a memory reservation of the same size. This way the XenApp VM has all the configured RAM available and the VMware balloon driver cannot degrade the performance of the XenApp VM.

Network I/O
The last resource for the XenApp VM is network. I haven´t seen any XenApp VM implementation where network i/o results in a bottleneck but for best results use the new VMXNET 3 vNIC. The VMXNET 3 has less impact on the CPU which is always useful.

Other considerations
I recommend to use Windows Server 2003 R2 x86 for building the XenApp VM. Windows Server 2008 uses a lot more resources. This probably will be a lot better with Windows Server 2008 R2 but at time of writing this article, XenApp is not certified for use with Windows Server 2008 R2. Furthermore I recommend to remove the CDRom drive and floppy disks. The floppy disk can be complete disabled in the BIOS of the VM. Always install the latest VMware Tools to provide the optimized drivers for the virtual SCSI and network devices and install the balloon driver.

So let’s summarize the preferable XenApp VM configuration:

2 vCPU
4096 MB of RAM with 4096 MB reserved
1 LSI Logic Parallel SCSI controller
2 virtual disks, one of OS and one for applications
CDRom drive removed
Floppy drive removed
CPU and Memory hardware assistance enabled if using Intel Nehalim processors

Again, depending on your environment another configuration could be more desirable. A consistent server configuration is very important in a XenApp farm so I recommend to build a dedicated template for deploying XenApp VM’s.

In the next part of this article I will look at the application landscape and the underlying ESX host for building your virtualized XenApp farm.

Continue to part II.

VMware vCenter CapacityIQ has been released

October 21, 2009

VMware vCenter CapacityIQ has been released by VMware. vCenter CapacityIQ is vCenter plugin for proactively monitor the capacity of your VMware Infrastructure. vCenter CapacityIQ monitors cpu, memory and disk I/O usage and predicts future demands for resources.

vCenter CapacityIQ allows you to show impacts in capacity as well as forecasted capacity based on changes you predict will happen to the environment.

From the VMware site:

‘VMware vCenter CapacityIQ is a value-add component of the vCenter family of management solutions, providing capacity management capabilities for virtualized datacenter or desktop environments. CapacityIQ integrates seamlessly with VMware vCenter Server, ensuring that your virtualized infrastructure capacity is always predictable and efficiently used.

  • Eliminate waste by identifying any unused or over-allocated capacity
  • Reduce operational overhead by automating routine capacity monitoring and management tasks
  • Minimize business risk of outages or failures resulting from capacity bottlenecks and shortfalls

More information can be found on:

vSphere, HP EVA and path policies

October 11, 2009

The HP EVA comes in two types of storage arrays. An active/passive array and an active/active array. The HP EVA active/active storage arrays can handle I/O requests on both storage processors. With ESX 3.5 and an active/active EVA array you had two options for the path selection policy, fixed and MRU (Most Recently Used). With an active/active EVA array most people used the fix path policy to manually load balance the I/O over both storage processors. This process is very work intensive (it has to be done on all ESX hosts) and is prone to errors (e.g. a fixed path is configured to the wrong storage processor or path trashing).

With the release of vSphere and the vStorage API’s for multipathing, the round robin path selection policy is now fully supported. The HP EVA active/active storage arrays support ALUA which stands for Asymmetrical Logical Unit Access. ALUA occurs when the access characteristics of one port may differ from those of another port. ALUA allows a lun to be accessed via its primary path (via the owning Storage Processor) and via an asymmetrical path (via the not-owning Storage Processor). I/O to the not-owning Storage Processor or not-optimized path  comes with a performance penalty because the I/O has to be transmitted over the internal connection between the storage processors which does not have much bandwidth.

With ALUA the ESX host is aware of the non-optimized paths. So when you use the round robin path policy, ESX will load balance the I/O over the optimized paths (the paths to the owning storage processor of the lun) and use the non-optimized paths only in case of a failure on the optimized paths. To my opinion this has two advances: both path are used and manually load balancing isn’t necessary anymore which saves a lot of configuration work.

So here you have it! I recommend to use the round robin path policy with an active/active EVA array. If you don’t want to use the round robin path policy I would recommend to use the MRU policy. The MRU policy is ALUA aware and will also use the optimized paths.