Friday, May 9, 2014

Virtualize with Confidence - Use VMware: Storage DRS and Drmdisk

Virtualize with Confidence - Use VMware: Storage DRS and Drmdisk: Storage DRS and Drmdisk Storage DRS leverages a special kind of disks for facilitating a more granular control over initial placement an...

Storage DRS and Drmdisk

Storage DRS and Drmdisk

Storage DRS leverages a special kind of disks for facilitating a more granular control over initial placement and migration recommendations. It also plays a major role in I/O load balancing by using these deeper details.

Let us read about it in detail:

DrmDisk


vSphere Storage DRS uses the DrmDisk construct as the smallest entity it can migrate. A DrmDisk represents a consumer of datastore resources. This means that vSphere Storage DRS creates a DrmDisk for each VMDK file belonging to the virtual machine. A soft DrmDisk is created for the working directory containing the configuration files such as the .VMX file and the swap file.

  • A separate DrmDisk for each VMDK file
  • A soft DrmDisk for system files (VMX, swap, logs, and so on)
  • If a snapshot is created, both the VMDK file and the snapshot are contained in a single DrmDisk.



VMDK Anti-affinity Rule

When the datastore cluster or the virtual machine is configured with a VMDK-level anti-affinity rule, vSphere Storage DRS must keep the DrmDisk containing the virtual machine disk files on separate datastores.

Impact of VMDK Anti-Affinity Rule on Initial Placement

Initial placement immensely benefits from this increased granularity. Instead of searching a suitable datastore that can fit the virtual machine as a whole, vSphere Storage DRS can seek appropriate datastores for each DrmDisk file separately. Due to the increased granularity, datastore cluster fragmentation—described in the “Initial Placement” section—is less likely to occur; if prerequisite migrations are required, far fewer are expected.

Impact of VMDK Anti-Affinity Rule on Load Balancing

Similar to initial placement, I/O load balancing also benefits from the deeper level of detail. vSphere Storage DRS can find a better fit for each workload generated by each VMDK file. vSphere Storage DRS analyzes the workload and generates a workload model for each DrmDisk. It then determines in which datastore it must place the DrmDisk to keep the load balanced within the datastore cluster while offering sufficient performance for each DrmDisk. This becomes considerably more difficult when vSphere Storage DRS must keep all the VMDK files together. Usually in that scenario, the datastore chosen is the one that provides the best performance for the most demanding workload and is able to store all the VMDK files
and system files.

By enabling vSphere Storage DRS to load-balance on a more granular level, each DrmDisk of a virtual machine is placed to suit its I/O latency needs as well as its space requirements.

Virtual Machine–to–Virtual Machine Anti-Affinity Rule

An inter–virtual machine (virtual machine–to–virtual machine) anti-affinity rule forces vSphere Storage DRS to keep the virtual machines on separate datastores. This rule effectively extends the availability requirements from hosts to datastores. In vSphere DRS, an anti-affinity rule is created to force two virtual machines.

For example,

Microsoft Active Directory servers—to run on separate hosts; in vSphere Storage DRS, a virtual machine–to–virtual machine anti-affinity rule guarantees that the Active Directory virtual machines are not stored on the same datastore. There is a requirement that both virtual machines participating in the virtual machine–to–virtual machine anti-affinity rule be configured with an intra–virtual machine VMDK affinity rule.

Thoughts ?

vSphere Storage DRS is one cool piece of code and is continuously improving how we use traditional storage systems for better.

Always good to dig and understand the VMware vSphere features in detail.  Until next time :)

Saturday, May 3, 2014

VMWare Horizon View Design Best Practices

This blog post talks about designing a basic VMware View deployment that will cover up to 500 desktops. 

For enterprise setup, I plan to do a more elaborate extension to this post where I would talk about areas such Storage, Networking, Load Balancing, POD architecture and best practices around them.




The basic components and what strategy to take around them to avoid single point of failure.

2 vSphere Clusters 
Its not wise to run one vSphere cluster for both your server infrastructure and your desktop infrastructure.  Don't be cheap, separate these clusters.

2 View Connection Servers
The View Connection Server is the brokering server, they establish the connection to the View Agent.  While redundancy is built into the product, all you have to do is install a second "replica server".  Also, keep in mind if you want to do external PCoIP connections you will need 4 View Connection Servers . Two of these servers will be used for internal redundancy, two will be used for external.

2 View Transfer and Security Servers
On similar lines as Connections server based on the functionality.  

2 vCenter Servers
You'll need to use vCenter Heartbeat, as its the only way I know how to make vCenter Servers redundant,. Its a little bit expensive but VMWare doesn't say that you need to make the vCenter server redundant, however vCenter service being down is catatrophic event. For the most part you don't need vCenter, except when you need to boot a linked clone VM

2 SQL Servers
Use SQL 2008 with Microsoft Failover Clustering.  Make this redundant  Well, vCenter DB is on this, Events DB is on this, View Composer DB is on this.  Basically every component of a View setup has a DB, so its advisable to have these DBs on a redundant back end. 

SplitRXMode in VMware vSphere 5

My previous post was on how multicasting is handled in VMware vSwitch context. You can read about it here.

Now it is most apt to mention some advanced setting around it. The advanced setting we talk about is SplitRXMode.

While Multicasting worked fine for some multicast applications this still wasn’t sufficient enough for the more demanding multicast applications and hence stalled their virtualization. 

The reason being that in this case VMs would process the packet replication in a single shared context which ultimately led to constraints. This is because when there was a high VM to ESX ratio there was a consequent high packet rate that often caused large packet losses and bottlenecks. VMware vSphere 5 provides, the new splitRXMode to not only compensate for this problem but also enable the virtualization of demanding multicast applications.

SplitRx mode is an ESXi feature that uses multiple physical CPUs to process network packets received in a single network queue. This feature provides a scalable and efficient platform for multicast receivers. SplitRx mode typically improves throughput and CPU efficiency for multicast traffic workloads.

VMware recommends enabling splitRx Mode in situations where multiple virtual machines share a single physical NIC and receive a lot of multicast or broadcast packets.

NOTE: 

  •         SplitRx mode is supported only on vmxnet3 network adapters. 
  •         This feature is disabled by default.
  •         SplitRx mode is individually configured for each virtual NIC.



To enable SplitRX do the following:


This feature, which is supported only for VMXNET3 virtual network adapters, is individually configured for each virtual NIC using the ethernetX.emuRxMode variable in each virtual machine’s .vmx file (where X is replaced with the network adapter’s ID).

The possible values for this variable are:

 ethernetX.emuRxMode = “0″

The above value disables splitRx mode for ethernetX.

ethernetX.emuRxMode = “1″

The above value enables splitRx mode for ethernetX.

To change this variable through the vSphere Client:


  1. Select the virtual machine you wish to change, then click Edit virtual machine settings.
  2. Under the Options tab, select General, then click Configuration Parameters.
  3. Look for ethernetX.emuRxMode (where X is the number of the desired NIC). If the variable isn’t present, click Add Row and enter it as a new variable.
  4. Click on the value to be changed and configure it as you wish.



The change will not take effect until the virtual machine has been restarted.


References and Credits: Chris Hendryx (it.toolbox.com).

Multicast and VMware

This post deals with Multicasting in VMware.

Lets see what multicast is and how it is deployed in VMware.

What is Multicasting ?


An alternate way of content delivery, where IP packet is sent to multiple destinations identified by a Multicast IP address. Multicasting is used for content delivery by Stock Exchanges, Video conferencing etc to multiple destinations at once.  
Multicast sends only one copy of the information along the network, whereby any duplication is at a point close to the recipients, consequently minimizing network bandwidth requirements.

For multicast the Internet Group Management Protocol (IGMP) is utilized in order for membership of the multicast group to be established and coordinated. This leads to single copies of information being sent to the multicast sources over the network. Hence it’s the network that takes responsibility for replicating and forwarding the information to multiple recipients.
By operating between the client and a local multicast router, IGMP utilises layer 2 switches with IGMP snooping and consequently derives the information regarding IGMP transactions. By being between the local and remote multicast routers, Multicast protocols such as PIM are then used to direct the traffic from the multicast server to the many multicast clients.

Multicasting and VMware

In the context of VMware and virtual switches there’s NO need for the vSwitches to perform IGMP snooping in order to recognize which VMs have IP multicast enabled. This is due to the ESX server having authoritative knowledge of the vNICs, so whenever a VM’s vNIC is configured for multicast the vSwitch automatically learns the multicast Ethernet group addresses associated with the VM. With the VMs using IGMP to join and leave multicast groups, the multicast routers send periodic membership queries while the ESX server allows these to pass through to the VMs. The VMs that have multicast subscriptions will in turn respond to the multicast router with their subscribed groups via IGMP membership reports.

 NOTE: IGMP snooping in this case is done by the usual physical Layer 2 switches in the network so that they can learn which interfaces require forwarding of multicast group traffic. 

So when the vSwitch receives multicast traffic, it forwards copies of the traffic to the subscribed VMs in a similar way to Unicast i.e. based on destination MAC addresses. With the responsibility of tracking which vNIC is associated with which multicast group lying with the vSwitch, packets are only delivered to the relevant VMs.



References and Credits: Chris Hendryx (it.toolbox.com).

Thursday, May 1, 2014

SIOC - VMware vSphere Storage I/O Control - Enhancement

Today I plan to write about the enhancements in Storage I/O control in vSphere 5.1 platform. Those of you who need some heads up can read about SIOC in my previous blog post here

The following are the new enhancements to Storage I/O Control in vSphere 5.1.

1. Stats Only Mode

SIOC is now turned on in stats only mode automatically. It doesn't enforce throttling but gathers statistics to assist Storage DRS. Storage DRS now has statistics in advance for new datastores being added to the datastore cluster & can get up to speed on the datastores profile/capabilities much quicker than before.

2. Automatic Threshold Computation

The default latency threshold for SIOC is 30 msecs. Not all storage devices are created equal so this default is set to a middle-of-the-ground range. There are certain devices which will hit their natural contention point earlier than others, e.g. SSDs, in which case the threshold should be lowered by the user. However, manually determining the correct latency can be difficult for users. This motivates the need for the latency threshold to get automatically determined at a correct level for each device. Another enhancement to SIOC is that SIOC is now turned on in stats only mode. This means that interesting statistics which are only presented when SIOC is enabled will now be available immediately.

To figure out the best threshold, the new automatic threshold detection uses the I/O injector modelling functionality of SIOC to determine what the peak throughput of a datastore is.

When peak throughput is measured, latency is also measured.

The latency threshold value at which Storage I/O Control will kick in is then set to 90% of this peak value (by default).

vSphere administrators can change this 90% to another percentage value or they can still input a millisecond value if they so wish.




3. VMobservedLatency

VmObservedLatency is a new metric. It replaces the datastore latency metric which was used in previous versions of SIOC. This new metric measures the time between VMkernel receiving the I/O from the VM, and the response coming back from the datastore. Previously we only measured the latency once the I/O had left the ESXi host, so now we are also measuring latency in the VMkernel as well. This new metric will be visible in the vSphere UI Performance Charts.



PS :  All the disk operations on VM like backup, Storage vMotion etc are billed to SIOC. 

SIOC - VMware vSphere Storage I/O Control

What is SIOC ?

A cool feature in VMware vSphere.. Well all the features in VMware vSphere platform are way cooler than they look. Trust me other virtualization vendors are no where even close when it comes to useful features.

Enough blabbing.. back to SIOC or Storage I/O control.

"SIOC or Storage I/O control is a mechanism to prioritize I/O for virtual machines running on group of vSphere hosts and sharing a common pool of storage."

Storage problems are most easily identified by high device latency. When storage takes a long time to service IOs (over 20ms by my definition), application owners will soon start complaining. The goal of SIOC is to identify this trend at the VMFS volume level and take corrective action to protect high priority virtual machines

Lets see how it works and some prerequisites.
  • SIOC throttles (decreases in most cases) VM's throughput to limit access to host device queue.
  • SIOC is enabled per datastore.
  • SIOC only applies disk shares when a certain threshold(Device Latency, most likely 30ms but configurable) has been reached.
  • SIOC modifies array queue on host (based on average DAVG per host on a datastore). Please refer to Frank Denneman's excellent post
  • SIOC will enforce limits in terms of IOPS when specified on the VM level.
  • SIOC requires Enterprise Plus license.
  • SIOC also supports NFS datastore in vSphere 5.0 and above.
  • SIOC is NOT same as traditional VM disk share but a level up. 
  • SIOC is NOT compute cluster based.



You can see the working of SIOC in a youtube video SIOC in action

Thoughts

Storage I/O Control (SIOC) is an Enterprise Plus feature that is used to control the I/O usage of a virtual machine and to gradually enforce the predefined I/O share levels according to the business needs. Even if equal shares are desired, fairness cannot be guaranteed between VMS on different hosts without SIOC.

It is supported on Fibre Channel, iSCSI and NFS storages, and can automatically set the best latency threshold to achieve maximum throughput. This enables you to make the best out of your shared storage, and helps you easily manage a better virtual environment.

This feature is a must have for organizations that want to achieve higher consolidation ratios or host VMs for multiple tenants (like public clouds) as it helps reduce the effect of a noisy neighbor trying to hog storage resources on the rest of the well-behaving VMs.


I plan to write another blog on enhanced features in SIOC in vSphere 5.1 and implications on NFS datastores. 

Until next time :)


Friday, April 25, 2014

vMotion network design considerations - Multi-NIC vMotion and Link Aggregation in vSphere 5.5 - Part2


In my previous Post, I tried to find out how support for LACP in vSphere 5.5 has changed the design consideration for vMotion Network.  One has to decide between Multi-NIC vMotion and Link Aggregation.

After some search on Internet and talking to my friends in virtualization domain, I was referred to this excellent post from Chris Wahl (VCDX -104) on his blog wahlnetwork.com
Not only does this post talk about this in depth but also the comments posted helped me understand the concept and its pros and cons even better.

Multi-NIC vMotion when used with LBT or Load Based Teaming is still a clear winner when compared to Link Aggregation for vMotion networks.

Few reasons for this are summarized here:

  • A LAG can only perform traffic distribution. Even if NIC is saturated.
  • LBT, on the other hand, actively examines traffic on a vSwitch. It is aware of the load on NIC which helps it avoid sending traffic on saturated NIC. 

Thoughts ?


So LBT is much better to load balance traffic in this case. 

That doesn't mean that LACP is bad and should not be used in vSphere designs. It perhaps has a different use case to suit its functionality. For eg: NFS based Storage.
I will post the pros and cons and use cases for when to use link aggregations in a different post.

Till then keep networking. Cheers !


vMotion network design considerations - Multi-NIC vMotion and Link Aggregation in vSphere 5.5 - Part1

vSphere 5.5 has been now out for some time. As always with a new release, there were some major enhancements (good for us), specially on the networking and vDS.

I was impressed by Multi-NIC vMotion support when it was introduced  in vSphere 5.0. With advent of 10G NIC cards and ever increasing size of worlkloads that run in single VM, it was important to leverage this technology to revamp existing vMotion.
And I must say, VMware is always two steps ahead in innovation and improving its product features.

I am today looking into the design consideration for vMotion network by using Multi-NIC vMotion and LACP support in vDS in vSphere 5.5 

LACP support till vSphere 5.1 was rather limited and exclusively dependent on IP Hash load balancing. With vSphere 5.5 a lot has changed with vDS specially LACP and support for all LACP load balancing types.

LACP now support dynamic link aggregation, multiple LAGs and is not exclusively dependent on IP Hash load balancing.

Till vSphere 5.1 Multi-NIC vMotion was correct as compared to LAGs backed NICs which in turn work on LACP. LACP support till vSphere 5.1 was rather limited and exclusively dependent on IP Hash load balancing.

Now for all of you who know how IP hash load balancing works and and how vMotion selects the preferred NIC can relate to this observation.

Rest of you can dig a little and it would be very clear. Please refer to excellent article by Frank Denneman ( Here )

I am still contemplating how does this change the LACP vs Multi-NIC vMotion debate. I am trying to discuss this with my friends in VMware and lets see what are there thoughts on this.

Keep watching this space on how has this changed with new vSphere 5.5

  

Back to VMware basics - vMotion Deepdive

Ever thought on how vMotion works ? 
I plan to write a detailed post divided into various sub posts to help you understand how the process works and what happens in background.

Architecture

vSphere 5.5 vMotion transfers the entire execution state of a running virtual machine from the source VMware vSphere ESXi host to the destination ESXi host over a high speed network. The execution state primarily consists of the following components:

  • The virtual machine’s virtual disks
  • The virtual machine’s physical memory
  • The virtual device state, including the state of the CPU, network and disk adapters, SVGA , and so on
  •  External network connections

Lets see how vSphere 5.5 vMotion handles the challenges associated with the transfer of these different states of a virtual machine.

Migration of Virtual Machine’s Storage

vSphere 5.5 vMotion builds on Storage vMotion technology for transfer of the virtual machine’s virtual disks. We need to understand, Storage vMotion architecture briefly to provide the necessary context.

Storage vMotion uses a synchronous mirroring approach to migrate a virtual disk from one datastore to another datastore on the same physical host. This is implemented by using two concurrent processes. First, a bulk copy (also known as a clone) process proceeds linearly across the virtual disk in a single pass and performs a bulk copy of the disk contents from the source datastore to the destination datastore.

Concurrently, an I/O mirroring process transports any additional changes that occur to the virtual disk, because of the guest’s ongoing modifications. The I/O mirroring process accomplishes that by mirroring the ongoing modifications to the virtual disk on both the source and the destination datastores. Storage vMotion mirrors I/O only to the disk region that has already been copied by the bulk copy process. Guest writes to a disk region that the bulk copy process has not yet copied are not mirrored because changes to this disk region will be copied by the bulk copy process eventually.

Migration of Virtual Machine’s Memory

vSphere 5.5 vMotion builds on existing vMotion technology for transfer of the virtual machine’s memory. Both vSphere 5.5 vMotion and vMotion use essentially the same pre-copy iterative approach to transfer the memory contents. The approach is as follows:


  •  [Phase 1] Guest trace phase The guest memory is staged for migration during this phase. Traces are placed on the guest memory pages to track any modifications by the guest during the migration.
  • [Phase 2] Pre-copy phase Because the virtual machine continues to run and actively modify its memory state on the source host during this phase, the memory contents of the virtual machine are copied from the source ESXi host to the destination ESXi host in an iterative process. Each iteration copies only the memory pages that were modified during the previous iteration.
  • [Phase 3] Switch-over phase During this final phase, the virtual machine is momentarily quiesced on the source ESXi host, the last set of memory changes are copied to the target ESXi host, and the virtual machine is resumed on the target ESXi host.


In contrast to vMotion prior to vSphere 5.1, vSphere 5.5 vMotion also must transport any additional changes that occur to the virtual machine’s virtual disks due to the guest’s ongoing operations during the memory migration. In addition, vSphere 5.5 vMotion must coordinate the several copying processes including the bulk copy process, I/O mirroring process, and memory copy process.

To allow a virtual machine to continue to run during the entire migration process, and to achieve the desired amount of transparency, vSphere 5.5 vMotion begins the memory copy process only after the bulk copy process completes the copy of the disk contents.The memory copy process runs concurrently with the I/O mirroring process, so the modifications to the memory and virtual disks, due to the guest’s ongoing operations, are reflected to the destination host.

Because both the memory copy process and I/O mirroring process contend for the same network bandwidth, the memory copy duration could be slightly higher in vSphere 5.5 vMotion compared to the memory copy duration during vMotion. Generally, this is not an issue because the memory dirtying rate is typically high compared to the rate at which disk blocks change.

vSphere 5.5 vMotion guarantees atomic switch-over between source and destination hosts by ensuring both memory and disk state of the virtual machine are in lock-step before switch-over, and fails back to source host and source disks in the event of any unexpected failure during disk or memory copy.

Further details on vMotion, Storage vMotion and some related concepts in my next post.