Over the course of past week I was asked a couple of storage related questions in regards to VMFS volumes and LUN partitions. The topic of the questions were based on something that folks with experience, and knowledge of VMware virtualization platforms are aware about. The questions were focused VMFS volumes and how they work, and the reason as why is not a good practice to create multiple VMFS volumes in a LUN that has been partitioned. I wanted to take a moment and try to explain it as simple as I can so here is my take on why we don’t want to use multiple VMFS volumes per LUN. I hope it’s something that can be of help.
VMware’s vStorage VMFS is a clustered file system that is shared among many servers – all or any of which could be writing to a shared LUN. So the access to the LUN’s (which by the way is a single partition) needs to control for certain important functions.
Another important function is that an ESX server powering on a VM. There can’t be confusion about which ESX server is running a VM because only that ESX server can write to the VM’s files (or else the VM files might get corrupted. SO we LOCK the entire LUN (datastore) (which is usually 1 partition) when we startup a VM. VMFS makes a note in the VMFS metadata to indicate which ESX server has that VM running (locked) or as we say the VM is “registered” on that ESX Server.
Another important function is allocating space. When we allocate a block on VMFS we can’t have confusion about whether the block is allocated or not – so we lock the entire LUN .
How does VMware lock an entire LUN? To lock the LUN VMware uses a feature of iSCSI or FC arrays called a SCSI-2 reserve – it’s VMware’s “distributed locking mechanism – the ESX Server requests and the array grants a SCSI-2 reserve to the ESX server allowing that server exclusive access to a LUN. A SCSI-2 reserve is held until the ESX server releases the SCSI-2 reserve (I believe). In VMFS3 VMware tries to hold the reserve for the shortest possible time – just 1 or 2 I/O’s – otherwise of course other ESX servers cannot access the LUN and performance could suffer.
So here’s what I think about multiple VMFS partition: since it’s not possible to grant a SCSI-2 reserve on a single partition (if a LUN had multiple partitions) it doesn’t make sense to have separate VMFS filesystems (i.e., multiple partitions) on one LUN. SCSI-2 reserve works only for the whole LUN. And indeed most VMFS volumes are single partition.
As for thin provisioning these SCSI Reserves apply there too – cause if the VM needs to grow then space needs to allocate for the VM. In order to allocate space ESX server has to get a SCSI-2 reserve granting that ESX host exclusive access to the LUN that the thin-provisioned VM is on so VMFS can allocate more space to the VM. Could be disastrous for performance if VM grows a lot. CAUSE we have to 1) Lock the VMFS LUN with a SCSI-2 reserve 2) allocate space to the VM 3) release SCSI-2 reserve 3) then do actual I/o to VM EACH TIME VM fills up a block and needs a new block.
I hope this makes better sense, to the people that asked. I want to thank to Connie Economou for taking the time to help me simplify the topic. Hope this helps.



March 1st, 2010 at 10:36 am
That makes sense as it relates to multiple datastores, but is this really relevant if the multiple VMFS partitions are different extents of the same VMFS datastore?
An example I have had is a LUN on an iSCSI enclosure is created, a VMFS file system and partition on the LUN. Later we needed more space so we extended the LUN, then added a VMFS extent to the LUN. I could have added another LUN with the free space and then added an extent there, but I have had issues on occasion with other storage admins deleting LUN’s they think are not in used, leaving me with a damaged VMFS file system. That is another story. In any case, is there any issue you could think of with 2 extents of the same VMFS on one extended LUN? The fact that both Extents locked at the same time shouldn’t be any more issue than if it was one extent. Either way all the space would be locked.
Just curious to your opinion.
March 5th, 2010 at 1:41 am
I like the layout of your blog and I’m going to do the same thing for mine. Do you have any tips? Please PM ME on yahoo @ AmandaLovesYou702
March 8th, 2010 at 9:05 am
This doc doesn’t explicitly talk about the locking mechanism (where is that doc???) it does talk about performance of thin vs thick provisioned disks.
http://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf
March 8th, 2010 at 9:31 am
Derek, why not just do a volume grow? Then you don’t have to worry about metadata, locks, multiple luns etc…