VMFS Deep Dive

VMFS Deep Dive
Tuesday, April 28, 2009 , Posted by Virtualbox at 11:46 PM

Agenda
VMFS Deep Dive - ESX Storage Stack and VMFS - VMFS Vs RDM - SCSI reservation conflicts - Multipathing - Snapshot LUNs and resignaturing The Storage Stack in VI3
VMFS A Clustered filesystem for todays dynamic IT world
Built-In VMFS Cluster File System Simplifies VM provisioning Enables independent VMotion and HA restart of VMs in common LUN File-level locking protects virtual disks Separates VM and storage administration Use RDMs for access to SAN features
Raw Disk Mapping (RDM)
Mapping files in a VMFS volume Presented as virtual SCSI device Key contents of the metadata include location and locking of mapped device Virtual machine must interact with a real disk on the SAN
Microsoft Cluster Services (MSCS) Storage VMFS vs. RDM RAW VMFS
RAW may give better performance Leverage templates and quick provisioning RAW means more LUNs - More provisioning time Fewer LUNs means you dont have to watch Heap Advanced features still work Scales better with Consolidated Backup Preferred Method
Skeleton of a VMFS
A VMFS holds files and has its own metadata Metadata gets updated through - Creating a file - Changing a files attributes - Powering on a VM - Powering off a VM - Growing a file

When metadata is updated, the VMkernel places a non-persistent SCSI reservation on the entire VMFS volume Lock held on volume for the duration of the operation Other VMkernels are prevented from doing metadata updates
VMFS 3 & SCSI Reservations
Concurrent-access filesystem Most I/O happens simultaneously from all hosts Filesystem metadata updates are atomic and performed by the requesting host
- Locking a file for read/write (e.g. vmdk when powering on VM) - Creating a new directory or file - Growing a file etc.
For the time needed by the locking operation (NOT metadata update), a LUN is reserved (=locked for access) to a single host SCSI Reservation Conflict What it is
What happens if we try to perform I/O to a LUN thats already reserved? - A retry counter is decreased and the I/O operation is retried - The retry is scheduled with a pseudo-random algorithm - If the counter reaches 0, we have a SCSI reservation conflict SCSI: 6630: Partition table read from device vmhba1:0:6 failed: SCSI reservation conflict (0xbad0022) SCSI: vm 1033: 5531: Sync CR at 64 SCSI: vm 1033: 5531: Sync CR at 48 SCSI: vm 1033: 5531: Sync CR at 32 SCSI: vm 1033: 5531: Sync CR at 16 SCSI: vm 1033: 5531: Sync CR at 0 WARNING: SCSI: 5541: Failing I/O due to too many reservation conflicts
WARNING: SCSI: 5637: status SCSI reservation conflict, r status 0xc0de01 for vmhba1:0:6. residual R 919, CR 0, ER 3 Whos holding a SCSI Reservation?
One ESX host (persistent reservation)
- vmkfstools L reserve : This should NEVER EVER be done - Interaction with installed third-party management agents Multiple ESX hosts, alternatively - High latency/slow SAN o Critical lock-passing between ESX hosts during vmotion - SAN firmware slow in honoring SCSI reserve/release o Synchronously mirrored LUNs One non-ESX host - LUN erroneously mapped to e.g. a Windows host No host - Persistent reservation held by the SAN - Needs investigation by the SAN vendor
ESX Server Multipathing
Multipathing vmhbaN:T:L:P notation
Determined at boot, install / rescan: - N = adapter number - T = target number (generally 1 SP = 1 target)
Determined by the SAN - L = LUN ID - SCSI identifier of the LUN (not shown here) Determined at datastore or extent creation - P = partition number (if 0 or absent = whole disk)
Per-LUN Multipathing Failover Policy
VMware supports using only one path at a time - MRU = Most Recently Used - Fixed = choose a preferred path & failback to it - multiple ESX hosts or multiple LUNs, allows for manual load balancing between SPs
Never setup Fixed policy with an active/passive SAN! Why? Path Thrashing
Only possible on active/passive SANs Host 1 needs access to the LUN through SP1 Host 2 needs access to the LUN through SP2
The LUN keeps being trespassed between SPs and its never available for I/O Multipathing
Active/Active - LUNs presented on multiple Storage Processors - Fixed path policy Failover on NO_CONNECT Preferred path policy Failback to preferred path if it recovers Active/Passive - LUNs presented on a single Storage Processor - MRU (Most Recently Used) path policy Failover on NOT_READY, ILLEGAL_REQUEST or NO_CONNECT
No preferred path policy, no failback to preferred path
Load Balancing - Fixed (Preferred Path)
1st active path discovered or user configured. Active/Active arrays only - Most recently used (MRU) Active/Active arrays Active/Passive arrays
Snapshot LUNs and Resignaturing How VMware ESX Identifies Disks
Each LUN has a SCSI identifier string provided by the SAN vendor The SCSI ID stays the same amongst different paths The vmkernel identifies disks with a combination of LUN ID, SCSI ID and part of the model string # ls -l /vmfs/devices/disks/ total 179129968 -rwxrwxrwx 1 root root 72833679360 Nov 13 12:16 vmhba0:0:0:0 lrwxrwxrwx 1 root root 58 Nov 13 12:16 vmhba1:0:0:0 -> vml.020000000060060160432017002a547c3e7893dc11524149442035 lrwxrwxrwx 1 root root 58 Nov 13 12:16 vmhba1:0:1:0 -> vml.02000100006006016043201700a99d1c3bb9c5dc11524149442035 lrwxrwxrwx 1 root root 58 Nov 13 12:16 vmhba1:0:10:0 -> vml.02000a000060060160432017000db2f61d17d3dc11524149442035
(...) Snapshot LUNs & Resignaturing Key Facts
ESX identifies objects in a VMFS datastore by path e.g. /vmfs/volumes// The VMFS UUID (aka signature) is generated at VMFS creation
The VMFS header includes hashed information about the disk where its been created
The Check for Snapshot LUNs
- VMFS relies on SCSI reservations to acquire on-disk locks, which in turn enforce atomicity of filesystem metadata updates" - SCSI reservations dont work across mirrored LUNs - To avoid corruption, we need to prevent mounting a datastore and a copy of it at the same time
On rescan, the information about the disk in the VMFS header metadata (m/d) is checked against the actual values If any of the fields doesnt match, the VMFS is not mounted and ESX complains its a snapshot LUN LVM: 5739: Device vmhba1:0:1:1 is a snapshot: LVM: 5745: disk ID: LVM: 5747: m/d disk ID:
ALERT: LVM: 4903: vmhba1:0:1:1 may be snapshot: disabling access. See resignaturing section in SAN config guide.
LUNs Detected as Snapshots Causes
LUN ID mismatch SCSI ID change (e.g. LUN copied to a new SAN)

They are effectively snapshots (e.g. DR site)
LUNs Detected as Snapshots How to Fix
Are they mirrored/snapshot LUNs? - If yes: will the ESX host(s) ever see both original and copy at the same time? Yes resignature No either allow snapshots or resignature - If no: do multiple ESX hosts see the same LUN with different IDs? Yes fix the SAN config; if not possible allow snapshots No IDs permanently changed: either allows snapshots or resignature
Resignaturing Issues Never ever resignature
while the VMs are running
- resignaturing implies changing UUID and datastore name - All paths to filesystem objects (vmdks, VMs) will become invalid!

VMFS Deep Dive

Uploaded by

Copyright:

Available Formats

VMFS Deep Dive

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

VMFS Deep Dive

Uploaded by

Copyright:

Available Formats

What are the differences between VMFS and RDM?

What is a SCSI reservation conflict?

VMFS Deep Dive

Tuesday, April 28, 2009 , Posted by Virtualbox at 11:46 PM

VMFS A Clustered filesystem for todays dynamic IT world

Raw Disk Mapping (RDM)

VMFS 3 & SCSI Reservations

One ESX host (persistent reservation)

Multipathing vmhbaN:T:L:P notation

Load Balancing - Fixed (Preferred Path)

The Check for Snapshot LUNs

LUNs Detected as Snapshots Causes

LUN ID mismatch SCSI ID change (e.g. LUN copied to a new SAN)

LUNs Detected as Snapshots How to Fix

while the VMs are running

You might also like