Monday, July 25, 2011

Using tape libraries under ESX 4.1 - Take 2

After ruling out SCSI pass-through and iSCSI, I found a link referencing PCI pass-through on ESX servers. This requires a system that supports Intel VT-d and essentially ties a PCI card to a running virtual machine. The VMware feature is called VMDirectPath I/O, and information on configuring it can be found at http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1010789

Once I found a host that supported VT-d, I was surprised how easy the setup process was. Just configure the VMDirectPath device, reboot, and add it to a VM. Using a QLogic HBA, my Backup Exec server easily discovered my tape library and drives.
Performance for this setup was great! I actually edged out my prior configuration, but that was likely due to the hardware I was running on.

Friday, July 22, 2011

Using tape libraries under ESX 4.1

In the old days of ESX 3.5, there was an unsupported feature that allowed SCSI pass-through to VMs. This allowed us to build a VM and connect our FC tape library to perform backup and restores using various backup software.
This isn't the highest performance backup/recovery solution, but it was definitily the most flexible. This allowed us to swap backup solutions (BackupExec, NetBackup, Networker, CommVault, etc..) by building out separate VMs for each configuration. By only having 1 VM on at any time, you could also share the same library across all the VMs.

This unsupported feature broken entirely with ESX4.0. SCSI pass-through would allow the VM to see the device, but it couldn't use the library. VMware had some experimental DirectPath features, but all our research suggested our configuration wouldnt work. To deal with the shortcomings, we kept 2 ESX hosts running at 3.5 to act as our Tape-Cluster, while we upgraded the rest to 4.0.

This extremely inefficient setup finally called for some changes, so I started looking for iSCSI solutions. In theory, the tape library can be presented via iSCSI to the OS running in the VM. Then, regardless of hardware (physical or virtual), the OS would be able to use the tape library.
There were a few hacked solutions out there, but only 1 came in an enterprise-ready setup. StarWind Software allows for publishing of physical and virtual devices (Hard Drive, Optical Drive, Tape Device) via iSCSI. Out of the box, the eval key allowed me to present my library and drives via iSCSI without any customization. Once the Microsoft iSCSI Initiator was installed and configured, the tape library was discovered without issue. Backup Exec readily accepted the changer and drives without question.

Configuration
A physical server (4 proc, 32GB RAM) was built running Windows 2008R2 to operate as the iSCSI gateway. A single FC card connected to the tape library via Brocade switch. A single 10GB Ethernet connection was used for iSCSI. Overall, an over-sized system for this test, but it was available at the time.
Our existing ESX3.5 Backup Exec VM (the one configured for SCSI pass-through) was cloned and moved to a pristine ESX4.1 environment. The hardware version, NICs, and VM tools were upgraded to the latest version. Microsoft's iSCSI initiator was installed and connected to the gateway server.

Functionality
So far I have found no deficiency in functionality. Once the iSCSI initiator connected, the library and drives discovered without intervention, the existing drivers supported the new devices without issue. Backup Exec was reconfigured with the new devices (same process as if a new library was connected), but otherwise there was minimal setup. The library inventories, catalogs, and restores the same as a FC connected system.
One issue I ran into - A coworker inserted tapes into the library while I was in-between tests, and somehow the library reported that it was rebooted. Backup Exec then reported the drive was offline, nothing I did on the Backup Exec server resolved the issue. Ultimately I had to restart the SolarWinds service on my gateway system to get things working again. Backup Exec then allowed me to set the drive as online and continue my testing.

Performance
I performed several tests using the iSCSI gateway and timed the results. I then performed the same tests on our existing ESX3.5 VM to compare the run times. 
Initial performance testing proved encouraging. Scanning the library, Inventorying the library and Cataloging the library all took about the same amount of time between the two configurations. Any variation in times could easily be attributed to the library doing other tasks at the same time. In my verdict, basic library/tape functions were comparable between the two setups
Restore testing is where the solution would really be tested. I performed multiple restores starting at a few MB up to a hundred GB. I made sure to use the same tape set and source data between the systems to ensure an appropriate comparison.


Verdict
The performance differences for larger restores are astounding. Small restores and library functions were comparable, but when transferring large amounts of data, iSCSI falls well short of the goal.
Additionally, I had a few failures on the larger restores after the 50GB mark using iSCSI. The failures were intermittent, but additional research suggests this may be a limitation of the technology.

Tuesday, July 12, 2011

sVmotion VMware templates

I am doing some VMware datastore shuffling and needed to move several templates to another datastore. For normal VMs, I have a script named "Move-VMThin.ps1" that moves the VMs and thins out the disks at the same time. I looked for something similar for templates, something that would allow me to migrate the template to other storage, and there was no native support.

A little googling and I found http://ict-freak.nl/2010/01/21/powercli-move-template/. It appears that you have to convert the template to a VM, move the VM, and convert it back to a template. This is fine, but a pain if you have dozens of templates.

$vmName = $args[0]
$dsName = $args[1]


function Move-VMTemplate{
    param( [string] $template, [string] $datastore)

    if($template -eq ""){Write-Host "Enter a Template name"}
    if($datastore -ne ""){$svmotion = $true}

    Write-Host "Converting $template to VM"
    $vm = Set-Template -Template (Get-Template $template) -ToVM 

    Write-Host "Migrate $template to $datastore"
    # Move-VM -VM (Get-VM $vm) -Destination (Get-VMHost $esx) -Datastore (Get-Datastore $datastore) -Confirm:$false
    Move-VMThin (Get-VM $vm) (Get-Datastore $datastore)
    
    Write-Host "Converting $template to template"
    (Get-VM $vm | Get-View).MarkAsTemplate() | Out-Null
}

function Move-VMThin {
    PARAM(
         [Parameter(Mandatory=$true,ValueFromPipeline=$true,HelpMessage="Virtual Machine Objects to Migrate")]
         [ValidateNotNullOrEmpty()]
            [System.String]$VM
        ,[Parameter(Mandatory=$true,HelpMessage="Destination Datastore")]
         [ValidateNotNullOrEmpty()]
            [System.String]$Datastore
    )
    
 Begin {
        #Nothing Necessary to process
 } #Begin
    
    Process {        
        #Prepare Migration info, uses .NET API to specify a transformation to thin disk
        $vmView = Get-View -ViewType VirtualMachine -Filter @{"Name" = "$VM"}
        $dsView = Get-View -ViewType Datastore -Filter @{"Name" = "$Datastore"}
        
        #Abort Migration if free space on destination datastore is less than 50GB
        if (($dsView.info.freespace / 1GB) -lt 50) {throw "Move-ThinVM ERROR: Destination Datastore $Datastore has less than 50GB of free space. This script requires at least 50GB of free space for safety. Please free up space or use the VMWare Client to perform this Migration"}

        #Prepare VM Relocation Specificatoin
        $spec = New-Object VMware.Vim.VirtualMachineRelocateSpec
        $spec.datastore =  $dsView.MoRef
        $spec.transform = "sparse"
        
        #Perform Migration
        $vmView.RelocateVM($spec, $null)
    } #Process
}


Move-VMTemplate $vmName $dsName

Friday, July 01, 2011

Firefox 5 with Hardware Acceleration

Firefox 5 was just recently released and I love it! The hardest part is finding the new features and possibilities in the new versions.

One item I found that was a carry-over from the firefox 4 days was hardware acceleration. In the beta versions of FF4, you could enable acceleration, but it was unproven. In theory, this was rolled up into the new versions, but I found some improvements were still possible.
First, go to http://demos.hacks.mozilla.org/openweb/HWACCEL/ and run the acceleration demo. This will give you a baseline of your current configuration.

Then, enable accelerated graphics: From http://www.addictivetips.com/internet-tips/enable-firefox-hardware-accelerated-graphics/
  1. Enter ‘about:config’
  2. Click through the warning, if necessary
  3. Enter gfx.font in the ‘Filter’ box
  4. Double-click on ‘gfx.font_rendering.directwrite.enabled’ to set it to true
  5. Below this, right click and select New > Integer to add a pref setting
  6. Enter ‘mozilla.widget.render-mode’ for the preference name, 6 for the value
  7. Restart
(To disable, set gfx.font_rendering.directwrite.enabled to false, delete mozilla.widget.render-mode, then restart.)

Finally, go back to the acceleration demo and see the differences.

Reclaiming thin-provisioned disk space in VMware

My coworker copied a bunch of VMDK files to a VM so he could import them into VMware. This process worked fine, but left his VM disks using almost their full capacity, even after the files were deleted. When attempting to SVMotion the VM (and re-thin the disks), it failed because there wasnt enough room at the destination disk - for some reason VMware was not re-thinning the disks properly.

A little googling came up with a post at http://www.virtualizationteam.com/virtualization-vmware/vsphere-virtualization-vmware/vmware-esx-4-reclaiming-thin-provisioned-disk-unused-space.html. This talks about using a tool called SDELETE (http://technet.microsoft.com/en-us/sysinternals/bb897443) to zero out the unused blocks on the disk. Once zero'd, the SVMotion should thin the disks properly.

Steps to reclaim the disk space:
  1. Download SDelete from Microsoft
  2. Run sdelete -c e: (or the appropriate drive letter)
  3. Use SVMotion to move the disk to another datastore
 NOTE: This can take a long time depending on the disk, and will churn your back-end storage. Dont do this during production hours, or you may have customers complain about performance issues

Recovering from BSOD with crash dump

My system just crashed and performed the traditional Blue Screen of Death (BSOD). The bad news is that I had open files that I have now lost. The good news is that it created a dump file to review and see what caused it.

90% of all crashes are due to a driver of some sort, and most of those are easily identified. The first step is to download WinDBG. For my Windows 7 system, with was part of the Windows 7 SDK at http://www.microsoft.com/download/en/confirmation.aspx?id=8279, I ran the setup and cleared out all but Common Utilities | Debugging Tools for Windows.
Once installed, on the start menu open WinDbg and go to File | Symbol File Path and enter SRV*c:\local cache*http://msdl.microsoft.com/download/symbols, click OK. Go to File | Open Crash Dump and browse to the memory.dmp or mini.dmp file. On opening the file, the debugger will do some initial analysis and return a line:
Probably caused by : XXXXXX
To get more detailed information, enter (or click) !analyze -v. Tons of more information will come spewing out and detail the type of error, more distinct error codes, drivers, and modules effected.

Like stated earlier, 90% of these errors are due to drivers (printers, video, etc...). This debug information should have identified the driver name, just google that driver and find an updated version.

More information on using the windows debugger to deal with crash dumps:
http://www.networkworld.com/news/2005/041105-windows-crash.html
http://support.microsoft.com/kb/315263