Wednesday, April 22, 2009

VMware/NFS performance with the Celerra NS-G8

Further testing with the Celerra raised the question of performance - can things be tweaked to improve performance and how does IP storage compare with FC? The scenario I am looking at is how VMware uses the NFS export as a datastore (something that isnt officially supported, yet a wizard exists to configure it).

Our Celerra will eventually be configured with a 10G Ethernet link into a Xsigo I/O Director - if your not familiar with Xsigo, check them out as there is some impressive technology there. We are waiting on parts to show up, so the current perf testing will use 2 1G Ethernet links in an Etherchannel configuration.

The test plan is:
  1. Create VMDKs using FC, Celerra with FC disks, and Celerra with SATA disks
  2. Use IOMETER to test the various "baseline" configuration
  3. Tweak the VMware/Celerra configuration according to EMC best practices and benchmark again
  4. Once the 10G modules arrive, repeat the tests
Storage Configuration:
Our Clariion is a CX4-240 with 1 tray of 1TB SATA disks, and 6 trays of 300GB FC disks. The SATA disks are configured into (2) 6+1 RAID5 arrays, with (4) 1.6TB LUNs each. The FC disks are configured into (3) 8+1 RAID5 arrays, with (2) 1.2TB LUNs each. The remaining FC disks are unused or configured for Celerra root file system. A temporary FC MetaLUN was also created with the same configuration as the existin FC exports for baseline testing

The SATA LUNs are combined on the Celerra to create 1 Storage Pool, and the FC LUNs are combined on the Celerra to create 1 Storage Pool. These pools are then exported to via NFS VMware over the 2GB Etherchannel links.

Once the baseline performance was captured using the default profiles in IOMETER, a whitepaper from EMC called "VMware ESX Server Optimization with EMC Celerra Performance Study Technical Note" (you may need a powerlink account to access this) was used to tweak NFS performance settings. According to their performance tests:
Based on the results of this study, the following recommendations should be considered when using VMware in a Celerra environment:
􀂊
Recommendation #1: Use the uncached option on Celerra file systems provisioned to the ESX Server as NAS datastore. This can improve the overall performance of the virtual machines. This should be considered particularly with random and mixed workloads that have write I/O.
􀂊
Recommendation #2: When using random and mixed workloads without replication, consider disabling prefetch on the Celerra file system. This can improve the overall performance of virtual machines provisioned with Celerra NFS storage.
􀂊
Recommendation #3: Align virtual machines that are provisioned with Celerra iSCSI storage as it can improve the overall performance of the virtual machines.
􀂊
Recommendation #4: Do not align virtual machines that are provisioned with Celerra NFS storage because this may degrade the overall performance of the virtual machines.
􀂊
Recommendation #5: Consider using an iSCSI HBA on the ESX Server with sequential workloads with Celerra iSCSI storage. This can improve the overall performance of virtual machines with such workloads.

This added 2 more scenarios into the testing mix: setting the uncached write option, and setting the prefetch read option. Configuring the uncached write option is done with the following command:
server_mount ALL -option rw,uncached
Configuring the prefetch read option is done with the following command:
server_mount ALL -option rw,noprefetch



MB/Second result
IOPs result

Conclusion:
While 2GB Ethernet does not match the throughput and performance of 4GB FC, it is close. I greatly expect that if this testing was done with 4GB Ethernet, the Ethernet/FC differences would be minimal, and once out 10G modules are installed that the Ethernet will outperform FC.
Additionally, at 2GB Ethernet speed, there is little difference between SATA disks and FC disks. This suggests that the bottlenec is the transport mechanism, and that more differences will be identified once the bandwidth is increased.
Lastly, there is little visible difference between the baseline configuration and the EMC best practice configuration. Some of this may be due to the workload profiles being mostly sequential instead of random, but it does suggest that the out-of-the-box configuration is fairly optimized.

Tuesday, April 21, 2009

Celerra NFS and VMware testing

We just recieved a EMC Celerra NS-G8 and it is my job to implement NFS serving VMware. Beyond the standard "get away from vmfs", there were a few features that peaked my interest: thin provisioning and deduplication.

Thin provisioning was a big letdown. If you are using NFS, you have some degree of thin provisioning by default. Additionally, most any VMware function that touches the disks (StorageVMotion, cloning, deploy from template) will bloat the vmdk to full size. I did find a way to thin out the VMs (I called it a treadmill process), but its not seemless and requires hours of downtime. I still have this feature enabled, but dont expect great things from it.

Deduplication was a bit of a surprise to me since I didnt think this feature was available until I got the system. My previous experience with deduplication was with EMC Avamar, which is block-level deduplication that allows for over 90% deduplication rates. Celerra deduplication however is file-level, meaning only fully duplicate files are freed up.
I have worked with Exchange and Windows Single-Instance-Storage before, so this is a great item for file servers where the same file may exist dozens or hundreds of times, but no 2 VMDKs are ever going to be alike.

Celerra Deduplication however also does compression, something that may be very useful if it can compress the zero blocks in a VMDK. To test this I created a "fat" vmdk and copied it to a NFS datastore, then initiated the dedupe process and identified the size differences.


Step 1: Create the bloated VMDK
The first thing needed is to create a bloated/fat/inflated disk to test against
  1. SSH into the VMware host
  2. CD to /vmfs/volumes/
  3. Create the disk vmkfstools -c 50G foo.vmdk -d eagerzeroedthick

The size of the disk can be confirmed by executing ls -l, and by viewing it in the Datastore Browser, make sure both locations list it as a full 50G in size (to ensure that thin provisioning isnt effecting us)


Step 2: Change the dedupe parameters
By default, deduplication is limited to files that meet the following requirements:
  • Havn't been accessed in 30 days
  • Havn't been modifed in 60 days
  • Is larger than 24kb
  • Is smaller than 200MB

To test the dedupe process, we need to change these using the server_param command. To see the current settings, ssh into the Celerra and run server_param server_2 -facility dedupe -list. This will list all the deduplication settings, the settings can then be changed by running server_param server_2 -facility dedupe -modify <attribute> -value <value>. In my case I need to reset the access and modified times to 0, and maximum size to 1000.


Step 3: Initiate the dedup process
Every time a filesystem is configured for deduplication, the dedupe process is triggered - meaning we can start a dedupe job manually by telling the filesystem to enable deduplication (if that makes sense). There are 2 ways we can do this - via the web console, or via the command line.

To kick off a dedupe job via the web console, browse to the File Systems node and open the properties for the target file system. In the File System Properties page, set Deduplication = Suspended and click Apply. The set Deduplication = On and click Apply. As soon as dedupe is set to on, a dedupe job will be initiated.

To kick off a dedupe job via command line, ssh into the Celerra and run fs_dedupe -modify -state on. This will automatically start a deduplication job. To view the status of the job, run fs_dedupe -info

Step 4: Compare the results
Initiating a dedupe on a file system with only VMDKs ultimatly results in 0 gain. Even with the disks being completely blank, compression doesnt seem to come into play - meaning a big waste of time in testing it.


Additional testing of dedupe with other files (ISOs, install files, home folders, etc...), show that dedupe works properly on the file level, but not for VMDKs.

Monday, April 20, 2009

Synchronizing ZenPacks across multiple collectors

I have setup multiple Zenoss collectors in my environment, and one of the issue I ran into was keeping the customizations and ZenPacks in sync. Unfortunatly there is no "out of the box" way to do this (maybe in Zenoss enterprise, but not in Zenoss core).

Initially I used SCP to copy the files from my primary server to the backup servers. This was problematic since the Zenoss user didnt have a password (that I knew at least), and running SCP as root changed the ACLs.

I then remembered my old friend rsync, and a quick bit of googling suggested that it may be the answer. A little more searching and I found a simple command line to copy the ZenPack files, customizations, and ACLs without any concern. A little tweaking and I can throw this right into my cronttab to synchronize on an hourly basis

rsync -avz /usr/local/zenoss/zenoss/ZenPacks/ root@serverB:/usr/local/zenoss/zenoss/ZenPacks

Wednesday, April 08, 2009

Configuring multiple Zenoss collectors

This post discusses how to configure multiple Zenoss collectors for centralized monitoring and alerting. These instructions are inspired by http://www.zenoss.com/Members/fdeckert/how-to-install-distributed-collectors/. Many of the tasks below reference $ZENHOME, which can be found by logging in as the zenoss user, but the tasks may need to be run as root.

Install zenoss on ServerB, but do not start it

Ensure DNS is setup with FQDN for both servers

Task

ServerA

ServerB

Install snmpd on both systems

apt-get install snmp snmpd

Configure snmpd

Snmpconf

Select none

Select to create snmpd.conf

Select Access Control Setup

Choose SNMPv1/SNMPv2c read-only access community name

Enter the read-only community name, enter, enter

Finished, Finished, Quit

mv snmpd.conf /etc/snmp/snmpd.conf

Enable remote snmp access

Edit /etc/default/snmpd

Change the line: SNMPDOPTS='-Lsd -Lf /dev/null -u snmp -I -smux -p /var/run/snmpd.pid 127.0.0.1'

To: SNMPDOPTS='-Lsd -Lf /dev/null -u snmp -I -smux -p /var/run/snmpd.pid'

Restart snmpd

Verify snmp is working

snmpwalk -v 2c -c public ServerB .1.3

snmpwalk -v 2c -c public ServerA .1.3

Configure Zenoss services

Create two files in $ZENHOME/etc named DAEMONS_TXT_ONLY and daemons.txt

Enter the below into daemons.txt and save:

zeoctl

zopectl

zenhub

zenping

zensyslog

zenstatus

zenactions

zentrap

zenmodeler

zenrender

zenperfsnmp

zencommand

zenprocess

zenwin

zeneventlog

Create two files in $ZENHOME/etc named DAEMONS_TXT_ONLY and daemons.txt

Enter the below into daemons.txt and save:

zenping

zensyslog

zenstatus

zenactions

zentrap

zenmodeler

zenrender

zenperfsnmp

zencommand

zenprocess

zenwin

zeneventlog

Configure ServerB to use local monitors, but use the hub on ServerA

In $ZENHOME/etc, edit the following files: zenactions.conf, zencommand.conf, zendisc.conf, zeneventlog.conf, zenmodeler.conf, zenperfsnmp.conf, zenping.conf, zenprocess.conf, zenrender.conf, zenstatus.conf, zensyslog.conf

zentrap.conf, zenwin.conf, zenwinmodeler.conf

Enter the following 2 lines in all files:

monitor ServerB

hubhost ServerA

Configure ServerB to use the zope engine on ServerA

In $ZENHOME/etc, edit zope.conf

Find the zeoclient section

Change the line: server localhost:8100

To: server ServerA:8100

Add the remote collector

In the Web Interface, browse to Management | Collectors | Add Monitor

Enter the name ServerB

Change Render URL from: /zport/RenderServer

To: http://ServerA:8090/ServerB

Copy ZenPacks and Plugins

Scp -r $ZENHOME/ZenPacks ServerB:$ZENHOME/ZenPacks

Make sure any other alterations (symlinks, packages, etc…) are duplicated

Ensure files are owned by zenoss

chown –R zenoss.zenoss $ZENHOME/ZenPacks

chown –R zenoss.zenoss $ZENHOME/ZenPacks

Start Zenoss

/etc/init.d/zenoss-stack restart

/etc/init.d/zenoss-stack start

Begin moving devices

In the Web Interface, browse to Management | Collectors | localhost

Select several devices and click Devices | Set Perf Monitor

Monitoring and automatically restarting services in Zenoss

Now we need to monitor a service - say the Print Spooler - and we want to know when it fails. Actually, since we are lazy and overworked, we want to automatically restart the service, and only be alerted if it doesnt restart.

Monitor a Windows Service
Enabling monitoring of Windows services is quite intuitive, below are the steps needed to setup monitoring.
  1. In the action pane, select Services -- various classes of services will be listed
  2. Select WinService
  3. Find the service you are interested in monitoring (spooler) by paging through the list, or type in the name in the search box to the right
  4. Click the Spooler service and select the Edit tab
  5. Change Monitor to True, click Save

Enable Automatic Restart
This was inspired from http://blog.zenoss.com/2008/03/21/restarting-windows-services-with-zenoss, more help may be available from there if my words dont make sense.

Create a Transform to recognize the event
  1. In the action pane, select Events
  2. Select Status, WinService
  3. Select WinService | More | Transform
  4. Enter the following in the Transform and click Save
#get service name
msg = getattr(evt, "message", None)
#parse the message which looks like this
#Windows Service 'W32Time' is down
if msg:service = msg.split("'")[1]
#make new message now with only service name
#we don’t loose anything since summary is the same message
evt.message = service

Create an Event Manager Command
  1. In the Action pane, under Management select Event Manager and click the Commands tab
  2. Enter the name Start Windows Service and click Add
  3. Click the command just created and change Enabled to True
  4. For Command, enter the following: winexe -U "${dev/zWinUser}%${dev/zWinPassword}" //${dev/manageIp} "net start ${evt/message}"
  5. For the Where clause, enter Event Class | begins with | /Status/WinService (see image below)
  6. Click Save

To test this, simply stop the spooler service on a monitored system - it should automatically restart. If you stop and disable the service, you should recieve an alert.


Changing the graphs shown under the Perf tab in Zenoss

The default Windows class in Zenoss has several SNMP based performance graphs included. Since I (and most Windows admins) don't use SNMP, I want to replace these with custom graphs built on WMI. I have already created my Performance Templates, and now I need to select them to be the default performance graphs.

  1. Browse to /Devices/Server/Windows and click the Templates tab
  2. Select Available Performance Templates | Bind Templates
  3. Select the template or templates to include and click OK

Now when you browse to a server and click the Perf tab, the graphs included in the template you chose will appear.

There is actually much more to what is happening here than defining what graphs appear where, but since this was my first question about the graphs, thats how I am stating it. More iformation about binding templates can be found at http://www.zenoss.com/community/docs/zenoss-guide/2.2.4/ch13s03.html

Configuring email in Zenoss

Where would we be without email.... The world would slow to a crawl. So how do we configure Zenoss to send us emails?

Enable Emails
  1. In the web console, select Management | Settings
  2. On the Settings tab, enter the SMTP Host and FROM: addresses, click Save

Setting a user for email
  1. Click the Users tab and select the appropriate user
  2. Enter the Email addresses and click Save
  3. On the Alerting Rules tab, click Alerting Rules | Add Alerting Rule
  4. Enter a name for the rule and click OK
  5. Click the rule to open its settings
  6. Change the Delay from 0 seconds to 600 -- this will force the alerts to age for 10 minutes before being sent
  7. Enable the rule and click Save

To test the emails are working
  1. Select Management | Settings and Users tab
  2. Next to the users email address, click TEST
  3. Validate the email is recieved

Windows Performance Monitoring in Zenoss using WMI

Now that we have our ZenPack, we need to begin making it do something for us. The goal here is to monitor Windows server performance using WMI and alert on overage. To do this we are going to start by importing a Perl script to access WMI.

Importing the Perl script
  1. There is a ZenPack available from Zenoss called Perfmon that includes a script called perfmon.pl. We could simply import this ZenPack and then reference the file, but I am going to import the file itself to keep things simple in the future.
  2. Download and extract the Perfmon ZenPack and copy perfmon.pl to the $ZENHOME/ZenPacks/ZenPacks./ZenPacks/lib
  3. Install the wmi-client tools -- apt-get install wmi-client
  4. Make a symbolic link for winexe -- ln -s /usr/bin/winexe /usr/local/zenoss/zenoss/bin/

Create the performance template - Data Sources
  1. Browse to Classes | Devices | Server | Windows and click the Templates tab
  2. Create a new template by selecting Available Performance Templates | Add Template
  3. Enter the name "Basic Windows Performance" and click OK
  4. Create a new Data Source by clicking Data Sources | Add Data Source
  5. For name, enter dsCpuPercentage, Type of Command
  6. For Command Template, enter
  7. $$ZENHOME/ZenPacks/ZenPacks../ZenPacks///lib/perfmon.pl 1 "${dev/manageIp}" "${dev/zWinUser}" "${dev/zWinPassword}" "\Processor(_Total)\% Processor Time" "cpu_ProcessorTime"
  8. Select DataPoints | Add Data Point
  9. Enter cpu_ProcessorTime and click Add

Create the performance template - Thresholds
  1. Browse to Classes | Devices | Server | Windows and click the Templates tab
  2. Click the Basic Windows Performance template
  3. Select Thresholds | Add Threshold
  4. Enter a name of thCpuPercentage, type MinMaxThreshold, click OK
  5. On the threshold page, select the appropriate Data Points
  6. For Max Value, type 80
  7. Set Event Class to /Perf/CPU
  8. Click Save


Create the performance template - Graph Definitions
  1. Browse to Classes | Devices | Server | Windows and click the Templates tab
  2. Click the Basic Windows Performance template
  3. Select Graph Definitions | Add Graph
  4. For name, enter CPU Utilization
  5. Select Graph Points | Add DataPoint, select the data points to include
  6. Click OK, Save

Include the template in the ZenPack

  1. Browse to Classes | Devices | Server | Windows and click the Templates tab
  2. Click the Basic Windows Performance template
  3. Select Performance Template | Add to ZenPack

Creating a ZenPack in Zenoss

So we have Zenoss installed and want to begin customizing it. The first thing we want to do is create a ZenPack to hold all customizations, that way we can backup and export the customizations and import them elsewhere as needed.

Create the ZenPack
  1. In the task pane, select Management | Settings
  2. Select the ZenPacks tab
  3. Click the drop-down and select Create a ZenPack
  4. Enter the name for the ZenPack - best practice suggests to use the format of ZenPacks..
To add objects into the ZenPack, you can copy files in to the ZenPack directory on the linux system.
Most Zenoss configurations are stored under the $ZENOSS directory, on Ubuntu this directory is /usr/local/zenoss/zenoss, but it can be different for each installation. ZenPack files are stored in $ZENHOME/ZenPacks/ZenPacks../ZenPacks//, so if your ZenPack is named ZenPacks.CompanyXYZ.BasicWindows, the files would be stored in $ZENHOME/ZenPacks/ZenPacks.CompanyXYZ.BasicWindows/ZenPacks/CompanyXYZ/BasicWindows/
To include a custom script in the ZenPack, you would simply copy the file to the $ZENHOME/ZenPacks/ZenPacks.<company>.<name>/ZenPacks/<company>/<name>/lib directory, and then reference the script accordingly.

Installing Zenoss on Ubuntu

While looking at new system monitoring solutions, I came to the realization that whatever tool was chosen it would still require a large outlay of man-hours to get it setup, configured, and working properly. Every tool is different, and has different ways of configuring it, different alerting functions, different graphing capabilities, and so on... So when I looked at open source solutions, I was quite pleased to see that Zenoss was capable of being implemented easily and was simple to customize and expand.

Here is the process to get Zenoss core installed on an Ubuntu server and begin monitoring a windows server.

Download and install Ubuntu Server 8.10
  1. Download the ISO and burn it to a CD
  2. Insert the CD into a server and install it as you normally would
  3. Select to include the LAMP and OpenSSH modules
  4. You could use Workstation, but best practice dictates to only install the features you need

Download and install the DEB package file
  1. wget http://downloads.sourceforge.net/zenoss/zenoss-stack_2.3.3_x64.deb
  2. Install the DEB package file
  3. sudo dpkg -i zenoss-stack_2.3.3_x64.deb

Log into the web interface
  1. From another system, open a web browser
  2. http://<ipaddress>:8080
  3. Username: admin, Password: zenoss
Setup the Windows admin account
  1. In the task pane, select Classes | Devices
  2. Select the zProperties tab
  3. Find the zWinPassword and zWinUser attributes and input the username and password
  4. NOTE: The zWinUser should be entered as domain\username

Add your first device
  1. In the task pane, select Management | Add Device
  2. Enter the Device Name, and set the Device Class Path to /Server/Windows
  3. Click Add Device


Devices can also be added by discovering networks, but that will be left for another day. The Zenoss installation completes in less than 30 minutes and can begin monitoring. Future todo's include configuring email, overriding the Windows template to use WMI instead of SNMP, configuring service monitoring, creating automated responses to alerts (such as restarting services), creating a tiered monitoring environment with multiple collectors, and including additional system information in the server discovery.

Tuesday, April 07, 2009

Monitoring Brocade switches with ZenOss

There is a really great zenPack for ZenOss that allows for monitoring and graphing of traffic on fiber switches. The zenPack can be downloaded at http://www.zenoss.com/community/projects/zenpacks/brocade-switches, but the install isnt flawless as you would hope. To install the ZenPack, do the following:
1. Within ZenOss, browse to Devices
2. Next to Sub-Devices, click the drop down and select Add New Organizer
3. Enter "Blade Switches" for the title and click OK
4. Import the ZenPack as normal
5. Discover the FC devices as Blade Switches

Once discovered, all the ports appear under the OS tab and graphs can be viewed by clicking on each port.