Friday, December 26, 2008

Joining Linux to Active Directory

In the computer world there is always a large number of pros and cons for every technology. Windows is very good for general purpose usage and works great in an every day workplace, but doesnt perform ideally in some high performance environments. Linux can be tuned like crazy to give some high performance, but is convoluted and confusing, and some basic Windows features are unavailable.

Recently I have been looking at Linux to run a high-performance Oracle database - but installign and setting it up is only a small part of the issue, long-term management is the big issue. The first hurdle is user accounts: I already have a windows domain and I dont want to make my admins manage different credentials on every box we run. In the past there has been NIS, some LDAP integrations and even MS Services For Unix, but I have always been dissapointed by what was available
I stumbled across some software called likewise-open that makes that first hurdle a piece of cake. I installed this in an Oracle Enterprise Linux installation in a few seconds without issue - no editing of text files, hacking of kerberos packages, or hunting down RPMs - before I knew it I was logging into my Oracle linux system with my domain credentials...

For those who use Ubuntu (my personal choice), check out this post. Just execute the following steps and your good to go
  1. sudo apt-get update
  2. sudo apt-get install likewise-open
  3. sudo domainjoin-cli join fqdn.of.your.domain Administrator
  4. sudo /etc/init.d/likewise-open start
So an entire infrastructure can now be built on Linux, but still use AD authentication like users expect. Just add these steps into your build process and all things are shiny.

Tuesday, December 23, 2008

Networking Functions and Devices

Now that we have the basics of networking theory, what does this mean to network functions and devices? Lets start at the bottom of the OSI layer and work up.
The physical layer is exactly that - the physical (or wireless) connectivity between devices. The physical layer includes the network interface cards (NIC) and cabling between the devices. The most common cabling in use today Twisted Pair and is defined in various categories based on the number of twists per inch in the cable; the more twists the less interference by external electronics. Most common ratings are category (or CAT for short) 3, 5 and 6, with some "enhanced" versions available for higher performance.
  • CAT 3 is traditionally used for phone lines and is capable of transmitting up to 10Mb/sec
  • CAT 5 is the most commonly used networking cable and is capable of transmitting up to 100Mb/sec
  • CAT 6 is fairly new and capable of transmitting up to 1Gb/sec
Other physical mediums include coaxial (not commonly used in a datacenter), fiber (used for high-speed and long distances), and wireless.
Network devices at the physical layer include: hub/bridge/repeater - an unintelligent device that rebroadcasts/repeats signals to all available ports. No addressing or routing is performed and collision of packets from multiple devices can occur frequently.

The datalink layer is the physical addressing of the devices on a network. Since several devices can share the same physical connection, the datalink layer utilizes the MAC address of each device to direct traffic. This is done by the hardware built into the NIC to only pickup traffic targeted for its address, thereby minimizing processor overhead on the recieving systems.
While the MAC address is normally hard-coded in a NIC, most devices today allow for a different MAC address to be configured via software. This is useful when two NICs on the same network have the same MAC address (supposedly shouldnt happen, but I have seen it occur), or when you need to pretend your a different device (sometimes referred to as spoofing or promiscuous mode).
The format of the MAC address follows a standardized format - a 12 digit hexadecimal number split into 2 parts. The first 6 digits identify the manufacturer of the network card (use to lookup the vendor), and the remaining 6 digits are randomly created. More information on MAC addresses can be found at
Network devices at the datalink layer include: switch - a somewhat intelligent device that learns the addresses of devices around it, then inspects traffic to direct it only to the appropriate target. Once a connection between two devices is setup within a switch, unhindered communication can occur between the devices.

The network layer is where logical addressing of occurs, on top of the physical addressing of the datalink layer. This is where the IP address comes into play to allow a human defined separation and routing of traffic between departments, offices, or cities.
IP addressing follows a strict format defined by the IETF (Internet Engineering Task Force). The IP address is composed of 4 octets (or 32 binary digits), separated into 2 sub addresses: the network address and the machine address. The network address can be likened to the city/state/zip your mail is delivered to - a message can be sent from anywhere in the world and this unique address will deliver it to your local post office. The machine address can then be likened to the physical address of your home, once the local post office has the mail, it can deliver it to your location.
The size of the network and machine address however are not static, their size can vary from network to network based on the administrators design. This is where the subnet mask comes into play, it tells the devices how much of the IP address is the network address and how much is the machine address. The subnet mask can define as little as 1 binary digit for the network address and as many as 31, and the same goes for the machine address. More information about the history and design of IP addressing can be
Network devices at the network layer include: router - a more intelligent device that has a set of rules to direct traffic between networks. This device is not aware of the individual devices that exist on each network, but instead only about the topology of the network addresses from its point of view.

Up next - Routing...

Sunday, December 14, 2008

Networking 101

I recently was involved in several technical interviews for some technician and administrative positions, and was surprised by how little most of the prople understood networking basics. Most of the answers I got in the interviews focused on:
  1. Hubs are for home offices
  2. Switches are smart hubs, and used in businesses
  3. Routers are used to link multiple offices

These definintions show a very limited understanding of network basics, and the fact that most of the interviews thought that this was the whole answer really made me wonder.

So I am going to post several articles detailing the basics of networking from the ground up - i.e. from the cable up, not what acronyms are in use today.

Networking Basics -- The OSI Model

The first thing to know is that there is a logical sequence to how networks function. Way back in 1977 when networks began to emerge, a common model was developed to allow different systems to work together. The OSI model (Open Systems Interconnect) was created to define 7 hierarcichal layers that work together - these layers are often easiest to understand from the bottom up.

ApplicationThe application layer identifies communication partners, determines availability of resources, and synchronizes communication.

The presenation layer is responsible for communicating between the application and the rest of the network stack. One main feature is of this layer is to convert serialized data (i.e. a long stream of numbers and characters), into structured data such as XML or documents and other files. This layer is also responsible for controlling encryption and compression of data transmitted between computers.

SessionThe session layer is responsible for controlling the sessions or connections between devices. Whenever network devices communicate, they normally start with a hello, contain several checkpoints, and end with a goodby. In theory, if a connection is lost in an ungraceful manner (such as a power loss or broken cable), the session layer could reconnect and restart communications where it left off.

The transport layer is used to control reliability, recoverability, and additional network features. Features such as checksums, packet sequencing, and retransmissions are all functions to increase reliability and recoverability.

Features such as tunneling, VPN, and IPSec are additional network features provided by the transport layer.


The network layer is resonsible for directing or routing traffic from the source to the correct destination. The components on the network layer are not concerned with the ultimate delivery of the information, but simply with passing it on to the next logical step.

Similar to the post office example used in the datalink layer, the network layer can be viewed as a central routing office that takes mail from all over the country. The office then directs the mail to the correct state, county, or city - these smaller offices then direct the mail to the individual homes.

DatalinkThis layer controls the communication that occurs on the physical layer. Each physical device has a corresponding physical address - sometimes referred to as a MAC address - which is like a house address. Since traffic from multiple systems can travel on the same network cable, the address is used to determine what information belongs at which location - similar to how the post office delivers mail to a house based on its address.

Refers to the physical connectivity (for wired networks) or frequency range (for wireless networks) used to connect machines together. This connectivity can be phone lines, coaxial, Cat5, or wireless.

This also includes the frequency or speed of data transmission. This is where the speed of the network is determined.

The OSI model can be likened to a pizza with multiple layers of toppings - each layer supporting the layer above it. For instance, a pizza may have a layer of cheese on the very top, followed by a layer of sausage, then the sauce, and finally the dough. Each of these layers can be substituted for something else (Canadian bacon and pineapple?), and layers can be added or removed (Pizza Pie? Cheesey Crust?). Some of these layers however are necessary (like the dough), and while their form can change (thin crust vs. thick crust), they still exist.

The OSI model is commonly remembered by using nmonics - i.e. an easily remembered phrase that is used to refer to something else. One popular nmonic is Please Do Not Throw Sausage Pizza Away, note the capitalized letters PDNTSPA which reference Physical, Datalink, Network, and so on...

Now that we have some of the OSI basics covered, I will be following this post up with some tangible information about how the OSI model works today. If your interested, more detailed information about the OSI model can be found at

Tuesday, September 09, 2008

Citrix “There are no items to display” when viewing Users

When attempting to view the active users on my Citrix servers I receive the message “There are no items to display”. This is my new Citrix 4.5 farm and this feature works in our old 3.0 farm and is relied upon quite a bit for locating users and such. A bit of googling turned up nothing, other people have seen the problem, but nobody knows the resolution.
A bit of testing and I realized that my test system (which has lots of non-standard stuff on it and configured) does list the active users, so the UI can show active users, but the other systems aren’t working. So what's different on my test box:
· 32 vs. 64 bit – my test system is 32bit, and some of the problem systems are also
· hotfix level – same on all systems
· system used in discovery – executing discovery locally on a problematic server still fails to list active users
· installed components – I updated a system to have all the same components
· license level – I changed the license levels to Enterprise on both, same issue
· zone name – both are in the same zone
· OS version – both are W2k3 SP2
· Connectivity to the data collector – I can successfully telnet to port 2512 on the DC from all systems
No differences appear to be identifying the issue. I contacted our vendor who put me in contact with a Citrix rep and we tried a few more things: various qfarm commands, some cleaning of the datastore, moving a server into its own farm, etc… Nothing seemed to resolve the issue.
Finally we deleted and recreated the ICA listener and the connections started showing, something in the ICA config was broken and since most of these servers were built from an image they all had the same problem. I started researching the differences between the working ICA listeners and the broken ones and noticed that the security was different. On the working listeners, the Local Service and Network Service accounts had Query and Message rights, and the broken listeners didn’t have any rights.
I added the two users with rights to the ICA listener on a broken system and logged off. In the AMC I looked for Sessions and the ICA-Tcp session appears, logged back onto the server and my name appears. A little more research shows that the RDP listener was failing due the same issue, resetting the security on it allows the sessions and users to be displayed

Before:Changes made to both services on both listeners:After change:

Monday, September 08, 2008

Restricting Citrix users on the CSG

I have a need to restrict users who can access the Citrix Web Interface (v 4.6) from outside the company, yet allow everyone to access it from inside. Some searching turned up several references to people restricting access based on username or group membership (such as, but nobody looked at how the site was being accessed. My first thought was to setup 2 web interfaces (one internal and one external) and use one of these functions, but I needed other customizations and I didn't like the idea of having to keep them in sync.

A little searching in the out of the box code and I realized there is an existing function called isUsingSecureGateway() that reports if the connection is being proxied via the Citrix Secure Gateway, this is exactly what I needed. Combine this function with examples from others and I have it all.

Step 1: Setup the code

I created a new file named /App_Data/wiCustomizations/CSGRestrict.aspxf to house the interface customizations. This file will contain the functions to check group membership and to search for membership within the group. The key here is to configure the variable groupsPermitted with the names of the security groups to allow access. The groupsPermitted is comma delimited (i.e. add multiple groups with commas in between) and are compared via regular expressions; this means that wildcards can be used to permit users. This file is stored as text on the web server and can easily be edited at any time to add/remove groups.

The second item to notice is the LDAP call needs a username and password to access active directory. Unless you have configured IIS to run under a domain account (not suggested) then you will need to add in a username here. Make sure this account has limited rights since the password will be stored in clear text and anyone with access to the server will have access to the password.

bool CSGAllow(string username, string domain)


string groupsPermitted = "Domain Admins,.*csgaccess";

bool bAllowed = MemberOf(username, domain, groupsPermitted);

return bAllowed;


bool MemberOf(string userName, string domainName, string groupNames)


bool retVal = false;

System.DirectoryServices.DirectoryEntry dirEntry = new System.DirectoryServices.DirectoryEntry("LDAP://" + domainName, "username", "password");

System.DirectoryServices.DirectorySearcher dirSearcher = new System.DirectoryServices.DirectorySearcher(dirEntry);

dirSearcher.Filter = string.Format("(sAMAccountName={0})", userName);


int propCount = 0;

System.DirectoryServices.SearchResult dirSearchResult = dirSearcher.FindOne();

propCount = dirSearchResult.Properties["memberOf"].Count;

for (int i = 0; i < propCount - 1; i++)


string member = dirSearchResult.Properties["memberOf"][i].ToString();

foreach (string strGroup in groupNames.Split(",".ToCharArray()))


System.Text.RegularExpressions.Regex myPattern = new System.Text.RegularExpressions.Regex("cn=" + strGroup + ".*,", System.Text.RegularExpressions.RegexOptions.IgnoreCase);

if (myPattern.IsMatch(member))






Step 2: Link into the login process

Two files need to be edited here – the first is /auth/login.aspx to reference our customized code

<!--#include file="~/app_data/wiCustomizations/serverscripts/csgRestrict.aspxf"-->

The second file to edit is /App_Data/auth/serverscripts/login.aspxf to initiate the security group check. The inserted code is highlighted below.

// Make sure none of the credential fields contain control characters

if( Strings.hasControlChars( user )

|| Strings.hasControlChars( password )

|| Strings.hasControlChars( domain )

|| Strings.hasControlChars( context )

|| Strings.hasControlChars( passcode ) ) {

messageCenterControl.setType( MessageCenterControl.MSGTYPE_ERROR );

messageCenterControl.setKey( "InvalidCredentials" );

} else
if (expAuth is ExplicitNDSAuth) {

// Defer to special NDS processing code that handles context lookup

result = loginAuthenticateNDS((ExplicitNDSAuth) expAuth, user, password, passcode, context);


if (isUsingSecureGateway() && !CSGAllow(user, domain))







// Not NDS - parse the fields and go do any Two-factor

ExplicitUDPAuth udpAuth = (ExplicitUDPAuth)expAuth;

Step 3: Define a custom error message

The final step is to setup a custom error message alerting users they don't have access via the CSG. Using the Citrix article as guidance I copied the file to the languages folder in the web interface. I then edited the file and added the below line, ensuring the variable NoCSGAccess matches the string that is called from login.aspxf above.

NoCSGAccess=This account has not been approved for accessing the Web Interface via the CSG

Wednesday, September 03, 2008

Configuring Citrix on Ubuntu

Installing the Citrix client is easy, just download and execute it

Opening a connection however fails with the error
You have not chosen to trust "/C=US/ST=/L=/O=Equifax Secure Certificate Authority/CN=", the issuer of the server's security certificate

To fix this, we need to download the Root CA certs
Download all Equifax certificates and rename them from .CER to .CRT and then as root copy them to /usr/lib/ICAClient/keystore/cacerts

Viola! The published apps now open

Monday, August 11, 2008

“Windows did not load your roaming profile”

I recently setup a new terminal server for a customer and it was reported that they were receiving errors when logging on. I logged on as a test account and received the following errors:

Windows did not load your roaming profile and is attempting to log you on with your local profile. Changes to the profile will not be copied to the server when you log on. Windows did not load your profile because a server copy of the profile already exists that does not have the correct security.

Either the current user or the administrator's group must be the owner of the folder. 


Windows cannot find the local profile and is logging you on with a temporary profile. Changes you make to this profile will be lost when you log off.


A bit of googling reports that the security on the user's profile path is incorrect – specifically the ownership of the path is wrong. MS Technet ( says that the profile should be owned by the user or the administrators group. I checked the folder ownership and in-deed the owner was misconfigured, however the user could still successfully log onto other servers without error – so the error is somewhat random.

According to MS, checking the owner of the profile started in W2k SP4 which is the minimum baseline of our environment, so that doesn't look like the culprit. There is a profile setting that can be configured to ignore this checking, but that downloads the network profiles into a TEMP folder instead of downloading them locally (which isn't how our other systems are working). It is possible that our imaging process has some unknown hack that turns this checking off, but that just seems stupid.


Thursday, July 24, 2008

Automating Citrix Presentation Server installation

We just recently began upgrading our presenation server envrionment to CPS 4.5, and having quite a few servers to install we need a way to automate it. Currently we use some imaging software to clone the hard disk of a "gold" image, while this works for CPS 3.0 I have been warned about using the same process with 4.5. That and I really dont like the idea of imaging, its a hack that isnt necessary.

So I began looking for how to autoamte the installation - once the farm is setup it shoudl be a simple matter of passing some parameters on a command line to add a new server. Some quick googling turns up a blog by stealthpuppy that details everything I need (
I am in process of testing this now, and I expect that I can merge this script with other MSIExec options ( to automate the install of Presentation Server and all necessary patches

Tuesday, July 22, 2008

Variant types in SCOM

One of the items I have tried to have SCOM do is monitor network traffic on some of our routers/switches. The basics of this is fairly simple – create an SNMP monitor to poll on a set frequency and then store it as performance data. However I ran into issues with SCOM reporting that the variable type was wrong. The web site has a nice breakdown of what the variant types are when referenced by the MP.

  • Empty = 0
  • Null = 1
  • Short = 2
  • Integer = 3
  • Single = 4
  • Double = 5
  • Currency = 6
  • Date = 7
  • String = 8
  • Object = 9
  • Error = 10
  • Boolean = 11
  • Variant = 12
  • DataObject = 13
  • Decimal = 14
  • Byte = 15
  • Char = 16
  • Long = 17

Wednesday, July 16, 2008

My Printing Nightmare

Often times printing is a simple thing, especially in Windows. The print server hosts all the drivers and the client needs only to point to the server to download the necessary drivers and map to the printer. However some situations aren't quite so simple.

Take for instance printing in a Citrix or other Terminal Server environment. The multi-user environment means that a particular print driver can be loaded dozens of times concurrently, and if it isn't designed to do this then you can be in big trouble. Our environment is a little unique to other TS environments – we host the Terminal Server and Print Server at our location and we have no knowledge of the clients network. The printers are mapped via a logon script from the TS to the PS using model specific drivers and because of our technology model we cant use the Citrix Universal Print Driver. This sets the stage for the issues we are about to see.

Some of the issues we are seeing:

  • Printers aren't mapping correctly
  • Some printers fail to work intermittently
  • Printing works on some servers but not others, and the printers having issues change depending on who is logged in where
  • Adobe Acrobat freezes when opening docs
  • The issue comes and goes intermittently

So to begin troubleshooting I do my favorite item – I google the issue. Turns out this is a common thing in Citrix and TS environments where some print drivers cause havoc with the spooler, which is tied into just about everything on the computer. So I find a doc by HP that details all their printers that are supported in a Citrix environment ( and I remove all that aren't supported. I look at other vendor sites to see about their drivers and find some that are supported in Citrix, some that aren't, and a lot that just don't say. First step down and the environment looks better.

I also notice that our logon script is using a file "prnadmin.dll" to map printers. I have never seen anyone use this before – google does turn up some results – but I wonder if that may be part of the problem. I rewrite the script to use VBScript's oNetwork.MapWindowsPrinterConnection function instead and add in lots of error checking and reporting to ensure things are working properly.

So far so good, the issue appears to be stabilizing – until we added another printer. This specific HP printer doesn't have a PS driver included, instead they state that we should use the HP Universal Print Driver for it. So we install the driver on the Print Server and almost immediately begin having issues, different issues but more common issues.

So I call Microsoft for help and the first tier that I talk to waste most of my day and result in not making any progress on the solution. After I get off the phone with them I realize I can delete the print drivers from all the TS and restart the spooler to make the issue go away so I do that on all the servers to get the users up and running again. Its not a solution, but at least it works.

MS calls back and gives me a process to clear out all the 3rd party print processors and monitors from the servers. I begin hacking the registry as instructed with little success – the issues keep coming up several times a day.

Several more days I spend battling the nightmare when we finally decide to call a Citrix expert. He comes in and almost immediately identifies the HP Universal Print Driver as having issues – that's odd since HP specifically states that it can (and should) be used in a Citrix environment; we were even hoping that this universal driver would be our salvation for this issue. We call a couple other Citrix gurus and they both say the same – the HP UPD has "weird" issues in a Citrix environment. So we uninstall the driver and bump the effected printers to use the HP 4000 drivers that come with Windows.

The Citrix guys do give a glimmer of hope however, the additional steps take of limiting the drivers, processors, and monitors may have already resolved the initial issue, and that the most recent symptoms were caused by the HP UPD. Hmmm… I am skeptical but hopeful, I flush all the drivers from all the Terminal Servers to ensure they are reloaded from the print server and cross my fingers.

A few days later and some issues are occurring. It's the weekend and one of the IT admins are checking out the TSs to ensure they are ready for Monday and she is having lots of problems trying to print. Long story short – the admin had nearly 400 printers mapped in her profile and it was taking several minutes for the spooler to recognize them all. When we logged her out and logged in as a normal user with 3-4 printers mapped things worked fine. I had her remove the other printers from her profile and it sounds like things are working.

Its been a couple more days and only 1 issue has been reported – much better than the 4-5 issues per day. Below are the items done to get things working again.

  • Use only print drivers certified to work in a Citrix environment
  • Don't use any Version 2 print drivers, only Version 3
  • Don't use the HP UPD
  • Only use Microsoft default Print Processors and Print Monitors
  • Limit the number of printer drivers in use in the environment
  • Use delprof.exe (downloadable from MS) to delete old cached profiles

There is probably more, and I doubt this is the final step to resolution, but at least its better.

Monday, July 14, 2008

Monitoring services that are not set to Automatic start

The default service monitor employed in SCOM intelligently identifies if the service is set to Automatic startup and only monitors the service if it is. This is a helpful default so that if you set the service to Manual startup or Disabled for some administrative reason, then you don't get alerted on the service stopping. But what if there are services that you want to ensure are running regardless of the startup type? Maybe you want to ensure the AntiVirus is running everywhere and you want to be alerted if someone tries to disable it. Maybe you have a key service like SQL or Exchange that you manually start after any reboots and you need still need to know when it goes down or fails to be started. This will describe how to monitor a service state regardless of the startup state.

Step 1: Create the service monitor

In the SCOM Authoring Console, select the Health Model tab and the Monitors node

Select New | Windows Services | Basic Service Monitor

Enter in the applicable information in the General and Service Name tabs and click finish

Step 2: Edit the monitor to ignore startup type

Locate the monitor just created and open its properties

On the Configuration tab you will see 3 attributes

  • ComputerName
  • ServiceName
  • CheckStartupType

The last attribute is the important one and we need to set it to false, however simply typing false and clicking OK will not work for us.

On the Configuration tab click the Edit… button at the bottom of the window. There you will see something similar to this:

<Configuration p1:noNamespaceSchemaLocation="C:\ Temp\1\Client Telnet Monitor.xsd" xmlns:p1="">



<CheckStartupType />


The line "<CheckStartupType />" needs to be changed to "<CheckStartupType>false</CheckStartupType>" for the value to take effect.

Save the XML file, save the Monitor properties, and save the MP. The service monitor should now ignore the startup type of the service you specified to monitor

Monitoring Servers based on Server Name in SCOM

Most organizations have a standard naming scheme that they use based on server type, location, or department. For instance, the letters "DC" suggests a domain controller, "Mail" or "EX" suggests exchange, "SQL" suggests SQL servers, and "DR" could suggest disaster recovery (for location, not role). Based on these name standards we may want to manage the servers differently: do we really care about excessive network utilization on backup devices, or minor alerts on development systems?

Creating a group based on name is easy – simply create a group with dynamic criteria based on the name – but then applying overrides and monitors to it contains many issues. If you apply a monitor to the group, then when one of the systems in the group errors, the group will report the error without any detail on which machine caused it. To resolve this, I decided to create classes of servers based on their names.

Step 1: Create a Local Application class

In the SCOM Authoring Console, select the Service Model tab and the Classes node

Select New | Windows Local Application

Enter appropriate ID and Name, there are no Key Properties so that can be left blank

Step 2: Create a Discovery for the class

Select the Health Model tab and choose the Discoveries node

Select New | Registry (Filtered)

Enter in the ID and Name, for Target select Windows Computer or some similarly generic group

Leave Schedule and Computer as they are for now

On the Registry Probe Configuration step, click Add and select the following:

  • Object Type: Value
  • Key: ComputerName
  • Path: SYSTEM\CurrentControlSet\Control\ComputerName\ActiveComputerName\ComputerName
  • Attribute Type: String

On the Build Event Expression step, click Insert and select the following:

  • Parameter Name: Values/ComputerName
  • Operator: Matches regular expression
  • Value: (?i)dc.*

On the Discovery Mapper step, select the Class ID created in step 1, and select the Principal Name attribute.

Step 3: Export to SCOM and test

A few notes:

  • The path on the Registry Probe Configuration includes the path and attribute. My first round assumed that the Key property was appended to the Path property and it took me a while to identify my failure
  • The Value in the expression uses (?i) which is a regex expression stating to ignore case. This may or may not be needed in all environments
  • The dc.* in the event expression means that all computers whose names start with the letters "DC" will be included. This should change based on your naming schemes
  • Class types other than Local Application may work for this as well, this is just the most familiar to me for now
  • A WMI discovery can be done instead of a registry discovery if your systems are all W2k3 or greater. The reason for this is because prior OS's cannot handle the LIKE clause in the WMI query you need to run.
  • The Schedule defaults to 1/hour. This is helpful for the initial deployment, but you may want to decrease it to several hours or a day to limit overall performance

Adding custom data to SCOM alerts

Gratuitously stolen from

Adding custom information to alert description (s) and notifications

This is just a dump of some alert description variables I pulled from several other bloggers:

Custom Properties for Alert

Description and Notification:

Alert Description Variables:

For event Rules:

EventDisplayNumber (Event ID): $Data/EventDisplayNumber$
EventDescription (Description): $Data/EventDescription$
Publisher Name (Event Source): $Data/PublisherName$

EventCategory: $Data/EventCategory$
LoggingComputer: $Data/LoggingComputer$

EventLevel: $Data/EventLevel$

Channel: $Data/Channel$
UserName: $Data/UserName$

EventNumber: $Data/EventNumber$

For event Monitors:

EventDisplayNumber (Event ID): $Data/Context/EventDisplayNumber$
EventDescription (Description): $Data/Context/EventDescription$
Publisher Name (Event Source): $Data/Context/PublisherName$

EventCategory: $Data/Context/EventCategory$
LoggingComputer: $Data/Context/LoggingComputer$

EventLevel: $Data/Context/EventLevel$

Channel: $Data/Context/Channel$
UserName: $Data/Context/UserName$

EventNumber: $Data/Context/EventNumber$

For Repeating Event Monitors:

EventDisplayNumber (Event ID): $Data/Context/Context/DataItem/EventDisplayNumber$
EventDescription (Description): $Data/Context/Context/DataItem/EventDescription$
Publisher Name (Event Source): $Data/Context/Context/DataItem/PublisherName$

EventCategory: $Data/Context/Context/DataItem/EventCategory$
LoggingComputer: $Data/Context/Context/DataItem/LoggingComputer$

EventLevel: $Data/Context/Context/DataItem/EventLevel$

Channel: $Data/Context/Context/DataItem/Channel$
UserName: $Data/Context/Context/DataItem/UserName$

EventNumber: $Data/Context/Context/DataItem/EventNumber$

Performance Threshold Monitors:

Object (Perf Object Name): $Data/Context/ObjectName$
Counter (Perf Counter Name): $Data/Context/CounterName$
Instance (Perf Instance Name): $Data/Context/InstanceName$
Value (Perf Counter Value): $Data/Context/Value$

Service Monitors:

Service Name $Data/Context/Property[@Name='Name']$
Service Dependencies $Data/Context/Property[@Name='Dependencies']$
Service Binary Path $Data/Context/Property[@Name='BinaryPathName']$
Service Display Name $Data/Context/Property[@Name='DisplayName']$
Service Description $Data/Context/Property[@Name='Description']$

Logfile Monitors:

Logfile Directory : $Data/Context/LogFileDirectory$
Logfile name: $Data/Context/LogFileName$
String: $Data/Context/Params/Param[1]$

Logfile rules:

Logfile Directory : $Data/EventData/DataItem/LogFileDirectory$
Logfile name: $Data/EventData/DataItem/LogFileName$
String: $Data/EventData/DataItem/Params/Param[1]$


$Data/Context/DataItem/AlertId$ The AlertID GUID

$Data/Context/DataItem/AlertName$ The Alert Name

$Data/Context/DataItem/Category$ The Alert category




$Data/Context/DataItem/CreatedByMonitor$ True/False

$Data/Context/DataItem/Custom1$ CustomField1

$Data/Context/DataItem/Custom2$ CustomField2

$Data/Context/DataItem/Custom3$ CustomField3

$Data/Context/DataItem/Custom4$ CustomField4

$Data/Context/DataItem/Custom5$ CustomField5

$Data/Context/DataItem/Custom6$ CustomField6

$Data/Context/DataItem/Custom7$ CustomField7

$Data/Context/DataItem/Custom8$ CustomField8

$Data/Context/DataItem/Custom9$ CustomField9

$Data/Context/DataItem/Custom10$ CustomField10

$Data/Context/DataItem/DataItemCreateTime$ UTC Date/Time of Dataitem created

$Data/Context/DataItem/DataItemCreateTimeLocal$ LocalTime Date/Time of Dataitem created

$Data/Context/DataItem/LastModified$ UTC Date/Time DataItem was modified

$Data/Context/DataItem/LastModifiedLocal$ Local Date/Time DataItem was modified

$Data/Context/DataItem/ManagedEntity$ ManagedEntity GUID

$Data/Context/DataItem/ManagedEntityDisplayName$ ManagedEntity Display name

$Data/Context/DataItem/ManagedEntityFullName$ ManagedEntity Full name

$Data/Context/DataItem/ManagedEntityPath$ Managed Entity Path

$Data/Context/DataItem/Priority$ The Alert Priority Number (High=1,Medium=2,Low=3)

$Data/Context/DataItem/Owner$ The Alert Owner

$Data/Context/DataItem/RepeatCount$ The Alert Repeat Count

$Data/Context/DataItem/ResolutionState$ Resolution state ID (0=New, 255= Closed)

$Data/Context/DataItem/ResolutionStateLastModified$ UTC Date/Time ResolutionState was last modified

$Data/Context/DataItem/ResolutionStateLastModifiedLocal$ Local Date/Time ResolutionState was last modified

$Data/Context/DataItem/ResolutionStateName$ The Resolution State Name (New, Closed)

$Data/Context/DataItem/ResolvedBy$ Person resolving the alert

$Data/Context/DataItem/Severity$ The Alert Severity ID

$Data/Context/DataItem/TicketId$ The TicketID

$Data/Context/DataItem/TimeAdded$ UTC Time Added

$Data/Context/DataItem/TimeAddedLocal$ Local Time Added

$Data/Context/DataItem/TimeRaised$ UTC Time Raised

$Data/Context/DataItem/TimeRaisedLocal$ Local Time Raised

$Data/Context/DataItem/TimeResolved$ UTC Date/Time the Alert was resolved

$Data/Context/DataItem/WorkflowId$ The Workflow ID (GUID)


The Web Console URL

Target/Property[Type="Notification!Microsoft.SystemCenter.AlertNotificationSubscriptionServer"/PrincipalName$ The principalname of the management server

$Data/Recipients/To/Address/Address$ The name of the recipient (e.g. the Email alias to which the notification is addressed)

Monday, June 30, 2008

Monitoring Logon Scripts via SCOM

In a large environment you need to know when something goes wrong. This is not necessarily because you want to know every issue that can occur, but because a single small issue can quickly grow into many small issues; which results in one large issue.

Say for instance that you have a logon script that maps the users home folders and some shared printers in a terminal server / citrix environment. If a user's printer fails to map correctly that could be the first symptom of a larger environmental issue that is effecting all users – whether they report it or not. So if we monitor this activity in SCOM then we will know what is failing, where it is failing, who it is failing for, and when it is failing – all this could hopefully direct us to why it is failing.

The solution is fairly simple: edit the logon script to identify and report failures, create a SCOM MP to catalog those failures.

Our logon script uses VBScript, so a simple subroutine can be added to check for an error code and then write the error to the NT Event Log. Then just call this sub wherever something may fail.

SUB CheckError(strSource)

IF Err.Number <> 0 THEN

WriteToLog strSource & " -- " & Err.Description & " " & Err.number & " (" & Hex(Err.number) & ")"

oShell.LogEvent EVENT_ERROR, "Printer.vbs" & vbCrLf & userName & " - " & strSource & " -- " & Err.Description & " " & Err.number & " (" & Hex(Err.number) & ")"




The SCOM MP simply uses the NT Event Rule to look for these events and report them.

Friday, June 27, 2008

Changing the PrintProcessor on a Windows 2000 Print Server

One of our customers has a W2k print server with several hundred (nearly 600) printers. Half are used for a program named CreateForm and therefore have a printprocessor named CfPrint, but the rest have a mixture of WinPrint and some other HP processors. They recently began having printer problems that we couldn't diagnose, but the idea came up to limit the number of variables in play: drivers, configurations, processors, etc… So its up to me to make nearly 300 print queues use WinPrint as their processor.

Under W2k3 this would be an easy item, simply write a WMI script to query the printers and then set the processor to WinPrint. W2k however sets these properties as read-only, so back to the drawing board.

It looks possible to set the processor using printui.dll, but it requires a little knowledge to make it work. First off, it doesn't seem to set the processor when run locally, instead I had to set it remotely from a W2k3 system. Secondly, it has a complex command line that can be difficult to handle; I came up with the following:

rundll32 printui.dll,PrintUIEntry /Xs /n "\\server\printer" PrintProcessor WinPrint

Now that we have a command to set the processor, we need a list of printers to change. As I stated earlier, half of these are used in CreateForm and therefore cant be changed; so we need a list of all printers that are not using the CfPrint processor. The command line utility WMIC works well for this, by using the following command I can get a listing of all printers and their processors.

wmic /node:server path win32_printer get DeviceID, PrintProcessor

Pipe the output into Excel or some other tool to filter out anything you don't want to change and save the list of printers to a text file named printers.txt. Now to simply iterate through the file and execute the printui commands.

for /F %i in (printers.txt) do rundll32 printui.dll,PrintUIEntry /Xs /n "\\server\%i" PrintProcessor WinPrint

That's it! You can re-run the WMIC command to ensure you got everything. Of course if your doing as many printers as I am then it may take a few minutes for everything to settle properly

Wednesday, June 18, 2008

Using SCOM to monitor SQL Error Logs

Our SQL DBA reported that occasionally he will see errors in the SQL Error Logs similar to:

SQL Server has encountered 45 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [T:\MSSQL\Data\tempdev6_Data.NDF] in database [tempdb] (2). The OS file handle is 0x000005B4. The offset of the latest long IO is: 0x0000000c7b2000

Monitoring for this alert at first sounded simple, but as I dug into it I realized it could be much more difficult than I expected. First off, this error is reported in the SQL Error Logs and only in the SQL Error Logs, so we cant simply use the NT Event log to alert us. Secondly, depending on how you install SQL, these error logs could be anywhere on the system. And lastly, this is needed to be monitored on SQL 2000 and SQL 2005 systems, so xp_ReadErrorLog won't work for us.

So the breakdown seems fairly simple: find out where the logs are, use the text log monitor, and search for the string.

1. Find out where the SQL Error logs are stored. This first step turns out to be fairly simple, when the SQL DB Engine is discovered, one of the attributes it discovers is "$MPElement[Name='SQL!Microsoft.SQLServer.DBEngine']/ErrorLogLocation$". This should tell us where the logs are stored on each server regardless of how they were built or configured.

2. Use the text log monitor to search the SQL Error logs. Using the "Matches Regex" function, I should be able to simply search for the string "of IO requests taking longer than 15 seconds to complete on file" and report an alert on this

Sounds simple, reality however is rarely simple. The ErrorLogLocation attribute is stored as a single string such as "d:\MSSQL\log\ERRORLOG", but the text log monitor requires the path and file name to be separate components and there is no method within SCOM (that I am aware of) to split these components from a single string.

Other options:

  • Hard code the error log path and name
    • This differs on each box
    • It could be done with an override, but that would be a nightmare
  • Create an extended SQL DB Engine class that includes this setting as 2 separate attributes
    • That's a lot more work than I want to tackle, plus it's a maintenance nightmare
  • Create a new text log monitor that takes only 1 attribute for the path and name
    • Turns out this is included in SCOM as a binary (i.e. in a DLL) and would have no idea where to start with that
  • Create a script that searches the log file for a RegEx
    • This is possible, but potentially filled with problems

Using a script ultimately seems to be the best option, with it I can pass whatever options are needed (log path, how long to go back, the string to search for, and anything else needed) and it can be somewhat expandable in the future. In this case I decided to drop an NT Event message of the error and then use a separate rule to pick up this event and alert on it. I also added another parameter of DBVersion because SQL 2000 and SQL 2005 store their files in different formats

' SQLErrorLog.vbs


' param 0 - path to errorlog $Target/Property[Type="SQLServer!Microsoft.SQLServer.DBEngine"]/ErrorLogLocation$

' param 1 - time in minutes to include 30

' param 2 - string to match "of IO requests taking longer than"

' param 3 - db version $Target/Property[Type="SQLServer!Microsoft.SQLServer.DBEngine"]/Version$

Const ForReading = 1, ForWriting = 2, ForAppending = 8

Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0


SET oArgs = WScript.Arguments

SET oShell = CreateObject("Wscript.Shell")

set fso = CreateObject("Scripting.FileSystemObject")

errorLog = oArgs(0)

iTime = oArgs(1)

sMatch = oArgs(2)

dbVer = oArgs(3)

'oShell.LogEvent EVENT_SUCCESS, "Beginning check, errorLog: " & errorLog & ", iTime: " & iTime & ", sMatch: " & sMatch & ", dbVer: " & dbVer

' read text file

IF LEFT(dbVer,1) = 9 THEN

set f = fso.OpenTextFile(errorLog, ForReading,,TristateTrue)


set f = fso.OpenTextFile(errorLog, ForReading,,TristateFalse)


arLines = split(f.ReadAll,vbCrLf)

' find lines that are iTime minutes old

for i = UBound(arLines)-1 to 0 step -1

line = arLines(i)

lineTime = CDate(Left(line,19))

IF lineTime < DateAdd("n",- iTime, Now) THEN



IF RegExTest(sMatch, line) THEN

' generate alert

oShell.LogEvent EVENT_ERROR, "SQL I/O Error" & vbCrLf & line





Function RegExTest(sPattern, sString)

SET regEx = new RegExp

regEx.Pattern = sPattern

regEx.IgnoreCase = TRUE

regEx.Global = TRUE

SET Matches = regEx.Execute(sString)

IF Matches.Count > 0 THEN

RegExTest = true


RegExTest = false



Monday, May 19, 2008

Creating a SCOM MP using the Authoring Console

When setting up monitors, I have found that the toolset provided to me in the SCOM Admin console doesn't provide me with all the necessary functions desired.

Specifically, if I want to monitor a service, I have the option of using the service monitoring wizard – this doesn't provide much flexibility for future configuration, doesn't give the nicer UI functionality for the operators, and is just ugly if I make more than 2 or 3 of them. Alternatively, I can configure a service monitor to look for a service on all servers in a group – this assumes a group is already created, it doesn't perform any discovery, and still doesn't provide the UI functionality the operators will expect.

This leaves me with the requirement of using the MP Authoring Console and configuring all these options there. This means I need to: Create a class, create a discovery for the class, create groups and group populators, create the UI elements, and lastly create the monitors.

Configuring the console

Copy all MPs locally

The easiest way to do this is to perform a wildcard search for *.MP and *.XML against the root management server, the SQL server, and the installation CD. Then copy all these files to a local directory.

In the Authoring Console, select Tools | Options and select the References tab. Click Add and select the path you copied the MP and XML files to.

Configure the namespace

In the Authoring Console, select Tools | Options and on the General tab enter in your default namespace. This should follow a standard naming scheme used in your company, you can look at how Microsoft names their MPs for examples.

Add in additional references

To perform all the functions needed, the authoring console doesn't reference all the needed MPs.

In the Authoring Console, select File | Properties and click the References tab. There you will see 4 MPs referenced, we need to add 2 more.

  • Click Add Reference
  • Browse to your local copy of the MPs
  • Select, click OPEN
  • Repeat for
  • Click OK

Creating a class

Since we are monitoring an application, we first need to create a class for the application. We want to do this so we can limit our monitoring to only those systems we are interested in, as well as limiting the tasks and other functions.

  • Click the Service Model tab in the Authoring Console and click Classes in the tree view.
  • Select New | Windows Local Application
  • Enter the ID following your naming scheme
  • Enter a friendly display name, next
  • Most likely there will be no key properties, so click finish

Creating a discovery

Now that we have our application class, we need to discover the application. The easiest way that I have found to do this is via WMI.

Click the Health Model tab in the Authoring Console and click the Discoveries node

Select New | WMI

Enter various information on the General tab

  • NOTE: the element ID needs to be different every time it is used. So if you created a class with the ID CompanyX.Application, the discovery ID could be CompanyX.Application.Discovery.
  • In the Target section, make sure to select the class you previously created

The WMI query can be simple or it can be complex depending on what you are discovering. If you are not 100% sure what you are typing (and who is?) you can use WBEMTEST to confirm everything

  • WMI Namespace: root\cimv2
  • Query: Select * from Win32_Service where name= 'servicename'
  • Frequency: 86400 (once per day)

Alternatively, you can discover multiple services by using different queries

  • Select * from Win32_Service where name like 'servicename*' (note the * as a wildcard character – this only work in W2k3 and above)
  • Select * from Win32_Service where name='servicename' or name='servicename2' or name='servicename3'

On the Discovery Mapper page

  • Select the Class ID you are discovering
  • Under Key Properties, select Host | Principal Name

Creating groups and populators

Now that we have our class, we need to create a SCOM group for use with the various UI components. The goal here is to create a group of computers that contain the service we are monitoring. This is different than creating a group of the services.This is where it gets a little complicated and some text editing is needed.

Click the Service Model tab of the Authoring Console and select the Classes node

  • Select New | Computer Group
  • Enter the group ID and display name (standard warning on naming scheme)

Once the group is created, we need to create the discovery to populate the group

Click the Health Model tab and select the Discoveries node

  • Select New | Custom Discovery
  • Create a unique identifier and click OK
  • Enter a name and for the target select the computer group you just created (NOTE: its icon should be a gray diamond with an underline, not a blue diamond)
  • On the Discovered Types tab select the bottom Add button (next to Discovered relationships and their attributes)
    • Select Microsoft.SystemCenter.InstanceGroupContainsEntities
  • On the Configuration tab, click Browse for a type
    • Select Group Populator
    • Under Module ID, I don't know what it expects so I think you can put anything there, click OK
    • Note: You will likely receive an error, I believe this can be ignored
  • Back on the Configuration tab, select Edit and replace the Configuration contents with the following














  • Replace "CompanyX.Application" with the correct value for your application
    • NOTE: If this section here is about the only place that can throw errors here. Make sure the MonitoringClass attribute is correct
  • Back in the Configuration tab, click OK

Creating UI components

Now that we have a computer group, we can create views based on that group and the health of the services in it.

Click the Presentation tab in the Authoring Console and select the Views node

  • Click New | Folder
  • Enter in an appropriate ID and click OK
  • Enter a name, click OK

Now we have a folder to place objects in, we can now create a state view

  • Click New | State View
  • Fill out the general tab, selecting the computer group previously created as the target
  • Click OK and once the wizard is finished, open the object again
  • On the Folder tab, uncheck Monitoring and check the subfolder you previously created
    • NOTE: This is to stop the view from appearing at the root of the management console and be more organized

Creating Monitors

Lastly we need to create a monitor. To be simple, we will only monitor a service state:

Click the Health Model tab in the Authoring Console and select the Monitors node

Select New | Windows services | Basic Service Monitor

Fill out the general information screen

  • Again, note that the Element ID must be something unique. One example could be CompanyX.Application.ServiceY
  • Target is the target class you initially created
  • Parent Monitor should select Availability (or otherwise appropriate category)
  • On the Service Name page, enter the name of the service, click finish

Once the Wizard is completed, we need to perform more editing to configure alerting

Select the monitor you just created and click Properties

  • On the Configuration tab you can see the service configuration.
    • NOTE: the CheckStartupType option allows you to specify if you want to monitor only services configured to Auto start or all startup types. If you wish to change this you will have to click Edit… to open the text editor and change this option to true
  • On the Health tab you select the states for a running or not running service.
  • The Alerting tab allows you to select if you want to generate alerts or not, and according to what state
  • The Diagnostic and Recovery tab allows for automatic recovery tasks to take effect
  • Lastly the Product Knowledge tab allows for custom knowledge to be imported

Monday, May 05, 2008

Report Server log file growing nonstop

Last friday I was alerted to the log drive filling up on a SQL 2005 server. I looked into it and found that my SQL Report Services log file (ReportServer_Log.LDF) was increasing in size and filling up the drive.
My first thought was to simply shrink the file via SQL Server Management Studio - select the DB, select Tasks, Shrink, Files, and select the appropriate log file. This however failed to do anything and the drive was continuing to fill. So before it got any worse, I capped the log file size to keep the drive from filling up and created another temporary log file on a drive that had more room.
I tried shrinking the file various ways in the UI and via dbcc shrinkfile, all without success. Some googling reminded me that the log files wont truncate until they are backed up, so I try backing up the log file:
backup log reportserver to DISK='f:\test.bak'

That also failed, but it finally gave me an interesting error message:
The log was not truncated because records at the beginning of the log are pending replication. Ensure the Log Reader Agent is running or use sp_repldone to mark transactions as distributed.

Some more googling and I found which discussed this error also. For some reason my database thought it was configured for replication even though I am not using replication. In a replication scenario the logs are only truncated after the transactions have been replicated to the second server. So to resolve this we need to make it think replication is working and then turn it off.

1. Following the article I begin by setting the replication to TRUE
sp_replicationdboption 'ReportServer','publish',true
This however fails because there is no distributor setup
The Distributor has not been installed correctly. Could not enable database for publishing.
2. Configure Distribution on the server
using the wizard I configured distribution, pointing the files to paths on drives with plenty of space
3. Now that replication is setup, set replication to TRUE
sp_replicationdboption 'ReportServer','publish',true
4. Using sp_repldone, clear the logs
EXEC sp_repldone @xactid = NULL, @xact_segno = NULL, @numtrans = 0, @time = 0, @reset = 1
5. Now, unpublish the DB
sp_replicationdboption 'ReportServer','publish','false'
6. Now that the DB no longer thinks its published, we can backup the log file
backup log reportserver to DISK='f:\test.bak'
7. Finally we can shrink the file
dbcc shrinkfile('reportserver_log',10)
8. Lastly, cleanup the mess you just made by disabling replication and deleting your backup file

Now all thats left is to delete the temporary log file that was created to keep the initial drive from filling.
The normal shrink process doesnt seem to be working and the log files are emptying to allow for their deletion. I ran into this once before and remember that it had to do with how SQL places data in the logical and virtual logs. More googling to find the answer...

Friday, May 02, 2008

Setting up DOS in a Windows VPC 2007

Microsoft has "officially" discontinued support for MSDOS under virtual PC, which makes it difficult for us to enjoy our old DOS games (like MechWarrior). However, most of the support files for DOS are still included on the VPC additions ISO, and we just need a little tweaking to make it work. I reviewed the "Dos Virtual Machine Additions.vfd" from VPC2004 and came up with the following steps for VPC2007.

1. Create a directory in the root of C:\ named vmadd
2. Copy cdrom.sys, fshare.exe,, and from the VMAdditions.iso to C:\vmadd
3. Edit Autoexec.bat and add the following lines
mscdex.exe /D:IDECD001 /L:E
4. Edit Config.sys by adding the following lines

Reboot and you should be good to go, the CDROM will be drive letter R, the mouse will work as it always did under DOS (which is poorly), and the shared folders will allow you to map your PC files.

Remember to use the right ALT key to release the mouse.

Tuesday, February 12, 2008

Wireless internet using my BlackBerry

I have to use a BlackBerry Perl for work and have generally enjoyed it - I can easily review emails without dialing into work, I can browse the net anywhere I want, and the phone quality is fairly stable. I started wondering though, can I connect it to my computer?

A little googling and I found a page ( that discussed just that. Below are the steps I performed to get this to work with Cingular/ATT
  1. Installed BlackBerry Desktop software, supposedly this is necessary
  2. Connected my phone via USB cable - not a problem since I normally charge teh phone this way
  3. Verify the phone is recognized as a modem
  4. In Device Manager, browse to Modems and select Standard Modem
  5. Click Diagnostics Query Modem and ensure the results detail a BlackBerry IP Modem
  6. On the Advanced tab of the modem, enter +cgdcont=1,"IP","wap.cingular"
  7. NOTE: The name was changed from the original article to match what was listed in the TCP properties of my phone
  8. Create a new connection using this modem and a phone number of *99#
  9. Initiate the connection and leave the username and password blank