Wednesday, December 15, 2010

OpsMgr Custom Alert: Alert on process using too much memory (without a monitor)

I needed to create a rule in SCOM that could alert if a process consumed more than x memory. To my surprise there is nothing out of the box in the Authoring Console that allows for this – I find this a bit strange as this is something that would be used quiet often. There are ways to create the rule I am looking for through a custom application monitor, or a VBScript, but those are time consuming and I feel are more effort than reward. Right clicking in the Authoring Console under the rules section proves my point.


For this example I am going to create an alert that will fire if notepad.exe is using more than 1.2 Mb of memory. Please Note: if you are testing this pack you will need to open notepad before importing this rule – if OpsMgr cannot resolve the performance counter at the time of the workflow initialization the workflow will be unloaded. The reason for choosing notepad is simple, it ships with every version of windows and we can easily change the amount of memory that it uses by opening a big text file.

Let’s Begin.

Open up the Authoring Console and create \ open a management pack. Navigate to Health Model, Rules, right click, and select New -> Custom Rule. Give the rule an ID and click OK. For this example my ID will be “CustomAlerts.AlertOnNotepadMemoryUsage”.


Under general give the rule a name, description and target it at the Microsoft.Windows.Server.Computerlass (or any other class that you want).


Click on the Modules Tab and create a new data source.


Select the System.Performance.DataProvider module, give it an ID and click OK.



Under the Data Source Module section select the module you just added and click Edit, on the screen that appears click Configure in the bottom left corner. The performance counter selection wizard appears. Select your counter (in this case its Process \ Working Set \ notepad). Use the picker below to choose the frequency to run this alert, finally click OK twice to return to the Modules Tab.



Now that we have our counter, the next thing we need to do is decide if an alert should be generated. Since the counter could potentially be below our threshold we will need to compare the value to decide if an alert should be created. In order to do this we will need to add a Condition Detection module to our alert, under the Condition Detection section click Create.



Select the System.Performance.SimpleThresholdCondition module and click OK.

Note: you can use the any other module you like for the condition detection in the list that appears (i.e. average threshold), but for now we are going to keep it simple by using the System.Performance.SimpleThresholdCondition module.



Once added, click the Edit button for the new module. There is no configuring wizard so you will have to either edit the module through notepad or use the dialog box on screen. As you can see there are 2 options that can be changed, they are Threshold and Operator these are pretty self-explanatory so I am going to go ahead and enter in 1258291 and Greater respectively.



Click OK twice to close the current window and return to the Modules Tab.

The last thing we will need to do is create an alert if the condition detection returns true. To do this, simply add the “System.Health.GenerateAlert” module to the Actions section. Click Edit and then Configure to bring up the configuration wizard for the alert. Enter in a name for the alert and any message that you want to appear, you can make use of the $Data$ variables to pull information back about the process. You can make use of the fly-outs on the message editor to get all values available.



Once complete your alerts screen should look something like this.



Click OK twice to get back to the Modules Tab.

At this point you are pretty much complete with the creating of your alert, I would recommend disabling the rule by default and then targeting it at the servers (or class) that you want to monitor. Additionally take the time to add some KB entry into the alert so support staff knows what to do with it appears in the console. I normally like to change the category of custom rules to its correct type (Alert in this case); you can do this on the Options Tab. You should now have a rule looking something like this.



Right, let’s test this rule.

Fire up your lab and import the pack, add any overrides to get the rule targeted at the correct computer.

I made a bit of a stuff up with the amount of memory to alert on so straight away I get an alert :/.



This means that the rule is working. I close notepad and wait for 5 min to see if I get any more alerts coming through (to ensure that the logic is working correctly). As mentioned earlier if I the performance counter cannot be found the workflow will be unloaded from the OpsMgr agent. Like clockwork 1 min later I get the following alert in the agents event log.

Log Name: Operations Manager
Source: Health Service Modules
Date: 12/15/2010 11:24:38 AM
Event ID: 10103
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: SQL.home.net
Description:
In PerfDataSource, could not find counter Process, Working Set, notepad in Snapshot. Unable to submit Performance
value. Module will not be unloaded.
One or more workflows were affected by this.
Workflow name: CustomAlerts.AlertOnNotepadMemoryUsage
Instance name: SQL.home.net
Instance ID: {1A5EB665-A107-EEC5-7394-60D9B5EF8882}
Management group: HOME
Event Xml:



10103
3
0
0x80000000000000

9440
Operations Manager
SQL.home.net



HOME
CustomAlerts.AlertOnNotepadMemoryUsage
SQL.home.net
{1A5EB665-A107-EEC5-7394-60D9B5EF8882}
Process
Working Set
notepad

I expected that. Something interesting to note here – although the performance counter cannot be resolved the workflow is not uninitialized, I suspect this has something to do with the fact that we have already submitted data back to SCOM from this workflow. After leaving notepad closed on my target computer for a while I decide to re-open it and open a 500kb text file (“hello world” +- 5000 times), and after a minuet a new alert appears in the SCOM console.



Comparing the 2 alerts side by side I see that the memory usage reflects that the 500kb file has been opened.



That’s all there is to creating a custom alert in OpsMgr based on the memory usage of an application. Although this particular example will not be useful in production, this example should enable you to create an alert of any performance data easily.

Feel free to leave questions / comments / requests.

SCCM - migrating from WSUS to SCCM Updates

Ok - so we wanted to start using SCCM to implement updates instead of our existing WSUS hierarchy. The problem was we had a couple of years worth of approved updates in WSUS and needed this list in SCCM. Started scratching around on the interwebs and couldn't find any kind of end to end solution - I'm lazy - if someone has done it before why re-invent the wheel.

So I broke down the problem into steps
  • Connect to WSUS DB and get all the updates in approved updates
  • Create an Update list in SCCM and import the approved updates from previous step
  • Tidy up the list to get rid of unwanted/unneeded updates etc
Pretty straight forward on the surface until you start trying to tie up article ID's and knowledgebase ID's etc. I ended up using one SQL script, one Powershell script and one VBScript. I could have tidied these all up into one powershell script but we only needed this as a one off and didn't need to redo this on a daily basis.

OK - lets get the approved updates out of WSUS. Our DB is running on the WSUS default, the Windows Internal Database. As it turns out the interwebs were going to help me after all - every step of the way - so big thanks to these guys for their articles. http://www.mssqltips.com/tip.asp?tip=1577 You can connect to your instance using SQL Server Management Studio on the local machine with this connection string :
\\.\pipe\MSSQL$MICROSOFT##SSEE\sql\query
Once you are connected - this will get you a list of ArticleID's from your approved updates in WSUS:
select distinct KnowledgebaseArticle from PUBLIC_VIEWS.vUpdate UPD
join PUBLIC_VIEWS.vUpdateApproval APP on upd.UpdateId = app.UpdateId
order by KnowledgebaseArticle
You will get NULL and 000000 - These need to be removed from the ArticleID list and added manually to the update list.

Get these Updates by doing :
select distinct * from PUBLIC_VIEWS.vUpdate UPD
join PUBLIC_VIEWS.vUpdateApproval APP on upd.UpdateId = app.UpdateId
where KnowledgebaseArticle = 000000
--And to get NULL
select distinct * from PUBLIC_VIEWS.vUpdate UPD
join PUBLIC_VIEWS.vUpdateApproval APP on upd.UpdateId = app.UpdateId
where KnowledgebaseArticle is NULL
OK - so now we have a list of ArticleID's. We now need to determine the CI_ID value for each ArticleID and the script below (http://msdn.microsoft.com/en-us/library/cc143352.aspx) needs to be modified - the line starting with Query1 = "Select.... Insert your ArticleID values in between the brackets as shown - Run this on your SCCM server and just redirect the output to a text file.
Connect ".","",""
Set connection = Connect(computer,userName,userPassword)
'On Error Resume next
' This is the line to change.
Query1 = "Select * from SMS_SoftwareUpdate where Articleid in ('940060','940357')"
' Run query.
Set ListOfResources1 = connection.ExecQuery(Query1, , wbemFlagForwardOnly Or wbemFlagReturnImmediately)
' The query returns a collection that needs to be enumerated.
Wscript.Echo " "
Wscript.Echo "Query: " & Query1
Wscript.Echo "--------------------------------------------------------"
For Each Resource1 In ListOfResources1
Wscript.Echo Resource1.CI_ID
'Wscript.Echo "Name: " & Resource1.LocalizedDisplayName
'Wscript.Echo "ArticleID: " & Resource1.ArticleID
Next

Function Connect(server, userName, userPassword)
On Error Resume Next
Dim net
Dim localConnection
Dim swbemLocator
Dim swbemServices
Dim providerLoc
Dim location
Set swbemLocator = CreateObject("WbemScripting.SWbemLocator")
swbemLocator.Security_.AuthenticationLevel = 6 'Packet Privacy.
' If the server is local, don't supply credentials.
Set net = CreateObject("WScript.NetWork")
If UCase(net.ComputerName) = UCase(server) Then
localConnection = true
userName = ""
userPassword = ""
server = "."
End If
' Connect to the server.
Set swbemServices= swbemLocator.ConnectServer _
(server, "root\sms",userName,userPassword)
If Err.Number<>0 Then
Wscript.Echo "Couldn't connect: " + Err.Description
Connect = null
Exit Function
End If
' Determine where the provider is and connect.
Set providerLoc = swbemServices.InstancesOf("SMS_ProviderLocation")
For Each location In providerLoc
If location.ProviderForLocalSite = True Then
Set swbemServices = swbemLocator.ConnectServer _
(location.Machine, "root\sms\site_" + _
location.SiteCode,userName,userPassword)
If Err.Number<>0 Then
Wscript.Echo "Couldn't connect:" + Err.Description
Connect = Null
Exit Function
End If
Set Connect = swbemServices
Exit Function
End If
Next
Set Connect = null ' Failed to connect.
End Function
Ok so now we need to create an update list in SCCM for that purpose I used a Powershell script written by Joachim Meyer . It has to be run on the SCCM server with a bunch of parameters and needs to point to a reference machine(which actually isn't used for our purposes) The list of CI_ID's that are returned get added to an array in the powershell script - Edit the array - lines starting with [VOID] and replace the values of the CI_ID's there.

param(
$ReferenceClient,
$UpdateListName,
$SiteServer,
[switch] $force,
[switch] $verbose
)
$AppName = "Create-UpdateList"
$manpage = @'
NAME
Create-UpdateList
SYNOPSIS
Creates a Configuration Manager update list.
SYNTAX
Create-UpdateList -ReferenceClient -UpdateListName [-SiteServer ] [-Force] [-Verbose]
DETAILED DESCRIPTION
This script creates an update list based on the inventory data of a reference client. The reference client should
represent the baseline for a specific operating system used within your organization. This client needs to be
present in the Configuration Manager database with valid inventory data. This script then creates an update list
which includes all the software updates reported as missing from the reference client.
PARAMETERS
-ReferenceClient
Specifies the name of the reference client.
-UpdateListName
Specifies the name of the update list to be created.
-SiteServer
Optional: Specifies the Configuration Manager site server. If not specified, the local computer is assumed
to be the site server.
-Force
Optional: Creates an update list even if an update list with the same display name already exists.
-Verbose
Optional: Generates detailed information about the script's operations.
'@
if (!$ReferenceClient -or !$UpdateListName) {
Write-Host $manpage
exit
}
if ($args.count -eq 1) {
$siteserver = $args[0]
}
elseif ($args.count -gt 1) {
Write-Host $manpage
exit
}

if (!$siteserver) { $siteserver = $env:computername }
$namespace = "root\sms"
if ($verbose) {
Write-Host "`nReferenceClient: $ReferenceClient"
Write-Host "UpdateListName : $UpdateListName"
Write-Host "SiteServer : $SiteServer`n"
}
$smsContext = New-Object System.Management.ManagementNamedValueCollection
$smsContext.Add("ApplicationName", $AppName)
$smsContext.Add("MachineName", $env:computername)
$smsContext.Add("LocaleID", 1033)
$connOptions = New-Object System.Management.ConnectionOptions
$connOptions.Context = $smsContext
$path = New-Object System.Management.ManagementPath
$path.NamespacePath = "\\$siteserver\" + $namespace
$scope = New-Object System.Management.ManagementScope($path, $connOptions)
$ErrorActionPreference = “silentlycontinue”
$scope.Connect()
if (!$?) {
$ErrorActionPreference = “continue”
$cred = Get-Credential
if (!$cred) {
Write-Host "No credentials supplied." -foregroundcolor Red -backgroundcolor Black
exit
}
# Property "SecurePassword" requires .NET Framework 2.0 SP1 or higher!
$connOptions.Username = $cred.Username
$connOptions.SecurePassword = $cred.Password
$scope.Options = $connOptions
$ErrorActionPreference = “silentlycontinue”
$scope.Connect()
if (!$?) {
Write-Host "Could not connect to site server $siteserver." -foregroundcolor Red -backgroundcolor Black
Write-Host $error[0] -foregroundcolor Red -backgroundcolor Black
exit
}
}
elseif ($verbose) {
Write-Host "Successfully connected to \\$siteserver\$namespace."
}
$ErrorActionPreference = “continue”
$wqlquery = "SELECT * FROM SMS_ProviderLocation"
$query = New-Object System.Management.ObjectQuery($wqlquery)
$searcher = New-Object System.Management.ManagementObjectSearcher($scope, $query)
$providerLoc = $searcher.Get()
if (!$providerLoc) {
Write-Host "Could not get instances from the SMS_ProviderLocation class." -foregroundcolor Red -backgroundcolor Black
exit
}
foreach ($providerInst in $providerLoc) {
if (!$providerInst.ProviderForLocalSite) {
Write-Host "SMS Provider $providerInst.SiteCode not set as local site server." -foregroundcolor Red -backgroundcolor Black
exit
}
else {
$sitecode = $providerInst.SiteCode
}
}
$namespace = "root\sms\site_$sitecode"
$ErrorActionPreference = “silentlycontinue”
$scope.Path = "\\$siteserver\$namespace"
$scope.Connect()
if (!$?) {
Write-Host "Could not connect to site server $siteserver." -foregroundcolor Red -backgroundcolor Black
exit
}
elseif ($verbose) {
Write-Host "Successfully connected to \\$siteserver\$namespace."
}
$ErrorActionPreference = “continue”
# Check if the specified reference client already exists within the ConfigMgr database
$wqlquery = 'SELECT ResourceID FROM SMS_R_System WHERE Name = ' + '"' + "$ReferenceClient" + '"' + ' AND Active = 1'
if ($verbose) {
Write-Host "Verifying if the reference client $ReferenceClient actually exists in the ConfigMgr database."
Write-Host "Running WQL query: $wqlquery."
}
$query = New-Object System.Management.ObjectQuery($wqlquery)
$searcher = New-Object System.Management.ManagementObjectSearcher($scope, $query)
$searcher.Get() | Foreach-Object { $rscID = $_.ResourceID }
if (!$rscID) {
Write-Host "Could not find the reference client $ReferenceClient in the Configuration Manager database." -foregroundcolor Red -backgroundcolor Black
exit
}
if ($verbose) {
Write-Host "Found $ReferenceClient in the database with Resource ID $rscID."
}
# Check if the specified name for the update list is already in use
$wqlquery = "SELECT * FROM SMS_AuthorizationList WHERE LocalizedDisplayName = " + "'"
$wqlquery += $UpdateListName + "'"
if ($verbose) {
Write-Host "Check if the specified name for the update list is already in use."
Write-Host "Running WQL query: $wqlquery."
}
$query = New-Object System.Management.ObjectQuery($wqlquery)
$searcher = New-Object System.Management.ManagementObjectSearcher($scope, $query)
$searcher.Get() | Foreach-Object { $ListID = $_.CI_ID }
if ($ListID) {
if (!$force) {
$msg = "`nAn update list with the name $UpdateListName already exists. If you want the update list to be created, "
$msg += "please specify the -force switch."
Write-Host $msg -foregroundcolor Yellow -backgroundcolor Black
exit
}
}
elseif ($verbose) {
Write-Host "An update list with the name $UpdateListName does not exist."
}
# Get the missing software updates reported for the reference client
$wqlquery = "SELECT css.CI_ID FROM SMS_UpdateComplianceStatus css "
$wqlquery += "JOIN SMS_SoftwareUpdate ui ON css.CI_ID = ui.CI_ID "
$wqlquery += "WHERE css.MachineID = $rscID AND css.Status = 2"
if ($verbose) {
Write-Host "Getting required software updates for $ReferenceClient from the ConfigMgr database."
Write-Host "Running WQL query: $wqlquery."
}
$swupdates = New-Object System.Collections.ArrayList
[void] $swupdates.Add(41200)
[void] $swupdates.Add(41200)
[void] $swupdates.Add(41202)
[void] $swupdates.Add(41227)
[void] $swupdates.Add(41230)
[void] $swupdates.Add(41252)
[void] $swupdates.Add(41277)
[void] $swupdates.Add(41282)

# Get the LocaleID of the site server installation
$wqlquery = 'SELECT * FROM SMS_Identification'
$query = New-Object System.Management.ObjectQuery($wqlquery)
$searcher = New-Object System.Management.ManagementObjectSearcher($scope, $query)
$searcher.Get() | Foreach-Object { $LocaleID = $_.LocaleID }
if (!$LocaleID) { $LocaleID = 1033 }
if ($verbose) {
Write-Host "Using LocaleID $LocaleID."
}
$searcher
$options = New-Object System.Management.ObjectGetOptions
$options.Context = $smsContext
$path = New-Object System.Management.ManagementPath("\\$siteserver\$namespace" + ":SMS_CI_LocalizedProperties")
$smsCiLoc = (New-Object System.Management.ManagementClass($scope, $path, $options)).CreateInstance()

# Workaround a PowerShell V1 issue
[void] $smsCiLoc.psbase.Properties
$swupdates
$smsCiLoc.DisplayName = $UpdateListName
$smsCiLoc.LocaleID = 1033
[System.Management.ManagementObject[]] $newDescriptionInfo += $smsCiLoc
$options = New-Object System.Management.ObjectGetOptions
$options.Context = $smsContext
$path = New-Object System.Management.ManagementPath("\\$siteserver\$namespace" + ":SMS_AuthorizationList")
$newUpdateList = (New-Object System.Management.ManagementClass($scope, $path, $options)).CreateInstance()
# Workaround a PowerShell V1 issue
[void] $newUpdateList.psbase.Properties
$newUpdateList.Updates = $swupdates
$newUpdateList.LocalizedInformation = $newDescriptionInfo
$putOptions = New-Object System.Management.PutOptions($smsContext)
$ErrorActionPreference = “silentlycontinue”
[void] $newUpdateList.Put($putOptions)
if (!$?) {
Write-Host "Could not create update list $UpdateListName."
Write-Host $error[0]
}
else {
Write-Host "Successfully created update list $UpdateListName."
}
Run the Powershell script on the Central Site Server, In a powershell console execute the following command:

 \reftest.ps1 -ReferenceClient  -updatelistname  -siteserver  -force -verbose 
Remember to update the variables between <> for your site. If you look at your SCCM console now under Update lists you will see your new update list. Check to see if there are any updates you can delete from here - odd languages etc before creating your update deployments from this list.

Good luck - DT