Monday, February 8, 2010

Changing which Gateway server an agent reports to - eh?!

Picture the scenario: you've got 500 machines, 2 gateway servers. You've got half of them reporting to Gateway 1, the other half reporting to Gateway 2. You've done a bit of research and even enabled failover with a bit of nifty Powershell.

Then one day Gateway 1 dies. All agents failover and start reporting to Gateway 2. Yep, that's all good but you now need to get a new Gateway server ASAP as you now have no redundancy.

We quickly find a new server -- Gateway 3 and are now left with the task of reassigning all the agents currently reporting to Gateway 1 onto Gateway 3, whilst keeping Gateway 2 as the failover.

Never fear, there is a solution.

There are two parts to this: server and client side. I will discuss each individually.

Client:

OpsMgr stores its connection details in two places: NetworkName and AuthenticationName in the registry. These need to be changed to the new value. This however won't work unless the server component has been carried out. An optional step here is deleting the Health Service State. In reality we only need to delete the Connector Configuration Cache, but you will be absolutely amazed at how many problems can be solved with a simple deletion of Health Service State data. This is a step you can choose to take: as far as I'm aware, the Connector Configuration will be updated if the registry keys have changed. If you're wanting to delete the State data too, uncomment the lines in the PS script.


Server:

Here we need to tell the RMS (and DB) that the agent has changed its PrimaryManagementServer (Gateway). This cannot be done with the GUI and hence, Powershell will be used.

------

Assumptions: This solution assumes that the CLIENT portion will be executed on a machine with Powershell installed, and with WMI (RPC/DCOM) access to each machine needing to be changed. X86/64 isn't a problem as it detects the install directory. What this Powershell script does do is make a WMI call to the remote machines to change registry keys, optionally delete State data, and restart the service. If you don't have this access it WON'T work! Also, this assumes you're running as an admin and you do have admin rights on each machine.

-----

Client Portion:

The first thing we need is a list of all machines that report to Gateway1. We need to export this to a CSV because we'll be using it to address each machine. We're going to export this to a one-column CSV, with its NetworkName as the column name [ remember to remove the top few "computer generated" lines in the CSV.

[This will obviously be run through an OpsMgr shell on a server -- not a client]:

$oldGatewayName = "Gateway1"
$agents = Get-Agent | ?{ $_.PrimaryManagementServerName -eq $oldGatewayName } | Select NetworkName | export-csv c:\agents.csv


[Now, the client script - ensure you've copied the CSV from the script above to the C:\]:

$MachineList = Import-Csv c:\agents.csv
$NewGatewayName = "Gateway3"

Foreach($MachineName in $MachineList) {

Write-Host "Attempting " $MachineName.NetworkName
#Change Registry Keys
$reg = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey('LocalMachine', $MachineName.NetworkName)
$regKey= $reg.OpenSubKey("SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Agent Management Groups\DERSCOM\Parent Health

Services\0",$true)
$regKey.SetValue("NetworkName",$NewGatewayName,"String")
$regKey.SetValue("AuthenticationName",$NewGatewayName,"String")

#Get Install Path for Health Service State folder
$path = $reg.OpenSubKey("SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup")
$HealthServiceState = ($path.GetValue("InstallDirectory")).Replace("\","\\") + "Health Service State"


#Stop HealthService
$service = Get-WmiObject -query "SELECT * FROM Win32_Service WHERE Name = 'HealthService'" -computer $MachineName.NetworkName
$service.StopService()

Start-Sleep(10)

#Delete Health Service State Folder
#$directory = Get-WmiObject -query "SELECT * FROM Win32_Directory WHERE Name='$HealthServiceState'" -computer

$MachineName.NetworkName
#foreach($item in $directory) {
#$item.Delete()
#}

#Start Service
$service.StartService()

}

Server Portion:

#Unfortunately there is no way to pass Criteria Syntax to get-agent. You may extend the SDK directly if you so desire.


$oldGatewayName = "Gateway1"
$gateway2 = "Gateway2"
$gateway3 = "Gateway3"

$PrimaryMS = Get-ManagementServer | Where { $_.DisplayName -match $gateway3 }
$FailoverMS = Get-ManagementServer | Where { $_.DisplayName -match $gateway2 }

$agents = Get-Agent | ?{ $_.PrimaryManagementServerName -eq $oldGatewayName }

foreach($agent in $agents)
{
Write-Host "Setting MS/Failover for:" $agent.DisplayName
Set-ManagementServer -AgentManagedComputer: $Agent -PrimaryManagementServer: $PrimaryMS -FailoverServer: $FailoverMS
}


And that's that! Once you've run those parts your agents will take a few minutes and start talking to the new Gateway server. Simple as pie.

No comments:

Post a Comment