Wednesday, May 24, 2017

SCOM 2016 Update Rollup 3 Released!

Microsoft released the SCOM 2016 Update Rollup 3 which includes a number of fixes particulary a fix for the SCOM agent crashing IIS .NET 2.0 legacy application pools.

Link to the detailed list of updates is here:
https://support.microsoft.com/en-za/help/4016126/update-rollup-3-for-system-center-2016-operations-manager

Downloads for the Update are available here:
http://www.catalog.update.microsoft.com/Search.aspx?q=4016126

Issues that are fixed
  • When you run System Center 2016 Operations Manager in an all-French locale (FRA) environment, the Date column in the Custom Event report appears blank.
  • The Enable deep monitoring using HTTP task in the System Center Operations Manager console doesn't enable WebSphere deep monitoring on Linux systems.
  • When overriding multiple properties on rules that are created by the Azure Management Pack, duplicate override names are created. This causes overrides to be lost.
  • When the heartbeat failure monitor is triggered, a "Computer Not Reachable" message is displayed even when the computer is not down.
  • The Get-SCOMOverrideResult PowerShell cmdlet doesn't return the correct list of effective overrides.
  • When creating a management pack (MP) on a client that contains a Service Level (SLA) dashboard and Service Level Objects (SLO), the localized names of objects aren't displayed properly if the client's CurrentCulture settings don't match the CurrentUICulture settings. In cases where the localized settings are English English, ENG, or Australian English, ENA, there's an issue when the objects are renamed.
  • The Event ID: 26373 error, which may cause high memory consumption and affect server performance, has been changed from a “Critical” message to an “Informational” message.
  • The Application Performance Monitoring (APM) feature in System Center 2016 Operations Manager Agent causes a crash for the IIS Application Pool that's running under the .NET Framework 2.0 runtime. Microsoft Monitoring Agent should be updated on all servers that use .NET 2.0 application pools for APM binaries update to take effect. Restart of the server might be required if APM libraries were being used at the time of the update.
  • The UseMIAPI registry subkey prevents collection of processor performance data for RedHat Linux system. Also, custom performance collection rules are also impacted by the UseMIAPI setting.
  • Organizational Unit (OU) properties for Active Directory systems are not being discovered or populated.
  • The Microsoft.SystemCenter.Agent.RestartHealthService.HealthServicePerfCounterThreshold recovery task fails to restart the agent, and you receive the following error message:
LaunchRestartHealthService.ps1 cannot be loaded because the execution of scripts is disabled on this system.
  • This issue has been resolved to make the recovery task work whenever the agent is consuming too much resources.
  • The DiscoverAgentPatches.ps1 script in Microsoft.SystemCenter.Internal.xml fails and you experience the following exception:
Exception:
Method invocation failed because [System.Object[]] does not contain a method named 'op_Subtraction'.
At C:\bin\scripts\patch.ps1:37 char:35
+                   for($count = 0; ($productList.Count-1); $count++)
+                                   ~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidOperation: (op_Subtraction:String) [], RuntimeException
+ FullyQualifiedErrorId : MethodNotFound

  • An execution policy has been added as unrestricted to PowerShell scripts in Inbox management packs.
  • SQL Agent jobs for maintenance schedule use the default database. If the database name is not the default, the job fails.
  • This update adds support for OpenSSL1.0.x on AIX computers. With this change, System Center Operations Manager uses OpenSSL 1.0.x as the default minimum version supported on AIX, and OpenSSL 0.9.x is no longer supported.

Wednesday, March 29, 2017

EventID 34103 - Web TestConfig Error, Web Availability Test Not Running.

I ran into an issue the other day while researching an error in SCOM on one of my servers running as a watcher for Web availability application monitoring. What I noticed was in the Operations Manager event log different event ID 34103 entries specifying TestConfig failures.


The Web Test Module has some invalid configuration.
Config Context: TestConfig
Error: 0x80004005
One or more workflows were affected by this.
Workflow name: Microsoft.SystemCenter.WebApplicationTest.WebTestProbe.Performance.CollectContentTime
Instance name: Demo-Visant [Monitoring Pool - Demo]
Instance ID: {ECFB7B7D-B242-C2EB-7470-6181DED0CFCA}
Management group: Jostens_ENT_Prod

Come to find out this was due to as stated a configuration error with the content matching. We utilize the Does not match standard expression for a large number of sites. Which typically you use an enclosed fail message or content such as (FAIL) or along those lines. What had happened was when the test was created one of the enclosures was missing so it was FAIL) instead.
SCOM still allowed the test to be saved and acting like it was monitoring the site without issues, no errors were shown in the console saying it was failing and the URL health state showed green.

To help overcome this issue I build a new Manual Reset Rule, that would check the Operations Manager log for the following conditions.

EventID -eq 34103
Source -eq Health Service Modules
Event Data- $Data/Params/Param[5]$ -eq TestConfig

That would create a critical alert with a customer error stating the test was not configured properly and needs to be updated. Thus informing whomever is watching the console of the issue. I went a step further and also created a notification alert that goes to the SCOM Admins to follow up on the issue.

Tuesday, March 28, 2017

SCOM Web Availability Monitoring with www/http basic authentication

SCOM offers multiple ways to monitor websites, I utilize the Web application availability monitoring template as much as possible apposed to the Web Application Transaction Monitoring, mainly because it is faster to configure and easier to see a break down of what is experiencing an issue.


However, one downside is the web application availability monitoring template does not have an inherent way to handle websites that require basic authentication logins. The way to overcome this issue is to utilize http headers and use an authorization header with a encoded username and password.


HTTP Headers - Basic Authentication
To access a site that is using basic authentication you will need to encode your username and password as a base64 username|password combination.
*It is suggested to use an account with the least available permissions needed to do the site test. Preferably an account that has limited read-only access. Since the base64 encoded account can be decoded fairly easy.


You can do this from a site such as https://www.base64encode.org/
  1. Under the Encode section type in your username and password with the following format username:password
  2. Click encode and viola!
  3. Copy and save the results.
Now you need to add an HTTP header to your URL's test to enable SCOM to login to the site and give a 200 response instead of a 401.
  1. Follow the normal steps to create a new web application availability monitor site test, and open the "Change Configuration" settings under the View and Validate Tests tab.
  2. Scroll down until you see HTTP Headers.
  3. Click on Add and a new window will open.
  4. In the HTTP header name: box type in: Authorization
  5. In the HTTP header value: box type in: basic "base64 encoded user/pass" without the quotes.
  6. Click OK to close the header window, then click OK to save.
  7. Click on Run Test to verify the site is passing and not giving a 401.
All Done.






Monday, March 27, 2017

SCOM 2016 agent causes issues with IIS .Net 2.0 Application

As many other blogs and users have posted SCOM 2016 Agent currently has an issue with servers that are running IIS using .net 2.0 apps.
One thing we have noticed in our environment recently is you don't always see errors with the agent causing the service to crash.
Sometimes the app just doesn't work as it is supposed to. When troubleshooting an issue with IIS site using 2.0 try and uninstall the SCOM agent and see if that resolves the issue.

If it does then follow the recommended resolution steps below, we typically install using the NOAPM=1 setting.

Summary
APM feature in SCOM 2016 Agent may cause a crash for the IIS Application Pool running under .NET 2.0 runtime.
Cause
Several callbacks within APM code of SCOM 2016 Agent utilize memory allocation that’s incompatible with .NET 2.0 runtime and may cause an issue if this memory is later accessed in a certain way. Those particular modifications were added in SCOM 2016 Agent and are not present in SCOM 2012 R2 Agent.

Resolution
The fix for this issue is to be provided with SC 2016 OM Update Rollup 3. The aforementioned code paths will not be executed if the Application Pool is running under .NET 2.0 Runtime. We are also evaluating to release a hotfix for this issue.
Workaround
There are several workarounds for this issue:

  • Application pool can be migrated to .NET 4.0 Runtime;
  • SCOM 2016 Agent can be replaced with SCOM 2012 R2 Agent, it’s forward-compatible with SCOM 2016 Server and APM feature will continue to work with the older bits;
  • SCOM 2016 Agent can be reinstalled with NOAPM=1 switch in msiexec.exe setup command line, APM feature will be excluded from setup;

Friday, March 24, 2017

SQL Server 2012 or newer - SCOM Monitoring using SID's

When installing SCOM agents on servers the agent is typically installed to run under the LocalSystem account to access resources and run scripts. Starting with SQL Server 2012 and newer the NT AUTHORITY\SYSTEM account no longer is part of the sysadmin role preventing scripts and discoveries from running properly. Causing alerts due to failures for discoveries and monitoring.


There are a few ways to overcome this issue; either assigning LocalSystem Sysadmin rights in SQL or to modify the SCOM Service to utilize a service SID to grant access to the SCOM Service process.


To Create the appropriate service SID for the SCOM Service run the following command.


sc sidtype HealthService unrestricted


To add the Service SID "NT SERVICE\HealthService" run the following SQL query on the SQL server to create a new login for the service or manually create the security login.
USE [master]
GO
/****** Add a login in SQL Server for the service SID of System Center Advisor HealthService ******/
CREATE LOGIN [NT SERVICE\HealthService] FROM WINDOWS WITH DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english]
GO
/****** Add the HealthService Service SID login to the sysadmin server role ******/
ALTER SERVER ROLE [sysadmin] ADD MEMBER [NT SERVICE\HealthService]
GO
To manually add the user by opening SQL Server Management Studio, opening Security > Logins > Right click and select New login.
  1. Under Login Name type in "NT Service\HealthService" without the quotes.
  2. Make sure Default Database is set to "Master"
  3. Click on the Server Roles page on the left hand list.
  4. Checkmark the "sysadmin" role and then click OK.
To finish up, restart the SCOM Service. You can monitor the OperationsManager event log to verify the scripts can properly login and run discoveries and scripts against SQL Server.




Further details can be obtained from a few other Blog posting by Kevin Holman and TechNet.


How to configure SQL Server 2012 to allow for System Center Advisor monitoring


SQL MP Run As Accounts – NO LONGER REQUIRED




SCOM UNIX/LINUX - Separate action accounts for resource pools

Hello all,

I have been working on implementing a new SCOM 2016 environment with Unix/Linux systems. One of the topics that is clearly lacking in documentation is how to utilize separate action accounts and passwords when deploying and managing systems behind firewalls and gateways.

This post below was first created by Silvana Deac with an relevant way to alleviate this trouble.

Hello all,
I observed that this topic is lacking some explanations on how to configure different run as accounts for each DMZ zone when using linux/unix monitoring. If the targeting is wrong you will get an error like:

Log Name:      Operations Manager
Source:        Cross Platform Modules
Event ID:      4113
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ComputerName
Description:
The account for the UNIX/Linux Action Run As profile associated with the workflow "Microsoft.Linux.Universal.Computer.Discovery", running for instance "computer.FQDN" with ID {random GUID} is not defined. The workflow has been unloaded. Please associate an account with the profile.
This condition may have occurred because no UNIX/Linux Accounts have been configured for the Run As profile. The UNIX/Linux Run As profile used by this workflow must be configured to associate a Run As account with the target.

The situation: You have multiple unix/linux run as accounts that should be used with a separate gateway or a separate resource pool or MS. So for example you want to monitor DMZ Zone1 using GW1 and account User1. You will define User 1 as a unix/linux run as account with a more secure distribution targeting the resource pool that holds GW1 or GW1 as an object directly.
You will go after this and configure the UNIX/Linux profiles (all three) and add User 1 targeting the same resource pool.
This will give you however error 4113 on GW1.
When looking at the discoveries from the Unix/Linux Core Libraries we have one that targets the Microsoft.Unix.ComputerGroup. So targeting objects of type unix/linux will not be enough since this discovery will fail.

How to solve:
You will configure  custom Unix/Linux groups that can be dynamic or not and will add the DMZ servers to each of them: group 1, x, x+1 etc…
For the RunAsAccounts you will still have the targeting for User 1 set to ResourcePool of Gw1, but under RunAsProfiles you will select as a target for each 3 unix/linux profiles for User1 the corresponding custom group (Group1).
This way you`ll get rid of the 4113 events and monitoring will work.

[SCOM] UNIX/Linux Run As Account settings for multiple DMZ, different resource pools

SCOM 2016 Update Rollup 3 Released!

Microsoft released the SCOM 2016 Update Rollup 3 which includes a number of fixes particulary a fix for the SCOM agent crashing IIS .NET 2....