Nexus Service Health

Prev Next

Concept

All (or some) web services in a business platform can be part of health checking by implementing the Nexus Service Health contract. The Nexus Health Checker service will periodically make requests to the web services and notify if something is wrong.

Dependencies

Definition of dependency

Resource dependency: A web service uses resources such as databases, storage queues and file system.
Service dependency: A web service is dependant on another web service.

Service dependencies

Service dependencies

In a business platform, services will be dependant on each other. Especially so with the business api which calls adapters, and the adapters that call the business api and their systems. Services parttaking in health checking implement the /ServiceMetasendpoint:

Contract

Service information

GET api/v1/ServiceMetas

{
  TechnicalName: "business-api",
  FriendlyName: "Acme Business API",
  InstanceId: "(optional) when running on multiple instances",
  Dependencies: [
    { ServiceMetasUrl: "https://prdsim-fulcrum-fundamentals.azurewebsites.net/api/v1/acme/prod/ServiceMetas" },
    { ServiceMetasUrl: "https://prdsim-fulcrum-businessevents-facade.azurewebsites.net/api/v1/acme/prod/ServiceMetas" },
    { ServiceMetasUrl: "https://adapter-1.acme.com/api/v1/ServiceMetas" },
    { ServiceMetasUrl: "https://adapter-2.acme.com/api/v1/ServiceMetas" },
    { ServiceMetasUrl: "https://it-capability-a.acme.com/api/v1/ServiceMetas" }
  ]
}

Resource dependencies

Each service is typically dependant on some resource, e.g. a database, a storage queue, a file system, etc. An adapter should check connectivity to it's system The health of a service takes this into account and implements the /ServiceMetas/Health endpoint:

Contract

Health status

GET api/v1/ServiceMetas/Health

{
  Name: "it-capbability-a",
  Timestamp: "2020-01-26T09:40:05.8318442+00:00",
  Status: 0,
  Message: "Ok",
  InstanceId: "(optional) when running on multiple instances",
  Resources: [
    {
      Resource: "Database"
      Status: 0,
      Message: "Ok"
    }
  ]
}

Code

In the Nexus.Link.Libraries.Core library, there are supporting classes. Use

  • Health
  • HealthInfo
  • IResourceHealth2
  • ResourceHealthAggregator2

Health checking

One implementation of this pattern for checking health is the Nexus Health Checker service.

Health checking

It will periodically (every minute) resolve the dependency tree between services (caching it for 30 minutes) and perform health checks on each service. It saves the result to a database (in the customer's cloud). If the result is a warning or error a notification is raised, by email and/or an event.

The settings for Nexus Health Checker is setup in the configuration database along with settings for other Nexus services. You only need to provide the Mandatory values.

To activate the health checking for a tenant, contact Nexus support.

Setting
Default Comment
TenantConnectionString Mandatory
The connection to the database. The database is multi-tenant, meaning the same database can be used for different environments.
ServiceMetasUrls Mandatory
An array of urls where service discovery begins to create the dependency tree. Typically the /ServiceMetasendpoint of the business api, which in turn points to all the dependencies in the platform. E.g.
[ "https://example.org/api/v1/ServiceMetas" ]
HealthRequest.MaxDelaySeconds 30 The number of seconds after which a health request is cancelled and health status is considered Error
HealthRequest.WarningDelaySeconds 5 If a health request is taking longer than this, the health status is considered Warning
ServiceInformationRequest.MaxDelaySeconds 5 The number of seconds after which a service information request (to /ServiceMetas) is cancelled and health status is considered Error
RecentHealthMaxAgeInSeconds 15 Within this timespan, a health result will be re-used (no new request)
Notification setting
Default Comment
Notifications.FromEmail noreply@devnull.net The "From" email address in the email alerts
NoRepeatedMailsForMinutes 15 The number of minutes to wait until notifying again that the health status is still not ok
Notifications.Event.PublicationId null The Nexus Business Events publication id, if using events as notification
ServiceAliveIntervalInMinutes null The interval at which to send "Service is alive" notifications (typically set to once or twice a day)

For the notification text settings below, variables are available:

  • {tenant}: The organization and environment of the Nexus tenant, e.g. "acme (prod)"
  • {top-status}: The HealthInfo.StatusEnum value of the whole health check
  • {top-message}: The message of the whole health check
  • {results}: A JSON serialized Dictionary of ServiceMetas url to Health results

Notifications.FromOkToError.Subject
(Defaults to something like) "Health check {top-status}: {tenant}"

Notifications.FromErrorToOk.Message
(Defaults to something like)

<h3>Health Checks for <em>{tenant}</em> are now Ok</h3>\r\n
\r\n
<hr />
<pre>
  Top status: {top-status}\r\n
 Top message: {top-message}\r\n
</pre>
<br /><hr />

Notifications.FromErrorToOk.Subject
(Defaults to something like) "Health check {top-status}: {tenant}"

Notifications.FromOkToError.Message
(Defaults to something like)

<h3>Health Checks for {tenant} are NOT Ok</h3>\r\n
\r\n
<hr />
<pre>
 Top status: {top-status}\r\n
Top message: {top-message}\r\n
</pre>
<br /><hr />\r\n
<p>Full results:</p>
<pre>{results}</pre>

Notifications.StillError.Subject
(Defaults to something like) "Health check STILL NOT OK: {tenant}"

Notifications.StillError.Message
(Defaults to something like)

<h3>Health Checks for {tenant} are STILL NOT Ok</h3>\r\n
\r\n
<hr />
<pre>
 Top status: {top-status}\r\n
Top message: {top-message}\r\n
</pre>
<br /><hr />\r\n
<p>Full results:</p>
<pre>{results}</pre>

Notifications.ServiceAlive.Subject
(Defaults to something like) "Health check is alive: {tenant}"

Notifications.ServiceAlive.Message
(Defaults to something like)

<p>Just letting you know that Health Check service is alive and checking tenant {tenant}.</p>\r\n
\r\n
<hr />
<pre>
 Top status: {top-status}\r\n
Top message: {top-message}\r\n
</pre>
<br /><hr />\r\n
"<p>Results at {urlToThisHealthCheck}</p>

Notifications.Signature
(Defaults to something like)

\r\n
<hr />\r\n
<p><em>Nexus Health Checker</em></p>