Concept
All (or some) web services in a business platform can be part of health checking by implementing the Nexus Service Health contract. The Nexus Health Checker service will periodically make requests to the web services and notify if something is wrong.
Dependencies
Resource dependency: A web service uses resources such as databases, storage queues and file system.
Service dependency: A web service is dependant on another web service.
Service dependencies
In a business platform, services will be dependant on each other. Especially so with the business api which calls adapters, and the adapters that call the business api and their systems. Services parttaking in health checking implement the /ServiceMetas
endpoint:
Service information
GET api/v1/ServiceMetas
{
TechnicalName: "business-api",
FriendlyName: "Acme Business API",
InstanceId: "(optional) when running on multiple instances",
Dependencies: [
{ ServiceMetasUrl: "https://prdsim-fulcrum-fundamentals.azurewebsites.net/api/v1/acme/prod/ServiceMetas" },
{ ServiceMetasUrl: "https://prdsim-fulcrum-businessevents-facade.azurewebsites.net/api/v1/acme/prod/ServiceMetas" },
{ ServiceMetasUrl: "https://adapter-1.acme.com/api/v1/ServiceMetas" },
{ ServiceMetasUrl: "https://adapter-2.acme.com/api/v1/ServiceMetas" },
{ ServiceMetasUrl: "https://it-capability-a.acme.com/api/v1/ServiceMetas" }
]
}
Resource dependencies
Each service is typically dependant on some resource, e.g. a database, a storage queue, a file system, etc. An adapter should check connectivity to it's system The health of a service takes this into account and implements the /ServiceMetas/Health
endpoint:
Health status
GET api/v1/ServiceMetas/Health
{
Name: "it-capbability-a",
Timestamp: "2020-01-26T09:40:05.8318442+00:00",
Status: 0,
Message: "Ok",
InstanceId: "(optional) when running on multiple instances",
Resources: [
{
Resource: "Database"
Status: 0,
Message: "Ok"
}
]
}
Code
In the Nexus.Link.Libraries.Core
library, there are supporting classes. Use
Health
HealthInfo
IResourceHealth2
ResourceHealthAggregator2
Health checking
One implementation of this pattern for checking health is the Nexus Health Checker service.
It will periodically (every minute) resolve the dependency tree between services (caching it for 30 minutes) and perform health checks on each service. It saves the result to a database (in the customer's cloud). If the result is a warning or error a notification is raised, by email and/or an event.
The settings for Nexus Health Checker is setup in the configuration database along with settings for other Nexus services. You only need to provide the Mandatory values.
To activate the health checking for a tenant, contact Nexus support.
Setting |
Default | Comment |
---|---|---|
TenantConnectionString |
Mandatory The connection to the database. The database is multi-tenant, meaning the same database can be used for different environments. |
|
ServiceMetasUrls |
Mandatory An array of urls where service discovery begins to create the dependency tree. Typically the /ServiceMetas endpoint of the business api, which in turn points to all the dependencies in the platform. E.g.[ "https://example.org/api/v1/ServiceMetas" ] |
|
HealthRequest.MaxDelaySeconds |
30 | The number of seconds after which a health request is cancelled and health status is considered Error |
HealthRequest.WarningDelaySeconds |
5 | If a health request is taking longer than this, the health status is considered Warning |
ServiceInformationRequest.MaxDelaySeconds |
5 | The number of seconds after which a service information request (to /ServiceMetas ) is cancelled and health status is considered Error |
RecentHealthMaxAgeInSeconds |
15 | Within this timespan, a health result will be re-used (no new request) |
Notification setting |
Default | Comment |
---|---|---|
Notifications.FromEmail |
noreply@devnull.net | The "From" email address in the email alerts |
NoRepeatedMailsForMinutes |
15 | The number of minutes to wait until notifying again that the health status is still not ok |
Notifications.Event.PublicationId |
null | The Nexus Business Events publication id, if using events as notification |
ServiceAliveIntervalInMinutes |
null | The interval at which to send "Service is alive" notifications (typically set to once or twice a day) |
For the notification text settings below, variables are available:
{tenant}
: The organization and environment of the Nexus tenant, e.g. "acme (prod)"{top-status}
: TheHealthInfo.StatusEnum
value of the whole health check{top-message}
: The message of the whole health check{results}
: A JSON serialized Dictionary of ServiceMetas url to Health results
Notifications.FromOkToError.Subject
(Defaults to something like) "Health check {top-status}: {tenant}"
Notifications.FromErrorToOk.Message
(Defaults to something like)
<h3>Health Checks for <em>{tenant}</em> are now Ok</h3>\r\n
\r\n
<hr />
<pre>
Top status: {top-status}\r\n
Top message: {top-message}\r\n
</pre>
<br /><hr />
Notifications.FromErrorToOk.Subject
(Defaults to something like) "Health check {top-status}: {tenant}"
Notifications.FromOkToError.Message
(Defaults to something like)
<h3>Health Checks for {tenant} are NOT Ok</h3>\r\n
\r\n
<hr />
<pre>
Top status: {top-status}\r\n
Top message: {top-message}\r\n
</pre>
<br /><hr />\r\n
<p>Full results:</p>
<pre>{results}</pre>
Notifications.StillError.Subject
(Defaults to something like) "Health check STILL NOT OK: {tenant}"
Notifications.StillError.Message
(Defaults to something like)
<h3>Health Checks for {tenant} are STILL NOT Ok</h3>\r\n
\r\n
<hr />
<pre>
Top status: {top-status}\r\n
Top message: {top-message}\r\n
</pre>
<br /><hr />\r\n
<p>Full results:</p>
<pre>{results}</pre>
Notifications.ServiceAlive.Subject
(Defaults to something like) "Health check is alive: {tenant}"
Notifications.ServiceAlive.Message
(Defaults to something like)
<p>Just letting you know that Health Check service is alive and checking tenant {tenant}.</p>\r\n
\r\n
<hr />
<pre>
Top status: {top-status}\r\n
Top message: {top-message}\r\n
</pre>
<br /><hr />\r\n
"<p>Results at {urlToThisHealthCheck}</p>
Notifications.Signature
(Defaults to something like)
\r\n
<hr />\r\n
<p><em>Nexus Health Checker</em></p>