Monitoring

Artifakt gives you the ability to monitor your application and your infrastructure as soon as your environment is ready without any configurations required.

Health Check

Artifakt automatically monitors the health of your critical environments. We send HTTP requests to your applications at regular intervals and alert you by email in the event of an incident.

Alerting Rules

After 2 failed attempts, we consider that an incident is occurring on your environment and send you an email alert to inform you.

These are the rules:

  • A HTTP request is sent every 30 seconds to the environments

  • HTTP response code must be 200 OK

  • Environments must respond in less than 5 seconds

  • HTTP requests are sent to the main domain of the environments

In the future, when your environment returns to a normal state and your application is back online, we send an email alert to inform you.

Emails are sent to the address defined in your profile.

Health checks and associated alerts are only available for critical environments.

When you create or are given access to a critical environment, alerts are automatically enabled for that environment. If you wish, you can disable alerts from your account settings.

Dashboard

Go to Environment β†’ Monitoring β†’ Health Check to view the health check history we have performed for a specific environment.

You can see the global availability of your application as well as the average response time. Lower in the page, you can view the status (Healthy or Unhealthy), the HTTP response code and the response time for each request we made.

Health Check Dashboard

Monitoring an Environment

Artifakt offers a simple way to see detailed information about your environments. It helps you understand how they behave over time and to make good decisions in the event of any incident.

Monitoring Diagram

Go to Environment β†’ Monitoring to see the available information. You will then be able to access different sections depending on your platform level (Starter or Scalable).

Availability

Starter

Scalable

Requests

Yes

Yes

Compute

Yes

Yes

Storage

Yes

Yes

Database

No

Yes

Each diagram can be displayed using 4 different time scales:

Time scale

Data points

Last hour

Data points available every minute.

Last day

Data points available every 10 minutes.

Last week

Data points available every hour.

Last month

Data points available every 6 hours.

The data displayed in the monitoring section is not available in real time. There is a delay of a few seconds between what happens on an environment and the availability of data in the diagrams.

Requests

All data related to the web requests processed by the platform.

Metrics

Description

Web Traffic

This diagram displays the volume of HTTP requests processed by your environment. To be more specific, it is the volume of HTTP requests processed by the Web servers of the platform hosting your environment.

HTTP Response Codes

This diagram displays the HTTP response codes (2xx, 3xx, 4xx, 5xx) of the Web requests processed by the Web servers.

Web Response Time

This diagram displays the average elapsed time between the sending of HTTP requests from the load balancer to the Web servers, and the response from these Web servers.

Compute

All data related to the web servers usage.

Metrics

Description

CPU Usage

This diagram shows the amount of CPU used by the web servers (percentage).

Memory Usage

This diagram shows the amount of memory (RAM) used by the Web servers.

Load

This diagram shows the average CPU usage load of the web servers over a 5-minute period.

The load must remain below the number of total CPU available. Otherwise, your environment will consume more computing capacity than available. This can then indicate:

  • That your application needs to be optimized to consume fewer resources

  • That the platform is not correctly sized in relation to the real needs of your application

Storage

All data related to the persistent volumes.

Metrics

Description

Disk Usage

This diagram indicates the amount of storage used by each of the Web servers.

Read/Write Volume

This diagram indicates the volume of read/write operations over a given period of time (total number of bytes transferred during this period).

Read/Write Operations

This diagram shows the total number of read/write operations over a given period of time.

Read/Write Time

This diagram indicates the total number of seconds elapsed for all read/write operations to complete over a given period of time.

For Scalable type platforms, only the Read / Write Volume diagram is available. This is because the Scalable platform has a scalable and elastic file system, and the other diagrams are of little interest in this context.

Database

All data related to the database usage.

Metrics

Description

Database Connections

This diagram indicates the number of connections to the database at a given time.

Free memory

This diagram indicates the memory (RAM) still available for the database server.

CPU Usage

This diagram shows the amount of CPU used by the database server (percentage).

Free Storage Space

This diagram indicates the amount of storage still available for the database server.

Read/Write Latency

This diagram indicates the average time required per read / write operation on the database server disk.

Read / Write Operations (Ops/s)

This diagram indicates the average number of read / write operations on the database server disk.