-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Healthchecks created via Terraform do not work #187
Comments
Hi @vector623, thanks for opening this issue, it seems that the health-check may not be the problem as the issue still appears when I try without it. I will investigate and let you know what I found. |
Hi @vector623, we've looked and I think you are trying to register a service on a node where a Consul agent is running (an internal service). The
The documentation of the provider (https://www.terraform.io/docs/providers/consul/r/service.html) mentions this briefly:
This is not related to the health-check and you should see the same behaviour when registering the service without the health-checks. You mentioned that the same service created using cURL works, I think you are creating it using The |
@remilapeyre wouldnt it be possible to combine the two, and abstract away that complexity to users? Healthchecks ftw!! |
Still cannot get TCP Health checks working, let alone HTTP health checks. Lets take two services as an example: Prometheus, which has to be configured using TCP checks on port Tested on Consul v1.15.3 Curl ChecksI have Prometheus running on IP $ curl -i 192.168.55.120:9090
HTTP/1.1 302 Found
Content-Type: text/html; charset=utf-8
Location: /graph
Date: Tue, 11 Jul 2023 18:15:55 GMT
Content-Length: 29
<a href="/graph">Found</a>. If Prometheus does HTTP responses, then it is surely giving out a collection of TCP packets. I have Grafana running on IP $ curl -i 192.168.55.121:3000/api/health
HTTP/1.1 200 OK
Cache-Control: no-store
Content-Type: application/json; charset=UTF-8
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-Xss-Protection: 1; mode=block
Date: Tue, 11 Jul 2023 18:17:30 GMT
Content-Length: 71
{
"commit": "5a30620b85",
"database": "ok",
"version": "10.0.1"
} Grafana working as well. Configuring healthchecks with terraform-provider-consulNo lets create the necessary service healthcheck resources. Prometheus configurationConfiguring Health checks for Prometheus resource "consul_node" "node" {
count = 1
datacenter = "dc1"
address = "192.168.55.120"
name = "prometheus01"
}
resource "consul_service" "svc" {
count = 1
name = "prometheus01"
node = "prometheus01"
address = "192.168.55.120"
datacenter = "dc1"
port = 9090
check {
check_id = "service:prometheus01"
name = "Prometheus Health Check"
notes = "Checks for a TCP connection on port 9090"
tcp = "192.168.55.120:9090"
interval = "10s"
timeout = "2s"
deregister_critical_service_after = "60s"
}
} Prometheus resultsGrafana configurationConfiguring Health checks for Grafana resource "consul_node" "node" {
datacenter = "dc1"
address = "192.168.55.121"
name = "grafana01"
}
resource "consul_service" "svc" {
name = "grafana01"
node = "grafana01"
address = "192.168.55.121"
datacenter = "dc1"
port = 3000
check {
check_id = "service:grafana01"
name = "Grafana Health Check"
http = "/api/health"
notes = "Checks for a GET /api/health request on port 3000"
tls_skip_verify = true
method = "GET"
interval = "10s"
timeout = "2s"
deregister_critical_service_after = "30s"
header {
name = "Accept"
value = ["application/json"]
}
}
} Grafana resultsConclusionWith what has been demonstrated above, I have three questions:
Relevant issues: #124 |
Hi @mbrav . Not sure if I'm doing archaeology here, but I just struggled through this myself. This looks like a non-issue to me, although it didn't at first. It's non-issue because although the service and the service health check are declared, there is no external service monitor to actually perform the health checks. I run So, registered services start off critical, but are updated to healthy as they are discovered by |
Terraform Version
Affected Resource(s)
Terraform Configuration Files
Debug Output
https://gist.github.com/vector623/d193f3292790bf7f1119c57bafd4e561
Expected Behavior
Health check should execute successfully. If it fails, it should not deregister for 90 minutes.
Actual Behavior
Health check fails and deregisters within a minute.
Steps to Reproduce
Please list the steps required to reproduce the issue, for example:
terraform init
terraform apply -auto-approve
Important Factoids
References
Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? For example:
The text was updated successfully, but these errors were encountered: