spring boot · 2023-11-07 0

Spring Boot Actuator 使用及 Docker 和 docker-compose 健康检查

前言

Spring Boot Actuator 模块提供了生产级别的功能,比如健康检查,审计,指标收集,HTTP 跟踪等,帮助我们监控和管理Spring Boot 应用。

因为暴露内部信息的特性,Actuator 也可以和一些外部的应用监控系统整合(Prometheus, Graphite, DataDog, Influx, Wavefront, New Relic等)。

一、maven

<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>2.7.2</version>
    <relativePath/>
</parent>

<properties>
    <maven.compiler.source>8</maven.compiler.source>
    <maven.compiler.target>8</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-jdbc</artifactId>
    </dependency>

    <dependency>
        <groupId>mysql</groupId>
        <artifactId>mysql-connector-java</artifactId>
        <version>8.0.32</version>
        <scope>runtime</scope>
    </dependency>
</dependencies>

<build>
    <plugins>
        <plugin>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-maven-plugin</artifactId>
        </plugin>
    </plugins>
</build>

二、Actuator 提供的所有 endpoint

HTTP方法 Endpoint 描述
GET /actuator 查看有哪些 Actuator endpoint 是开放的
GET /actuator/beans 查看运行当下里面全部的 bean,以及他们的关系
GET /actuator/conditions 查看自动配置的結果,记录哪些自动配置条件通过了,哪些沒通过
GET /actuator/env 查看全部环境属性,可以看到 SpringBoot 载入了哪些 properties,以及这些 properties 的值
GET /actuator/health 查看当前 SpringBoot 运行的健康指标,值由 HealthIndicator 类实现
GET /actuator/mappings 查看全部的 endpoint(包含 Actuator 的),以及他们和 Controller 的关系
GET /actuator/metrics 查看有哪些指标可以看

三、配置

1.默认暴露

  • management.endpoints.web.exposure.include

默认只会开放 /actuator/health 和 /actuator/info

2.暴露配置

# 开放所有 endpoints (不包含 shutdown)
management.endpoints.web.exposure.include=*

# 开放指定的 endpoint,因此此处只会开放 /actuator/beans 和 /actuator/mappings
management.endpoints.web.exposure.include=beans,mappings

# exclude 可以用來关闭某些endpoints
# exclude 通常会跟 include 一起用,就是先 include 了全部,然后再 exclude
management.endpoints.web.exposure.exclude=beans
management.endpoints.web.exposure.include=*

3.路径映射

# 原本 /actuator/xxx 路徑,都会变成 /manage/xxx
management.endpoints.web.base-path=/manage

# 同时可以将 health 修改成 healthcheck
management.endpoints.web.path-mapping.health=healthcheck

3.管理端口调整

# 指定端口,默认跟 server.port 一样
management.server.port=8081

四、示例

1.application.properties 配置

spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.url=jdbc:mysql://127.0.0.1:3306/test
spring.datasource.username=root
spring.datasource.password=123456

# 管理端口调整,指定端口,默认跟 server.port 一样
management.server.port=8081

# 暴露指定的 endpoint,修改路径映射
management.endpoints.web.exposure.include=health
management.endpoints.web.base-path=/manage

# 查看详细的应用健康信息需要配置
management.endpoint.health.show-details=always

# spring boot 应用连接了 mysql,redis 等就自动给监控起来了
# diskspace 对应 DiskSpaceHealthIndicator 类,db 对应 DataSourceHealthIndicator 类
# 如组件有一个状态异常,应用服务的整体状态即为 down,可以通过配置禁用某个组件的健康监测
management.health.defaults.enabled=true
management.health.diskspace.enabled=true
management.health.db.enabled=true

2.自定义监控

@Component
public class MyHealthIndicator implements HealthIndicator {

    private static int num = 0;

    @Override
    public Health health() {
        // 进行一些特定的健康检查,这里模拟检查,设置为一次正常一次异常
        // int errorCode = check();
        // if (errorCode != 0) {
            // return Health.down().withDetail("Error Code", errorCode).build();
        // }
        return Health.up().build();
    }

    private int check() {
        num++;
        return num % 2;
    }
}

3.结果

请求 http://localhost:8081/manage,返回:

{
  "_links": {
    "self": {
      "href": "http://localhost:8081/manage",
      "templated": false
    },
    "health": {
      "href": "http://localhost:8081/manage/health",
      "templated": false
    },
    "health-path": {
      "href": "http://localhost:8081/manage/health/{*path}",
      "templated": true
    }
  }
}

请求 http://localhost:8081/manage/health,返回:

自定义监控 up 时,

{
  "status": "UP",
  "components": {
    "db": {
      "status": "UP",
      "details": {
        "database": "MySQL",
        "validationQuery": "isValid()"
      }
    },
    "diskSpace": {
      "status": "UP",
      "details": {
        "total": 490577010688,
        "free": 183808139264,
        "threshold": 10485760,
        "exists": true
      }
    },
    "my": {
      "status": "UP"
    },
    "ping": {
      "status": "UP"
    }
  }
}

自定义监控 down 时,

{
  "status": "DOWN",
  "components": {
    "db": {
      "status": "UP",
      "details": {
        "database": "MySQL",
        "validationQuery": "isValid()"
      }
    },
    "diskSpace": {
      "status": "UP",
      "details": {
        "total": 490577010688,
        "free": 183936602112,
        "threshold": 10485760,
        "exists": true
      }
    },
    "my": {
      "status": "DOWN",
      "details": {
        "Error Code": 1
      }
    },
    "ping": {
      "status": "UP"
    }
  }
}

五、docker 健康检查

1.12 之后,Docker 提供了 HEALTHCHECK 指令,通过该指令指定一行命令,用这行命令来判断容器主进程的服务状态是否还正常,从而比较真实的反应容器实际状态

当在一个镜像指定了 HEALTHCHECK 指令后,用其启动容器,初始状态会为 starting,在 HEALTHCHECK 指令检查成功后变为 healthy,如果连续一定次数失败,则会变为 unhealthy

HEALTHCHECK 支持下列选项:

  • --interval=<间隔>:两次健康检查的间隔,默认为 30 秒
  • --timeout=<时长>:健康检查命令运行超时时间,如果超过这个时间,本次健康检查就被视为失败,默认 30 秒
  • --retries=<次数>:当连续失败指定次数后,则将容器状态视为 unhealthy,默认 3 次

和 CMD, ENTRYPOINT 一样,HEALTHCHECK 只可以出现一次,如果写了多个,只有最后一个生效

在 HEALTHCHECK [选项] CMD 后面的命令,格式和 ENTRYPOINT 一样,分为 shell 格式,和 exec 格式。命令的返回值决定了该次健康检查的成功与否:0:成功;1:失败;2:保留

Dockerfile

FROM openjdk:8u191-jre-alpine3.9

MAINTAINER zxm <zxm-2018@qq.com>
ENV APPLICATION_NAME=spring-boot-demo

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories
RUN apk add --update curl && rm -rf /var/cache/apk/*

RUN mkdir /app
COPY target/*.jar /app/app.jar
WORKDIR /app
EXPOSE 8080

# 健康检查,间隔 5s,超时 3s
# curl 选项 -f 作用是,如果 HTTP 请求失败,即服务器在该 URL 处不可达或请求遇到错误时,退出并返回一个非零的状态码
HEALTHCHECK --interval=5s --timeout=3s CMD curl -f http://localhost:8081/manage/health || exit 1

ENTRYPOINT ["java", "-jar", "/app/app.jar"]

构建镜像:

docker build -f Dockerfile -t demo:1.0 .

结果:

zxm@zxm-pc:~$ docker run -itd --name demo_1 demo:1.0
c51b33672c4aa37622fe35b5f827e13a6fb7423c4a813038fb41c7b95747897a
zxm@zxm-pc:~$ docker ps
CONTAINER ID   IMAGE       COMMAND                   CREATED         STATUS                           PORTS                                                  NAMES
c51b33672c4a   demo:1.0    "java -jar /app/app.…"   2 seconds ago   Up 1 second (health: starting)   8080/tcp                                               demo_1
zxm@zxm-pc:~$ docker ps
CONTAINER ID   IMAGE       COMMAND                   CREATED          STATUS                      PORTS                                                  NAMES
c51b33672c4a   demo:1.0    "java -jar /app/app.…"   59 seconds ago   Up 58 seconds (unhealthy)   8080/tcp                                               demo_1

六、docker-compose 健康检查

Compose file format 2.1 之后支持 healthcheck

https://docs.docker.com/compose/compose-file/compose-file-v2/#healthcheck

Dockerfile

FROM openjdk:8u191-jre-alpine3.9

MAINTAINER zxm <zxm-2018@qq.com>
ENV APPLICATION_NAME=spring-boot-demo

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories
RUN apk add --update curl && rm -rf /var/cache/apk/*

RUN mkdir /app
COPY target/*.jar /app/app.jar
WORKDIR /app
EXPOSE 8080

# 健康检查,间隔 5s,超时 3s
# curl 选项 -f 作用是,如果 HTTP 请求失败,即服务器在该 URL 处不可达或请求遇到错误时,退出并返回一个非零的状态码
# HEALTHCHECK --interval=5s --timeout=3s CMD curl -f http://localhost:8081/manage/health || exit 1

ENTRYPOINT ["java", "-jar", "/app/app.jar"]

docker-compose.yml

version: "3"
services:
  demo:
    image: demo:2.0
    container_name: demo_2
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8081/manage/health"]
      interval: 5s
      timeout: 3s

构建镜像:

docker build -f Dockerfile -t demo:2.0 .

结果:

zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker-compose -f docker-compose.yml up -d
Creating network "spring-boot-actuator-demo_default" with the default driver
Creating demo_2 ... done

zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker ps
CONTAINER ID   IMAGE      COMMAND                   CREATED         STATUS                            PORTS      NAMES
ddcf0d7d230d   demo:2.0   "java -jar /app/app.…"   5 seconds ago   Up 4 seconds (health: starting)   8080/tcp   demo_2

zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker-compose ps
 Name           Command                   State            Ports  
------------------------------------------------------------------
demo_2   java -jar /app/app.jar   Up (health: starting)   8080/tcp

zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker ps
CONTAINER ID   IMAGE      COMMAND                   CREATED          STATUS                    PORTS      NAMES
ddcf0d7d230d   demo:2.0   "java -jar /app/app.…"   12 seconds ago   Up 11 seconds (healthy)   8080/tcp   demo_2

zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker-compose ps
 Name           Command              State        Ports  
---------------------------------------------------------
demo_2   java -jar /app/app.jar   Up (healthy)   8080/tcp