前言
Spring Boot Actuator 模块提供了生产级别的功能,比如健康检查,审计,指标收集,HTTP 跟踪等,帮助我们监控和管理Spring Boot 应用。
因为暴露内部信息的特性,Actuator 也可以和一些外部的应用监控系统整合(Prometheus, Graphite, DataDog, Influx, Wavefront, New Relic等)。
一、maven
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.7.2</version>
<relativePath/>
</parent>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.32</version>
<scope>runtime</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
二、Actuator 提供的所有 endpoint
HTTP方法 | Endpoint | 描述 |
---|---|---|
GET | /actuator | 查看有哪些 Actuator endpoint 是开放的 |
GET | /actuator/beans | 查看运行当下里面全部的 bean,以及他们的关系 |
GET | /actuator/conditions | 查看自动配置的結果,记录哪些自动配置条件通过了,哪些沒通过 |
GET | /actuator/env | 查看全部环境属性,可以看到 SpringBoot 载入了哪些 properties,以及这些 properties 的值 |
GET | /actuator/health | 查看当前 SpringBoot 运行的健康指标,值由 HealthIndicator 类实现 |
GET | /actuator/mappings | 查看全部的 endpoint(包含 Actuator 的),以及他们和 Controller 的关系 |
GET | /actuator/metrics | 查看有哪些指标可以看 |
三、配置
1.默认暴露
- management.endpoints.web.exposure.include
默认只会开放 /actuator/health 和 /actuator/info
2.暴露配置
# 开放所有 endpoints (不包含 shutdown)
management.endpoints.web.exposure.include=*
# 开放指定的 endpoint,因此此处只会开放 /actuator/beans 和 /actuator/mappings
management.endpoints.web.exposure.include=beans,mappings
# exclude 可以用來关闭某些endpoints
# exclude 通常会跟 include 一起用,就是先 include 了全部,然后再 exclude
management.endpoints.web.exposure.exclude=beans
management.endpoints.web.exposure.include=*
3.路径映射
# 原本 /actuator/xxx 路徑,都会变成 /manage/xxx
management.endpoints.web.base-path=/manage
# 同时可以将 health 修改成 healthcheck
management.endpoints.web.path-mapping.health=healthcheck
3.管理端口调整
# 指定端口,默认跟 server.port 一样
management.server.port=8081
四、示例
1.application.properties 配置
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.url=jdbc:mysql://127.0.0.1:3306/test
spring.datasource.username=root
spring.datasource.password=123456
# 管理端口调整,指定端口,默认跟 server.port 一样
management.server.port=8081
# 暴露指定的 endpoint,修改路径映射
management.endpoints.web.exposure.include=health
management.endpoints.web.base-path=/manage
# 查看详细的应用健康信息需要配置
management.endpoint.health.show-details=always
# spring boot 应用连接了 mysql,redis 等就自动给监控起来了
# diskspace 对应 DiskSpaceHealthIndicator 类,db 对应 DataSourceHealthIndicator 类
# 如组件有一个状态异常,应用服务的整体状态即为 down,可以通过配置禁用某个组件的健康监测
management.health.defaults.enabled=true
management.health.diskspace.enabled=true
management.health.db.enabled=true
2.自定义监控
@Component
public class MyHealthIndicator implements HealthIndicator {
private static int num = 0;
@Override
public Health health() {
// 进行一些特定的健康检查,这里模拟检查,设置为一次正常一次异常
// int errorCode = check();
// if (errorCode != 0) {
// return Health.down().withDetail("Error Code", errorCode).build();
// }
return Health.up().build();
}
private int check() {
num++;
return num % 2;
}
}
3.结果
请求 http://localhost:8081/manage
,返回:
{
"_links": {
"self": {
"href": "http://localhost:8081/manage",
"templated": false
},
"health": {
"href": "http://localhost:8081/manage/health",
"templated": false
},
"health-path": {
"href": "http://localhost:8081/manage/health/{*path}",
"templated": true
}
}
}
请求 http://localhost:8081/manage/health
,返回:
自定义监控 up 时,
{
"status": "UP",
"components": {
"db": {
"status": "UP",
"details": {
"database": "MySQL",
"validationQuery": "isValid()"
}
},
"diskSpace": {
"status": "UP",
"details": {
"total": 490577010688,
"free": 183808139264,
"threshold": 10485760,
"exists": true
}
},
"my": {
"status": "UP"
},
"ping": {
"status": "UP"
}
}
}
自定义监控 down 时,
{
"status": "DOWN",
"components": {
"db": {
"status": "UP",
"details": {
"database": "MySQL",
"validationQuery": "isValid()"
}
},
"diskSpace": {
"status": "UP",
"details": {
"total": 490577010688,
"free": 183936602112,
"threshold": 10485760,
"exists": true
}
},
"my": {
"status": "DOWN",
"details": {
"Error Code": 1
}
},
"ping": {
"status": "UP"
}
}
}
五、docker 健康检查
1.12 之后,Docker 提供了 HEALTHCHECK 指令,通过该指令指定一行命令,用这行命令来判断容器主进程的服务状态是否还正常,从而比较真实的反应容器实际状态
当在一个镜像指定了 HEALTHCHECK 指令后,用其启动容器,初始状态会为 starting,在 HEALTHCHECK 指令检查成功后变为 healthy,如果连续一定次数失败,则会变为 unhealthy
HEALTHCHECK 支持下列选项:
- --interval=<间隔>:两次健康检查的间隔,默认为 30 秒
- --timeout=<时长>:健康检查命令运行超时时间,如果超过这个时间,本次健康检查就被视为失败,默认 30 秒
- --retries=<次数>:当连续失败指定次数后,则将容器状态视为 unhealthy,默认 3 次
和 CMD, ENTRYPOINT 一样,HEALTHCHECK 只可以出现一次,如果写了多个,只有最后一个生效
在 HEALTHCHECK [选项] CMD 后面的命令,格式和 ENTRYPOINT 一样,分为 shell 格式,和 exec 格式。命令的返回值决定了该次健康检查的成功与否:0:成功;1:失败;2:保留
Dockerfile
FROM openjdk:8u191-jre-alpine3.9
MAINTAINER zxm <zxm-2018@qq.com>
ENV APPLICATION_NAME=spring-boot-demo
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories
RUN apk add --update curl && rm -rf /var/cache/apk/*
RUN mkdir /app
COPY target/*.jar /app/app.jar
WORKDIR /app
EXPOSE 8080
# 健康检查,间隔 5s,超时 3s
# curl 选项 -f 作用是,如果 HTTP 请求失败,即服务器在该 URL 处不可达或请求遇到错误时,退出并返回一个非零的状态码
HEALTHCHECK --interval=5s --timeout=3s CMD curl -f http://localhost:8081/manage/health || exit 1
ENTRYPOINT ["java", "-jar", "/app/app.jar"]
构建镜像:
docker build -f Dockerfile -t demo:1.0 .
结果:
zxm@zxm-pc:~$ docker run -itd --name demo_1 demo:1.0
c51b33672c4aa37622fe35b5f827e13a6fb7423c4a813038fb41c7b95747897a
zxm@zxm-pc:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c51b33672c4a demo:1.0 "java -jar /app/app.…" 2 seconds ago Up 1 second (health: starting) 8080/tcp demo_1
zxm@zxm-pc:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c51b33672c4a demo:1.0 "java -jar /app/app.…" 59 seconds ago Up 58 seconds (unhealthy) 8080/tcp demo_1
六、docker-compose 健康检查
Compose file format 2.1 之后支持 healthcheck
https://docs.docker.com/compose/compose-file/compose-file-v2/#healthcheck
Dockerfile
FROM openjdk:8u191-jre-alpine3.9
MAINTAINER zxm <zxm-2018@qq.com>
ENV APPLICATION_NAME=spring-boot-demo
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories
RUN apk add --update curl && rm -rf /var/cache/apk/*
RUN mkdir /app
COPY target/*.jar /app/app.jar
WORKDIR /app
EXPOSE 8080
# 健康检查,间隔 5s,超时 3s
# curl 选项 -f 作用是,如果 HTTP 请求失败,即服务器在该 URL 处不可达或请求遇到错误时,退出并返回一个非零的状态码
# HEALTHCHECK --interval=5s --timeout=3s CMD curl -f http://localhost:8081/manage/health || exit 1
ENTRYPOINT ["java", "-jar", "/app/app.jar"]
docker-compose.yml
version: "3"
services:
demo:
image: demo:2.0
container_name: demo_2
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8081/manage/health"]
interval: 5s
timeout: 3s
构建镜像:
docker build -f Dockerfile -t demo:2.0 .
结果:
zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker-compose -f docker-compose.yml up -d
Creating network "spring-boot-actuator-demo_default" with the default driver
Creating demo_2 ... done
zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ddcf0d7d230d demo:2.0 "java -jar /app/app.…" 5 seconds ago Up 4 seconds (health: starting) 8080/tcp demo_2
zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker-compose ps
Name Command State Ports
------------------------------------------------------------------
demo_2 java -jar /app/app.jar Up (health: starting) 8080/tcp
zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ddcf0d7d230d demo:2.0 "java -jar /app/app.…" 12 seconds ago Up 11 seconds (healthy) 8080/tcp demo_2
zxm@zxm-pc:~/IdeaProjects/spring-boot-actuator-demo$ docker-compose ps
Name Command State Ports
---------------------------------------------------------
demo_2 java -jar /app/app.jar Up (healthy) 8080/tcp