K8S集群中存在有很多的Pod,会产生很多的日志,通过命令的方式去检索日志,往往会比较麻烦,如果需要统一收集日志,此时就可以基于ELK去进行日志的收集并对外提供日志的检索功能。

1.安装ES(ElasticSearch)

1.1 创建ES的StatefulSet控制器

安装ES需要使用StatefulSet有状态服务的控制器,不能使用Deployment控制器,原因在于ES的集群的各个Pod之间需要知道彼此的Pod的IP(Host)才能实现集群之间的相互通信,因此我们在StatefulSet控制器创建时就必须指定ES集群的机器名称,只有StatefulSet控制器实现做到这个功能。

我们下面创建了一个3台机器的ES集群,通过elasticsearch-0.elasticsearch,elasticsearch-1.elasticsearch,elasticsearch-2.elasticsearch这三台机器组成一个ES集群,并且指定以elasticsearch-0作为ES集群的主节点。

我们定义资源清单(es.yaml)如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: elk
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      volumes:
        - name: host-time
          hostPath:
            path: /etc/localtime
            type: ''
      containers:
        - name: elasticsearch
          image: docker.elastic.co/elasticsearch/elasticsearch:8.10.2
          resources:
            requests:
              memory: "2Gi"
              cpu: "1"
            limits:
              memory: "2Gi"
              cpu: "1"
          volumeMounts:
            - name: host-time
              readOnly: true
              mountPath: /etc/localtime
            - name: elasticsearch-data
              mountPath: /usr/share/elasticsearch/data  # 持久化数据路径
          env:
            - name: discovery.seed_hosts
              value: "elasticsearch-0.elasticsearch,elasticsearch-1.elasticsearch,elasticsearch-2.elasticsearch"
            - name: cluster.initial_master_nodes
              value: "elasticsearch-0"
            - name: xpack.security.enabled
              value: "false"
            - name: xpack.security.transport.ssl.enabled
              value: "false"
            - name: ES_JAVA_OPTS
              value: "-Xms512m -Xmx512m"
          ports:
            - containerPort: 9200
              name: http
            - containerPort: 9300
              name: transport
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 50Gi
---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: elk
spec:
  ports:
    - port: 9200
      name: http
    - port: 9300
      name: transport
  clusterIP: None   # headless
  selector:
    app: elasticsearch

需要注意的是:ES集群一般需要使用volumeClaimTemplates去声明PVC,对于每个Pod都会尝试去申请一个PVC,因此StatefulSet声明的副本数,将会决定需要去申请的PVC数量,也会对应需要去申请的PV持久卷数量。也就是说我们有3个Pod,那么就会需要申请3个PVC。

  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 50Gi

在卷的挂载时,我们指定需要挂在的卷的名称为elasticsearch-data,指定挂载路径为/usr/share/elasticsearch/data,这个路径是ES的数据存储路径。

            - name: elasticsearch-data
              mountPath: /usr/share/elasticsearch/data  # 持久化数据路径

通过kubectl apply -f es.yaml去创建ES集群。

1.2 尝试初始化ES集群的账号密码

初始化密码流程如下:

首先,我们进入到POD:

kubectl exec -it elasticsearch-0 -n elk -- bash

接着我们使用如下的命令去初始化ES集群的账号密码:

bin/elasticsearch-setup-passwords interactive

执行命令遇到问题:

Unexpected response code [405] from calling GET http://10.88.116.6:9200/_security/_authenticate?pretty
It doesn't look like the X-Pack security feature is enabled on this Elasticsearch node.
Please check if you have enabled X-Pack security in your elasticsearch.yml configuration file.


ERROR: X-Pack Security is disabled by configuration., with exit code 78

提示我们X-Pack安全没有打开,必须要要配置好X-Pack安全之后,才允许打开。

1.3 新增Xpack配置支持配置证书

使用上面的yaml文件安装之后,ES集群是不安全的,是可以被外部随意访问的,因为我们将它配置成为了不安全(如果打开的话,需要证书以及密钥信息,ES集群不一定能起来)。

涉及到的配置如下:

            - name: xpack.security.enabled
              value: "false"
            - name: xpack.security.transport.ssl.enabled
              value: "false"

我们尝试将上面两项改成true之后,会提示我们必须指定证书信息。

            - name: xpack.security.enabled
              value: "false"
            - name: xpack.security.transport.ssl.enabled
              value: "false"

此时,我们需要使用ES自带的工具去生成CA证书并签名。需要注意的是,这里会有个问题,如果xpack.security.enabled设置为true,则ES集群无法启动。因此我们应该,先基于不安全的ES集群上去机器生成证书文件下载到本地(或者找一台别的ES机器去生成证书文件),并通过ConfigMap,或者是PV持久卷的方式去挂载到ES集群,再进行启动才能正常启动ES集群。

# 生成CA证书elastic-stack-ca.p12
./bin/elasticsearch-certutil ca --out elastic-stack-ca.p12 --pass ""
# 签名elastic-certificates.p12
./bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12 --ca-pass "" --out elastic-certificates.p12 --pass ""

在生成CA证书和签名之后,需要将它配置到ES集群当中,我们使用如下的命令去创建ES集群的证书文件的Secret。

kubectl create secret generic elastic-certificates   --from-file=elastic-certificates.p12=elastic-certificates.p12   --from-file=elastic-stack-ca.p12=elastic-stack-ca.p12   -n elk

接着需要在StatefulSet的配置中新增如下的这些配置项:

            - name: xpack.security.enabled
              value: "true"
            - name: xpack.security.transport.ssl.enabled
              value: "true"
            - name: xpack.security.transport.ssl.verification_mode
              value: "certificate"
            - name: xpack.security.transport.ssl.keystore.path
              value: "/usr/share/elasticsearch/data/cert/elastic-certificates.p12"
            - name: xpack.security.transport.ssl.truststore.path
              value: "/usr/share/elasticsearch/data/cert/elastic-certificates.p12"

需要注意的是:对于xpack.security.transport.ssl.keystore.pathxpack.security.transport.ssl.truststore.path都应该指定成为elastic-certificates.p12所在的路径。

修改新的StatefulSet资源清单,编写如下的资源清单:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: elk
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      volumes:
        - name: host-time
          hostPath:
            path: /etc/localtime
            type: ''
        - name: elastic-certificates
          secret:
            secretName: elastic-certificates
      containers:
        - name: elasticsearch
          image: elasticsearch:8.18.0
          resources:
            requests:
              memory: "2Gi"
              cpu: "1"
            limits:
              memory: "2Gi"
              cpu: "1"
          volumeMounts:
            - name: host-time
              readOnly: true
              mountPath: /etc/localtime
            - name: elastic-certificates
              readOnly: true
              mountPath: /usr/share/elasticsearch/config/certs
            - name: elasticsearch-data-pvc
              mountPath: /usr/share/elasticsearch/data  # 持久化数据路径
          env:
            - name: discovery.seed_hosts
              value: "elasticsearch-0.elasticsearch,elasticsearch-1.elasticsearch,elasticsearch-2.elasticsearch"
            - name: cluster.initial_master_nodes
              value: "elasticsearch-0"
            - name: xpack.security.enabled
              value: "true"
            - name: xpack.security.transport.ssl.enabled
              value: "true"
            - name: xpack.security.transport.ssl.verification_mode
              value: "certificate"
            - name: xpack.security.transport.ssl.keystore.path
              value: "/usr/share/elasticsearch/config/certs/elastic-certificates.p12"
            - name: xpack.security.transport.ssl.truststore.path
              value: "/usr/share/elasticsearch/config/certs/elastic-certificates.p12"
            - name: ES_JAVA_OPTS
              value: "-Xms512m -Xmx512m"
          ports:
            - containerPort: 9200
              name: http
            - containerPort: 9300
              name: transport
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data-pvc
    spec:
      accessModes: ["ReadWriteMany"]
      resources:
        requests:
          storage: 100Gi
---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: elk
spec:
  ports:
    - port: 9200
      targetPort: 9200
      name: http
    - port: 9300
      name: transport
  selector:
    app: elasticsearch
  type: ClusterIP

通过卷(Volume)的方式去挂载刚刚我们创建的Secret,指定引用的Secret的secretName为elasticsearch-certificates

      volumes:
        - name: elasticsearch-certificates
          secret:
            secretName: elasticsearch-certificates

在mainc(Main Container)的volumeMounts当中声明,要挂载的卷elasticsearch-certificates,挂载到/usr/share/elasticsearch/config/certs目录下,那么该目录下就会存在有elastic-certificates.p12elastic-stack-ca.p12文件。

          volumeMounts:
            - name: elasticsearch-certificates
              readOnly: true
              mountPath: /usr/share/elasticsearch/config/certs

此时,我们就可以指定证书所在的路径为/usr/share/elasticsearch/config/certs/elastic-certificates.p12

            - name: xpack.security.transport.ssl.keystore.path
              value: "/usr/share/elasticsearch/config/certs/elastic-certificates.p12"
            - name: xpack.security.transport.ssl.truststore.path
              value: "/usr/share/elasticsearch/config/certs/elastic-certificates.p12"

1.4 配置好Xpack之后重新初始化账号密码

重新执行bin/elasticsearch-setup-passwords interactive命令进行初始化ES集群相关的所有的账号密码:

整个流程如下:

sh-5.0$ bin/elasticsearch-setup-passwords interactive
******************************************************************************
Note: The 'elasticsearch-setup-passwords' tool has been deprecated. This       command will be removed in a future release.
******************************************************************************

Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
You will be prompted to enter passwords as the process progresses.
Please confirm that you would like to continue [y/N]y


Enter password for [elastic]: 
Reenter password for [elastic]: 
Enter password for [apm_system]: 
Reenter password for [apm_system]: 
Enter password for [kibana_system]: 
Reenter password for [kibana_system]: 
Enter password for [logstash_system]: 
Reenter password for [logstash_system]: 
Enter password for [beats_system]: 
Reenter password for [beats_system]: 
Enter password for [remote_monitoring_user]: 
Reenter password for [remote_monitoring_user]: 
Changed password for user [apm_system]
Changed password for user [kibana_system]
Changed password for user [kibana]
Changed password for user [logstash_system]
Changed password for user [beats_system]
Changed password for user [remote_monitoring_user]
Changed password for user [elastic]

会去初始化elastic、apm_system、kibana_system、logstash_system、beats_system、remote_monitoring_user这些用户的密码。

后续可以对elastic用户进行修改账号的密码:

curl -u elastic:{current-password} -X POST "http://127.0.0.1:9200/_security/user/elastic/_password" -H "Content-Type: application/json" -d'
{
  "password": "{new-password}"
}'

1.5 ES集群创建新用户

我们以创建kibana用户为例,其他用户的创建正常按照如下的流程走即可。

我们可以使用如下的脚本,请求ES集群去创建一个ES账号,专门给Kibana使用,并授予相应的角色的权限(必须要有kibana_systemkibana_user才能正常使用Kibana)。

curl -X POST "http://{nodePort}/_security/user/{new-user-name}" -u elastic:{elastic-password} -H "Content-Type: application/json" -d '
{
  "password" : "{new-password}",
  "roles" : [ "kibana_user", "kibana_system", "superuser" ]
}'

# example
curl -X POST "http://127.0.0.1:9200/_security/user/{new-username}" -u elastic:{password} -H "Content-Type: application/json" -d '
{
  "password" : "{new-password}",
  "roles" : [ "superuser", "kibana_user", "kibana_system" ]
}'

可以使用如下的命令,去查看各个用户目前拥有的角色信息,需要一个superuser的用户进行登录,superuser-username是用户名,superuser-password是密码。

curl -u {superuser-username}:{superuser-password} -X GET "http://{nodePort}/_security/user?pretty"

# example
curl -u {superuser-username}:{superuser-password} -X GET "http://127.0.0.1:9200/_security/user?pretty"

可以使用如下的命令查看ES集群的用户信息(xxxxxx需要换成elastic的密码):

curl -u elastic:xxxxxx -X GET "http://127.0.0.1:9200/_security/user?pretty"

2.安装Kibana

2.1 创建Kibana访问ES集群的用户名/密码的Secret

安装Kibaba需要先创建一个Secret,用来存放Kibana访问ES集群的用户名和密码。

apiVersion: v1
kind: Secret
metadata:
  name: elasticsearch-secrets
  namespace: elk
type: Opaque
stringData:
  elastic-username: kibana_system
  elastic-password: xxxxxx

2.2 安装Kibana集群

可以基于如下的资源清单kibana.yaml,安装Kibana。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: elk
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      volumes:
        - name: host-time
          hostPath:
            path: /etc/localtime
            type: ''
        - name: elasticsearch-certificates
          secret:
            secretName: elasticsearch-certificates
      containers:
        - name: kibana
          volumeMounts:
            - name: host-time
              readOnly: true
              mountPath: /etc/localtime
            - name: elasticsearch-certificates
              mountPath: /usr/share/elasticsearch/config/certs
              readOnly: true
          image: docker.elastic.co/kibana/kibana:8.10.2  # 请根据你的需求更换 Kibana 版本
          env:
            - name: ELASTICSEARCH_HOSTS
              value: "http://elasticsearch.elk.svc.cluster.local:9200"  # 配置 Elasticsearch 服务地址
            - name: ELASTICSEARCH_USERNAME
              valueFrom:
                secretKeyRef:
                  name: elasticsearch-secrets
                  key: elastic-username
            - name: elasticsearch.username
              valueFrom:
                secretKeyRef:
                  name: elasticsearch-secrets
                  key: elastic-username
            - name: ELASTICSEARCH_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: elasticsearch-secrets
                  key: elastic-password
            - name: elasticsearch.password
              valueFrom:
                secretKeyRef:
                  name: elasticsearch-secrets
                  key: elastic-password
            - name: xpack.encryptedSavedObjects.encryptionKey
              value: "min-32-byte-long-strong-encryption-key"
          ports:
            - containerPort: 5601  # Kibana 的默认端口
---
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: elk
spec:
  type: NodePort
  selector:
    app: kibana
  ports:
    - port: 5601
      targetPort: 5601
      nodePort: 31234

需要通过如下的环境变量ELASTICSEARCH_HOSTS去配置ES服务的地址,可以基于K8S的集群之间的通信去配置ES集群的访问地址,我们这里配置的是elasticsearch.elk.svc.cluster.local,elasticsearch是Service的名称,elk是Namespace名称,更换成为自己的Service和Namespace即可。

            - name: ELASTICSEARCH_HOSTS
              value: "http://elasticsearch.elk.svc.cluster.local:9200"  # 配置 Elasticsearch 服务地址

我们通过kubectl apply -f kibana.yaml去安装Kibana集群。

3. 安装Logstash和Filebeat

背景(为什么我们要安装的是Logstash和Filebeat):正常的系统来说,可以将Logstash安装在需要采集日志的机房,通过Logstash采集这个机房的日志,并汇总到用于分布式日志采集的ES当中,对于单个Logstash来说占用的资源,相比一整个机房的资源来说九牛一毛。

但是由于云原生的诞生,可能需要将Logstash去安装到K8S集群的每一个节点当中,甚至是Logstash安装到每个Pod当中,Logstash占用的资源相比之前一个机房来说多了很多。此时在K8S集群当中可以部署一个小的Logstash集群(不必每个节点都部署Logstash),在K8S各个节点当中安装一个轻量级的日志采集组件Filebeat,负责将机器上的日志发送给Logstash集群,经过Logstash将日志进行清洗和转换并发送给ES。

如果日志太多,可能Filebeat发送的数据Logstash处理不过来,那么可以在Filebeat和Logstash中间添加一层Kafka等中间件去缓冲一下,让Logstash可以根据自己的消费能力去进行消费对应的日志并推送到ES集群当中。

3.1 创建Logstash用的用户

创建logstash_writer用户:

curl -X POST "http://127.0.0.1:9200/_security/user/logstash_writer" -u elastic:xxxxxx -H "Content-Type: application/json" -d '
{
  "password" : "xxxxxx",
  "roles" : [ "superuser", "kibana_user", "kibana_system" ],
  "full_name" : "logstash_writer"
}'

3.2 安装Logstash

安装Logstash的资源清单如下:

  • 1.定义一个ConfigMap(logstash-config),存放Logstash的配置文件,比如Logstash需要如何去处理日志的规则,以及需要将日志推送到哪个ES集群当中。
  • 2.定义一个Deployment(logstash)控制器部署Logstash服务,并挂载Logstash配置文件的ConfigMap作为卷,在5044端口启动Logstash服务。
  • 3.创建一个Service(logstash),用于将各个Pod,以Service的方式去进行对外暴露,可以基于Service的方式完成服务的自动发现,Filebeat需要基于Service找到Logstash的服务,将日志推送给Logstash。
apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-config
  namespace: elk
data:
  logstash.conf: |
    input {
      beats {
        port => 5044
      }
    }
    filter {
      # 可在此处添加过滤器
    }
    output {
      elasticsearch {
        hosts => ["http://elasticsearch.elk.svc.cluster.local:9200"]
        index => "logstash-%{+YYYY.MM.dd}"
        user => "logstash_system"
        password => "xxxxxx"
      }
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: logstash
  namespace: elk
spec:
  replicas: 1
  selector:
    matchLabels:
      app: logstash
  template:
    metadata:
      labels:
        app: logstash
    spec:
      containers:
      - name: logstash
        image: docker.elastic.co/logstash/logstash:8.10.2
        ports:
        - containerPort: 5044
        volumeMounts:
        - name: config-volume
          mountPath: /usr/share/logstash/pipeline/logstash.conf
          subPath: logstash.conf
      volumes:
      - name: config-volume
        configMap:
          name: logstash-config
---
apiVersion: v1
kind: Service
metadata:
  name: logstash
  namespace: elk
spec:
  ports:
  - port: 5044
    targetPort: 5044
  selector:
    app: logstash

3.3 各个Node上安装Filebeat采集日志

对于K8S集群当中的各个Pod,日志都会被存放到宿主机的/var/log/containers/目录下,因此我们只需要在Node上安装Filebeat,即可采集到K8S集群的所有的Pod的日志信息。

  • 1.创建一个Filebeat的配置信息的ConfigMap,定义日志采集的路径,以及需要将日志推送到目标Logstash服务的地址,这里以Service的内部域名的方式给出。
  • 2.因为每个Node节点都需要Filebeat进行日志的采集,因此我们使用了DaemonSet的Pod控制器进行服务的部署,并挂载Filebeat的配置信息ConfigMap(定义日志的输入和输出),每个Pod都挂载自己Node的/var/log到自己Pod的/var/log路径下,以便Filebeat可以采集到宿主机上的各个Pod的日志信息。

安装Filebeat的资源清单如下:

apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: elk
data:
  filebeat.yml: |
    filebeat.inputs:
      - type: container
        paths:
          - /var/log/containers/*.log
    output.logstash:
      hosts: ["logstash.elk.svc.cluster.local:5044"]
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: elk
spec:
  selector:
    matchLabels:
      app: filebeat
  template:
    metadata:
      labels:
        app: filebeat
    spec:
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:8.10.2
        volumeMounts:
        - name: config-volume
          mountPath: /usr/share/filebeat/filebeat.yml
          subPath: filebeat.yml
        - name: varlog
          mountPath: /var/log
      volumes:
      - name: config-volume
        configMap:
          name: filebeat-config
      - name: varlog
        hostPath:
          path: /var/log

4.在Kibana查看日志

进入Elastic Kibana的主页面,使用账号和密码进行登录。

594ee6a4e95cb9fdd5151bb2e4af4f9e.png