常见的情况是：您有多个应用程序，其中一个应用程序在白天具有峰值负载，而在其他时间没有人访问它（或者被访问，但很少访问）；而其他应用程序可以在晚上使用群集电源。作为此类应用程序的示例，我们可以引用Web服务和一些数据处理程序。

像往常一样，群集资源根本不够用。您必须想出一些办法来优化资源的使用，而Kubernetes对此非常有用。它具有Horizontal Pod Autoscaler ，可让您基于指标来缩放应用程序。

度量通常由度量服务器提供。接下来，我将讨论用Prometheus替换度量服务器（因为Prometheus实现了度量服务器提供的数据，并且我们摆脱了一个额外的链接），以及如何基于Prometheus的度量在Kubernetes中扩展应用程序。

首先，请安装Prometheus操作员。我个人使用现成的清单。您可以将图表用于Helm（但我没有检查其性能）。如果有，请同时删除度量标准服务器。之后，检查是否一切正常。

# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/" | jq { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "metrics.k8s.io/v1beta1", "resources": [ { "name": "nodes", "singularName": "", "namespaced": false, "kind": "NodeMetrics", "verbs": [ "get", "list" ] }, { "name": "pods", "singularName": "", "namespaced": true, "kind": "PodMetrics", "verbs": [ "get", "list" ] } ] }

然后从该目录应用清单。这将安装Prometheus适配器。我找到了包含这些清单的图表，但没有对其进行检查。之后，您应该正确运行命令：

 kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq

（将有一个非常大的列表，所以我不在这里列出）

要了解metrics.k8s.io和custom.metrics.k8s.io URL的含义，该文档可以为您提供帮助。

如果某些操作不起作用，请照常查看日志。您也可以寻找问题的解决方案。

现在设置自动缩放。

我有一个消耗大量处理器资源并为队列服务的应用程序。一旦队列大小超过某个阈值，我想增加副本集中的炉床数量，以更快地处理队列。一旦其大小小于阈值，就应释放群集资源。

要了解如何为Prometheus-adapter编写规则，您需要仔细阅读本文档及其相关页面。这就是我的样子。

要求普罗米修斯

 wqueue_tube_total_size{tube="dmload-legacy"}

它返回：

 wqueue_tube_total_size{endpoint="pprof-port",instance="10.116.2.237:8542",job="wqueue-pprof",namespace="default",pod="wqueue-b9fdd9455-66mwm",service="wqueue-pprof",tube="dmload-legacy"} 32

我为Prometheus-adapter编写以下规则：

 - seriesQuery: wqueue_tube_total_size{tube="dmload-legacy"} resources: overrides: namespace: resource: namespace tube: resource: service name: {as: "wqueue_tube_total_size_dmload_legacy"} metricsQuery: wqueue_tube_total_size{tube="dmload-legacy"}

应该注意的是，我必须在service映射tube参数，然后在描述中使用hpa。

HPA配置：

 --- kind: HorizontalPodAutoscaler apiVersion: autoscaling/v2beta1 metadata: name: dmload-v3-legacy namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: dmload-v3-legacy minReplicas: 2 maxReplicas: 20 metrics: - type: Object object: metricName: wqueue_tube_total_size_dmload_legacy target: apiVersion: v1 kind: Service name: dmload-legacy targetValue: 30

我在这里指出，只要wqueue_tube_total_size_dmload_legacy队列中的作业数超过30，就添加pod直到20，如果targetValue降至30以下，则减少至2。

我们申请，看看会发生什么。我的系统工作了几天，目前仅减少了炉膛数量：

 # kubectl describe hpa dmload-v3-legacy Name: dmload-v3-legacy Namespace: default Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"dmload-v3-legacy","namespace":"d... CreationTimestamp: Thu, 24 Jan 2019 16:16:43 +0300 Reference: Deployment/dmload-v3-legacy Metrics: ( current / target ) "wqueue_tube_total_size_dmload_legacy" on Service/dmload-legacy: 14 / 30 Min replicas: 2 Max replicas: 20 Deployment pods: 15 current / 14 desired Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 14 ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from Service metric wqueue_tube_total_size_dmload_legacy ScalingLimited False DesiredWithinRange the desired count is within the acceptable range Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRescale 59m (x14 over 40h) horizontal-pod-autoscaler New size: 13; reason: All metrics below target Normal SuccessfulRescale 59m (x13 over 40h) horizontal-pod-autoscaler New size: 12; reason: All metrics below target Normal SuccessfulRescale 57m (x14 over 40h) horizontal-pod-autoscaler New size: 11; reason: All metrics below target Normal SuccessfulRescale 56m (x14 over 40h) horizontal-pod-autoscaler New size: 10; reason: All metrics below target Normal SuccessfulRescale 56m (x11 over 38h) horizontal-pod-autoscaler New size: 8; reason: All metrics below target Normal SuccessfulRescale 55m (x6 over 36h) horizontal-pod-autoscaler New size: 7; reason: All metrics below target Normal SuccessfulRescale 47m (x103 over 40h) horizontal-pod-autoscaler (combined from similar events): New size: 20; reason: Service metric wqueue_tube_total_size_dmload_legacy above target Normal SuccessfulRescale 3m38s (x19 over 41h) horizontal-pod-autoscaler New size: 17; reason: All metrics below target Normal SuccessfulRescale 2m8s (x23 over 41h) horizontal-pod-autoscaler New size: 16; reason: All metrics below target Normal SuccessfulRescale 98s (x20 over 40h) horizontal-pod-autoscaler New size: 15; reason: All metrics below target Normal SuccessfulRescale 7s (x18 over 40h) horizontal-pod-autoscaler New size: 14; reason: All metrics below target

所描述的一切均在Kubernetes 1.13.2上执行。

结论

在这篇简短的文章中，我展示了如何使用Prometheus的指标自动扩展Kubernetes集群中的应用程序。

配置了Prometheus-operator组件并创建了必要的清单。

结果，基于Prometheus对队列大小的度量，结果是增加或减少了处理该队列的Pod的数量。

（该图显示了炉床数量如何根据队列的大小而变化）

感谢您的关注！

基于Prometheus指标的Kubernetes应用扩展

结论

More articles: