Kubernetes生产实战微服务部署与弹性伸缩完全指南大家好我是迪哥。之前和大家聊了不少架构设计的话题今天来点硬核的——聊聊如何在生产环境用 Kubernetes 部署微服务以及如何实现真正的弹性伸缩。这是我所在团队踩过无数坑后总结出的实战经验。为什么选择Kubernetes在云原生时代Kubernetes 已经成为容器编排的事实标准。但很多团队上 K8s 后反而发现运维复杂度大幅上升其实是因为没有想清楚几个核心问题服务如何优雅上下线Pod 启动时流量不要打过来终止时要等请求处理完再退出资源配置如何设定设小了 OOM设大了浪费钱弹性伸缩如何真正生效HPA 不只是写个配置就完事了接下来我会逐一讲解这些问题的解决方案。核心配置让服务优雅地运行1. 优雅上下线很多团队遇到服务重启时出现短暂的 503 错误这就是没有配置优雅上下线。apiVersion: apps/v1 kind: Deployment metadata: name: order-service namespace: production spec: replicas: 3 selector: matchLabels: app: order-service template: metadata: labels: app: order-service spec: terminationGracePeriodSeconds: 60 # 等待60秒让请求处理完 containers: - name: order-service image: registry.example.com/order-service:v1.2.3 ports: - containerPort: 8080 readinessProbe: httpGet: path: /actuator/health/readiness port: 8080 initialDelaySeconds: 10 periodSeconds: 5 failureThreshold: 3 livenessProbe: httpGet: path: /actuator/health/liveness port: 8080 initialDelaySeconds: 30 periodSeconds: 10 failureThreshold: 3 lifecycle: preStop: exec: command: [/bin/sh, -c, sleep 10] # 等10秒让kube-proxy更新endpoint2. 资源配额与 LimitRange# 命名空间级别的资源限制 apiVersion: v1 kind: LimitRange metadata: name: production-limits namespace: production spec: limits: - type: Container default: cpu: 500m memory: 512Mi defaultRequest: cpu: 200m memory: 256Mi max: cpu: 4 memory: 4Gi min: cpu: 100m memory: 128Mi3. 反亲和性高可用部署apiVersion: apps/v1 kind: Deployment metadata: name: order-service spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 template: spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchLabels: app: order-service topologyKey: kubernetes.io/hostname弹性伸缩HPA VPA CPAHorizontalPodAutoscalerHPAapiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: order-service-hpa namespace: production spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: order-service minReplicas: 3 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 - type: Pods pods: metric: name: http_requests_per_second target: type: AverageValue averageValue: 1000 behavior: scaleDown: stabilizationWindowSeconds: 300 # 冷却5分钟防止抖动 policies: - type: Percent value: 10 periodSeconds: 60 scaleUp: stabilizationWindowSeconds: 0 policies: - type: Percent value: 100 periodSeconds: 15 - type: Pods value: 4 periodSeconds: 15 selectPolicy: MaxVerticalPodAutoscalerVPAHPA 管数量VPA 管规格。对于一些有状态服务或者难以水平扩展的服务VPA 可以自动调整 CPU 和内存apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: order-service-vpa namespace: production spec: targetRef: apiVersion: apps/v1 kind: Deployment name: order-service updatePolicy: updateMode: Auto resourcePolicy: containerPolicies: - containerName: order-service minAllowed: cpu: 100m memory: 128Mi maxAllowed: cpu: 2 memory: 2GiCronHPA定时弹性伸缩大促前提前扩容活动结束后自动缩容apiVersion: v1 kind: ConfigMap metadata: name: cronhpa-config namespace: production data: config.yaml: | - name: order-service crons: - schedule: 0 2 * * 6 # 每周六凌晨2点扩容 minReplicas: 10 maxReplicas: 30 - schedule: 0 22 * * 7 # 周日凌晨10点缩容 minReplicas: 3 maxReplicas: 20灰度发布如何让新版本平滑上线基于权重的金丝雀发布apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: order-service spec: hosts: - order-service http: - route: - destination: host: order-service-v1 subset: v1 weight: 90 - destination: host: order-service-v2 subset: v2 weight: 10 --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: order-service spec: host: order-service subsets: - name: v1 labels: version: v1 - name: v2 labels: version: v2基于请求内容的灰度apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: order-service spec: hosts: - order-service http: - match: - headers: x-user-id: regex: .*test.* route: - destination: host: order-service-v2 subset: v2 - route: - destination: host: order-service-v1 subset: v1 weight: 100监控告警让问题扼杀在摇篮里Prometheus Grafana 核心指标# 关键监控指标 # 1. Pod 重启次数 - alert: PodRestartingTooMuch expr: rate(kube_pod_container_status_restarts_total[5m]) 0.1 for: 5m labels: severity: warning annotations: summary: Pod 重启过于频繁 # 2. CPU/内存使用率 - alert: HighResourceUsage expr: (kube_pod_container_resource_usage 0.8) * on(namespace, pod) group_left(app) kube_deployment_labels for: 10m labels: severity: warning annotations: summary: 资源使用率超过 80% # 3. HPA 无法扩容 - alert: HPACannotScale expr: kube_hpa_status_condition{conditionAbleToScale} 0 for: 5m labels: severity: critical annotations: summary: HPA 无法进行扩缩容操作经验总结优雅上下线是基础terminationGracePeriodSeconds preStop readinessProbe 三件套必须配齐资源限制要合理先用 VPA 观察实际用量再设定合理阈值HPA 不要裸用一定要设置 behavior 限制缩放速度防止雪崩灰度发布保平安新版本先放 5% 流量观察没问题再逐步放大监控要覆盖全链路从 K8s 层到应用层再到业务层缺一不可我家 Docker 最近总想往外跑看来它也需要弹性伸缩——家里的容器不够用了 我是迪哥我们下期再见往期推荐《从单体到微服务架构拆分实战》《Redis高可用架构实战》