共计 23601 个字符,预计需要花费 60 分钟才能阅读完成。
简介
利用亲和性和反亲和性可以帮助业务实现打散同workload内pod在node中的分布,提高了服务高可用性。当然也可以实现业务架构级划分不同node群集等等
[root@k8s-master ~]# kubectl explain deploy.spec.template.spec.affinity
KIND: Deployment
VERSION: apps/v1
RESOURCE: affinity <Object>
DESCRIPTION:
If specified, the pod's scheduling constraints
Affinity is a group of affinity scheduling rules.
FIELDS:
nodeAffinity <Object>
Describes node affinity scheduling rules for the pod.
podAffinity <Object>
Describes pod affinity scheduling rules (e.g. co-locate this pod in the
same node, zone, etc. as some other pod(s)).
podAntiAffinity <Object>
Describes pod anti-affinity scheduling rules (e.g. avoid putting this pod
in the same node, zone, etc. as some other pod(s)).
nodeAffinity
[root@k8s-master ~]# kubectl explain deploy.spec.template.spec.affinity.nodeAffinity
KIND: Deployment
VERSION: apps/v1
RESOURCE: nodeAffinity <Object>
DESCRIPTION:
Describes node affinity scheduling rules for the pod.
Node affinity is a group of node affinity scheduling rules.
FIELDS:
preferredDuringSchedulingIgnoredDuringExecution <[]Object>
The scheduler will prefer to schedule pods to nodes that satisfy the
affinity expressions specified by this field, but it may choose a node that
violates one or more of the expressions. The node that is most preferred is
the one with the greatest sum of weights, i.e. for each node that meets all
of the scheduling requirements (resource request, requiredDuringScheduling
affinity expressions, etc.), compute a sum by iterating through the
elements of this field and adding "weight" to the sum if the node matches
the corresponding matchExpressions; the node(s) with the highest sum are
the most preferred.
requiredDuringSchedulingIgnoredDuringExecution <Object>
If the affinity requirements specified by this field are not met at
scheduling time, the pod will not be scheduled onto the node. If the
affinity requirements specified by this field cease to be met at some point
during pod execution (e.g. due to an update), the system may or may not try
to eventually evict the pod from its node.
podAffinity
[root@k8s-master ~]# kubectl explain deploy.spec.template.spec.affinity.podAffinity
KIND: Deployment
VERSION: apps/v1
RESOURCE: podAffinity <Object>
DESCRIPTION:
Describes pod affinity scheduling rules (e.g. co-locate this pod in the
same node, zone, etc. as some other pod(s)).
Pod affinity is a group of inter pod affinity scheduling rules.
FIELDS:
preferredDuringSchedulingIgnoredDuringExecution <[]Object>
The scheduler will prefer to schedule pods to nodes that satisfy the
affinity expressions specified by this field, but it may choose a node that
violates one or more of the expressions. The node that is most preferred is
the one with the greatest sum of weights, i.e. for each node that meets all
of the scheduling requirements (resource request, requiredDuringScheduling
affinity expressions, etc.), compute a sum by iterating through the
elements of this field and adding "weight" to the sum if the node has pods
which matches the corresponding podAffinityTerm; the node(s) with the
highest sum are the most preferred.
requiredDuringSchedulingIgnoredDuringExecution <[]Object>
If the affinity requirements specified by this field are not met at
scheduling time, the pod will not be scheduled onto the node. If the
affinity requirements specified by this field cease to be met at some point
during pod execution (e.g. due to a pod label update), the system may or
may not try to eventually evict the pod from its node. When there are
multiple elements, the lists of nodes corresponding to each podAffinityTerm
are intersected, i.e. all terms must be satisfied.
podAntiAffinity
[root@k8s-master ~]# kubectl explain deploy.spec.template.spec.affinity.podAntiAffinity
KIND: Deployment
VERSION: apps/v1
RESOURCE: podAntiAffinity <Object>
DESCRIPTION:
Describes pod anti-affinity scheduling rules (e.g. avoid putting this pod
in the same node, zone, etc. as some other pod(s)).
Pod anti affinity is a group of inter pod anti affinity scheduling rules.
FIELDS:
preferredDuringSchedulingIgnoredDuringExecution <[]Object>
The scheduler will prefer to schedule pods to nodes that satisfy the
anti-affinity expressions specified by this field, but it may choose a node
that violates one or more of the expressions. The node that is most
preferred is the one with the greatest sum of weights, i.e. for each node
that meets all of the scheduling requirements (resource request,
requiredDuringScheduling anti-affinity expressions, etc.), compute a sum by
iterating through the elements of this field and adding "weight" to the sum
if the node has pods which matches the corresponding podAffinityTerm; the
node(s) with the highest sum are the most preferred.
requiredDuringSchedulingIgnoredDuringExecution <[]Object>
If the anti-affinity requirements specified by this field are not met at
scheduling time, the pod will not be scheduled onto the node. If the
anti-affinity requirements specified by this field cease to be met at some
point during pod execution (e.g. due to a pod label update), the system may
or may not try to eventually evict the pod from its node. When there are
multiple elements, the lists of nodes corresponding to each podAffinityTerm
are intersected, i.e. all terms must be satisfied.
从上面可以看出都有相同的属性字段
- preferredDuringSchedulingIgnoredDuringExecution:软亲和性
- requiredDuringSchedulingIgnoredDuringExecution:硬亲和性
软亲和性:结合下面的 “operator: NotIn”,意思就是尽量不要将 pod 调度到匹配到的节点,但是如果没有不匹配的节点的话,也可以调度到匹配到的节点
硬亲和性:结合下面的 “operator: In”,意思就是必须调度到满足条件的节点上,否则就等着 Pending
调度策略 | 匹配标签 | 操作符 | 拓扑域支持 | 调度目标 |
nodeAffinity | 主机 | In,NotIn,Exists,DoesNotExist, Gt, Lt |
否 | 指定主机 |
podAffinity | pod | In,NotIn,Exists,DoesNotExist |
是 |
pod 与指定 pod同一拓扑域 |
podAntAffinity | pod | In,NotIn,Exists,DoesNotExist |
是 |
pod 与指定 pod 不在同一拓扑域 |
操作符
- In:label 的值在某个列表中
- NotIn:label 的值不在某个列表中
- Gt:label 的值大于某个值
- Lt:label 的值小于某个值
- Exists:某个 label 存在
- DoesNotExist:某个 label 不存在
注意:不管哪种亲和性都是需要依赖标签功能的
[root@k8s-master ~]# kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-master Ready compute,master 109d v1.18.9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master,kubernetes.io/os=linux,node-role.kubernetes.io/compute=dedicated-middleware,node-role.kubernetes.io/master=
k8s-node-01 Ready <none> 4h18m v1.18.9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node-01,kubernetes.io/os=linux
k8s-node-02 Ready <none> 4h16m v1.18.9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node-02,kubernetes.io/os=linux
[root@k8s-master ~]# kubectl get pods -n kube-system --show-labels
NAME READY STATUS RESTARTS AGE LABELS
calico-kube-controllers-5b8b769fcd-8hlzn 1/1 Running 23 109d k8s-app=calico-kube-controllers,pod-template-hash=5b8b769fcd
calico-node-fwcss 1/1 Running 23 109d controller-revision-hash=b9dd4bd9f,k8s-app=calico-node,pod-template-generation=1
calico-node-m84rz 1/1 Running 0 4h17m controller-revision-hash=b9dd4bd9f,k8s-app=calico-node,pod-template-generation=1
calico-node-tvs89 1/1 Running 0 4h19m controller-revision-hash=b9dd4bd9f,k8s-app=calico-node,pod-template-generation=1
coredns-65556b4c97-dhkz4 1/1 Running 5 24d k8s-app=kube-dns,pod-template-hash=65556b4c97
etcd-k8s-master 1/1 Running 23 109d component=etcd,tier=control-plane
kube-apiserver-k8s-master 1/1 Running 23 109d component=kube-apiserver,tier=control-plane
kube-controller-manager-k8s-master 1/1 Running 24 109d component=kube-controller-manager,tier=control-plane
kube-proxy-9b84w 1/1 Running 0 4h19m controller-revision-hash=949786769,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-hftdw 1/1 Running 23 109d controller-revision-hash=949786769,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-x4lnq 1/1 Running 0 4h17m controller-revision-hash=949786769,k8s-app=kube-proxy,pod-template-generation=1
kube-scheduler-k8s-master 1/1 Running 23 109d component=kube-scheduler,tier=control-plane
metrics-server-86499f7fd8-pdw6d 1/1 Running 3 9d k8s-app=metrics-server,pod-template-hash=86499f7fd8
nfs-client-provisioner-df46b8d64-jwgd4 1/1 Running 23 109d app=nfs-client-provisioner,pod-template-hash=df46b8d64
亲和性
硬亲和性
nodeAffinity的硬亲和性
[root@k8s-master nginx]# cat nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx1
labels:
app: nginx1
spec:
replicas: 10
selector:
matchLabels:
app: nginx1
template:
metadata:
labels:
app: nginx1
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname #指定node的标签
operator: NotIn #设置Pod安装到kubernetes.io/hostname的标签值不在values列表中的node上
values:
- k8s-node-01
initContainers:
- name: init-container
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh"]
env:
# - name: MY_POD_NAME
# valueFrom:
# fieldRef:
# fieldPath: metadata.name
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
[
"-c",
"echo ${HOSTNAME} ${MY_POD_IP} > /wwwroot/index.html",
]
volumeMounts:
- name: wwwroot
mountPath: "/wwwroot"
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
protocol: TCP
volumeMounts:
- name: wwwroot
mountPath: /usr/share/nginx/html/index.html
subPath: index.html
volumes:
- name: wwwroot
emptyDir: {}
调度器则不会调度pod到该节点中,而是选择其他合适的节点
# 可以看到所有pod只会调度到非k8s-node-01的节点
[root@k8s-master nginx]# kubectl get pods -o wide | grep nginx1
nginx1-664f458845-2bbcw 1/1 Running 0 2m38s 10.100.44.197 k8s-node-02 <none> <none>
nginx1-664f458845-64rg2 1/1 Running 0 2m39s 10.100.44.198 k8s-node-02 <none> <none>
nginx1-664f458845-8tbxt 1/1 Running 0 2m39s 10.100.44.200 k8s-node-02 <none> <none>
nginx1-664f458845-bz672 1/1 Running 0 2m38s 10.100.44.195 k8s-node-02 <none> <none>
nginx1-664f458845-ft5c9 1/1 Running 0 2m38s 10.100.44.201 k8s-node-02 <none> <none>
nginx1-664f458845-jp8tz 1/1 Running 0 2m39s 10.100.44.196 k8s-node-02 <none> <none>
nginx1-664f458845-lf5k7 1/1 Running 0 2m38s 10.100.44.202 k8s-node-02 <none> <none>
nginx1-664f458845-sn9c5 1/1 Running 0 2m39s 10.100.44.199 k8s-node-02 <none> <none>
nginx1-664f458845-t8nrb 1/1 Running 0 2m39s 10.100.44.194 k8s-node-02 <none> <none>
nginx1-664f458845-vbwxn 1/1 Running 0 2m39s 10.100.44.193 k8s-node-02 <none> <none>
由于博主这里给mater设置了可调度,但是在配置了调度策略后,竟然没有一个pod调度到master,把节点2也排除时才会到master节点上,看来调度到master节点是最后无可用node节点时才会选择
[root@k8s-master nginx]# cat nginx-deployment.yaml
####### 略
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname #指定node的标签
operator: NotIn #设置Pod安装到kubernetes.io/hostname的标签值不在values列表中的node上
values:
- k8s-node-01
- k8s-node-02
####### 略
[root@k8s-master nginx]# kubectl get pods -o wide | grep nginx1
nginx1-6cb8f796f7-25f5l 1/1 Running 0 15m 10.100.235.219 k8s-master <none> <none>
nginx1-6cb8f796f7-2qfcv 1/1 Running 0 16m 10.100.235.194 k8s-master <none> <none>
nginx1-6cb8f796f7-78t2r 1/1 Running 0 15m 10.100.235.242 k8s-master <none> <none>
nginx1-6cb8f796f7-b9c5n 1/1 Running 0 16m 10.100.235.236 k8s-master <none> <none>
nginx1-6cb8f796f7-ccpd8 1/1 Running 0 16m 10.100.235.254 k8s-master <none> <none>
nginx1-6cb8f796f7-jhp2k 1/1 Running 0 15m 10.100.235.213 k8s-master <none> <none>
nginx1-6cb8f796f7-l6s9h 1/1 Running 0 16m 10.100.235.231 k8s-master <none> <none>
nginx1-6cb8f796f7-qnjvg 1/1 Running 0 15m 10.100.235.199 k8s-master <none> <none>
nginx1-6cb8f796f7-tgx6l 1/1 Running 0 16m 10.100.235.202 k8s-master <none> <none>
nginx1-6cb8f796f7-twhxq 1/1 Running 0 15m 10.100.235.226 k8s-master <none> <none>
此时再将master也加上排除,则会看到滚动更新时新pod一直处于pending状态
[root@k8s-master nginx]# kubectl get pods -o wide | grep nginx1
nginx1-6cb8f796f7-2qfcv 1/1 Running 0 20m 10.100.235.194 k8s-master <none> <none>
nginx1-6cb8f796f7-b9c5n 1/1 Running 0 20m 10.100.235.236 k8s-master <none> <none>
nginx1-6cb8f796f7-ccpd8 1/1 Running 0 20m 10.100.235.254 k8s-master <none> <none>
nginx1-6cb8f796f7-jhp2k 1/1 Running 0 19m 10.100.235.213 k8s-master <none> <none>
nginx1-6cb8f796f7-l6s9h 1/1 Running 0 20m 10.100.235.231 k8s-master <none> <none>
nginx1-6cb8f796f7-qnjvg 1/1 Running 0 19m 10.100.235.199 k8s-master <none> <none>
nginx1-6cb8f796f7-tgx6l 1/1 Running 0 20m 10.100.235.202 k8s-master <none> <none>
nginx1-6cb8f796f7-twhxq 1/1 Running 0 19m 10.100.235.226 k8s-master <none> <none>
nginx1-9f7cb6d58-mvhwr 0/1 Pending 0 2m54s <none> <none> <none> <none>
nginx1-9f7cb6d58-pz6vn 0/1 Pending 0 2m53s <none> <none> <none> <none>
nginx1-9f7cb6d58-rkbm5 0/1 Pending 0 2m53s <none> <none> <none> <none>
nginx1-9f7cb6d58-tsc7z 0/1 Pending 0 2m54s <none> <none> <none> <none>
nginx1-9f7cb6d58-x4c8w 0/1 Pending 0 2m54s <none> <none> <none> <none>
[root@k8s-master nginx]# kubectl describe pods nginx1-9f7cb6d58-mvhwr
Name: nginx1-9f7cb6d58-mvhwr
Namespace: default
Priority: 0
Node: <none>
Labels: app=nginx1
pod-template-hash=9f7cb6d58
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/nginx1-9f7cb6d58
Init Containers:
init-container:
Image: busybox:latest
Port: <none>
Host Port: <none>
Command:
sh
Args:
-c
echo ${HOSTNAME} ${MY_POD_IP} > /wwwroot/index.html
Environment:
MY_POD_IP: (v1:status.podIP)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-scngg (ro)
/wwwroot from wwwroot (rw)
Containers:
nginx:
Image: nginx:latest
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/usr/share/nginx/html/index.html from wwwroot (rw,path="index.html")
/var/run/secrets/kubernetes.io/serviceaccount from default-token-scngg (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
wwwroot:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
default-token-scngg:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-scngg
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m42s default-scheduler 0/3 nodes are available: 3 node(s) didn't match node selector.
Warning FailedScheduling 3m42s default-scheduler 0/3 nodes are available: 3 node(s) didn't match node selector.
podAffinity的硬亲和性
创建两个deploy
- A deploy使用nodeAffinity配置调度到node-01,并配置标签group=a
- B deploy使用podAffinity调度到具有标签group=a的pod的节点上
A deploy
[root@k8s-master nginx]# cat nginx-a-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-a
labels:
app: nginx-a
spec:
replicas: 10
selector:
matchLabels:
app: nginx-a
template:
metadata:
labels:
app: nginx-a
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-node-01
initContainers:
- name: init-container
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh"]
env:
# - name: MY_POD_NAME
# valueFrom:
# fieldRef:
# fieldPath: metadata.name
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
[
"-c",
"echo ${HOSTNAME} ${MY_POD_IP} > /wwwroot/index.html",
]
volumeMounts:
- name: wwwroot
mountPath: "/wwwroot"
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
protocol: TCP
volumeMounts:
- name: wwwroot
mountPath: /usr/share/nginx/html/index.html
subPath: index.html
volumes:
- name: wwwroot
emptyDir: {}
[root@k8s-master nginx]# kubectl get pods -o wide --show-labels | grep nginx-a
nginx-a-55c8c877d5-29smq 1/1 Running 0 2m35s 10.100.154.212 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-5s92q 1/1 Running 0 2m35s 10.100.154.206 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-5tbf8 1/1 Running 0 2m35s 10.100.154.203 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-6qzdp 1/1 Running 0 2m35s 10.100.154.210 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-7zr2b 1/1 Running 0 2m35s 10.100.154.208 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-bqnvw 1/1 Running 0 2m35s 10.100.154.207 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-s7fjn 1/1 Running 0 2m35s 10.100.154.209 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-w7nsq 1/1 Running 0 2m35s 10.100.154.211 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-wkss5 1/1 Running 0 2m35s 10.100.154.204 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
nginx-a-55c8c877d5-z4q2w 1/1 Running 0 2m35s 10.100.154.205 k8s-node-01 <none> <none> app=nginx-a,group=a,pod-template-hash=55c8c877d5
B deploy
[root@k8s-master nginx]# cat nginx-b-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-b
labels:
app: nginx-b
spec:
replicas: 10
selector:
matchLabels:
app: nginx-b
template:
metadata:
labels:
app: nginx-b
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: group
operator: In
values:
- a
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-container
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh"]
env:
# - name: MY_POD_NAME
# valueFrom:
# fieldRef:
# fieldPath: metadata.name
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
[
"-c",
"echo ${HOSTNAME} ${MY_POD_IP} > /wwwroot/index.html",
]
volumeMounts:
- name: wwwroot
mountPath: "/wwwroot"
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
protocol: TCP
volumeMounts:
- name: wwwroot
mountPath: /usr/share/nginx/html/index.html
subPath: index.html
volumes:
- name: wwwroot
emptyDir: {}
[root@k8s-master nginx]# kubectl get pods -o wide | grep nginx
nginx-a-55c8c877d5-29smq 1/1 Running 0 21m 10.100.154.212 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-5s92q 1/1 Running 0 21m 10.100.154.206 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-5tbf8 1/1 Running 0 21m 10.100.154.203 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-6qzdp 1/1 Running 0 21m 10.100.154.210 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-7zr2b 1/1 Running 0 21m 10.100.154.208 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-bqnvw 1/1 Running 0 21m 10.100.154.207 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-s7fjn 1/1 Running 0 21m 10.100.154.209 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-w7nsq 1/1 Running 0 21m 10.100.154.211 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-wkss5 1/1 Running 0 21m 10.100.154.204 k8s-node-01 <none> <none>
nginx-a-55c8c877d5-z4q2w 1/1 Running 0 21m 10.100.154.205 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-4ds8b 0/1 PodInitializing 0 44s 10.100.154.228 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-5w6w2 1/1 Running 0 44s 10.100.154.223 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-fs6qk 1/1 Running 0 44s 10.100.154.232 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-jwb5d 0/1 PodInitializing 0 44s 10.100.154.229 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-pgt9l 0/1 PodInitializing 0 44s 10.100.154.226 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-q5fmc 0/1 PodInitializing 0 44s 10.100.154.231 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-rnd55 1/1 Running 0 44s 10.100.154.224 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-sgljk 1/1 Running 0 44s 10.100.154.225 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-vz7js 1/1 Running 0 44s 10.100.154.227 k8s-node-01 <none> <none>
nginx-b-7bfbc47b99-wj68d 1/1 Running 0 44s 10.100.154.230 k8s-node-01 <none> <none>
软亲和性
软亲和性是一种柔性控制逻辑,被调度的Pod应该尽量放置在满足亲和条件的节点上,但亲和条件不满足时,该Pod也能接受被调度到其它不满足亲和条件的节点上。另外,多个软亲和条件并存时,还支持为亲和条件定义weight属性以区别它们的优先级,取值范围1-100,数字越大优先级越高,Pod越优先被调度到此节点上
nodeAffinity的软亲和性
Pod规范中的spec.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution字段用于定义Pod和节点的首选亲和关系,它可以嵌套使用preference和weight字段。
- weight:指定软亲和条件的优先级,取值范围1-100,数字越大优先级越高
- preference:用于定义节点选择器,值是一个对象,支持matchExpressions和matchFields两种表达机制,它们的使用方式和逻辑与强制亲和一样
加入软亲和性配置如下
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 60
preference:
matchExpressions:
- key: departement
operator: In
values:
- dep-a
- weight: 30
preference:
matchExpressions:
- key: ssd
operator: In
values:
- true
在上面的示例中,定义了两个软亲和条件,第一个用于选择具有departement=dep-a标签的节点,优先级为60;第二个用于选择具有ssd=true标签的节点,优先级为30。此时可以将集群中的节点分为4类:
- 同时具有departement=dep-a和ssd=true标签的节点,优先级最高为60+30=90
- 只具有project=project2标签的节点,优先级为60
- 只具有ssd=true标签的节点,优先级为30
- 不具有departement=dep-a和ssd=true标签的节点,优先级为0
Pod在调度时会优先选择第一种开始选择,以此类推往下选择,最后一种就是优先级为0的节点,所以不会像硬亲和性导致pod pending
podAffinity软亲和性
Pod间的首选亲和通过spec.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution字段中,其值是一个对象列表,支持嵌套使用weight和podAffinityTerm字段
weight:数字,取值范围1-100,定义软亲和条件的权重
podAffinityTerm:Pod标签选择器定义,可以嵌套使用labelSelector、namespaces和topologyKey字段
加入策略如下
affinity:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
podAffinityTerm:
labelSelector:
matchExpressions:
- key: group
operator: In
values:
- a
topologyKey: kubernetes.io/hostname
- weight: 40
podAffinityTerm:
labelSelector:
matchExpressions:
- key: group
operator: NotIn
values:
- b
topologyKey: kubernetes.io/hostname
上面这个例子优选节点上存在pod标签group=a且满足pod标签不存在group=b的节点,最差的选择则就是pod标签不带group=a且满足pod标签存在group=b的节点
反亲和性
通过这个机制,就可以实现一个workload内pod均匀调度到每个节点上,降低出现同一workload内pod密集于某些node上的情况
podAntiAffinity软反亲和性
调度器尽量不会把互斥的Pod调度到同一位置,但约束条件无法满足时,也会将Pod放在同一位置,而不是将Pod至于Pending状态
[root@k8s-master nginx]# cat nginx-d-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-d
labels:
app: nginx-d
spec:
replicas: 9
selector:
matchLabels:
app: nginx-d
tier: backend
template:
metadata:
labels:
app: nginx-d
tier: backend
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 60
podAffinityTerm:
labelSelector:
matchExpressions:
- key: tier
operator: In
values:
- backend
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-container
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh"]
env:
# - name: MY_POD_NAME
# valueFrom:
# fieldRef:
# fieldPath: metadata.name
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
[
"-c",
"echo ${HOSTNAME} ${MY_POD_IP} > /wwwroot/index.html",
]
volumeMounts:
- name: wwwroot
mountPath: "/wwwroot"
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
protocol: TCP
volumeMounts:
- name: wwwroot
mountPath: /usr/share/nginx/html/index.html
subPath: index.html
volumes:
- name: wwwroot
emptyDir: {}
[root@k8s-master nginx]# kubectl get pods -o wide --show-labels | grep nginx-d
nginx-d-6985c677dd-26pl9 1/1 Running 0 56s 10.100.235.222 k8s-master <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
nginx-d-6985c677dd-b2x5z 1/1 Running 0 56s 10.100.154.235 k8s-node-01 <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
nginx-d-6985c677dd-cv74q 1/1 Running 0 56s 10.100.235.221 k8s-master <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
nginx-d-6985c677dd-hf5k5 1/1 Running 0 56s 10.100.44.205 k8s-node-02 <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
nginx-d-6985c677dd-hjmm2 1/1 Running 0 56s 10.100.154.234 k8s-node-01 <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
nginx-d-6985c677dd-hs6pw 1/1 Running 0 56s 10.100.44.207 k8s-node-02 <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
nginx-d-6985c677dd-nfmfm 1/1 Running 0 56s 10.100.44.204 k8s-node-02 <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
nginx-d-6985c677dd-tcfxr 1/1 Running 0 56s 10.100.44.206 k8s-node-02 <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
nginx-d-6985c677dd-thxlt 1/1 Running 0 56s 10.100.44.208 k8s-node-02 <none> <none> app=nginx-d,pod-template-hash=6985c677dd,tier=backend
podAntiAffinity硬反亲和性
调度器尽量不会把互斥的Pod调度到同一位置,但约束条件无法满足时,则会处于pending状态
[root@k8s-master nginx]# cat nginx-c-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-c
labels:
app: nginx-c
spec:
replicas: 9
selector:
matchLabels:
app: nginx-c
tier: frontend
template:
metadata:
labels:
app: nginx-c
tier: frontend
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: tier
operator: In
values:
- frontend
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-container
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh"]
env:
# - name: MY_POD_NAME
# valueFrom:
# fieldRef:
# fieldPath: metadata.name
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
[
"-c",
"echo ${HOSTNAME} ${MY_POD_IP} > /wwwroot/index.html",
]
volumeMounts:
- name: wwwroot
mountPath: "/wwwroot"
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
protocol: TCP
volumeMounts:
- name: wwwroot
mountPath: /usr/share/nginx/html/index.html
subPath: index.html
volumes:
- name: wwwroot
emptyDir: {}
[root@k8s-master nginx]# kubectl get pod -o wide --show-labels | grep nginx-c
nginx-c-5c57748846-5xf25 0/1 Pending 0 2m13s <none> <none> <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend
nginx-c-5c57748846-68sr2 0/1 Pending 0 2m13s <none> <none> <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend
nginx-c-5c57748846-7z24h 0/1 Pending 0 2m13s <none> <none> <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend
nginx-c-5c57748846-85fcw 1/1 Running 0 3m17s 10.100.235.244 k8s-master <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend
nginx-c-5c57748846-pdr4r 1/1 Running 0 3m17s 10.100.154.233 k8s-node-01 <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend
nginx-c-5c57748846-pmrn9 0/1 Pending 0 2m13s <none> <none> <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend
nginx-c-5c57748846-qms5r 0/1 Pending 0 2m13s <none> <none> <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend
nginx-c-5c57748846-sxfkw 0/1 Pending 0 2m13s <none> <none> <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend
nginx-c-5c57748846-tbs2g 1/1 Running 0 3m17s 10.100.44.203 k8s-node-02 <none> <none> app=nginx-c,pod-template-hash=5c57748846,tier=frontend