问题
现象是kubesphere用户无法查看某成员集群信息,只要点击该集群,就会自动退出登陆。成员集群和主集群间网络正常。而且用admin用户可以正常查看该集群信息。删除成员集群重新加入无效。登陆成员集群查看cluster-agent pod
日志无报错输出。查看ks-controller-manager pod
报错信息如下:
E0818 00:15:28.710840 1 controller.go:304] controller/user-controller "msg"="Reconciler error" "error"="Internal error occurred: failed calling webhook \"users.iam.kubesphere.io\": Post \"https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2?timeout=30s\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" "name"="admin" "namespace"="" "reconciler group"="iam.kubesphere.io" "reconciler kind"="User"
E0818 00:15:58.745125 1 user_controller.go:127] controllers/user-controller "msg"="failed to update user" "error"="Internal error occurred: failed calling webhook \"users.iam.kubesphere.io\": Post \"https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2?timeout=30s\": context deadline exceeded" "user"={"Namespace":"","Name":"admin"}
解决
删除webhook即可,删除前备份。但根本原因还是apiserver无法访问ks-controller-manager
服务,即集群网络问题。
kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io users.iam.kubesphere.io -o yaml > validatingwebhook.yaml
kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io users.iam.kubesphere.io
kubectl get user
快速查看效果可以重启ks-controller-manager pod
kubectl rollout restart deployment -n kubesphere-system ks-controller-manager
查看日志输出没有报错了,再次查看user,已经自动同步过来了。
kubesphere也可以查看成员集群信息了。