I figured out what was the issue. The kube-dns service was being scheduled to nodes suffering from high memory pressure, causing kube-dns to be evicted and restarted. During the time it was out some requests would not be resolved. In order to fix the issue I created a nodepool exclusive to the kube-system services, then edited the kube-system deployments and set a nodeSelector so they always get scheduled to safe Nodes. After that, the issue has ceased.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…