-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calico node error - iptables-legacy-save command failed #8831
Comments
Please share your thoughts on this. We are currently blocked from upgrading to EKS 1.29 due to this issue. |
What Linux distro/version do you use? Does it have (proper) support for iptables? |
we create cluster on ami which is amazon-linux-2-arm64 AMI |
The actual AMIs which are in question here are the Optimized EKS ones (such as All versions of these AMIs (even the x86/AMD64 ones) have the same version of iptables (v1.8.4):
So I don't think this would be related to the version of iptables. The same commands work on much older 1.26 ARM instances (which work with earlier versions of Calico).
|
@jonathan-hurley the function in question ( Would it be possible to upgrade iptables to v1.8.8 in your instances? Alternatively, calico pre-v3.27.2 should be using iptables v1.8.4, could you try that and see if the issue is resolved? (not ideal, but this would at least help diagnose this) |
Amazon EKS optimized images have always used 1.8.4; we do not have the option to change this. We must use the latest versions of Calico in order to resolve CVEs. |
Hi @coutinhop I got seg fault
when using calico 3.26.3 on a Photon OS with iptables 1.8.9. Do you mean the iptables on the OS should also use 1.8.4 version to avoid issue? I wish calico prints more than just seg fault. Looks like hashes does contain stdout, maybe it could be printed in debug mode. Just found out it already in if debug mode
If I do |
@coutinhop This change breaks Calico 3.27.2+ on every version of Amazon Linux 2 and every Optimized EKS AMI based on it. The latest versions of Amazon Linux 2 still only support iptables 1.8.4 are are not EOL for another entire year. What is the possibility that the Felix change which caused this can be reverted and the |
@coutinhop I don't think this has anything to do with the AMI / version of iptables shipped on the Amazon VM instances. Calico packages the necessary libraries for the binaries it uses inside the container, so it sounds like we have published ARM images that are missing a necessary lib (I'm not able to reproduce this on amd64) If I had to guess, we update the version of iptables to v1.8.8 which introduced a dependency on |
This commit is likely to be the one that introduced the problem: c053d1c |
@caseydavenport - I can reproduce this on AMD64 as well using an |
Well, that certainly changes things. Perhaps there is some interaction with the host packages that I wasn't expecting. |
Here are the details of the VM which I tried:
And when I try to invoke iptables from the Calico pod on this machine, I get the same
|
I managed to push a PR with a tentative fix, see here for more details: #9022 @jonathan-hurley or @Farhanec07, would any of you be able to test this with an image I built locally with the fix and let us know if that gets rid of the problem? amd64: Thanks! |
@coutinhop - that fixed it!
|
@jonathan-hurley that is great news! I'm still having a bit of trouble reproducing the issue myself, could you share some more details about your setup? Like, inside a calico-node pod WITH the problem, could you run |
Yes, the problem happened consistently on all pods, on both AMD64/x86 and ARM64 architectures. We use the Optimized EKS AMI images (which are based off of Amazon Linux 2).
|
Thank you @jonathan-hurley! Just to clarify, is this in the calico-node pod? What image are you using? This looks very weird:
As 'libxtables.so.12.3.0' is the "outdated" version of the lib which won't contain the Would you be willing to hit me up in the Calico Users slack? We could make this conversation a lot more real-time if so: https://slack.projectcalico.org/ |
Yes, these commands were being run from the calico-node pod before it died due to Felix crashing. The image for this run was ami-0be1daad79c89dd0a with a build ID of Sure, let me hop on slack ... |
Resolution was that the user was using an incorrectly built external image from Iron Bank. It had incorrect versions of the libraries. |
Expected Behavior
Current Behavior
panic which i observed that its failing to save iptables rules causing pods to crash.
calico-node pod log -
checked cni.log . could see only below error are
while exec into pod iptables cmd is not executing
Possible Solution
Steps to Reproduce (for bugs)
Context
Your Environment
The text was updated successfully, but these errors were encountered: