-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Felix should configure iptables rules such that VXLAN UDP Flows are not tracked in conntrack, when in VXLAN mode #8934
Comments
I noticed the issue before, but I also found it doesn't affect the communication, Thanks for the report! @adkafka, I'd help to solve this. this feature doesn't need a flag, right? |
I cannot think of a use case where we'd want these flows tracked in |
Yes, I also agree we don't need the feature flag, this should be a default behavior, also need ack from the maintainers. |
I think this is likely to be a good idea but we need to check for interactions with Calico host endpoint policy, where Calico is policing traffic on the host's own interface. That feature already has an auto-allow for VXLAN but we'd need to check that that all worked correctly with NOTRACK (and there was no performance regression, for example). Were there any established flows for VXLAN, or does it always go into this state? |
There are 0 established (
This matches my understanding of VXLAN. It is a one-way tunnel between hosts. If the host responds, it will be over a different UDP flow because that other host will use the destination port of 4789 to respond. |
Yeah, that makes sense; we're likely getting no benefit from conntrack then. |
Hi @fasaxc, I'm trying to fix this issue, can you tell me where to change the code? I found a few but I'm not sure if it's appropriate,thanks a lot. |
I think the best place is probably in https://github.com/projectcalico/calico/blob/master/felix/rules/static.go#L1261
I think there are quite a few unit tests that check these rules, they'll need to be updated. I think you can run them with |
We have a workload that manages many (10,000s) TCP connections per node. Traffic is sent between nodes in a Kubernetes cluster. We use Calico in a VXLAN configuration as our CNI. Felix manages the
iptables
entries as expected. Additionally, we havekube-proxy
running on our cluster in a standard configuration (usingiptables
notipvs
).The issue we are noticing is that our Conntrack tables are unexpectedly full. Some of the entries in the Conntrack table are expected (the TCP connections responsible for our application traffic), but to my surprise, almost half of the entries in our Conntrack table are UDP "connections" responsible for the VxLAN tunnels between nodes. All of these connections are in the "UNREPLIED" state.
Here are some commands to illustrate this:
Of the 129,605 flows tracking in Conntrack, 59,258 (~46%) of them are UDP flows corresponding to VXLAN. This limits how many connections each node in our cluster can handle significantly. Luckily, when one of these nodes Conntrack table does fill up, after dropping a couple packets, it will enter "early_drop" mode, and remove many
UNREPLIED
connections from the Conntrack table (which in our case, are the UDP VXLAN flows). This prevents having significant application impact, but it does make monitoring our Conntrack usage much more difficult.After some discussion in the Calico slack (#networking https://calicousers.slack.com/archives/CPEPF833L/p1718663125404499), we decided to experiment with adding
iptables
rules such that these VXLAN UDP flows were not tracked in Conntrack. We found that it had the desired effect and caused no impact to our application traffic. Therefore, we are proposing that Calico automatically add these rules when in VXLAN mode. It may be worth putting this behind a configuration flag and defaulting to "off" to prevent accidentally breaking any workloads.The
iptables
rules I added to each of these node to configure it not to track VXLAN UDP traffic was:This was based off a tool I found online that did something very similar (https://review.opendev.org/c/openstack/tripleo-heat-templates/+/831444/1/deployment/neutron/neutron-ovs-agent-container-puppet.yaml).
After we apply these rules on our nodes, we see 0 entries in Conntrack matching the UDP port:
My understanding is that tracking these UDP flows in Conntrack has no advantage. These flows remain in the
UNREPLIED
state because the traffic only flows one way. Therefore, stateful connection tracking has no positive impact.Expected Behavior
Conntrack table does not fill up with VXLAN UDP flows.
Current Behavior
Conntrack table contains a non-trivial amount of VXLAN UDP flows, resulting in these tables filling prematurely.
Possible Solution
Configure Felix to add
NOTRACK
rules to theraw
table iniptables
when used in VXLAN mode. Controlling this with a configuration parameter seems ideal, in case there are some unique workloads where this change does have an impact (though I can't think of one).Your Environment
v3.28.0
5.10.217-205.860.amzn2.x86_64
)The text was updated successfully, but these errors were encountered: