Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make trex to run on x86 systems with more than 240 threads and sub-numa #1120

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Civil
Copy link
Contributor

@Civil Civil commented Mar 18, 2024

clustering

  • Fix dpdk_setup_ports numa detection code - it now support sub-numa clustering
  • Remove artificial limitation on 63 max cores in dpdk_setup_ports
  • Change RTE_MAX_LCORE in build options for DPDK
  • Update trex_defs to unify ppc and x86 and also make arg string bigger
  • Change BP_MAX_CORES to 256 (not sure why it is not dynamically determined...)

Fixes #1119

clustering

 * Fix dpdk_setup_ports numa detection code - it now support sub-numa
   clustering
 * Remove artificial limitation on 63 max cores in dpdk_setup_ports
 * Change RTE_MAX_LCORE in build options for DPDK
 * Update trex_defs to unify ppc and x86 and also make arg string bigger
 * Change BP_MAX_CORES to 256 (not sure why it is not dynamically
   determined...)
@Civil
Copy link
Contributor Author

Civil commented Mar 18, 2024

There is alternative approach - for my setup it would be enough to still keep 128 cores, and dpdk support custom mapping (you can map your real thread ids to virtual ones and that way keep it below 128).

But that would require more work on refactoring.

I can try to do that and also move static allocation for BP_MAX_CORES to dynamic one but later.

@trex-bot
Copy link

Checked SHA: eb463ae
Status: SUCCESS ✔️
Link to job: http://81.218.86.50:8080/job/trex_build/829/

@il-alexk
Copy link

il-alexk commented Jul 10, 2024

Does it really work with more than 31 cores?

I can see the latest 3.05 still doesn't include the quick patch for this issue: #1072 , looks like it's not in your patch either.

@Civil
Copy link
Contributor Author

Civil commented Jul 10, 2024

Does it really work with more than 31 cores?

Based on the issue you've mentioned it shouldn't work with amount of cores equivalent to 32 because validation function is wrong. As far as I can see the problem there is just in validation function, so if you have more than 32 cores it would probably work just fine.

However as I needed slightly more cores (my test system have 240 threads) I haven't encountered that problem.

Honestly I don't know what are the thoughts of devs ( @hhaim ? ) on this PR as it is open for a while... So I'm not sure I should just apply the patch mentioned there or rewrite the check funciton as part of that PR or I should just abandon all the efforts of upstreaming my patches and just keep my own fork instead.

@il-alexk
Copy link

il-alexk commented Jul 10, 2024

My patch in that PR is quite lame, I expect it to throw a run time error exactly at 64 cores. I considered rewriting the code to use __int128 instead to support up to 128 cores (still less than 240 that you need), but couldn't justify the effort for my tasks. My Trex setup has an optimal performance between 16 to 32 cores, hence a quick and dirty fix.

Anyway, since the core mask is only 64 bits, I am not sure how Trex can correctly allocate 240.

TrexDPCoreMask(uint8_t dp_core_count, uint64_t dp_core_mask = MASK_ALL) {

It also should throw an error with any number of cores above 64:
if ( (dp_core_count < 1) || (dp_core_count > 64) ) { return false; }

I could be wrong of course, would be good to learn and understand how it works for you.

As for what we can do, you can see Haim thanked me for my patch, so no reasons for you not to include it in your fork, unless you want to rewrite it and do a proper fix there. I personally prefer you to add either this fix or your own to your fork. I'm sure it will be merged to the main branch after that.

@Civil
Copy link
Contributor Author

Civil commented Jul 10, 2024

I could be wrong of course, would be good to learn and understand how it works for you.

Honestly I'll need to have a closer look at what they are using this mask for. As, as far as I remember, they used to have a core mask few version ago and then pass it to DPDK, but now that was replaced with a core list. And as far as I remember I had pretty even core load on my side.

On the other hand, it could be that this limits it to 32 threads per NIC or something like that...

I probably will have a look at the behavior of the code in next few weeks (mostly because it was a few days of proper summer with +30 C outside and I don't want to turn on a homelab that dissipate extra 800W of heat until my apartment cools down :D )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants