Publication | Open Access
Optimizing MPI communication within large multicore nodes with kernel assistance
24
Citations
13
References
2010
Year
Unknown Venue
Cluster ComputingHeterogeneous ComputingEngineeringComputer ArchitectureCommunication ArchitectureComplex ArchitecturesShared MemoryHigh-performance ArchitectureParallel ComputingNetwork OptimizationManycore ProcessorHybrid ProgrammingNpb Execution TimeComputer EngineeringComputer ScienceIntra-node Communication EfficiencyEdge ComputingParallel Performance EvaluationCloud ComputingMany-core ArchitectureParallel ProgrammingKernel AssistanceSystem Software
As the number of cores per node increases in modern clusters, intra-node communication efficiency becomes critical to application performance. We present a study of the traditional double-copy model in MPICH2 and a kernel-assisted single-copy strategy with KNEM on different shared-memory hosts with up to 96 cores. We show that KNEM suffers less from process placement on these complex architectures. It improves throughput up to a factor of 2 for large messages for both point-to-point and collective operations, and significantly improves NPB execution time. We detail when to switch from one strategy to the other depending on the communication pattern and we show that I/OAT copy offload only appears to be an interesting solution for older architectures.
| Year | Citations | |
|---|---|---|
Page 1
Page 1