diff --git a/data/talks.yaml b/data/talks.yaml index 32eb05fbecf2c461743e65600e19d63e0e1d15d0..d2122350c4bab4863bab62509223749c37cf282c 100644 --- a/data/talks.yaml +++ b/data/talks.yaml @@ -4,6 +4,20 @@ # references. Nominally we will use /talks. cards: + - meeting: Computational Galaxy Formation 2018 + location: Ringberg Castle, Tegernsee, Germany + date: March 2018 + title: "First steps towards cosmological simulations with full EAGLE physics" + author: Matthieu Schaller + abstract: "The SWIFT simulation code has now matured enough that we can start targeting large-scale simulations using the EAGLE physics model. In this talk + I will discuss the status of the code and present some ideas related to the domain decomposition that we implemented in order to tackle the challenge of + deep time-step hierarchies that develop in cosmological simulations. I will also present some weak-scaling plots demonstrating the ability of the code to scale + up to the largest systems in realistic scenarios." + links: + - href: "Ringberg_2018.pdf" + name: Slides + + - meeting: Supercomputing Frontiers Europe 2018 location: Warsaw, Poland date: March 2018 @@ -24,7 +38,7 @@ cards: - href: "SuperComputingFrontiers_2018.pdf" name: Slides - - meeting: SIAM Conference on Parallel Procecssing for Scientific Computing 2018 + - meeting: SIAM Conference on Parallel Processing for Scientific Computing 2018 location: Tokyo, Japan date: March 2018 title: "Using Task-Based Parallelism, Asynchronous MPI and Dynamic Workload-Based Domain Decomposition to Achieve Near-Perfect Load-Balancing for Particle-Based Hydrodynamics and Gravity" @@ -67,7 +81,7 @@ cards: date: June 2017 title: "SWIFT: Using Task-Based Parallelism, Fully Asynchronous Communication and Vectorization to achieve maximal HPC performance" author: James S. Willis - abstract: "We present a new open-source cosmological code, called swift, designed to solve the equations of hydrodynamics using a particle-based approach (Smooth Particle Hydrodynamics) on hybrid shared / distributed-memory architectures. Swift was designed from the bottom up to provide excellent strong scaling on both commodity clusters (Tier-2 systems) and Top100-supercomputers (Tier-0 systems), without relying on architecture-specific features or specialized accelerator hardware. This performance is due to three main computational approaches: - Task-based parallelism for shared-memory parallelism, which provides fine-grained load balancing and thus strong scaling on large numbers of cores. - Graph-based and genetic algorithm-based domain decomposition, which uses the task graph to decompose the simulation domain such that the work, as opposed to just the data, as is the case with most partitioning schemes, is equally distributed across all nodes. - Fully dynamic and asynchronous communication, in which communication is modeled as just another task in the task-based scheme, sending data whenever it is ready and deferring on tasks that rely on data from other nodes until it arrives, - Explicit vectorization of the core kernel to exploit all the available FLOPS on architectures such as Xeon Phi. In order to use these approaches, the code had to be rewritten from scratch, and the algorithms therein adapted to the task-based paradigm. As a result, we can show upwards of 60% parallel efficiency for moderate-sized problems when increasing the number of cores 512-fold on x86 architecture making SWIFT more than an order of magnitude faster than current alternative software." + abstract: "We present a new open-source cosmological code, called swift, designed to solve the equations of hydrodynamics using a particle-based approach (Smooth Particle Hydrodynamics) on hybrid shared / distributed-memory architectures. Swift was designed from the bottom up to provide excellent strong scaling on both commodity clusters (Tier-2 systems) and Top100-supercomputers (Tier-0 systems), without relying on architecture-specific features or specialised accelerator hardware. This performance is due to three main computational approaches: - Task-based parallelism for shared-memory parallelism, which provides fine-grained load balancing and thus strong scaling on large numbers of cores. - Graph-based and genetic algorithm-based domain decomposition, which uses the task graph to decompose the simulation domain such that the work, as opposed to just the data, as is the case with most partitioning schemes, is equally distributed across all nodes. - Fully dynamic and asynchronous communication, in which communication is modelled as just another task in the task-based scheme, sending data whenever it is ready and deferring on tasks that rely on data from other nodes until it arrives, - Explicit vectorization of the core kernel to exploit all the available FLOPS on architectures such as Xeon Phi. In order to use these approaches, the code had to be rewritten from scratch, and the algorithms therein adapted to the task-based paradigm. As a result, we can show upwards of 60% parallel efficiency for moderate-sized problems when increasing the number of cores 512-fold on x86 architecture making SWIFT more than an order of magnitude faster than current alternative software." links: - href: "ISC_Intel_Booth_Talk_2017.pdf" name: Slides @@ -87,7 +101,7 @@ cards: date: November 2016 title: "SWIFT: Using Task-Based Parallelism, Fully Asynchronous Communication and Vectorization to achieve maximal HPC performance" author: Matthieu Schaller - abstract: "We present a new open-source cosmological code, called swift, designed to solve the equations of hydrodynamics using a particle-based approach (Smooth Particle Hydrodynamics) on hybrid shared / distributed-memory architectures. Swift was designed from the bottom up to provide excellent strong scaling on both commodity clusters (Tier-2 systems) and Top100-supercomputers (Tier-0 systems), without relying on architecture-specific features or specialized accelerator hardware. This performance is due to three main computational approaches: - Task-based parallelism for shared-memory parallelism, which provides fine-grained load balancing and thus strong scaling on large numbers of cores. - Graph-based and genetic algorithm-based domain decomposition, which uses the task graph to decompose the simulation domain such that the work, as opposed to just the data, as is the case with most partitioning schemes, is equally distributed across all nodes. - Fully dynamic and asynchronous communication, in which communication is modeled as just another task in the task-based scheme, sending data whenever it is ready and deferring on tasks that rely on data from other nodes until it arrives, - Explicit vectorization of the core kernel to exploit all the available FLOPS on architectures such as Xeon Phi. In order to use these approaches, the code had to be rewritten from scratch, and the algorithms therein adapted to the task-based paradigm. As a result, we can show upwards of 60% parallel efficiency for moderate-sized problems when increasing the number of cores 512-fold on x86 architecture making SWIFT more than an order of magnitude faster than current alternative software." + abstract: "We present a new open-source cosmological code, called swift, designed to solve the equations of hydrodynamics using a particle-based approach (Smooth Particle Hydrodynamics) on hybrid shared / distributed-memory architectures. Swift was designed from the bottom up to provide excellent strong scaling on both commodity clusters (Tier-2 systems) and Top100-supercomputers (Tier-0 systems), without relying on architecture-specific features or specialised accelerator hardware. This performance is due to three main computational approaches: - Task-based parallelism for shared-memory parallelism, which provides fine-grained load balancing and thus strong scaling on large numbers of cores. - Graph-based and genetic algorithm-based domain decomposition, which uses the task graph to decompose the simulation domain such that the work, as opposed to just the data, as is the case with most partitioning schemes, is equally distributed across all nodes. - Fully dynamic and asynchronous communication, in which communication is modelled as just another task in the task-based scheme, sending data whenever it is ready and deferring on tasks that rely on data from other nodes until it arrives, - Explicit vectorization of the core kernel to exploit all the available FLOPS on architectures such as Xeon Phi. In order to use these approaches, the code had to be rewritten from scratch, and the algorithms therein adapted to the task-based paradigm. As a result, we can show upwards of 60% parallel efficiency for moderate-sized problems when increasing the number of cores 512-fold on x86 architecture making SWIFT more than an order of magnitude faster than current alternative software." links: - href: "HPC_DevCon_Talk_2016.pdf" name: Slides @@ -109,7 +123,7 @@ cards: date: June 2016 title: "SWIFT: Strong scaling for particle-based simulations on more than 100'000 cores" author: Matthieu Schaller - abstract: "We present a new open-source cosmological code, called SWIFT, designed to solve the equations of hydrodynamics using a particle-based approach (Smooth Particle Hydrodynamics) on hybrid shared/distributed-memory architectures. SWIFT was designed from the bottom up to provide excellent strong scaling on both commodity clusters (Tier-2 systems) and Top100-supercomputers (Tier-0 systems), without relying on architecture-specific features or specialized accelerator hardware. This performance is due to three main computational approaches: (1) Task-based parallelism for shared-memory parallelism, which provides fine-grained load balancing and thus strong scaling on large numbers of cores. (2) Graph-based domain decomposition, which uses the task graph to decompose the simulation domain such that the work, as opposed to just the data, as is the case with most partitioning schemes, is equally distributed across all nodes. (3) Fully dynamic and asynchronous communication, in which communication is modelled as just another task in the task-based scheme, sending data whenever it is ready and deferring on tasks that rely on data from other nodes until it arrives. In order to use these approaches, the code had to be re-written from scratch, and the algorithms therein adapted to the task-based paradigm. As a result, we can show upwards of 60% parallel efficiency for moderate-sized problems when increasing the number of cores 512-fold, on both x86-based and Power8-based architectures." + abstract: "We present a new open-source cosmological code, called SWIFT, designed to solve the equations of hydrodynamics using a particle-based approach (Smooth Particle Hydrodynamics) on hybrid shared/distributed-memory architectures. SWIFT was designed from the bottom up to provide excellent strong scaling on both commodity clusters (Tier-2 systems) and Top100-supercomputers (Tier-0 systems), without relying on architecture-specific features or specialised accelerator hardware. This performance is due to three main computational approaches: (1) Task-based parallelism for shared-memory parallelism, which provides fine-grained load balancing and thus strong scaling on large numbers of cores. (2) Graph-based domain decomposition, which uses the task graph to decompose the simulation domain such that the work, as opposed to just the data, as is the case with most partitioning schemes, is equally distributed across all nodes. (3) Fully dynamic and asynchronous communication, in which communication is modelled as just another task in the task-based scheme, sending data whenever it is ready and deferring on tasks that rely on data from other nodes until it arrives. In order to use these approaches, the code had to be re-written from scratch, and the algorithms therein adapted to the task-based paradigm. As a result, we can show upwards of 60% parallel efficiency for moderate-sized problems when increasing the number of cores 512-fold, on both x86-based and Power8-based architectures." links: - href: "PASC_2016.pdf" name: Slides diff --git a/talks/Ringberg_2018.pdf b/talks/Ringberg_2018.pdf new file mode 100644 index 0000000000000000000000000000000000000000..ea704e353c727f32610e7267fd6d9d9d9239500f Binary files /dev/null and b/talks/Ringberg_2018.pdf differ