Computational Performance Enhancements via Parallel Execution of Competing Implementations, 10-R8537
Inclusive Dates: 03/10/15 – Current
Background — The lattice-Boltzmann method (LBM) facilitates the analysis of fluid flows. It represents physical systems in an idealized way where space and time are discrete. It provides less computationally demanding descriptions of macroscopic hydrodynamic problems. Many of our clients have problems that are amenable to solution using the LBM. However, ongoing work on these problems suffers from overly long computational times for the simulation runs. This research takes advantage of the availability of large numbers of processors to evaluate a speculative space of optimization techniques during the execution of simulation runs. A genetic algorithm (GA) approach is being evaluated for the dynamic selection of an optimum implementation that efficiently executes the LBM algorithm to efficiently adapt to temporal phase changes in the simulation.
Approach — The objective of the project is to use the lattice-Boltzmann method problem first and then the symplectic N-body algorithms with close approaches (SyMBA) problem to evaluate the applicability of a GA approach to optimizing the performance of computationally intensive simulation code. The original code provided by the problem presenters is first analyzed and optimized using a base set of transformations to produce a baseline suitable for the specialized GA-focused research. The code is then marked up to allow PARADYN to experiment with a variety of possible variations on the construction of the code, the organization of the data, and the flags used to compile the code. A set of genes (ways to vary the code, data, and flags) is defined and the set of alleles to be used (specific variations on the genes) is specified. The framework then executes multiple versions of portions of the simulation in parallel and evaluates the performance of those portions. New variations are created by the framework based on the best performing versions. The continual evolution of the versions being run allows the simulation to eventually perform better than the baseline solution (Figure 1) and to adapt to changes in the state of the simulation (Figure 2).
Accomplishments — Baseline optimized versions of both the LBM code and the SyMBA code have been created. Figures 1 and 2 show the improvements that were made during the base optimization phase of work on the LBM problem. As Figure 1 shows, up to ~2x speedup was achieved for problems not including thermal fluctuations. Figure 2 shows an example improvement made during the baseline phase for problems including thermal fluctuations. The improvement is more than 800x. Genes and alleles have been defined for the LBM code, and the necessary plugins to the PARADYN framework have been implemented. The LBM code has been marked up for use by PARADYN. An example optimization run of the LBM code with PARADYN is show in Figure 3. This figure demonstrates PARADYN's ability to dynamically adapt to changes in the simulation, as can be seen when the particles are released at t=10,000. Figure 4 illustrates PARADYN's ability to select more and more optimized solutions as the simulation progresses, eventually reaching better performance than the baseline optimizations as evinced by the trend line. A comparison of the original code performance, along with the current PARADYN performance, is shown in Figure 5. A 5-8X speedup has been achieved to date for non-thermal fluctuation problems. The PARADYN framework was designed and implemented in a modular and reusable way that facilitates its use on the SyMBA project as well. Genes and alleles are being defined for the SyMBA code, and the code is being marked up for use within the framework.