Metallic Foam Structures, Dendrites and Implementation Optimizations ...

Cite this paper as: Vondrous A., Nestler B., August A., Wesner E., Choudhury A., Hötzer J. (2012) Metallic Foam Structures, Dendrites and Implementation ...
613KB Größe 1 Downloads 253 Ansichten
Metallic Foam Structures, Dendrites and Implementation Optimizations for Phase-Field Modeling A. Vondrous, B. Nestler, A. August, E. Wesner, A. Choudhury, and J. H¨otzer

Abstract We present our current work in the field of computational materials science with the phase-field method on the high performance cluster XC 4000 of the KIT (Karlsruhe Institute of Technology). Our investigations include heat conduction of open cell metal foams, dendritic growth and optimizations of the concurrent processing with the message passing interface (MPI) standard. Large scale simulations are applied to identify relevant parameters of heat conduction and dendrite growth. Our overall goal is to continuously develop our models, numerical solution techniques and software implementations. The basic model and parallelization scheme is described. Disadvantages of 1D domain decomposition compared to 3D domain decomposition for large 3D simulation domains are explained and a detailed analysis of the new 3D decomposition needs to be performed. The data throughput of parallel file IO operations is measured and system specific differences have been found which need further investigations.

1 Introduction Science is constantly growing with the discovered and understood phenomena of our environment. To help understand our environment, we try to get insight in processing mechanisms and interactions by computer simulations. In this report, we present our contribution based on applications using the phase-field method. The presented results present current work run on the XC 4000 Cluster of the KIT. Our overall goal is to reveal the whole strength of the phase-field method and to exploit its capabilities. Phase-field is an easily extendable and powerful method A. Vondrous · B. Nestler · A. August · E. Wesner · A. Choudhury · J. H¨otzer Institute of Materials and Processes, Karlsruhe University of Applied Sciences, Moltkestr. 30, Karlsruhe, Germany, e-mail: [email protected], Anastasia.August@hs-karlsruhe. de, [email protected], [email protected], johannes.hoetzer@ gmx.de, [email protected] W.E. Nagel et al. (eds.), High Performance Computing in Science and Engineering ’11, DOI 10.1007/978-3-642-23869-7 43, © Springer-Verlag Berlin Heidelberg 2012

595

596

A. Vondrous et al.

to describe physical phenomena of microstructure evolution and to computationally design materials with tailored properties. One of its strengths is the ability to model surface dynamics and surface interactions [1]. The effect of different physical fields on the phase transformations and on the motion of boundaries can be introduced by formulating appropriate free energies in the underlying functional. The derivation of the set of evolution equations contains appropriate terms representing the driving forces according to the physical fields. Together with the property of simulating structures with complex shaped surfaces, the phase-field method evolves as a powerful tool to optimize production and manufacturing processes as well as to investigate mechanical properties in various material systems. The method expects to have a great potential within a broad range of research fields. The field of applications is constantly growing from solidification studies of multicomponent systems [3], over fluid dynamics studies [4] to studies of mechanical properties [5, 8] and computer graphics [6, 7]. The improvements of the models, numerical methods and implementation will constantly show progress to obtain efficient and applicable tools to gain more insights into the physics of structure and pattern formation. One application of the phase-field method is to investigate new materials such as metal foams. Open cell metal foams have a big surface and promising thermal conductivity properties as required to serve as a material for heat exchangers. Development, analysis and improvement of new materials can be realized more target orientated by applying simulation methods such as those based on the phase-field concept. Modified structures can be tested and analyzed in a fast manner without any material input or the need to set up an extensive experimental measurement apparatus for achieving a first and rough impression of the tendency. The roots of the phase-field method are solidification studies which still form the most common research subject [9]. Metal casting industry and research facilities make use of thermodynamic databases to determine material properties and process parameters. The phase-field method is capable to describe effects of certain process steps on the alloy properties, on the typical morphological shapes and microstructure patterns. Investigations in metal technology can be supported and shorter development cycles can be achieved. The gap between atomistic and macroscopic methods is crossed by the phase-field method operating on a mesoscopic (micrometer scale) in a consistant way. From a technical viewpoint, a further optimization of the numerical solving algorithms is required not only to improve the efficiency or to gain more usability, but also to establish a continuity in software design and to help developing systems for the next generation. The use of modern hardware and cutting edge software reduces unnecessary power consumption and increases the simulation speed. In the past, significant speedup has been obtained without dedicated programming concepts, only by faster hardware with higher clockspeed. Nowadays the number of CPUs directly correlates with the need to implement concurrent calculations. The future is now open for different applications, which have to be in mind, when developing high performance software. The section “Optimizations” is devoted to describe our current work on performance optimizations of the software package Pace3D for materials simulation.

Metallic Foam Structures and Dendrites with Phase-Field Modeling

597

2 Method In this section we give an overview of physical modeling, numerics and our implementation. Pace3D (Parallel Algorithms for Crystal Evolution in 3D) is a library which contains parallel solvers and tools for data manipulation, analysis and visualization. The core of Pace3D is the implemented phase-field model for multiphase and multicomponent systems proposed by B. Nestler, H. Garcke and B. Stinner in [2] and [1]. The formulation of the model is based on a general energy functional    1 ∇φ ) + w (φ ) + fbulk (φ , c, T, . . .) d Ω , ε a(φ ,∇ (1) F= ε from which a set of nonlinear dynamic equations for the different physical field variables is derived. Interfaces between distinct phases, e.g. between solid and liquid material are modeled such that a smooth transition from one phase state to the other is used. φ is the phase-field vector with N phases, where each vector component describes a physical state or a particular property of the material. The vector describes volume fractions of each phase with the constraint ∑Nα =1 φα = 1. ε is a length scale parameter with influence on the interface thickness. The gradient energy density is de∇φ ) and the surface energy density by w (φ ). One or many scribed by the term a(φ ,∇ bulk energies fbulk can be incorporated to model the processes such as solidification and to account for influences such as pressure, elasticity, plasticity or magnetism. Evolution equations are obtained by variational derivation of the energy functional. The area of interest is discretized by finite differences and solved according to an explicit Euler method. In case of many phases (N > 1000), the necessary memory rapidly exceeds the available capacity such that optimization strategies to manage large data structures are required. The software package offers the activation of locally reduced order parameter sets according to the techniques in [10] allowing to compute systems with very high amounts of phases (N  100000). Concerning aspects of software design, the implementation supports parallel execution with MPI and a clear structure for users, modelers and implementers. All parallel tasks follow a manager-worker pattern to keep complexity under control. An advantage of the explicit Euler method is the low range calculation stencil. Only direct neighbors are addressed to calculate the values of a cell for the next time step. Parallelization is applied by dividing the simulation domain in subregions along one dimension and adding boundary layers, which have to be updated after each time step (see Fig. 1). The user controls all physical data and the choice of modules of the package such as fluid flow, magnetism, heat transfer etc. by a setting the appropriate keys in the parameter file.

598

A. Vondrous et al.

Fig. 1 Simulation domain decomposition along one axis for parallel execution with MPI

3 Metallic Foam Metal foams are porous materials with a fine irregular structure. To obtain significant simulation results, we have to • resolve the ligaments with sufficiently many cells and • find a representative sample suitable to describe the behavior of bigger ones. On the one hand, the consideration of a large number of cells in the domain causes very long simulation times, whereas on the other hand, a small representative volume element impairs the generalization of the results. By increasing the number of processors, we can satisfy the previous items. The first example refers to an experimental foam sample of 1 cm3 length in each spatial direction with 20 ppi (pores per inch). The parameter ‘ppi’ implies the necessity of representing 339 pores. Hence this quantity defines the extension of the representative volume element in the simulation. To resolve samples of 50 ppi, a volume of 1 cm3 contains about 4850 pores, so that a smaller representative volume of 0.3 × 0.3 × 0.3 cm3 has to be chosen. This volume encloses about 133 pores to provide convincing simulation results. The physical and simulation parameters for both samples are summarized in Tables 1 and 2. In a phase-field simulation, the energy functional (see Sect. 2) is constructed to minimize the energy in the system of consideration on the basis of classical irreversible thermodynamics. Following a conservation law for the internal energy e, a dynamic field equation, directly related to the evolution of the temperature everywhere in the simulation domain, can be derived by means of variational differentiation of the according entropy functional. The classical Gibbs relation reads e = f (φ , c, T ) + T s(e, c, φ ) = f (φ , c, T ) − T f ,T (φ , c, T ),

(2)

where e is the internal energy, f contains bulk free energies, s is the entropy density and the notation f ,T denotes the derivative of the bulk free energy density f (φ , c, T )

Metallic Foam Structures and Dendrites with Phase-Field Modeling

599

Table 1 Structure parameters ppi 20 50

grid cells 420 × 420 × 420 420 × 420 × 420

pore radius 26 cells 36 cells

pore number 339 133

solid fraction 11.18% 9.54%

Table 2 Simulation parameters physical size of the domain for 20 ppi physical size of the domain for 50 ppi initial temperature in the interior of the domain volumetric heat capacity of aluminum volumetric heat capacity of air boundary condition at the bottom of the domain

1.0 × 1.0 × 1.0 cm3 0.3 × 0.3 × 0.3 cm3 300 K 2.422 · 106 J/(m3 K) 1.297 · 103 J/(m3 K) 600 K

with respect to the temperature. A balance law is used to form the evolution equation for the internal energy. Terms representing heat flux are derived from the functional equation by a linear relation with respect to the thermodynamic driving forces ∇δ S/δ e and ∇δ S/δ ci incorporating Onsager’s mobility coefficients Li j . The partial differential equation for the inner energy is given by   −μ  K 1 j ∂t e = −∇ · L00 ∇ + ∑ L0 j ∇ . T j=1 T

(3)

The simulation study considers the heating up of foam structures with different porosity from the bottom of the domain. The considered metal is aluminum and the pores are filled with air. Figures 2 and 3 refer to generated foam samples with 20 ppi and 50 ppi as well as to a heat diffusion after 0.79 s and 0.63 s, respectively. The figures at the right side illustrate snapshots of the temperature field including isolines within the foam at an intermediate state of the simulation.

4 Dendrites Dendrites are ubiquitous morphologies occurring during solidification processing. Apart from being very interesting to physicists from the point of view of pattern formation, it holds special significance in industry, because one requires to understand the influence of the processing parameters such that the resulting microstructure is finer and has uniform distribution contributing to better mechanical properties. While such a correlation between the processing parameters and the microstructure is difficult to establish experimentally, the application of simulation techniques such as the phase-field method can contribute immensely. Phase-field simulations have been extensively applied in the past decade for understanding the evolution of a number of microstructures occurring and one of the principal microstructures is indeed dendrite solidification. Dendrites are microstruc-

600

A. Vondrous et al.

Fig. 2 Left: Synthetic foam structure used for simulations associated to 1 cm3 foam samples of 20 ppi. Right: Heat distribution in the air-aluminum domain after 0.79 s

Fig. 3 Left: Synthetic foam structure used for simulations associated to 0.3 × 0.3 × 0.3 cm3 foam samples of 50 ppi. Right: Heat distribution in the air-aluminum domain after 0.63 s

tures that result due to a special instability leading to the selection of a unique dendrite tip radius and growth velocity for given processing conditions. Two theories exist for the selection of the dendrite tip radius, namely the marginal stability criterion and the microsolvability theory. While the marginal stability criterion is empirical, the microsolvability criterion is a solvability condition for the existence of a solution. Both theories do pretty well in predicting some of the precise details of the fineness of the microstructure. However, exact details such as the secondary arm spacing, and primary arm spacing selections are far too complicated for analytical theories. Here is where, phase-field simulations become useful, which not only establish the properties of the steady state dendrite tip, but also other geometrical features relating to the fineness of the microstructure.

Metallic Foam Structures and Dendrites with Phase-Field Modeling

601

Dendrites occur in both, pure materials and multi-components alloys. While in pure materials the onset of primary arm formation results due to latent heat rejection, in multi-component alloys, the physical problem is a combination of solutal and thermal field evolution. While, the coupled problem is difficult to solve numerically, because the solute and thermal field evolution occur on largely different time scales, useful assumptions can be made where the thermal field is assumed to have evolved to a steady state in the time scale where solute diffusion occurs. This allows one to simulate and recover most relevant properties of solutal dendrites. Phase-field simulations must however be used carefully if one needs to recover quantitative numbers out of simulations. In this context, one must perform an analysis of the governing equations to understand their legitimacy. Apart from this, the problem of dendrite evolution is computationally extensive. The reason for this is the resolution of the long far-field diffusion field. At higher velocities or undercoolings, the diffusion field is smaller, and is more computationally accessible. However, existing asymptotic analysis breakdown at such velocities and hence relevance of such simulations become limited. At lower undercoolings the diffusion field is of the order of millimeters, while for quantitative simulations the grid resolution is limited. Hence, the number of grid points multiplies. Therefore, to perform such simulations it becomes important to involve optimizations such as the adaptive mesh refinement and efficient parallelization computation techniques. In the following discussion we present simulation results of dendritic structures for the Al-Cu 2 at % alloy. We model the free energies of the solid and the liquid phases with an ideal solution model. We perform the asymptotic analysis and derive expression for the simulation parameters such that the resulting time scale of evolution is limited by the evolution of the solute field. This implies the phase transformation relaxes infinitely fast in response to a change by the concentration field. The simulation temperature is set to T = 895 K, with a small supersaturation, with a Gibbs-Thomson coefficient of 2.4e-7 K/m. The size of the domain in 2D simulations is 600 µm, while in the 3D simulations it is 100 µm with a grid resolution of 0.2 µm. The strength of anisotropy is 0.06. The secondary arm formation is induced by Langevin noise in the phase-field equations. The simulations are able to retrieve some of the important features such as the secondary arm spacings and dendrite tip radius, which can be seen in Fig. 4a and b. To obtain these quantities in the whole range of undercoolings and driving forces, highly advanced computational optimizations are required for simulating dendrite growth. This is however an outlook and dendrite growth still remains a serious computational challenge.

5 Optimizations In this section we describe the optimization of the Pace3D simulation software by applying a multi dimensional domain decomposition and parallel file IO with MPI. 1D and small 2D simulations can be computed on a regular workstation with 1 CPU.

602

A. Vondrous et al.

Fig. 4 Dendrites in isothermal conditions of T = 895 K in a two dimensions and b three dimensions

To use all available resources, even those of a workstation, concurrent calculations have to be applied. The current development of CPUs is pointing towards more processing units on a single socket such that the efficiency of parallel calculations should be optimized to gain a bigger advantage. 1D domain decomposition enables parallel execution, but may have drawbacks compared to a multi dimensional decomposition, especially for big domains and many parallel tasks. Each sub domain has to transfer more boundary cells in the case of a 1D compared to 3D decomposition, because the surface of a nearly cubic subdomain is smaller than the surface of a thin slice (compare Fig. 1 and Fig. 5a) with the same amount of cells. If the interconnect between nodes has a high latency, 1D decomposition should have the advantage of lesser communication connects. Nevertheless 1D decomposition limits the amount of useable CPUs to the size of the largest domain axis. A decomposition along all three axes as in Fig. 5a allows to use even more CPUs and increases the efficiency due to a better cache usage. The 3D decomposition of the software is in a final stage and a big scale performance analysis is in progress. To face load imbalance, we have developed an adaptive decomposition scheme which divides the domain more often in areas with higher load to increase the overall efficiency. IO operations for writing and reading files are another slow down to be minimized. MPI delivers IO operations for 3D decomposition with the chance of optimized writing and reading throughput. Measurements in Table 3 show a clear difference between collective and non collective writing operations as mentioned in the MPI standard [11]. They also show, that throuput of workstations can be higher than that of a high performance clusters. The measurements have to be repeated and evaluated at different times, since the file IO performance is dependent on the workload of the system. Workload and hardware configurations are not the only performance issues. The configuration of a system in an appropriate manner has an additional

Metallic Foam Structures and Dendrites with Phase-Field Modeling

603

Fig. 5 Decomposition of a 3D simulation domain by a “regular” 3D decomposition and b adaptive decomposition Table 3 Writing throughput of an MPI task on different systems in MB/s Communication type collective non collective

XC 4000 7,933 6,355

‘ProStudium’ cluster 2,564 1,012

Office workstations 20,235 7,012

significant influence. The measurements are performed without the usage of MPI hints and provide us with basic performance numbers. The so-called ‘ProStudium’ cluster is a high performance computing facility containing 40 nodes, each equipped with 8 CPUs. An infiniband with DDR connections is the high performance interconnect between the nodes. Data are stored at the login node which contains a software RAID 5 system with a handful of disks. This system is exclusively provided for student projects and teaching purposes. In contrast, all office workstations have 2 CPUs, are connected through 1 GBit ethernet and the data are stored on a hardware RAID 5 system. The big throughput of the office workstations compared to other types of clusters has to be investigated and a broader performance analysis of the new 3D decompositions for the Pace3D simulation application has to be carried out.

6 Outlook The current state of our studies, developments and implementations allow to investigate different phenomena of material development in detail. The model and implementation grow hand in hand. The investigation of metallic foams combined with fluid flow and heat convection is one of the next steps to predict the heat transfer for different flow conditions and for different metals and alloys. The application of topology optimization of metallic foam structures and other porous synthetic and

604

A. Vondrous et al.

symmetric configurations with respect to specific properties is another study in the current range of feasibility. The investigation of metallic foams is a particular topic, where simulations based on the phase-field method provide valuable information of microstructure design, but its far not the only application. There exist many open questions in the field of solidification and melting. For the study of dendritic structures, their properties and behavior, feasible simulation domains and efficient algorithms have to be applied to make sustainable conclusions. The extra addition of thermodynamic databases and more accurate calculations emphasizes the need for high performance computing resources. From a technical point of view, the progress is addressed to a continuous development of fast, portable and maintainable algorithms aiming for a most efficient handling of computing resources. 3D domain decomposition is an important step to meet current and future needs. Advantages and disadvantages have to be identified to provide solid programs and user advices. Beside the presented studies in this report, almost all microstructure simulations nowadays concern systems with many phases, grains or particles and require the use of high performance computations. The property changing process during recovery of heavy plastically deformed metal sheets involves the growth of many grains in a many grain microstructure. Large-scale simulations are capable of representing the process of recrystallization. To describe the Lotus effect of leafs, a high resolution of the numerical grid is necessary to resolve the rough and complex structured surface of the leaf. Each presented topic benefits from using high performance computational resources. Applications of the phase-field model exploit their capabilities and give insight in physical processes and mechanisms. The technical progress of software development allows to use current and prospective systems efficiently. Acknowledgments. The presented simulations were performed on the Landesh¨ochstleistungsrechner XC 4000 of the Karlsruhe Institute of Technology (KIT). The authors gratefully acknowledge the access to the system.

References 1. B. Nestler, H. Garcke, and B. Stinner: Multicomponent alloy solidification: Phase-field modelling and simulations. Physical Review E, 71:041609, 2005 2. H. Garcke, B. Nestler, and B. Stinner: A diffuse interface model for alloys with multiple components and phases. SIAM J. Appl. Math., 64:775, 2004 3. B. Nestler and A. A. Wheeler: A multi-phase-field model of eutectic and peritectic alloys: Numerical simulation of growth structures. Physica D, 138, 2000 4. B. Nestler, A. Aksi, and M. Selzer: Combined Lattice Boltzmann and phase-field simulations for incompressible fluid flow in porous media. Mathematics and Computers in Simulation, 80:1458–1468, 2010 5. R. Spatschek, C. M¨uller-Gugenberger, E. Brener, and B. Nestler: Phase field modeling of fracture and stress induced phase transitions. Physical Review B, 75:066111, 2007 6. T. Kim and M. C. Lin: Visual Simulation of Ice Crystal Growth. Department of Computer Science, Eurographics/SIGGRAPH Symposium on Computer Animation, 2003

Metallic Foam Structures and Dendrites with Phase-Field Modeling

605

7. H. Garcke, T. Preuer, M. Rumpf, A. C. Telea, U. Weikard, and J. J. van Wijk: A Phase Field Model for Continuous Clustering on Vector Fields. IEEE Transactions on Visualization and Computer Graphics Archive, 7(3), 2001 8. R. Spatschek, M. Hartmann, E. Brener, H. M¨uller-Krumbhaar, and K. Kassner: Phase Field modelling of Fast Crack Propagation. arXiv, 2005 9. B. Nestler and A. Choudhury: Phase-field modeling of multi-component systems. Current Opinion in Solid State and Materials Science, DOI: 10.1016/j.cossms.2011.01.003, 2011 10. S. G. Kim, D. I. Kim, W. T. Kim, and Y. B. Park: Computer simulations of two-dimensional and three-dimensional ideal grain growth. Physical Review E, 74:061605, 2006 11. Message Passing Interface Forum: A Message-Passing Interface Standard, Version 2.2