by Yumeng Chen, May 2024

Data assimilation (DA) has various applications in the field of weather and climate prediction. Previous DARC blogs introduced its capability in numerical weather prediction, air pollution, marine ecosystem, and land surface modelling. Depending on the complexity of the modelled systems, implementing a DA system can be complicated and/or time-consuming. Operational numerical weather prediction centres require sophisticated software to deliver accurate weather forecast in a timely manner. This can be a tremendous engineering challenge when the global atmosphere model in the IFS used by the ECMWF (European Centre for Medium Range Weather Forecasts) has over 100 million grid points. 

Parallel Data Assimilation Framework (PDAF) 

My role in the National Centre for Earth Observation (NCEO) and DARC is applying and developing a piece of data assimilation software called PDAF (Parallel Data Assimilation Framework). The software offers efficient implementations with the possibility to a parallel DA system. This means that the DA system can utilise multiple computer processors at the same time. This design is particularly useful when we use ensemble DA where each possible outcome of the model forecast is represented by an ensemble member. As ensemble members do not exchange information during the forecast, they can be run perfectly independently on different processors. The parallel DA system can also utilise properties of the physical systems referred to as localisation. When the state at one location is only influenced by adjacent regions, the DA algorithms can be broken into small problems and solved in parallel.  

PDAF can supplement a wide range of NCEO research by providing a suite of DA algorithms and being usable flexibly with any numerical models and observations. For example, a global marine biogeochemistry DA system is under development with PDAF in NCEO. Meanwhile, PDAF has been used with some notable climate models including AWI-CM, CICE, FESOM, MITgcm, MPI-ESM, TerrSysMP and NEMO. 

Figure 1: The design principle of PDAF where the model and observation feed information into the DA algorithm denoted by Filter/Core of PDAF (source). 

 A new python interface to PDAF

However, the efficiency and flexibility of PDAF comes with a price. The framework is written in Fortran, an efficient and popular programming language for weather and climate models. However, writing code in Fortran can be laborious. Recently, we released a Python interface to PDAF, pyPDAF. This new Python package eases the difficulties in developing a DA system with Python models that are used in NCEO. This new Python package can be particularly useful for machine learning models, usually programmed in Python. 

Certainly, PDAF is just one example of DA software for complex weather and climate problems.   On the basis of different purposes and applications, many other DA tools have been developed. For instance, JCSDA (Joint Center for Satellite Data Assimilation) is collaborating with many operational centres, including the UK Met Office, to develop JEDI (Joint Effort for Data assimilation Integration); NCAR (the US National Center for Atmospheric Research) developed and maintained DART (The Data Assimilation Research Testbed); The LAVENDAR (Land Variational Ensemble Data Assimilation fRamework) has been developed for land surface models. Developed in Norwegian Research Centre, DAPPER provides a great tool for DA methodology research.