# Harmonisation

By Ralf Quast and Sam Hunt

## Getting started on the FIDUCEO method

An explicit methodology to conduct such a harmonised calibration of a series of sensors has not existed before FIDUCEO and was explored and developed by following the errors-in-variables (EIV) idea, which takes into account the uncertainty and error correlation in all sensor telemetry measurements and yields optimised calibration coefficients and an associated calibration error covariance matrix for all sensors (e.g. Newey 2001).

Figure 4 broadly illustrates the methodology which was developed by the FIDUCEO project to conduct harmonisation within four steps. Step one is to collect dual-sensor match-ups where two sensors observe approximately the same Earth target at approximately the same time (e.g. simultaneous nadir overpasses). Ideally, these match-up data include all possible reference-to-sensor and sensor-to-sensor pairs. Step two is an assessment of the error covariance of all measurements included with the match-up data and an assessment of the expected match-up radiance differences and their associated uncertainties. Step three consists in a joint optimisation of sensor calibration models. The last step is the evaluation of the calibration coefficient error covariance matrix. The complete methodology is described in detail by Giering *et al.* (2019) and explained at a higher level in the following sections.

Figure 4: Methodology to conduct harmonisation

Newey, W.K. Flexible simulated moment estimation of nonlinear errors-in-variables models. Rev. Econ. Stat. 2001, 83, 616–627. [CrossRef]Giering, R.; Quast, R.; Mittaz, J.P.D.; Hunt, S.E.; Harris, P.M.; Woolliams, E.R.; Merchant, C.J. A Novel Framework to Harmonise Satellite Data Series for Climate Applications. Remote Sens. 2019, 11, 1002. [CrossRef]Harris, P.M.; Hunt, S.E.; Quast, R.; Giering, R.; Mittaz, J.P.D.; Woolliams, E.R.; Dilo, A.; Cox, M. Solving large structured non-linear least-squares problems with an application in Earth observation. In preparation. |

## Step one: how to collect match-up data?

In FIDUCEO, we call a match-up a “point” measurement that is matched by another “point” measurement sufficiently close in space and time – excluding the trivial case of neighbouring measurements from the same acquisition. In other words, we require match-up pixels that cover the same place on Earth acquired at almost the same time and with similar viewing geometries (Figure 5).

Figure 5: Concept of a dual-sensor match-up.

In most cases, we extend this definition to also include the neighbouring pixels of a sensor acquisition, covering a symmetrical area of several pixels around the match-up point, which allows us to check homogeneity and detect cloud shadows or other problematic characteristics. Match-up data will include calibrated measurements for the reference sensor and raw telemetry data for as many as possible other (i.e. non-reference) sensors. It is important to have a set of match-up data which represents the relevant aspects of Earth sufficiently well.

You will want to compare acquisitions of different spaceborne sensors which are spatially and temporally close so that you may assume a similar atmospheric condition. You will also want to compare satellite sensor data with *in situ* measurements, like AERONET data, in order to have measurements close to the measured entity. In some cases, we have also considered indirect match-ups between two sensors, with a geo-stationary sensor or *in situ* measurements used to identify stable Earth targets.

FIDUCEO has developed a variable multi-sensor match-up software (MMS) that allows us to run various match-up tasks using many different satellite- and ground-based instruments. The framework comprises an open design with various plugin points to allow for easy algorithmic updates or adding new sensor data.

To browse and download the code for the multi-sensor match-up software take a look at FIDUCEO’s MMS repositor on GitHub. T. Block, S. Embacher, C. J. Merchant, and C. Donlon, “High Performance Software Framework for the Calculation of Satellite-to-Satellite Data Matchups (MMS version 1.2),” Geosci. Model Dev. Discuss., vol. 11, no. 6, pp. 1–15, Jun. 2017. |

## Step two: how to assess the match-up error covariance structure?

After having collected a representative set of match-up data, you will have to assess the radiance difference \textcolor{orange}{L^i} – \textcolor{red}{L^j} = \textcolor{purple}{K^{ij}} expected due to different sensor spectral response functions and the uncertainty of this expected difference. You will also have to assess the uncertainty of the \textcolor{orange}{L^i} – \textcolor{red}{L^j} = \textcolor{purple}{K^{ij}} due to matchup and representativeness errors. Doing this requires expert knowledge on the satellite instruments involved and on radiative transfer modelling.

To explain how to assess the error covariance structure of the satellite measurements, either calibrated radiance or sensor telemetry, let us have a closer look at the mathematical representation of a matchup dataset. We arrange each dataset in form of a matrix, where the columns correspond to different variables and the rows correspond to different times of measurement of these variables.

For reference-to-sensor match-up datasets, the first column contains the calibrated radiance of your reference sensor (yellow column in Figure 6) while the other columns contain the telemetry data of a non-reference sensor (red columns). Each column represents a different variable.

Figure 6: Mathematical representation of a reference-to-sensor match-up dataset

For sensor-to-sensor match-up datasets you have several columns of sensor telemetry data instead of a single column of calibrated radiance (Figure 7).

Figure 7: Mathematical representation of a sensor-to-sensor match-up dataset

Writing matchup data in form of a matrix is helpful when we consider how the errors of the measurements are correlated. This is essential, because considering the error covariance of all measurements is the main difference when comparing FIDUCEO harmonisation to less rigorous calibration methods.

In FIDUCEO, we have minimised the error correlation between *different* variables (i.e. inter-column error correlation) by design of our calibration models thus we need to consider error correlation among the measurements of the *same* variable over time only (i.e. intra-column error correlation). Then the error covariance matrix \boldsymbol{S}(\textcolor{orange}{\boldsymbol{q}}) of any column \textcolor{orange}{\boldsymbol{q}} in a matchup dataset is of one of three genuine types or a combination thereof (Figure 8).

- Independent random – the errors of the measurements are not correlated because the measurements themselves are all independent. We have error variances only.
- Common random – the errors of the measurements are fully correlated because all measurements are susceptible to the same error.
- Structured random – the measurements in a column are the result of some averaging procedure, where the original measurements (which are averaged) are all independent.

Figure 8: Schematic illustration of error covariance matrices for genuine error correlation types.

FIDUCEO has specified a match-up dataset format to represent these error covariance structures and has developed software tools to aid users in preparing correctly formatted data. The tools are included with the harmonisation software on GitHub.

To browse and download the code for the harmonisation software take a look at FIDUCEO’s Harmonisation repository on GitHub. |

## Step three: how to optimise the sensor calibration models?

Knowing the error covariances of all your matchup data, the mathematical formulation of the harmonisation problem is relatively straightforward. Let us assume we optimise the calibration model we have used for the AVHRR series of sensors, which is

\textcolor{blue}{L}(\textcolor{red}{\boldsymbol{x}},\textcolor{blue}{\boldsymbol{\alpha}}) = \textcolor{blue}{\alpha_1} + (\epsilon + \textcolor{blue}{\alpha_2}) \textcolor{red}{L_\mathrm{ICT}} \frac{\textcolor{red}{C_\mathrm{E}} – \textcolor{red}{\bar{C}_\mathrm{S}}}{\textcolor{red}{\bar{C}_\mathrm{ICT}} – \textcolor{red}{\bar{C}_\mathrm{S}}} + \textcolor{blue}{\alpha_3} (\textcolor{red}{C_\mathrm{E}} – \textcolor{red}{\bar{C}_\mathrm{S}}) (\textcolor{red}{\bar{C}_\mathrm{ICT}} – \textcolor{red}{\bar{C}_\mathrm{S}}) + \textcolor{blue}{\alpha_4} \textcolor{red}{T_\mathrm{env}} \tag{Eq.4}

where

- \textcolor{red}{\boldsymbol{x}}=\textcolor{red}{ (\bar{C}_\mathrm{S}}, \textcolor{red} {\bar{C}_\mathrm{ICT}}, \textcolor{red}{C_\mathrm{E}}, \textcolor{red}{L_\mathrm{ICT}}, \textcolor{red}{T_\mathrm{env}}) are a single record of sensor telemetry data
- \textcolor{blue}{\boldsymbol{\alpha}}=( \textcolor{blue}{x_1}, \textcolor{blue}{x_2}, \textcolor{blue}{x_3}, \textcolor{blue} {x_4}) are the sensor calibration coefficients
- \textcolor{red}{ \bar{C}_\mathrm{S}} denotes the space counts averaged over 51 subsequent scans
- \textcolor{red}{ \bar{C}_\mathrm{ICT}} denotes the counts obtained from the internal calibration target (ICT) averaged over 51 subsequent scans
- \textcolor{red} C_\mathrm{E} denotes the Earth count
- \textcolor{red} L_\mathrm{ICT} is the radiance obtained from the internal calibration target (this number has been precomputed from the temperature of the ICT as measured by four Platinum Resistance Thermometers and an error correction model)
- \textcolor{red}{ T_\mathrm{env}} is the mean environmental temperature within the duration of a satellite orbit (this number has been precomputed from all ICT temperature measurements)

Harmonisation then computes the K-residuals, which are the differences between the radiance differences observed and the radiance differences expected

\tag{Eq. 5} \textcolor{green}{\boldsymbol{r}^{ij}} = \textcolor{blue}{\boldsymbol{L}^i}(\textcolor{orange}{\boldsymbol{X}^{ij}},\textcolor{blue}{\boldsymbol{\alpha}^i}) – \textcolor{blue}{\boldsymbol{L}^j}(\textcolor{red}{\boldsymbol{X}^{ji}},\textcolor{blue}{\boldsymbol{\alpha}^j}) – \textcolor{purple}{\boldsymbol{K}^{ij}}

where

- \textcolor{gren}{\boldsymbol{r}^{ij}} are the K -residuals for sensor i and sensor j
- \textcolor{blue}{\boldsymbol{L}^i} and \textcolor{blue}{ \boldsymbol{L}^j} are the calibration models for sensor i and sensor j . For the reference sensor this is the identity function, i.e. \textcolor{blue}{ \boldsymbol{L}^1} \equiv \operatorname{id}
- \textcolor{blue}{\boldsymbol{\alpha}^i} and \textcolor{blue} {\boldsymbol{\alpha}^j} are the calibration coefficients for sensor i and sensor j . For the reference sensor this is an empty tuple, i.e. \textcolor{blue}{ \boldsymbol{\alpha}^1} = ()
- \textcolor{orange}{ \boldsymbol{X}^{ij}} are the telemetry data from sensor i when matched with sensor j . For the reference sensor this is the calibrated radiance, i.e. \textcolor{orange}{ \boldsymbol{X}^{ij}} = \textcolor{orange}{ \boldsymbol{L}^{1j}}
- \textcolor{red}{ \boldsymbol{X}^{ji}} are the telemetry data from sensor j when matched with sensor i

FIDUCEO has implemented two slightly different but eventually equivalent variants to achieve an optimal calibration, each of which penalises K -residuals that are large in comparison to their expected uncertainty. Figure 9 illustrates the cost functions associated with these two variants.

Figure 9: Cost functions for errors-in-variables and marginalised errors-in-variables

The errors-in-variables (EIV) variant uses a cost function, which is a function of (all) sensor calibration coefficients \boldsymbol{\alpha} and corrections \boldsymbol{\xi} to (all) sensor telemetry measurements \boldsymbol{X}.

Sensor telemetry data may include several 10 to several 100 million individual measurements and optimising a corresponding number of corrections results in an enormous and highly covariant optimisation problem (Harris *et al.*, in preparation). To reduce the complexity of the optimisation from 100 million to a few 10 parameters, FIDUCEO has developed a marginalised errors-in-variables (MEIV) variant, which transforms the error covariance of sensor telemetry measurements into radiance-equivalent form (Figure 10) and optimises sensor calibration parameters only, without compromising the rigour treatment of measurement uncertainties (Giering *et al.* 2019).

Figure 10: Marginalisation – the error covariance of sensor telemetry measurements is transformed into radiance equivalent form

Note that for each sensor pair ij the radiance-equivalent error covariance matrix is a sum of three components: the error variance matrix for K plus the radiance error covariance matrix of the ‘first’ sensor plus the radiance error covariance matrix of the ‘second’ sensor. To start with something simple, you may skip the telemetry variables that do have correlated errors or omit the two radiance error covariance matrices entirely. The marginalised errors-invariables variant will then be the same as weighted least squares. With that you will minimize jumps and trends, but your uncertainty budget will not be correct, and you will want to improve metrological soundness as your calibration project evolves.

FIDUCEO has prototyped the errors-in-variables (EIV) variant of harmonisation and made it publicly available on the projects’ Harmonisation repository on GitHub. The marginalised errors-in-variables (MEIV) variant is proprietary software. FIDUCEO’s Harmonisation repository on GitHub contains some free code to demonstrate the logic but does not include computational parts. Please contact FastOpt GmbH for further information. The MEIV variant uses Transformation of Algorithms in Fortran (TAF) to generate code for algorithmic evaluation of partial derivatives. Algorithmic differentiation solves the problem of accurate and fast computation of partial derivatives for model optimisation and propagation of uncertainties. |

## Step four: how to compute the calibration error covariance matrix?

Having optimised your sensor calibration models, you will want to compute the error covariance matrix associated with your optimised calibration parameters. You may compute this matrix by means of Monte Carlo methods like the EIV variant (Harris *et al.*, in preparation) or analytically like the MEIV variant (Giering et al., 2019).

Figure 11 illustrates an example how the calibration error correlation matrix (the normalised form of your error covariance matrix) may look like. Each matrix element represents how the error of one calibration coefficient is correlated with the error of another calibration coefficient. Here we have nine different sensors and four calibration coefficients for each sensor, which makes the matrix appear like a 9\times 9 chequerboard. Each square on this chequerboard comprises 4\times 4 elements. Squares along the diagonal of the chequerboard represent intra-sensor calibration error correlations while off-diagonal squares represent inter-sensor calibration error correlations.

Figure 11: Calibration error correlation matrix for the harmonisation of the AVHRR series of sensors

You will need the calibration error covariance matrix to propagate the uncertainty of the harmonisation to the calibrated radiance of your FCDR, where it contributes to the common random part of the uncertainty budget.

## Harmonisation Tools

As has been mentioned throughout the above, during the development of the harmonisation framework a number of software tools have been developed by the project. The following are now available for you to download and use.

### Multisensor Matchup System

This software generates satellite-to-satellite or satellite-to-in-situ match up datasets. This may be extended to work for currently unsupported sensors of your interest. Find this software on the MMS repository on GitHub.

### Harmonisation Python Library

This python library was developed to perform FCDR harmonisation using the EIV approach. It has useful functionality, such as, harmonisation input and output file readers and writers and plotting routines. This package also has a prototype version of harmonisation EIV software, that is currently in state of development. Find this in the Harmonisation repository on GitHub.

### MEIV Demonstrator Code

Harmonisation repository on GitHub also contains some free code to demonstrate the logic of the harmonisation MEIV approach, however, it does not include computational parts. Please contact FastOpt GmbH for further information.

### Harmonisation Tools

In the Harmonisation repository on GitHub there are also a set of software utilities developed during the project to assist with harmonisation. Most of these are quite esoteric to the practicalities of the project, but some may be of general use. For example, there is a harmonisation input file checker that checks the file you create meets the specification, or, there are functions to help you generate typical W matrices.