A super-ensemble methodology is proposed to improve the quality of short-term ocean analyses for sea surface temperature (SST) in the Mediterranean Sea. The methodology consists of a multiple linear regression technique applied to a multi-physics multi-model super-ensemble (MMSE) data set. This is a collection of different operational forecasting analyses together with ad hoc simulations, created by modifying selected numerical model parameterizations. A new linear regression algorithm based on empirical orthogonal function filtering techniques is shown to be efficient in preventing overfitting problems, although the best performance is achieved when a simple spatial filter is applied after the linear regression. Our results show that the MMSE methodology improves the ocean analysis SST estimates with respect to the best ensemble member (BEM) and that the performance is dependent on the selection of an unbiased operator and the length of training. The quality of the MMSE data set has the largest impact on the MMSE analysis root mean square error (RMSE) evaluated with respect to observed satellite SST. The MMSE analysis estimates are also affected by training period length, with the longest period leading to the smoothest estimates. Finally, lower RMSE analysis estimates result from the following: a 15-day training period, an overconfident MMSE data set (a subset with the higher-quality ensemble members) and the least-squares algorithm being filtered a posteriori.

The limiting factors to short-term ocean forecasting
predictability are the uncertainties in ocean initial conditions, atmospheric
forcing

The basic idea discussed in Krishnamurti's work is that each model can carry
somewhat different representation of the foreseen processes, so an
appropriate combination can reduce biases in space and time. In his work an
unbiased linear combination of the available models, optimal (in the
least-squares sense) with respect to observations during a training period of
a priori chosen length, reduces the RMSE for prediction on the south–north
component of winds at 850 hPa (averaged over boundaries between 50 and 120

Multi-physics multi-model SE members model and data assimilation characteristics: the column lists the most significant differences between the models in terms of code and model physical parameterizations.

The MMSE data set includes the collection of daily mean outputs from five
operational analysis systems in the Mediterranean Sea and four outputs from
the same operational forecasting model but with different physical
parameterization choices. The study period lasts from 1 January to 31 December 2008.
The main differences between the MMSE members are mainly due
to the different numerical schemes used, the data assimilation scheme and the
model physical parameterizations. Optimally interpolated satellite SST
observations (OI-SST)

Data set statistics, from left to right: standard deviation (SD) (with monthly mean seasonal signal removed) normalized by SD of OI-SST, centred root mean square error (RMSE) between members and OI-SST and anomaly correlation coefficient (ACC). All the values were evaluated over the year 2008.

So far, as discussed in Sect.

Our SE methodology is based on

Nomenclature and characteristics of the four MMSE algorithms used.

Flow chart of methodologies developed in the paper.

In this section we describe the MMSE experiments performed to test the four
regression algorithms. For all our regression algorithms, we selected a test
analysis period from 25 April to 4 May 2008, while the related training
period was chosen as a number

MMSE estimates for the first day of the test period (25 April 2008) using a training period of 15 days, S1-SST (top panel, left) and the corresponding estimate for S2-SST (top panel, right), SST from satellite (bottom panel, left) and best ensemble member SST (bottom panel, right).

In order to find the minimum training period length possible, a simple
experiment has been done using the observations as one of the ensemble
members in the training period. This test can be considered as the maximum
skill that could be achieved with a MMSE approach, and it is also a way to
check the coefficient estimates. For a training period of 15 days, all the
regression coefficients are 0 except for the weight related to the
observational member, which is retrieved to be 1. Trimming the data set
(removing members), we noticed that when the training period days (

MMSE estimates for the last day of test period (4 May 2008), S1-SST (top panel, left) and the corresponding estimate for S2-SST (top panel, right), SST from satellite (bottom panel, left) and best ensemble member SST(bottom panel, right).

The two estimates at the end of the test analysis period are shown in
Fig.

MMSE estimates for the first day of the test period (25 April 2008) using a training period of 35 days, S1-SST (left panel) and the corresponding estimate for S2-SST (right panel).

MMSE estimates with a 35-day training period and for the last day of the test period (4 May 2008), S1-SST (left panel) and the corresponding estimate for S2-SST(right panel).

data set A has the smallest bias because the

for data set B, which is constructed from well-dispersed model ensemble members, the two peaks become of equivalent amplitude;

for data set C, which is constructed from badly dispersed model ensemble members, the estimate enhances the positive bias.

MMSE data sets: members are detailed in
Table

RMSE mean value throughout the analysis
period for the full data set (see Table

Distributions of

The effect of multi-model composition
in the distributions of

The effect of multi-model composition
on the distributions of

Domain average (over the Mediterranean) and time mean the over year 2008 of the RMSE for a 15-day training period for the overconfident data set A.

Domain average (over the Mediterranean) and time mean over the year 2008 of the RMSE of S3 estimates training the overconfident data set A for 15 days and with a different circular radius.

Number of retained EOFs histogram, on ordinates the length of the training period, colour bar proportioned to the day of the experiments.

S3 and S4 estimate for the first day of
the test period (25 April 2008) using a training period of 15 days, SE3-SST
(left panel,

S3 and S4 estimates valid the last day
of the test period (4 May 2008) using a training period of 15 days, S3-SST
(left panel,

Domain average (over the Mediterranean) and time mean during the year 2008 of SE prediction RMSE overconfident data set A. SE predictions trained for 15 days. Error bars stand for the standard deviation of the RMSE during the year.

Spatial average over the Mediterranean Sea and time mean during 2008 SE prediction ACC of overconfident data set A. SE predictions trained for 15 days.

Spatial average over the Mediterranean Sea and time mean during 2008 SE prediction bias of overconfident data set A. SE predictions trained for 15 days. Error bars stand for the standard deviation of the bias during the year.

In order to reduce the overfitting of the SE estimate, here we show the
results of the S3 and S4 algorithms. Both proposed methodologies are used
with the overconfident data set (data set A) and a 15-day training period. In
S3, the 15 km value has been found by means of sensitivity studies done applying a
circular filter at each point of the domain. Figure

ACC mean value throughout the analysis period for data set A (see Table 1) as a function of the training period length for the proposed SE methodologies.

We developed a multi-model multi-physics
super-ensemble methodology to estimate the best SST from different
oceanic analysis systems. Several regression algorithms were analysed for a
test period and the whole of 2008. We examined different conditions when the
MMSE estimate outperforms the BEM of the generating ensemble. The target was
to obtain 10-day posterior analyses using a training period in the past for
the regression algorithm and to generate the lowest bias and RMSE for the
MMSE estimates. The results show that the ensemble size, quality and type of
members, and the training period length are all-important elements of the
MMSE methodology and require careful calibration. Almost 2000 posterior
analyses were produced for 2008 with different training periods. The
classical SE approach, as proposed by

Future developments could involve the addition of physical constraints during the regression, considering for example cross correlations with atmospheric forcing. MMSE should also be applied to the ocean forecast problem instead of the analysis problem. The difference for MMSE forecast estimates is that atmospheric forecast uncertainties are not contained in training period analyses, and the size of the ensemble members required could increase considerably, as well as the complexity of the estimation problem.

The full data set can be found at

The analysis systems that generated the ensemble members of the experiments
used in this paper are briefly described below:

SYS3a2: system composed of the numerical code of OPA8.2 implemented in the
Mediterranean Sea

SYS4a3 uses NEMO 2.3

Mercator-V0 (PSY2V3R1): the numerical code is based on NEMO 1.09, and it is implemented
in the North Atlantic and Mediterranean Sea with a horizontal resolution of

Mercator-V1(PSY2V4R1): the numerical code is based on NEMO 3.1 version, and it
is implemented in the North Atlantic and Mediterranean Sea with a horizontal resolution
of

HCMR: Hellenic Centre for Marine Research

NEMO multi-physics: this is the same as SYS4a3 NEMO 2.3 code without assimilation but with different model physical parameterizations.

Here we show how the observed and model fields are decomposed into different
temporal signals. Let us consider

The last term on the right of Eq. (

Jenny Pistoia with the supervision of Paolo Oddo and Nadia Pinardi designed and performed all the INGV multi-physics simulations and collected all the INGV analysis. Matthew Collins supervised all the activity connected with the SE based on EOFs. Gerasimos Korres and Yann Drillet provide respectively the HCMR and Mercator analysis members in the same grid of INGV members in order to build the MMSE data set. Jenny Pistoia prepared the manuscript with contributions from all co-authors.

This work was supported by the University of Bologna as part of the graduate programme in geophysics and by the MyOcean2 Project. The CMCC Gemina project funded J. Pistoia's studies at the University of Exeter. Publication was supported from the Italian Ministry of Education, University and Research under the project RITMARE. Edited by: R. Archetti Reviewed by: M. Bostenaru-Dan and one anonymous referee