comp_spectrum_ratio_3d

Authors

Pascal Terray (LOCEAN/IPSL)

Latest revision

30/05/2024

Purpose

Compute the ratio of two multi-channel Power Spectral Density (PSD) estimates (computed from two real multi-channel time series with the help of comp_cross_spectrum_3d or comp_cross_spectrum2_3d) and a pointwise tolerance interval for this spectral ratio under the assumption that the two “true” underlying multi-channel power spectra are the same. The two multi-channel PSD estimates are readed from two tridimensional variables (as specified by the -nv= and -dv= arguments) readed from one or two estimates NetCDF datasets (as specified by the -nf= and -df= arguments).

How to compare quantitatively two estimated power spectra is an important question solved by comp_spectrum_ratio_3d. Suppose that f1(w) and f2(w) are ( multi-channel) PSD estimates (where w is a Fourier frequency) with df1 and df2 Degrees Of Freedom (DOF), respectively. To answer this question, spectral analysis theory [Diggle] suggests to examine the spectral ratio

R(w) = f1(w) / f2(w)

, which may be assumed to follow an F-distribution with numerator and denominator degrees of freedom df1 and df2, respectively. This result can be used to calculate pointwise confidence (or tolerance) intervals for the spectral ratios at the Fourier frequencies [Diggle] [Masson_etal] [Terray_etalb].

It is assumed that the two estimated multi-channel power spectra have been computed by the Welch’s method [Welch] by calls to comp_cross_spectrum_3d or comp_cross_spectrum2_3d with the -conflim argument (this argument is required for computing the DOFs of the PSD estimates, which are required by comp_spectrum_ratio_3d).

The pointwise tolerance interval and the spectral ratio of the two estimated multi-channel power spectra are stored in an output NetCDF dataset.

If you want to test if two estimated (multi-channel) power spectra have the same underlying population spectrum or if the underlying population spectra have the same shape for a given set of frequencies, use comp_spectrum_diff_3d instead of comp_spectrum_ratio_3d.

For additional details on these spectral computations, see [Welch] [Bloomfield] [Diggle] .

Further Details

Usage

$ comp_spectrum_ratio_3d \
       -nf=numerator_netcdf_file \
       -nv=numerator_netcdf_variable \
       -dv=denominator_netcdf_variable \
       -df=denominator_netcdf_file       (optional) \
       -m=input_mesh_mask_netcdf_file    (optional) \
       -g=grid_type                      (optional : n, t, u, v, w, f) \
       -x=lon1,lon2                      (optional) \
       -y=lat1,lat2                      (optional) \
       -np=numerator_period              (optional) \
       -dp=denominator_period            (optional) \
       -t=freq1,freq2                    (optional) \
       -o=output_netcdf_file             (optional) \
       -pro=probability_interval         (optional : 0.0 > 1.) \
       -mi=missing_value                 (optional) \
       -logratio                         (optional) \
       -double                           (optional) \
       -bigfile                          (optional) \
       -hdf5                             (optional) \
       -tlimited                         (optional)

By default

-df=
the denominator_netcdf_file is the same as the numerator_netcdf_file
-m=
an input_mesh_mask_netcdf_file is not used
-g=
the grid_type is set to n, which means that the 2-D grid-mesh associated with the input numerator_netcdf_variable and denominator_netcdf_variable is assumed to be regular or Gaussian
-x=
the whole longitude domain associated with the numerator_netcdf_variable and denominator_netcdf_variable
-y=
the whole latitude domain associated with the numerator_netcdf_variable and denominator_netcdf_variable
-np=
the numerator_period argument is not used
-dp=
the denominator_period argument is not used
-t=
the whole set of frequencies included in the two input NetCDF files
-o=
output_netcdf_file name is set to spectrum_ratio_numerator_netcdf_variable.denominator_netcdf_variable.nc where numerator_netcdf_variable and denominator_netcdf_variable are specified with the -nv= and -dv= arguments.
-pro=
the probability_interval is set to 0.90, which means that a 90% tolerance interval for the spectral ratio of the estimated multi-channel power spectra is computed
-mi=
the missing_value for the output NetCDF variables is set to 1.e+20
-logratio
the spectral ratios and their associated upper and lower critical spectral ratios are not Log transformed. However, if -logratio is specified, the spectral ratios, upper and lower critical spectral ratios are Log transformed
-double
the results are stored as single-precision floating point numbers in the output NetCDF file. If -double is activated, the results are stored as double-precision floating point numbers
-bigfile
a NetCDF classical format file is created. However, if -bigfile is activated, the output NetCDF file is a 64-bit offset format file
-hdf5
a NetCDF classical format file is created . However, if -hdf5 is activated, the output NetCDF file is a NetCDF-4/HDF5 format file
-tlimited
the time dimension (e.g., the frequency axis) is defined as unlimited in the output NetCDF file. However, if -tlimited is activated, the frequency dimension is defined as limited in the output NetCDF file

Remarks

  1. The -nf=numerator_netcdf_file argument specifies the NetCDF file containing the multi-channel power spectrum estimates and their associated degrees of freedom for the numerator of the spectral ratio.

    It is assumed that this NetCDF file has been created by NCSTAT procedures performing spectral computations like comp_cross_spectrum_3d or comp_cross_spectrum2_3d with the -conflim argument activated (e.g., this NetCDF file must contain two NetCDF variables named edof and freq).

  2. The -df=denominator_netcdf_file argument specifies the NetCDF file containing the multi-channel power spectrum estimates and their associated degrees of freedom for the denominator of the spectral ratio. If this argument is not specified, it is assumed that the denominator_netcdf_file is the same as the numerator_netcdf_file.

    It is assumed that this NetCDF file has been created by NCSTAT procedures performing spectral computations like comp_cross_spectrum_3d or comp_cross_spectrum2_3d with the -conflim argument activated (e.g., this NetCDF file must contain two NetCDF variables named edof and freq).

  3. The -nv=numerator_netcdf_variable and -dv=denominator_netcdf_variable arguments specify the NetCDF variables containing the multi-channel PSD estimates for the numerator and denominator of the spectral ratio, respectively.

    The geographical shapes of the numerator_netcdf_variable and denominator_netcdf_variable must agree.

  4. The optional argument -m=input_mesh_mask_netcdf_file specifies the land-sea mask to apply to the numerator_netcdf_variable and denominator_netcdf_variable for transforming these tridimensional NetCDF variables as rectangular matrices before computing the spectral ratio of the multi-channel PSD estimates.

    By default, it is assumed that each cell in the 2-D grid-mesh associated with the input tridimensional NetCDF variables is a valid power spectrum (e.g., missing values are not present).

    The geographical shapes of the numerator_netcdf_variable (in the numerator_netcdf_file), the denominator_netcdf_variable (in the denominator_netcdf_file) and the mask (in the input_mesh_mask_netcdf_file) must agree if an input_mesh_mask_netcdf_file is used.

    Refer to comp_clim_3d or comp_mask_3d for creating a valid input_mesh_mask_netcdf_file NetCDF file for regular or gaussian grids before using comp_spectrum_ratio_3d.

  5. If -g= is set to t, u, v, w or f it is assumed that the input NetCDF variables are from an experiment with the NEMO model (ORCA configuration and R2, R4 or R05 resolutions). This argument is also used to determine the name of the mesh_mask variable if an input_mesh_mask_netcdf_file is used.

  6. If the -x=lon1,lon2 and -y=lat1,lat2 arguments are missing the whole geographical domain associated with the the numerator_netcdf_variable and denominator_netcdf_variable is used.

    The longitude or latitude range must be a vector of two integers specifying the first and last selected indices along each dimension. The indices are relative to 1. Negative values are allowed for lon1. In this case the longitude domain is from nlon+lon1+1 to lon2 where nlon is the number of longitude points in the grid associated with the NetCDF variable and it is assumed that the grid is periodic.

    Refer to comp_mask_3d for transforming geographical coordinates as indices or generating an appropriate mesh-mask before using comp_spectrum_ratio_3d.

  7. If the -np=numerator_period argument is specified, the multi-channel PSD estimates in the numerator_netcdf_variable have been computed for different periods separately (as determined by the value of the -p=period_length argument in comp_cross_spectrum_3d). In that case, the -np= argument selects the period in the numerator_netcdf_variable which must be used for computing the spectral ratio of the two multi-channel PSD estimates.

  8. If the -dp=denominator_period argument is specified, the multi-channel PSD estimates in the denominator_netcdf_variable have been computed for different periods separately (as determined by the value of the -p=period_length argument in comp_cross_spectrum_3d). In that case, the -dp= argument selects the period in the denominator_netcdf_variable which must be used for computing the spectral ratio of the two multi-channel PSD estimates.

  9. If the -t=freq1,freq2 argument is missing, the spectral ratios are computed for the whole set of frequencies stored in the numerator_netcdf_variable and denominator_netcdf_variable (e.g., the multi-channel power spectra must have been estimated for the same set of frequencies).

    The selected frequency interval is a vector of two integers specifying the first and last frequencies for which the spectral ratios and the associated tolerance intervals must be computed. The indices are relative to 1.

  10. The -pro=probability_interval argument specifies that a ( probability_interval * 100 )% tolerance interval must be computed. This argument is used to determine the upper and lower critical spectral ratios of the computed tolerance interval. probability_interval must be greater than 0. and less than 1..

    The default value is 0.90, e.g., a 90 )% tolerance interval is estimated.

  11. The -mi=missing_value argument specifies the missing value indicator for the output NetCDF variables in the output_netcdf_file.

    If the -mi= argument is not specified, the missing_value in the output NetCDF variables is set to 1.e+20.

  12. The -logratio argument specifies that the logarithms of the spectral ratios must be computed and stored in the output NetCDF file.

  13. The -double argument specifies that the results are stored as double-precision floating point numbers in the output NetCDF file.

    By default, the results are stored as single-precision floating point numbers in the output NetCDF file.

  14. The -bigfile argument is allowed only if the NCSTAT software has been compiled with the _USE_NETCDF36 or _USE_NETCDF4 CPP macros (e.g., -D_USE_NETCDF36 or -D_USE_NETCDF4) and linked to the NetCDF 3.6 library or higher.

    If this argument is specified, the output_netcdf_file will be a 64-bit offset format file instead of a NetCDF classic format file. However, this argument is recognized in the procedure only if the NCSTAT software has been built with the _USE_NETCDF36 or _USE_NETCDF4 CPP macros.

  15. The -hdf5 argument is allowed only if the NCSTAT software has been compiled with the _USE_NETCDF4 CPP macro (e.g., -D_USE_NETCDF4) and linked to the NetCDF 4 library or higher.

    If this argument is specified, the output_netcdf_file will be a NetCDF-4/HDF5 format file instead of a NetCDF classic format file. However, this argument is recognized in the procedure only if the NCSTAT software has been built with the _USE_NETCDF4 CPP macro.

  16. Duplicate parameters are allowed, but this is always the last occurrence of a parameter which will be used for the computations. Moreover, the number of specified parameters must not be greater than the total number of allowed parameters.

  17. For more details on power spectrum analysis, theory for computing a pointwise tolerance interval for the ratio of two estimated power spectra and examples in the climate literature, see

    • “The use of Fast Fourier Transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms”, by P.D. Welch, IEEE trans. on audio and electroacoustics, 15:2, 70-73, 1967. DOI: https://doi.org/10.1109/TAU.1967.1161901
    • “Fourier analysis of time series- An introduction”, by Bloomfield, P., John Wiley and Sons, New York, Chapter 6, 1976. http://as.wiley.com/WileyCDA/WileyTitle/productCd-0471889482.html
    • “Time series: a biostatistical introduction”, by P.J. Diggle, Clarendon Press, Oxford, Chap. 4, 1990. http://www.oupcanada.com/catalog/9780198522263.html
    • “Impact of intra-daily SST variability on ENSO characteristics in a coupled model” by S. Masson et al., 2012, Climate Dynamics, Vol. 39:681-707, doi: 10.1007/s00382-011-1247-2
    • “The role of the intra-daily SST variability in the Indian monsoon variability in a global coupled model” by P. Terray et al., 2012, Climate Dynamics, Vol. 39:729-754, doi: 10.1007/s00382-011-1240-9
    • “Southern Hemisphere extra-tropical forcing: A new paradigm for El Nino-Southern Oscillation” by P. Terray, 2011, Climate Dynamics, Vol. 36:2171-2199, doi: 10.1007/s00382-010-0825-z

Outputs

comp_spectrum_ratio_3d creates an output NetCDF file that contains the upper and lower critical spectral ratios associated with the specified pointwise tolerance interval and the multi-channel spectral ratios of the two input estimated multi-channel power spectra.

The output NetCDF dataset contains the following NetCDF variables (in the description below, nlat and nlon are the lengths of the spatial dimensions of the input NetCDF variables numerator_netcdf_variable and denominator_netcdf_variable, nfreq is the number of frequencies for which the multi-channel spectral ratios and, eventually, the upper and lower critical spectral ratios are computed as determined by the -t= and -p= arguments) :

  1. frequency(nfreq) : the frequencies at which the spectral ratios are computed.

  2. periodicity(nfreq) : the corresponding periods in number of time observations.

  3. lower_critical_ratio(nfreq) : the lower critical spectral ratios associated with the specified pointwise tolerance interval.

    This variable is stored only if the -logratio argument has not been specified when calling comp_spectrum_ratio_3d.

  4. upper_critical_ratio(nfreq) : the upper critical spectral ratios associated with the specified pointwise tolerance interval.

    This variable is stored only if the -logratio argument has not been specified when calling comp_spectrum_ratio_3d.

  5. numerator_netcdf_variable_denominator_netcdf_variable_ratio(nfreq,nlat,nlon) : the multi-channel spectral ratios computed from the two multi-channel power spectra stored in the two input NetCDF variables numerator_netcdf_variable and denominator_netcdf_variable.

    This variable is stored only if the -logratio argument has not been specified when calling comp_spectrum_ratio_3d.

  6. lower_critical_logratio(nfreq) : the lower critical spectral ratios associated with the specified pointwise tolerance interval.

    This variable is stored only if the -logratio argument has been specified when calling comp_spectrum_ratio_3d.

  7. upper_critical_logratio(nfreq) : the upper critical spectral ratios associated with the specified pointwise tolerance interval.

    This variable is stored only if the -logratio argument has been specified when calling comp_spectrum_ratio_3d.

  8. numerator_netcdf_variable_denominator_netcdf_variable_logratio(nfreq) : the multi-channel spectral ratios computed from the two multi-channel power spectra stored the two input NetCDF variables numerator_netcdf_variable and denominator_netcdf_variable.

    This variable is stored only if the -logratio argument has been specified when calling comp_spectrum_ratio_3d.

All the multi-channel spectral ratios involving the two input NetCDF variables numerator_netcdf_variable and denominator_netcdf_variable are packed in tridimensional variables whose first and second dimensions are exactly the same as those associated with these input NetCDF variables even if you restrict the geographical domain with the -x= and -y= arguments. However, outside the selected domain, the output NetCDF variables are filled with missing values.

Examples

  1. For computing the multi-channel spectral ratios and the associated 90% confidence interval for two multi-channel power spectra stored in a NetCDF variable called sst extracted, respectively, from the files spectrum_sst.monthly.1979_2005.nc and spectrum_sst.monthly.1950_1978.nc, use the following command :

    $ comp_spectrum_ratio_3d  -fn=spectrum_sst.monthly.1979_2005.nc \
                              -fd=spectrum_sst.monthly.1950_1978.nc \
                              -nv=sst \
                              -dv=sst \
                              -o=spectrum_ratio_sst_1979_2005.1950_1978.nc
    
Flag Counter