comp_fftfilter_4d¶
Authors¶
Pascal Terray (LOCEAN/IPSL)
Latest revision¶
28/05/2024
Purpose¶
Filter a real multi-channel time series in a selected frequency band by windowed filtering [Iacobucci_Noullez] . The multi-channel time series is extracted from a fourimensional variable readed from a NetCDF dataset and can also be detrended before windowed filtering at the user option.
The windowed filter is applied to the time series in the frequency domain [Iacobucci_Noullez] . The filter is obtained by convolving a raised-cosine window with the ideal rectangular filter response function. This filter is stationary and symmetric, therefore, it induces no phase-shift and is a good candidate for extracting frequency-defined series components from short-length time series.
Applying the filter in the frequency domain assumes implicitly that the multi-channel time series is part of a periodic infinite series whose period is exactly equal to the length of the analyzed time series. The selected ideal filter gain vector (in the frequency domain) is multiplied with the Fourier transform of a raised-cosine window to get the final filter frequency response. This filter frequency response is then multiplied with the Fourier transforms of the multi-channel time series and the resulting sequences are inverted back in the time domain to obtain the final filtered multi-channel time series. The specific form of the raised-cosine window can be controlled by the user with the help of the -win= argument, described below.
Additionally, the filtering can be done separately for different segments of equal length of the selected multi-channel time series if the multi-channel time series are not continuous in time (see the -p= argument).
This procedure returns the filtered multi-channel time series in a NetCDF dataset. If the NetCDF variable is one or tridimensional use comp_fftfilter_1d or comp_fftfilter_3d, respectively, instead of comp_fftfilter_4d. If the time series has a seasonal (or diurnal) cycle, use comp_stl_4d in order to estimate and remove the harmonic components of the time series before using comp_fftfilter_4d.
Finally, use comp_spectvar_4d to compute variance estimates in the selected frequency band for the real multi-channel time series before or after filtering the real multi-channel time series.
Further Details¶
Usage¶
$ comp_fftfilter_4d \
-f=input_netcdf_file \
-v=netcdf_variable \
-m=input_mesh_mask_netcdf_file (optional) \
-g=grid_type (optional : n, t, u, v, w, f) \
-x=lon1,lon2 (optional) \
-y=lat1,lat2 (optional) \
-z=level1,level2 (optional) \
-t=time1,time2 (optional) \
-o=output_netcdf_file (optional) \
-p=period_length (optional) \
-pl=minimum_period (optional) \
-ph=maximum_period (optional) \
-tr=trend_removal (optional : 0, 1, 2, 3, -1, -2, -3) \
-win=window_choice (optional : 0.5 > 1.) \
-mi=missing_value (optional) \
-ngp=number_of_grid_points (optional) \
-double (optional) \
-bigfile (optional) \
-hdf5 (optional) \
-tlimited (optional)
By default¶
- -m=
- an input_mesh_mask_netcdf_file is not used
- -g=
- the grid_type is set to
n
which means that the 2-D grid-mesh associated with the input NetCDF variable is assumed to be regular or Gaussian- -x=
- the whole longitude domain associated with the netcdf_variable
- -y=
- the whole latitude domain associated with the netcdf_variable
- -z=
- the whole vertical resolution associated with the netcdf_variable
- -t=
- the whole time period associated with the netcdf_variable
- -o=
- the output_netcdf_file is named
filt_
netcdf_variable.nc
- -p=
- the period_length is set to
time2 - time1 + 1
, which means that the time series is considered as continuous with only one time segment- -pl=
- the minimum_period is set to
0
, which means that no filtering is done for the shorter periods- -ph=
- the maximum_period is set to
0
, which means that no filtering is done for the longer periods- -tr=0
- trend_removal is set to
0
, which means that no detrending is done before filtering- -win=0.54
- the Hamming window is convolved with the ideal filter response
- -mi=
- the missing_value for the output variable is equal to
1.e+20
- -ngp=
- the number_of_grid_points is set to the number of points in the selected domain
- -double
- the results are stored as single-precision floating point numbers in the output NetCDF file. If -double is activated, the results are stored as double-precision floating point numbers
- -bigfile
- a NetCDF classical format file is created. If -bigfile is activated, the output NetCDF file is a 64-bit offset format file
- -hdf5
- a NetCDF classical format file is created. If -hdf5 is activated, the output NetCDF file is a NetCDF-4/HDF5 format file
- -tlimited
- the time dimension is defined as unlimited in the output NetCDF file. However, if -tlimited is activated, the time dimension is defined as limited in the output NetCDF file
Remarks¶
The -v=netcdf_variable argument specifies the NetCDF variable for which a windowed filtering operation must be computed and the -f=input_netcdf_file argument specifies that this NetCDF variable must be extracted from the NetCDF file input_netcdf_file.
The geographical shapes of the netcdf_variable (in the input_netcdf_file) and the NetCDF mesh_mask variable (in the input_mesh_mask_netcdf_file) must agree if the -m= argument is used.
If -g= is set to
t
,u
,v
,w
orf
it is assumed that the NetCDF variable is from an experiment with the NEMO model (ORCA configuration and R2, R4 or R05 resolutions).If -g= is set to
n
, it is assumed that the 3-D grid-mesh is regular or Gaussian.This argument is also used to determine the name of the NetCDF mesh_mask variable if an input_mesh_mask_netcdf_file is used as specified with the -m= argument
If the -x=lon1,lon2, -y=lat1,lat2 and -z=level1,level2 arguments are missing, the whole geographical domain and vertical resolution associated with the netcdf_variable is used to select the multi-channel time series.
The longitude, latitude or level range must be a vector of two integers specifying the first and last selected indices along each dimension. The indices are relative to
1
. Negative values are allowed for lon1. In this case the longitude domain is fromnlon
+lon1+1
to lon2 wherenlon
is the number of longitude points in the grid associated with the NetCDF variable and it is assumed that the grid is periodic.Refer to comp_mask_4d for transforming geographical coordinates as indices before using comp_fftfilter_4d.
If the -t=time1,time2 argument is missing the whole time period associated with the netcdf_variable is used to decompose the time series.
The selected time period is a vector of two integers specifying the first and last time observations. The indices are relative to
1
. Note that the output NetCDF file will haventime
= time2 - time1 +1
time observations.If the -p= argument is specified, the filtering is applied separately for each time segment of length period_length (as determined by the value of the -p= argument). The whole selected time period (e.g., time2 - time1 +
1
) must also be a multiple of the period_length.The -pl= argument specifies the minimum period of oscillation of the filtered time series. The minimum_period is expressed in number of time observations.
Do not use the -pl= argument or use -pl=
0
for high-pass filtering frequencies corresponding to periods shorter than -ph=PH
.The -pl= argument is a positive integer equal to
0
or greater than2
.The -ph= argument specifies the maximum period of oscillation of the filtered time series. The maximum_period is expressed in number of time observations. Do not use the -ph= argument or use -ph=
0
for low-pass filtering frequencies corresponding to periods longer than -pl=PL
. For example, -pl=6
(or18
) and -ph=32
(or96
) select periods between1.5
and8
years for quarterly (monthly) time series.The -ph= argument is a positive integer equal to
0
or greater than2
and less than the length of the time series or the period_length if the -p= argument is used.The -ph= argument must also be greater or equal to the -pl= argument if both are specified.
Setting -pl= and -ph= to the same value
P
is allowed. In this case, an -ideal- band-pass filter with peak response near one at the single periodP
is computed and applied to the time series.Setting -pl=
PL
, -ph=PH
andPH``<``PL
is also allowed and performs band rejection of periods betweenPH
andPL
. In that case, the meaning of the -pl= and -ph= arguments reversed.Setting both -pl=
0
and -ph=0
is also allowed. In that case, no frequencies filtering is done, but the data may be detrended if the -tr= argument is used with a value of1
,2
or3
.The -tr= argument specifies pre- and post-filtering processing of the multi-channel time series. If:
- -tr=
+/-1
, the mean of the time series is removed before time filtering- -tr=
+/-2
, the drift from the time series is removed before time filtering. The drift for the time series is estimated using the formula :drift = ( tseries(ntime) - tseries(1) )/( ntime - 1 )
- -tr=
+/-3
, the least-squares line from the time series is removed before time filtering.If -tr=
-1
,-2
or-3
, the mean, drift or least-squares line are reintroduced post-filtering, respectively.For other values of the -tr= argument, nothing is done before or after filtering.
If the -p= argument is present, the pre-filtering and post-filtering processing is applied to each time segment, separately.
The -tr= argument must be an integer and the default value for the -tr= argument is
0
.The -win= argument controls the form of the window, which will be convolved with the ideal filter response. By default, a Hamming window is used (e.g., -win=
0.54
).Set -win=
0.5
for using a Hanning window or -win=1.
for a rectangular window (e.g., the “ideal” filter).The -win= argument is a real number greater or equal to
O.5
and less or equal to1.
.The -mi=missing_value argument specifies the missing value indicator for the output variable in the output_netcdf_file.
If the -mi= argument is not specified, the missing_value is set to
1.e+20
.The -ngp= argument can be used if you have memory problems when running comp_fftfilter_3d on very large datasets. By default, the number_of_grid_points is set to the number of cells in the selected domain. In case of memory problems, you can use the -ngp= argument with a lower value. This will reduce the memory used by the operator.
The -double argument specifies that the results are stored as double-precision floating point numbers in the output NetCDF file.
By default, the results are stored as single-precision floating point numbers in the output NetCDF file.
The -bigfile argument is allowed only if the NCSTAT software has been compiled with the _USE_NETCDF36 or _USE_NETCDF4 CPP macros (e.g.,
-D_USE_NETCDF36
or-D_USE_NETCDF4
) and linked to the NetCDF 3.6 library or higher.If this argument is specified, the output_netcdf_file will be a 64-bit offset format file instead of a NetCDF classic format file. However, this argument is recognized in the procedure only if the NCSTAT software has been built with the _USE_NETCDF36 or _USE_NETCDF4 CPP macros.
The -hdf5 argument is allowed only if the NCSTAT software has been compiled with the _USE_NETCDF4 CPP macro (e.g.,
-D_USE_NETCDF4
) and linked to the NetCDF 4 library or higher.If this argument is specified, the output_netcdf_file will be a NetCDF-4/HDF5 format file instead of a NetCDF classic format file. However, this argument is recognized in the procedure only if the NCSTAT software has been built with the _USE_NETCDF4 CPP macro.
It the multi-channel time series has a seasonal or diurnal cycle, use comp_stl_4d or comp_clim_4d to remove the pure harmonic components from the time series before filtering.
It is assumed that the data has no missing values.
Duplicate parameters are allowed, but this is always the last occurrence of a parameter which will be used for the computations. Moreover, the number of specified parameters must not be greater than the total number of allowed parameters.
For more details on windowed filtering, see
- “A Frequency Selective Filter for Short-Length Time Series”, by Iacobucci, A., and Noullez, A., Computational Economics, Vol. 25, 75-102, 2005. https://link.springer.com/article/10.1007/s10614-005-6276-7
Outputs¶
comp_fftfilter_4d creates an output NetCDF file that contains the filtered time series estimated from the multi-channel time series associated with the input NetCDF variable. The output NetCDF dataset contains the following NetCDF variable (in the description below, nlev, nlat and nlon are the lengths of the vertical and spatial dimensions of the input NetCDF variable and ntime is the selected number of time observations) :
- netcdf_variable_filt
(ntime,nlev,nlat,nlon)
: the filtered time series for each of the time series of the 3-D grid-mesh associated with the input NetCDF variable.The filtered multi-channel time series is packed in a fourdimensional variable whose first, second and third dimensions are exactly the same as those associated with the input NetCDF variable netcdf_variable even if you restrict the geographical domain with the -x=, -y= and -z= arguments. However, outside the selected domain, the output NetCDF variable is filled with missing values.
Examples¶
For windowed filtering a real multi-channel monthly time series between
18
and30
months (e.g., biennial time scale) from a fourdimensional NetCDF variable calleduwnd
extracted from the fileuwnd.monthly.mean.ncep2.nc
, which includes monthly time series, and store the results in the NetCDF fileqbo_uwnd_ncep2.nc
, use the following command :$ comp_fftfilter_4d \ -f=uwnd.monthly.mean.ncep2.nc \ -v=uwnd \ -m=mesh_mask_uwnd_ncep2.nc \ -pl=18 \ -ph=30 \ -o=qbo_uwnd_ncep2.nc