comp_lanczos_filter_3d¶
Authors¶
Pascal Terray (LOCEAN/IPSL)
Latest revision¶
29/05/2024
Purpose¶
Filter a real multi-channel time series in a selected frequency band by Lanczos filtering [Bloomfield] [Duchon]. The multi-channel time series is extracted from a tridimensional variable readed from a NetCDF dataset and can also be detrended before Lanczos filtering at the user option.
The number of coefficients used to build the Lanczos filter can be selected by the user and the Lanczos filter can be applied to the multi-channel time series in the time or frequency domain, also at the user option. This gives to the user some control on the desired end-effects of the filter (e.g., applying the filter in the frequency domain assumes implicitly that the multi-channel time series is part of a periodic infinite series whose period is exactly equal to the length of the analyzed time series; on the other hand, applying the filter in the time domain implies some loss of data or some distortions of the desired response function of the filter at both ends of the filtered time series).
Additionally, the filtering can be done separately for different segments of equal length of the selected multi-channel time series if this time series is not continuous in time.
The frequency response function (e.g., the transfer function) of the selected Lanczos filter can be computed by comp_freq_func_1d. See the references cited below for more details on Lanczos filtering [Bloomfield] [Duchon].
This procedure returns the filtered real multi-channel time series in a NetCDF dataset. If the NetCDF variable is uni- or fourdimensional use comp_lanczos_filter_1d or comp_lanczos_filter_4d, respectively, instead of comp_lanczos_filter_3d. If the multi-channel time series has a seasonal (or diurnal) cycle, use comp_clim_3d and comp_norm_3d or comp_stl_3d in order to estimate and remove the harmonic components of the time series before using comp_lanczos_filter_3d.
If you need more control on the filtering parameters, including other windows, use comp_symlin_filter_3d instead of comp_lanczos_filter_3d.
This procedure is parallelized if OpenMP is used.
Further Details¶
Usage¶
$ comp_lanczos_filter_3d \
-f=input_netcdf_file \
-v=netcdf_variable \
-m=input_mesh_mask_netcdf_file (optional) \
-g=grid_type (optional : n, t, u, v, w, f) \
-x=lon1,lon2 (optional) \
-y=lat1,lat2 (optional) \
-t=time1,time2 (optional) \
-o=output_netcdf_file (optional) \
-p=periodicity (optional) \
-pl=minimum_period (optional) \
-ph=maximum_period (optional) \
-tr=trend_removal (optional : 0, 1, 2, 3, -1, -2, -3) \
-nfc=number_of_filter_coefficients (optional) \
-mi=missing_value (optional) \
-ngp=number_of_grid_points (optional) \
-notestf (optional) \
-usefft (optional) \
-double (optional) \
-bigfile (optional) \
-hdf5 (optional) \
-tlimited (optional)
By default¶
- -m=
- an input_mesh_mask_netcdf_file is not used
- -g=
- the grid_type is set to
n
which means that the 2-D grid-mesh associated with the input NetCDF variable is assumed to be regular or Gaussian- -x=
- the whole longitude domain associated with the netcdf_variable
- -y=
- the whole latitude domain associated with the netcdf_variable
- -t=
- the whole time period associated with the netcdf_variable
- -o=
- the output_netcdf_file is named
filt_
netcdf_variable.nc
- -p=
- the periodicity is set to
time2 - time1 + 1
, which means that the time series is considered as continuous with only one time segment- -pl=
- the minimum_period is set to
0
, which means that no filtering is done for the shorter periods- -ph=
- the maximum_period is set to
0
, which means that no filtering is done for the longer periods- -tr=0
- the trend_removal is set to
0
, which means that no detrending is done before filtering- -nfc=
- the number_of_filter_coefficients is determined in order to optimize the frequency response function of the selected filter
- -mi=
- the missing_value for the output variable is equal to
1.e+20
- -ngp=
- the number_of_grid_points is set to the number of grid points in the selected domain
- -notestf
- normally, the value of the -nfc= argument must be chosen such that the full transition bands about the cutoff frequencies
1/PH
and1/PL
(wherePH
andPL
are the values of the -ph= and -pl= arguments, respectively) of the selected filter are inside the [0.
,0.5
] frequency interval. By using the -notestf argument you can get ride of this limitation- -usefft
- the Lanczos filter is applied in the time domain. When you specify the -usefft argument the filter will be applied in the frequency domain, using an FFT algorithm, multiplication and an inverse FFT instead of a convolution in the time domain
- -double
- the results are stored as single-precision floating point numbers in the output NetCDF file. If -double is activated, the results are stored as double-precision floating point numbers
- -bigfile
- a NetCDF classical format file is created. If -bigfile is activated, the output NetCDF file is a 64-bit offset format file
- -hdf5
- a NetCDF classical format file is created. If -hdf5 is activated, the output NetCDF file is a NetCDF-4/HDF5 format file
- -tlimited
- the time dimension is defined as unlimited in the output NetCDF file. However, if -tlimited is activated, the time dimension is defined as limited in the output NetCDF file
Remarks¶
The -v=netcdf_variable argument specifies the NetCDF variable for which a Lanczos filtering operation must be computed and the -f=input_netcdf_file argument specifies that this NetCDF variable must be extracted from the NetCDF file input_netcdf_file.
The geographical shapes of the netcdf_variable (in the input_netcdf_file) and the NetCDF mesh_mask variable (in the input_mesh_mask_netcdf_file) must agree if the -m= argument is used.
If -g= is set to
t
,u
,v
,w
orf
it is assumed that the NetCDF variable is from an experiment with the NEMO model (ORCA configuration and R2, R4 or R05 resolutions).If -g= is set to
n
, it is assumed that the 2-D grid-mesh is regular or Gaussian.This argument is also used to determine the name of the NetCDF mesh_mask variable if an input_mesh_mask_netcdf_file is used as specified with the -m= argument
If the -x=lon1,lon2 and -y=lat1,lat2 arguments are missing, the whole geographical domain associated with the netcdf_variable is used to select the multi-channel time series.
The longitude or latitude range must be a vector of two integers specifying the first and last selected indices along each dimension. The indices are relative to
1
. Negative values are allowed for lon1. In this case the longitude domain is fromnlon
+lon1+1
to lon2 wherenlon
is the number of longitude points in the grid associated with the NetCDF variable and it is assumed that the grid is periodic.Refer to comp_mask_3d for transforming geographical coordinates as indices before using comp_lanczos_filter_3d.
If the -t=time1,time2 argument is missing, the whole time period associated with the netcdf_variable is used to decompose the time series.
The selected time period is a vector of two integers specifying the first and last time observations. The indices are relative to
1
. Note that the output NetCDF file will haventime
= time2 - time1 +1
time observations.If the -p= argument is specified, the filtering is applied separately for each time segment of length periodicity (as determined by the value of the -p= argument). The whole selected time period (e.g., time2 - time1 +
1
) must also be a multiple of the periodicity.The -pl= argument specifies the minimum period of oscillation of the filtered time series. The minimum_period is expressed in number of time observations.
Do not use the -pl= argument or use -pl=
0
for high-pass filtering frequencies corresponding to periods shorter than -ph=PH
.The -pl= argument is a positive integer equal to
0
or greater than2
.The -ph= argument specifies the maximum period of oscillation of the filtered time series. The maximum_period is expressed in number of time observations. Do not use the -ph= argument or use -ph=
0
for low-pass filtering frequencies corresponding to periods longer than -pl=PL
. For example, -pl=6
(or18
) and -ph=32
(or96
) select periods between1.5
and8
years for quarterly (monthly) time series.The -ph= argument is a positive integer equal to
0
or greater than2
and less than the length of the multi-channel time series or the periodicity if the -p= argument is used.The -ph= argument must also be greater or equal to the -pl= argument if both are specified.
Setting -pl= and -ph= to the same value
P
is allowed. In this case, an -ideal- band-pass filter with peak response near one at the single periodP
is computed and applied to the multi-channel time series.Setting both -pl=
0
and -ph=0
is also allowed. In that case, no frequencies filtering is done, but the data may be detrended if the -tr= argument is used with a value of1
,2
or3
.The -tr= argument specifies pre- and post-filtering processing of the multi-channel time series. If:
- -tr=
+/-1
, the means of the time series are removed before time filtering- -tr=
+/-2
, the drifts from the time series are removed before time filtering. The drift for each time series is estimated using the formula :drift = ( tseries(ntime) - tseries(1) )/( ntime - 1 )
- -tr=
+/-3
, the least-squares lines from the multi-channel time series are removed before time filtering.If -tr=
-1
,-2
or-3
, the means, drifts or least-squares lines are reintroduced post-filtering, respectively.For other values of the -tr= argument, nothing is done before or after filtering.
If the -p= argument is present, the pre-filtering and post-filtering processing is applied to each time segment, separately.
The -tr= argument must be an integer and the default value for the -tr= argument is
0
.The -nfc= argument specifies the desired number of symmetric linear filter coefficients for the filtering of the multi-channel time series. If -nfc= is not specified, an optimal value is chosen in order to obtain a good frequency response function for the selected filter.
However, if -nfc= is set to
K
, the first and last(K-1)/2
time observations in the output NetCDF file will be affected by end effects. Thus, the user must choose the number of filter terms,K
, as a compromise between:
- A sharp cutoff, that is,
1/K
small; and- Minimizing the number of data points lost or affected by the filtering operation (since
(K-1)/2
data points will be lost or affected from each end of the series).Finally, the -nfc= argument must be greater or equal to
3
and odd.The -mi=missing_value argument specifies the missing value indicator for the output variable in the output_netcdf_file.
If the -mi= argument is not specified, the missing_value is set to
1.e+20
.The -ngp= argument can be used if you have memory problems when running comp_lanczos_filter_3d on very large datasets. By default, the number_of_grid_points is set to the number of cells in the selected domain. In case of memory problems, you can use the -ngp= argument with a lower value. This will reduce the memory used by the operator.
The -notestf argument allows to bypass some of the restrictions on the number of filter coefficients as specified with the -nfc= argument.
By default, the value of the -nfc= argument must be chosen such that the full transition bands about the cutoff frequencies (
1/PH
and1/PL
) of the selected filter are inside the [0.
,0.5
] frequency interval.When the -notestf argument is specified, only the cutoff frequencies (e.g.,
1/PH
and1/PL
) of the selected filter must lie in the [0.
,0.5
] frequency interval and not the full transition bands around them.This allows to diminish the number of filter coefficients and, thus, to minimize the number of data points lost by the filtering operation (if -nfc= is set to
K
,(K-1)/2
data points will be “lost” or affected by end effects from each end of the series).The -usefft argument specifies that the Lanczos filter must be applied in the frequency domain by using a Fast Fourier Transform and the convolution theorem. Moreover, if the -usefft argument is specified, the values at the ends of the output filtered series are computed implicitly by assuming that the input series is part of a periodic sequence.
The -double argument specifies that the results are stored as double-precision floating point numbers in the output NetCDF file.
By default, the results are stored as single-precision floating point numbers in the output NetCDF file.
The -bigfile argument is allowed only if the NCSTAT software has been compiled with the _USE_NETCDF36 or _USE_NETCDF4 macros (e.g.,
-D_USE_NETCDF36
or-D_USE_NETCDF4
) and linked to the NetCDF 3.6 library or higher.If this argument is specified, the output_netcdf_file will be a 64-bit offset format file instead of a NetCDF classic format file. However, this argument is recognized in the procedure only if the NCSTAT software has been built with the _USE_NETCDF36 or _USE_NETCDF4 CPP macros.
The -hdf5 argument is allowed only if the NCSTAT software has been compiled with the _USE_NETCDF4 macro (e.g.,
-D_USE_NETCDF4
) and linked to the NetCDF 4 library or higher.If this argument is specified, the output_netcdf_file will be a NetCDF-4/HDF5 format file instead of a NetCDF classic format file. However, this argument is recognized in the procedure only if the NCSTAT software has been built with the _USE_NETCDF4 CPP macro.
It the multi-channel time series has a seasonal or diurnal cycle, use comp_stl_3d or comp_clim_3d to remove the pure harmonic components from the time series before filtering.
It is assumed that the data has no missing values excepted those associated with a constant land-sea mask.
Duplicate parameters are allowed, but this is always the last occurrence of a parameter which will be used for the computations. Moreover, the number of specified parameters must not be greater than the total number of allowed parameters.
For more details on Lanczos filtering and examples of use in the climate literature, see
- “Fourier analysis of time series- An introduction”, by Bloomfield, P., John Wiley and Sons, New York, Chapter 6, 1976. http://as.wiley.com/WileyCDA/WileyTitle/productCd-0471889482.html
- “Lanczos filtering in one and two dimensions”, by Duchon, C., Journal of applied meteorology, vol. 18, 1016-1022, 1979. doi: 10.1175/1520-0450(1979)018<1016:LFIOAT>2.0.CO;2
Outputs¶
comp_lanczos_filter_3d creates an output NetCDF file that contains the filtered time series estimated from the multi-channel time series associated with the input NetCDF variable. The output NetCDF data set contains the following NetCDF variable (in the description below, nlat and nlon are the lengths of the spatial dimensions of the input NetCDF variable and ntime is the selected number of time observations) :
- netcdf_variable_filt
(ntime,nlat,nlon)
: the filtered time series for each of the time series of the 2-D grid-mesh associated with the input NetCDF variable.The filtered multi-channel time series is packed in a tridimensional variable whose first and second dimensions are exactly the same as those associated with the input NetCDF variable netcdf_variable even if you restrict the geographical domain with the -x= and -y= arguments. However, outside the selected domain, the output NetCDF variable is filled with missing values.
Examples¶
For filtering a real multi-channel monthly time series between
18
and30
months (e.g., biennial time scale) from a tridimensional NetCDF variable calledmslp
extracted from the filemslp.monthly.mean_ncep2.nc
, which includes monthly time series, and store the results in the NetCDF filetbo_mslp_ncep2.nc
, use the following command :$ comp_lanczos_filter_3d \ -f=mslp.monthly.mean_ncep2.nc \ -v=mslp \ -m=mesh_mask_mslp_ncep2.nc \ -pl=18 \ -ph=30 \ -o=tbo_mslp_ncep2.nc