comp_lanczos_filter_1d

Authors

Pascal Terray (LOCEAN/IPSL)

Latest revision

29/05/2024

Purpose

Filter a real time series in a selected frequency band by Lanczos filtering [Bloomfield] [Duchon]. The time series is extracted from a uni- or bidimensional variable readed from a NetCDF dataset and can also be detrended before Lanczos filtering at the user option.

The number of coefficients used to build the Lanczos filter can be selected by the user and the Lanczos filter can be applied to the time series in the time or frequency domain, also at the user option. This gives to the user some control on the desired end-effects of the filter (e.g., applying the filter in the frequency domain assumes implicitly that the time series is part of a periodic infinite series whose period is exactly equal to the length of the analyzed time series; on the other hand, applying the filter in the time domain implies some loss of data or some distortions of the desired response function of the filter at both ends of the filtered time series).

Additionally, the filtering can be done separately for different segments of equal length of the selected time series if this time series is not continuous in time.

The frequency response function (e.g., the transfer function) of the selected Lanczos filter can be computed by comp_freq_func_1d. See the references cited below for more details on Lanczos filtering [Bloomfield] [Duchon].

This procedure returns the filtered real time series in a NetCDF dataset. If the NetCDF variable is tri- or fourdimensional use comp_lanczos_filter_3d or comp_lanczos_filter_4d, respectively, instead of comp_lanczos_filter_1d. If the time series has a seasonal (or diurnal) cycle, use comp_stl_1d in order to estimate and remove the harmonic components of the time series before using comp_lanczos_filter_1d.

If you need more control on the filtering parameters, including other windows, use comp_symlin_filter_1d instead of comp_lanczos_filter_1d.

Further Details

Usage

$ comp_lanczos_filter_1d \
  -f=input_netcdf_file \
  -v=netcdf_variable \
  -t=time1,time2                           (optional) \
  -o=output_netcdf_file                    (optional) \
  -ni=index_for_2d_netcdf_variable         (optional) \
  -p=periodicity                           (optional) \
  -pl=minimum_period                       (optional) \
  -ph=maximum_period                       (optional) \
  -tr=trend_removal                        (optional : 0, 1, 2, 3, -1, -2, -3) \
  -nfc=number_of_filter_coefficients       (optional) \
  -mi=missing_value                        (optional) \
  -notestf                                 (optional) \
  -usefft                                  (optional) \
  -double                                  (optional) \
  -hdf5                                    (optional) \
  -tlimited                                (optional)

By default

-t=
the whole time period associated with the netcdf_variable
-o=
the output_netcdf_file is named filt_netcdf_variable.nc
-ni=
if the netcdf_variable is bidimensional, the first time series is used
-p=
the periodicity is set to time2 - time1 + 1 , which means that the time series is considered as continuous with only one time segment
-pl=
the minimum_period is set to 0, which means that no filtering is done for the shorter periods
-ph=
the maximum_period is set to 0, which means that no filtering is done for the longer periods
-tr=0
the trend_removal is set to 0, which means that no detrending is done before filtering
-nfc=
the number_of_filter_coefficients is determined in order to optimize the frequency response function of the selected filter
-mi=
the missing_value for the output variable is equal to 1.e+20
-notestf
normally, the value of the -nfc= argument must be chosen such that the full transition bands about the cutoff frequencies 1/PH and 1/PL (where PH and PL are the values of the -ph= and -pl= arguments, respectively) of the selected filter are inside the [0., 0.5] frequency interval. By using the -notestf argument you can get ride of this limitation
-usefft
the Lanczos filter is applied in the time domain. When you specify the -usefft argument the filter will be applied in the frequency domain, using an FFT algorithm and multiplication, instead of a convolution in the time domain
-double
the results are stored as single-precision floating point numbers in the output NetCDF file. If -double is activated, the results are stored as double-precision floating point numbers
-hdf5
a NetCDF classical format file is created. If -hdf5 is activated, the output NetCDF file is a NetCDF-4/HDF5 format file
-tlimited
the time dimension is defined as unlimited in the output NetCDF file. However, if -tlimited is activated, the time dimension is defined as limited in the output NetCDF file

Remarks

  1. The -v=netcdf_variable argument specifies the NetCDF variable for which a Lanczos filtering operation must be computed and the -f=input_netcdf_file argument specifies that this NetCDF variable must be extracted from the NetCDF file input_netcdf_file.

  2. If the -t=time1,time2 argument is missing the whole time period associated with the netcdf_variable is used to decompose the time series.

    The selected time period is a vector of two integers specifying the first and last time observations. The indices are relative to 1. Note that the output NetCDF file will have ntime = time2 - time1 + 1 time observations.

  3. The -ni=index_for_2d_netcdf_variable argument specifies the index for selecting the time series if the netcdf_variable is a 2D NetCDF variable. By default, the first time series is used, which is equivalent to set index_for_2d_netcdf_variable to 1.

  4. If the -p= argument is specified, the filtering is applied separately for each time segment of length periodicity (as determined by the value of the -p= argument). The whole selected time period (e.g., time2 - time1 + 1 ) must also be a multiple of the periodicity.

  5. The -pl= argument specifies the minimum period of oscillation of the filtered time series. The minimum_period is expressed in number of time observations.

    Do not use the -pl= argument or use -pl=0 for high-pass filtering frequencies corresponding to periods shorter than -ph=PH .

    The -pl= argument is a positive integer equal to 0 or greater than 2.

  6. The -ph= argument specifies the maximum period of oscillation of the filtered time series. The maximum_period is expressed in number of time observations. Do not use the -ph= argument or use -ph=0 for low-pass filtering frequencies corresponding to periods longer than -pl=PL . For example, -pl=6 (or 18) and -ph=32 (or 96) select periods between 1.5 and 8 years for quarterly (monthly) time series.

    The -ph= argument is a positive integer equal to 0 or greater than 2 and less than the length of the time series or the periodicity if the -p= argument is used.

    The -ph= argument must also be greater or equal to the -pl= argument if both are specified.

  7. Setting -pl= and -ph= to the same value P is allowed. In this case, an -ideal- band-pass filter with peak response near one at the single period P is computed and applied to the time series.

  8. Setting both -pl=0 and -ph=0 is also allowed. In that case, no frequencies filtering is done, but the data may be detrended if the -tr= argument is used with a value of 1, 2 or 3.

  9. The -tr= argument specifies pre- and post-filtering processing of the time series. If:

    • -tr=+/-1, the mean of the time series is removed before time filtering
    • -tr=+/-2, the drift from the time series is removed before time filtering. The drift for the time series is estimated using the formula : drift = ( tseries(ntime) - tseries(1) )/( ntime - 1 )
    • -tr=+/-3, the least-squares line from the time series is removed before time filtering.

    If -tr=-1, -2 or -3, the mean, drift or least-squares line are reintroduced post-filtering, respectively.

    For other values of the -tr= argument, nothing is done before or after filtering.

    If the -p= argument is present, the pre-filtering and post-filtering processing is applied to each time segment, separately.

    The -tr= argument must be an integer and the default value for the -tr= argument is 0.

  10. The -nfc= argument specifies the desired number of symmetric linear filter coefficients for the filtering of the time series. If -nfc= is not specified, an optimal value is chosen in order to obtain a good frequency response function for the selected filter.

    However, if -nfc= is set to K, the first and last (K-1)/2 time observations in the output NetCDF file will be affected by end effects. Thus, the user must choose the number of filter terms, K, as a compromise between:

    1. A sharp cutoff, that is, 1/K small; and
    2. Minimizing the number of data points lost or affected by the filtering operation (since (K-1)/2 data points will be lost or affected from each end of the series).

    Finally, the -nfc= argument must be greater or equal to 3 and odd.

  11. The -mi=missing_value argument specifies the missing value indicator for the output variable in the output_netcdf_file.

    If the -mi= argument is not specified, the missing_value is set to 1.e+20.

  12. The -notestf argument allows to bypass some of the restrictions on the number of filter coefficients as specified with the -nfc= argument.

    By default, the value of the -nfc= argument must be chosen such that the full transition bands about the cutoff frequencies (1/PH and 1/PL) of the selected filter are inside the [0., 0.5] frequency interval.

    When the -notestf argument is specified, only the cutoff frequencies (e.g., 1/PH and 1/PL) of the selected filter must lie in the [0., 0.5] frequency interval and not the full transition bands around them.

    This allows to diminish the number of filter coefficients and, thus, to minimize the number of data points lost by the filtering operation (if -nfc= is set to K, (K-1)/2 data points will be “lost” or affected by end effects from each end of the series).

  13. The -usefft argument specifies that the filter must be applied in the frequency domain by using a Fast Fourier Transform and the convolution theorem. Moreover, if the -usefft argument is specified, the values at the ends of the output filtered series are computed implicitly by assuming that the input series is part of a periodic sequence.

  14. The -double argument specifies that the results are stored as double-precision floating point numbers in the output NetCDF file.

    By default, the results are stored as single-precision floating point numbers in the output NetCDF file.

  15. The -hdf5 argument is allowed only if the NCSTAT software has been compiled with the _USE_NETCDF4 CPP macro (e.g., -D_USE_NETCDF4) and linked to the NetCDF 4 library or higher.

    If this argument is specified, the output_netcdf_file will be a NetCDF-4/HDF5 format file instead of a NetCDF classic format file. However, this argument is recognized in the procedure only if the NCSTAT software has been built with the _USE_NETCDF4 CPP macro.

  16. It the time series has a seasonal or diurnal cycle, use comp_stl_1d to remove the pure harmonic components from the time series before filtering.

  17. It is assumed that the data has no missing values.

  18. Duplicate parameters are allowed, but this is always the last occurrence of a parameter which will be used for the computations. Moreover, the number of specified parameters must not be greater than the total number of allowed parameters.

  19. For more details on Lanczos filtering and examples of use in the climate literature, see

Outputs

comp_lanczos_filter_1d creates an output NetCDF file that contains the filtered time series estimated from the time series associated with the input NetCDF variable. The output NetCDF dataset contains the following NetCDF variable (in the description below, ntime is the selected number of time observations) :

  1. netcdf_variable_filt(ntime) : the filtered time series for the time series associated with the input NetCDF variable.

Examples

  1. For Lanczos filtering a real monthly time series between 18 and 30 months (e.g., biennial time scale) from a NetCDF variable called sst extracted from the file sst.monthly.nino34.nc, which includes a monthly time series, and store the results in the NetCDF file qbo_sst_nino34.nc, use the following command :

    $ comp_lanczos_filter_1d \
      -f=sst.monthly.nino34.nc \
      -v=sst \
      -pl=18 \
      -ph=30 \
      -o=qbo_sst_nino34.nc
    
Flag Counter