Installation

Basic installation

In this section, we provide a step by step procedure for the installation of the NCSTAT software.

Before compiling the NCSTAT distribution, you must create the NetCDF and STATPACK library (archive) files since NCSTAT depends on NetCDF and STATPACK. Beware that default command-line flags may not be sufficient for compiling the source code of these two libraries, especially the NetCDF library. In order to be compatible with NCSTAT, you may have to use the same (or at least compatible) command-line flags for compiling these two libraries as you will use later for compiling NCSTAT.

Once this is done, please follow the following steps for LINUX/Unix systems:

  1. Download the latest NCSTAT version at NCSTAT .

    For example, let us call this package NCSTAT2.tar.gz.

  2. Put the file in your preferred directory such as $HOME directory or, for example, /opt/ directory if you have ROOT privilege.

  3. Execute the UNIX command:

    $ tar -xzvf NCSTAT2.tar.gz
    

    to decompress the archive. Let us denote <NCSTAT directory> the package’s top directory after decompression. For example, it could be $HOME/NCSTAT2 or /opt/NCSTAT2.

    It is not mandatory, but recommended, to set the NCSTATDIR Shell environment variable to the path of the NCSTAT top directory:

    Defining the Shell environment variable NCSTATDIR
    Shell
    Command line
    csh/tcsh
    setenv NCSTATDIR <NCSTAT directory>
    sh/bash
    export NCSTATDIR=<NCSTAT directory>

    One of this command can be placed in the appropriate shell startup file in $HOME (i.e. .bashrc or .cshrc files).

    This directory, <NCSTAT directory>, contains the following subdirectories and associated files:

    Main NCSTAT directory
    File/subdirectory
    Content
    makefile
    Generic Makefile
    make.inc
    User specification options for makefile
    LICENSE
    NCSTAT License file
    README
    README file
    Changelog.org
    Change log file
    doc
    html, pdf and man NCSTAT documentation
    makeincs
    Template make.inc files for various compilers/platforms
    sources
    NCSTAT source code
  4. In order to proceed to compilation: Go to the $NCSTATDIR directory:

    $ cd $NCSTATDIR
    

    and edit the make.inc file inside this directory and follow the directions to change appropriately:

    • the absolute path of the $NCSTATDIR directory (TOPDIR);
    • the name/path of the Fortran90/95 compiler and the associated compilation/loader options (FC, FLAGS and LDFLAGS);
    • the name/path of your NetCDF, STATPACK and, eventually, BLAS libraries (NETCDF, STATPACK and LBLAS);
    • the name/path of the directory for the NCSTAT executables (EXECDIR).

    Alternatively, you can look at the make.inc examples in the subdirectory $NCSTATDIR/makeincs and if one of them matches your compiler/platform, use this file as a template make.inc to build your own make.inc.

    This can be done:

    • manually, by overwriting the make.inc file in $NCSTATDIR by your choice in $NCSTATDIR/makeincs;

    • by executing the command:

      $ make
      

      in the $NCSTATDIR directory, selecting the name for your architecture/compiler in the list printed on the screen and, then, executing the command:

      $ make <arch>
      

      where <arch> is the selected name for your architecture/compiler. These steps will also overwrite the make.inc file in $NCSTATDIR by your choice in $NCSTATDIR/makeincs.

    After these steps, you still need to customize this new make.inc file at least to provide:

    • the name/path of your NetCDF, STATPACK and, eventually, BLAS libraries (NETCDF, STATPACK and LBLAS);
    • the path of the directory for the NCSTAT executables (EXECDIR).

    Two loader options are typically used for linking the object code of the NetCDF, STATPACK and, eventually, BLAS libraries with NCSTAT executables:

    • -lname causes the compiler to look for a library file named libname.a and to link the NCSTAT executables to this library. To find this library file, the compiler searches sequentially through any directories named with the -L option explained below;
    • -Ldir option lets you specify a private directory for libraries specified with the -l option, before searching in the standard library directories /lib and /usr/lib.

    Note that your compiler may have other options for specifiying libraries, particularly if your UNIX system supports shared libraries and you want to use shared versions of the NetCDF, STATPACK and, eventually, BLAS libraries when creating NCSTAT executables.

    Remember also that UNIX linkers search for libraries in the order in which they occur on the command line and only resolve the references that are outstanding at the time when the library is searched. Therefore, the order of libraries and source/object files specified in LDFLAGS can be critical and it is almost always a good idea to list first the STATPACK library and, secondly, only the NetCDF and BLAS libraries (or other libraries) in the shell variable LDFLAGS when compiling and linking NCSTAT executables in order to avoid “Undefined” symbol messages during the loading or execution of a NCSTAT executable.

    Moreover, if NCSTAT is built with OpenMP support, many NCSTAT operators will be multi-threaded and the NetCDF, STATPACK and, eventually, BLAS libraries linked to NCSTAT must be compiled thread-safe, as much as possible, in order to avoid unexpected errors at execution of the NCSTAT operators. A simple way to achieve this, is often to compile these libraries with OpenMP support. See the section OpenMP compilation for more details.

  5. For compiling and creating the NCSTAT executables, then execute the make command:

    $ make all
    

    in the $NCSTATDIR directory.

    If no errors are generated during this last step, NCSTAT is now installed successfully on your computer and the NCSTAT executables are in the directory that you have specified in the Shell variable EXECDIR defined in your $NCSTATDIR/make.inc file.

More details on the available commands for compiling and managing NCSTAT code can be found in the headers of the makefiles $NCSTATDIR/makefile and $NCSTATDIR/sources/makefile.

Note, finally, that if you want to change the precision (i.e. single, double or extended precision) of the computations performed in the NCSTAT operators, you have to recompile first your STATPACK library with the desired precision. This is because NCSTAT uses directly the standard kind type for real numbers defined in STATPACK (i.e. the parameterized stnd type) for all definitions/allocations of real variables and arrays in the NCSTAT code. Once the STATPACK library has been recompiled, you must then recompile and link NCSTAT with this new version of the STATPACK library as described above.

The following subsections provide more details on how to activate OpenMP support when compiling NCSTAT, and on the UNIX preprocessor cpp macros, which can be used to compile/optimize NCSTAT or solve some compilation problems.

OpenMP compilation

In order to activate OpenMP parallelism in the NCSTAT operators, all compilers require you to use an appropriate compiler flag to turn on OpenMP compilation.

The table below shows what to use for several well-known Fortran compilers:

OpenMP compilation flags
Compiler
Compiler commands
OpenMP flag
Intel
ifort
-openmp or -qopenmp
GNU
gfortran
-fopenmp
PGI
pgfortran, pgf95, pgf90
-mp
NAG
nagfor
-openmp
IBM XL
xlf90_r, xlf95_r, xlf2003_r
-qsmp=omp

Additional information on OpenMP support provided by a large range of current Fortran compilers can be found at https://www.openmp.org/resources/openmp-compilers-tools/ . You will also find several examples of how to activate OpenMP compilation for various compilers/platforms in the template make.inc files under the subdirectory $NCSTATDIR/makeincs.

How to activate parallelism when executing the NCSTAT operators compiled with OpenMP support is described below in the section Parallel execution.

The following NCSTAT operators are parallelized if NCSTAT has been built with OpenMP support and if the preprocessor cpp macro _PARALLEL_READ has been defined during compilation:

  • comp_clim_3d, comp_clim_4d, comp_clim_miss_3d, comp_clim_miss_4d, comp_stat_3d, comp_stat_4d, comp_stat_miss_3d, which compute univariate statistics from a NetCDF variable;
  • comp_serie_3d, comp_serie_4d, comp_serie_miss_3d, comp_section_3d, comp_section_4d, comp_section_miss_3d, which compute time series and cross-sections from a NetCDF variable;
  • comp_lanczos_filter_3d, comp_lanczos_filter_4d, comp_symlin_filter_3d, comp_symlin_filter_4d, which filter time series from a NetCDF variable;
  • comp_stl_3d, comp_stl_4d, comp_trend_3d, comp_trend_4d, which decompose time series from a NetCDF variable;
  • comp_composite_3d, comp_composite_4d, which compute composite analysis from a NetCDF variable;
  • comp_cor_1d, comp_cor_3d, comp_cor_4d, comp_cor_miss_3d, comp_reg_1d, comp_reg_3d, comp_reg_4d, which compute correlation and regression from two NetCDF variables;
  • comp_eof_3d, comp_eof_4d, comp_eof_miss_3d, comp_svd_3d, comp_project_eof_3d, comp_project_eof_4d, which compute multivariate statistics from one or two NetCDF variables.

Note that some NCSTAT operators are still not parallelized, because they are I/O-bound (meaning that most of the time is spent in reading and writing the data) and shared memory parallelized writing of NetCDF files with OpenMP is not (yet) implemented in the NCSTAT source code.

Preprocessor cpp macros

The NCSTAT software uses the standard UNIX preprocessor, cpp, in order to allow some flexibility in the compilation of the NCSTAT operators. The cpp preprocessor is only used for conditional compilation of some parts of the NCSTAT source code at the user option. This is typically done by defining some UNIX preprocessor cpp macros (e.g. variables governing conditional compilation in the NCSTAT source files) at the compilation step of NCSTAT, usually by specifying -Dname as a compilation option, where name is a preprocessor cpp macro. Note that there is no space between -D and name. Each occurence of -D defines a single macro and the -D option can appear many times on a command line. Please note that your compiler may have other options for specifying UNIX preprocessor cpp macros (this is for example the case of the IBM XL Fortran compiler on IBM UNIX-like systems).

The following preprocessor cpp macros are currently used in the NCSTAT source code and can be defined at compilation of NCSTAT software:

  • _USE_NETCDF36 lets you create 64-bit offset format files instead of NetCDF classic format files on output of the NCSTAT operators, if the NCSTAT software has been linked to the NetCDF 3.6 library or higher. When the cpp macro _USE_NETCDF36 is defined at compilation and the version of your NetCDF library is higher than 3.6, many NCSTAT operators recognized the command line option -bigfile, which tells to these operators to produce NetCDF 64-bit offset format files instead of NetCDF classic format files. The use of the cpp macro _USE_NETCDF36 is recommended as soon as the version of your NetCDF library is higher than 3.6, since this allows the processing of huge NetCDF files, commonly produced as climate model outputs.
  • _USE_NETCDF4 lets you create NetCDF-4/HDF5 format files instead of NetCDF classic format files on output of the NCSTAT operators, if the NCSTAT software has been linked to the NetCDF 4 library or higher. When the cpp macro _USE_NETCDF4 is defined at compilation and the version of your NetCDF library is higher than 4, many NCSTAT operators recognized the command line option -hdf5, which tells to these operators to produce NetCDF-4/HDF5 format files instead of NetCDF classic format files. The use of the cpp macro _USE_NETCDF4 is recommended as soon as the version of your NetCDF library is higher than 4, since this allows the processing of huge NetCDF files, commonly produced as climate model outputs. Note also that the cpp macro _USE_NETCDF36 is also automatically defined when the cpp macro _USE_NETCDF4 is defined.
  • _USE_NAGWARE lets you compile the NCSTAT software with the NAG Fortran Compiler. The UNIX system subroutine and function, getarg() and iargc(), which are currently used by NCSTAT operators to process their command line arguments, are normally external programs without any explicit Fortran90 interfaces. However, these two UNIX system programs are part of the f90_unix_env Fortran90 module when the NAG Fortran compiler is used. The cpp macro _USE_NAGWARE takes care of this difference. Don’t use the cpp macro _USE_NAGWARE with other Fortran compilers since this will generate compilation errors.
  • _WHERE replaces some where Fortran90 constructs by do loops in the source code when OpenMP is used. This cpp macro is useful for activating OpenMP with some Fortran compilers, like the INTEL ifort compiler, which have some restrictions about the Fortran90 instructions, which can be used inside an OpenMP construct.
  • _BLAS lets you activate the use of an optimized and multithreaded BLAS library [blas] inside NCSTAT as described in the section Parallelism. Note that the name and path of this BLAS library must also be specified with the help of compiler/loader options on the command line as described in the section Basic installation.
  • _TRANSPOSE tells to the Fortran compiler to replace each instance of the Fortran90 intrinsic function, transpose(), in the source code by the corresponding STATPACK function, transpose2(), which is multithreaded when OpenMP is used. Use the cpp macro _TRANSPOSE, if you suspect that the intrinsic Fortran90 functions of your Fortran compiler are not optimized or efficient.
  • _MATMUL tells to the Fortran compiler to replace each instance of the Fortran90 intrinsic function, matmul(), in the source code by the corresponding STATPACK function, matmul2(), which is multithreaded when OpenMP is used. Use the cpp macro _MATMUL, if you suspect that the intrinsic Fortran90 functions of your Fortran compiler are not optimized or efficient. If the cpp macro _BLAS is also defined, the BLAS subroutine Xgemm() will be used instead of an OpenMP multithreaded version of matmul().
  • _PARALLEL_READ lets you activate parallel reading of NetCDF files based on the OpenMP standard [openmp] as described in the section Parallelism if the NCSTAT source code has been compiled with OpenMP support as described in the section OpenMP compilation. If OpenMP compilation has not been activated this preprocessor cpp macro has no effect. Finally, if multithreaded versions of the NCSTAT operators are not working properly on your machine, desactivating the cpp macro _PARALLEL_READ at compilation is a good choice, since you will still benefit from the parallelism of the STATPACK library.

Examples of use of these preprocessor cpp macros for the compilation of NCSTAT can be found in the template make.inc files under the subdirectory $NCSTATDIR/makeincs.