Module_Num_Constants

Copyright 2024 IRD

This file is part of statpack.

statpack is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

statpack is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You can find a copy of the GNU Lesser General Public License in the statpack/doc directory.

Authors: Pascal Terray (LOCEAN/IPSL, Paris, France)


THIS MODULE PROVIDES SIMPLE NAMES AND ROUTINES FOR THE VARIOUS MACHINE DEPENDENT CONSTANTS. ALL ARE FOR PRECISION ‘stnd’.

LATEST REVISION : 25/02/2024


function lamch ( cmach )

Purpose

LAMCH determines machine parameters for precision STND.

Arguments

CMACH (INPUT) character*1

Specifies the value to be returned by LAMCH. If:

  • CMACH = ‘S’ or ‘s’, LAMCH := sfmin
  • CMACH = ‘T’ or ‘t’, LAMCH := t
  • CMACH = ‘R’ or ‘r’, LAMCH := rnd
  • CMACH = ‘G’ or ‘g’, LAMCH := grd
  • CMACH = ‘U’ or ‘u’, LAMCH := unitrnd
  • CMACH = ‘P’ or ‘p’, LAMCH := prec

where:

  • sfmin = safe minimum, such that 1/sfmin does not overflow.

  • t = number of (base) digits in the floating-point significand.

  • rnd = 0.0 when floating-point addition rounds upward, downward

    or toward zero;

    = 1.0 when floating-point addition rounds to nearest,

    but not in the IEEE style;

    = 2.0 when floating-point addition rounds in the IEEE style.

  • grd = 1. if floating-point arithmetic chops (rnd = 0.) and more

    than t digits participate in the post-normalization shift of the floating-point significand in multiplication,

    = 0.0 otherwise.

  • unitrnd = unit roundoff of the machine, e.g., the maximum

    relative representation error of a real number in the range of the floating point numbers of kind STND.

  • prec = unitrnd*machbase, e.g., the relative spacing between

    consecutive floating point numbers in the range of the floating point numbers of kind STND.

Further Details

The routine is based on the routine DLAMCH in LAPACK77 (version 3). Note that the interface of DLAMCH in more recent versions of LAPACK has changed and is now using intrinsic Fortran90 functions to get the values of the machine parameters.

For any other characters, LAMCH returns the bit pattern corresponding to a quiet NaN.

subroutine mach ( basedigits, irnd, iuflow, igrd, iexp, ifloat, expepspos, expepsneg, minexpbase, maxexpbase, epspos, epsneg, epsilpos, epsilneg, rndunit )

Purpose

This subroutine is intended to determine the parameters of the floating-point arithmetic system specified below.

Arguments

BASEDIGITS (OUTPUT, OPTIONAL) integer(i4b)
The number of base digits in the floating-point significand.
IRND (OUTPUT, OPTIONAL) integer(i4b)

A parameter indicating whether proper rounding or chopping (rounding upward, downward, toward zero) occurs in addition. If:

  • IRND = 0 if floating-point addition rounds upward, downward or toward zero;
  • IRND = 1 if floating-point addition rounds to nearest, but not in the IEEE style;
  • IRND = 2 if floating-point addition rounds in the IEEE style.
IUFLOW (OUTPUT, OPTIONAL) integer(i4b)

A parameter indicating whether underflow is full or partial:

  • IUFLOW = 0 if there is full underflow (flush to zero, etc);
  • IUFLOW = 1 if there is partial underflow.
IGRD (OUTPUT, OPTIONAL) integer(i4b)

The number of guard digits for multiplication with chopping arithmetic (IRND = 0). If:

  • IGRD = 0 if floating-point arithmetic rounds, or if it chops and only BASEDIGITS digits participate in the post-normalization shift of the floating-point significand in multiplication;
  • IGRD = 1 if floating-point arithmetic chops and more than BASEDIGITS digits participate in the post-normalization shift of the floating-point significand in multiplication.
IEXP (OUTPUT, OPTIONAL) integer(i4b)
A guess for the number of bits dedicated to the representation of the exponent of a floating point number if BASE is a power of two and -1 otherwise.
IFLOAT (OUTPUT, OPTIONAL) integer(i4b)
A guess for the number of bits dedicated to the representation of a floating point number if BASE is a power of two and -1 otherwise.
EXPEPSPOS (OUTPUT, OPTIONAL) integer(i4b)

The largest in magnitude negative integer such that

1.0 + float(base)**(expepspos) /= 1.
EXPEPSNEG (OUTPUT, OPTIONAL) integer(i4b)

The largest in magnitude negative integer such that

1.0 - float(base)**(expepsneg) /= 1.
MINEXPBASE (OUTPUT, OPTIONAL) integer(i4b)
The largest in magnitude negative integer such that float(base)**minexpbase is positive and normalized.
MAXEXPBASE (OUTPUT, OPTIONAL) integer(i4b)
The largest in magnitude positive integer such that float(base)**(maxexpbase) is positive and normalized.
EPSPOS (OUTPUT, OPTIONAL) real(stnd)
The smallest power of BASE whose sum with 1. is greater than 1. That is, float(base)**(expepspos).
EPSNEG (OUTPUT, OPTIONAL) real(stnd)
The smallest power of BASE whose difference with 1. is less than 1. That is, float(base)**(expepsneg).
EPSILPOS (OUTPUT, OPTIONAL) real(stnd)
The smallest positive floating point number whose sum with 1. is greater than 1.
EPSILNEG (OUTPUT, OPTIONAL) real(stnd)
The smallest positive floating point number whose difference with 1. is less than 1.
RNDUNIT (OUTPUT, OPTIONAL) real(stnd)
unit roundoff of the machine, e.g., machine epsilon of the machine.

Further Details

This subroutine is based on the routines MACHAR by Cody and DLAMCH in LAPACK77 (version 3). For further details, See:

  1. Malcolm M.A., 1972:
    Algorithms to reveal properties of floating-point arithmetic. Comms. of the ACM, 15, 949-951.
  2. Gentleman, W.M., and Marovich, S.B., 1974:
    More on algorithms that reveal properties of floating point arithmetic units. Comms. of the ACM, 17, 276-277.
  3. Cody, W.J., 1988:
    MACHAR: A subroutine to dynamically determine machine parameters. TOMS 14, No. 4, 303-311.

function test_ieee ( )

Purpose

TEST_IEEE try to determine if the computer follows the IEEE standard 754 for binary floating-point arithmetic.

Arguments

none

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to determine if the computer follows the IEEE standard 754 for binary floating-point arithmetic.

function test_nan ( )

Purpose

TEST_NAN returns TRUE if NaNs exist, and FALSE otherwise.

Arguments

None

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to determine if NaNs exist as defined in the IEEE standard 754 for binary floating-point arithmetic.

Otherwise, the routine exploits the IEEE requirement that NaNs compare as unequal to all values, including themselves.

For further details, see:

  1. Cody, W.J., and Coonen, J.T., 1993:
    Algorithm 722, TOMS 19, No. 4, 443-451.

function is_nan ( x )

Purpose

This function returns TRUE if the scalar X is a NaN, and FALSE otherwise.

Arguments

X (INPUT) real(stnd)
The floating point number to be tested.

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to detect NaNs as defined in the IEEE standard 754 for binary floating-point arithmetic.

If the IEEE_ARITHMETIC module is not available, but the compiler supports the intrinsic function isnan(), this function is used to detect NaNs.

Otherwise, the routine exploits the IEEE requirement that NaNs compare as unequal to all values, including themselves.

Finally, if the computer does not follow the IEEE standard 754 for binary floating-point arithmetic, this function returns TRUE if the scalar X is equal to huge(X).

For further details, see:

  1. Cody, W.J., and Coonen, J.T., 1993:
    Algorithm 722, TOMS 19, No. 4, 443-451.

function is_nan ( x )

Purpose

This function returns the value TRUE if any of the elements of the vector X is a NaN, and FALSE otherwise.

Arguments

X (INPUT) real(stnd), dimension(:)
The floating point vector to be tested.

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to detect NaNs as defined in the IEEE standard 754 for binary floating-point arithmetic.

If the IEEE_ARITHMETIC module is not available, but the compiler supports the intrinsic function isnan(), this function is used to detect NaNs.

Otherwise, the routine exploits the IEEE requirement that NaNs compare as unequal to all values, including themselves.

If the computer does not follow the IEEE standard 754 for binary floating-point arithmetic, this function returns TRUE if any of the elements of the vector X is equal to huge(X).

For further details, see:

  1. Cody, W.J., and Coonen, J.T., 1993:
    Algorithm 722, TOMS 19, No. 4, 443-451.

function is_nan ( x )

Purpose

This function returns the value TRUE if any of the elements of the matrix X is a NaN, and FALSE otherwise.

Arguments

X (INPUT) real(stnd), dimension(:,:)
The floating point matrix to be tested.

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to detect NaNs as defined in the IEEE standard 754 for binary floating-point arithmetic.

If the IEEE_ARITHMETIC module is not available, but the compiler supports the intrinsic function isnan(), this function is used to detect NaNs.

Otherwise, the routine exploits the IEEE requirement that NaNs compare as unequal to all values, including themselves.

If the computer does not follow the IEEE standard 754 for binary floating-point arithmetic, this function returns TRUE if any of the elements of the matrix X is equal to huge(X).

For further details, see:

  1. Cody, W.J., and Coonen, J.T., 1993:
    Algorithm 722, TOMS 19, No. 4, 443-451.

subroutine replace_nan ( x, missing )

Purpose

This subroutine replaces the scalar X with the scalar MISSING, if X is a NaN on input.

Arguments

X (INPUT/OUTPUT) real(stnd)
The floating point number to be tested.
MISSING (INPUT) real(stnd)
The floating point number used to replace NaNs.

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to detect NaNs as defined in the IEEE standard 754 for binary floating-point arithmetic.

If the IEEE_ARITHMETIC module is not available, but the compiler supports the intrinsic function isnan(), this function is used to detect NaNs.

Otherwise, the routine exploits the IEEE requirement that NaNs compare as unequal to all values, including themselves.

If the computer does not follow the IEEE standard 754 for binary floating-point arithmetic, this subroutine replaces X with the scalar MISSING, if X is equal to huge(X).

For further details, see:

  1. Cody, W.J., and Coonen, J.T., 1993:
    Algorithm 722, TOMS 19, No. 4, 443-451.

subroutine replace_nan ( x, missing )

Purpose

This subroutine replaces the elements of the vector X which are NaNs by the scalar MISSING.

Arguments

X (INPUT/OUTPUT) real(stnd), dimension(:)
The floating point vector to be tested.
MISSING (INPUT) real(stnd)
The floating point number used to replace the NaNs.

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to detect NaNs as defined in the IEEE standard 754 for binary floating-point arithmetic.

If the IEEE_ARITHMETIC module is not available, but the compiler supports the intrinsic function isnan(), this function is used to detect NaNs.

Otherwise, the routine exploits the IEEE requirement that NaNs compare as unequal to all values, including themselves.

Finally, if the computer does not follow the IEEE standard 754 for binary floating-point arithmetic, this subroutine replaces the elements of the vector X which are equal to huge(X) with the scalar MISSING.

For further details, see:

  1. Cody, W.J., and Coonen, J.T., 1993:
    Algorithm 722, TOMS 19, No. 4, 443-451.

subroutine replace_nan ( x, missing )

Purpose

This subroutine replaces the elements of the matrix X which are NaNs by the scalar MISSING.

Arguments

X (INPUT/OUTPUT) real(stnd), dimension(:,:)
The floating point matrix to be tested.
MISSING (INPUT) real(stnd)
The floating point number used to replace the NaNs.

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to detect NaNs as defined in the IEEE standard 754 for binary floating-point arithmetic.

If the IEEE_ARITHMETIC module is not available, but the compiler supports the intrinsic function isnan(), this function is used to detect NaNs.

Otherwise, the routine exploits the IEEE requirement that NaNs compare as unequal to all values, including themselves.

Finally, if the computer does not follow the IEEE standard 754 for binary floating-point arithmetic, this subroutine replaces the elements of the matrix X which are equal to huge(X) with the scalar MISSING.

For further details, see:

  1. Cody, W.J., and Coonen, J.T., 1993:
    Algorithm 722, TOMS 19, No. 4, 443-451.

function nan ( )

Purpose

NAN returns as a scalar function, the bit pattern corresponding to a quiet NaN in the IEEE standard 754 for binary floating-point arithmetic if the machine recognizes NaNs or the maximum floating point number of kind STND otherwise.

Arguments

None

Further Details

If the compiler follows the Fortran 2003 standard, the facilities provided by the IEEE_ARITHMETIC module are used to create a quiet NaN as defined in the IEEE standard 754 for binary floating-point arithmetic.

Otherwise, the routine exploits the IEEE requirement that NaNs compare as unequal to all values, including themselves.

Finally, NAN returns the maximum floating point number of kind STND, if the computer does not follow the IEEE standard 754 for binary floating-point arithmetic.

function true_nan ( )

Purpose

TRUE_NAN returns as a scalar function, the bit pattern corresponding to a quiet NaN in the IEEE standard 754 for binary floating-point arithmetic.

Arguments

None
Flag Counter