OASIS Open Document Format for Office Applications (OpenDocument) TC

 View Only

Fwd: ODF 1.4 recalculated formula spec suggestions, batch 6

  • 1.  Fwd: ODF 1.4 recalculated formula spec suggestions, batch 6

    Posted 11 days ago
    Apologies!

    Latest batch of comments from Arne.

    Patrick

    -------- Original Message --------
    Subject: ODF 1.4 recalculated formula spec suggestions, batch 6
    Date: 2026-05-22 14:54
    From: "arne@arne-thomassen.de" <arne@arne-thomassen.de>
    To: "patrick@durusau.net" <patrick@durusau.net>

    Hi,

    the OpenFormula specification OpenDocument-v1.4-os-part4-formula.pdf
    appears to have some flaws. I'd like to suggest the following changes to
    individual functions and topics in addition to the batches from my
    e-mails on 19 January, 15 February, 9 March, 31 March and 25 April 2026:

    - 4.11.7 Basis: several sentences state that date values must be
    truncated, e.g. in 4.11.7.2 Procedural Notation and in 4.11.7.3
    Procedure A. But all of the involved serial dates come from function
    parameters whose formal data type is DateParam. All DateParam values
    must already be converted to integers automatically (because the
    fractional portions indicate a _time_ within a day and must be removed;
    cf. 4.3.3 Date as opposed to 4.3.4 DateTime). Truncating an integer
    value does nothing. Please clarify the mentioned sections of the
    specification.

    - 4.11.7.7 Procedure E, case "If F is true then return the average of
    the number of days in each year between date1 and date2, inclusive." It
    seems that the returned numbers of all year-length procedures must
    always be integers (also because it is always integer in everyday use).
    If that is correct, then the average value in the quoted case must be
    rounded to an integer and can only be 365 or 366. The only case
    (involving more than one year) in which the result can be 366 is when
    the date range consists of exactly two years and one of them is a
    leap-year: the average (365 + 366) / 2 is 365.5, which _could_ be
    rounded to 366. (In all other cases, the average is below 365.5 and is
    usually rounded down to 365.) Please specify whether all procedures from
    D to F must or should always return integer numbers and which rounding
    mechanism should be applied to the quoted case in procedure E. - One
    reason why it is difficult to discern is that the respective function
    specifications provide too few specifics about the desired calculations,
    e.g. 6.12.3 ACCRINTM, 6.12.39 PRICEDISC. - For comparison, 6.10.25
    YEARFRAC explicitly says "number of years (including fractional part)".

    - 5.8 References: the grammar rule Reference unconditionally requires
    brackets at the beginning and end of any given reference. But typical,
    simple references like A2 don't have brackets. The best match seems to
    be the rule CellAddress, but it unconditionally requires a dot character
    '.' even if the SheetLocatorOrEmpty part is empty (RangeAddress has the
    same problem); and this rule does not appear anywhere else in the
    grammar or prose of OpenFormula, so it isn't even usable. The
    specification seems to break compatibility with widespread programs
    which allow simple references like A2 without a bracket or dot. If that
    isn't intentional, the rule CellAddress should be usable as an
    alternative in the rule Reference, and the required dot character '.'
    should become optional when its SheetLocatorOrEmpty part is empty.

    - 6.10.15 NETWORKDAYS: the given return type Number should be Integer,
    e.g. because the paragraph "Summary" says that "the _whole_ number of
    work days" is returned. - Please specify what should happen if Date2 <
    Date1, e.g. whether the returned integer should be negative or the dates
    should be swapped early or an error should be returned.

    - 6.10.15 NETWORKDAYS, 6.10.23 WORKDAY: the formal parameter Workdays is
    of type LogicalSequence, but the given examples "{1;0;0;0;0;0;1}" and
    "{0;0;0;0;0;1;1}" contain numbers instead of Logical values. Section
    6.3.13 "Conversion to LogicalSequence" states that a single Number or
    Logical value should be handled as "Conversion to Logical", but that the
    sequence contained in a Reference should only include "the values of
    type Logical and Error" and therefore not Number. Please clarify when a
    sequence of numbers may or should be converted to a sequence of Logicals
    if the formal parameter type is LogicalSequence. Maybe the quoted
    examples should be written as sequences of Logical values. - Quote from
    the paragraph which begins "The optional 4th parameter Workdays": "To
    define the work week as excluding Friday and Saturday, the third
    parameter would be [...]" The word "third" should be "fourth" or "4th"
    for both functions.

    - 6.10.23 WORKDAY: the type of the formal parameter Offset should be
    changed from Number to Integer because all calendar serial numbers are
    integers. Similarly, the return data type should be Date instead of
    DateTime.

    - 6.12: some functions use the parameter identifier CompoundingPeriods
    instead of Nper. Please replace CompoundingPeriods with Nper for
    uniformity and clarity.

    - 6.12.3 ACCRINTM: the paragraph "Summary" says "pay at maturity", but
    the parameter list doesn't contain a parameter named Maturity. Such a
    parameter is provided for many other specified finance functions. The
    paragraph "Semantics" says: "Settlement: The security's maturity date".
    This looks as if the parameter Settlement should be renamed to Maturity.
    (Otherwise, please specify for non-finance readers how to calculate the
    maturity date from the given parameters Issue and Settlement.) - A
    constraint like "Issue < Maturity" is missing.

    - 6.12.24 IRR, 6.18.3 AVERAGE: these functions use the parameter type
    NumberSequence, but many other functions which involve summing up or
    counting use the more generous type NumberSequenceList instead. Please
    use NumberSequenceList for these two functions too. (Or would this break
    compatibility? - My 25 April 2026 e-mail suggested this change for IRR
    for an additional reason.)

    - 6.12.32 ODDFYIELD, 6.12.34 ODDLYIELD: the paragraph "Summary" uses the
    unspecific notion "Compute the yield". Please specify whether the word
    "yield" refers to a "current yield" or "yield to maturity" or which
    other kind. - The return type should probably be Percentage instead of
    Number because the word "yield" seems to mean "rate". (Such a hint, if
    correct, would also help readers who aren't finance experts or native
    English speakers.)

    - 6.12.47 TBILLEQ: the given mathematical formula contains the undefined
    variable "rate" twice, which should probably be the parameter Discount.
    For clarity, maybe name it DiscountRate throughout 6.12.47.

    - 6.14.11 OFFSET: the texts imply that the contents of the parameter R
    itself gets changed. Please clarify that the function modifies and
    returns a _copy_ of R (if that is correct).

    - 6.18 Statistical Functions: for all inverse functions of cumulative
    distribution functions, please add the notion "quantile function" as a
    hint for users and implementors. (The alternative hint "percentile
    function" could be misunderstood as referring to 6.18.57 PERCENTILE.)

    - 6.18.40 LARGE, 6.18.70 SMALL: if the parameter N is an array with more
    than one element, please specify how the elements in the returned new
    array should be sorted. For example, should the smallest of the returned
    numbers always be in the top-left position of the new array? Or should
    the order depend on whether the function is LARGE resp. SMALL? - The
    paragraph "Semantics" states: "If N is an array of numbers, [...]" Shall
    implementations really check the entire array N to find out whether
    every element is a number, or would you remove the words "of numbers"
    for simplicity? (Usually, only the width and height of the array N are
    important, not its contents.) Should implementations only accept plain
    floating-point numbers in N or also Text values which represent
    floating-point or complex numbers?

    - 6.18.42 LOGEST: "if any of the entries in KnownY is negative, LOGEST
    returns an error". The condition should probably be "less than or equal
    to 0" because the logarithm of 0 would be infinite.

    - 6.18.58 PERCENTRANK: "if X lies between Y and Z = Y + 1 (Y < X < Z)
    with Y being the largest number smaller than X and Z the smallest number
    larger than X". Both occurrences of the word "number" seem to be wrong
    and should be replaced with "integer". - The given mathematical formula
    for rx seems to be bogus. For example, the denominator Z-Y is always 1
    and therefore superfluous (because of the definition Z = Y + 1). The
    numerator X-Y can e.g. be 0.9; adding that to ry can lead to a value
    near 2.0, which means that rx can be above the allowed maximum 1.0.

    - 6.18.67 SKEW, 6.18.68 SKEWP: the specification of 6.18.39 KURT
    requires STDEV(X) <> 0. The specification of SKEW and SKEWP lacks the
    necessary STDEV constraint (to avoid division by 0). The parameter type
    for SKEWP is only NumberSequence, but the type for SKEW and KURT is
    NumberSequenceList. Please clear up these inconsistencies.

    Please consider these additional suggestions.

    Thank you,
    Arne Thomaßen</patrick@durusau.net></arne@arne-thomassen.de>