Doug Mahugh wrote:
"I have one question about this matter, which perhaps David Wheeler or
somebody involved early on can answer: how were the existing
implementation-defined items determined? I'm assuming that some thought
has already been given to what should be in the standard and what should
be implementation-defined, and that is reflected in the current content
of the OpenFormula draft. Is that a fair assumption? And if so, it would
be useful to hear some of the rationale for how these decisions were
originally made."
Sure. Quite a lot of thought and time went into determining what is
implementation-defined; a little history might help.
The OpenFormula specification was developed by examining a large number
of actual spreadsheet applications, including Excel, OpenOffice.org,
Lotus', Word Perfect/Quattro Pro, Gnumeric, KOffice, Palm DocumentsToGo,
and many others. This reflects a belief of mine: I believe standards
should reflect _actual_ _practice_, instead of some academic notion of
what such applications MIGHT do. I think this is a belief that most
others in this group share; it's certainly not unique to me. We created
a number of tables comparing functions and operators in each
application, for example, as well as many test cases to look for "edge
cases" or undocumented functionality.
The specification is in some sense a "union" of the applications above,
because we want to be able to represent the data generated by any of
those applications. It's not quite a union; if the capability was
considered extremely exotic and unlikely to be useful to more than a few
users, it was omitted. For example, Gnumeric and Quattro Pro have an
enormous number of predefined functions, and not all of them are in
OpenFormula. Nevertheless, OpenFormula can represent even very exotic
Excel or OpenOffice.org spreadsheets without difficulty, and since
extensions are easily supported, it can even represent those well.
Syntactically, OpenFormula looks like the XML formal used by
OpenOffice.org, though not it's identical; we added syntactic extensions
as necessary to support other implementations.
When existing applications _differed_ in what they produced, we (the
group) tried to gain agreement on what they "should do". In some cases,
we could agree that an application was simply buggy, and the spec should
say something else. In other cases, we agreed that there were actually
different functions, that happened to have colliding names. In those
cases, we defined both functions with different names. In general Excel,
OpenOffice.org, and Gnumeric all agree on function names, so we followed
their names where they existed.
But in some cases, applications do things differently, and there were
good arguments for each option. In those cases, we tried to determine
what it "should" be. But if we could not agree on a single result, we
labelled it as "implementation-defined". Even when it's labelled as
implementation-defined, we tried to limit the possibilities. A good
example is 0^0: The only plausible values are 0, 1, or Error, so we can
at least spec that it must be one of those 3.
As Rob noted, the number of "implementation-defined" items is really
very, very small. The C and C++ specifications are _already_
international standards, and they have FAR more implementation-defined.
Even the Ada specification has many implementation-defined items, and
that is one of the most rigorous specifications for a language with the
ability to do numerical calculations. I don't see the few
implementation-defined items as a real problem. You can certainly
exchange a vast number of spreadsheets, even given these
implementation-defined areas.
That's the history in a nutshell. Does that help?
--- David A. Wheeler