OASIS Open Document Format for Office Applications (OpenDocument) TC

 View Only

Storing sheet-local named expressions in ODF

  • 1.  Storing sheet-local named expressions in ODF

    Posted 08-07-2006 05:24
    Some spreadsheet applications (Excel and Gnumeric at least)
    allow named expressions (which are basically spreadsheet variables)
    to be locally scoped to a specific sheet instead of being globally scoped.
    While a vast number of spreadsheet documents don't
    require local sheet names, there are documents that depend on this.
    
    There is a workaround solution in ODF as-is, but it has several weaknesses,
    so I'd like to standardize a better way to address sheet-local names.
    In the current draft formula syntax, the formula SC has a syntax for directly
    representing references to sheet-local names.  But for this
    to be useful, there needs to be a standard way to _store_ sheet-local
    named expressions in ODF documents.
    
    Can we discuss how to handle them?  As a starting point,
    I propose that in the rule , we add this:
      
    This would allow a table:table to directly contain
    table:named-expressions, which would
    then be interpreted as sheet-local names.
    
    Currently, element "table:named-expressions" is only defined
    as being part of the epilogue of the entire spreadsheet document's
    "office:spreadsheet", so spreadsheets look like this:
      
    
    This would still be fine; table:named-expressions that are
    children of office:spreadsheet would continue to be
    considered global names.  The table:named-expressions that
    are children of table:table would be sheet-local names.
    
    Currently, since named expressions are only allowed in
    the spreadsheet epilogue, the "usual" way to represent
    quasi-sheet-local names in ODF is by creating
    names that are a concatenation of the sheetname
    and the user's original name.  That works in terms
    of getting calculations correct, so no one needs to
    worry that ODF "isn't ready for use". ODF _does_ work, as-is.
    
    But this IS a little awkward.
    Users have to use long names instead of the usual short names in a Sheet, e.g.,
    in Sheet1 I have to refer to Sheet1_myname instead of simply "myname".
    Worse, sheet copying does NOT work as expected with local names; if I copy the
    sheet and call it "Sheet5", all the named references will still refer
    to the variables named "Sheet1..." and will NOT automatically switch to Sheet5.
    So copying such sheets is error-prone, and requires excess work
    (lots of renaming). A similar problem happens when you use XML tools to extract a
    particular sheet - you'd like the sheet-local named expressions to be
    directly accessible.  The problem is worse if you allow subtables,
    something that Excel doesn't permit but IS specifically mentioned
    in the OpenDocument spec.  The workaround is fine for now, and future
    applications will be able to perpetually read backwards to documents
    that use the workaround, but there's no reason we can't
    add improved support for the future.
    
    A separate issue that MIGHT be relevant is a technique for
    improving the performance of loading external values.
    A spreadsheet formula can refer to external
    spreadsheet document's names (and in ODF, cells).
    An application _CAN_ load the entire external document
    to get those values, but this loading process could become quite lengthy.
    It would be very nice if the packaged format defined an
    optional set of files that directly stored the values of
    named expressions.  E.G., perhaps there's a  file named
    "global-named-expressions.xml", and a subdirectory
    "named-expressions" that contain files for each sheet.
    Each of these would list the names of named expressions and
    their values.  If the files don't exist, and a named expression is
    requested, the application just accesses them the usual way;
    if an application wants such external access to be faster, it just
    puts them in the zip archive.  Now reading them involves unzipping
    only those files. Since in my conception there is one file per sheet,
    and a global file, I don't think that these issues conflict with
    each other.
    
    Allowing table:named-expressions as an optional child of table:table
    seems like a reasonable way to handle sheet-local variables to me.
    But I'd like to hear others' comments. This is an interface issue between
    the formula subcommittee and the overall OpenDocument TC,
    so I'm cross-posting to both lists.
    
    Below are relevant extracts from the ODF spec, including the
    "Spreadsheet Documents" and "Named Expressions" sections.
    
    --- David A. Wheeler 
    
    
    ====================================
    
    
    (From section "Spreadsheet Documents":)
    
    The content of spreadsheet documents mainly consists of a sequence of tables. Additionally, a spreadsheet document may contain forms, change tracking information and various kinds of declarations that simplify the usage of spreadsheet tables and their analysis. Each of these are contained in either the document prelude, or the document epilogue.
    
    Spreadsheet Document Content Model
    The spreadsheet document prelude contains the document's form data, change tracking information, calculation setting for formulas, validation rules for cell content and declarations for label ranges.
    
    
    
    The main document is a list of tables.
    
    The epilogue of spreadsheet documents contains declarations for named expressions, database ranges, data pilot tables, consolidation operations and DDE links.
    
    
    
    
    
    
    
    
    (section "Table Element"):
    
    The table element is the root element for tables.
    
    
    
    
    
    
    
    
    
    
    (section "Named Expressions"):
    
    The named expressions element