OASIS Open Document Format for Office Applications (OpenDocument) TC

 View Only
  • 1.  Re: [office] Encryption and data leakage

    Posted 05-12-2010 09:07
    I agree (except for file names, at least OOo doesn't keep file names,
    but in the end the spec doesn't guide what to do anyway).
    
    I would like to avoid way 1+2, but to directly go with way 3.
    
    a) It's IMHO the better approach
    b) Don't introduce interim changes, This breaks compatibility twice, and
    I am even not not sure if somebody would implement them at all.
    
    Another not yet mentioned approach would be to use/allow standard zip
    encryption including directory encryption (instead of way 1+2, but not
    as a replacement for 3).
    But I don't know if this would allow for different algorithms, nor do I
    know if the standard zip encryption is considered to be strong.
    I guess there are reasons that it hasn't be considered for ODF
    encryption from the beginning...
    
    Malte.
    
    robert_weir@us.ibm.com wrote, On 05/11/10 19:42:
    > The approach we inherited from ODF 1.1 encrypts each file in the ZIP 
    > independently.  Although the contents of the files are not viewable due to 
    > the encryption, there are bits of information that  potential "leak", such 
    > as:
    > 
    > 1) The file size
    > 2) The file date
    > 3) The file name
    > 4) The file mime type
    > 5) The hash of the first 1024 bytes of the file
    > 
    > For example, even in an encrypted document I could see a file name called 
    > "big-secret-takeover-june-3.jpg" and know some information that the person 
    > who wrote the encrypted document might be rather surprised to see in the 
    > open.
    > 
    > Although not required by ODF, an implementation, if it is clever, can 
    > avoid some of these leakages.  For example, the timestamp of the file can 
    > be turned into the time of encryption rather than the original time stamp. 
    >  And the file name can be randomized rather than indicate the original 
    > file name.  This might be fine for ODF, since these time stamps and file 
    > names are not necessary to be preserved.  So long as as we preserve 
    > referential integrity of the package, the names of images are not 
    > significant.
    > 
    > However we still should be concerned here.  First, the reason we split 
    > Part 3 into its own part was the believe that it could be useful for 
    > purposes other than just ODF 1.2.  Many of us hoped that it would other 
    > uses.  But I don't think we can assume that all uses can ignore the 
    > original file names and time stamps.  These might be significant for some 
    > uses. 
    > 
    > Second, even within ODF, especially if we allow package extensions,  we 
    > might see items added to packages where the names of files (which may 
    > ultimately end user-defined) cannot safely be renamed to random names. For 
    > example, there may be referential integrity constraints that a generic ODF 
    > processor is not aware of.  Maybe there is RDF that points to a contained 
    > image or other package resource.  In any case, the approach is very 
    > fragile.
    > 
    > Finally, even without extensions, and with the use of randomized names, we 
    > still leak information, based on knowing the size and hash of the first 
    > 1024 bytes of the file.  For example, if I have a copy of "
    > big-secret-takeover-june-3.jpg" I can easily check to see what encrypted 
    > documents also contain that same image.  I can similarly probe for any 
    > other resource where I know in advance its size and or contents. 
    > 
    > There are three ways of getting around this problem.  (Or at least two 
    > that come to mind).  One is to keep a "shadow directory" for the ZIP, that 
    > contains the original names, time stamps, and sizes of the files.  Encrypt 
    > this  "shadow directory" when the document is encrypted.  For example 
    > encrypted file, prepend it with some random bytes (not sure what is 
    > optimal) in order to prevent data leakage of original size and hash of 
    > first 1024 bytes.
    > 
    > Another approach is to encode the original full path of the file, appended 
    > with its timestamp, using the original derived key, base64 encode that, 
    > and then write that out as the full path for the ZIP entry. That way you 
    > do not need another file in the ZIP. 
    > 
    > The other way is to move to a whole-package encryption method, rather than 
    > trying to do this file-by-file. 
    > 
    > -Rob
    > 
    > ---------------------------------------------------------------------
    > To unsubscribe from this mail list, you must leave the OASIS TC that
    > generates this mail.  Follow this link to all your TCs in OASIS at:
    > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
    > 
    


  • 2.  Re: [office] Encryption and data leakage

    Posted 05-12-2010 09:22
    On 12 May 2010 10:06, Malte Timmermann 


  • 3.  Re: [office] Encryption and data leakage

    Posted 05-12-2010 12:58
    I'd also be concerned about library support for that form of encryption. 
    Specifically, if you have a ZIP library in your platform or language 
    libraries, but it does not support ZIP encryption, it will be far easier 
    to code a pre-ZIP encryption method than to modify the way your system 
    library handles ZIPs.
    
    -Rob
    
    
    Bob Jolliffe 


  • 4.  RE: [office] Encryption and data leakage

    Posted 05-12-2010 16:20
    Rob, I'm not sure I understand which approach these last concerns are about.
    (I see there are now later notes in this exchange while I was writing this,
    so I may be even more out of sync than I think already.)
    
     1. It strikes me that it is easier to make a Zip, encrypt the whole thing,
    and then include it in a Zip along with a manifest and a file that provides
    the encryption parameters than almost anything else.  This should work with
    standard Zip-handling libraries.  In the OpenDocument case, since whatever
    producer is doing this already knows how to make an ODF Package, it
    certainly can make the outer wrapper using the same machinery.  One can also
    avoid compression of the inner, encrypted payload Zip, so I suspect this can
    be done in a single pass on top of the unencrypted save of the payload Zip.
    
     2. I don't think the act of full-document encryption is technically
    difficult.  With regard to the encryption-descriptor, using a profile of XML
    Encryption strikes me as the ideal case, it being extremely valuable to rely
    on existing work in this situation.  That is probably the biggest impact on
    implementers, but the benefit is also quite high.  And since products that
    do XML DSig must already deal with certification stores, that aspect is
    already in hand and gives us even more consistency and reliance on existing
    technology.  There are likely some known libraries and more-widely available
    material on threat models against the encrypting application as well.
    
     3. But there may be use-case issues and only implementers of OpenDocument
    products can say what those specific issues are.  For example, ODFDOM would
    perhaps need to adjust its model to support this and have it work at the
    right point in time.  More seriously, tying encryption to Save As ... (as
    now done in OO.o) would be problematic, and it would appear that signing and
    encryption would both need to be operations on [being-]saved documents (just
    like, for e-mail, the cases are selected in advance, even as defaults, but
    it is the send[-to-outbox] that carries out the actions and requests any
    Pass Phrases that are required for private keys to be accessed and
    certificates to be applied, etc.  
    
    Or were you addressing a different one of the options under discussion?
    Bob was commenting about possible IP issues using PKware's proprietary
    encryption and DSig arrangements.  I would avoid those simply to use a
    widely-recognized standard mechanism (e.g., XML Encryption and XML DSig plus
    XadES, etc.) as much as possible.
    
     - Dennis
    
    (Side note: The e-mail software I use allows signing and/or encryption, with
    no indication as to sequence.  I am confident that signing applies to the
    unencrypted content, including attachments.)
    
      - - - - - - - - - - - - -
       Standards are arbitrary solutions to recurring problems (R. W. Bemer)
       Although not by becoming the recurring problem (orcmid).
      When you find yourself in a hole, stop digging.
    
    
    
    


  • 5.  RE: [office] Encryption and data leakage

    Posted 05-12-2010 16:29
    I think you are agreeing with me.  I'm certainly am agreeing with you.
    
    Let me know if you disagree,
    
    -Rob
    
    "Dennis E. Hamilton" 


  • 6.  RE: [office] Encryption and data leakage

    Posted 05-12-2010 17:11
    I don't disagree.
    
     1. I would not use the ODF manifest-carried encryption in the outer Zip at
    all, though.  Aren't there W3C XML formats for that specific task that we
    can use so we are freed from defining and maintaining a custom mapping to
    ODF-defined attributes and elements?
    
     2. I think this encryption should be decoupled completely from content
    signings and whatever those signings attest to and are non-reputable
    concerning.
    
     3. I am not sure what it means to sign the outer package.  If there is some
    sort of authentication against damage or substitution of the package, that
    is usually done by providing an external, possibly-signed message
    authenticator.  I'm not sure what in-wrapper signing can be done that can't
    simply be removed or replaced, and it won't be a signature of what the user
    sees in the decrypted document in any case.  The decryption should have its
    own checks against damage, and the Zip will already have a CRC32 of the
    encrypted payload.  
    
     4. I suppose the encryption information could carry a stronger
    message-authentication check on the unencrypted/encrypted payload, but
    that/those would be encrypted along with the decryption key.  I think David
    LeBlanc has already made suggestions on this aspect.  It would be useful to
    explore what XML Encryption already provides for this.  OOPS, now there are
    comments from him I have not read yet ...
    
     - Dennis
    
    


  • 7.  RE: [office] Encryption and data leakage

    Posted 05-12-2010 17:04
    Bob said:
    
    >I think there may also be IP concerns about using "standard" zip
    encryption.  From the application note:
    
    Yes, I do seem to recall that being a concern. It would be good to check with legal before implementing such an approach.


  • 8.  Re: [office] Encryption and data leakage

    Posted 05-12-2010 13:14
    So what are our options for #3?
    
    Option 1) ZIP in a ZIP.  So create document as if it is encrypted, then 
    encrypt that as one file and STORE it in a container ZIP file that has 
    manifest, mimetype and nothing else.  That manifest lists the encryption 
    parameters for the "inner" zip.
    
    PRO: 
    
    Data leakage concerns go away. 
    
    Better interaction with digital signatures.
    
    Simplifies the specification.  We don't need to talk about pre-compressing 
    before encrypting.  That happens automatically.
    
    CON: 
    
    Will this be slower because of the double ZIP?  I'm not quite sure.  I 
    think it might actually be faster because encrypting one big stream should 
    be faster than encrypting many smaller streams.  This is worth testing.
    
    There is no opportunity for selective encryption.  For example, cannot 
    decide to expose metadata but not content.  But this is not typical.  And 
    if really needed we could allow metadata to be shadowed in the outer 
    container.
    
    Option 2) Don't have two-levels of ZIP, but maintain a shadow directory 
    that is encrypted along with the concatenation of the files in the stream, 
    maybe using the Unix tar method.
    
    PRO:  Not sure it has advantages over 1)
    
    CON: Requires us to specify more, specifically our own conventions for a 
    pre-compression, pre-encryption compound file.
    
    Option 3)  Is there an option 3?
    
    -Rob
    
    Malte.Timmermann@Sun.COM wrote on 05/12/2010 05:06:09 AM:
    
    > From:
    > 
    > Malte Timmermann 


  • 9.  Re: [office] Encryption and data leakage

    Posted 05-12-2010 13:48
    On 12 May 2010 14:16,  


  • 10.  Re: [office] Encryption and data leakage

    Posted 05-12-2010 16:11
    OK. 
    
    So it sounds like this:
    
    1) Create ODF document as normal, with compressed (DEFLATE) streams in a 
    ZIP
    
    2) Encrypt the ZIP using Blowfish (legacy), AES (default) or another 
    specified algorithm
    
    3) The encrypted ZIP is then STORED in another ZIP contained.  We would 
    also have at least a manifest (with the encryption parameters), a 
    mimetype, and optionally a duplicate copy of any additional resources 
    (thumbnail, metadata) that an implementation intentionally desired to be 
    unencrypted.
    
    If we do this we need a convention for locating the contained ZIP file. We 
    could either have a well-known name like "encrypted.bin" or allow an 
    arbitrary name but then have a content type that we use to locate it from 
    the manifest.
    
    Given the above, what does signing look like? 
    
    -Rob
    
    Bob Jolliffe 


  • 11.  RE: [office] Encryption and data leakage

    Posted 05-12-2010 16:56
    I know this is not the last in the thread, but in a bit of a rush at the moment -
    
    Robert asked:
    
    >Will this be slower because of the double ZIP?  I'm not quite sure.  
    
    I don't think so - can't we just tell the zip not to compress that stream?
    
    >I think it might actually be faster because encrypting one big stream should
    be faster than encrypting many smaller streams.  This is worth testing.
    
    Certainly so - no need to generate and store new random salt, no need to generate independent HMAC for integrity checking,
    
    >There is no opportunity for selective encryption.  For example, cannot
    decide to expose metadata but not content.  But this is not typical.  And
    if really needed we could allow metadata to be shadowed in the outer
    container.
    
    Exactly.
    
    > Option 2) Don't have two-levels of ZIP, but maintain a shadow directory
    that is encrypted along with the concatenation of the files in the stream,
    maybe using the Unix tar method.
    
    I'm not sure how this would work. Seems like something not standard to a zip, might be interesting (in a not good way) to implement.