OASIS Open Data Protocol (OData) TC

Expand all | Collapse all

RE: [EXTERNAL] RE: action item 3913 - Concept for binary data format

  • 1.  RE: [EXTERNAL] RE: action item 3913 - Concept for binary data format

    Posted 08-05-2020 18:06




    Hi Hubert,
     
    Thanks for forwarding that document.
    I guess I made some wrong assumptions in what direction compact JSON is going.

    This idea about removing the repeating property names to reduce the size is really interesting.
     
    I would like to make the suggestion to actually calculate the effect of different approaches. We could decide on some representative responses to run through: traditional JSON, compact JSON, and binary serialization. All with and without
    compression (gzip). Beside comparing size we should also look into the effort to write client code dealing with this.
     
    Christof
     


    From: Hubert Heijkers <Hubert.Heijkers@nl.ibm.com>
    Sent: Tuesday, August 4, 2020 00:59
    To: Christof Sprenger <chrispre@microsoft.com>
    Cc: odata@lists.oasis-open.org
    Subject: [EXTERNAL] RE: action item 3913 - Concept for binary data format


     
    Hi Christof,

    Yeah, that doc is what is supposed to be the start of the formal normative specification for Compact JSON, if you haven't seen my contribution from back then you can find it here:
    https://www.oasis-open.org/apps/org/workgroup/odata/download.php/62898/odata-compact-json-format-v4.0.pdf

    This was enthusiastically received by and I had a follow up meeting with the SAP guys, at their offices, a week or two later, which lead to a number of great ideas/improvements. I'm still eager to take that next
    step and then move to a binary 'equivalent'.

    Let me know what you think, happy to discuss further.

    Cheers,

    -H


    Hubert Heijkers STSM, Chief Architect TM1 Server and OData Evangelist IBM Business Analytics +31-621-394123
    "The crisis of today is the joke of tomorrow" - H.G. Wells

    Christof
    Sprenger ---03-08-2020 05:23:33 PM---Thanks Hubert for the reply. Would you please share the current stage of the discussion on compact J

    From: Christof Sprenger < chrispre@microsoft.com >
    To: Hubert Heijkers < Hubert.Heijkers@nl.ibm.com >
    Cc: " odata@lists.oasis-open.org " < odata@lists.oasis-open.org >
    Date: 03-08-2020 05:23 PM
    Subject: RE: [EXTERNAL] Re: action item 3913 - Concept for binary data format






    Thanks Hubert for the reply.

    Would you please share the current stage of the discussion on compact JSON ? I was only able to find this

    document which is relatively empty.

    There is a fundamental difference between MessagePack and ProtoBuf. In ProtoBuf message format is defined upfront and needs to be known and carefully versioned for compatibility. MessagePack, has a fixed but very flexible and simple model (see

    type system , and yes, array s are length prefixed similar to all variable length structures like maps, strings, ). In that sense this is similar to JSON and XML where typically there is a second step in Memory that transforms message into Programming
    Language Types.
    In summary, my impression is that MessagePack is so much closer to JSON that I would expect that it should be easier for any OData implementation to add such a format.

    As mentioned above, I do not have any insights into the discussion on compact JSON but it might also be worth investigating compression on top of a binary format.

    Happy to help with any investigations.

    Christof

    From: Hubert Heijkers < Hubert.Heijkers@nl.ibm.com >

    Sent: Sunday, August 2, 2020 09:19
    To: Christof Sprenger < chrispre@microsoft.com >
    Cc: odata@lists.oasis-open.org
    Subject: [EXTERNAL] Re: action item 3913 - Concept for binary data format
    I still have a note here to reach back out to Mike as well because he mentioned ProtoBuf a couple of months ago by now as well in this same realm.

    For me this has always been an action item that was closely related to the Compact JSON work, hoping to iron out how we'd be able to specify very concisely what's to be expected in the payload in Compact JSON already and then come up with a representation in
    which we could use a more binary (read: less human readable) format to optimize the communication further. Back then I'd looked at a bunch of binary formats that would potentially make sense, MessagePack and ProtoBuf (the later one even more so since we use
    build on GRPC), but the time couldn't find any that I liked because either 1) they didn't have a separation between metadata data and a data file, 2) didn't support collections without needing to know the size up front easily (a definite no go for us) or 3)
    didn't feel we could easily be embedded in our serialization abstraction logic.

    So to answer your question, other then early investigations, I've not gotten to this again, wasn't planning to until we'd have taken the next step on Compact JSON (which is effectively taking my proposal - which we have in production - and merge in all the
    ideas/thoughts we had since and come up with a normative proposal for it).

    Cheers,

    -H


    Hubert Heijkers STSM, Chief Architect TM1 Server and OData Evangelist IBM Business Analytics +31-621-394123
    "The crisis of today is the joke of tomorrow" - H.G. Wells

    Christof Sprenger ---31-07-2020 08:29:54 PM---Hubert, We recently stumbled
    upon MessagePack< INVALID URI REMOVED

    From: Christof Sprenger < chrispre@microsoft.com >
    To: Hubert Heijkers < hubert.heijkers@nl.ibm.com >
    Cc: " odata@lists.oasis-open.org " < odata@lists.oasis-open.org >
    Date: 31-07-2020 08:29 PM
    Subject: [EXTERNAL] action item 3913 - Concept for binary data format







    Hubert,

    We recently stumbled upon
    MessagePack and I remembered the action item

    3913 which seems to be assigned to you.
    May I ask in which stage this action item is? I would love to explore if/how Message Pack fits into this since it appears to be at first sight so similar to JSON that this should give a nice balance between efficiency and compatibility.

    Christof





    Tenzij hierboven anders aangegeven: / Unless stated otherwise above:
    IBM Nederland B.V.
    Gevestigd te Amsterdam
    Inschrijving Handelsregister Amsterdam Nr. 33054214

    Tenzij hierboven anders aangegeven: / Unless stated otherwise above:
    IBM Nederland B.V.
    Gevestigd te Amsterdam
    Inschrijving Handelsregister Amsterdam Nr. 33054214






  • 2.  RE: action item 3913 - Concept for binary data format

    Posted 08-06-2020 00:53
    Hi Christof, Sounds like you had different expectations/requirements for Compact JSON, mind sharing them? What drove me to write up this spec, based on what I at the time prototyped and by now is in TM1, was overall performance, most notably processing responses client side (at it turns out the time it takes to generate server side, the bare-bones way we do it, it didn't really matter much). Personally I think that most efficiency is being had by the fact that people in their implementation are forced to spend time on writing that efficient client code, as opposed to using a generic JSON deserializer and then convert the JSON structure into whatever works for them. And I don't have to explain to you that not having to convert numeric, especially floating point and decimal values, into strings and parsing those again, but this also applies to other types, but passing them in binary form would take another chunk out of the time it takes to process these responses. BTW Ralf already did the comparison between gzipped JSON and Compact JSON in 2018 and it's a wash, virtually the same size remains. Never gotten to try a binary format in a prototype, prefer to reuse some format that would map nicely (read: I can refer to metadata, describe, using context, what's in this payload and stream the result). But if such a binary format doesn't exists it wouldn't be to hard to come up with one based on Compact JSON either. It gets a bit hairy where we have data that is nicely described and then include more free-form pieces, presuming we'd want to be able to use it for all payloads and not have restrictions like my proposal had. In compact JSON we can fall back on 'regular' JSON (not worked into the proposal as such yet) I suppose we'd have to come up with a way to represent such data in a binary format as well. Are you aware of an existing binary format that would map neatly? TTYT, -H Hubert Heijkers STSM, Chief Architect TM1 Server and OData Evangelist IBM Business Analytics +31-621-394123 "The crisis of today is the joke of tomorrow" - H.G. Wells Christof Sprenger ---05-08-2020 08:05:33 PM---Hi Hubert, Thanks for forwarding that document. From: Christof Sprenger <chrispre@microsoft.com> To: Hubert Heijkers <Hubert.Heijkers@nl.ibm.com> Cc: "odata@lists.oasis-open.org" <odata@lists.oasis-open.org> Date: 05-08-2020 08:05 PM Subject: RE: [EXTERNAL] RE: action item 3913 - Concept for binary data format Hi Hubert, Thanks for forwarding that document. I guess I made some wrong assumptions in what direction compact JSON is going. This idea about removing the repeating property names to reduce the size is really interesting. I would like to make the suggestion to actually calculate the effect of different approaches. We could decide on some representative responses to run through: traditional JSON, compact JSON, and binary serialization. All with and without compression (gzip). Beside comparing size we should also look into the effort to write client code dealing with this. Christof From: Hubert Heijkers <Hubert.Heijkers@nl.ibm.com> Sent: Tuesday, August 4, 2020 00:59 To: Christof Sprenger <chrispre@microsoft.com> Cc: odata@lists.oasis-open.org Subject: [EXTERNAL] RE: action item 3913 - Concept for binary data format Hi Christof, Yeah, that doc is what is supposed to be the start of the formal normative specification for Compact JSON, if you haven't seen my contribution from back then you can find it here: https://www.oasis-open.org/apps/org/workgroup/odata/download.php/62898/odata-compact-json-format-v4.0.pdf This was enthusiastically received by and I had a follow up meeting with the SAP guys, at their offices, a week or two later, which lead to a number of great ideas/improvements. I'm still eager to take that next step and then move to a binary 'equivalent'. Let me know what you think, happy to discuss further. Cheers, -H Hubert Heijkers STSM, Chief Architect TM1 Server and OData Evangelist IBM Business Analytics +31-621-394123 "The crisis of today is the joke of tomorrow" - H.G. Wells Christof Sprenger ---03-08-2020 05:23:33 PM---Thanks Hubert for the reply. Would you please share the current stage of the discussion on compact J From: Christof Sprenger < chrispre@microsoft.com > To: Hubert Heijkers < Hubert.Heijkers@nl.ibm.com > Cc: " odata@lists.oasis-open.org " < odata@lists.oasis-open.org > Date: 03-08-2020 05:23 PM Subject: RE: [EXTERNAL] Re: action item 3913 - Concept for binary data format Thanks Hubert for the reply. Would you please share the current stage of the discussion on compact JSON ? I was only able to find this document which is relatively empty. There is a fundamental difference between MessagePack and ProtoBuf. In ProtoBuf message format is defined upfront and needs to be known and carefully versioned for compatibility. MessagePack, has a fixed but very flexible and simple model (see type system , and yes, array s are length prefixed similar to all variable length structures like maps, strings, ). In that sense this is similar to JSON and XML where typically there is a second step in Memory that transforms message into Programming Language Types. In summary, my impression is that MessagePack is so much closer to JSON that I would expect that it should be easier for any OData implementation to add such a format. As mentioned above, I do not have any insights into the discussion on compact JSON but it might also be worth investigating compression on top of a binary format. Happy to help with any investigations. Christof From: Hubert Heijkers < Hubert.Heijkers@nl.ibm.com > Sent: Sunday, August 2, 2020 09:19 To: Christof Sprenger < chrispre@microsoft.com > Cc: odata@lists.oasis-open.org Subject: [EXTERNAL] Re: action item 3913 - Concept for binary data format I still have a note here to reach back out to Mike as well because he mentioned ProtoBuf a couple of months ago by now as well in this same realm. For me this has always been an action item that was closely related to the Compact JSON work, hoping to iron out how we'd be able to specify very concisely what's to be expected in the payload in Compact JSON already and then come up with a representation in which we could use a more binary (read: less human readable) format to optimize the communication further. Back then I'd looked at a bunch of binary formats that would potentially make sense, MessagePack and ProtoBuf (the later one even more so since we use build on GRPC), but the time couldn't find any that I liked because either 1) they didn't have a separation between metadata data and a data file, 2) didn't support collections without needing to know the size up front easily (a definite no go for us) or 3) didn't feel we could easily be embedded in our serialization abstraction logic. So to answer your question, other then early investigations, I've not gotten to this again, wasn't planning to until we'd have taken the next step on Compact JSON (which is effectively taking my proposal - which we have in production - and merge in all the ideas/thoughts we had since and come up with a normative proposal for it). Cheers, -H Hubert Heijkers STSM, Chief Architect TM1 Server and OData Evangelist IBM Business Analytics +31-621-394123 "The crisis of today is the joke of tomorrow" - H.G. Wells Christof Sprenger ---31-07-2020 08:29:54 PM---Hubert, We recently stumbled upon MessagePack< INVALID URI REMOVED From: Christof Sprenger < chrispre@microsoft.com > To: Hubert Heijkers < hubert.heijkers@nl.ibm.com > Cc: " odata@lists.oasis-open.org " < odata@lists.oasis-open.org > Date: 31-07-2020 08:29 PM Subject: [EXTERNAL] action item 3913 - Concept for binary data format Hubert, We recently stumbled upon MessagePack and I remembered the action item 3913 which seems to be assigned to you. May I ask in which stage this action item is? I would love to explore if/how Message Pack fits into this since it appears to be at first sight so similar to JSON that this should give a nice balance between efficiency and compatibility. Christof Tenzij hierboven anders aangegeven: / Unless stated otherwise above: IBM Nederland B.V. Gevestigd te Amsterdam Inschrijving Handelsregister Amsterdam Nr. 33054214 Tenzij hierboven anders aangegeven: / Unless stated otherwise above: IBM Nederland B.V. Gevestigd te Amsterdam Inschrijving Handelsregister Amsterdam Nr. 33054214 Tenzij hierboven anders aangegeven: / Unless stated otherwise above:
    IBM Nederland B.V.
    Gevestigd te Amsterdam
    Inschrijving Handelsregister Amsterdam Nr. 33054214