I would like to reiterate my position. Requirement: =========== The ability to Retry from From Party MSH to To Party MSH on lack of a Delivery Receipt is a fundamental functional requirement. This does not exclude the use of hop-to-hop reliable messaging. Rationale: ========= The assertion that the summation of hop-wise reliable messaging equals end-to-end reliable messaging is an unproven assumption which can easily be shown to be false for ebXML. There have been many references to MQseries as an example of how to accomplish Reliable Messaging, and to the lack of end-to-end acknowledgements in MQseries. However, there are significant difference between MQseries and ebXML. If I may quote Marty: IBM MQSeries is a stack of middleware that includes much more function that just getting messages on and off the wire. Since for MQSeries reliable messaging purposes, each node on the path has the whole stack, it is only necessary to postulate that all the nodes are MQSeries, which implies the full set of function. With ebXML, we cannot make silent assumptions about what is above the MSH, especially with an intermediary. We must make explicit statements about what must be there for reliable messaging to work. . . We simply do not have the infrastructure or the proprietary requirements to implement an MQseries-like system -- nor can we. Any comparison between ebXML Reliable Messaging and MQseries is simply not applicable. Even if we could make such comparisons, it will be some time before there is an ebXML infrastructure sufficient to make any assumptions about the reliability of an entire path. OTOH, end-to-end reliable messaging obviates the need for hop-wise reliable messaging (but does not preclude it). Chris has argued that end-to-end Retries are a new requirement and not part of our original requirement list. A look at the Requirements document says otherwise. There is no mention of multi-hop and there is actually no mention of how Retries are to be accomplished at all and only one mention of Retry in the entire document (Retry Count as a parameter in the TSLA/CPA). However, one could easily make the assumption that Reliable Messaging tacitly assumes the function of Retries. Very well, what does the Requirements document say about Reliable Messaging? A scan of the document shows the implicit assumption of point-to-point messaging only -- no middle hops. Figures 1, 2, 3, 5 show: Sender Receiver Application <==> MSH <==> Internet <==> MSH <==> Application This implicit assumption of Sender to Receiver only is carried throughout the document. Nowhere does it show anything akin to an Intermediary Hop. The Requirements document does state that the receiving Party might be an intermediary agent but it makes no provision for multi-hop (an intermediary does not automatically mean multi-hop). The entire Requirements document makes the assumption that Messaging is from the Sending Party to the Receiving Party through some nebulous cloud (labeled as Internet). Reliable Messaging is from the Sender to the Recipient regardless of what might be in the intervening cloud, or black box as we frequently called it. Reliable Messaging never terminates inside the cloud/black box. What happens in that black box is of no concern to us nor do we even need to acknowledge its existence. This was the original axiom of multi-hop (although it seems some have forgotten). The addition of multi-hop is actually the new requirement Chris speaks of. How then shall we reconcile multi-hop to the Requirements document? There is only one satisfactory answer -- add multi-hop in such a way that there is no change in any other requirement in the document. This means Reliable Messaging, while it may ALSO be applied to Intermediary hops, cannot change the fundamental functionality implicit in Reliable Messaging from end-to-end. This requires the ability to perform end-to-end RM functions (retries, duplicate detection, acknowledgements, failure notification) regardless of what may be going on in the black box. Everything that is added for multi-hop must be orthogonal to basic end-to-end functionality prescribed by the Requirements document. What about the Technical reasons for end-to-end Retries? There may be a variety of temporary reasons for a failure. Ostensibly, Reliable Messaging guarantees either a successful send or a Delivery Failure Notification back to the original sender. Such a guarantee is easily shown to be impossible. Simple network failures can exceed HTTP retry timeouts and neither a forward send nor a failure notification will be sent. The quick answer in this case would be that the local administrator will be notified and a phone call would be made. However, even if such information (phone number or eMail address) were available (it isn't), intermediary (VAN) operators with thousands of messages per hour flowing through their systems from almost as many sources, will simply not perform such tasks. Intermediary Reliable Messaging is easily shown to be unreliable just from common network outages. As has been said numerous times, lack of notification cannot be relied upon as either a success or a failure. Failure can never be proved. Even a Failure message may not be construed as a failure since it itself is based upon the lack of an Acknowledgement and thus an unknown status. There is never any proof of failure. Conversely, lack of failure is not a proof of success. Only success may be proven. The only proof of success is an end-to-end acknowledgement (we erroneously call Delivery Receipt). Must we then conclude that Intermediary Reliable Messaging is inherently fallacious and thus only end-to-end Reliable Message is needed? Yes! -- from the point of view of the ends. However, it can easily be shown that Intermediary nodes just as much need their own end-to-end acknowledgements for their own operations -- which from the point of view of the ends looks like Intermediary Reliable Messaging. What happens if we don't provide end-to-end Retries? If the Sender requests a Delivery Receipt (end-to-end Acknowledgement) but does not receive it then, regardless of the receipt of an intermediate acknowledgement, the Sender must assume some form of failure. The sending MSH must immediately report to the application or to a human element that there was a failure. The initial human response would be to resend!!! The second normal response would be a phone call to the receiving administrator (who will tell the sender to try again). If there is no verification that the message was received, then the next response would be to try to determine/correct the problem and resend. Even if the message arrived and is verified by the receiver, a resend may still be required to get the needed Delivery Receipt (NRR). In *all* cases, the answer is to resend. This resend MUST use the original MessageId for fear that the wayward message will eventually be delivered or that the receiving administrator is incorrect (especially if this is a financial transaction). This means that under ALL circumstances, the ability to resend from end-to-end is REQUIRED and the ability for intermediaries to pass such a resend is necessary functionality. What about the CPA parameters of Retries and RetryInterval? In the case of multi-hop, there will actually be two CPAs -- one with the end (CPA1) and one with the next hop (CPA2). The Retries to the first hop will be governed by Retries & RetryInterval in CPA2. What do Retries & RetryInterval in CPA1 mean? Of course, these govern Retries with the end! All the mechanisms needed for end-to-end retries already exist -- no change is required in the specification. The only problem is that the addition of multi-hop interferes with end-to-end retries (duplicates) which, as we have seen, is a fundamental functional requirement under all circumstances when a Delivery Receipt is requested but not received. The question is then not whether to allow end-to-end Retries but rather how to accomplish this since it MUST be allowed. The automation of those retries (or not) is not really important and can actually be left as an implementation detail. Regards, David Fischer Drummond Group.