I have to admit the CPA DeliveryChannel manageability
issue, with transparent Intermediary solution.
Now, two potentially interesting directions to follow from
this conversation:
1. Another "multi-hop topology requirement" should be
considered in addition to:
(1) simple hub-and-spoke model (with a single
intermediary)
(2) networked-gateways model, where each endpoint MSH is
never more than 1 hop away from a "public I-cloud" made of intermediaries
connected to each other, that know about each other URL. (equivalent to
"federated" hub-and-spoke topologies, apparently like CDC
model)
We must consider as Pim suggests:
(3) connected private MSH subnets, where there may still be
a public I-cloud, but each one of its intermediaries serves as a bridge to a
private network, the URLs of which are unknown and unaccessible from
outside.
Here, we could look at what assumption we could make on
routing functions, for a minimal management overhead.
For example it seems like there would be 3 parts in
every end-to-end multihop path across networks, that could be routed based on
different functions: (a) from MSH sender to MSH bridge 1, (b) from bridge 1 to
bridge 2, (c) from bridge 2 to MSH receiver.
This kind of topology assumption could make the "reverse
routing" (e.g. of responses) easier.
2. Looking into hybrid solutions between relayed Acks model
and transparent intermediaries model:
Advantage: No change in behavior for endpoints MSH at
all, while avoiding the Ack mapping and RM header replacement at each
intermediary. Here is a preview (need further validation)
- the "RM sequence handshake" is done in a way fairly
similar to the Relayed Ack approach: Sender MSH sends a CreateSequence to its
immediate intermediary, gets a Seq ID from it, then send the first User Message
on this seq, which is only from endpoint to intermediary.
- the immediate intermediary will in turn issue a new
CreateSequence, will piggyback the (first) User Message header to get it routed
to same ultimate destination as the User Message. This CS goes all the way to
ultimate receiver. The last intermediary removes the User Message header, gets
the CreateSequenceResponse from ultimate destination, and sends back the CSR
after putting back the UM header for reverse routing.
- the first intermediary now has established an RM sequence
up to the ultimate receiver.
- Sender MSH keeps using the initial sequence ID. In all
subsequent exchanges, the first intermediary just passes back and forth every RM
message/header, and only substitutes sequence IDs (so still RM header
replacement in first intermediary).
- this solution would look very similar to Sander's in case
of hub-and-spoke, but would still deal with two sequences at most for each path
regardless how many intermediaries are added (these would be
"transparent").
- A variant could even avoid the sequence ID mapping and
avoid all RM header replacement - (need more thought)
Jacques
I am hoping that my
comments can be picked out in green.
From: Pim van
der Eijk [mailto:pvde@sonnenglanz.net]
Sent: Tuesday, April 08, 2008 1:10
PM
To: 'Durand, Jacques R.';
ebxml-msg@lists.oasis-open.org
Subject: RE: [ebxml-msg] Groups -
Transparent MEP routing proposal V0.1
(ebMS-transparent-Multihop-MEPs-Routng.doc)
uploaded
Some additional
comments, see below in
red. I would be very interested in other
people's fresh opinions on this, I am beginning to think I have spent too much
time in multi-hop projects and may be brain washed
...
From: Durand,
Jacques R. [mailto:JDurand@us.fujitsu.com]
Sent: 03 April 2008 07:49
To: Pim van der Eijk;
ebxml-msg@lists.oasis-open.org
Subject: RE: [ebxml-msg] Groups -
Transparent MEP routing proposal V0.1
(ebMS-transparent-Multihop-MEPs-Routng.doc)
uploaded
Pim:
inline
From: Pim van
der Eijk [mailto:pvde@sonnenglanz.net]
Sent: Wednesday, April 02, 2008 1:35
PM
To: Durand, Jacques R.;
ebxml-msg@lists.oasis-open.org
Subject: RE: [ebxml-msg] Groups -
Transparent MEP routing proposal V0.1
(ebMS-transparent-Multihop-MEPs-Routng.doc)
uploaded
Hello
Jacques,
Here are some quick comments
on this proposal (and some earlier
documents):
1) What does
"no additional capability besides
Core ebMS V3 is required from the endpoint MSHs involved at the boundaries of
the I-Cloud" mean? The proposal
describes a requirement that "The PMode parameter
PMode.Protocol.Address contains the Hub URL, the value of which must be extended
with an HTTP Query of the form: ?pmode=<ID>". In this proposal,
any support for multihop would mean the ebMS 3.0 processor must append this
query to the hub URL. I don't know if any of the few existing ebMS 3.0
processors support this ability to update the hub URL today, and it may not be
easy to add this behaviour to an existing SOAP stack. Or are we assuming this is
configured statically by users, e.g. in the CPA? In a CPA with a
dozen CanSends, each would be connected with a DeliveryChannel that references a
Transport element that contains a URL that include a substring naming that same
DeliveryChannel. We would be missing a generalization, which users and
implementers will object to.
<JD> Our
assumption is that the URL of the Hub (or Intermediary) is
configuration data: e.g. it has to be represented in the PMode that
describes an end-to-end exchange. We do not see any problem with our
own implementation on top of Axis 2, in adding an HTTP query. Now we
know that the CPA is a possible representation for the PMode. The CPA has
already been profiled / extended for V3.0. It may have to be updated again
for multi-hop, and with all due respect for the CPA, I believe a
multihop solution(s) should be designed independently from the
current CPA (ebms specs are supposed to work independently), then we
can adapt/extend the CPA as needed.
This being said, I
don't believe that a different DeliveryChannel has to be defined for each party
or each CanSend when the same Intermediary is used as
TransportReceiver/Endpoint. An identifier must exist in the CPA
(or PMode) that defines a set of messages sent to the same destination with the
same level of reliability. (question: how do you know which message must be
associated with which RM sequence, in your solution?) This identifier is what
needs be communicated to the first intermediary (or Hub) in our solution, along
with CreateSequence. Whether already appended to the destination URL or
intended to be appended when the CPA is interpreted. I see this a CPA binding
question - probably needs be defined in a conformance
profile.
OK, but the
point I tried to make is that if an MSH can send messages with 20 different
p-modes, it has to reference 20 different URLs, which would need to be
enumerated in a CPA or equivalent configuration format. Not very
elegant?
CPAs and CPPs are
organized with “reusability” in mind, and therefore make use of certain
presumptions about what things are likely to be reused (that is, tend to be
shared for different actions within a service). Reusable “modules” are then
referenced (using IDREFs) so that they do not have to be repeated. The initial
TC members and ebXML participants believed that Endpoints and Transport details
would form a reusable module because actions within a service would tend to use
the same Endpoint (URL). Maintaining a distinct URL for each Action seemed to be
a management and governance mess to many participants and TC members.
De-multiplexing multiple kinds of Actions going to one URL is clearly possible
because of the variety of metadata (such as the values for Action and PartyId).
I think Pim shares a very widely held sentiment that using distinct URLs for
distinct Endpoints for every PMode would be quite an imposition on the
management of the community. Would a large community of 50,000 community
partners then have 50,000 distinct URLs issued? In addition to their distinct
PartyIds? Clearly this fixation on one part of the configuration information to
vary is possible. It is clearly not necessary, and I believe many customers will
find it undesirable and an imposition on their environment.
Technically speaking,
Pim is correct in saying that a different Delivery Channel will be needed for
every partner because a DeliveryChannel incorporates by reference a DocExchange
module and a Transport Module. Since the disinct Endpoints cause the Transport
Module to be distinct for each partner, the ID value must differ and so much the
DeliveryChannel element that has an attribute whose value (an IDREF) refers to
the Transport Module. Thus handling configurations for a Community becomes much
more awkward because they cannot be conveniently assembled from basic reusable
parts. It would still be possible in a CPA Template to substitute for the
Endpoint value. But for different Actions, a different Endpoint value would be
needed. So the template would need to make multiple distinct substitutions into
Transport elements, and repeat those Transport elements in building up the
CPA.
A regular v3.0
Core conformant specification does not have to do this and (the few ebMS 3.0
implementations that have been developed so far) no doubt
don't. And if they don't, they don't support your proposal.
Also, if this solution requires WS-Addressing, WS-Addressing is never
mentioned in v3.0 Core and the Core spec includes example messages that
don't have WS-Addressing anywhere. If it requires an eb:Routing header,
that is an element that does not even exist in the v3.0 Core XSD. So I
don't see how you can say that it does not require more than v3.0
Conformance.
By the way, I am
assuming the p-mode ID is used as a key to retrieve information that is used in
routing rules, e.g. to find out Service/Action or To/PartyId, i.e. the routing
rules would not reference p-mode IDs but derived information. Appendix D of v3.0
Core mentions several dozen p-mode parameters. Are these parameter values all
available for routing, or do you intend to propose a subset of "routable"
parameters. If the latter, would you agree that those would be parameters like
To/PartyId, Service/Action that are available in the ebMS business document
header in ebMS user messages?
My concern is why
these metadata values need to be repeated as parts of the URL.
If routing is
based on p-mode parameters, are you saying that routing cannot reference dynamic
properties (like ConversationId or Message
Properties)?
That
limitation would be unfortunate restriction. I think Pim is correct about this
limitation.
2) This proposal is
based on features of HTTP as an underlying transport protocol. Even
if most ebMS implementations will use HTTP, the idea that an ebMS
multihop functionality ("level 3" in my earlier terminology) has
dependencies with transport protocols (two levels lower) does not seem
right architecturally. If someone defined another transport protocol binding
than HTTP for ebMS 3 0 (e.g. SMTP or JMS), he should be able to use our
multihop solution with those other lower level transport
protocols.
<JD> then let us
just say that the handhake part of our solution (initiating a new RM sequence)
is transport-binding specific (at least,
specific to the transport used between Sending endpoint and first Intermediary).
That wouldn't be the first ebMS feature the usability of which is tied to a
particular transport layer (e.g. Pull mode so far is not usable with an
underlying protocol that is not request-response , e.g. SMTP). It could be
that we don't have a solution for all transports..
Pull mode should be
possible using SMTP message stores (mailbox) as found in protocols such as IMAP
or POP3, but of course it would work quite differently and POP3 and IMAP lower
level could be used instead of fiddling with the HTTP request entity (body
part). The ebMS 3 Pull mode is a kind of message store solution for a
protocol that does not have an already defined message store extension. FTP
would have worked for a FTP client out of the box because the client could have
both sent and received data using FTP methods STOR and RETR using the FTP
directory structure as the store.
3) Page 5
"Step
4: The Hub receives the message, and
closes the HTTP connection (asynchronous case). The Hub determines where this
User Message must be sent, using routing function based on ebMS header
data". It would be better
if the Hub does a routability check (do I have a configuration rule that tells
me where to forward this message to?) before closing the connection. If it
does not know how to forward the message, the Hub could then return an ebMS
error directly. This means that a sender in a multihop context could
receive errors from both the ultimate recipient (as in the peer to peer
case) and from intermediaries.
<JD> good
point. A new type of error needs be created for Intermediaries (e.g.
"RoutingFailure")
4) I have been
thinking of an alternative way to do what you want using this appended
query. An alternative that to me seems more in the spirit of the
SOAP processing model, would be to allow messages to contain more than one
eb:Messaging SOAP header element. One eb:Messaging block would be
targeted at ebMS intermediary nodes, identified using a separate "target"
attribute. The other would be targeted at the true recipient, as
in v3.0 Core. The idea is analogous to having multiple WS-Security
header blocks, targeted at multiple SOAP nodes. Given our goal to converge with
WS-* specifications, it seems best to leverage their type of solutions to
similar problems.
The only required
(compatible) update to the ebMS v3 0 schema is to add an optional "target"
attribute to eb:Messaging.
The eb:Message block
targeted at the intermediary could be used in various situations:
- When sending a
wsrm:CreateSequence message, it would serve to provide rich business document
routingheader content (To/PartyId, Service) to enable the sequence to be
established with the right recipient MSH. Like the appended
"?pmode=<ID>", this second header is not
explicitly mentioned in v3 0 Core. But v3 0 Core is assumed to be
composable with other WS-* specifications not discussed in v3 0 Core (say,
WS-Addressing or WS-SecureConversation) like any well-designed WS spec without
this having to be described in the v3 0 spec. It may be an acceptable
price to pay, even for Endpoints. In a Web Browser you have to configure
something in your client to use an HTTP proxy too, after
all.
- But the Sending endpoint
could use the appended "?pmode=<ID>" trick to pass information
to the first intermediary that would allow to create the second ebMS header
block as in your proposal, if it somehow cannot be modified to create this
second block itself.
<JD> When HTTP is
used, the HTTP query trick has the advantage of not requiring more than Core V3
Conformance - i.e. no ebms-level additional capability, like this
piggybacking of an eb:Messaging header on a CreateSequence message. And
even if we do so, we could probably reuse regular eb:Messaging headers (same
Core V3 schema) without creating a "target" attribute, if we rely instead on
SOAP processing features - i.e. use instead the standard "role" attribute
that SOAP2.0 headers support.
- Intermediaries could
have same default logic as ebMS 2.0 to reverse route any ebMS (user or signal)
message by reversing eb:From and eb:To, copying eb:Service, eb:Action,
eb:ConversationId, eb:AgreementRef and setting eb:RefToMessageId based on
incoming eb:MessageId. We only need to think of a value for
eb:Action. This model could be used to return a
standalone wsrm:CreateSequenceResponse to the sending
MSH.
<JD> we thought
of that. But that seems contrived. We believe again that within an I-Cloud,
routing should always be possible based on the URL of the destination node, in
case this URL is known (in many cases, a single hop directly to this URL will be
possible !) And that is usually the case for a response: we know where it should
go. more generally, the URL of the Intermediary to which the response
destination MSH is connected (the last hop could be a Pull). Header data that
tells where the Response should go, needs then be added to the Request message
by the first intermediary. That could be a wsa header (ReplyTo) or another
ebMS-level header data.
Here I think
you have very different deployment scenarios and user communities in
mind than I. The existing ebMS 2.0 multihop users I know (and
whose interest I have in mind) operate in environments where
intermediaries are used to bridge private networks. Message handlers are
not supposed to directly connect beyond one intermediary. They're not supposed
to know the URL of intermediaries beyond the one they're connected to
immediately, but if they knew, that URL wouldn't be in their DNS. If
they knew the IP address, it would be an address in a different VPN
that they can't even ping to.
- When sending an
eb:Messaging/eb:SignalMessage, adding a eb:Messaging[@target='intermediary']
structure could provide rich header data to allow the signal to be
routed across hops, even though the Signal itself lacks business
semantics. This would even work for PullRequest which unlike
eb:Receipt and errors is not a response message, where the trick of rerouting
based on retrieving data from the preceding UserMessage using
MessageId/RefToMessageId would not work.
<JD> right that
PullRequest was never considered for "routing" so far. But we believe that is a
low priority: the case for pulling across the entire I-Cloud has not been made
yet... So yes some "routable" header could be added to a response Signal.
But if doing so, it is better if the first intermediary on its way does this,
rather than the endpoint MSH because that would require more than core V3
conformance.
- When sending an
eb:Messaging/eb:UserMessage, adding a eb:Messaging[@target='intermediary']
structure would not be necessary, if the ebMS header is not encrypted,
so that its header elements can be used for
routing.
- When sending an
eb:Messaging/eb:UserMessage, adding a eb:Messaging[@target='intermediary']
structure would allow the end-to-end ebMS header to be encrypted by the first
intermediary and decrypted by the last one or by the
recipient.
<JD> Possibly.
Although instead of adding a second eb:Messaging header, we could consider
instead an "eb:Routing" header that contains the subset of header values that
are set as "constants" in the PMode, if we admit that all messages related to
same PMode (or same CPA CanSend?) must be routed the same way. The first
Intermediary to which the endpoint MSH is connected would add this header, based
on PMode info (assuming here that an endpoint must "register" in some way to an
Intermediary that will serve as its gateway to the I-Cloud).
This has some
advantages in cases where (some of the) header data is sensitive too. The
eb:Messaging structure targeted at the intermediary could be copied from the
end-to-end headers, or it could have some derived generalized content (e.g.
mapping many From/PartyIds to a more general PartyId, e.g. the one of the first
intermediary). This would greatly simplify the maintenance of routing
tables at the intermediary, which would just have to know how to get from one
intermediary to another, not from any endpoint to any other endpoint. In
realistic use cases (e.g. hubs linking geographies, or various sectors) there
could be in the order of dozens of intermediaries serving thousands of
endpoints.
- The last intermediary
could remove this header block before forwarding the message to the Endpoint, as
in your proposal.
- The second ebMS header
would never be needed in peer-to-peer ebMS
messaging.
Note: the above attempts to
use routing based on ebMS header data for all ebMS traffic and WSRM lifecycle
messages. This does not preclude the use of optimized routing information,
as in your proposal. If the endpoint somehow pre-computes a way to express
the HTTP URL of the ultimate recipient in a WS-Addressing header, as a custom
header, as an HTTP URL or as an IP address, this could be encoded in the
SOAP structure too. But before we look at optimizations, I wanted to make
sure the ebMS 2.0 style of header-based routing can be used to cover end-to-end
routing of user messages, signals and reliability messages.
<JD> But the
main problem with ebms header-based routing of signals, is that Core V3
does not support adding such headers... note that even if you relax
this restriction, a "reverse-routing" based on exactly the same headers with
just a swap between FromParty and ToParty, assumes that the routing always works
with only ToParty. What if in a first phase the routing uses "Service", then
only for the last Intermediary, uses "ToParty"? How can an intermediary tell if
it must do reverse-routing or just regular routing? I am not convinced this
reverse routing system is viable - need to see it described more
completely.
Yes, perhaps at
an upcoming call or F2F
Dale,
who had to leave for another meeting before completing
comments.