OASIS Virtual I/O Device (VIRTIO) TC

 View Only
Expand all | Collapse all

[PATCH requirements 0/7] virtio net new features requirements

  • 1.  [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-01-2023 22:04
    Hi All, This document captures the virtio net device requirements for the upcoming release 1.4 that some of us are currently working on. I would like to capture if there are any other requirements linked to the features captured in this document to consider in the interface design. The objectives are: 1. to consider these requirements in introducing new features listed in the document and work towards the interface design followed by drafting the specification changes. 2. Define practical list of requirements that can be achieved in 1.4 timeframe incrementally and also have the ability to implement them. This is not a specification document. It is a pre-work document that helps to understand the draft specification addressing these requirements. Please review. Parav Pandit (7): net-features: Add requirements document for release 1.4 net-features: Add low latency transmit queue requirements net-features: Add low latency receive queue requirements net-features: Add notification coalescing requirements net-features: Add n-tuple receive flow steering requirements net-features: Add packet timestamp requirements net-features: Add header data split requirements net-workstream/features-1.4.md 224 +++++++++++++++++++++++++++++++++ 1 file changed, 224 insertions(+) create mode 100644 net-workstream/features-1.4.md -- 2.26.2


  • 2.  [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-01-2023 22:04
    Add requirements document template for the virtio net features. Add virtio net device counters visible to driver. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 36 ++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 net-workstream/features-1.4.md diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md new file mode 100644 index 0000000..03b4eb3 --- /dev/null +++ b/net-workstream/features-1.4.md @@ -0,0 +1,36 @@ +# 1. Introduction + +This document describes the overall requirements for virtio net device +improvements for upcoming release 1.4. Some of these requirements are +interrelated and influence the interface design, hence reviewing them +together is desired while updating the virtio net interface. + +# 2. Summary +1. Device counters visible to the driver + +# 3. Requirements +## 3.1 Device counters +1. The driver should be able to query the device and/or per vq counters for + debugging purpose using a control vq command. +2. The driver should be able to query which counters are supported using a + control vq command. +3. If this device is migrated between two hosts, the driver should be able + get the counter values in the destination host from where it was left + off in the source host. +4. If a virtio device is group member device, a group owner should be able + to query all the counter attributes using the admin queue command which + a virtio device will expose via a control vq to the driver. + +### 3.1.1 Per receive queue counters +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being + oversize than the buffer size +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the + buffer in the receive queue +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the device +4. le64 rx_pkts: Total packets received by the device + +### 3.1.2 Per transmit queue counters +1. le64 tx_bad_desc_errors: Descriptors dropped by the device due to errors in + descriptors +2. le64 tx_gso_pkts: Packets send as host GSO sequence +3. le64 tx_pkts: Total packets send by the device -- 2.26.2


  • 3.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 22:15
    On Fri, Jun 02, 2023 at 01:02:59AM +0300, Parav Pandit wrote:
    > Add requirements document template for the virtio net features.
    >
    > Add virtio net device counters visible to driver.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 36 ++++++++++++++++++++++++++++++++++
    > 1 file changed, 36 insertions(+)
    > create mode 100644 net-workstream/features-1.4.md
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > new file mode 100644
    > index 0000000..03b4eb3
    > --- /dev/null
    > +++ b/net-workstream/features-1.4.md
    > @@ -0,0 +1,36 @@
    > +# 1. Introduction
    > +
    > +This document describes the overall requirements for virtio net device
    > +improvements for upcoming release 1.4. Some of these requirements are
    > +interrelated and influence the interface design, hence reviewing them
    > +together is desired while updating the virtio net interface.
    > +
    > +# 2. Summary
    > +1. Device counters visible to the driver
    > +
    > +# 3. Requirements
    > +## 3.1 Device counters
    > +1. The driver should be able to query the device and/or per vq counters for
    > + debugging purpose using a control vq command.
    > +2. The driver should be able to query which counters are supported using a
    > + control vq command.

    why does it matter for requirements whether it's a control vq?


    what matters is whether they have to be synchronized
    with a given queue - I get it they don't have to.

    > +3. If this device is migrated between two hosts, the driver should be able
    > + get the counter values in the destination host from where it was left
    > + off in the source host.

    we don't cover migration currently don't see how this is a spec
    rquirement. unless maybe it's justification for 4?
    so maybe it means there needs to be a way to set counters?
    so there's no need to mention migration - just that it
    should be possible to move counters between devices.

    > +4. If a virtio device is group member device, a group owner should be able
    > + to query all the counter attributes using the admin queue command which
    > + a virtio device will expose via a control vq to the driver.


    this seems weirdly specific.
    what is the actual requirement?


    > +
    > +### 3.1.1 Per receive queue counters
    > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
    > + oversize than the buffer size

    with mergeable buffers how does this differ from 2?

    > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
    > + buffer in the receive queue
    > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the device

    what does this mean exactly? packets before or after they are combined?

    pls stick to device/driver terminology not guest/host

    > +4. le64 rx_pkts: Total packets received by the device

    including dropped ones or not?

    > +
    > +### 3.1.2 Per transmit queue counters
    > +1. le64 tx_bad_desc_errors: Descriptors dropped by the device due to errors in
    > + descriptors

    since when do we drop packets on error in descriptor?
    we just as likely stall ...

    > +2. le64 tx_gso_pkts: Packets send as host GSO sequence

    same questions as gro

    > +3. le64 tx_pkts: Total packets send by the device

    sent

    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 4.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 22:15
    On Fri, Jun 02, 2023 at 01:02:59AM +0300, Parav Pandit wrote: > Add requirements document template for the virtio net features. > > Add virtio net device counters visible to driver. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > net-workstream/features-1.4.md 36 ++++++++++++++++++++++++++++++++++ > 1 file changed, 36 insertions(+) > create mode 100644 net-workstream/features-1.4.md > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > new file mode 100644 > index 0000000..03b4eb3 > --- /dev/null > +++ b/net-workstream/features-1.4.md > @@ -0,0 +1,36 @@ > +# 1. Introduction > + > +This document describes the overall requirements for virtio net device > +improvements for upcoming release 1.4. Some of these requirements are > +interrelated and influence the interface design, hence reviewing them > +together is desired while updating the virtio net interface. > + > +# 2. Summary > +1. Device counters visible to the driver > + > +# 3. Requirements > +## 3.1 Device counters > +1. The driver should be able to query the device and/or per vq counters for > + debugging purpose using a control vq command. > +2. The driver should be able to query which counters are supported using a > + control vq command. why does it matter for requirements whether it's a control vq? what matters is whether they have to be synchronized with a given queue - I get it they don't have to. > +3. If this device is migrated between two hosts, the driver should be able > + get the counter values in the destination host from where it was left > + off in the source host. we don't cover migration currently don't see how this is a spec rquirement. unless maybe it's justification for 4? so maybe it means there needs to be a way to set counters? so there's no need to mention migration - just that it should be possible to move counters between devices. > +4. If a virtio device is group member device, a group owner should be able > + to query all the counter attributes using the admin queue command which > + a virtio device will expose via a control vq to the driver. this seems weirdly specific. what is the actual requirement? > + > +### 3.1.1 Per receive queue counters > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being > + oversize than the buffer size with mergeable buffers how does this differ from 2? > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the > + buffer in the receive queue > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the device what does this mean exactly? packets before or after they are combined? pls stick to device/driver terminology not guest/host > +4. le64 rx_pkts: Total packets received by the device including dropped ones or not? > + > +### 3.1.2 Per transmit queue counters > +1. le64 tx_bad_desc_errors: Descriptors dropped by the device due to errors in > + descriptors since when do we drop packets on error in descriptor? we just as likely stall ... > +2. le64 tx_gso_pkts: Packets send as host GSO sequence same questions as gro > +3. le64 tx_pkts: Total packets send by the device sent > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 5.  RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 22:29


    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Tuesday, June 6, 2023 6:15 PM

    > > +# 3. Requirements
    > > +## 3.1 Device counters
    > > +1. The driver should be able to query the device and/or per vq counters for
    > > + debugging purpose using a control vq command.
    > > +2. The driver should be able to query which counters are supported using a
    > > + control vq command.
    >
    > why does it matter for requirements whether it's a control vq?
    >
    It matters for requirements, so we produce design that addresses it.
    We don't want to add config space every growing bit map which may be different between different devices.

    >
    > what matters is whether they have to be synchronized with a given queue - I get
    > it they don't have to.
    They don't have to be.
    I don't see how it can be every synchronized.

    >
    > > +3. If this device is migrated between two hosts, the driver should be able
    > > + get the counter values in the destination host from where it was left
    > > + off in the source host.
    >
    > we don't cover migration currently don't see how this is a spec rquirement.
    > unless maybe it's justification for 4?
    True, but design need to keep this in mind if it has some touch points like of #4 so when migration arrives, it has the building block in place.

    > so maybe it means there needs to be a way to set counters?
    > so there's no need to mention migration - just that it should be possible to
    > move counters between devices.
    >
    > > +4. If a virtio device is group member device, a group owner should be able
    > > + to query all the counter attributes using the admin queue command which
    > > + a virtio device will expose via a control vq to the driver.
    >
    >
    > this seems weirdly specific.
    > what is the actual requirement?
    >
    I don't follow the question.
    When a device migrates from src to dst, group owner needs to know if both side underlying member device has same counter attributes or not.

    >
    > > +
    > > +### 3.1.1 Per receive queue counters
    > > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
    > > + oversize than the buffer size
    >
    > with mergeable buffers how does this differ from 2?
    >
    > > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
    > > + buffer in the receive queue
    > > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the
    > > +device
    >
    > what does this mean exactly? packets before or after they are combined?
    >
    Before.

    > pls stick to device/driver terminology not guest/host
    >
    Yes, will change the GUEST surfaced from the current F_GUEST terminology of the net device.

    > > +4. le64 rx_pkts: Total packets received by the device
    >
    > including dropped ones or not?
    >
    Not including. Will add this clarification further in v1.

    > > +
    > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
    > > +Descriptors dropped by the device due to errors in
    > > + descriptors
    >
    > since when do we drop packets on error in descriptor?
    > we just as likely stall ...
    >
    It is left to the device to implement on what to do on bad desc.

    > > +2. le64 tx_gso_pkts: Packets send as host GSO sequence
    >
    > same questions as gro
    >
    > > +3. le64 tx_pkts: Total packets send by the device
    >
    > sent
    >
    Ack.
    > >




  • 6.  RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 22:29
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 6, 2023 6:15 PM > > +# 3. Requirements > > +## 3.1 Device counters > > +1. The driver should be able to query the device and/or per vq counters for > > + debugging purpose using a control vq command. > > +2. The driver should be able to query which counters are supported using a > > + control vq command. > > why does it matter for requirements whether it's a control vq? > It matters for requirements, so we produce design that addresses it. We don't want to add config space every growing bit map which may be different between different devices. > > what matters is whether they have to be synchronized with a given queue - I get > it they don't have to. They don't have to be. I don't see how it can be every synchronized. > > > +3. If this device is migrated between two hosts, the driver should be able > > + get the counter values in the destination host from where it was left > > + off in the source host. > > we don't cover migration currently don't see how this is a spec rquirement. > unless maybe it's justification for 4? True, but design need to keep this in mind if it has some touch points like of #4 so when migration arrives, it has the building block in place. > so maybe it means there needs to be a way to set counters? > so there's no need to mention migration - just that it should be possible to > move counters between devices. > > > +4. If a virtio device is group member device, a group owner should be able > > + to query all the counter attributes using the admin queue command which > > + a virtio device will expose via a control vq to the driver. > > > this seems weirdly specific. > what is the actual requirement? > I don't follow the question. When a device migrates from src to dst, group owner needs to know if both side underlying member device has same counter attributes or not. > > > + > > +### 3.1.1 Per receive queue counters > > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being > > + oversize than the buffer size > > with mergeable buffers how does this differ from 2? > > > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the > > + buffer in the receive queue > > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the > > +device > > what does this mean exactly? packets before or after they are combined? > Before. > pls stick to device/driver terminology not guest/host > Yes, will change the GUEST surfaced from the current F_GUEST terminology of the net device. > > +4. le64 rx_pkts: Total packets received by the device > > including dropped ones or not? > Not including. Will add this clarification further in v1. > > + > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors: > > +Descriptors dropped by the device due to errors in > > + descriptors > > since when do we drop packets on error in descriptor? > we just as likely stall ... > It is left to the device to implement on what to do on bad desc. > > +2. le64 tx_gso_pkts: Packets send as host GSO sequence > > same questions as gro > > > +3. le64 tx_pkts: Total packets send by the device > > sent > Ack. > >


  • 7.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 22:57
    On Tue, Jun 06, 2023 at 10:28:46PM +0000, Parav Pandit wrote:
    >
    >
    > > From: Michael S. Tsirkin <mst@redhat.com>
    > > Sent: Tuesday, June 6, 2023 6:15 PM
    >
    > > > +# 3. Requirements
    > > > +## 3.1 Device counters
    > > > +1. The driver should be able to query the device and/or per vq counters for
    > > > + debugging purpose using a control vq command.
    > > > +2. The driver should be able to query which counters are supported using a
    > > > + control vq command.
    > >
    > > why does it matter for requirements whether it's a control vq?
    > >
    > It matters for requirements, so we produce design that addresses it.
    > We don't want to add config space every growing bit map which may be different between different devices.

    then say that you want to conserve config space, that is
    the requirement. not cvq specifically.

    > >
    > > what matters is whether they have to be synchronized with a given queue - I get
    > > it they don't have to.
    > They don't have to be.

    Then say that.

    > I don't see how it can be every synchronized.

    you could program them through the vq itself.


    > >
    > > > +3. If this device is migrated between two hosts, the driver should be able
    > > > + get the counter values in the destination host from where it was left
    > > > + off in the source host.
    > >
    > > we don't cover migration currently don't see how this is a spec rquirement.
    > > unless maybe it's justification for 4?
    > True, but design need to keep this in mind if it has some touch points like of #4 so when migration arrives, it has the building block in place.

    so again, let's make sure we capture the actual spec requirement.


    > > so maybe it means there needs to be a way to set counters?
    > > so there's no need to mention migration - just that it should be possible to
    > > move counters between devices.
    > >
    > > > +4. If a virtio device is group member device, a group owner should be able
    > > > + to query all the counter attributes using the admin queue command which
    > > > + a virtio device will expose via a control vq to the driver.
    > >
    > >
    > > this seems weirdly specific.
    > > what is the actual requirement?
    > >
    > I don't follow the question.
    > When a device migrates from src to dst, group owner needs to know if both side underlying member device has same counter attributes or not.

    whether it's through a command or not is not a requirement
    and I still do not know what the requirement is.
    what does "same counter attributes" mean? you never mentioned
    attributes before.


    > >
    > > > +
    > > > +### 3.1.1 Per receive queue counters
    > > > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
    > > > + oversize than the buffer size
    > >
    > > with mergeable buffers how does this differ from 2?
    > >
    > > > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
    > > > + buffer in the receive queue
    > > > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the
    > > > +device
    > >
    > > what does this mean exactly? packets before or after they are combined?
    > >
    > Before.
    >
    > > pls stick to device/driver terminology not guest/host
    > >
    > Yes, will change the GUEST surfaced from the current F_GUEST terminology of the net device.

    So? this predates 1.x spec we never bothered changing them.

    > > > +4. le64 rx_pkts: Total packets received by the device
    > >
    > > including dropped ones or not?
    > >
    > Not including. Will add this clarification further in v1.
    >
    > > > +
    > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
    > > > +Descriptors dropped by the device due to errors in
    > > > + descriptors
    > >
    > > since when do we drop packets on error in descriptor?
    > > we just as likely stall ...
    > >
    > It is left to the device to implement on what to do on bad desc.

    then how can you count drops? why does this even matter?
    and why on tx specifically?
    I feel addressing descriptor errors is a completely separate project.


    > > > +2. le64 tx_gso_pkts: Packets send as host GSO sequence
    > >
    > > same questions as gro
    > >
    > > > +3. le64 tx_pkts: Total packets send by the device
    > >
    > > sent
    > >
    > Ack.
    > > >




  • 8.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 22:57
    On Tue, Jun 06, 2023 at 10:28:46PM +0000, Parav Pandit wrote: > > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Tuesday, June 6, 2023 6:15 PM > > > > +# 3. Requirements > > > +## 3.1 Device counters > > > +1. The driver should be able to query the device and/or per vq counters for > > > + debugging purpose using a control vq command. > > > +2. The driver should be able to query which counters are supported using a > > > + control vq command. > > > > why does it matter for requirements whether it's a control vq? > > > It matters for requirements, so we produce design that addresses it. > We don't want to add config space every growing bit map which may be different between different devices. then say that you want to conserve config space, that is the requirement. not cvq specifically. > > > > what matters is whether they have to be synchronized with a given queue - I get > > it they don't have to. > They don't have to be. Then say that. > I don't see how it can be every synchronized. you could program them through the vq itself. > > > > > +3. If this device is migrated between two hosts, the driver should be able > > > + get the counter values in the destination host from where it was left > > > + off in the source host. > > > > we don't cover migration currently don't see how this is a spec rquirement. > > unless maybe it's justification for 4? > True, but design need to keep this in mind if it has some touch points like of #4 so when migration arrives, it has the building block in place. so again, let's make sure we capture the actual spec requirement. > > so maybe it means there needs to be a way to set counters? > > so there's no need to mention migration - just that it should be possible to > > move counters between devices. > > > > > +4. If a virtio device is group member device, a group owner should be able > > > + to query all the counter attributes using the admin queue command which > > > + a virtio device will expose via a control vq to the driver. > > > > > > this seems weirdly specific. > > what is the actual requirement? > > > I don't follow the question. > When a device migrates from src to dst, group owner needs to know if both side underlying member device has same counter attributes or not. whether it's through a command or not is not a requirement and I still do not know what the requirement is. what does "same counter attributes" mean? you never mentioned attributes before. > > > > > + > > > +### 3.1.1 Per receive queue counters > > > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being > > > + oversize than the buffer size > > > > with mergeable buffers how does this differ from 2? > > > > > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the > > > + buffer in the receive queue > > > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the > > > +device > > > > what does this mean exactly? packets before or after they are combined? > > > Before. > > > pls stick to device/driver terminology not guest/host > > > Yes, will change the GUEST surfaced from the current F_GUEST terminology of the net device. So? this predates 1.x spec we never bothered changing them. > > > +4. le64 rx_pkts: Total packets received by the device > > > > including dropped ones or not? > > > Not including. Will add this clarification further in v1. > > > > + > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors: > > > +Descriptors dropped by the device due to errors in > > > + descriptors > > > > since when do we drop packets on error in descriptor? > > we just as likely stall ... > > > It is left to the device to implement on what to do on bad desc. then how can you count drops? why does this even matter? and why on tx specifically? I feel addressing descriptor errors is a completely separate project. > > > +2. le64 tx_gso_pkts: Packets send as host GSO sequence > > > > same questions as gro > > > > > +3. le64 tx_pkts: Total packets send by the device > > > > sent > > > Ack. > > >


  • 9.  RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 23:08


    > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
    > open.org> On Behalf Of Michael S. Tsirkin
    > Sent: Tuesday, June 6, 2023 6:57 PM

    > > It matters for requirements, so we produce design that addresses it.
    > > We don't want to add config space every growing bit map which may be
    > different between different devices.
    >
    > then say that you want to conserve config space, that is the requirement. not
    > cvq specifically.
    >
    Well in one meeting you specifically told that requirements and design to be combined together, so it is drafted this way.
    Instead of very abstract like "conserve config space".

    We can debate and change from cvq to new cfgvq as part of the journey.
    It is fine.

    > > >
    > > > what matters is whether they have to be synchronized with a given
    > > > queue - I get it they don't have to.
    > > They don't have to be.
    >
    > Then say that.
    >
    I thought this is very obvious as querying counter is hundred-time slower operation than packet processing.

    > > I don't see how it can be every synchronized.
    >
    > you could program them through the vq itself.
    >
    Do you mean in the packet transmit and receive completions itself?
    It would be too heavy to do so to mix the control fields in the data path.
    > > > we don't cover migration currently don't see how this is a spec rquirement.
    > > > unless maybe it's justification for 4?
    > > True, but design need to keep this in mind if it has some touch points like of
    > #4 so when migration arrives, it has the building block in place.
    >
    > so again, let's make sure we capture the actual spec requirement.
    >
    It is an actual spec requirement to be done and drafted in the spec.
    We may not do everything in the first phase, this are the broad requirements.
    And in design, we will say, requirement #4 is phase 2.

    >
    > > > so maybe it means there needs to be a way to set counters?
    > > > so there's no need to mention migration - just that it should be
    > > > possible to move counters between devices.
    > > >
    > > > > +4. If a virtio device is group member device, a group owner should be
    > able
    > > > > + to query all the counter attributes using the admin queue command
    > which
    > > > > + a virtio device will expose via a control vq to the driver.
    > > >
    > > >
    > > > this seems weirdly specific.
    > > > what is the actual requirement?
    > > >
    > > I don't follow the question.
    > > When a device migrates from src to dst, group owner needs to know if both
    > side underlying member device has same counter attributes or not.
    >
    > whether it's through a command or not is not a requirement and I still do not
    > know what the requirement is.
    > what does "same counter attributes" mean? you never mentioned attributes
    > before.
    >
    I will refine this further and drop the word "attribute".
    If device support X counters, group owner should be able to know this bitmap.

    > > Yes, will change the GUEST surfaced from the current F_GUEST terminology of
    > the net device.
    >
    > So? this predates 1.x spec we never bothered changing them.
    >
    Will remove the guest wording and will change to transmit and receive.


    > > > > +4. le64 rx_pkts: Total packets received by the device
    > > >
    > > > including dropped ones or not?
    > > >
    > > Not including. Will add this clarification further in v1.
    > >
    > > > > +
    > > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
    > > > > +Descriptors dropped by the device due to errors in
    > > > > + descriptors
    > > >
    > > > since when do we drop packets on error in descriptor?
    > > > we just as likely stall ...
    > > >
    > > It is left to the device to implement on what to do on bad desc.
    >
    > then how can you count drops?
    A device may count the drops based on errors.
    It is counting drops based on the error it got.

    > why does this even matter?
    To debug.

    > and why on tx specifically?
    Missed the rx, will add.

    > I feel addressing descriptor errors is a completely separate project.

    Not sure if it is that big project.



  • 10.  RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 23:09
    > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis- > open.org> On Behalf Of Michael S. Tsirkin > Sent: Tuesday, June 6, 2023 6:57 PM > > It matters for requirements, so we produce design that addresses it. > > We don't want to add config space every growing bit map which may be > different between different devices. > > then say that you want to conserve config space, that is the requirement. not > cvq specifically. > Well in one meeting you specifically told that requirements and design to be combined together, so it is drafted this way. Instead of very abstract like "conserve config space". We can debate and change from cvq to new cfgvq as part of the journey. It is fine. > > > > > > what matters is whether they have to be synchronized with a given > > > queue - I get it they don't have to. > > They don't have to be. > > Then say that. > I thought this is very obvious as querying counter is hundred-time slower operation than packet processing. > > I don't see how it can be every synchronized. > > you could program them through the vq itself. > Do you mean in the packet transmit and receive completions itself? It would be too heavy to do so to mix the control fields in the data path. > > > we don't cover migration currently don't see how this is a spec rquirement. > > > unless maybe it's justification for 4? > > True, but design need to keep this in mind if it has some touch points like of > #4 so when migration arrives, it has the building block in place. > > so again, let's make sure we capture the actual spec requirement. > It is an actual spec requirement to be done and drafted in the spec. We may not do everything in the first phase, this are the broad requirements. And in design, we will say, requirement #4 is phase 2. > > > > so maybe it means there needs to be a way to set counters? > > > so there's no need to mention migration - just that it should be > > > possible to move counters between devices. > > > > > > > +4. If a virtio device is group member device, a group owner should be > able > > > > + to query all the counter attributes using the admin queue command > which > > > > + a virtio device will expose via a control vq to the driver. > > > > > > > > > this seems weirdly specific. > > > what is the actual requirement? > > > > > I don't follow the question. > > When a device migrates from src to dst, group owner needs to know if both > side underlying member device has same counter attributes or not. > > whether it's through a command or not is not a requirement and I still do not > know what the requirement is. > what does "same counter attributes" mean? you never mentioned attributes > before. > I will refine this further and drop the word "attribute". If device support X counters, group owner should be able to know this bitmap. > > Yes, will change the GUEST surfaced from the current F_GUEST terminology of > the net device. > > So? this predates 1.x spec we never bothered changing them. > Will remove the guest wording and will change to transmit and receive. > > > > +4. le64 rx_pkts: Total packets received by the device > > > > > > including dropped ones or not? > > > > > Not including. Will add this clarification further in v1. > > > > > > + > > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors: > > > > +Descriptors dropped by the device due to errors in > > > > + descriptors > > > > > > since when do we drop packets on error in descriptor? > > > we just as likely stall ... > > > > > It is left to the device to implement on what to do on bad desc. > > then how can you count drops? A device may count the drops based on errors. It is counting drops based on the error it got. > why does this even matter? To debug. > and why on tx specifically? Missed the rx, will add. > I feel addressing descriptor errors is a completely separate project. Not sure if it is that big project.


  • 11.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 23:18
    On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
    >
    >
    > > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
    > > open.org> On Behalf Of Michael S. Tsirkin
    > > Sent: Tuesday, June 6, 2023 6:57 PM
    >
    > > > It matters for requirements, so we produce design that addresses it.
    > > > We don't want to add config space every growing bit map which may be
    > > different between different devices.
    > >
    > > then say that you want to conserve config space, that is the requirement. not
    > > cvq specifically.
    > >
    > Well in one meeting you specifically told that requirements and design to be combined together, so it is drafted this way.
    > Instead of very abstract like "conserve config space".
    >
    > We can debate and change from cvq to new cfgvq as part of the journey.
    > It is fine.

    just don't keep listing this in each feature. my plan is to work
    on cfgvq so we can keep using config space without these
    limitations.

    > > > >
    > > > > what matters is whether they have to be synchronized with a given
    > > > > queue - I get it they don't have to.
    > > > They don't have to be.
    > >
    > > Then say that.
    > >
    > I thought this is very obvious as querying counter is hundred-time slower operation than packet processing.
    >
    > > > I don't see how it can be every synchronized.
    > >
    > > you could program them through the vq itself.
    > >
    > Do you mean in the packet transmit and receive completions itself?
    > It would be too heavy to do so to mix the control fields in the data path.

    maybe. but notice how you have a specific design in mind so you are jumping ahead.
    you asked how we could make these synch, i told you how.
    the actual point is you want to say "we don't need this synchronous",
    design idea: use cvq


    > > > > we don't cover migration currently don't see how this is a spec rquirement.
    > > > > unless maybe it's justification for 4?
    > > > True, but design need to keep this in mind if it has some touch points like of
    > > #4 so when migration arrives, it has the building block in place.
    > >
    > > so again, let's make sure we capture the actual spec requirement.
    > >
    > It is an actual spec requirement to be done and drafted in the spec.
    > We may not do everything in the first phase, this are the broad requirements.
    > And in design, we will say, requirement #4 is phase 2.

    yes but this one I don't know what it means, spec wise.
    some kind of block? what for?

    > >
    > > > > so maybe it means there needs to be a way to set counters?
    > > > > so there's no need to mention migration - just that it should be
    > > > > possible to move counters between devices.
    > > > >
    > > > > > +4. If a virtio device is group member device, a group owner should be
    > > able
    > > > > > + to query all the counter attributes using the admin queue command
    > > which
    > > > > > + a virtio device will expose via a control vq to the driver.
    > > > >
    > > > >
    > > > > this seems weirdly specific.
    > > > > what is the actual requirement?
    > > > >
    > > > I don't follow the question.
    > > > When a device migrates from src to dst, group owner needs to know if both
    > > side underlying member device has same counter attributes or not.
    > >
    > > whether it's through a command or not is not a requirement and I still do not
    > > know what the requirement is.
    > > what does "same counter attributes" mean? you never mentioned attributes
    > > before.
    > >
    > I will refine this further and drop the word "attribute".
    > If device support X counters, group owner should be able to know this bitmap.

    i guess device might only support part of counters then?

    > > > Yes, will change the GUEST surfaced from the current F_GUEST terminology of
    > > the net device.
    > >
    > > So? this predates 1.x spec we never bothered changing them.
    > >
    > Will remove the guest wording and will change to transmit and receive.
    >
    >
    > > > > > +4. le64 rx_pkts: Total packets received by the device
    > > > >
    > > > > including dropped ones or not?
    > > > >
    > > > Not including. Will add this clarification further in v1.
    > > >
    > > > > > +
    > > > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
    > > > > > +Descriptors dropped by the device due to errors in
    > > > > > + descriptors
    > > > >
    > > > > since when do we drop packets on error in descriptor?
    > > > > we just as likely stall ...
    > > > >
    > > > It is left to the device to implement on what to do on bad desc.
    > >
    > > then how can you count drops?
    > A device may count the drops based on errors.
    > It is counting drops based on the error it got.
    >
    > > why does this even matter?
    > To debug.

    for debugging it's easiest if you just stop the vq, instead of drops.

    > > and why on tx specifically?
    > Missed the rx, will add.
    >
    > > I feel addressing descriptor errors is a completely separate project.
    >
    > Not sure if it is that big project.

    I would have a separate debug section.

    --
    MST




  • 12.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-06-2023 23:18
    On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote: > > > > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis- > > open.org> On Behalf Of Michael S. Tsirkin > > Sent: Tuesday, June 6, 2023 6:57 PM > > > > It matters for requirements, so we produce design that addresses it. > > > We don't want to add config space every growing bit map which may be > > different between different devices. > > > > then say that you want to conserve config space, that is the requirement. not > > cvq specifically. > > > Well in one meeting you specifically told that requirements and design to be combined together, so it is drafted this way. > Instead of very abstract like "conserve config space". > > We can debate and change from cvq to new cfgvq as part of the journey. > It is fine. just don't keep listing this in each feature. my plan is to work on cfgvq so we can keep using config space without these limitations. > > > > > > > > what matters is whether they have to be synchronized with a given > > > > queue - I get it they don't have to. > > > They don't have to be. > > > > Then say that. > > > I thought this is very obvious as querying counter is hundred-time slower operation than packet processing. > > > > I don't see how it can be every synchronized. > > > > you could program them through the vq itself. > > > Do you mean in the packet transmit and receive completions itself? > It would be too heavy to do so to mix the control fields in the data path. maybe. but notice how you have a specific design in mind so you are jumping ahead. you asked how we could make these synch, i told you how. the actual point is you want to say "we don't need this synchronous", design idea: use cvq > > > > we don't cover migration currently don't see how this is a spec rquirement. > > > > unless maybe it's justification for 4? > > > True, but design need to keep this in mind if it has some touch points like of > > #4 so when migration arrives, it has the building block in place. > > > > so again, let's make sure we capture the actual spec requirement. > > > It is an actual spec requirement to be done and drafted in the spec. > We may not do everything in the first phase, this are the broad requirements. > And in design, we will say, requirement #4 is phase 2. yes but this one I don't know what it means, spec wise. some kind of block? what for? > > > > > > so maybe it means there needs to be a way to set counters? > > > > so there's no need to mention migration - just that it should be > > > > possible to move counters between devices. > > > > > > > > > +4. If a virtio device is group member device, a group owner should be > > able > > > > > + to query all the counter attributes using the admin queue command > > which > > > > > + a virtio device will expose via a control vq to the driver. > > > > > > > > > > > > this seems weirdly specific. > > > > what is the actual requirement? > > > > > > > I don't follow the question. > > > When a device migrates from src to dst, group owner needs to know if both > > side underlying member device has same counter attributes or not. > > > > whether it's through a command or not is not a requirement and I still do not > > know what the requirement is. > > what does "same counter attributes" mean? you never mentioned attributes > > before. > > > I will refine this further and drop the word "attribute". > If device support X counters, group owner should be able to know this bitmap. i guess device might only support part of counters then? > > > Yes, will change the GUEST surfaced from the current F_GUEST terminology of > > the net device. > > > > So? this predates 1.x spec we never bothered changing them. > > > Will remove the guest wording and will change to transmit and receive. > > > > > > > +4. le64 rx_pkts: Total packets received by the device > > > > > > > > including dropped ones or not? > > > > > > > Not including. Will add this clarification further in v1. > > > > > > > > + > > > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors: > > > > > +Descriptors dropped by the device due to errors in > > > > > + descriptors > > > > > > > > since when do we drop packets on error in descriptor? > > > > we just as likely stall ... > > > > > > > It is left to the device to implement on what to do on bad desc. > > > > then how can you count drops? > A device may count the drops based on errors. > It is counting drops based on the error it got. > > > why does this even matter? > To debug. for debugging it's easiest if you just stop the vq, instead of drops. > > and why on tx specifically? > Missed the rx, will add. > > > I feel addressing descriptor errors is a completely separate project. > > Not sure if it is that big project. I would have a separate debug section. -- MST


  • 13.  Re: [virtio] Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 09:03
    On Tue, 6 Jun 2023 19:18:06 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
    > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
    > >
    > >
    > > > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
    > > > open.org> On Behalf Of Michael S. Tsirkin
    > > > Sent: Tuesday, June 6, 2023 6:57 PM
    > >
    > > > > It matters for requirements, so we produce design that addresses it.
    > > > > We don't want to add config space every growing bit map which may be
    > > > different between different devices.
    > > >
    > > > then say that you want to conserve config space, that is the requirement. not
    > > > cvq specifically.
    > > >
    > > Well in one meeting you specifically told that requirements and design to be combined together, so it is drafted this way.
    > > Instead of very abstract like "conserve config space".
    > >
    > > We can debate and change from cvq to new cfgvq as part of the journey.
    > > It is fine.
    >
    > just don't keep listing this in each feature. my plan is to work
    > on cfgvq so we can keep using config space without these
    > limitations.


    cfgvq?

    Did I miss someting?


    Thanks.

    >
    > > > > >
    > > > > > what matters is whether they have to be synchronized with a given
    > > > > > queue - I get it they don't have to.
    > > > > They don't have to be.
    > > >
    > > > Then say that.
    > > >
    > > I thought this is very obvious as querying counter is hundred-time slower operation than packet processing.
    > >
    > > > > I don't see how it can be every synchronized.
    > > >
    > > > you could program them through the vq itself.
    > > >
    > > Do you mean in the packet transmit and receive completions itself?
    > > It would be too heavy to do so to mix the control fields in the data path.
    >
    > maybe. but notice how you have a specific design in mind so you are jumping ahead.
    > you asked how we could make these synch, i told you how.
    > the actual point is you want to say "we don't need this synchronous",
    > design idea: use cvq
    >
    >
    > > > > > we don't cover migration currently don't see how this is a spec rquirement.
    > > > > > unless maybe it's justification for 4?
    > > > > True, but design need to keep this in mind if it has some touch points like of
    > > > #4 so when migration arrives, it has the building block in place.
    > > >
    > > > so again, let's make sure we capture the actual spec requirement.
    > > >
    > > It is an actual spec requirement to be done and drafted in the spec.
    > > We may not do everything in the first phase, this are the broad requirements.
    > > And in design, we will say, requirement #4 is phase 2.
    >
    > yes but this one I don't know what it means, spec wise.
    > some kind of block? what for?
    >
    > > >
    > > > > > so maybe it means there needs to be a way to set counters?
    > > > > > so there's no need to mention migration - just that it should be
    > > > > > possible to move counters between devices.
    > > > > >
    > > > > > > +4. If a virtio device is group member device, a group owner should be
    > > > able
    > > > > > > + to query all the counter attributes using the admin queue command
    > > > which
    > > > > > > + a virtio device will expose via a control vq to the driver.
    > > > > >
    > > > > >
    > > > > > this seems weirdly specific.
    > > > > > what is the actual requirement?
    > > > > >
    > > > > I don't follow the question.
    > > > > When a device migrates from src to dst, group owner needs to know if both
    > > > side underlying member device has same counter attributes or not.
    > > >
    > > > whether it's through a command or not is not a requirement and I still do not
    > > > know what the requirement is.
    > > > what does "same counter attributes" mean? you never mentioned attributes
    > > > before.
    > > >
    > > I will refine this further and drop the word "attribute".
    > > If device support X counters, group owner should be able to know this bitmap.
    >
    > i guess device might only support part of counters then?
    >
    > > > > Yes, will change the GUEST surfaced from the current F_GUEST terminology of
    > > > the net device.
    > > >
    > > > So? this predates 1.x spec we never bothered changing them.
    > > >
    > > Will remove the guest wording and will change to transmit and receive.
    > >
    > >
    > > > > > > +4. le64 rx_pkts: Total packets received by the device
    > > > > >
    > > > > > including dropped ones or not?
    > > > > >
    > > > > Not including. Will add this clarification further in v1.
    > > > >
    > > > > > > +
    > > > > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
    > > > > > > +Descriptors dropped by the device due to errors in
    > > > > > > + descriptors
    > > > > >
    > > > > > since when do we drop packets on error in descriptor?
    > > > > > we just as likely stall ...
    > > > > >
    > > > > It is left to the device to implement on what to do on bad desc.
    > > >
    > > > then how can you count drops?
    > > A device may count the drops based on errors.
    > > It is counting drops based on the error it got.
    > >
    > > > why does this even matter?
    > > To debug.
    >
    > for debugging it's easiest if you just stop the vq, instead of drops.
    >
    > > > and why on tx specifically?
    > > Missed the rx, will add.
    > >
    > > > I feel addressing descriptor errors is a completely separate project.
    > >
    > > Not sure if it is that big project.
    >
    > I would have a separate debug section.
    >
    > --
    > MST
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe from this mail list, you must leave the OASIS TC that
    > generates this mail. Follow this link to all your TCs in OASIS at:
    > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
    >



  • 14.  Re: [virtio] Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 09:04
    On Tue, 6 Jun 2023 19:18:06 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote: > > > > > > > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis- > > > open.org> On Behalf Of Michael S. Tsirkin > > > Sent: Tuesday, June 6, 2023 6:57 PM > > > > > > It matters for requirements, so we produce design that addresses it. > > > > We don't want to add config space every growing bit map which may be > > > different between different devices. > > > > > > then say that you want to conserve config space, that is the requirement. not > > > cvq specifically. > > > > > Well in one meeting you specifically told that requirements and design to be combined together, so it is drafted this way. > > Instead of very abstract like "conserve config space". > > > > We can debate and change from cvq to new cfgvq as part of the journey. > > It is fine. > > just don't keep listing this in each feature. my plan is to work > on cfgvq so we can keep using config space without these > limitations. cfgvq? Did I miss someting? Thanks. > > > > > > > > > > > what matters is whether they have to be synchronized with a given > > > > > queue - I get it they don't have to. > > > > They don't have to be. > > > > > > Then say that. > > > > > I thought this is very obvious as querying counter is hundred-time slower operation than packet processing. > > > > > > I don't see how it can be every synchronized. > > > > > > you could program them through the vq itself. > > > > > Do you mean in the packet transmit and receive completions itself? > > It would be too heavy to do so to mix the control fields in the data path. > > maybe. but notice how you have a specific design in mind so you are jumping ahead. > you asked how we could make these synch, i told you how. > the actual point is you want to say "we don't need this synchronous", > design idea: use cvq > > > > > > > we don't cover migration currently don't see how this is a spec rquirement. > > > > > unless maybe it's justification for 4? > > > > True, but design need to keep this in mind if it has some touch points like of > > > #4 so when migration arrives, it has the building block in place. > > > > > > so again, let's make sure we capture the actual spec requirement. > > > > > It is an actual spec requirement to be done and drafted in the spec. > > We may not do everything in the first phase, this are the broad requirements. > > And in design, we will say, requirement #4 is phase 2. > > yes but this one I don't know what it means, spec wise. > some kind of block? what for? > > > > > > > > > so maybe it means there needs to be a way to set counters? > > > > > so there's no need to mention migration - just that it should be > > > > > possible to move counters between devices. > > > > > > > > > > > +4. If a virtio device is group member device, a group owner should be > > > able > > > > > > + to query all the counter attributes using the admin queue command > > > which > > > > > > + a virtio device will expose via a control vq to the driver. > > > > > > > > > > > > > > > this seems weirdly specific. > > > > > what is the actual requirement? > > > > > > > > > I don't follow the question. > > > > When a device migrates from src to dst, group owner needs to know if both > > > side underlying member device has same counter attributes or not. > > > > > > whether it's through a command or not is not a requirement and I still do not > > > know what the requirement is. > > > what does "same counter attributes" mean? you never mentioned attributes > > > before. > > > > > I will refine this further and drop the word "attribute". > > If device support X counters, group owner should be able to know this bitmap. > > i guess device might only support part of counters then? > > > > > Yes, will change the GUEST surfaced from the current F_GUEST terminology of > > > the net device. > > > > > > So? this predates 1.x spec we never bothered changing them. > > > > > Will remove the guest wording and will change to transmit and receive. > > > > > > > > > > +4. le64 rx_pkts: Total packets received by the device > > > > > > > > > > including dropped ones or not? > > > > > > > > > Not including. Will add this clarification further in v1. > > > > > > > > > > + > > > > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors: > > > > > > +Descriptors dropped by the device due to errors in > > > > > > + descriptors > > > > > > > > > > since when do we drop packets on error in descriptor? > > > > > we just as likely stall ... > > > > > > > > > It is left to the device to implement on what to do on bad desc. > > > > > > then how can you count drops? > > A device may count the drops based on errors. > > It is counting drops based on the error it got. > > > > > why does this even matter? > > To debug. > > for debugging it's easiest if you just stop the vq, instead of drops. > > > > and why on tx specifically? > > Missed the rx, will add. > > > > > I feel addressing descriptor errors is a completely separate project. > > > > Not sure if it is that big project. > > I would have a separate debug section. > > -- > MST > > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php >


  • 15.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 20:36
    On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
    > We can debate and change from cvq to new cfgvq as part of the journey.
    > It is fine.

    Just to stress, cvq has a bunch of drawbacks. For example it is
    hard to access from the hypervisor since it's DMA from VF.
    This is why I think our direction should be to add a vq
    transport that does all config accesses over admin commands.
    Or maybe I'm wrong.

    But, let's distinguish between requirements, and between
    requirements and design. Getting supported counters is a requirement.
    Doing this over cvq is a design. Reducing MMIO registers is
    a requirement. OK?


    --
    MST




  • 16.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 20:36
    On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote: > We can debate and change from cvq to new cfgvq as part of the journey. > It is fine. Just to stress, cvq has a bunch of drawbacks. For example it is hard to access from the hypervisor since it's DMA from VF. This is why I think our direction should be to add a vq transport that does all config accesses over admin commands. Or maybe I'm wrong. But, let's distinguish between requirements, and between requirements and design. Getting supported counters is a requirement. Doing this over cvq is a design. Reducing MMIO registers is a requirement. OK? -- MST


  • 17.  RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 20:39


    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Wednesday, June 7, 2023 4:36 PM
    >
    > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
    > > We can debate and change from cvq to new cfgvq as part of the journey.
    > > It is fine.
    >
    > Just to stress, cvq has a bunch of drawbacks. For example it is hard to access
    > from the hypervisor since it's DMA from VF.
    > This is why I think our direction should be to add a vq transport that does all
    > config accesses over admin commands.
    We debate this before, so will write is below.
    > Or maybe I'm wrong.
    >

    > But, let's distinguish between requirements, and between requirements and
    > design. Getting supported counters is a requirement.
    > Doing this over cvq is a design. Reducing MMIO registers is a requirement. OK?

    Yes, I would write it as reducing MMIO and accessing it over vq of the own device is a requirement.




  • 18.  RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 20:39
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, June 7, 2023 4:36 PM > > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote: > > We can debate and change from cvq to new cfgvq as part of the journey. > > It is fine. > > Just to stress, cvq has a bunch of drawbacks. For example it is hard to access > from the hypervisor since it's DMA from VF. > This is why I think our direction should be to add a vq transport that does all > config accesses over admin commands. We debate this before, so will write is below. > Or maybe I'm wrong. > > But, let's distinguish between requirements, and between requirements and > design. Getting supported counters is a requirement. > Doing this over cvq is a design. Reducing MMIO registers is a requirement. OK? Yes, I would write it as reducing MMIO and accessing it over vq of the own device is a requirement.


  • 19.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 20:50
    On Wed, Jun 07, 2023 at 08:39:11PM +0000, Parav Pandit wrote:
    >
    >
    > > From: Michael S. Tsirkin <mst@redhat.com>
    > > Sent: Wednesday, June 7, 2023 4:36 PM
    > >
    > > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
    > > > We can debate and change from cvq to new cfgvq as part of the journey.
    > > > It is fine.
    > >
    > > Just to stress, cvq has a bunch of drawbacks. For example it is hard to access
    > > from the hypervisor since it's DMA from VF.
    > > This is why I think our direction should be to add a vq transport that does all
    > > config accesses over admin commands.
    > We debate this before, so will write is below.
    > > Or maybe I'm wrong.
    > >
    >
    > > But, let's distinguish between requirements, and between requirements and
    > > design. Getting supported counters is a requirement.
    > > Doing this over cvq is a design. Reducing MMIO registers is a requirement. OK?
    >
    > Yes, I would write it as reducing MMIO and accessing it over vq of the own device is a requirement.

    reducing mmio is a requirement (unrelated to counters btw, it's
    general). access over vq is a design point I think.

    --
    MST




  • 20.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 20:50
    On Wed, Jun 07, 2023 at 08:39:11PM +0000, Parav Pandit wrote: > > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Wednesday, June 7, 2023 4:36 PM > > > > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote: > > > We can debate and change from cvq to new cfgvq as part of the journey. > > > It is fine. > > > > Just to stress, cvq has a bunch of drawbacks. For example it is hard to access > > from the hypervisor since it's DMA from VF. > > This is why I think our direction should be to add a vq transport that does all > > config accesses over admin commands. > We debate this before, so will write is below. > > Or maybe I'm wrong. > > > > > But, let's distinguish between requirements, and between requirements and > > design. Getting supported counters is a requirement. > > Doing this over cvq is a design. Reducing MMIO registers is a requirement. OK? > > Yes, I would write it as reducing MMIO and accessing it over vq of the own device is a requirement. reducing mmio is a requirement (unrelated to counters btw, it's general). access over vq is a design point I think. -- MST


  • 21.  RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 20:53

    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Wednesday, June 7, 2023 4:50 PM
    >
    > On Wed, Jun 07, 2023 at 08:39:11PM +0000, Parav Pandit wrote:
    > >
    > >
    > > > From: Michael S. Tsirkin <mst@redhat.com>
    > > > Sent: Wednesday, June 7, 2023 4:36 PM
    > > >
    > > > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
    > > > > We can debate and change from cvq to new cfgvq as part of the journey.
    > > > > It is fine.
    > > >
    > > > Just to stress, cvq has a bunch of drawbacks. For example it is hard
    > > > to access from the hypervisor since it's DMA from VF.
    > > > This is why I think our direction should be to add a vq transport
    > > > that does all config accesses over admin commands.
    > > We debate this before, so will write is below.
    > > > Or maybe I'm wrong.
    > > >
    > >
    > > > But, let's distinguish between requirements, and between
    > > > requirements and design. Getting supported counters is a requirement.
    > > > Doing this over cvq is a design. Reducing MMIO registers is a requirement.
    > OK?
    > >
    > > Yes, I would write it as reducing MMIO and accessing it over vq of the own
    > device is a requirement.
    >
    > reducing mmio is a requirement (unrelated to counters btw, it's general). access
    > over vq is a design point I think.

    Didn't you ask to combine requirements and high-level design together in the first draft? What changed?

    Should we create a new queue, reuse cvq, generalize over cfg q, etc are detailed design discussions.



  • 22.  RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 20:53
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, June 7, 2023 4:50 PM > > On Wed, Jun 07, 2023 at 08:39:11PM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > Sent: Wednesday, June 7, 2023 4:36 PM > > > > > > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote: > > > > We can debate and change from cvq to new cfgvq as part of the journey. > > > > It is fine. > > > > > > Just to stress, cvq has a bunch of drawbacks. For example it is hard > > > to access from the hypervisor since it's DMA from VF. > > > This is why I think our direction should be to add a vq transport > > > that does all config accesses over admin commands. > > We debate this before, so will write is below. > > > Or maybe I'm wrong. > > > > > > > > But, let's distinguish between requirements, and between > > > requirements and design. Getting supported counters is a requirement. > > > Doing this over cvq is a design. Reducing MMIO registers is a requirement. > OK? > > > > Yes, I would write it as reducing MMIO and accessing it over vq of the own > device is a requirement. > > reducing mmio is a requirement (unrelated to counters btw, it's general). access > over vq is a design point I think. Didn't you ask to combine requirements and high-level design together in the first draft? What changed? Should we create a new queue, reuse cvq, generalize over cfg q, etc are detailed design discussions.


  • 23.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 09:31
    On Fri, 2 Jun 2023 01:02:59 +0300, Parav Pandit <parav@nvidia.com> wrote:
    > Add requirements document template for the virtio net features.
    >
    > Add virtio net device counters visible to driver.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 36 ++++++++++++++++++++++++++++++++++
    > 1 file changed, 36 insertions(+)
    > create mode 100644 net-workstream/features-1.4.md
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > new file mode 100644
    > index 0000000..03b4eb3
    > --- /dev/null
    > +++ b/net-workstream/features-1.4.md
    > @@ -0,0 +1,36 @@
    > +# 1. Introduction
    > +
    > +This document describes the overall requirements for virtio net device
    > +improvements for upcoming release 1.4. Some of these requirements are
    > +interrelated and influence the interface design, hence reviewing them
    > +together is desired while updating the virtio net interface.
    > +
    > +# 2. Summary
    > +1. Device counters visible to the driver
    > +
    > +# 3. Requirements
    > +## 3.1 Device counters
    > +1. The driver should be able to query the device and/or per vq counters for
    > + debugging purpose using a control vq command.
    > +2. The driver should be able to query which counters are supported using a
    > + control vq command.
    > +3. If this device is migrated between two hosts, the driver should be able
    > + get the counter values in the destination host from where it was left
    > + off in the source host.
    > +4. If a virtio device is group member device, a group owner should be able
    > + to query all the counter attributes using the admin queue command which
    > + a virtio device will expose via a control vq to the driver.
    > +
    > +### 3.1.1 Per receive queue counters
    > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
    > + oversize than the buffer size
    > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
    > + buffer in the receive queue
    > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the device
    > +4. le64 rx_pkts: Total packets received by the device
    > +
    > +### 3.1.2 Per transmit queue counters
    > +1. le64 tx_bad_desc_errors: Descriptors dropped by the device due to errors in
    > + descriptors
    > +2. le64 tx_gso_pkts: Packets send as host GSO sequence
    > +3. le64 tx_pkts: Total packets send by the device


    We hope to support device custom counter. That is, virtio spec provides a
    channel for driver and device, and both key and value are provided by device.

    We discussed this issue earlier, and after some internal practice, I think it is
    still necessary to discuss this again.

    It is very important, each cloud vendor will always have some special counters,
    these counters may not exist in another vendor. At the same time, if we have
    to discuss it in the spec every time we add a counter, or add a feature, I
    think it is very inconvenient. Manufacturers may add some new counters at any
    time based on some new requirements. Some counters may also be removed at any
    time.

    Of course I know that doing this might hurt migration. But what I want to say is,
    why does it affect live migration? These counters we plan to give to users
    through ethtool, and some changes have taken place in the ethtool counters
    output by users. Does this have any practical impact? Or do we directly use
    some other output in a way, we can clearly tell the user that these counters
    may change during the migration process. For example, the driver is migrated
    to some new devices. These devices support some new counters. I think users
    should be able to see these new counters. These new counters may be the
    purpose of the migration.

    We don't need to support live migration at this point.

    Thanks.


    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >



  • 24.  Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4

    Posted 06-07-2023 09:32
    On Fri, 2 Jun 2023 01:02:59 +0300, Parav Pandit <parav@nvidia.com> wrote: > Add requirements document template for the virtio net features. > > Add virtio net device counters visible to driver. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > net-workstream/features-1.4.md 36 ++++++++++++++++++++++++++++++++++ > 1 file changed, 36 insertions(+) > create mode 100644 net-workstream/features-1.4.md > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > new file mode 100644 > index 0000000..03b4eb3 > --- /dev/null > +++ b/net-workstream/features-1.4.md > @@ -0,0 +1,36 @@ > +# 1. Introduction > + > +This document describes the overall requirements for virtio net device > +improvements for upcoming release 1.4. Some of these requirements are > +interrelated and influence the interface design, hence reviewing them > +together is desired while updating the virtio net interface. > + > +# 2. Summary > +1. Device counters visible to the driver > + > +# 3. Requirements > +## 3.1 Device counters > +1. The driver should be able to query the device and/or per vq counters for > + debugging purpose using a control vq command. > +2. The driver should be able to query which counters are supported using a > + control vq command. > +3. If this device is migrated between two hosts, the driver should be able > + get the counter values in the destination host from where it was left > + off in the source host. > +4. If a virtio device is group member device, a group owner should be able > + to query all the counter attributes using the admin queue command which > + a virtio device will expose via a control vq to the driver. > + > +### 3.1.1 Per receive queue counters > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being > + oversize than the buffer size > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the > + buffer in the receive queue > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the device > +4. le64 rx_pkts: Total packets received by the device > + > +### 3.1.2 Per transmit queue counters > +1. le64 tx_bad_desc_errors: Descriptors dropped by the device due to errors in > + descriptors > +2. le64 tx_gso_pkts: Packets send as host GSO sequence > +3. le64 tx_pkts: Total packets send by the device We hope to support device custom counter. That is, virtio spec provides a channel for driver and device, and both key and value are provided by device. We discussed this issue earlier, and after some internal practice, I think it is still necessary to discuss this again. It is very important, each cloud vendor will always have some special counters, these counters may not exist in another vendor. At the same time, if we have to discuss it in the spec every time we add a counter, or add a feature, I think it is very inconvenient. Manufacturers may add some new counters at any time based on some new requirements. Some counters may also be removed at any time. Of course I know that doing this might hurt migration. But what I want to say is, why does it affect live migration? These counters we plan to give to users through ethtool, and some changes have taken place in the ethtool counters output by users. Does this have any practical impact? Or do we directly use some other output in a way, we can clearly tell the user that these counters may change during the migration process. For example, the driver is migrated to some new devices. These devices support some new counters. I think users should be able to see these new counters. These new counters may be the purpose of the migration. We don't need to support live migration at this point. Thanks. > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 25.  [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements

    Posted 06-01-2023 22:04
    Add requirements for the low latency transmit queue. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 73 ++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 03b4eb3..55f1b1f 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -7,6 +7,7 @@ together is desired while updating the virtio net interface. # 2. Summary 1. Device counters visible to the driver +2. Low latency tx virtqueue for PCI transport # 3. Requirements ## 3.1 Device counters @@ -34,3 +35,75 @@ together is desired while updating the virtio net interface. descriptors 2. le64 tx_gso_pkts: Packets send as host GSO sequence 3. le64 tx_pkts: Total packets send by the device + +## 3.2 Low PCI latency virtqueues +### 3.2.1 Low PCI latency tx virtqueue +1. Packet transmit descriptor should contain data descriptors count without any + indirection and without any O(N) search to find the end of a packet stream. + For example, a packet transmit descriptor (called vnet_tx_hdr_desc + subsequently) to contain a field num_next_desc for the packet stream + indicating that a packet is located N data descriptors. + +2. Packet transmit descriptor should contain segmentation offload-related fields + without any indirection. For example, packet transmit descriptor to contain + gso_type, gso_size/mss, header length, csum placement byte offset, and + csum start. + +3. Packet transmit descriptor should be able to place a small size packet that + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory. + For example a TCP ack only packet can fit in a descriptor memory which + otherwise consume more than 25% of metadata to describe the packet. + +4. Packet transmit descriptor should be able to place a full GSO header (L2 to + L4) after header descriptor and before data descriptors. For example, the + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory. + When such a GSO header is positioned adjacent to the packet transmit + descriptor, and when the GSO header is not aligned to 16B, the following + data descriptor to start on the 8B aligned boundary. + +5. An example of the above requirements at high level is: + +``` +struct vitio_packed_q_desc { + /* current desc for reference */ + u64 address; + u32 len; + u16 id; + u16 flags; +}; + +/* Constant size header descriptor for tx packets */ +struct vnet_tx_hdr_desc { + u16 flags; /* indicate how to parse next fields */ + u16 id; /* desc id to come back in completion */ + u8 num_next_desc; /* indicates the number of the next 16B data desc for this + * buffer. + */ + u8 gso_type; + le16 gso_hdr_len; + le16 gso_size; + le16 csum_start; + le16 csum_offset; + u8 inline_pkt_len; /* indicates the length of the inline packet after this + * desc + */ + u8 reserved; + u8 padding[]; +}; + +/* Example of a short packet or GSO header placed in the desc section of the vq + */ +struct vnet_tx_small_pkt_desc { + u8 raw_pkt[128]; +}; + +/* Example of header followed by data descriptor */ +struct vnet_tx_hdr_desc hdr_desc; +struct vnet_data_desc desc[2]; + +``` +6. Ability to zero pad the transmit completion when the transmit completion is + shorter than the CPU cache line size. + +7. Ability to place all transmit completion together with it per packet stream + transmit timestamp using single PCIe transcation. -- 2.26.2


  • 26.  Re: [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements

    Posted 06-06-2023 22:26
    On Fri, Jun 02, 2023 at 01:03:00AM +0300, Parav Pandit wrote:
    > Add requirements for the low latency transmit queue.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 73 ++++++++++++++++++++++++++++++++++
    > 1 file changed, 73 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index 03b4eb3..55f1b1f 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -7,6 +7,7 @@ together is desired while updating the virtio net interface.
    >
    > # 2. Summary
    > 1. Device counters visible to the driver
    > +2. Low latency tx virtqueue for PCI transport
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -34,3 +35,75 @@ together is desired while updating the virtio net interface.
    > descriptors
    > 2. le64 tx_gso_pkts: Packets send as host GSO sequence
    > 3. le64 tx_pkts: Total packets send by the device
    > +
    > +## 3.2 Low PCI latency virtqueues
    > +### 3.2.1 Low PCI latency tx virtqueue
    > +1. Packet transmit descriptor should contain data descriptors count without any
    > + indirection and without any O(N) search to find the end of a packet stream.
    > + For example, a packet transmit descriptor (called vnet_tx_hdr_desc
    > + subsequently) to contain a field num_next_desc for the packet stream
    > + indicating that a packet is located N data descriptors.
    > +
    > +2. Packet transmit descriptor should contain segmentation offload-related fields
    > + without any indirection. For example, packet transmit descriptor to contain
    > + gso_type, gso_size/mss, header length, csum placement byte offset, and
    > + csum start.
    > +
    > +3. Packet transmit descriptor should be able to place a small size packet that
    > + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory.
    > + For example a TCP ack only packet can fit in a descriptor memory which
    > + otherwise consume more than 25% of metadata to describe the packet.

    For IPv6? With all the crazy options? Are you sure?

    > +
    > +4. Packet transmit descriptor should be able to place a full GSO header (L2 to
    > + L4) after header descriptor and before data descriptors. For example, the
    > + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory.
    > + When such a GSO header is positioned adjacent to the packet transmit
    > + descriptor, and when the GSO header is not aligned to 16B, the following
    > + data descriptor to start on the 8B aligned boundary.
    > +

    These alignments are weirdly specific.


    Are these achievable with existing VQ designs? If not this specific part
    is not going to be a network specific thing. which is ok but maybe
    mention this.

    > +5. An example of the above requirements at high level is:
    > +
    > +```
    > +struct vitio_packed_q_desc {
    > + /* current desc for reference */
    > + u64 address;
    > + u32 len;
    > + u16 id;
    > + u16 flags;
    > +};
    > +
    > +/* Constant size header descriptor for tx packets */
    > +struct vnet_tx_hdr_desc {
    > + u16 flags; /* indicate how to parse next fields */
    > + u16 id; /* desc id to come back in completion */
    > + u8 num_next_desc; /* indicates the number of the next 16B data desc for this
    > + * buffer.
    > + */
    > + u8 gso_type;
    > + le16 gso_hdr_len;
    > + le16 gso_size;
    > + le16 csum_start;
    > + le16 csum_offset;
    > + u8 inline_pkt_len; /* indicates the length of the inline packet after this
    > + * desc
    > + */
    > + u8 reserved;
    > + u8 padding[];
    > +};
    > +
    > +/* Example of a short packet or GSO header placed in the desc section of the vq
    > + */
    > +struct vnet_tx_small_pkt_desc {
    > + u8 raw_pkt[128];
    > +};
    > +
    > +/* Example of header followed by data descriptor */
    > +struct vnet_tx_hdr_desc hdr_desc;
    > +struct vnet_data_desc desc[2];
    > +
    > +```

    Seems to assume huge descriptors, apparently of variable size.
    Are fixed size descriptors valuable?



    > +6. Ability to zero pad the transmit completion when the transmit completion is
    > + shorter than the CPU cache line size.
    > +
    > +7. Ability to place all transmit completion together with it per packet stream
    > + transmit timestamp using single PCIe transcation.

    I can't decipher the last two ones. Part of it is using non virtio
    terminology, part referring to PCI/CPU unnecessarily, part
    weird grammar.

    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 27.  Re: [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements

    Posted 06-06-2023 22:26
    On Fri, Jun 02, 2023 at 01:03:00AM +0300, Parav Pandit wrote: > Add requirements for the low latency transmit queue. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > net-workstream/features-1.4.md 73 ++++++++++++++++++++++++++++++++++ > 1 file changed, 73 insertions(+) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index 03b4eb3..55f1b1f 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -7,6 +7,7 @@ together is desired while updating the virtio net interface. > > # 2. Summary > 1. Device counters visible to the driver > +2. Low latency tx virtqueue for PCI transport > > # 3. Requirements > ## 3.1 Device counters > @@ -34,3 +35,75 @@ together is desired while updating the virtio net interface. > descriptors > 2. le64 tx_gso_pkts: Packets send as host GSO sequence > 3. le64 tx_pkts: Total packets send by the device > + > +## 3.2 Low PCI latency virtqueues > +### 3.2.1 Low PCI latency tx virtqueue > +1. Packet transmit descriptor should contain data descriptors count without any > + indirection and without any O(N) search to find the end of a packet stream. > + For example, a packet transmit descriptor (called vnet_tx_hdr_desc > + subsequently) to contain a field num_next_desc for the packet stream > + indicating that a packet is located N data descriptors. > + > +2. Packet transmit descriptor should contain segmentation offload-related fields > + without any indirection. For example, packet transmit descriptor to contain > + gso_type, gso_size/mss, header length, csum placement byte offset, and > + csum start. > + > +3. Packet transmit descriptor should be able to place a small size packet that > + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory. > + For example a TCP ack only packet can fit in a descriptor memory which > + otherwise consume more than 25% of metadata to describe the packet. For IPv6? With all the crazy options? Are you sure? > + > +4. Packet transmit descriptor should be able to place a full GSO header (L2 to > + L4) after header descriptor and before data descriptors. For example, the > + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory. > + When such a GSO header is positioned adjacent to the packet transmit > + descriptor, and when the GSO header is not aligned to 16B, the following > + data descriptor to start on the 8B aligned boundary. > + These alignments are weirdly specific. Are these achievable with existing VQ designs? If not this specific part is not going to be a network specific thing. which is ok but maybe mention this. > +5. An example of the above requirements at high level is: > + > +``` > +struct vitio_packed_q_desc { > + /* current desc for reference */ > + u64 address; > + u32 len; > + u16 id; > + u16 flags; > +}; > + > +/* Constant size header descriptor for tx packets */ > +struct vnet_tx_hdr_desc { > + u16 flags; /* indicate how to parse next fields */ > + u16 id; /* desc id to come back in completion */ > + u8 num_next_desc; /* indicates the number of the next 16B data desc for this > + * buffer. > + */ > + u8 gso_type; > + le16 gso_hdr_len; > + le16 gso_size; > + le16 csum_start; > + le16 csum_offset; > + u8 inline_pkt_len; /* indicates the length of the inline packet after this > + * desc > + */ > + u8 reserved; > + u8 padding[]; > +}; > + > +/* Example of a short packet or GSO header placed in the desc section of the vq > + */ > +struct vnet_tx_small_pkt_desc { > + u8 raw_pkt[128]; > +}; > + > +/* Example of header followed by data descriptor */ > +struct vnet_tx_hdr_desc hdr_desc; > +struct vnet_data_desc desc[2]; > + > +``` Seems to assume huge descriptors, apparently of variable size. Are fixed size descriptors valuable? > +6. Ability to zero pad the transmit completion when the transmit completion is > + shorter than the CPU cache line size. > + > +7. Ability to place all transmit completion together with it per packet stream > + transmit timestamp using single PCIe transcation. I can't decipher the last two ones. Part of it is using non virtio terminology, part referring to PCI/CPU unnecessarily, part weird grammar. > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 28.  RE: [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements

    Posted 06-06-2023 22:36


    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Tuesday, June 6, 2023 6:26 PM

    > > +## 3.2 Low PCI latency virtqueues
    > > +### 3.2.1 Low PCI latency tx virtqueue 1. Packet transmit descriptor
    > > +should contain data descriptors count without any
    > > + indirection and without any O(N) search to find the end of a packet stream.
    > > + For example, a packet transmit descriptor (called vnet_tx_hdr_desc
    > > + subsequently) to contain a field num_next_desc for the packet stream
    > > + indicating that a packet is located N data descriptors.
    > > +
    > > +2. Packet transmit descriptor should contain segmentation offload-related
    > fields
    > > + without any indirection. For example, packet transmit descriptor to contain
    > > + gso_type, gso_size/mss, header length, csum placement byte offset, and
    > > + csum start.
    > > +
    > > +3. Packet transmit descriptor should be able to place a small size packet that
    > > + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue
    > memory.
    > > + For example a TCP ack only packet can fit in a descriptor memory which
    > > + otherwise consume more than 25% of metadata to describe the packet.
    >
    > For IPv6? With all the crazy options? Are you sure?
    >
    IPV6 may not use all option, for example #3 is not useful for IPV6.

    > > +
    > > +4. Packet transmit descriptor should be able to place a full GSO header (L2 to
    > > + L4) after header descriptor and before data descriptors. For example, the
    > > + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue
    > memory.
    > > + When such a GSO header is positioned adjacent to the packet transmit
    > > + descriptor, and when the GSO header is not aligned to 16B, the following
    > > + data descriptor to start on the 8B aligned boundary.
    > > +
    >
    > These alignments are weirdly specific.
    >
    Not sure what it means,
    these alignments are mostly specific on the host side to have aligned writes.

    >
    > Are these achievable with existing VQ designs? If not this specific part is not
    > going to be a network specific thing. which is ok but maybe mention this.
    >
    It will be probably a VQ with a type/attributes.

    > > +5. An example of the above requirements at high level is:
    > > +
    > > +```
    > > +struct vitio_packed_q_desc {
    > > + /* current desc for reference */
    > > + u64 address;
    > > + u32 len;
    > > + u16 id;
    > > + u16 flags;
    > > +};
    > > +
    > > +/* Constant size header descriptor for tx packets */ struct
    > > +vnet_tx_hdr_desc {
    > > + u16 flags; /* indicate how to parse next fields */
    > > + u16 id; /* desc id to come back in completion */
    > > + u8 num_next_desc; /* indicates the number of the next 16B data desc for
    > this
    > > + * buffer.
    > > + */
    > > + u8 gso_type;
    > > + le16 gso_hdr_len;
    > > + le16 gso_size;
    > > + le16 csum_start;
    > > + le16 csum_offset;
    > > + u8 inline_pkt_len; /* indicates the length of the inline packet after this
    > > + * desc
    > > + */
    > > + u8 reserved;
    > > + u8 padding[];
    > > +};
    > > +
    > > +/* Example of a short packet or GSO header placed in the desc section
    > > +of the vq */ struct vnet_tx_small_pkt_desc {
    > > + u8 raw_pkt[128];
    > > +};
    > > +
    > > +/* Example of header followed by data descriptor */ struct
    > > +vnet_tx_hdr_desc hdr_desc; struct vnet_data_desc desc[2];
    > > +
    > > +```
    >
    > Seems to assume huge descriptors, apparently of variable size.
    > Are fixed size descriptors valuable?
    >
    Its variable size but size is known upfront. So in a way fixed size.

    >
    >
    > > +6. Ability to zero pad the transmit completion when the transmit
    > > +completion is shorter than the CPU cache line size.
    > > +
    > > +7. Ability to place all transmit completion together with it per packet stream
    > > + transmit timestamp using single PCIe transcation.
    >
    > I can't decipher the last two ones. Part of it is using non virtio terminology, part
    > referring to PCI/CPU unnecessarily, part weird grammar.
    >
    Yes, I will rewrite this part. It got messed up.



  • 29.  RE: [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements

    Posted 06-06-2023 22:36
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 6, 2023 6:26 PM > > +## 3.2 Low PCI latency virtqueues > > +### 3.2.1 Low PCI latency tx virtqueue 1. Packet transmit descriptor > > +should contain data descriptors count without any > > + indirection and without any O(N) search to find the end of a packet stream. > > + For example, a packet transmit descriptor (called vnet_tx_hdr_desc > > + subsequently) to contain a field num_next_desc for the packet stream > > + indicating that a packet is located N data descriptors. > > + > > +2. Packet transmit descriptor should contain segmentation offload-related > fields > > + without any indirection. For example, packet transmit descriptor to contain > > + gso_type, gso_size/mss, header length, csum placement byte offset, and > > + csum start. > > + > > +3. Packet transmit descriptor should be able to place a small size packet that > > + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue > memory. > > + For example a TCP ack only packet can fit in a descriptor memory which > > + otherwise consume more than 25% of metadata to describe the packet. > > For IPv6? With all the crazy options? Are you sure? > IPV6 may not use all option, for example #3 is not useful for IPV6. > > + > > +4. Packet transmit descriptor should be able to place a full GSO header (L2 to > > + L4) after header descriptor and before data descriptors. For example, the > > + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue > memory. > > + When such a GSO header is positioned adjacent to the packet transmit > > + descriptor, and when the GSO header is not aligned to 16B, the following > > + data descriptor to start on the 8B aligned boundary. > > + > > These alignments are weirdly specific. > Not sure what it means, these alignments are mostly specific on the host side to have aligned writes. > > Are these achievable with existing VQ designs? If not this specific part is not > going to be a network specific thing. which is ok but maybe mention this. > It will be probably a VQ with a type/attributes. > > +5. An example of the above requirements at high level is: > > + > > +``` > > +struct vitio_packed_q_desc { > > + /* current desc for reference */ > > + u64 address; > > + u32 len; > > + u16 id; > > + u16 flags; > > +}; > > + > > +/* Constant size header descriptor for tx packets */ struct > > +vnet_tx_hdr_desc { > > + u16 flags; /* indicate how to parse next fields */ > > + u16 id; /* desc id to come back in completion */ > > + u8 num_next_desc; /* indicates the number of the next 16B data desc for > this > > + * buffer. > > + */ > > + u8 gso_type; > > + le16 gso_hdr_len; > > + le16 gso_size; > > + le16 csum_start; > > + le16 csum_offset; > > + u8 inline_pkt_len; /* indicates the length of the inline packet after this > > + * desc > > + */ > > + u8 reserved; > > + u8 padding[]; > > +}; > > + > > +/* Example of a short packet or GSO header placed in the desc section > > +of the vq */ struct vnet_tx_small_pkt_desc { > > + u8 raw_pkt[128]; > > +}; > > + > > +/* Example of header followed by data descriptor */ struct > > +vnet_tx_hdr_desc hdr_desc; struct vnet_data_desc desc[2]; > > + > > +``` > > Seems to assume huge descriptors, apparently of variable size. > Are fixed size descriptors valuable? > Its variable size but size is known upfront. So in a way fixed size. > > > > +6. Ability to zero pad the transmit completion when the transmit > > +completion is shorter than the CPU cache line size. > > + > > +7. Ability to place all transmit completion together with it per packet stream > > + transmit timestamp using single PCIe transcation. > > I can't decipher the last two ones. Part of it is using non virtio terminology, part > referring to PCI/CPU unnecessarily, part weird grammar. > Yes, I will rewrite this part. It got messed up.


  • 30.  [PATCH requirements 3/7] net-features: Add low latency receive queue requirements

    Posted 06-01-2023 22:04
    Add requirements for the low latency receive queue. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 38 +++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 55f1b1f..054f951 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface. # 2. Summary 1. Device counters visible to the driver -2. Low latency tx virtqueue for PCI transport +2. Low latency tx and rx virtqueues for PCI transport # 3. Requirements ## 3.1 Device counters @@ -107,3 +107,39 @@ struct vnet_data_desc desc[2]; 7. Ability to place all transmit completion together with it per packet stream transmit timestamp using single PCIe transcation. + +### 3.2.2 Low latency rx virtqueue +1. The device should be able to write a packet receive completion that consists + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write + PCIe TLP. +2. The device should be able to perform DMA writes of multiple packets + completions in a single DMA transaction up to the PCIe maximum write limit + in a transaction. +3. The device should be able to zero pad packet write completion to align it to + 64B or CPU cache line size whenever possible. +4. An example of the above DMA completion structure: + +``` +/* Constant size receive packet completion */ +struct vnet_rx_completion { + u16 flags; + u16 id; /* buffer id */ + u8 gso_type; + u8 reserved[3]; + le16 gso_hdr_len; + le16 gso_size; + le16 csum_start; + le16 csum_offset; + u16 reserved2; + u64 timestamp; /* explained later */ + u8 padding[]; +}; +``` +5. The driver should be able to post constant-size buffer pages on a receive + queue which can be consumed by the device for an incoming packet of any size + from 64B to 9K bytes. +6. The device should be able to know the constant buffer size at receive + virtqueue level instead of per buffer level. +7. The device should be able to indicate when a full page buffer is consumed, + which can be recycled by the driver when the packets from the completed + page is fully consumed. -- 2.26.2


  • 31.  Re: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements

    Posted 06-06-2023 22:33
    On Fri, Jun 02, 2023 at 01:03:01AM +0300, Parav Pandit wrote:
    > Add requirements for the low latency receive queue.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 38 +++++++++++++++++++++++++++++++++-
    > 1 file changed, 37 insertions(+), 1 deletion(-)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index 55f1b1f..054f951 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
    >
    > # 2. Summary
    > 1. Device counters visible to the driver
    > -2. Low latency tx virtqueue for PCI transport
    > +2. Low latency tx and rx virtqueues for PCI transport
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -107,3 +107,39 @@ struct vnet_data_desc desc[2];
    >
    > 7. Ability to place all transmit completion together with it per packet stream
    > transmit timestamp using single PCIe transcation.
    > +
    > +### 3.2.2 Low latency rx virtqueue
    > +1. The device should be able to write a packet receive completion that consists
    > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
    > + PCIe TLP.

    why? what is wrong with it being linear with packet instead?

    > +2. The device should be able to perform DMA writes of multiple packets
    > + completions in a single DMA transaction up to the PCIe maximum write limit
    > + in a transaction.
    > +3. The device should be able to zero pad packet write completion to align it to
    > + 64B or CPU cache line size whenever possible.

    assuming completion is used buffer, these are eactly 64 bytes with
    packed vq, and they are linear so can be written in one transaction.
    if so why list requirements which are already met?
    if you want them for completeness mention this.

    > +4. An example of the above DMA completion structure:
    > +
    > +```
    > +/* Constant size receive packet completion */
    > +struct vnet_rx_completion {
    > + u16 flags;
    > + u16 id; /* buffer id */
    > + u8 gso_type;
    > + u8 reserved[3];
    > + le16 gso_hdr_len;
    > + le16 gso_size;
    > + le16 csum_start;
    > + le16 csum_offset;
    > + u16 reserved2;
    > + u64 timestamp; /* explained later */
    > + u8 padding[];
    > +};
    > +```
    > +5. The driver should be able to post constant-size buffer pages on a receive
    > + queue which can be consumed by the device for an incoming packet of any size
    > + from 64B to 9K bytes.

    possible with mrg buffers

    > +6. The device should be able to know the constant buffer size at receive
    > + virtqueue level instead of per buffer level.

    the bigger question is not communicating to device. that is trivial.
    the bigger question is that linux IP stack seems to benefit from variable sized
    packets because buffers waste precious kernel memory.
    is this for non IP stack such as xdp? non-linux guests? dpdk perhaps?

    > +7. The device should be able to indicate when a full page buffer is consumed,
    > + which can be recycled by the driver when the packets from the completed
    > + page is fully consumed.

    no idea what this means.

    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 32.  Re: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements

    Posted 06-06-2023 22:33
    On Fri, Jun 02, 2023 at 01:03:01AM +0300, Parav Pandit wrote: > Add requirements for the low latency receive queue. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > net-workstream/features-1.4.md 38 +++++++++++++++++++++++++++++++++- > 1 file changed, 37 insertions(+), 1 deletion(-) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index 55f1b1f..054f951 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface. > > # 2. Summary > 1. Device counters visible to the driver > -2. Low latency tx virtqueue for PCI transport > +2. Low latency tx and rx virtqueues for PCI transport > > # 3. Requirements > ## 3.1 Device counters > @@ -107,3 +107,39 @@ struct vnet_data_desc desc[2]; > > 7. Ability to place all transmit completion together with it per packet stream > transmit timestamp using single PCIe transcation. > + > +### 3.2.2 Low latency rx virtqueue > +1. The device should be able to write a packet receive completion that consists > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write > + PCIe TLP. why? what is wrong with it being linear with packet instead? > +2. The device should be able to perform DMA writes of multiple packets > + completions in a single DMA transaction up to the PCIe maximum write limit > + in a transaction. > +3. The device should be able to zero pad packet write completion to align it to > + 64B or CPU cache line size whenever possible. assuming completion is used buffer, these are eactly 64 bytes with packed vq, and they are linear so can be written in one transaction. if so why list requirements which are already met? if you want them for completeness mention this. > +4. An example of the above DMA completion structure: > + > +``` > +/* Constant size receive packet completion */ > +struct vnet_rx_completion { > + u16 flags; > + u16 id; /* buffer id */ > + u8 gso_type; > + u8 reserved[3]; > + le16 gso_hdr_len; > + le16 gso_size; > + le16 csum_start; > + le16 csum_offset; > + u16 reserved2; > + u64 timestamp; /* explained later */ > + u8 padding[]; > +}; > +``` > +5. The driver should be able to post constant-size buffer pages on a receive > + queue which can be consumed by the device for an incoming packet of any size > + from 64B to 9K bytes. possible with mrg buffers > +6. The device should be able to know the constant buffer size at receive > + virtqueue level instead of per buffer level. the bigger question is not communicating to device. that is trivial. the bigger question is that linux IP stack seems to benefit from variable sized packets because buffers waste precious kernel memory. is this for non IP stack such as xdp? non-linux guests? dpdk perhaps? > +7. The device should be able to indicate when a full page buffer is consumed, > + which can be recycled by the driver when the packets from the completed > + page is fully consumed. no idea what this means. > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 33.  RE: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements

    Posted 06-06-2023 22:44


    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Tuesday, June 6, 2023 6:33 PM
    >
    > On Fri, Jun 02, 2023 at 01:03:01AM +0300, Parav Pandit wrote:
    > > Add requirements for the low latency receive queue.
    > >
    > > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > > ---
    > > net-workstream/features-1.4.md | 38
    > > +++++++++++++++++++++++++++++++++-
    > > 1 file changed, 37 insertions(+), 1 deletion(-)
    > >
    > > diff --git a/net-workstream/features-1.4.md
    > > b/net-workstream/features-1.4.md index 55f1b1f..054f951 100644
    > > --- a/net-workstream/features-1.4.md
    > > +++ b/net-workstream/features-1.4.md
    > > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
    > >
    > > # 2. Summary
    > > 1. Device counters visible to the driver -2. Low latency tx virtqueue
    > > for PCI transport
    > > +2. Low latency tx and rx virtqueues for PCI transport
    > >
    > > # 3. Requirements
    > > ## 3.1 Device counters
    > > @@ -107,3 +107,39 @@ struct vnet_data_desc desc[2];
    > >
    > > 7. Ability to place all transmit completion together with it per packet stream
    > > transmit timestamp using single PCIe transcation.
    > > +
    > > +### 3.2.2 Low latency rx virtqueue
    > > +1. The device should be able to write a packet receive completion that
    > consists
    > > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
    > > + PCIe TLP.
    >
    > why? what is wrong with it being linear with packet instead?
    >
    It prohibits header data split, it requires multiple DMAs for the metadata consumed by single driver layer.
    Data processed by one layer is placed in two different locations.
    This hurts performance.

    > > +2. The device should be able to perform DMA writes of multiple packets
    > > + completions in a single DMA transaction up to the PCIe maximum write
    > limit
    > > + in a transaction.
    > > +3. The device should be able to zero pad packet write completion to align it
    > to
    > > + 64B or CPU cache line size whenever possible.
    >
    > assuming completion is used buffer, these are eactly 64 bytes with packed vq,
    > and they are linear so can be written in one transaction.
    When a packet spans multiple buffers, they cannot be written in contiguous manner.

    > if so why list requirements which are already met?
    > if you want them for completeness mention this.
    >
    > > +4. An example of the above DMA completion structure:
    > > +
    > > +```
    > > +/* Constant size receive packet completion */ struct
    > > +vnet_rx_completion {
    > > + u16 flags;
    > > + u16 id; /* buffer id */
    > > + u8 gso_type;
    > > + u8 reserved[3];
    > > + le16 gso_hdr_len;
    > > + le16 gso_size;
    > > + le16 csum_start;
    > > + le16 csum_offset;
    > > + u16 reserved2;
    > > + u64 timestamp; /* explained later */
    > > + u8 padding[];
    > > +};
    > > +```
    > > +5. The driver should be able to post constant-size buffer pages on a receive
    > > + queue which can be consumed by the device for an incoming packet of any
    > size
    > > + from 64B to 9K bytes.
    >
    > possible with mrg buffers
    >
    It doesn't scale to post constant size 64B buffer.

    > > +6. The device should be able to know the constant buffer size at receive
    > > + virtqueue level instead of per buffer level.
    >
    > the bigger question is not communicating to device. that is trivial.
    > the bigger question is that linux IP stack seems to benefit from variable sized
    > packets because buffers waste precious kernel memory.
    Want to avoid the wastage and still achieve constant size posting.

    > is this for non IP stack such as xdp? non-linux guests? dpdk perhaps?
    >
    For Linux kernel guest.

    > > +7. The device should be able to indicate when a full page buffer is consumed,
    > > + which can be recycled by the driver when the packets from the completed
    > > + page is fully consumed.
    >
    > no idea what this means.
    >
    Instead of driver nearly allocating a page and splitting into buffer nearly of same size, an approach to post the page and let device to consume it based on packet size.




  • 34.  RE: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements

    Posted 06-06-2023 22:44
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 6, 2023 6:33 PM > > On Fri, Jun 02, 2023 at 01:03:01AM +0300, Parav Pandit wrote: > > Add requirements for the low latency receive queue. > > > > Signed-off-by: Parav Pandit <parav@nvidia.com> > > --- > > net-workstream/features-1.4.md 38 > > +++++++++++++++++++++++++++++++++- > > 1 file changed, 37 insertions(+), 1 deletion(-) > > > > diff --git a/net-workstream/features-1.4.md > > b/net-workstream/features-1.4.md index 55f1b1f..054f951 100644 > > --- a/net-workstream/features-1.4.md > > +++ b/net-workstream/features-1.4.md > > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface. > > > > # 2. Summary > > 1. Device counters visible to the driver -2. Low latency tx virtqueue > > for PCI transport > > +2. Low latency tx and rx virtqueues for PCI transport > > > > # 3. Requirements > > ## 3.1 Device counters > > @@ -107,3 +107,39 @@ struct vnet_data_desc desc[2]; > > > > 7. Ability to place all transmit completion together with it per packet stream > > transmit timestamp using single PCIe transcation. > > + > > +### 3.2.2 Low latency rx virtqueue > > +1. The device should be able to write a packet receive completion that > consists > > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write > > + PCIe TLP. > > why? what is wrong with it being linear with packet instead? > It prohibits header data split, it requires multiple DMAs for the metadata consumed by single driver layer. Data processed by one layer is placed in two different locations. This hurts performance. > > +2. The device should be able to perform DMA writes of multiple packets > > + completions in a single DMA transaction up to the PCIe maximum write > limit > > + in a transaction. > > +3. The device should be able to zero pad packet write completion to align it > to > > + 64B or CPU cache line size whenever possible. > > assuming completion is used buffer, these are eactly 64 bytes with packed vq, > and they are linear so can be written in one transaction. When a packet spans multiple buffers, they cannot be written in contiguous manner. > if so why list requirements which are already met? > if you want them for completeness mention this. > > > +4. An example of the above DMA completion structure: > > + > > +``` > > +/* Constant size receive packet completion */ struct > > +vnet_rx_completion { > > + u16 flags; > > + u16 id; /* buffer id */ > > + u8 gso_type; > > + u8 reserved[3]; > > + le16 gso_hdr_len; > > + le16 gso_size; > > + le16 csum_start; > > + le16 csum_offset; > > + u16 reserved2; > > + u64 timestamp; /* explained later */ > > + u8 padding[]; > > +}; > > +``` > > +5. The driver should be able to post constant-size buffer pages on a receive > > + queue which can be consumed by the device for an incoming packet of any > size > > + from 64B to 9K bytes. > > possible with mrg buffers > It doesn't scale to post constant size 64B buffer. > > +6. The device should be able to know the constant buffer size at receive > > + virtqueue level instead of per buffer level. > > the bigger question is not communicating to device. that is trivial. > the bigger question is that linux IP stack seems to benefit from variable sized > packets because buffers waste precious kernel memory. Want to avoid the wastage and still achieve constant size posting. > is this for non IP stack such as xdp? non-linux guests? dpdk perhaps? > For Linux kernel guest. > > +7. The device should be able to indicate when a full page buffer is consumed, > > + which can be recycled by the driver when the packets from the completed > > + page is fully consumed. > > no idea what this means. > Instead of driver nearly allocating a page and splitting into buffer nearly of same size, an approach to post the page and let device to consume it based on packet size.


  • 35.  Re: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements

    Posted 06-06-2023 23:03
    On Tue, Jun 06, 2023 at 10:44:04PM +0000, Parav Pandit wrote:
    >
    >
    > > From: Michael S. Tsirkin <mst@redhat.com>
    > > Sent: Tuesday, June 6, 2023 6:33 PM
    > >
    > > On Fri, Jun 02, 2023 at 01:03:01AM +0300, Parav Pandit wrote:
    > > > Add requirements for the low latency receive queue.
    > > >
    > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > > > ---
    > > > net-workstream/features-1.4.md | 38
    > > > +++++++++++++++++++++++++++++++++-
    > > > 1 file changed, 37 insertions(+), 1 deletion(-)
    > > >
    > > > diff --git a/net-workstream/features-1.4.md
    > > > b/net-workstream/features-1.4.md index 55f1b1f..054f951 100644
    > > > --- a/net-workstream/features-1.4.md
    > > > +++ b/net-workstream/features-1.4.md
    > > > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
    > > >
    > > > # 2. Summary
    > > > 1. Device counters visible to the driver -2. Low latency tx virtqueue
    > > > for PCI transport
    > > > +2. Low latency tx and rx virtqueues for PCI transport
    > > >
    > > > # 3. Requirements
    > > > ## 3.1 Device counters
    > > > @@ -107,3 +107,39 @@ struct vnet_data_desc desc[2];
    > > >
    > > > 7. Ability to place all transmit completion together with it per packet stream
    > > > transmit timestamp using single PCIe transcation.
    > > > +
    > > > +### 3.2.2 Low latency rx virtqueue
    > > > +1. The device should be able to write a packet receive completion that
    > > consists
    > > > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
    > > > + PCIe TLP.
    > >
    > > why? what is wrong with it being linear with packet instead?
    > >
    > It prohibits header data split,
    > it requires multiple DMAs for the metadata consumed by single driver layer.
    > Data processed by one layer is placed in two different locations.
    > This hurts performance.

    only with split yes? refer to that then.

    maybe combine with the header split part?
    these two seem intervined.


    >
    > > > +2. The device should be able to perform DMA writes of multiple packets
    > > > + completions in a single DMA transaction up to the PCIe maximum write
    > > limit
    > > > + in a transaction.
    > > > +3. The device should be able to zero pad packet write completion to align it
    > > to
    > > > + 64B or CPU cache line size whenever possible.
    > >
    > > assuming completion is used buffer, these are eactly 64 bytes with packed vq,
    > > and they are linear so can be written in one transaction.
    > When a packet spans multiple buffers, they cannot be written in contiguous manner.

    oh you mean when you manage to stick packet itself in the queue?
    this idea is only mentioned down the road.


    > > if so why list requirements which are already met?
    > > if you want them for completeness mention this.
    > >
    > > > +4. An example of the above DMA completion structure:
    > > > +
    > > > +```
    > > > +/* Constant size receive packet completion */ struct
    > > > +vnet_rx_completion {
    > > > + u16 flags;
    > > > + u16 id; /* buffer id */
    > > > + u8 gso_type;
    > > > + u8 reserved[3];
    > > > + le16 gso_hdr_len;
    > > > + le16 gso_size;
    > > > + le16 csum_start;
    > > > + le16 csum_offset;
    > > > + u16 reserved2;
    > > > + u64 timestamp; /* explained later */
    > > > + u8 padding[];
    > > > +};
    > > > +```
    > > > +5. The driver should be able to post constant-size buffer pages on a receive
    > > > + queue which can be consumed by the device for an incoming packet of any
    > > size
    > > > + from 64B to 9K bytes.
    > >
    > > possible with mrg buffers
    > >
    > It doesn't scale to post constant size 64B buffer.

    maybe - figure out the actual requirement then.


    > > > +6. The device should be able to know the constant buffer size at receive
    > > > + virtqueue level instead of per buffer level.
    > >
    > > the bigger question is not communicating to device. that is trivial.
    > > the bigger question is that linux IP stack seems to benefit from variable sized
    > > packets because buffers waste precious kernel memory.
    > Want to avoid the wastage and still achieve constant size posting.

    ok mention this maybe.

    > > is this for non IP stack such as xdp? non-linux guests? dpdk perhaps?
    > >
    > For Linux kernel guest.

    constant size buffers will inevitably waste memory i don't see
    a way around that.

    > > > +7. The device should be able to indicate when a full page buffer is consumed,
    > > > + which can be recycled by the driver when the packets from the completed
    > > > + page is fully consumed.
    > >
    > > no idea what this means.
    > >
    > Instead of driver nearly allocating a page and splitting into buffer nearly of same size, an approach to post the page and let device to consume it based on packet size.

    oh I see. interesting. not easy as this does not fit the
    buffer abstraction.

    --
    MST




  • 36.  Re: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements

    Posted 06-06-2023 23:03
    On Tue, Jun 06, 2023 at 10:44:04PM +0000, Parav Pandit wrote: > > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Tuesday, June 6, 2023 6:33 PM > > > > On Fri, Jun 02, 2023 at 01:03:01AM +0300, Parav Pandit wrote: > > > Add requirements for the low latency receive queue. > > > > > > Signed-off-by: Parav Pandit <parav@nvidia.com> > > > --- > > > net-workstream/features-1.4.md 38 > > > +++++++++++++++++++++++++++++++++- > > > 1 file changed, 37 insertions(+), 1 deletion(-) > > > > > > diff --git a/net-workstream/features-1.4.md > > > b/net-workstream/features-1.4.md index 55f1b1f..054f951 100644 > > > --- a/net-workstream/features-1.4.md > > > +++ b/net-workstream/features-1.4.md > > > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface. > > > > > > # 2. Summary > > > 1. Device counters visible to the driver -2. Low latency tx virtqueue > > > for PCI transport > > > +2. Low latency tx and rx virtqueues for PCI transport > > > > > > # 3. Requirements > > > ## 3.1 Device counters > > > @@ -107,3 +107,39 @@ struct vnet_data_desc desc[2]; > > > > > > 7. Ability to place all transmit completion together with it per packet stream > > > transmit timestamp using single PCIe transcation. > > > + > > > +### 3.2.2 Low latency rx virtqueue > > > +1. The device should be able to write a packet receive completion that > > consists > > > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write > > > + PCIe TLP. > > > > why? what is wrong with it being linear with packet instead? > > > It prohibits header data split, > it requires multiple DMAs for the metadata consumed by single driver layer. > Data processed by one layer is placed in two different locations. > This hurts performance. only with split yes? refer to that then. maybe combine with the header split part? these two seem intervined. > > > > +2. The device should be able to perform DMA writes of multiple packets > > > + completions in a single DMA transaction up to the PCIe maximum write > > limit > > > + in a transaction. > > > +3. The device should be able to zero pad packet write completion to align it > > to > > > + 64B or CPU cache line size whenever possible. > > > > assuming completion is used buffer, these are eactly 64 bytes with packed vq, > > and they are linear so can be written in one transaction. > When a packet spans multiple buffers, they cannot be written in contiguous manner. oh you mean when you manage to stick packet itself in the queue? this idea is only mentioned down the road. > > if so why list requirements which are already met? > > if you want them for completeness mention this. > > > > > +4. An example of the above DMA completion structure: > > > + > > > +``` > > > +/* Constant size receive packet completion */ struct > > > +vnet_rx_completion { > > > + u16 flags; > > > + u16 id; /* buffer id */ > > > + u8 gso_type; > > > + u8 reserved[3]; > > > + le16 gso_hdr_len; > > > + le16 gso_size; > > > + le16 csum_start; > > > + le16 csum_offset; > > > + u16 reserved2; > > > + u64 timestamp; /* explained later */ > > > + u8 padding[]; > > > +}; > > > +``` > > > +5. The driver should be able to post constant-size buffer pages on a receive > > > + queue which can be consumed by the device for an incoming packet of any > > size > > > + from 64B to 9K bytes. > > > > possible with mrg buffers > > > It doesn't scale to post constant size 64B buffer. maybe - figure out the actual requirement then. > > > +6. The device should be able to know the constant buffer size at receive > > > + virtqueue level instead of per buffer level. > > > > the bigger question is not communicating to device. that is trivial. > > the bigger question is that linux IP stack seems to benefit from variable sized > > packets because buffers waste precious kernel memory. > Want to avoid the wastage and still achieve constant size posting. ok mention this maybe. > > is this for non IP stack such as xdp? non-linux guests? dpdk perhaps? > > > For Linux kernel guest. constant size buffers will inevitably waste memory i don't see a way around that. > > > +7. The device should be able to indicate when a full page buffer is consumed, > > > + which can be recycled by the driver when the packets from the completed > > > + page is fully consumed. > > > > no idea what this means. > > > Instead of driver nearly allocating a page and splitting into buffer nearly of same size, an approach to post the page and let device to consume it based on packet size. oh I see. interesting. not easy as this does not fit the buffer abstraction. -- MST


  • 37.  [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-01-2023 22:04
    Add virtio net device requirements for receive flow steering. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index fc36f31..41242b4 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface. 1. Device counters visible to the driver 2. Low latency tx and rx virtqueues for PCI transport 3. Virtqueue notification coalescing re-arming support +4. Receive virtqueue n-tuple steering # 3. Requirements ## 3.1 Device counters @@ -151,3 +152,35 @@ struct vnet_rx_completion { coalescing after processing the packets per VQ. This ensures that when networking stack decides to poll, no new notifications are generated when per VQ notification coalescing is used. + +## 3.4 Receive virtqueue n-tuple flow steering (RFS) +1. The driver should be able to define a receive packet match criteria, an + action and a destination for a packet. For example, an ipv4 packet with a + multicast address to be steered to the receive vq 0. The second example is + ipv4, tcp packet matching a specified IP address and tcp port tuple to + steered to receive vq 10. +2. The match criteria should include exact tuple fields well-defined such as mac + address, IP addresses, tcp/udp ports, etc. +3. The match criteria should also include the field mask. +4. The match criteria may optionally also include specific packet data offset, + length, and matching pattern, which may not be defined in the standard RFC. +5. Action includes (a) dropping or (b) forwarding the packet. +6. Destination location is a receive virtqueue index or rss context. +7. The device should process RFS rules before RSS rules, i.e., when there is a + miss on the RFS rule RSS rule applies. +8. The device should be able to add/remove these rules at a rate of 100K + rules/sec. +9. If multiple rules are programmed which has overlapping attributes for a + received packet, the driver to define the location/priority of the rule. +10. The driver should be able to query flow steering table entries programmed in + the device by the flow id. +11. The driver should be able to add the entry for attributes (a) match + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. +12. The driver should be able to delete the steering rule entry via a unique id. +13. The driver should be able to add and delete the steering rules in out of + order manner without depending on previous commands. +14. A group member device should be able to query the attributes of the flow + steering that device supports. +15. The driver and group owner driver should be able to query supported device + limits for the steering entries. + -- 2.26.2


  • 38.  Re: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-02-2023 03:35
    Hi, Parav.

    I also did some work on this feature, and I think it would be very
    useful if we could work on this interface design and VirtIO Spec together!

    ? 2023/6/2 ??6:03, Parav Pandit ??:
    > Add virtio net device requirements for receive flow steering.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 33 +++++++++++++++++++++++++++++++++
    > 1 file changed, 33 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index fc36f31..41242b4 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface.
    > 1. Device counters visible to the driver
    > 2. Low latency tx and rx virtqueues for PCI transport
    > 3. Virtqueue notification coalescing re-arming support
    > +4. Receive virtqueue n-tuple steering
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -151,3 +152,35 @@ struct vnet_rx_completion {
    > coalescing after processing the packets per VQ. This ensures that when
    > networking stack decides to poll, no new notifications are generated when
    > per VQ notification coalescing is used.
    > +
    > +## 3.4 Receive virtqueue n-tuple flow steering (RFS)

    I think this abbreviation can be replaced with eg Receive virtqueue
    n-tuple flow director, RFD? or something else?
    This allows us to distinguish more clearly from (Accelerated) Recieve
    Flow Steering, (A)RFS.

    ARFS can *automatically* issue RFS rules based on n-tuple rules, and we
    can still issue n-tuple rules *manually*.
    So RFD seems more reasonable than RFS abbreviation, what do you think?:)

    > +1. The driver should be able to define a receive packet match criteria, an
    > + action and a destination for a packet. For example, an ipv4 packet with a
    > + multicast address to be steered to the receive vq 0. The second example is
    > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
    > + steered to receive vq 10.
    > +2. The match criteria should include exact tuple fields well-defined such as mac
    > + address, IP addresses, tcp/udp ports, etc.
    > +3. The match criteria should also include the field mask.

    The mask structure can be considered to reuse the same structure as the
    key structure.

    > +4. The match criteria may optionally also include specific packet data offset,
    > + length, and matching pattern, which may not be defined in the standard RFC.
    > +5. Action includes (a) dropping or (b) forwarding the packet.
    > +6. Destination location is a receive virtqueue index or rss context.

    We may have an additional requirement internally, such as specifying the
    queue id of a certain VF,
    but we are still aligning this requirement internally, and it may be
    more reasonable to implement the
    work involving device group based on admin vq.

    > +7. The device should process RFS rules before RSS rules, i.e., when there is a
    > + miss on the RFS rule RSS rule applies.
    > +8. The device should be able to add/remove these rules at a rate of 100K
    > + rules/sec.
    > +9. If multiple rules are programmed which has overlapping attributes for a
    > + received packet, the driver to define the location/priority of the rule.

    Considering the capabilities similar to some current tools such as
    ethtool, if it is a simple implementation,
    the first match principle can be applied.

    > +10. The driver should be able to query flow steering table entries programmed in
    > + the device by the flow id.
    > +11. The driver should be able to add the entry for attributes (a) match
    > + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    > +12. The driver should be able to delete the steering rule entry via a unique id.
    > +13. The driver should be able to add and delete the steering rules in out of
    > + order manner without depending on previous commands.
    > +14. A group member device should be able to query the attributes of the flow
    > + steering that device supports.
    > +15. The driver and group owner driver should be able to query supported device
    > + limits for the steering entries.

    Do we want to support some user-defined fields, so that manufacturers
    can customize the interpretation of some fields.

    Thanks!



  • 39.  Re: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-02-2023 03:35
    Hi, Parav. I also did some work on this feature, and I think it would be very useful if we could work on this interface design and VirtIO Spec together! å 2023/6/2 äå6:03, Parav Pandit åé: Add virtio net device requirements for receive flow steering. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index fc36f31..41242b4 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface. 1. Device counters visible to the driver 2. Low latency tx and rx virtqueues for PCI transport 3. Virtqueue notification coalescing re-arming support +4. Receive virtqueue n-tuple steering # 3. Requirements ## 3.1 Device counters @@ -151,3 +152,35 @@ struct vnet_rx_completion { coalescing after processing the packets per VQ. This ensures that when networking stack decides to poll, no new notifications are generated when per VQ notification coalescing is used. + +## 3.4 Receive virtqueue n-tuple flow steering (RFS) I think this abbreviation can be replaced with eg Receive virtqueue n-tuple flow director, RFD? or something else? This allows us to distinguish more clearly from (Accelerated) Recieve Flow Steering, (A)RFS. ARFS can *automatically* issue RFS rules based on n-tuple rules, and we can still issue n-tuple rules *manually*. So RFD seems more reasonable than RFS abbreviation, what do you think?:) +1. The driver should be able to define a receive packet match criteria, an + action and a destination for a packet. For example, an ipv4 packet with a + multicast address to be steered to the receive vq 0. The second example is + ipv4, tcp packet matching a specified IP address and tcp port tuple to + steered to receive vq 10. +2. The match criteria should include exact tuple fields well-defined such as mac + address, IP addresses, tcp/udp ports, etc. +3. The match criteria should also include the field mask. The mask structure can be considered to reuse the same structure as the key structure. +4. The match criteria may optionally also include specific packet data offset, + length, and matching pattern, which may not be defined in the standard RFC. +5. Action includes (a) dropping or (b) forwarding the packet. +6. Destination location is a receive virtqueue index or rss context. We may have an additional requirement internally, such as specifying the queue id of a certain VF, but we are still aligning this requirement internally, and it may be more reasonable to implement the work involving device group based on admin vq. +7. The device should process RFS rules before RSS rules, i.e., when there is a + miss on the RFS rule RSS rule applies. +8. The device should be able to add/remove these rules at a rate of 100K + rules/sec. +9. If multiple rules are programmed which has overlapping attributes for a + received packet, the driver to define the location/priority of the rule. Considering the capabilities similar to some current tools such as ethtool, if it is a simple implementation, the first match principle can be applied. +10. The driver should be able to query flow steering table entries programmed in + the device by the flow id. +11. The driver should be able to add the entry for attributes (a) match + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. +12. The driver should be able to delete the steering rule entry via a unique id. +13. The driver should be able to add and delete the steering rules in out of + order manner without depending on previous commands. +14. A group member device should be able to query the attributes of the flow + steering that device supports. +15. The driver and group owner driver should be able to query supported device + limits for the steering entries. Do we want to support some user-defined fields, so that manufacturers can customize the interpretation of some fields. Thanks!


  • 40.  RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-02-2023 03:52
    Hi Heng,

    > From: Heng Qi <hengqi@linux.alibaba.com>
    > Sent: Thursday, June 1, 2023 11:35 PM
    >
    > Hi, Parav.
    >
    > I also did some work on this feature, and I think it would be very useful if we
    > could work on this interface design and VirtIO Spec together!
    >
    Sure, happy to bring this alive together.
    More below.

    > > +
    > > +## 3.4 Receive virtqueue n-tuple flow steering (RFS)
    >
    > I think this abbreviation can be replaced with eg Receive virtqueue n-tuple flow
    > director, RFD? or something else?
    One hw vendor product name has director in it, so will probably want to stay away picking such TM.

    May be we can just say flow filters.
    A filter may have an action and a destination.
    So it can be generic enough.


    > This allows us to distinguish more clearly from (Accelerated) Recieve Flow
    > Steering, (A)RFS.
    >
    The idea was for ARFS and RFS both to reuse the same underlying infra, just the OS hooks are different. (stack vs user ethtool respectively).

    > ARFS can *automatically* issue RFS rules based on n-tuple rules, and we can
    > still issue n-tuple rules *manually*.
    > So RFD seems more reasonable than RFS abbreviation, what do you think?:)
    >
    "Receive flow filters" is probably more neutral name. :)

    > > +1. The driver should be able to define a receive packet match criteria, an
    > > + action and a destination for a packet. For example, an ipv4 packet with a
    > > + multicast address to be steered to the receive vq 0. The second example is
    > > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
    > > + steered to receive vq 10.
    > > +2. The match criteria should include exact tuple fields well-defined such as
    > mac
    > > + address, IP addresses, tcp/udp ports, etc.
    > > +3. The match criteria should also include the field mask.
    >
    > The mask structure can be considered to reuse the same structure as the key
    > structure.
    Yes, mask will probably have the bit mask definition.

    >
    > > +4. The match criteria may optionally also include specific packet data offset,
    > > + length, and matching pattern, which may not be defined in the standard
    > RFC.
    > > +5. Action includes (a) dropping or (b) forwarding the packet.
    > > +6. Destination location is a receive virtqueue index or rss context.
    >
    > We may have an additional requirement internally, such as specifying the queue
    > id of a certain VF, but we are still aligning this requirement internally, and it may
    > be more reasonable to implement the work involving device group based on
    > admin vq.
    >
    The switch side APIs bit more complex.
    We were first considering using the flow filters/steering for the guest driver and extend further for switching side.

    > > +7. The device should process RFS rules before RSS rules, i.e., when there is a
    > > + miss on the RFS rule RSS rule applies.
    > > +8. The device should be able to add/remove these rules at a rate of 100K
    > > + rules/sec.
    > > +9. If multiple rules are programmed which has overlapping attributes for a
    > > + received packet, the driver to define the location/priority of the rule.
    >
    > Considering the capabilities similar to some current tools such as ethtool, if it is
    > a simple implementation, the first match principle can be applied.
    >
    Yes, this likely needs refinement, I just wanted to get the seed for priority to be afloat when I wrote.
    I am thinking of defining the priority at interface level, and driver can choose first match in the programming model based on the use.

    > > +10. The driver should be able to query flow steering table entries
    > programmed in
    > > + the device by the flow id.
    > > +11. The driver should be able to add the entry for attributes (a) match
    > > + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    > > +12. The driver should be able to delete the steering rule entry via a unique
    > id.
    > > +13. The driver should be able to add and delete the steering rules in out of
    > > + order manner without depending on previous commands.
    > > +14. A group member device should be able to query the attributes of the
    > flow
    > > + steering that device supports.
    > > +15. The driver and group owner driver should be able to query supported
    > device
    > > + limits for the steering entries.
    >
    > Do we want to support some user-defined fields, so that manufacturers can
    > customize the interpretation of some fields.
    >
    Yes, we can phase out the common case of ipv{4,6} and {udp,tcp} as first phase and more customs field as second phase.
    Those use define fields can be useful in custom filters.



  • 41.  RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-02-2023 03:52
    Hi Heng, > From: Heng Qi <hengqi@linux.alibaba.com> > Sent: Thursday, June 1, 2023 11:35 PM > > Hi, Parav. > > I also did some work on this feature, and I think it would be very useful if we > could work on this interface design and VirtIO Spec together! > Sure, happy to bring this alive together. More below. > > + > > +## 3.4 Receive virtqueue n-tuple flow steering (RFS) > > I think this abbreviation can be replaced with eg Receive virtqueue n-tuple flow > director, RFD? or something else? One hw vendor product name has director in it, so will probably want to stay away picking such TM. May be we can just say flow filters. A filter may have an action and a destination. So it can be generic enough. > This allows us to distinguish more clearly from (Accelerated) Recieve Flow > Steering, (A)RFS. > The idea was for ARFS and RFS both to reuse the same underlying infra, just the OS hooks are different. (stack vs user ethtool respectively). > ARFS can *automatically* issue RFS rules based on n-tuple rules, and we can > still issue n-tuple rules *manually*. > So RFD seems more reasonable than RFS abbreviation, what do you think?:) > "Receive flow filters" is probably more neutral name. :) > > +1. The driver should be able to define a receive packet match criteria, an > > + action and a destination for a packet. For example, an ipv4 packet with a > > + multicast address to be steered to the receive vq 0. The second example is > > + ipv4, tcp packet matching a specified IP address and tcp port tuple to > > + steered to receive vq 10. > > +2. The match criteria should include exact tuple fields well-defined such as > mac > > + address, IP addresses, tcp/udp ports, etc. > > +3. The match criteria should also include the field mask. > > The mask structure can be considered to reuse the same structure as the key > structure. Yes, mask will probably have the bit mask definition. > > > +4. The match criteria may optionally also include specific packet data offset, > > + length, and matching pattern, which may not be defined in the standard > RFC. > > +5. Action includes (a) dropping or (b) forwarding the packet. > > +6. Destination location is a receive virtqueue index or rss context. > > We may have an additional requirement internally, such as specifying the queue > id of a certain VF, but we are still aligning this requirement internally, and it may > be more reasonable to implement the work involving device group based on > admin vq. > The switch side APIs bit more complex. We were first considering using the flow filters/steering for the guest driver and extend further for switching side. > > +7. The device should process RFS rules before RSS rules, i.e., when there is a > > + miss on the RFS rule RSS rule applies. > > +8. The device should be able to add/remove these rules at a rate of 100K > > + rules/sec. > > +9. If multiple rules are programmed which has overlapping attributes for a > > + received packet, the driver to define the location/priority of the rule. > > Considering the capabilities similar to some current tools such as ethtool, if it is > a simple implementation, the first match principle can be applied. > Yes, this likely needs refinement, I just wanted to get the seed for priority to be afloat when I wrote. I am thinking of defining the priority at interface level, and driver can choose first match in the programming model based on the use. > > +10. The driver should be able to query flow steering table entries > programmed in > > + the device by the flow id. > > +11. The driver should be able to add the entry for attributes (a) match > > + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. > > +12. The driver should be able to delete the steering rule entry via a unique > id. > > +13. The driver should be able to add and delete the steering rules in out of > > + order manner without depending on previous commands. > > +14. A group member device should be able to query the attributes of the > flow > > + steering that device supports. > > +15. The driver and group owner driver should be able to query supported > device > > + limits for the steering entries. > > Do we want to support some user-defined fields, so that manufacturers can > customize the interpretation of some fields. > Yes, we can phase out the common case of ipv{4,6} and {udp,tcp} as first phase and more customs field as second phase. Those use define fields can be useful in custom filters.


  • 42.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-02-2023 04:40


    ? 2023/6/2 ??11:51, Parav Pandit ??:
    > Hi Heng,
    >
    >> From: Heng Qi <hengqi@linux.alibaba.com>
    >> Sent: Thursday, June 1, 2023 11:35 PM
    >>
    >> Hi, Parav.
    >>
    >> I also did some work on this feature, and I think it would be very useful if we
    >> could work on this interface design and VirtIO Spec together!
    >>
    > Sure, happy to bring this alive together.
    > More below.
    >
    >>> +
    >>> +## 3.4 Receive virtqueue n-tuple flow steering (RFS)
    >> I think this abbreviation can be replaced with eg Receive virtqueue n-tuple flow
    >> director, RFD? or something else?
    > One hw vendor product name has director in it, so will probably want to stay away picking such TM.
    >
    > May be we can just say flow filters.
    > A filter may have an action and a destination.
    > So it can be generic enough.

    Yes, it's a good generic name.

    >
    >
    >> This allows us to distinguish more clearly from (Accelerated) Recieve Flow
    >> Steering, (A)RFS.
    >>
    > The idea was for ARFS and RFS both to reuse the same underlying infra, just the OS hooks are different. (stack vs user ethtool respectively).
    >

    Yes, so I want it to have a distinguishing name from RFS, 'recieve flow
    filters' below as the name of the underlying infra is enough.:)

    >> ARFS can *automatically* issue RFS rules based on n-tuple rules, and we can
    >> still issue n-tuple rules *manually*.
    >> So RFD seems more reasonable than RFS abbreviation, what do you think?:)
    >>
    > "Receive flow filters" is probably more neutral name. :)
    >
    >>> +1. The driver should be able to define a receive packet match criteria, an
    >>> + action and a destination for a packet. For example, an ipv4 packet with a
    >>> + multicast address to be steered to the receive vq 0. The second example is
    >>> + ipv4, tcp packet matching a specified IP address and tcp port tuple to
    >>> + steered to receive vq 10.
    >>> +2. The match criteria should include exact tuple fields well-defined such as
    >> mac
    >>> + address, IP addresses, tcp/udp ports, etc.
    >>> +3. The match criteria should also include the field mask.
    >> The mask structure can be considered to reuse the same structure as the key
    >> structure.
    > Yes, mask will probably have the bit mask definition.
    >
    >>> +4. The match criteria may optionally also include specific packet data offset,
    >>> + length, and matching pattern, which may not be defined in the standard
    >> RFC.
    >>> +5. Action includes (a) dropping or (b) forwarding the packet.
    >>> +6. Destination location is a receive virtqueue index or rss context.
    >> We may have an additional requirement internally, such as specifying the queue
    >> id of a certain VF, but we are still aligning this requirement internally, and it may
    >> be more reasonable to implement the work involving device group based on
    >> admin vq.
    >>
    > The switch side APIs bit more complex.
    > We were first considering using the flow filters/steering for the guest driver and extend further for switching side.

    Yes, but it's not that hard to just extend a VF id. In general, we will
    get back to you next week after aligning because of any difficulties we
    need this.:)

    >
    >>> +7. The device should process RFS rules before RSS rules, i.e., when there is a
    >>> + miss on the RFS rule RSS rule applies.
    >>> +8. The device should be able to add/remove these rules at a rate of 100K
    >>> + rules/sec.
    >>> +9. If multiple rules are programmed which has overlapping attributes for a
    >>> + received packet, the driver to define the location/priority of the rule.
    >> Considering the capabilities similar to some current tools such as ethtool, if it is
    >> a simple implementation, the first match principle can be applied.
    >>
    > Yes, this likely needs refinement, I just wanted to get the seed for priority to be afloat when I wrote.
    > I am thinking of defining the priority at interface level, and driver can choose first match in the programming model based on the use.
    >

    Yes, in general we really need priorities, and then maybe a matching
    mode field to let the driver choose which matching principle to use.

    >>> +10. The driver should be able to query flow steering table entries
    >> programmed in
    >>> + the device by the flow id.
    >>> +11. The driver should be able to add the entry for attributes (a) match
    >>> + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    >>> +12. The driver should be able to delete the steering rule entry via a unique
    >> id.
    >>> +13. The driver should be able to add and delete the steering rules in out of
    >>> + order manner without depending on previous commands.
    >>> +14. A group member device should be able to query the attributes of the
    >> flow
    >>> + steering that device supports.
    >>> +15. The driver and group owner driver should be able to query supported
    >> device
    >>> + limits for the steering entries.
    >> Do we want to support some user-defined fields, so that manufacturers can
    >> customize the interpretation of some fields.
    >>
    > Yes, we can phase out the common case of ipv{4,6} and {udp,tcp} as first phase and more customs field as second phase.
    > Those use define fields can be useful in custom filters.

    Yes, for example some simple tunnel matching is possible, but the full,
    exact tunnel matching is probably not the first phase of this new feature.


    Thanks!





  • 43.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-02-2023 04:40
    å 2023/6/2 äå11:51, Parav Pandit åé: Hi Heng, From: Heng Qi <hengqi@linux.alibaba.com> Sent: Thursday, June 1, 2023 11:35 PM Hi, Parav. I also did some work on this feature, and I think it would be very useful if we could work on this interface design and VirtIO Spec together! Sure, happy to bring this alive together. More below. + +## 3.4 Receive virtqueue n-tuple flow steering (RFS) I think this abbreviation can be replaced with eg Receive virtqueue n-tuple flow director, RFD? or something else? One hw vendor product name has director in it, so will probably want to stay away picking such TM. May be we can just say flow filters. A filter may have an action and a destination. So it can be generic enough. Yes, it's a good generic name. This allows us to distinguish more clearly from (Accelerated) Recieve Flow Steering, (A)RFS. The idea was for ARFS and RFS both to reuse the same underlying infra, just the OS hooks are different. (stack vs user ethtool respectively). Yes, so I want it to have a distinguishing name from RFS, 'recieve flow filters' below as the name of the underlying infra is enough.:) ARFS can *automatically* issue RFS rules based on n-tuple rules, and we can still issue n-tuple rules *manually*. So RFD seems more reasonable than RFS abbreviation, what do you think?:) "Receive flow filters" is probably more neutral name. :) +1. The driver should be able to define a receive packet match criteria, an + action and a destination for a packet. For example, an ipv4 packet with a + multicast address to be steered to the receive vq 0. The second example is + ipv4, tcp packet matching a specified IP address and tcp port tuple to + steered to receive vq 10. +2. The match criteria should include exact tuple fields well-defined such as mac + address, IP addresses, tcp/udp ports, etc. +3. The match criteria should also include the field mask. The mask structure can be considered to reuse the same structure as the key structure. Yes, mask will probably have the bit mask definition. +4. The match criteria may optionally also include specific packet data offset, + length, and matching pattern, which may not be defined in the standard RFC. +5. Action includes (a) dropping or (b) forwarding the packet. +6. Destination location is a receive virtqueue index or rss context. We may have an additional requirement internally, such as specifying the queue id of a certain VF, but we are still aligning this requirement internally, and it may be more reasonable to implement the work involving device group based on admin vq. The switch side APIs bit more complex. We were first considering using the flow filters/steering for the guest driver and extend further for switching side. Yes, but it's not that hard to just extend a VF id. In general, we will get back to you next week after aligning because of any difficulties we need this.:) +7. The device should process RFS rules before RSS rules, i.e., when there is a + miss on the RFS rule RSS rule applies. +8. The device should be able to add/remove these rules at a rate of 100K + rules/sec. +9. If multiple rules are programmed which has overlapping attributes for a + received packet, the driver to define the location/priority of the rule. Considering the capabilities similar to some current tools such as ethtool, if it is a simple implementation, the first match principle can be applied. Yes, this likely needs refinement, I just wanted to get the seed for priority to be afloat when I wrote. I am thinking of defining the priority at interface level, and driver can choose first match in the programming model based on the use. Yes, in general we really need priorities, and then maybe a matching mode field to let the driver choose which matching principle to use. +10. The driver should be able to query flow steering table entries programmed in + the device by the flow id. +11. The driver should be able to add the entry for attributes (a) match + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. +12. The driver should be able to delete the steering rule entry via a unique id. +13. The driver should be able to add and delete the steering rules in out of + order manner without depending on previous commands. +14. A group member device should be able to query the attributes of the flow + steering that device supports. +15. The driver and group owner driver should be able to query supported device + limits for the steering entries. Do we want to support some user-defined fields, so that manufacturers can customize the interpretation of some fields. Yes, we can phase out the common case of ipv{4,6} and {udp,tcp} as first phase and more customs field as second phase. Those use define fields can be useful in custom filters. Yes, for example some simple tunnel matching is possible, but the full, exact tunnel matching is probably not the first phase of this new feature. Thanks!


  • 44.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-06-2023 12:09
    On Fri, Jun 02, 2023 at 12:39:45PM +0800, Heng Qi wrote:
    >
    >
    > ? 2023/6/2 ??11:51, Parav Pandit ??:
    > >Hi Heng,
    > >
    > >>From: Heng Qi <hengqi@linux.alibaba.com>
    > >>Sent: Thursday, June 1, 2023 11:35 PM
    > >>
    > >>Hi, Parav.
    > >>
    > >>I also did some work on this feature, and I think it would be very useful if we
    > >>could work on this interface design and VirtIO Spec together!
    > >>
    > >Sure, happy to bring this alive together.
    > >More below.
    > >
    > >>>+
    > >>>+## 3.4 Receive virtqueue n-tuple flow steering (RFS)
    > >>I think this abbreviation can be replaced with eg Receive virtqueue n-tuple flow
    > >>director, RFD? or something else?
    > >One hw vendor product name has director in it, so will probably want to stay away picking such TM.
    > >
    > >May be we can just say flow filters.
    > >A filter may have an action and a destination.
    > >So it can be generic enough.
    >
    > Yes, it's a good generic name.
    >
    > >
    > >
    > >>This allows us to distinguish more clearly from (Accelerated) Recieve Flow
    > >>Steering, (A)RFS.
    > >>
    > >The idea was for ARFS and RFS both to reuse the same underlying infra, just the OS hooks are different. (stack vs user ethtool respectively).
    >
    > Yes, so I want it to have a distinguishing name from RFS, 'recieve
    > flow filters' below as the name of the underlying infra is enough.:)
    >
    > >>ARFS can *automatically* issue RFS rules based on n-tuple rules, and we can
    > >>still issue n-tuple rules *manually*.
    > >>So RFD seems more reasonable than RFS abbreviation, what do you think?:)
    > >>
    > >"Receive flow filters" is probably more neutral name. :)
    > >
    > >>>+1. The driver should be able to define a receive packet match criteria, an
    > >>>+ action and a destination for a packet. For example, an ipv4 packet with a
    > >>>+ multicast address to be steered to the receive vq 0. The second example is
    > >>>+ ipv4, tcp packet matching a specified IP address and tcp port tuple to
    > >>>+ steered to receive vq 10.
    > >>>+2. The match criteria should include exact tuple fields well-defined such as
    > >>mac
    > >>>+ address, IP addresses, tcp/udp ports, etc.
    > >>>+3. The match criteria should also include the field mask.
    > >>The mask structure can be considered to reuse the same structure as the key
    > >>structure.
    > >Yes, mask will probably have the bit mask definition.
    > >
    > >>>+4. The match criteria may optionally also include specific packet data offset,
    > >>>+ length, and matching pattern, which may not be defined in the standard
    > >>RFC.
    > >>>+5. Action includes (a) dropping or (b) forwarding the packet.
    > >>>+6. Destination location is a receive virtqueue index or rss context.
    > >>We may have an additional requirement internally, such as specifying the queue
    > >>id of a certain VF, but we are still aligning this requirement internally, and it may
    > >>be more reasonable to implement the work involving device group based on
    > >>admin vq.
    > >>
    > >The switch side APIs bit more complex.
    > >We were first considering using the flow filters/steering for the guest driver and extend further for switching side.
    >
    > Yes, but it's not that hard to just extend a VF id. In general, we
    > will get back to you next week after aligning because of any
    > difficulties we need this.:)

    Hi, all.

    We have already aligned the requirements internally. There is a common
    requirement in the cloud scenario: a virtual nic with SR-IOV capability.
    The user of a vm splits it into one PF + multiple VFs, and each VF
    takes some of the queues from the PF. Each VF can get its own IP address
    and mac from the PF.

    However, the forwarding component (here it can be understood as an eswitch)
    connected to the virtual nic cannot see VFs, but can only see the PF (when
    creating the SR-IOV virtual nic, the PF is bound to a forwarding component).
    This behavior does not affect users (such as users in containers) using the VF,
    that is, the user sees a real virtual nic. However, the data incoming and
    outgoing this VF nic needs to be distinguished in the forwarding component:

    1. When the packet comes out of the VF nic (from the perspective of the
    forwarding component, it comes out of a queue of the PF), the source ip of
    the packet is still the ip of the VF, and the forwarding component forwards
    it directly;

    2. When packets arrive at the forwarding component, the IPs seen by
    the forwarding component (the IPs of all VFs) are the IPs of the PF.
    If the forwarding component wants to direct packets to the corresponding VF,
    it should use the flow director rules with the VF id issued by the PF.

    Perhaps the admin queue can achieve this capability, but the flow
    director rule carrying the VF id has become the ethtool and kernel standard,
    so we can try to include it in the flow director's ability first (this
    means that the flow director itself can have this behavior, especially
    for those PFs that intend to limit the capabilities of the VF, and
    this behavior can effectively support our scenario).

    Thanks.

    >
    > >>>+7. The device should process RFS rules before RSS rules, i.e., when there is a
    > >>>+ miss on the RFS rule RSS rule applies.
    > >>>+8. The device should be able to add/remove these rules at a rate of 100K
    > >>>+ rules/sec.
    > >>>+9. If multiple rules are programmed which has overlapping attributes for a
    > >>>+ received packet, the driver to define the location/priority of the rule.
    > >>Considering the capabilities similar to some current tools such as ethtool, if it is
    > >>a simple implementation, the first match principle can be applied.
    > >>
    > >Yes, this likely needs refinement, I just wanted to get the seed for priority to be afloat when I wrote.
    > >I am thinking of defining the priority at interface level, and driver can choose first match in the programming model based on the use.
    > >
    >
    > Yes, in general we really need priorities, and then maybe a matching
    > mode field to let the driver choose which matching principle to use.
    >
    > >>>+10. The driver should be able to query flow steering table entries
    > >>programmed in
    > >>>+ the device by the flow id.
    > >>>+11. The driver should be able to add the entry for attributes (a) match
    > >>>+ criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    > >>>+12. The driver should be able to delete the steering rule entry via a unique
    > >>id.
    > >>>+13. The driver should be able to add and delete the steering rules in out of
    > >>>+ order manner without depending on previous commands.
    > >>>+14. A group member device should be able to query the attributes of the
    > >>flow
    > >>>+ steering that device supports.
    > >>>+15. The driver and group owner driver should be able to query supported
    > >>device
    > >>>+ limits for the steering entries.
    > >>Do we want to support some user-defined fields, so that manufacturers can
    > >>customize the interpretation of some fields.
    > >>
    > >Yes, we can phase out the common case of ipv{4,6} and {udp,tcp} as first phase and more customs field as second phase.
    > >Those use define fields can be useful in custom filters.
    >
    > Yes, for example some simple tunnel matching is possible, but the
    > full, exact tunnel matching is probably not the first phase of this
    > new feature.
    >
    >
    > Thanks!
    >
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe from this mail list, you must leave the OASIS TC that
    > generates this mail. Follow this link to all your TCs in OASIS at:
    > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php



  • 45.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-06-2023 12:09
    On Fri, Jun 02, 2023 at 12:39:45PM +0800, Heng Qi wrote: > > > å 2023/6/2 äå11:51, Parav Pandit åé: > >Hi Heng, > > > >>From: Heng Qi <hengqi@linux.alibaba.com> > >>Sent: Thursday, June 1, 2023 11:35 PM > >> > >>Hi, Parav. > >> > >>I also did some work on this feature, and I think it would be very useful if we > >>could work on this interface design and VirtIO Spec together! > >> > >Sure, happy to bring this alive together. > >More below. > > > >>>+ > >>>+## 3.4 Receive virtqueue n-tuple flow steering (RFS) > >>I think this abbreviation can be replaced with eg Receive virtqueue n-tuple flow > >>director, RFD? or something else? > >One hw vendor product name has director in it, so will probably want to stay away picking such TM. > > > >May be we can just say flow filters. > >A filter may have an action and a destination. > >So it can be generic enough. > > Yes, it's a good generic name. > > > > > > >>This allows us to distinguish more clearly from (Accelerated) Recieve Flow > >>Steering, (A)RFS. > >> > >The idea was for ARFS and RFS both to reuse the same underlying infra, just the OS hooks are different. (stack vs user ethtool respectively). > > Yes, so I want it to have a distinguishing name from RFS, 'recieve > flow filters' below as the name of the underlying infra is enough.:) > > >>ARFS can *automatically* issue RFS rules based on n-tuple rules, and we can > >>still issue n-tuple rules *manually*. > >>So RFD seems more reasonable than RFS abbreviation, what do you think?:) > >> > >"Receive flow filters" is probably more neutral name. :) > > > >>>+1. The driver should be able to define a receive packet match criteria, an > >>>+ action and a destination for a packet. For example, an ipv4 packet with a > >>>+ multicast address to be steered to the receive vq 0. The second example is > >>>+ ipv4, tcp packet matching a specified IP address and tcp port tuple to > >>>+ steered to receive vq 10. > >>>+2. The match criteria should include exact tuple fields well-defined such as > >>mac > >>>+ address, IP addresses, tcp/udp ports, etc. > >>>+3. The match criteria should also include the field mask. > >>The mask structure can be considered to reuse the same structure as the key > >>structure. > >Yes, mask will probably have the bit mask definition. > > > >>>+4. The match criteria may optionally also include specific packet data offset, > >>>+ length, and matching pattern, which may not be defined in the standard > >>RFC. > >>>+5. Action includes (a) dropping or (b) forwarding the packet. > >>>+6. Destination location is a receive virtqueue index or rss context. > >>We may have an additional requirement internally, such as specifying the queue > >>id of a certain VF, but we are still aligning this requirement internally, and it may > >>be more reasonable to implement the work involving device group based on > >>admin vq. > >> > >The switch side APIs bit more complex. > >We were first considering using the flow filters/steering for the guest driver and extend further for switching side. > > Yes, but it's not that hard to just extend a VF id. In general, we > will get back to you next week after aligning because of any > difficulties we need this.:) Hi, all. We have already aligned the requirements internally. There is a common requirement in the cloud scenario: a virtual nic with SR-IOV capability. The user of a vm splits it into one PF + multiple VFs, and each VF takes some of the queues from the PF. Each VF can get its own IP address and mac from the PF. However, the forwarding component (here it can be understood as an eswitch) connected to the virtual nic cannot see VFs, but can only see the PF (when creating the SR-IOV virtual nic, the PF is bound to a forwarding component). This behavior does not affect users (such as users in containers) using the VF, that is, the user sees a real virtual nic. However, the data incoming and outgoing this VF nic needs to be distinguished in the forwarding component: 1. When the packet comes out of the VF nic (from the perspective of the forwarding component, it comes out of a queue of the PF), the source ip of the packet is still the ip of the VF, and the forwarding component forwards it directly; 2. When packets arrive at the forwarding component, the IPs seen by the forwarding component (the IPs of all VFs) are the IPs of the PF. If the forwarding component wants to direct packets to the corresponding VF, it should use the flow director rules with the VF id issued by the PF. Perhaps the admin queue can achieve this capability, but the flow director rule carrying the VF id has become the ethtool and kernel standard, so we can try to include it in the flow director's ability first (this means that the flow director itself can have this behavior, especially for those PFs that intend to limit the capabilities of the VF, and this behavior can effectively support our scenario). Thanks. > > >>>+7. The device should process RFS rules before RSS rules, i.e., when there is a > >>>+ miss on the RFS rule RSS rule applies. > >>>+8. The device should be able to add/remove these rules at a rate of 100K > >>>+ rules/sec. > >>>+9. If multiple rules are programmed which has overlapping attributes for a > >>>+ received packet, the driver to define the location/priority of the rule. > >>Considering the capabilities similar to some current tools such as ethtool, if it is > >>a simple implementation, the first match principle can be applied. > >> > >Yes, this likely needs refinement, I just wanted to get the seed for priority to be afloat when I wrote. > >I am thinking of defining the priority at interface level, and driver can choose first match in the programming model based on the use. > > > > Yes, in general we really need priorities, and then maybe a matching > mode field to let the driver choose which matching principle to use. > > >>>+10. The driver should be able to query flow steering table entries > >>programmed in > >>>+ the device by the flow id. > >>>+11. The driver should be able to add the entry for attributes (a) match > >>>+ criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. > >>>+12. The driver should be able to delete the steering rule entry via a unique > >>id. > >>>+13. The driver should be able to add and delete the steering rules in out of > >>>+ order manner without depending on previous commands. > >>>+14. A group member device should be able to query the attributes of the > >>flow > >>>+ steering that device supports. > >>>+15. The driver and group owner driver should be able to query supported > >>device > >>>+ limits for the steering entries. > >>Do we want to support some user-defined fields, so that manufacturers can > >>customize the interpretation of some fields. > >> > >Yes, we can phase out the common case of ipv{4,6} and {udp,tcp} as first phase and more customs field as second phase. > >Those use define fields can be useful in custom filters. > > Yes, for example some simple tunnel matching is possible, but the > full, exact tunnel matching is probably not the first phase of this > new feature. > > > Thanks! > > > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 46.  RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-06-2023 21:49


    > From: Heng Qi <hengqi@linux.alibaba.com>
    > Sent: Tuesday, June 6, 2023 8:09 AM

    >
    > We have already aligned the requirements internally. There is a common
    > requirement in the cloud scenario: a virtual nic with SR-IOV capability.
    > The user of a vm splits it into one PF + multiple VFs, and each VF takes some of
    > the queues from the PF. Each VF can get its own IP address and mac from the PF.
    >
    > However, the forwarding component (here it can be understood as an eswitch)
    > connected to the virtual nic cannot see VFs, but can only see the PF (when
    > creating the SR-IOV virtual nic, the PF is bound to a forwarding component).
    > This behavior does not affect users (such as users in containers) using the VF,
    > that is, the user sees a real virtual nic. However, the data incoming and outgoing
    > this VF nic needs to be distinguished in the forwarding component:
    >
    > 1. When the packet comes out of the VF nic (from the perspective of the
    > forwarding component, it comes out of a queue of the PF), the source ip of the
    > packet is still the ip of the VF, and the forwarding component forwards it
    > directly;
    >
    > 2. When packets arrive at the forwarding component, the IPs seen by the
    > forwarding component (the IPs of all VFs) are the IPs of the PF.
    > If the forwarding component wants to direct packets to the corresponding VF, it
    > should use the flow director rules with the VF id issued by the PF.
    >
    > Perhaps the admin queue can achieve this capability, but the flow director rule
    > carrying the VF id has become the ethtool and kernel standard,
    To my knowledge the eswitch model is not part of the ethtool.
    It is part of the tc flower offloads and it covers far more than just _forwarding_ rules.

    so we should be able to steer to a vf_id with a abstraction of say switch_port_id as far as spec is concerned.

    The main difference is the type of rules eswitch vs the guest driver adds are very different.
    But yes, both should be able to use some of the common infrastructure.

    We should consider this requirement in mind and craft the "destination" to cover this.

    Since the flow insertion/removal queue should be present in one case PF and other case on VFs, we likely need a different queue than the AQ or expand the scope.

    We should cover these in the design/requirements further.

    I will update this details in the v1.

    > so we can try to
    > include it in the flow director's ability first (this means that the flow director
    > itself can have this behavior, especially for those PFs that intend to limit the
    > capabilities of the VF, and this behavior can effectively support our scenario).




  • 47.  RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-06-2023 21:50
    > From: Heng Qi <hengqi@linux.alibaba.com> > Sent: Tuesday, June 6, 2023 8:09 AM > > We have already aligned the requirements internally. There is a common > requirement in the cloud scenario: a virtual nic with SR-IOV capability. > The user of a vm splits it into one PF + multiple VFs, and each VF takes some of > the queues from the PF. Each VF can get its own IP address and mac from the PF. > > However, the forwarding component (here it can be understood as an eswitch) > connected to the virtual nic cannot see VFs, but can only see the PF (when > creating the SR-IOV virtual nic, the PF is bound to a forwarding component). > This behavior does not affect users (such as users in containers) using the VF, > that is, the user sees a real virtual nic. However, the data incoming and outgoing > this VF nic needs to be distinguished in the forwarding component: > > 1. When the packet comes out of the VF nic (from the perspective of the > forwarding component, it comes out of a queue of the PF), the source ip of the > packet is still the ip of the VF, and the forwarding component forwards it > directly; > > 2. When packets arrive at the forwarding component, the IPs seen by the > forwarding component (the IPs of all VFs) are the IPs of the PF. > If the forwarding component wants to direct packets to the corresponding VF, it > should use the flow director rules with the VF id issued by the PF. > > Perhaps the admin queue can achieve this capability, but the flow director rule > carrying the VF id has become the ethtool and kernel standard, To my knowledge the eswitch model is not part of the ethtool. It is part of the tc flower offloads and it covers far more than just _forwarding_ rules. so we should be able to steer to a vf_id with a abstraction of say switch_port_id as far as spec is concerned. The main difference is the type of rules eswitch vs the guest driver adds are very different. But yes, both should be able to use some of the common infrastructure. We should consider this requirement in mind and craft the "destination" to cover this. Since the flow insertion/removal queue should be present in one case PF and other case on VFs, we likely need a different queue than the AQ or expand the scope. We should cover these in the design/requirements further. I will update this details in the v1. > so we can try to > include it in the flow director's ability first (this means that the flow director > itself can have this behavior, especially for those PFs that intend to limit the > capabilities of the VF, and this behavior can effectively support our scenario).


  • 48.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-07-2023 02:40
    On Tue, Jun 06, 2023 at 09:49:28PM +0000, Parav Pandit wrote: > > > > From: Heng Qi <hengqi@linux.alibaba.com> > > Sent: Tuesday, June 6, 2023 8:09 AM > > > > > We have already aligned the requirements internally. There is a common > > requirement in the cloud scenario: a virtual nic with SR-IOV capability. > > The user of a vm splits it into one PF + multiple VFs, and each VF takes some of > > the queues from the PF. Each VF can get its own IP address and mac from the PF. > > > > However, the forwarding component (here it can be understood as an eswitch) > > connected to the virtual nic cannot see VFs, but can only see the PF (when > > creating the SR-IOV virtual nic, the PF is bound to a forwarding component). > > This behavior does not affect users (such as users in containers) using the VF, > > that is, the user sees a real virtual nic. However, the data incoming and outgoing > > this VF nic needs to be distinguished in the forwarding component: > > > > 1. When the packet comes out of the VF nic (from the perspective of the > > forwarding component, it comes out of a queue of the PF), the source ip of the > > packet is still the ip of the VF, and the forwarding component forwards it > > directly; > > > > 2. When packets arrive at the forwarding component, the IPs seen by the > > forwarding component (the IPs of all VFs) are the IPs of the PF. > > If the forwarding component wants to direct packets to the corresponding VF, it > > should use the flow director rules with the VF id issued by the PF. > > > > Perhaps the admin queue can achieve this capability, but the flow director rule > > carrying the VF id has become the ethtool and kernel standard, > To my knowledge the eswitch model is not part of the ethtool. > It is part of the tc flower offloads and it covers far more than just _forwarding_ rules. > Please refer to the kernel function ethtool_get_flow_spec_ring_vf() and "ethtool -h grep -E 'vf queue'" of the Ethtool tool (I'm not saying we have to rely on ethtool, but it's at least one of the useful and standard tools). On the PF side, the user can see the VFs. For example, there is VF1 with IP 1.2.3.4, we just want PF to issue something like "ethtool -N PF flow-type udp dst-ip 1.2.3.4 VF1", and then the device receives such a command/rule (the device does not see the VF, but it can convert the VF id to the corresponding queues, and then send the packets to these queues or one queue). This is what we need, just a VF id, nothing else. Of course, we could use a more generic field. > so we should be able to steer to a vf_id with a abstraction of say switch_port_id as far as spec is concerned. > > The main difference is the type of rules eswitch vs the guest driver adds are very different. > But yes, both should be able to use some of the common infrastructure. Yes. > > We should consider this requirement in mind and craft the "destination" to cover this. > Yes, we can make this included in the "destination". > Since the flow insertion/removal queue should be present in one case PF and other case on VFs, we likely need a different queue than the AQ or expand the scope. > > We should cover these in the design/requirements further. > > I will update this details in the v1. Ok, thanks a lot! > > > so we can try to > > include it in the flow director's ability first (this means that the flow director > > itself can have this behavior, especially for those PFs that intend to limit the > > capabilities of the VF, and this behavior can effectively support our scenario). >


  • 49.  RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-07-2023 03:27
    > From: Heng Qi <hengqi@linux.alibaba.com> > Sent: Tuesday, June 6, 2023 10:40 PM > Please refer to the kernel function ethtool_get_flow_spec_ring_vf() and > "ethtool -h grep -E 'vf queue'" of the Ethtool tool (I'm not saying we have to > rely on ethtool, but it's at least one of the useful and standard tools). > > On the PF side, the user can see the VFs. For example, there is VF1 with IP > 1.2.3.4, we just want PF to issue something like > > "ethtool -N PF flow-type udp dst-ip 1.2.3.4 VF1", > This is probably ancient and probably for a single vendor who could not implement full eswitch. I may be incomplete/wrong here. But for 100% sure that Linux kernel eswitch model is through the representors. I have built and extended the represnetors for VF, PF, SF, with and without DPU modes and lot of plumbing of it with devlink. And all uniformly connected to netdev representors. So I don't think ethtool VF1 patch will be accepted upstream stream. May be I am wrong. But for sure, if we build eswitch for virtio net for Linux, it should be through the tc manner. > and then the device receives such a command/rule (the device does not see the > VF, but it can convert the VF id to the corresponding queues, and then send the > packets to these queues or one queue). This is what we need, just a VF id, > nothing else. > > Of course, we could use a more generic field. > > > so we should be able to steer to a vf_id with a abstraction of say > switch_port_id as far as spec is concerned. > > > > The main difference is the type of rules eswitch vs the guest driver adds are > very different. > > But yes, both should be able to use some of the common infrastructure. > > Yes. > > > > > We should consider this requirement in mind and craft the "destination" to > cover this. > > > > Yes, we can make this included in the "destination". > > > Since the flow insertion/removal queue should be present in one case PF and > other case on VFs, we likely need a different queue than the AQ or expand the > scope. > > > > We should cover these in the design/requirements further. > > > > I will update this details in the v1. > > Ok, thanks a lot! > > > > > > so we can try to > > > include it in the flow director's ability first (this means that the > > > flow director itself can have this behavior, especially for those > > > PFs that intend to limit the capabilities of the VF, and this behavior can > effectively support our scenario). > >


  • 50.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-12-2023 14:35


    ? 2023/6/7 ??5:49, Parav Pandit ??:
    >
    >> From: Heng Qi <hengqi@linux.alibaba.com>
    >> Sent: Tuesday, June 6, 2023 8:09 AM
    >> We have already aligned the requirements internally. There is a common
    >> requirement in the cloud scenario: a virtual nic with SR-IOV capability.
    >> The user of a vm splits it into one PF + multiple VFs, and each VF takes some of
    >> the queues from the PF. Each VF can get its own IP address and mac from the PF.
    >>
    >> However, the forwarding component (here it can be understood as an eswitch)
    >> connected to the virtual nic cannot see VFs, but can only see the PF (when
    >> creating the SR-IOV virtual nic, the PF is bound to a forwarding component).
    >> This behavior does not affect users (such as users in containers) using the VF,
    >> that is, the user sees a real virtual nic. However, the data incoming and outgoing
    >> this VF nic needs to be distinguished in the forwarding component:
    >>
    >> 1. When the packet comes out of the VF nic (from the perspective of the
    >> forwarding component, it comes out of a queue of the PF), the source ip of the
    >> packet is still the ip of the VF, and the forwarding component forwards it
    >> directly;
    >>
    >> 2. When packets arrive at the forwarding component, the IPs seen by the
    >> forwarding component (the IPs of all VFs) are the IPs of the PF.
    >> If the forwarding component wants to direct packets to the corresponding VF, it
    >> should use the flow director rules with the VF id issued by the PF.
    >>
    >> Perhaps the admin queue can achieve this capability, but the flow director rule
    >> carrying the VF id has become the ethtool and kernel standard,
    > To my knowledge the eswitch model is not part of the ethtool.
    > It is part of the tc flower offloads and it covers far more than just _forwarding_ rules.
    >
    > so we should be able to steer to a vf_id with a abstraction of say switch_port_id as far as spec is concerned.
    >
    > The main difference is the type of rules eswitch vs the guest driver adds are very different.
    > But yes, both should be able to use some of the common infrastructure.
    >
    > We should consider this requirement in mind and craft the "destination" to cover this.
    >
    > Since the flow insertion/removal queue should be present in one case PF and other case on VFs, we likely need a different queue than the AQ or expand the scope.
    >
    > We should cover these in the design/requirements further.
    >
    > I will update this details in the v1.

    Hi, Parav.

    Do we have a work schedule for the 1.4 features?
    For example, the flow director we are very concerned about recently,
    because different scenarios have requirements for it.

    We'd like to work out with you a rough timeline so we can discuss and
    collaborate better!

    I hope I'm not disrupting your schedule :), just for us to better
    promote the work of virtio!

    Grateful!

    >
    >> so we can try to
    >> include it in the flow director's ability first (this means that the flow director
    >> itself can have this behavior, especially for those PFs that intend to limit the
    >> capabilities of the VF, and this behavior can effectively support our scenario).




  • 51.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-12-2023 14:35
    å 2023/6/7 äå5:49, Parav Pandit åé: From: Heng Qi <hengqi@linux.alibaba.com> Sent: Tuesday, June 6, 2023 8:09 AM We have already aligned the requirements internally. There is a common requirement in the cloud scenario: a virtual nic with SR-IOV capability. The user of a vm splits it into one PF + multiple VFs, and each VF takes some of the queues from the PF. Each VF can get its own IP address and mac from the PF. However, the forwarding component (here it can be understood as an eswitch) connected to the virtual nic cannot see VFs, but can only see the PF (when creating the SR-IOV virtual nic, the PF is bound to a forwarding component). This behavior does not affect users (such as users in containers) using the VF, that is, the user sees a real virtual nic. However, the data incoming and outgoing this VF nic needs to be distinguished in the forwarding component: 1. When the packet comes out of the VF nic (from the perspective of the forwarding component, it comes out of a queue of the PF), the source ip of the packet is still the ip of the VF, and the forwarding component forwards it directly; 2. When packets arrive at the forwarding component, the IPs seen by the forwarding component (the IPs of all VFs) are the IPs of the PF. If the forwarding component wants to direct packets to the corresponding VF, it should use the flow director rules with the VF id issued by the PF. Perhaps the admin queue can achieve this capability, but the flow director rule carrying the VF id has become the ethtool and kernel standard, To my knowledge the eswitch model is not part of the ethtool. It is part of the tc flower offloads and it covers far more than just _forwarding_ rules. so we should be able to steer to a vf_id with a abstraction of say switch_port_id as far as spec is concerned. The main difference is the type of rules eswitch vs the guest driver adds are very different. But yes, both should be able to use some of the common infrastructure. We should consider this requirement in mind and craft the "destination" to cover this. Since the flow insertion/removal queue should be present in one case PF and other case on VFs, we likely need a different queue than the AQ or expand the scope. We should cover these in the design/requirements further. I will update this details in the v1. Hi, Parav. Do we have a work schedule for the 1.4 features? For example, the flow director we are very concerned about recently, because different scenarios have requirements for it. We'd like to work out with you a rough timeline so we can discuss and collaborate better! I hope I'm not disrupting your schedule :), just for us to better promote the work of virtio! Grateful! so we can try to include it in the flow director's ability first (this means that the flow director itself can have this behavior, especially for those PFs that intend to limit the capabilities of the VF, and this behavior can effectively support our scenario).


  • 52.  RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-12-2023 17:26

    > From: Heng Qi <hengqi@linux.alibaba.com>
    > Sent: Monday, June 12, 2023 10:35 AM

    > > I will update this details in the v1.
    >
    > Hi, Parav.
    >
    > Do we have a work schedule for the 1.4 features?

    Since n-tuple requirements are straightforward, below is timeline in mind,
    1. Requirements agree and drafted by June-30
    2. Design and its review to finish by July-30
    3. Spec to review by Aug-30

    > For example, the flow director we are very concerned about recently, because
    > different scenarios have requirements for it.
    >
    > We'd like to work out with you a rough timeline so we can discuss and
    > collaborate better!
    >
    Will above timeline workable for you to work together on it?
    We should work on finite requirements that we can converge and deliver.
    First draft was this v0 one.

    Otherwise, it can be never ending list that doesn't reach anywhere as steering technology keeps improving.

    Slightly unrelated to question, let's call this feature as: "receive flow filters", instead of "flow director".




  • 53.  RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-12-2023 17:26
    > From: Heng Qi <hengqi@linux.alibaba.com> > Sent: Monday, June 12, 2023 10:35 AM > > I will update this details in the v1. > > Hi, Parav. > > Do we have a work schedule for the 1.4 features? Since n-tuple requirements are straightforward, below is timeline in mind, 1. Requirements agree and drafted by June-30 2. Design and its review to finish by July-30 3. Spec to review by Aug-30 > For example, the flow director we are very concerned about recently, because > different scenarios have requirements for it. > > We'd like to work out with you a rough timeline so we can discuss and > collaborate better! > Will above timeline workable for you to work together on it? We should work on finite requirements that we can converge and deliver. First draft was this v0 one. Otherwise, it can be never ending list that doesn't reach anywhere as steering technology keeps improving. Slightly unrelated to question, let's call this feature as: "receive flow filters", instead of "flow director".


  • 54.  Re: [virtio-comment] RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 02:28


    ? 2023/6/13 ??1:26, Parav Pandit ??:
    >> From: Heng Qi <hengqi@linux.alibaba.com>
    >> Sent: Monday, June 12, 2023 10:35 AM
    >>> I will update this details in the v1.
    >> Hi, Parav.
    >>
    >> Do we have a work schedule for the 1.4 features?
    > Since n-tuple requirements are straightforward, below is timeline in mind,
    > 1. Requirements agree and drafted by June-30
    > 2. Design and its review to finish by July-30
    > 3. Spec to review by Aug-30

    OK, is this timeline for all 1.4 features or individual high priority
    1.4 features?
    Do we have a prioritization of 1.4 features? This also facilitates time
    investment for community maintainers.

    >> For example, the flow director we are very concerned about recently, because
    >> different scenarios have requirements for it.
    >>
    >> We'd like to work out with you a rough timeline so we can discuss and
    >> collaborate better!
    >>
    > Will above timeline workable for you to work together on it?

    This timeline is ok for us. In addition, if it is difficult for the
    community to carry the VF id or there is an unclear
    demarcation problem with the design of  the group owner management,
    maybe we can skip this first (In any
    case, it depends on the opinion of the virtio community).

    > We should work on finite requirements that we can converge and deliver.

    Yes. Agree.

    > First draft was this v0 one.
    >
    > Otherwise, it can be never ending list that doesn't reach anywhere as steering technology keeps improving.
    >
    > Slightly unrelated to question, let's call this feature as: "receive flow filters", instead of "flow director".

    OK, here's our previous agreement.

    Thanks!





  • 55.  Re: [virtio-comment] RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 02:28
    å 2023/6/13 äå1:26, Parav Pandit åé: From: Heng Qi <hengqi@linux.alibaba.com> Sent: Monday, June 12, 2023 10:35 AM I will update this details in the v1. Hi, Parav. Do we have a work schedule for the 1.4 features? Since n-tuple requirements are straightforward, below is timeline in mind, 1. Requirements agree and drafted by June-30 2. Design and its review to finish by July-30 3. Spec to review by Aug-30 OK, is this timeline for all 1.4 features or individual high priority 1.4 features? Do we have a prioritization of 1.4 features? This also facilitates time investment for community maintainers. For example, the flow director we are very concerned about recently, because different scenarios have requirements for it. We'd like to work out with you a rough timeline so we can discuss and collaborate better! Will above timeline workable for you to work together on it? This timeline is ok for us. In addition, if it is difficult for the community to carry the VF id or there is an unclear demarcation problem with the design of the group owner management, maybe we can skip this first (In any case, it depends on the opinion of the virtio community). We should work on finite requirements that we can converge and deliver. Yes. Agree. First draft was this v0 one. Otherwise, it can be never ending list that doesn't reach anywhere as steering technology keeps improving. Slightly unrelated to question, let's call this feature as: "receive flow filters", instead of "flow director". OK, here's our previous agreement. Thanks!


  • 56.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 08:58
    On Mon, Jun 12, 2023 at 05:26:04PM +0000, Parav Pandit wrote:
    >
    > > From: Heng Qi <hengqi@linux.alibaba.com>
    > > Sent: Monday, June 12, 2023 10:35 AM
    >
    > > > I will update this details in the v1.
    > >
    > > Hi, Parav.
    > >
    > > Do we have a work schedule for the 1.4 features?
    >
    > Since n-tuple requirements are straightforward, below is timeline in mind,
    > 1. Requirements agree and drafted by June-30

    Pls note a bunch of people are traveling to kvm forum,
    I'd add a couple of weeks.

    > 2. Design and its review to finish by July-30
    > 3. Spec to review by Aug-30

    Wait a second are we talking about 1.4 or 1.3? For 1.3 indeed we need
    to wrap development up by end of summer - our charter is to produce
    releases every 12 to 16 months, previous one was July 1st, 2022.

    > > For example, the flow director we are very concerned about recently, because
    > > different scenarios have requirements for it.
    > >
    > > We'd like to work out with you a rough timeline so we can discuss and
    > > collaborate better!
    > >
    > Will above timeline workable for you to work together on it?
    > We should work on finite requirements that we can converge and deliver.
    > First draft was this v0 one.
    >
    > Otherwise, it can be never ending list that doesn't reach anywhere as steering technology keeps improving.
    >
    > Slightly unrelated to question, let's call this feature as: "receive flow filters", instead of "flow director".
    >




  • 57.  Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 08:58
    On Mon, Jun 12, 2023 at 05:26:04PM +0000, Parav Pandit wrote: > > > From: Heng Qi <hengqi@linux.alibaba.com> > > Sent: Monday, June 12, 2023 10:35 AM > > > > I will update this details in the v1. > > > > Hi, Parav. > > > > Do we have a work schedule for the 1.4 features? > > Since n-tuple requirements are straightforward, below is timeline in mind, > 1. Requirements agree and drafted by June-30 Pls note a bunch of people are traveling to kvm forum, I'd add a couple of weeks. > 2. Design and its review to finish by July-30 > 3. Spec to review by Aug-30 Wait a second are we talking about 1.4 or 1.3? For 1.3 indeed we need to wrap development up by end of summer - our charter is to produce releases every 12 to 16 months, previous one was July 1st, 2022. > > For example, the flow director we are very concerned about recently, because > > different scenarios have requirements for it. > > > > We'd like to work out with you a rough timeline so we can discuss and > > collaborate better! > > > Will above timeline workable for you to work together on it? > We should work on finite requirements that we can converge and deliver. > First draft was this v0 one. > > Otherwise, it can be never ending list that doesn't reach anywhere as steering technology keeps improving. > > Slightly unrelated to question, let's call this feature as: "receive flow filters", instead of "flow director". >


  • 58.  Re: [virtio-comment] Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 09:17
    On Tue, Jun 13 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:

    > On Mon, Jun 12, 2023 at 05:26:04PM +0000, Parav Pandit wrote:
    >>
    >> > From: Heng Qi <hengqi@linux.alibaba.com>
    >> > Sent: Monday, June 12, 2023 10:35 AM
    >>
    >> > > I will update this details in the v1.
    >> >
    >> > Hi, Parav.
    >> >
    >> > Do we have a work schedule for the 1.4 features?
    >>
    >> Since n-tuple requirements are straightforward, below is timeline in mind,
    >> 1. Requirements agree and drafted by June-30
    >
    > Pls note a bunch of people are traveling to kvm forum,
    > I'd add a couple of weeks.
    >
    >> 2. Design and its review to finish by July-30
    >> 3. Spec to review by Aug-30
    >
    > Wait a second are we talking about 1.4 or 1.3? For 1.3 indeed we need
    > to wrap development up by end of summer - our charter is to produce
    > releases every 12 to 16 months, previous one was July 1st, 2022.

    For the 1.3 schedule, please also see
    87edmp4z04.fsf@redhat.com/">https://lore.kernel.org/virtio-comment/87edmp4z04.fsf@redhat.com/




  • 59.  Re: [virtio-comment] Re: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 09:17
    On Tue, Jun 13 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Mon, Jun 12, 2023 at 05:26:04PM +0000, Parav Pandit wrote: >> >> > From: Heng Qi <hengqi@linux.alibaba.com> >> > Sent: Monday, June 12, 2023 10:35 AM >> >> > > I will update this details in the v1. >> > >> > Hi, Parav. >> > >> > Do we have a work schedule for the 1.4 features? >> >> Since n-tuple requirements are straightforward, below is timeline in mind, >> 1. Requirements agree and drafted by June-30 > > Pls note a bunch of people are traveling to kvm forum, > I'd add a couple of weeks. > >> 2. Design and its review to finish by July-30 >> 3. Spec to review by Aug-30 > > Wait a second are we talking about 1.4 or 1.3? For 1.3 indeed we need > to wrap development up by end of summer - our charter is to produce > releases every 12 to 16 months, previous one was July 1st, 2022. For the 1.3 schedule, please also see 87edmp4z04.fsf@redhat.com / >https://lore.kernel.org/virtio-comment/ 87edmp4z04.fsf@redhat.com /


  • 60.  RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 11:33

    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Tuesday, June 13, 2023 4:58 AM

    > > 2. Design and its review to finish by July-30 3. Spec to review by
    > > Aug-30
    >
    > Wait a second are we talking about 1.4 or 1.3? For 1.3 indeed we need to wrap
    > development up by end of summer - our charter is to produce releases every 12
    > to 16 months, previous one was July 1st, 2022.
    >
    1.4.



  • 61.  RE: [virtio] RE: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 11:33
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 13, 2023 4:58 AM > > 2. Design and its review to finish by July-30 3. Spec to review by > > Aug-30 > > Wait a second are we talking about 1.4 or 1.3? For 1.3 indeed we need to wrap > development up by end of summer - our charter is to produce releases every 12 > to 16 months, previous one was July 1st, 2022. > 1.4.


  • 62.  Re: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-07-2023 02:48
    On Fri, Jun 2, 2023 at 6:03?AM Parav Pandit <parav@nvidia.com> wrote:
    >
    > Add virtio net device requirements for receive flow steering.

    While at this, I would go with a well defined RX frame processing
    pipeline like a modern NIC instead of adding individual features like:

    - filter
    - hashing
    - queue selection

    etc.

    Thanks

    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 33 +++++++++++++++++++++++++++++++++
    > 1 file changed, 33 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index fc36f31..41242b4 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface.
    > 1. Device counters visible to the driver
    > 2. Low latency tx and rx virtqueues for PCI transport
    > 3. Virtqueue notification coalescing re-arming support
    > +4. Receive virtqueue n-tuple steering
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -151,3 +152,35 @@ struct vnet_rx_completion {
    > coalescing after processing the packets per VQ. This ensures that when
    > networking stack decides to poll, no new notifications are generated when
    > per VQ notification coalescing is used.
    > +
    > +## 3.4 Receive virtqueue n-tuple flow steering (RFS)
    > +1. The driver should be able to define a receive packet match criteria, an
    > + action and a destination for a packet. For example, an ipv4 packet with a
    > + multicast address to be steered to the receive vq 0. The second example is
    > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
    > + steered to receive vq 10.
    > +2. The match criteria should include exact tuple fields well-defined such as mac
    > + address, IP addresses, tcp/udp ports, etc.
    > +3. The match criteria should also include the field mask.
    > +4. The match criteria may optionally also include specific packet data offset,
    > + length, and matching pattern, which may not be defined in the standard RFC.
    > +5. Action includes (a) dropping or (b) forwarding the packet.
    > +6. Destination location is a receive virtqueue index or rss context.
    > +7. The device should process RFS rules before RSS rules, i.e., when there is a
    > + miss on the RFS rule RSS rule applies.
    > +8. The device should be able to add/remove these rules at a rate of 100K
    > + rules/sec.
    > +9. If multiple rules are programmed which has overlapping attributes for a
    > + received packet, the driver to define the location/priority of the rule.
    > +10. The driver should be able to query flow steering table entries programmed in
    > + the device by the flow id.
    > +11. The driver should be able to add the entry for attributes (a) match
    > + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    > +12. The driver should be able to delete the steering rule entry via a unique id.
    > +13. The driver should be able to add and delete the steering rules in out of
    > + order manner without depending on previous commands.
    > +14. A group member device should be able to query the attributes of the flow
    > + steering that device supports.
    > +15. The driver and group owner driver should be able to query supported device
    > + limits for the steering entries.
    > +
    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 63.  Re: [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-07-2023 02:48
    On Fri, Jun 2, 2023 at 6:03âAM Parav Pandit <parav@nvidia.com> wrote: > > Add virtio net device requirements for receive flow steering. While at this, I would go with a well defined RX frame processing pipeline like a modern NIC instead of adding individual features like: - filter - hashing - queue selection etc. Thanks > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > net-workstream/features-1.4.md 33 +++++++++++++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index fc36f31..41242b4 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface. > 1. Device counters visible to the driver > 2. Low latency tx and rx virtqueues for PCI transport > 3. Virtqueue notification coalescing re-arming support > +4. Receive virtqueue n-tuple steering > > # 3. Requirements > ## 3.1 Device counters > @@ -151,3 +152,35 @@ struct vnet_rx_completion { > coalescing after processing the packets per VQ. This ensures that when > networking stack decides to poll, no new notifications are generated when > per VQ notification coalescing is used. > + > +## 3.4 Receive virtqueue n-tuple flow steering (RFS) > +1. The driver should be able to define a receive packet match criteria, an > + action and a destination for a packet. For example, an ipv4 packet with a > + multicast address to be steered to the receive vq 0. The second example is > + ipv4, tcp packet matching a specified IP address and tcp port tuple to > + steered to receive vq 10. > +2. The match criteria should include exact tuple fields well-defined such as mac > + address, IP addresses, tcp/udp ports, etc. > +3. The match criteria should also include the field mask. > +4. The match criteria may optionally also include specific packet data offset, > + length, and matching pattern, which may not be defined in the standard RFC. > +5. Action includes (a) dropping or (b) forwarding the packet. > +6. Destination location is a receive virtqueue index or rss context. > +7. The device should process RFS rules before RSS rules, i.e., when there is a > + miss on the RFS rule RSS rule applies. > +8. The device should be able to add/remove these rules at a rate of 100K > + rules/sec. > +9. If multiple rules are programmed which has overlapping attributes for a > + received packet, the driver to define the location/priority of the rule. > +10. The driver should be able to query flow steering table entries programmed in > + the device by the flow id. > +11. The driver should be able to add the entry for attributes (a) match > + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. > +12. The driver should be able to delete the steering rule entry via a unique id. > +13. The driver should be able to add and delete the steering rules in out of > + order manner without depending on previous commands. > +14. A group member device should be able to query the attributes of the flow > + steering that device supports. > +15. The driver and group owner driver should be able to query supported device > + limits for the steering entries. > + > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 64.  Re: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 02:57


    ? 2023/6/2 ??6:03, Parav Pandit ??:
    > Add virtio net device requirements for receive flow steering.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 33 +++++++++++++++++++++++++++++++++
    > 1 file changed, 33 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index fc36f31..41242b4 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface.
    > 1. Device counters visible to the driver
    > 2. Low latency tx and rx virtqueues for PCI transport
    > 3. Virtqueue notification coalescing re-arming support
    > +4. Receive virtqueue n-tuple steering
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -151,3 +152,35 @@ struct vnet_rx_completion {
    > coalescing after processing the packets per VQ. This ensures that when
    > networking stack decides to poll, no new notifications are generated when
    > per VQ notification coalescing is used.
    > +
    > +## 3.4 Receive virtqueue n-tuple flow steering (RFS)
    > +1. The driver should be able to define a receive packet match criteria, an
    > + action and a destination for a packet. For example, an ipv4 packet with a
    > + multicast address to be steered to the receive vq 0. The second example is
    > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
    > + steered to receive vq 10.
    > +2. The match criteria should include exact tuple fields well-defined such as mac
    > + address, IP addresses, tcp/udp ports, etc.
    > +3. The match criteria should also include the field mask.
    > +4. The match criteria may optionally also include specific packet data offset,

    Is this the offset of the packet payload (not including the packet
    header)? If yes, what might be the usage scenarios?

    > + length, and matching pattern, which may not be defined in the standard RFC.
    > +5. Action includes (a) dropping or (b) forwarding the packet.
    > +6. Destination location is a receive virtqueue index or rss context.
    > +7. The device should process RFS rules before RSS rules, i.e., when there is a
    > + miss on the RFS rule RSS rule applies.
    > +8. The device should be able to add/remove these rules at a rate of 100K
    > + rules/sec.

    Why do we need to add this limit for the device please? I think that
    different devices may handle ctrq at different speeds.

    > +9. If multiple rules are programmed which has overlapping attributes for a
    > + received packet, the driver to define the location/priority of the rule.
    > +10. The driver should be able to query flow steering table entries programmed in
    > + the device by the flow id.

    Combining points 9 and 10, can I deduce that the attributes of a flow
    rule include identifier (id), position (index of the rule entry) and
    priority?
    According to point 11, the driver needs to provide a unique id, if the
    location and priority are not provided, they should have predefined
    default values.

    > +11. The driver should be able to add the entry for attributes (a) match
    > + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    > +12. The driver should be able to delete the steering rule entry via a unique id.
    > +13. The driver should be able to add and delete the steering rules in out of
    > + order manner without depending on previous commands.
    > +14. A group member device should be able to query the attributes of the flow
    > + steering that device supports.

    Does a group member driver support insertion rules?  The group owner
    driver and a group member driver should have the ability to add/remove
    rules for themselves.

    > +15. The driver and group owner driver should be able to query supported device
    > + limits for the steering entries.
    > +

    Does this include the number of rules supported by the device? Are there
    other limits?

    Thanks!







  • 65.  Re: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 02:57
    å 2023/6/2 äå6:03, Parav Pandit åé: Add virtio net device requirements for receive flow steering. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index fc36f31..41242b4 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface. 1. Device counters visible to the driver 2. Low latency tx and rx virtqueues for PCI transport 3. Virtqueue notification coalescing re-arming support +4. Receive virtqueue n-tuple steering # 3. Requirements ## 3.1 Device counters @@ -151,3 +152,35 @@ struct vnet_rx_completion { coalescing after processing the packets per VQ. This ensures that when networking stack decides to poll, no new notifications are generated when per VQ notification coalescing is used. + +## 3.4 Receive virtqueue n-tuple flow steering (RFS) +1. The driver should be able to define a receive packet match criteria, an + action and a destination for a packet. For example, an ipv4 packet with a + multicast address to be steered to the receive vq 0. The second example is + ipv4, tcp packet matching a specified IP address and tcp port tuple to + steered to receive vq 10. +2. The match criteria should include exact tuple fields well-defined such as mac + address, IP addresses, tcp/udp ports, etc. +3. The match criteria should also include the field mask. +4. The match criteria may optionally also include specific packet data offset, Is this the offset of the packet payload (not including the packet header)? If yes, what might be the usage scenarios? + length, and matching pattern, which may not be defined in the standard RFC. +5. Action includes (a) dropping or (b) forwarding the packet. +6. Destination location is a receive virtqueue index or rss context. +7. The device should process RFS rules before RSS rules, i.e., when there is a + miss on the RFS rule RSS rule applies. +8. The device should be able to add/remove these rules at a rate of 100K + rules/sec. Why do we need to add this limit for the device please? I think that different devices may handle ctrq at different speeds. +9. If multiple rules are programmed which has overlapping attributes for a + received packet, the driver to define the location/priority of the rule. +10. The driver should be able to query flow steering table entries programmed in + the device by the flow id. Combining points 9 and 10, can I deduce that the attributes of a flow rule include identifier (id), position (index of the rule entry) and priority? According to point 11, the driver needs to provide a unique id, if the location and priority are not provided, they should have predefined default values. +11. The driver should be able to add the entry for attributes (a) match + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. +12. The driver should be able to delete the steering rule entry via a unique id. +13. The driver should be able to add and delete the steering rules in out of + order manner without depending on previous commands. +14. A group member device should be able to query the attributes of the flow + steering that device supports. Does a group member driver support insertion rules? The group owner driver and a group member driver should have the ability to add/remove rules for themselves. +15. The driver and group owner driver should be able to query supported device + limits for the steering entries. + Does this include the number of rules supported by the device? Are there other limits? Thanks!


  • 66.  RE: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 04:16


    > From: Heng Qi <hengqi@linux.alibaba.com>
    > Sent: Monday, June 12, 2023 10:57 PM

    [..]
    > > +4. The match criteria may optionally also include specific packet
    > > +data offset,
    >
    > Is this the offset of the packet payload (not including the packet header)? If yes,
    > what might be the usage scenarios?
    >
    Some packet header fields may not be defined initially by us.
    For example some protocol FOO type which has some header after UDP header.
    And wants to filter and steer to a specific RQ.

    A virtio may not have defined such protocol at the beginning. So a generic extension allows to steer packet.
    At that point line between header and data is blurry. :)

    > > + length, and matching pattern, which may not be defined in the standard
    > RFC.
    > > +5. Action includes (a) dropping or (b) forwarding the packet.
    > > +6. Destination location is a receive virtqueue index or rss context.
    > > +7. The device should process RFS rules before RSS rules, i.e., when there is a
    > > + miss on the RFS rule RSS rule applies.
    > > +8. The device should be able to add/remove these rules at a rate of 100K
    > > + rules/sec.
    >
    > Why do we need to add this limit for the device please? I think that different
    > devices may handle ctrq at different speeds.
    >
    ARFS rules insertion is fast and tcp streams should be able to converge quicker on the desired cpu.
    So rule insertion should be fast enough.
    Many of the device can/may be able to do this as some semi-fast path operation unlike CVQ which is typically a device level control operation, often done as slow process.

    So driver-device interface to be able to scale up to low latency and high throughput flow insert/delete ops.
    It is ok one device may choose to not support such high rate, but interface definition wise I imagine to be on its own dedicated queue.

    > > +9. If multiple rules are programmed which has overlapping attributes for a
    > > + received packet, the driver to define the location/priority of the rule.
    > > +10. The driver should be able to query flow steering table entries
    > programmed in
    > > + the device by the flow id.
    >
    > Combining points 9 and 10, can I deduce that the attributes of a flow rule
    > include identifier (id), position (index of the rule entry) and priority?

    Yes, I also think that way.

    > According to point 11, the driver needs to provide a unique id, if the location
    > and priority are not provided, they should have predefined default values.
    >
    Yes.
    I am not fully sure of it as it brings some uncertainty in behavior.
    So having second thought as driver to always provide it.

    > > +11. The driver should be able to add the entry for attributes (a) match
    > > + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    > > +12. The driver should be able to delete the steering rule entry via a unique
    > id.
    > > +13. The driver should be able to add and delete the steering rules in out of
    > > + order manner without depending on previous commands.
    > > +14. A group member device should be able to query the attributes of the
    > flow
    > > + steering that device supports.
    >
    > Does a group member driver support insertion rules?  The group owner driver
    > and a group member driver should have the ability to add/remove rules for
    > themselves.
    >
    Yes, Group owner adds rule for the self, not on behalf.
    Same as group member.
    No difference between them as they operate on self device.

    In future, a group owner may want to steer the packet towards a group member, like eswitch.

    > > +15. The driver and group owner driver should be able to query supported
    > device
    > > + limits for the steering entries.
    > > +
    >
    > Does this include the number of rules supported by the device? Are there other
    > limits?

    Lets list down if you have some in mind.



  • 67.  RE: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 04:16
    > From: Heng Qi <hengqi@linux.alibaba.com> > Sent: Monday, June 12, 2023 10:57 PM [..] > > +4. The match criteria may optionally also include specific packet > > +data offset, > > Is this the offset of the packet payload (not including the packet header)? If yes, > what might be the usage scenarios? > Some packet header fields may not be defined initially by us. For example some protocol FOO type which has some header after UDP header. And wants to filter and steer to a specific RQ. A virtio may not have defined such protocol at the beginning. So a generic extension allows to steer packet. At that point line between header and data is blurry. :) > > + length, and matching pattern, which may not be defined in the standard > RFC. > > +5. Action includes (a) dropping or (b) forwarding the packet. > > +6. Destination location is a receive virtqueue index or rss context. > > +7. The device should process RFS rules before RSS rules, i.e., when there is a > > + miss on the RFS rule RSS rule applies. > > +8. The device should be able to add/remove these rules at a rate of 100K > > + rules/sec. > > Why do we need to add this limit for the device please? I think that different > devices may handle ctrq at different speeds. > ARFS rules insertion is fast and tcp streams should be able to converge quicker on the desired cpu. So rule insertion should be fast enough. Many of the device can/may be able to do this as some semi-fast path operation unlike CVQ which is typically a device level control operation, often done as slow process. So driver-device interface to be able to scale up to low latency and high throughput flow insert/delete ops. It is ok one device may choose to not support such high rate, but interface definition wise I imagine to be on its own dedicated queue. > > +9. If multiple rules are programmed which has overlapping attributes for a > > + received packet, the driver to define the location/priority of the rule. > > +10. The driver should be able to query flow steering table entries > programmed in > > + the device by the flow id. > > Combining points 9 and 10, can I deduce that the attributes of a flow rule > include identifier (id), position (index of the rule entry) and priority? Yes, I also think that way. > According to point 11, the driver needs to provide a unique id, if the location > and priority are not provided, they should have predefined default values. > Yes. I am not fully sure of it as it brings some uncertainty in behavior. So having second thought as driver to always provide it. > > +11. The driver should be able to add the entry for attributes (a) match > > + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. > > +12. The driver should be able to delete the steering rule entry via a unique > id. > > +13. The driver should be able to add and delete the steering rules in out of > > + order manner without depending on previous commands. > > +14. A group member device should be able to query the attributes of the > flow > > + steering that device supports. > > Does a group member driver support insertion rules? The group owner driver > and a group member driver should have the ability to add/remove rules for > themselves. > Yes, Group owner adds rule for the self, not on behalf. Same as group member. No difference between them as they operate on self device. In future, a group owner may want to steer the packet towards a group member, like eswitch. > > +15. The driver and group owner driver should be able to query supported > device > > + limits for the steering entries. > > + > > Does this include the number of rules supported by the device? Are there other > limits? Lets list down if you have some in mind.


  • 68.  Re: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 05:04


    ? 2023/6/13 ??12:16, Parav Pandit ??:
    >
    >> From: Heng Qi <hengqi@linux.alibaba.com>
    >> Sent: Monday, June 12, 2023 10:57 PM
    > [..]
    >>> +4. The match criteria may optionally also include specific packet
    >>> +data offset,
    >> Is this the offset of the packet payload (not including the packet header)? If yes,
    >> what might be the usage scenarios?
    >>
    > Some packet header fields may not be defined initially by us.
    > For example some protocol FOO type which has some header after UDP header.
    > And wants to filter and steer to a specific RQ.

    Ok, I got it.

    >
    > A virtio may not have defined such protocol at the beginning. So a generic extension allows to steer packet.
    > At that point line between header and data is blurry. :)

    Then I think it's just an *offset* from the start of the packet so we
    can use this capability for the packet header as well.

    >
    >>> + length, and matching pattern, which may not be defined in the standard
    >> RFC.
    >>> +5. Action includes (a) dropping or (b) forwarding the packet.
    >>> +6. Destination location is a receive virtqueue index or rss context.
    >>> +7. The device should process RFS rules before RSS rules, i.e., when there is a
    >>> + miss on the RFS rule RSS rule applies.
    >>> +8. The device should be able to add/remove these rules at a rate of 100K
    >>> + rules/sec.
    >> Why do we need to add this limit for the device please? I think that different
    >> devices may handle ctrq at different speeds.
    >>
    > ARFS rules insertion is fast and tcp streams should be able to converge quicker on the desired cpu.
    > So rule insertion should be fast enough.
    > Many of the device can/may be able to do this as some semi-fast path operation unlike CVQ which is typically a device level control operation, often done as slow process.

    We need to consider ARFS for receive flow filters, but receive flow
    filters are more infrastructure,
    do we need to offer a sub-feature to distinguish whether a device
    supports this dedicated queue or not.
    This becomes easier for users who only use receive flow filters without
    ARFS enabled.
    At this point, ARFS and receive flow filters become orthogonal.

    >
    > So driver-device interface to be able to scale up to low latency and high throughput flow insert/delete ops.

    Ok.

    > It is ok one device may choose to not support such high rate, but interface definition wise I imagine to be on its own dedicated queue.
    >
    >>> +9. If multiple rules are programmed which has overlapping attributes for a
    >>> + received packet, the driver to define the location/priority of the rule.
    >>> +10. The driver should be able to query flow steering table entries
    >> programmed in
    >>> + the device by the flow id.
    >> Combining points 9 and 10, can I deduce that the attributes of a flow rule
    >> include identifier (id), position (index of the rule entry) and priority?
    > Yes, I also think that way.
    >
    >> According to point 11, the driver needs to provide a unique id, if the location
    >> and priority are not provided, they should have predefined default values.
    >>
    > Yes.
    > I am not fully sure of it as it brings some uncertainty in behavior.
    > So having second thought as driver to always provide it.

    Yes, and this point is specific, then we can continue to confirm it in
    the future spec version.

    >
    >>> +11. The driver should be able to add the entry for attributes (a) match
    >>> + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    >>> +12. The driver should be able to delete the steering rule entry via a unique
    >> id.
    >>> +13. The driver should be able to add and delete the steering rules in out of
    >>> + order manner without depending on previous commands.
    >>> +14. A group member device should be able to query the attributes of the
    >> flow
    >>> + steering that device supports.
    >> Does a group member driver support insertion rules?  The group owner driver
    >> and a group member driver should have the ability to add/remove rules for
    >> themselves.
    >>
    > Yes, Group owner adds rule for the self, not on behalf.
    > Same as group member.
    > No difference between them as they operate on self device.

    Ok.

    >
    > In future, a group owner may want to steer the packet towards a group member, like eswitch.

    Yes.
    And what I want to say is that our cloud architecture is similar to
    two-layer virtualization, but we only have one layer of eswitch,
    so we need VF id to let eswitch provide simple flow steering for the
    second layer of virtualization, and do not need complicated eswitch.
    This is the purpose of carrying VF id, but now we can not consider this
    (VF id) in the feature of receive flow filters, so as not to hinder the
    progress of this feature.

    >
    >>> +15. The driver and group owner driver should be able to query supported
    >> device
    >>> + limits for the steering entries.
    >>> +
    >> Does this include the number of rules supported by the device? Are there other
    >> limits?
    > Lets list down if you have some in mind.
    Not yet.

    Thanks!




  • 69.  Re: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 05:05
    å 2023/6/13 äå12:16, Parav Pandit åé: From: Heng Qi <hengqi@linux.alibaba.com> Sent: Monday, June 12, 2023 10:57 PM [..] +4. The match criteria may optionally also include specific packet +data offset, Is this the offset of the packet payload (not including the packet header)? If yes, what might be the usage scenarios? Some packet header fields may not be defined initially by us. For example some protocol FOO type which has some header after UDP header. And wants to filter and steer to a specific RQ. Ok, I got it. A virtio may not have defined such protocol at the beginning. So a generic extension allows to steer packet. At that point line between header and data is blurry. :) Then I think it's just an *offset* from the start of the packet so we can use this capability for the packet header as well. + length, and matching pattern, which may not be defined in the standard RFC. +5. Action includes (a) dropping or (b) forwarding the packet. +6. Destination location is a receive virtqueue index or rss context. +7. The device should process RFS rules before RSS rules, i.e., when there is a + miss on the RFS rule RSS rule applies. +8. The device should be able to add/remove these rules at a rate of 100K + rules/sec. Why do we need to add this limit for the device please? I think that different devices may handle ctrq at different speeds. ARFS rules insertion is fast and tcp streams should be able to converge quicker on the desired cpu. So rule insertion should be fast enough. Many of the device can/may be able to do this as some semi-fast path operation unlike CVQ which is typically a device level control operation, often done as slow process. We need to consider ARFS for receive flow filters, but receive flow filters are more infrastructure, do we need to offer a sub-feature to distinguish whether a device supports this dedicated queue or not. This becomes easier for users who only use receive flow filters without ARFS enabled. At this point, ARFS and receive flow filters become orthogonal. So driver-device interface to be able to scale up to low latency and high throughput flow insert/delete ops. Ok. It is ok one device may choose to not support such high rate, but interface definition wise I imagine to be on its own dedicated queue. +9. If multiple rules are programmed which has overlapping attributes for a + received packet, the driver to define the location/priority of the rule. +10. The driver should be able to query flow steering table entries programmed in + the device by the flow id. Combining points 9 and 10, can I deduce that the attributes of a flow rule include identifier (id), position (index of the rule entry) and priority? Yes, I also think that way. According to point 11, the driver needs to provide a unique id, if the location and priority are not provided, they should have predefined default values. Yes. I am not fully sure of it as it brings some uncertainty in behavior. So having second thought as driver to always provide it. Yes, and this point is specific, then we can continue to confirm it in the future spec version. +11. The driver should be able to add the entry for attributes (a) match + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. +12. The driver should be able to delete the steering rule entry via a unique id. +13. The driver should be able to add and delete the steering rules in out of + order manner without depending on previous commands. +14. A group member device should be able to query the attributes of the flow + steering that device supports. Does a group member driver support insertion rules? The group owner driver and a group member driver should have the ability to add/remove rules for themselves. Yes, Group owner adds rule for the self, not on behalf. Same as group member. No difference between them as they operate on self device. Ok. In future, a group owner may want to steer the packet towards a group member, like eswitch. Yes. And what I want to say is that our cloud architecture is similar to two-layer virtualization, but we only have one layer of eswitch, so we need VF id to let eswitch provide simple flow steering for the second layer of virtualization, and do not need complicated eswitch. This is the purpose of carrying VF id, but now we can not consider this (VF id) in the feature of receive flow filters, so as not to hinder the progress of this feature. +15. The driver and group owner driver should be able to query supported device + limits for the steering entries. + Does this include the number of rules supported by the device? Are there other limits? Lets list down if you have some in mind. Not yet. Thanks!


  • 70.  RE: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 12:24


    > From: Heng Qi <hengqi@linux.alibaba.com>
    > Sent: Tuesday, June 13, 2023 1:04 AM
    >
    >
    > ? 2023/6/13 ??12:16, Parav Pandit ??:
    > >
    > >> From: Heng Qi <hengqi@linux.alibaba.com>
    > >> Sent: Monday, June 12, 2023 10:57 PM
    > > [..]
    > >>> +4. The match criteria may optionally also include specific packet
    > >>> +data offset,
    > >> Is this the offset of the packet payload (not including the packet
    > >> header)? If yes, what might be the usage scenarios?
    > >>
    > > Some packet header fields may not be defined initially by us.
    > > For example some protocol FOO type which has some header after UDP
    > header.
    > > And wants to filter and steer to a specific RQ.
    >
    > Ok, I got it.
    >
    > >
    > > A virtio may not have defined such protocol at the beginning. So a generic
    > extension allows to steer packet.
    > > At that point line between header and data is blurry. :)
    >
    > Then I think it's just an *offset* from the start of the packet so we can use this
    > capability for the packet header as well.
    >
    Yes, will change it. "packet data" was misleading. It is just the packet offset.

    > >
    > >>> + length, and matching pattern, which may not be defined in the
    > >>> + standard
    > >> RFC.
    > >>> +5. Action includes (a) dropping or (b) forwarding the packet.
    > >>> +6. Destination location is a receive virtqueue index or rss context.
    > >>> +7. The device should process RFS rules before RSS rules, i.e., when there is
    > a
    > >>> + miss on the RFS rule RSS rule applies.
    > >>> +8. The device should be able to add/remove these rules at a rate of 100K
    > >>> + rules/sec.
    > >> Why do we need to add this limit for the device please? I think that
    > >> different devices may handle ctrq at different speeds.
    > >>
    > > ARFS rules insertion is fast and tcp streams should be able to converge quicker
    > on the desired cpu.
    > > So rule insertion should be fast enough.
    > > Many of the device can/may be able to do this as some semi-fast path
    > operation unlike CVQ which is typically a device level control operation, often
    > done as slow process.
    >
    > We need to consider ARFS for receive flow filters, but receive flow filters are
    > more infrastructure, do we need to offer a sub-feature to distinguish whether a
    > device supports this dedicated queue or not.
    > This becomes easier for users who only use receive flow filters without ARFS
    > enabled.
    > At this point, ARFS and receive flow filters become orthogonal.
    >
    If it’s a "flowvq" dedicated queue device both the requirements are addressed.
    Device can choose to implement in its preferred way.

    > >
    > > So driver-device interface to be able to scale up to low latency and high
    > throughput flow insert/delete ops.
    >
    > Ok.
    >
    > > It is ok one device may choose to not support such high rate, but interface
    > definition wise I imagine to be on its own dedicated queue.
    > >
    > >>> +9. If multiple rules are programmed which has overlapping attributes for a
    > >>> + received packet, the driver to define the location/priority of the rule.
    > >>> +10. The driver should be able to query flow steering table entries
    > >> programmed in
    > >>> + the device by the flow id.
    > >> Combining points 9 and 10, can I deduce that the attributes of a flow
    > >> rule include identifier (id), position (index of the rule entry) and priority?
    > > Yes, I also think that way.
    > >
    > >> According to point 11, the driver needs to provide a unique id, if
    > >> the location and priority are not provided, they should have predefined
    > default values.
    > >>
    > > Yes.
    > > I am not fully sure of it as it brings some uncertainty in behavior.
    > > So having second thought as driver to always provide it.
    >
    > Yes, and this point is specific, then we can continue to confirm it in the future
    > spec version.
    >
    Ok. I will revise it in v1.

    > >
    > >>> +11. The driver should be able to add the entry for attributes (a) match
    > >>> + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    > >>> +12. The driver should be able to delete the steering rule entry via
    > >>> +a unique
    > >> id.
    > >>> +13. The driver should be able to add and delete the steering rules in out of
    > >>> + order manner without depending on previous commands.
    > >>> +14. A group member device should be able to query the attributes of
    > >>> +the
    > >> flow
    > >>> + steering that device supports.
    > >> Does a group member driver support insertion rules?  The group owner
    > >> driver and a group member driver should have the ability to
    > >> add/remove rules for themselves.
    > >>
    > > Yes, Group owner adds rule for the self, not on behalf.
    > > Same as group member.
    > > No difference between them as they operate on self device.
    >
    > Ok.
    >
    > >
    > > In future, a group owner may want to steer the packet towards a group
    > member, like eswitch.
    >
    > Yes.
    > And what I want to say is that our cloud architecture is similar to two-layer
    > virtualization, but we only have one layer of eswitch, so we need VF id to let
    > eswitch provide simple flow steering for the second layer of virtualization, and
    > do not need complicated eswitch.
    > This is the purpose of carrying VF id, but now we can not consider this (VF id) in
    > the feature of receive flow filters, so as not to hinder the progress of this
    > feature.
    >
    Ok. Lets keep this in notes and mind to have the abstraction and don’t jump to bake vf id in the spec in the initial version.



  • 71.  RE: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-13-2023 12:24
    > From: Heng Qi <hengqi@linux.alibaba.com> > Sent: Tuesday, June 13, 2023 1:04 AM > > > å 2023/6/13 äå12:16, Parav Pandit åé: > > > >> From: Heng Qi <hengqi@linux.alibaba.com> > >> Sent: Monday, June 12, 2023 10:57 PM > > [..] > >>> +4. The match criteria may optionally also include specific packet > >>> +data offset, > >> Is this the offset of the packet payload (not including the packet > >> header)? If yes, what might be the usage scenarios? > >> > > Some packet header fields may not be defined initially by us. > > For example some protocol FOO type which has some header after UDP > header. > > And wants to filter and steer to a specific RQ. > > Ok, I got it. > > > > > A virtio may not have defined such protocol at the beginning. So a generic > extension allows to steer packet. > > At that point line between header and data is blurry. :) > > Then I think it's just an *offset* from the start of the packet so we can use this > capability for the packet header as well. > Yes, will change it. "packet data" was misleading. It is just the packet offset. > > > >>> + length, and matching pattern, which may not be defined in the > >>> + standard > >> RFC. > >>> +5. Action includes (a) dropping or (b) forwarding the packet. > >>> +6. Destination location is a receive virtqueue index or rss context. > >>> +7. The device should process RFS rules before RSS rules, i.e., when there is > a > >>> + miss on the RFS rule RSS rule applies. > >>> +8. The device should be able to add/remove these rules at a rate of 100K > >>> + rules/sec. > >> Why do we need to add this limit for the device please? I think that > >> different devices may handle ctrq at different speeds. > >> > > ARFS rules insertion is fast and tcp streams should be able to converge quicker > on the desired cpu. > > So rule insertion should be fast enough. > > Many of the device can/may be able to do this as some semi-fast path > operation unlike CVQ which is typically a device level control operation, often > done as slow process. > > We need to consider ARFS for receive flow filters, but receive flow filters are > more infrastructure, do we need to offer a sub-feature to distinguish whether a > device supports this dedicated queue or not. > This becomes easier for users who only use receive flow filters without ARFS > enabled. > At this point, ARFS and receive flow filters become orthogonal. > If itâs a "flowvq" dedicated queue device both the requirements are addressed. Device can choose to implement in its preferred way. > > > > So driver-device interface to be able to scale up to low latency and high > throughput flow insert/delete ops. > > Ok. > > > It is ok one device may choose to not support such high rate, but interface > definition wise I imagine to be on its own dedicated queue. > > > >>> +9. If multiple rules are programmed which has overlapping attributes for a > >>> + received packet, the driver to define the location/priority of the rule. > >>> +10. The driver should be able to query flow steering table entries > >> programmed in > >>> + the device by the flow id. > >> Combining points 9 and 10, can I deduce that the attributes of a flow > >> rule include identifier (id), position (index of the rule entry) and priority? > > Yes, I also think that way. > > > >> According to point 11, the driver needs to provide a unique id, if > >> the location and priority are not provided, they should have predefined > default values. > >> > > Yes. > > I am not fully sure of it as it brings some uncertainty in behavior. > > So having second thought as driver to always provide it. > > Yes, and this point is specific, then we can continue to confirm it in the future > spec version. > Ok. I will revise it in v1. > > > >>> +11. The driver should be able to add the entry for attributes (a) match > >>> + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. > >>> +12. The driver should be able to delete the steering rule entry via > >>> +a unique > >> id. > >>> +13. The driver should be able to add and delete the steering rules in out of > >>> + order manner without depending on previous commands. > >>> +14. A group member device should be able to query the attributes of > >>> +the > >> flow > >>> + steering that device supports. > >> Does a group member driver support insertion rules? The group owner > >> driver and a group member driver should have the ability to > >> add/remove rules for themselves. > >> > > Yes, Group owner adds rule for the self, not on behalf. > > Same as group member. > > No difference between them as they operate on self device. > > Ok. > > > > > In future, a group owner may want to steer the packet towards a group > member, like eswitch. > > Yes. > And what I want to say is that our cloud architecture is similar to two-layer > virtualization, but we only have one layer of eswitch, so we need VF id to let > eswitch provide simple flow steering for the second layer of virtualization, and > do not need complicated eswitch. > This is the purpose of carrying VF id, but now we can not consider this (VF id) in > the feature of receive flow filters, so as not to hinder the progress of this > feature. > Ok. Lets keep this in notes and mind to have the abstraction and donât jump to bake vf id in the spec in the initial version.


  • 72.  Re: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-14-2023 03:43


    ? 2023/6/13 ??8:24, Parav Pandit ??:
    >
    >> From: Heng Qi <hengqi@linux.alibaba.com>
    >> Sent: Tuesday, June 13, 2023 1:04 AM
    >>
    >>
    >> ? 2023/6/13 ??12:16, Parav Pandit ??:
    >>>> From: Heng Qi <hengqi@linux.alibaba.com>
    >>>> Sent: Monday, June 12, 2023 10:57 PM
    >>> [..]
    >>>>> +4. The match criteria may optionally also include specific packet
    >>>>> +data offset,
    >>>> Is this the offset of the packet payload (not including the packet
    >>>> header)? If yes, what might be the usage scenarios?
    >>>>
    >>> Some packet header fields may not be defined initially by us.
    >>> For example some protocol FOO type which has some header after UDP
    >> header.
    >>> And wants to filter and steer to a specific RQ.
    >> Ok, I got it.
    >>
    >>> A virtio may not have defined such protocol at the beginning. So a generic
    >> extension allows to steer packet.
    >>> At that point line between header and data is blurry. :)
    >> Then I think it's just an *offset* from the start of the packet so we can use this
    >> capability for the packet header as well.
    >>
    > Yes, will change it. "packet data" was misleading. It is just the packet offset.
    >
    >>>>> + length, and matching pattern, which may not be defined in the
    >>>>> + standard
    >>>> RFC.
    >>>>> +5. Action includes (a) dropping or (b) forwarding the packet.
    >>>>> +6. Destination location is a receive virtqueue index or rss context.
    >>>>> +7. The device should process RFS rules before RSS rules, i.e., when there is
    >> a
    >>>>> + miss on the RFS rule RSS rule applies.
    >>>>> +8. The device should be able to add/remove these rules at a rate of 100K
    >>>>> + rules/sec.
    >>>> Why do we need to add this limit for the device please? I think that
    >>>> different devices may handle ctrq at different speeds.
    >>>>
    >>> ARFS rules insertion is fast and tcp streams should be able to converge quicker
    >> on the desired cpu.
    >>> So rule insertion should be fast enough.
    >>> Many of the device can/may be able to do this as some semi-fast path
    >> operation unlike CVQ which is typically a device level control operation, often
    >> done as slow process.
    >>
    >> We need to consider ARFS for receive flow filters, but receive flow filters are
    >> more infrastructure, do we need to offer a sub-feature to distinguish whether a
    >> device supports this dedicated queue or not.
    >> This becomes easier for users who only use receive flow filters without ARFS
    >> enabled.
    >> At this point, ARFS and receive flow filters become orthogonal.
    >>
    > If it’s a "flowvq" dedicated queue device both the requirements are addressed.
    > Device can choose to implement in its preferred way.

    I think this is a migration issue:
    Devices A and B each offser the VIRTIO_NET_F_RECEIVE_FLOW_FILTER (this
    is a temporary name) feature.
    And device A implements flowvq, device B does not implement flowvq, they
    are both using receive flow filters but ARFS is disabled.
    The migration should fail at this time. Does this meet user expectations?

    Or do we warn in the spec that using flow vq might cause the migration
    issue?

    Thanks!

    >
    >>> So driver-device interface to be able to scale up to low latency and high
    >> throughput flow insert/delete ops.
    >>
    >> Ok.
    >>
    >>> It is ok one device may choose to not support such high rate, but interface
    >> definition wise I imagine to be on its own dedicated queue.
    >>>>> +9. If multiple rules are programmed which has overlapping attributes for a
    >>>>> + received packet, the driver to define the location/priority of the rule.
    >>>>> +10. The driver should be able to query flow steering table entries
    >>>> programmed in
    >>>>> + the device by the flow id.
    >>>> Combining points 9 and 10, can I deduce that the attributes of a flow
    >>>> rule include identifier (id), position (index of the rule entry) and priority?
    >>> Yes, I also think that way.
    >>>
    >>>> According to point 11, the driver needs to provide a unique id, if
    >>>> the location and priority are not provided, they should have predefined
    >> default values.
    >>> Yes.
    >>> I am not fully sure of it as it brings some uncertainty in behavior.
    >>> So having second thought as driver to always provide it.
    >> Yes, and this point is specific, then we can continue to confirm it in the future
    >> spec version.
    >>
    > Ok. I will revise it in v1.
    >
    >>>>> +11. The driver should be able to add the entry for attributes (a) match
    >>>>> + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits.
    >>>>> +12. The driver should be able to delete the steering rule entry via
    >>>>> +a unique
    >>>> id.
    >>>>> +13. The driver should be able to add and delete the steering rules in out of
    >>>>> + order manner without depending on previous commands.
    >>>>> +14. A group member device should be able to query the attributes of
    >>>>> +the
    >>>> flow
    >>>>> + steering that device supports.
    >>>> Does a group member driver support insertion rules?  The group owner
    >>>> driver and a group member driver should have the ability to
    >>>> add/remove rules for themselves.
    >>>>
    >>> Yes, Group owner adds rule for the self, not on behalf.
    >>> Same as group member.
    >>> No difference between them as they operate on self device.
    >> Ok.
    >>
    >>> In future, a group owner may want to steer the packet towards a group
    >> member, like eswitch.
    >>
    >> Yes.
    >> And what I want to say is that our cloud architecture is similar to two-layer
    >> virtualization, but we only have one layer of eswitch, so we need VF id to let
    >> eswitch provide simple flow steering for the second layer of virtualization, and
    >> do not need complicated eswitch.
    >> This is the purpose of carrying VF id, but now we can not consider this (VF id) in
    >> the feature of receive flow filters, so as not to hinder the progress of this
    >> feature.
    >>
    > Ok. Lets keep this in notes and mind to have the abstraction and don’t jump to bake vf id in the spec in the initial version.




  • 73.  Re: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-14-2023 03:43
    å 2023/6/13 äå8:24, Parav Pandit åé: From: Heng Qi <hengqi@linux.alibaba.com> Sent: Tuesday, June 13, 2023 1:04 AM å 2023/6/13 äå12:16, Parav Pandit åé: From: Heng Qi <hengqi@linux.alibaba.com> Sent: Monday, June 12, 2023 10:57 PM [..] +4. The match criteria may optionally also include specific packet +data offset, Is this the offset of the packet payload (not including the packet header)? If yes, what might be the usage scenarios? Some packet header fields may not be defined initially by us. For example some protocol FOO type which has some header after UDP header. And wants to filter and steer to a specific RQ. Ok, I got it. A virtio may not have defined such protocol at the beginning. So a generic extension allows to steer packet. At that point line between header and data is blurry. :) Then I think it's just an *offset* from the start of the packet so we can use this capability for the packet header as well. Yes, will change it. "packet data" was misleading. It is just the packet offset. + length, and matching pattern, which may not be defined in the + standard RFC. +5. Action includes (a) dropping or (b) forwarding the packet. +6. Destination location is a receive virtqueue index or rss context. +7. The device should process RFS rules before RSS rules, i.e., when there is a + miss on the RFS rule RSS rule applies. +8. The device should be able to add/remove these rules at a rate of 100K + rules/sec. Why do we need to add this limit for the device please? I think that different devices may handle ctrq at different speeds. ARFS rules insertion is fast and tcp streams should be able to converge quicker on the desired cpu. So rule insertion should be fast enough. Many of the device can/may be able to do this as some semi-fast path operation unlike CVQ which is typically a device level control operation, often done as slow process. We need to consider ARFS for receive flow filters, but receive flow filters are more infrastructure, do we need to offer a sub-feature to distinguish whether a device supports this dedicated queue or not. This becomes easier for users who only use receive flow filters without ARFS enabled. At this point, ARFS and receive flow filters become orthogonal. If itâs a "flowvq" dedicated queue device both the requirements are addressed. Device can choose to implement in its preferred way. I think this is a migration issue: Devices A and B each offser the VIRTIO_NET_F_RECEIVE_FLOW_FILTER (this is a temporary name) feature. And device A implements flowvq, device B does not implement flowvq, they are both using receive flow filters but ARFS is disabled. The migration should fail at this time. Does this meet user expectations? Or do we warn in the spec that using flow vq might cause the migration issue? Thanks! So driver-device interface to be able to scale up to low latency and high throughput flow insert/delete ops. Ok. It is ok one device may choose to not support such high rate, but interface definition wise I imagine to be on its own dedicated queue. +9. If multiple rules are programmed which has overlapping attributes for a + received packet, the driver to define the location/priority of the rule. +10. The driver should be able to query flow steering table entries programmed in + the device by the flow id. Combining points 9 and 10, can I deduce that the attributes of a flow rule include identifier (id), position (index of the rule entry) and priority? Yes, I also think that way. According to point 11, the driver needs to provide a unique id, if the location and priority are not provided, they should have predefined default values. Yes. I am not fully sure of it as it brings some uncertainty in behavior. So having second thought as driver to always provide it. Yes, and this point is specific, then we can continue to confirm it in the future spec version. Ok. I will revise it in v1. +11. The driver should be able to add the entry for attributes (a) match + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. +12. The driver should be able to delete the steering rule entry via +a unique id. +13. The driver should be able to add and delete the steering rules in out of + order manner without depending on previous commands. +14. A group member device should be able to query the attributes of +the flow + steering that device supports. Does a group member driver support insertion rules? The group owner driver and a group member driver should have the ability to add/remove rules for themselves. Yes, Group owner adds rule for the self, not on behalf. Same as group member. No difference between them as they operate on self device. Ok. In future, a group owner may want to steer the packet towards a group member, like eswitch. Yes. And what I want to say is that our cloud architecture is similar to two-layer virtualization, but we only have one layer of eswitch, so we need VF id to let eswitch provide simple flow steering for the second layer of virtualization, and do not need complicated eswitch. This is the purpose of carrying VF id, but now we can not consider this (VF id) in the feature of receive flow filters, so as not to hinder the progress of this feature. Ok. Lets keep this in notes and mind to have the abstraction and donât jump to bake vf id in the spec in the initial version.


  • 74.  RE: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-14-2023 03:48

    > From: Heng Qi <hengqi@linux.alibaba.com>
    > Sent: Tuesday, June 13, 2023 11:43 PM

    > > If it’s a "flowvq" dedicated queue device both the requirements are
    > addressed.
    > > Device can choose to implement in its preferred way.
    >
    > I think this is a migration issue:
    > Devices A and B each offser the VIRTIO_NET_F_RECEIVE_FLOW_FILTER (this is a
    > temporary name) feature.
    > And device A implements flowvq, device B does not implement flowvq, they are
    > both using receive flow filters but ARFS is disabled.
    > The migration should fail at this time. Does this meet user expectations?
    >
    If F_RECEIVE_FLOW_FILTER is offered, it means a device supports flowvq on src and dst.
    So migration wont fail.

    It is no different than any other net _F feature.
    For example notification coalescing or _MQ or _RSS.

    What I meant in above "preferred way" is, device can choose to operate flow table entries as 5 flow/sec.
    Or it can choose to do 100K flows/sec.

    The basic requirement is that flow rule entries should be a low latency interface.

    > Or do we warn in the spec that using flow vq might cause the migration issue?
    >
    It should not cause migration once the device features and associated attributes can be queries by the group owner.




  • 75.  RE: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-14-2023 03:49
    > From: Heng Qi <hengqi@linux.alibaba.com> > Sent: Tuesday, June 13, 2023 11:43 PM > > If itâs a "flowvq" dedicated queue device both the requirements are > addressed. > > Device can choose to implement in its preferred way. > > I think this is a migration issue: > Devices A and B each offser the VIRTIO_NET_F_RECEIVE_FLOW_FILTER (this is a > temporary name) feature. > And device A implements flowvq, device B does not implement flowvq, they are > both using receive flow filters but ARFS is disabled. > The migration should fail at this time. Does this meet user expectations? > If F_RECEIVE_FLOW_FILTER is offered, it means a device supports flowvq on src and dst. So migration wont fail. It is no different than any other net _F feature. For example notification coalescing or _MQ or _RSS. What I meant in above "preferred way" is, device can choose to operate flow table entries as 5 flow/sec. Or it can choose to do 100K flows/sec. The basic requirement is that flow rule entries should be a low latency interface. > Or do we warn in the spec that using flow vq might cause the migration issue? > It should not cause migration once the device features and associated attributes can be queries by the group owner.


  • 76.  Re: [virtio-comment] RE: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-14-2023 03:53


    ? 2023/6/14 ??11:48, Parav Pandit ??:
    >> From: Heng Qi <hengqi@linux.alibaba.com>
    >> Sent: Tuesday, June 13, 2023 11:43 PM
    >>> If it’s a "flowvq" dedicated queue device both the requirements are
    >> addressed.
    >>> Device can choose to implement in its preferred way.
    >> I think this is a migration issue:
    >> Devices A and B each offser the VIRTIO_NET_F_RECEIVE_FLOW_FILTER (this is a
    >> temporary name) feature.
    >> And device A implements flowvq, device B does not implement flowvq, they are
    >> both using receive flow filters but ARFS is disabled.
    >> The migration should fail at this time. Does this meet user expectations?
    >>
    > If F_RECEIVE_FLOW_FILTER is offered, it means a device supports flowvq on src and dst.
    > So migration wont fail.
    >
    > It is no different than any other net _F feature.
    > For example notification coalescing or _MQ or _RSS.
    >
    > What I meant in above "preferred way" is, device can choose to operate flow table entries as 5 flow/sec.
    > Or it can choose to do 100K flows/sec.

    Ok, I see. I misunderstood the "preferred way" before :)

    >
    > The basic requirement is that flow rule entries should be a low latency interface.
    >
    >> Or do we warn in the spec that using flow vq might cause the migration issue?
    >>
    > It should not cause migration once the device features and associated attributes can be queries by the group owner.
    >




  • 77.  Re: [virtio-comment] RE: [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow steering requirements

    Posted 06-14-2023 03:53
    å 2023/6/14 äå11:48, Parav Pandit åé: From: Heng Qi <hengqi@linux.alibaba.com> Sent: Tuesday, June 13, 2023 11:43 PM If itâs a "flowvq" dedicated queue device both the requirements are addressed. Device can choose to implement in its preferred way. I think this is a migration issue: Devices A and B each offser the VIRTIO_NET_F_RECEIVE_FLOW_FILTER (this is a temporary name) feature. And device A implements flowvq, device B does not implement flowvq, they are both using receive flow filters but ARFS is disabled. The migration should fail at this time. Does this meet user expectations? If F_RECEIVE_FLOW_FILTER is offered, it means a device supports flowvq on src and dst. So migration wont fail. It is no different than any other net _F feature. For example notification coalescing or _MQ or _RSS. What I meant in above "preferred way" is, device can choose to operate flow table entries as 5 flow/sec. Or it can choose to do 100K flows/sec. Ok, I see. I misunderstood the "preferred way" before :) The basic requirement is that flow rule entries should be a low latency interface. Or do we warn in the spec that using flow vq might cause the migration issue? It should not cause migration once the device features and associated attributes can be queries by the group owner.


  • 78.  [PATCH requirements 6/7] net-features: Add packet timestamp requirements

    Posted 06-01-2023 22:04
    Add tx and rx packet timestamp requirements. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 41242b4..f94848e 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -10,6 +10,7 @@ together is desired while updating the virtio net interface. 2. Low latency tx and rx virtqueues for PCI transport 3. Virtqueue notification coalescing re-arming support 4. Receive virtqueue n-tuple steering +5. Device timestamp for tx and rx packets # 3. Requirements ## 3.1 Device counters @@ -184,3 +185,27 @@ struct vnet_rx_completion { 15. The driver and group owner driver should be able to query supported device limits for the steering entries. +## 3.5 Packet timestamp +1. Device should provide transmit timestamp and receive timestamp of the packets + at per packet level when the device is enabled. +2. Device should provide the current free running clock in the least latency + possible using an MMIO register read of 64-bit to have the least jitter. +3. Device should provide the current frequency and the frequency unit for the + software to synchronize the reference point of software and the device using + a control vq command. + +### 3.5.1 Transmit timestamp +1. Transmit completion must contain a packet transmission timestamp when the + device is enabled for it. +2. The device should record the packet transmit timestamp in the completion at + the farthest egress point towards the network. +3. The device must provide a transmit packet timestamp in a single DMA + transaction along with the rest of the transmit completion fields. + +### 3.5.2 Receive timestamp +1. Receive completion must contain a packet reception timestamp when the device + is enabled for it. +2. The device should record the received packet timestamp at the closet ingress + point of reception from the network. +3. The device should provide a receive packet timestamp in a single DMA + transaction along with the rest of the receive completion fields. -- 2.26.2


  • 79.  Re: [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements

    Posted 06-06-2023 22:40
    On Fri, Jun 02, 2023 at 01:03:04AM +0300, Parav Pandit wrote:
    > Add tx and rx packet timestamp requirements.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 25 +++++++++++++++++++++++++
    > 1 file changed, 25 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index 41242b4..f94848e 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -10,6 +10,7 @@ together is desired while updating the virtio net interface.
    > 2. Low latency tx and rx virtqueues for PCI transport
    > 3. Virtqueue notification coalescing re-arming support
    > 4. Receive virtqueue n-tuple steering
    > +5. Device timestamp for tx and rx packets
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -184,3 +185,27 @@ struct vnet_rx_completion {
    > 15. The driver and group owner driver should be able to query supported device
    > limits for the steering entries.
    >
    > +## 3.5 Packet timestamp
    > +1. Device should provide transmit timestamp and receive timestamp of the packets
    > + at per packet level when the device is enabled.
    > +2. Device should provide the current free running clock in the least latency
    > + possible using an MMIO register read of 64-bit to have the least jitter.

    wow these reads are expensive. what is the actual requirement?

    > +3. Device should provide the current frequency and the frequency unit for the
    > + software to synchronize the reference point of software and the device using
    > + a control vq command.

    let's leave mechanism out of it. I think you are trying to say
    this is async to data vqs? and cvq is fine.

    > +
    > +### 3.5.1 Transmit timestamp
    > +1. Transmit completion must contain a packet transmission timestamp when the
    > + device is enabled for it.
    > +2. The device should record the packet transmit timestamp in the completion at
    > + the farthest egress point towards the network.
    > +3. The device must provide a transmit packet timestamp in a single DMA
    > + transaction along with the rest of the transmit completion fields.
    > +
    > +### 3.5.2 Receive timestamp
    > +1. Receive completion must contain a packet reception timestamp when the device
    > + is enabled for it.
    > +2. The device should record the received packet timestamp at the closet ingress
    > + point of reception from the network.
    > +3. The device should provide a receive packet timestamp in a single DMA
    > + transaction along with the rest of the receive completion fields.

    how large do these need to be? ok for timer to overflow?


    why do you mention migration in counters but not in timers?

    is it ok for time to go back?



    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 80.  Re: [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements

    Posted 06-06-2023 22:40
    On Fri, Jun 02, 2023 at 01:03:04AM +0300, Parav Pandit wrote: > Add tx and rx packet timestamp requirements. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > net-workstream/features-1.4.md 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index 41242b4..f94848e 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -10,6 +10,7 @@ together is desired while updating the virtio net interface. > 2. Low latency tx and rx virtqueues for PCI transport > 3. Virtqueue notification coalescing re-arming support > 4. Receive virtqueue n-tuple steering > +5. Device timestamp for tx and rx packets > > # 3. Requirements > ## 3.1 Device counters > @@ -184,3 +185,27 @@ struct vnet_rx_completion { > 15. The driver and group owner driver should be able to query supported device > limits for the steering entries. > > +## 3.5 Packet timestamp > +1. Device should provide transmit timestamp and receive timestamp of the packets > + at per packet level when the device is enabled. > +2. Device should provide the current free running clock in the least latency > + possible using an MMIO register read of 64-bit to have the least jitter. wow these reads are expensive. what is the actual requirement? > +3. Device should provide the current frequency and the frequency unit for the > + software to synchronize the reference point of software and the device using > + a control vq command. let's leave mechanism out of it. I think you are trying to say this is async to data vqs? and cvq is fine. > + > +### 3.5.1 Transmit timestamp > +1. Transmit completion must contain a packet transmission timestamp when the > + device is enabled for it. > +2. The device should record the packet transmit timestamp in the completion at > + the farthest egress point towards the network. > +3. The device must provide a transmit packet timestamp in a single DMA > + transaction along with the rest of the transmit completion fields. > + > +### 3.5.2 Receive timestamp > +1. Receive completion must contain a packet reception timestamp when the device > + is enabled for it. > +2. The device should record the received packet timestamp at the closet ingress > + point of reception from the network. > +3. The device should provide a receive packet timestamp in a single DMA > + transaction along with the rest of the receive completion fields. how large do these need to be? ok for timer to overflow? why do you mention migration in counters but not in timers? is it ok for time to go back? > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 81.  RE: [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements

    Posted 06-06-2023 22:51


    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Tuesday, June 6, 2023 6:40 PM

    > > +## 3.5 Packet timestamp
    > > +1. Device should provide transmit timestamp and receive timestamp of the
    > packets
    > > + at per packet level when the device is enabled.
    > > +2. Device should provide the current free running clock in the least latency
    > > + possible using an MMIO register read of 64-bit to have the least jitter.
    >
    > wow these reads are expensive. what is the actual requirement?
    >
    I don't understand the question.

    > > +3. Device should provide the current frequency and the frequency unit for
    > the
    > > + software to synchronize the reference point of software and the device
    > using
    > > + a control vq command.
    >
    > let's leave mechanism out of it. I think you are trying to say this is async to data
    > vqs? and cvq is fine.
    >
    There are two clocks running, one of cpu where driver is running, other one is in the device.
    And timestamps are done by the device based on its clock.

    So driver can convert the device timestamp to driver cpu timestamp.
    This is like one time query to know conversion mechanics.

    > > +### 3.5.2 Receive timestamp
    > > +1. Receive completion must contain a packet reception timestamp when the
    > device
    > > + is enabled for it.
    > > +2. The device should record the received packet timestamp at the closet
    > ingress
    > > + point of reception from the network.
    > > +3. The device should provide a receive packet timestamp in a single DMA
    > > + transaction along with the rest of the receive completion fields.
    >
    > how large do these need to be? ok for timer to overflow?
    >
    Will add it. Usually 8B.
    But overflow after 30+ years or so is common norm.


    >
    > why do you mention migration in counters but not in timers?
    >
    > is it ok for time to go back?

    Should add it for timer too.



  • 82.  RE: [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements

    Posted 06-06-2023 22:52
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 6, 2023 6:40 PM > > +## 3.5 Packet timestamp > > +1. Device should provide transmit timestamp and receive timestamp of the > packets > > + at per packet level when the device is enabled. > > +2. Device should provide the current free running clock in the least latency > > + possible using an MMIO register read of 64-bit to have the least jitter. > > wow these reads are expensive. what is the actual requirement? > I don't understand the question. > > +3. Device should provide the current frequency and the frequency unit for > the > > + software to synchronize the reference point of software and the device > using > > + a control vq command. > > let's leave mechanism out of it. I think you are trying to say this is async to data > vqs? and cvq is fine. > There are two clocks running, one of cpu where driver is running, other one is in the device. And timestamps are done by the device based on its clock. So driver can convert the device timestamp to driver cpu timestamp. This is like one time query to know conversion mechanics. > > +### 3.5.2 Receive timestamp > > +1. Receive completion must contain a packet reception timestamp when the > device > > + is enabled for it. > > +2. The device should record the received packet timestamp at the closet > ingress > > + point of reception from the network. > > +3. The device should provide a receive packet timestamp in a single DMA > > + transaction along with the rest of the receive completion fields. > > how large do these need to be? ok for timer to overflow? > Will add it. Usually 8B. But overflow after 30+ years or so is common norm. > > why do you mention migration in counters but not in timers? > > is it ok for time to go back? Should add it for timer too.


  • 83.  Re: [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements

    Posted 06-06-2023 23:08
    On Tue, Jun 06, 2023 at 10:51:25PM +0000, Parav Pandit wrote:
    >
    >
    > > From: Michael S. Tsirkin <mst@redhat.com>
    > > Sent: Tuesday, June 6, 2023 6:40 PM
    >
    > > > +## 3.5 Packet timestamp
    > > > +1. Device should provide transmit timestamp and receive timestamp of the
    > > packets
    > > > + at per packet level when the device is enabled.
    > > > +2. Device should provide the current free running clock in the least latency
    > > > + possible using an MMIO register read of 64-bit to have the least jitter.
    > >
    > > wow these reads are expensive. what is the actual requirement?
    > >
    > I don't understand the question.

    placing something in a register is a solution. what is the
    requirement?

    > > > +3. Device should provide the current frequency and the frequency unit for
    > > the
    > > > + software to synchronize the reference point of software and the device
    > > using
    > > > + a control vq command.
    > >
    > > let's leave mechanism out of it. I think you are trying to say this is async to data
    > > vqs? and cvq is fine.
    > >
    > There are two clocks running, one of cpu where driver is running, other one is in the device.
    > And timestamps are done by the device based on its clock.
    >
    > So driver can convert the device timestamp to driver cpu timestamp.
    > This is like one time query to know conversion mechanics.
    >
    > > > +### 3.5.2 Receive timestamp
    > > > +1. Receive completion must contain a packet reception timestamp when the
    > > device
    > > > + is enabled for it.
    > > > +2. The device should record the received packet timestamp at the closet
    > > ingress
    > > > + point of reception from the network.
    > > > +3. The device should provide a receive packet timestamp in a single DMA
    > > > + transaction along with the rest of the receive completion fields.
    > >
    > > how large do these need to be? ok for timer to overflow?
    > >
    > Will add it. Usually 8B.
    > But overflow after 30+ years or so is common norm.
    >
    >
    > >
    > > why do you mention migration in counters but not in timers?
    > >
    > > is it ok for time to go back?
    >
    > Should add it for timer too.




  • 84.  Re: [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements

    Posted 06-06-2023 23:08
    On Tue, Jun 06, 2023 at 10:51:25PM +0000, Parav Pandit wrote: > > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Tuesday, June 6, 2023 6:40 PM > > > > +## 3.5 Packet timestamp > > > +1. Device should provide transmit timestamp and receive timestamp of the > > packets > > > + at per packet level when the device is enabled. > > > +2. Device should provide the current free running clock in the least latency > > > + possible using an MMIO register read of 64-bit to have the least jitter. > > > > wow these reads are expensive. what is the actual requirement? > > > I don't understand the question. placing something in a register is a solution. what is the requirement? > > > +3. Device should provide the current frequency and the frequency unit for > > the > > > + software to synchronize the reference point of software and the device > > using > > > + a control vq command. > > > > let's leave mechanism out of it. I think you are trying to say this is async to data > > vqs? and cvq is fine. > > > There are two clocks running, one of cpu where driver is running, other one is in the device. > And timestamps are done by the device based on its clock. > > So driver can convert the device timestamp to driver cpu timestamp. > This is like one time query to know conversion mechanics. > > > > +### 3.5.2 Receive timestamp > > > +1. Receive completion must contain a packet reception timestamp when the > > device > > > + is enabled for it. > > > +2. The device should record the received packet timestamp at the closet > > ingress > > > + point of reception from the network. > > > +3. The device should provide a receive packet timestamp in a single DMA > > > + transaction along with the rest of the receive completion fields. > > > > how large do these need to be? ok for timer to overflow? > > > Will add it. Usually 8B. > But overflow after 30+ years or so is common norm. > > > > > > why do you mention migration in counters but not in timers? > > > > is it ok for time to go back? > > Should add it for timer too.


  • 85.  [PATCH requirements 4/7] net-features: Add notification coalescing requirements

    Posted 06-01-2023 22:04
    Add virtio net device notification coalescing improvements requirements. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 054f951..fc36f31 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -8,6 +8,7 @@ together is desired while updating the virtio net interface. # 2. Summary 1. Device counters visible to the driver 2. Low latency tx and rx virtqueues for PCI transport +3. Virtqueue notification coalescing re-arming support # 3. Requirements ## 3.1 Device counters @@ -143,3 +144,10 @@ struct vnet_rx_completion { 7. The device should be able to indicate when a full page buffer is consumed, which can be recycled by the driver when the packets from the completed page is fully consumed. + +## 3.3 Virtqueue notification coalescing re-enable support +1. Tx and Rx virtqueue notification coalescing should auto-disable on + notification reporting to the driver. The driver should be able to enable + coalescing after processing the packets per VQ. This ensures that when + networking stack decides to poll, no new notifications are generated when + per VQ notification coalescing is used. -- 2.26.2


  • 86.  Re: [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements

    Posted 06-06-2023 22:37
    On Fri, Jun 02, 2023 at 01:03:02AM +0300, Parav Pandit wrote:
    > Add virtio net device notification coalescing improvements requirements.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 8 ++++++++
    > 1 file changed, 8 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index 054f951..fc36f31 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -8,6 +8,7 @@ together is desired while updating the virtio net interface.
    > # 2. Summary
    > 1. Device counters visible to the driver
    > 2. Low latency tx and rx virtqueues for PCI transport
    > +3. Virtqueue notification coalescing re-arming support
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -143,3 +144,10 @@ struct vnet_rx_completion {
    > 7. The device should be able to indicate when a full page buffer is consumed,
    > which can be recycled by the driver when the packets from the completed
    > page is fully consumed.
    > +
    > +## 3.3 Virtqueue notification coalescing re-enable support
    > +1. Tx and Rx virtqueue notification coalescing should auto-disable on
    > + notification reporting to the driver. The driver should be able to enable
    > + coalescing after processing the packets per VQ. This ensures that when
    > + networking stack decides to poll, no new notifications are generated when
    > + per VQ notification coalescing is used.

    Weirdly specific.
    I don't know what does auto-disable and enable coalescing mean.
    Does this refer to VIRTIO_NET_F_NOTF_COAL?
    what is the problem this is trying to solve?

    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 87.  Re: [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements

    Posted 06-06-2023 22:37
    On Fri, Jun 02, 2023 at 01:03:02AM +0300, Parav Pandit wrote: > Add virtio net device notification coalescing improvements requirements. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > net-workstream/features-1.4.md 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index 054f951..fc36f31 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -8,6 +8,7 @@ together is desired while updating the virtio net interface. > # 2. Summary > 1. Device counters visible to the driver > 2. Low latency tx and rx virtqueues for PCI transport > +3. Virtqueue notification coalescing re-arming support > > # 3. Requirements > ## 3.1 Device counters > @@ -143,3 +144,10 @@ struct vnet_rx_completion { > 7. The device should be able to indicate when a full page buffer is consumed, > which can be recycled by the driver when the packets from the completed > page is fully consumed. > + > +## 3.3 Virtqueue notification coalescing re-enable support > +1. Tx and Rx virtqueue notification coalescing should auto-disable on > + notification reporting to the driver. The driver should be able to enable > + coalescing after processing the packets per VQ. This ensures that when > + networking stack decides to poll, no new notifications are generated when > + per VQ notification coalescing is used. Weirdly specific. I don't know what does auto-disable and enable coalescing mean. Does this refer to VIRTIO_NET_F_NOTF_COAL? what is the problem this is trying to solve? > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 88.  RE: [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements

    Posted 06-06-2023 22:47


    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Tuesday, June 6, 2023 6:37 PM

    > > # 3. Requirements
    > > ## 3.1 Device counters
    > > @@ -143,3 +144,10 @@ struct vnet_rx_completion { 7. The device should
    > > be able to indicate when a full page buffer is consumed,
    > > which can be recycled by the driver when the packets from the completed
    > > page is fully consumed.
    > > +
    > > +## 3.3 Virtqueue notification coalescing re-enable support 1. Tx and
    > > +Rx virtqueue notification coalescing should auto-disable on
    > > + notification reporting to the driver. The driver should be able to enable
    > > + coalescing after processing the packets per VQ. This ensures that when
    > > + networking stack decides to poll, no new notifications are generated when
    > > + per VQ notification coalescing is used.
    >
    > I don't know what does auto-disable and enable coalescing mean.
    > Does this refer to VIRTIO_NET_F_NOTF_COAL?

    > what is the problem this is trying to solve?
    >
    Current driver when uses F_VQ_NOTF_COAL, and polling, it still receives the event (interrupt) from the device, because the time is always running.
    (event is not auto disabled on generation of the event).
    So number of interrupts are not fully controlled yet by the driver.



  • 89.  RE: [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements

    Posted 06-06-2023 22:47
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 6, 2023 6:37 PM > > # 3. Requirements > > ## 3.1 Device counters > > @@ -143,3 +144,10 @@ struct vnet_rx_completion { 7. The device should > > be able to indicate when a full page buffer is consumed, > > which can be recycled by the driver when the packets from the completed > > page is fully consumed. > > + > > +## 3.3 Virtqueue notification coalescing re-enable support 1. Tx and > > +Rx virtqueue notification coalescing should auto-disable on > > + notification reporting to the driver. The driver should be able to enable > > + coalescing after processing the packets per VQ. This ensures that when > > + networking stack decides to poll, no new notifications are generated when > > + per VQ notification coalescing is used. > > I don't know what does auto-disable and enable coalescing mean. > Does this refer to VIRTIO_NET_F_NOTF_COAL? > what is the problem this is trying to solve? > Current driver when uses F_VQ_NOTF_COAL, and polling, it still receives the event (interrupt) from the device, because the time is always running. (event is not auto disabled on generation of the event). So number of interrupts are not fully controlled yet by the driver.


  • 90.  Re: [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements

    Posted 06-06-2023 23:06
    On Tue, Jun 06, 2023 at 10:46:53PM +0000, Parav Pandit wrote:
    >
    >
    > > From: Michael S. Tsirkin <mst@redhat.com>
    > > Sent: Tuesday, June 6, 2023 6:37 PM
    >
    > > > # 3. Requirements
    > > > ## 3.1 Device counters
    > > > @@ -143,3 +144,10 @@ struct vnet_rx_completion { 7. The device should
    > > > be able to indicate when a full page buffer is consumed,
    > > > which can be recycled by the driver when the packets from the completed
    > > > page is fully consumed.
    > > > +
    > > > +## 3.3 Virtqueue notification coalescing re-enable support 1. Tx and
    > > > +Rx virtqueue notification coalescing should auto-disable on
    > > > + notification reporting to the driver. The driver should be able to enable
    > > > + coalescing after processing the packets per VQ. This ensures that when
    > > > + networking stack decides to poll, no new notifications are generated when
    > > > + per VQ notification coalescing is used.
    > >
    > > I don't know what does auto-disable and enable coalescing mean.
    > > Does this refer to VIRTIO_NET_F_NOTF_COAL?
    >
    > > what is the problem this is trying to solve?
    > >
    > Current driver when uses F_VQ_NOTF_COAL, and polling, it still receives the event (interrupt) from the device, because the time is always running.
    > (event is not auto disabled on generation of the event).
    > So number of interrupts are not fully controlled yet by the driver.

    could we please speak virtio terms? you are not new in this area..
    so this is asking for a way to disable used buffer notifications?
    we actually have this with packet vq always and with split
    without EVENT_IDX. what is missing?

    --
    MST




  • 91.  Re: [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements

    Posted 06-06-2023 23:06
    On Tue, Jun 06, 2023 at 10:46:53PM +0000, Parav Pandit wrote: > > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Tuesday, June 6, 2023 6:37 PM > > > > # 3. Requirements > > > ## 3.1 Device counters > > > @@ -143,3 +144,10 @@ struct vnet_rx_completion { 7. The device should > > > be able to indicate when a full page buffer is consumed, > > > which can be recycled by the driver when the packets from the completed > > > page is fully consumed. > > > + > > > +## 3.3 Virtqueue notification coalescing re-enable support 1. Tx and > > > +Rx virtqueue notification coalescing should auto-disable on > > > + notification reporting to the driver. The driver should be able to enable > > > + coalescing after processing the packets per VQ. This ensures that when > > > + networking stack decides to poll, no new notifications are generated when > > > + per VQ notification coalescing is used. > > > > I don't know what does auto-disable and enable coalescing mean. > > Does this refer to VIRTIO_NET_F_NOTF_COAL? > > > what is the problem this is trying to solve? > > > Current driver when uses F_VQ_NOTF_COAL, and polling, it still receives the event (interrupt) from the device, because the time is always running. > (event is not auto disabled on generation of the event). > So number of interrupts are not fully controlled yet by the driver. could we please speak virtio terms? you are not new in this area.. so this is asking for a way to disable used buffer notifications? we actually have this with packet vq always and with split without EVENT_IDX. what is missing? -- MST


  • 92.  [PATCH requirements 7/7] net-features: Add header data split requirements

    Posted 06-01-2023 22:04
    Add header data split requirements for the receive packets. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index f94848e..31680b7 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -11,6 +11,7 @@ together is desired while updating the virtio net interface. 3. Virtqueue notification coalescing re-arming support 4. Receive virtqueue n-tuple steering 5. Device timestamp for tx and rx packets +6. Header data split for the receive virtqueue # 3. Requirements ## 3.1 Device counters @@ -209,3 +210,15 @@ struct vnet_rx_completion { point of reception from the network. 3. The device should provide a receive packet timestamp in a single DMA transaction along with the rest of the receive completion fields. + +## 3.6 Header data split for the receive virtqueue +1. The device should be able to DMA the packet header and data to two different + memory locations, this enables driver and networking stack to perform zero + copy to application buffer(s). +2. The driver should be able to configure maximum header buffer size per + virtqueue. +3. The header buffer to be in a physically contiguous memory per virtqueue +4. The device should be able to indicate header data split in the receive + completion. +5. The device should be able to zero pad the header buffer when the received + header is shorter than cpu cache line size. -- 2.26.2


  • 93.  Re: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements

    Posted 06-06-2023 22:42
    On Fri, Jun 02, 2023 at 01:03:05AM +0300, Parav Pandit wrote:
    > Add header data split requirements for the receive packets.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > net-workstream/features-1.4.md | 13 +++++++++++++
    > 1 file changed, 13 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index f94848e..31680b7 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -11,6 +11,7 @@ together is desired while updating the virtio net interface.
    > 3. Virtqueue notification coalescing re-arming support
    > 4. Receive virtqueue n-tuple steering
    > 5. Device timestamp for tx and rx packets
    > +6. Header data split for the receive virtqueue
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -209,3 +210,15 @@ struct vnet_rx_completion {
    > point of reception from the network.
    > 3. The device should provide a receive packet timestamp in a single DMA
    > transaction along with the rest of the receive completion fields.
    > +
    > +## 3.6 Header data split for the receive virtqueue
    > +1. The device should be able to DMA the packet header and data to two different
    > + memory locations, this enables driver and networking stack to perform zero
    > + copy to application buffer(s).

    what is header here? which layer?

    > +2. The driver should be able to configure maximum header buffer size per
    > + virtqueue.
    > +3. The header buffer to be in a physically contiguous memory per virtqueue

    this is a requirement why?

    > +4. The device should be able to indicate header data split in the receive
    > + completion.
    > +5. The device should be able to zero pad the header buffer when the received
    > + header is shorter than cpu cache line size.

    weirdly specific.
    how does device even know the cache line?

    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 94.  Re: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements

    Posted 06-06-2023 22:42
    On Fri, Jun 02, 2023 at 01:03:05AM +0300, Parav Pandit wrote: > Add header data split requirements for the receive packets. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > net-workstream/features-1.4.md 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index f94848e..31680b7 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -11,6 +11,7 @@ together is desired while updating the virtio net interface. > 3. Virtqueue notification coalescing re-arming support > 4. Receive virtqueue n-tuple steering > 5. Device timestamp for tx and rx packets > +6. Header data split for the receive virtqueue > > # 3. Requirements > ## 3.1 Device counters > @@ -209,3 +210,15 @@ struct vnet_rx_completion { > point of reception from the network. > 3. The device should provide a receive packet timestamp in a single DMA > transaction along with the rest of the receive completion fields. > + > +## 3.6 Header data split for the receive virtqueue > +1. The device should be able to DMA the packet header and data to two different > + memory locations, this enables driver and networking stack to perform zero > + copy to application buffer(s). what is header here? which layer? > +2. The driver should be able to configure maximum header buffer size per > + virtqueue. > +3. The header buffer to be in a physically contiguous memory per virtqueue this is a requirement why? > +4. The device should be able to indicate header data split in the receive > + completion. > +5. The device should be able to zero pad the header buffer when the received > + header is shorter than cpu cache line size. weirdly specific. how does device even know the cache line? > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 95.  RE: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements

    Posted 06-08-2023 14:57

    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Tuesday, June 6, 2023 6:42 PM


    > > +## 3.6 Header data split for the receive virtqueue 1. The device
    > > +should be able to DMA the packet header and data to two different
    > > + memory locations, this enables driver and networking stack to perform
    > zero
    > > + copy to application buffer(s).
    >
    > what is header here? which layer?
    >
    Usually L2 to L4. To be defined further.

    > > +2. The driver should be able to configure maximum header buffer size per
    > > + virtqueue.
    > > +3. The header buffer to be in a physically contiguous memory per
    > > +virtqueue
    >
    > this is a requirement why?
    >
    Because fetching individual buffer memory pointer is sub-optimal.

    > > +4. The device should be able to indicate header data split in the receive
    > > + completion.
    > > +5. The device should be able to zero pad the header buffer when the
    > received
    > > + header is shorter than cpu cache line size.
    >
    > weirdly specific.
    > how does device even know the cache line?
    >
    A driver can tell how much padding to be done by keeping the cache line specific details in the driver, but only telling about padding size.



  • 96.  RE: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements

    Posted 06-08-2023 14:57
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 6, 2023 6:42 PM > > +## 3.6 Header data split for the receive virtqueue 1. The device > > +should be able to DMA the packet header and data to two different > > + memory locations, this enables driver and networking stack to perform > zero > > + copy to application buffer(s). > > what is header here? which layer? > Usually L2 to L4. To be defined further. > > +2. The driver should be able to configure maximum header buffer size per > > + virtqueue. > > +3. The header buffer to be in a physically contiguous memory per > > +virtqueue > > this is a requirement why? > Because fetching individual buffer memory pointer is sub-optimal. > > +4. The device should be able to indicate header data split in the receive > > + completion. > > +5. The device should be able to zero pad the header buffer when the > received > > + header is shorter than cpu cache line size. > > weirdly specific. > how does device even know the cache line? > A driver can tell how much padding to be done by keeping the cache line specific details in the driver, but only telling about padding size.


  • 97.  Re: [virtio] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-02-2023 03:07


    ? 2023/6/2 ??6:02, Parav Pandit ??:
    > Hi All,
    >
    > This document captures the virtio net device requirements for the upcoming
    > release 1.4 that some of us are currently working on.
    >
    > I would like to capture if there are any other requirements linked to the
    > features captured in this document to consider in the interface design.
    >
    > The objectives are:
    > 1. to consider these requirements in introducing new features
    > listed in the document and work towards the interface design followed
    > by drafting the specification changes.
    >
    > 2. Define practical list of requirements that can be achieved in 1.4
    > timeframe incrementally and also have the ability to implement them.
    >
    > This is not a specification document. It is a pre-work document
    > that helps to understand the draft specification addressing these
    > requirements.
    >
    > Please review.
    >
    > Parav Pandit (7):
    > net-features: Add requirements document for release 1.4
    > net-features: Add low latency transmit queue requirements
    > net-features: Add low latency receive queue requirements
    > net-features: Add notification coalescing requirements
    > net-features: Add n-tuple receive flow steering requirements
    > net-features: Add packet timestamp requirements
    > net-features: Add header data split requirements

    Hi, Parav.

    This 1.4 job listing is pretty cool! And I believe that these work plans
    overlap a lot with our/your needs,
    it would be interesting if we could work together to push these new
    features to make them actually available on VirtIO!

    Thanks!

    >
    > net-workstream/features-1.4.md | 224 +++++++++++++++++++++++++++++++++
    > 1 file changed, 224 insertions(+)
    > create mode 100644 net-workstream/features-1.4.md
    >




  • 98.  Re: [virtio] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-02-2023 03:07
    å 2023/6/2 äå6:02, Parav Pandit åé: Hi All, This document captures the virtio net device requirements for the upcoming release 1.4 that some of us are currently working on. I would like to capture if there are any other requirements linked to the features captured in this document to consider in the interface design. The objectives are: 1. to consider these requirements in introducing new features listed in the document and work towards the interface design followed by drafting the specification changes. 2. Define practical list of requirements that can be achieved in 1.4 timeframe incrementally and also have the ability to implement them. This is not a specification document. It is a pre-work document that helps to understand the draft specification addressing these requirements. Please review. Parav Pandit (7): net-features: Add requirements document for release 1.4 net-features: Add low latency transmit queue requirements net-features: Add low latency receive queue requirements net-features: Add notification coalescing requirements net-features: Add n-tuple receive flow steering requirements net-features: Add packet timestamp requirements net-features: Add header data split requirements Hi, Parav. This 1.4 job listing is pretty cool! And I believe that these work plans overlap a lot with our/your needs, it would be interesting if we could work together to push these new features to make them actually available on VirtIO! Thanks! net-workstream/features-1.4.md 224 +++++++++++++++++++++++++++++++++ 1 file changed, 224 insertions(+) create mode 100644 net-workstream/features-1.4.md


  • 99.  Re: [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-06-2023 22:50
    On Fri, Jun 02, 2023 at 01:02:58AM +0300, Parav Pandit wrote:
    > Hi All,
    >
    > This document captures the virtio net device requirements for the upcoming
    > release 1.4 that some of us are currently working on.
    >
    > I would like to capture if there are any other requirements linked to the
    > features captured in this document to consider in the interface design.
    >
    > The objectives are:
    > 1. to consider these requirements in introducing new features
    > listed in the document and work towards the interface design followed
    > by drafting the specification changes.
    >
    > 2. Define practical list of requirements that can be achieved in 1.4
    > timeframe incrementally and also have the ability to implement them.
    >
    > This is not a specification document. It is a pre-work document
    > that helps to understand the draft specification addressing these
    > requirements.


    thanks for write up! great start.

    some things look a bit more like a specific design that
    you have in mind. we need to take a step back and include
    the actual requirement too. it's ok to list a design idea
    too (you call them examples here).

    splitting different areas in separate files might be a good idea.

    some things will require virtio core extensions.

    I have a generic comment about limitations such as alignment
    physically contiguous etc. It's natural for hardware vendors
    to try and push complexity to software, but in the end this
    might just shift overhead to driver/TCP stack.
    Please try to add motivation with each requirement.
    I pointed out some of such cases as "weirdly specific".




    > Please review.
    >
    > Parav Pandit (7):
    > net-features: Add requirements document for release 1.4
    > net-features: Add low latency transmit queue requirements
    > net-features: Add low latency receive queue requirements
    > net-features: Add notification coalescing requirements
    > net-features: Add n-tuple receive flow steering requirements
    > net-features: Add packet timestamp requirements
    > net-features: Add header data split requirements
    >
    > net-workstream/features-1.4.md | 224 +++++++++++++++++++++++++++++++++
    > 1 file changed, 224 insertions(+)
    > create mode 100644 net-workstream/features-1.4.md
    >
    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 100.  Re: [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-06-2023 22:50
    On Fri, Jun 02, 2023 at 01:02:58AM +0300, Parav Pandit wrote: > Hi All, > > This document captures the virtio net device requirements for the upcoming > release 1.4 that some of us are currently working on. > > I would like to capture if there are any other requirements linked to the > features captured in this document to consider in the interface design. > > The objectives are: > 1. to consider these requirements in introducing new features > listed in the document and work towards the interface design followed > by drafting the specification changes. > > 2. Define practical list of requirements that can be achieved in 1.4 > timeframe incrementally and also have the ability to implement them. > > This is not a specification document. It is a pre-work document > that helps to understand the draft specification addressing these > requirements. thanks for write up! great start. some things look a bit more like a specific design that you have in mind. we need to take a step back and include the actual requirement too. it's ok to list a design idea too (you call them examples here). splitting different areas in separate files might be a good idea. some things will require virtio core extensions. I have a generic comment about limitations such as alignment physically contiguous etc. It's natural for hardware vendors to try and push complexity to software, but in the end this might just shift overhead to driver/TCP stack. Please try to add motivation with each requirement. I pointed out some of such cases as "weirdly specific". > Please review. > > Parav Pandit (7): > net-features: Add requirements document for release 1.4 > net-features: Add low latency transmit queue requirements > net-features: Add low latency receive queue requirements > net-features: Add notification coalescing requirements > net-features: Add n-tuple receive flow steering requirements > net-features: Add packet timestamp requirements > net-features: Add header data split requirements > > net-workstream/features-1.4.md 224 +++++++++++++++++++++++++++++++++ > 1 file changed, 224 insertions(+) > create mode 100644 net-workstream/features-1.4.md > > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 101.  RE: [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-06-2023 22:57


    > From: Michael S. Tsirkin <mst@redhat.com>
    > Sent: Tuesday, June 6, 2023 6:50 PM
    >
    > thanks for write up! great start.
    >
    > some things look a bit more like a specific design that you have in mind. we
    > need to take a step back and include the actual requirement too. it's ok to list a
    > design idea too (you call them examples here).
    >
    > splitting different areas in separate files might be a good idea.
    >
    Yes, as it grows, will likely do it, when the design part takes shape and gets extended as section.
    At present with 10 to 20 lines in a file is overhead.

    > some things will require virtio core extensions.
    >
    > I have a generic comment about limitations such as alignment physically
    > contiguous etc. It's natural for hardware vendors to try and push complexity to
    > software, but in the end this might just shift overhead to driver/TCP stack.
    Mostly for sure not the TCP stack.
    Some part may be to software.
    Overall software cost might be same or probably slightly higher but can have overall system improvement.

    > Please try to add motivation with each requirement.

    > I pointed out some of such cases as "weirdly specific".
    >
    Yes. I had the rationale section, but I dropped it in the initial version to add on need basis to cut out the obvious.
    Will insert some of it in v1.



  • 102.  RE: [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-06-2023 22:57
    > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 6, 2023 6:50 PM > > thanks for write up! great start. > > some things look a bit more like a specific design that you have in mind. we > need to take a step back and include the actual requirement too. it's ok to list a > design idea too (you call them examples here). > > splitting different areas in separate files might be a good idea. > Yes, as it grows, will likely do it, when the design part takes shape and gets extended as section. At present with 10 to 20 lines in a file is overhead. > some things will require virtio core extensions. > > I have a generic comment about limitations such as alignment physically > contiguous etc. It's natural for hardware vendors to try and push complexity to > software, but in the end this might just shift overhead to driver/TCP stack. Mostly for sure not the TCP stack. Some part may be to software. Overall software cost might be same or probably slightly higher but can have overall system improvement. > Please try to add motivation with each requirement. > I pointed out some of such cases as "weirdly specific". > Yes. I had the rationale section, but I dropped it in the initial version to add on need basis to cut out the obvious. Will insert some of it in v1.


  • 103.  Re: [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-06-2023 23:11
    On Tue, Jun 06, 2023 at 10:56:51PM +0000, Parav Pandit wrote:
    >
    >
    > > From: Michael S. Tsirkin <mst@redhat.com>
    > > Sent: Tuesday, June 6, 2023 6:50 PM
    > >
    > > thanks for write up! great start.
    > >
    > > some things look a bit more like a specific design that you have in mind. we
    > > need to take a step back and include the actual requirement too. it's ok to list a
    > > design idea too (you call them examples here).
    > >
    > > splitting different areas in separate files might be a good idea.
    > >
    > Yes, as it grows, will likely do it, when the design part takes shape and gets extended as section.
    > At present with 10 to 20 lines in a file is overhead.
    >
    > > some things will require virtio core extensions.
    > >
    > > I have a generic comment about limitations such as alignment physically
    > > contiguous etc. It's natural for hardware vendors to try and push complexity to
    > > software, but in the end this might just shift overhead to driver/TCP stack.
    > Mostly for sure not the TCP stack.
    > Some part may be to software.
    > Overall software cost might be same or probably slightly higher but can have overall system improvement.

    Then pls spend some time for each requirement documenting drawbacks too.
    Almost anything has tradeoffs, listing them makes you actually think
    whether things are worth it and shows others you are being objective.

    > > Please try to add motivation with each requirement.
    >
    > > I pointed out some of such cases as "weirdly specific".
    > >
    > Yes. I had the rationale section, but I dropped it in the initial version to add on need basis to cut out the obvious.
    > Will insert some of it in v1.




  • 104.  Re: [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-06-2023 23:11
    On Tue, Jun 06, 2023 at 10:56:51PM +0000, Parav Pandit wrote: > > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Tuesday, June 6, 2023 6:50 PM > > > > thanks for write up! great start. > > > > some things look a bit more like a specific design that you have in mind. we > > need to take a step back and include the actual requirement too. it's ok to list a > > design idea too (you call them examples here). > > > > splitting different areas in separate files might be a good idea. > > > Yes, as it grows, will likely do it, when the design part takes shape and gets extended as section. > At present with 10 to 20 lines in a file is overhead. > > > some things will require virtio core extensions. > > > > I have a generic comment about limitations such as alignment physically > > contiguous etc. It's natural for hardware vendors to try and push complexity to > > software, but in the end this might just shift overhead to driver/TCP stack. > Mostly for sure not the TCP stack. > Some part may be to software. > Overall software cost might be same or probably slightly higher but can have overall system improvement. Then pls spend some time for each requirement documenting drawbacks too. Almost anything has tradeoffs, listing them makes you actually think whether things are worth it and shows others you are being objective. > > Please try to add motivation with each requirement. > > > I pointed out some of such cases as "weirdly specific". > > > Yes. I had the rationale section, but I dropped it in the initial version to add on need basis to cut out the obvious. > Will insert some of it in v1.


  • 105.  Re: [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-07-2023 02:50
    On Fri, Jun 2, 2023 at 6:03?AM Parav Pandit <parav@nvidia.com> wrote:
    >
    > Hi All,
    >
    > This document captures the virtio net device requirements for the upcoming
    > release 1.4 that some of us are currently working on.
    >
    > I would like to capture if there are any other requirements linked to the
    > features captured in this document to consider in the interface design.
    >
    > The objectives are:
    > 1. to consider these requirements in introducing new features
    > listed in the document and work towards the interface design followed
    > by drafting the specification changes.
    >
    > 2. Define practical list of requirements that can be achieved in 1.4
    > timeframe incrementally and also have the ability to implement them.
    >
    > This is not a specification document. It is a pre-work document
    > that helps to understand the draft specification addressing these
    > requirements.
    >
    > Please review.
    >
    > Parav Pandit (7):
    > net-features: Add requirements document for release 1.4
    > net-features: Add low latency transmit queue requirements
    > net-features: Add low latency receive queue requirements
    > net-features: Add notification coalescing requirements
    > net-features: Add n-tuple receive flow steering requirements
    > net-features: Add packet timestamp requirements
    > net-features: Add header data split requirements
    >
    > net-workstream/features-1.4.md | 224 +++++++++++++++++++++++++++++++++

    I feel that some patches go into too many details of the proposed
    solution which makes it more like RFC instead of a requirement.

    It would be easier to start with to describe why a feature is needed
    and why it can't work well in current spec.

    Thanks

    > 1 file changed, 224 insertions(+)
    > create mode 100644 net-workstream/features-1.4.md
    >
    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >




  • 106.  Re: [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements

    Posted 06-07-2023 02:50
    On Fri, Jun 2, 2023 at 6:03âAM Parav Pandit <parav@nvidia.com> wrote: > > Hi All, > > This document captures the virtio net device requirements for the upcoming > release 1.4 that some of us are currently working on. > > I would like to capture if there are any other requirements linked to the > features captured in this document to consider in the interface design. > > The objectives are: > 1. to consider these requirements in introducing new features > listed in the document and work towards the interface design followed > by drafting the specification changes. > > 2. Define practical list of requirements that can be achieved in 1.4 > timeframe incrementally and also have the ability to implement them. > > This is not a specification document. It is a pre-work document > that helps to understand the draft specification addressing these > requirements. > > Please review. > > Parav Pandit (7): > net-features: Add requirements document for release 1.4 > net-features: Add low latency transmit queue requirements > net-features: Add low latency receive queue requirements > net-features: Add notification coalescing requirements > net-features: Add n-tuple receive flow steering requirements > net-features: Add packet timestamp requirements > net-features: Add header data split requirements > > net-workstream/features-1.4.md 224 +++++++++++++++++++++++++++++++++ I feel that some patches go into too many details of the proposed solution which makes it more like RFC instead of a requirement. It would be easier to start with to describe why a feature is needed and why it can't work well in current spec. Thanks > 1 file changed, 224 insertions(+) > create mode 100644 net-workstream/features-1.4.md > > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >