OASIS Virtual I/O Device (VIRTIO) TC

 View Only
Expand all | Collapse all

[PATCH REQUIREMENTS v2 0/7] virtio net new features requirements

  • 1.  [PATCH REQUIREMENTS v2 0/7] virtio net new features requirements

    Posted 07-02-2023 23:44
    Hi All, This document captures the virtio net device requirements for the upcoming release 1.4 that some of us are currently working on. This is live document to be updated in coming time and work towards it for its design which can result in a draft specification. The objectives are: 1. to consider these requirements in introducing new features listed in the document and otherwise and work towards the interface design followed by drafting the specification changes. 2. Define practical list of requirements that can be achieved in 1.4 timeframe incrementally and also have the ability to implement them. Please review. TODO: 1. Some more refinement needed for rx low latency and header data split requirements. --- changelog: v1->v2: - major update of receive flow filter requirements updated based on last two design discussions in community and offline research - examples added - link to use case and design goal added - control and operation side requirements split - more verbose v0->v1: - addressed comments from Heng Li - addressed few (not all) comments from Michael - per patch changelog Parav Pandit (7): net-features: Add requirements document for release 1.4 net-features: Add low latency transmit queue requirements net-features: Add low latency receive queue requirements net-features: Add notification coalescing requirements net-features: Add n-tuple receive flow filters requirements net-features: Add packet timestamp requirements net-features: Add header data split requirements net-workstream/features-1.4.md 308 +++++++++++++++++++++++++++++++++ 1 file changed, 308 insertions(+) create mode 100644 net-workstream/features-1.4.md -- 2.26.2


  • 2.  [PATCH REQUIREMENTS v2 1/7] net-features: Add requirements document for release 1.4

    Posted 07-02-2023 23:44
    Add requirements document template for the virtio net features. Add virtio net device counters visible to driver. Signed-off-by: Parav Pandit <parav@nvidia.com> --- changelog: v0->v1: - removed tx dropped counter - updated requirements to mention about virtqueue interface for counters query --- net-workstream/features-1.4.md 35 ++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) create mode 100644 net-workstream/features-1.4.md diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md new file mode 100644 index 0000000..4c3797b --- /dev/null +++ b/net-workstream/features-1.4.md @@ -0,0 +1,35 @@ +# 1. Introduction + +This document describes the overall requirements for virtio net device +improvements for upcoming release 1.4. Some of these requirements are +interrelated and influence the interface design, hence reviewing them +together is desired while updating the virtio net interface. + +# 2. Summary +1. Device counters visible to the driver + +# 3. Requirements +## 3.1 Device counters +1. The driver should be able to query the device and/or per vq counters for + debugging purpose using a virtqueue directly from driver to device for + example using a control vq. +2. The driver should be able to query which counters are supported using a + virtqueue command, for example using an existing control vq. +3. If this device is migrated between two hosts, the driver should be able + get the counter values in the destination host from where it was left + off in the source host. +4. If a virtio device is group member device, a group owner should be able + to query all the counter attributes using the administration command which + a virtio member device will expose via a virtqueue to the driver. + +### 3.1.1 Per receive queue counters +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being + oversize than the buffer size +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the + buffer in the receive queue +3. le64 rx_gro_pkts: Packets treated as receive GSO sequence by the device +4. le64 rx_pkts: Total packets received by the device + +### 3.1.2 Per transmit queue counters +1. le64 tx_gso_pkts: Packets send as transmit GSO sequence +2. le64 tx_pkts: Total packets send by the device -- 2.26.2


  • 3.  Re: [virtio] [PATCH REQUIREMENTS v2 1/7] net-features: Add requirements document for release 1.4

    Posted 07-20-2023 02:11
    On Mon, 3 Jul 2023 02:44:04 +0300, Parav Pandit <parav@nvidia.com> wrote:
    > Add requirements document template for the virtio net features.
    >
    > Add virtio net device counters visible to driver.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > changelog:
    > v0->v1:
    > - removed tx dropped counter
    > - updated requirements to mention about virtqueue interface for counters
    > query
    > ---
    > net-workstream/features-1.4.md | 35 ++++++++++++++++++++++++++++++++++
    > 1 file changed, 35 insertions(+)
    > create mode 100644 net-workstream/features-1.4.md
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > new file mode 100644
    > index 0000000..4c3797b
    > --- /dev/null
    > +++ b/net-workstream/features-1.4.md
    > @@ -0,0 +1,35 @@
    > +# 1. Introduction
    > +
    > +This document describes the overall requirements for virtio net device
    > +improvements for upcoming release 1.4. Some of these requirements are
    > +interrelated and influence the interface design, hence reviewing them
    > +together is desired while updating the virtio net interface.
    > +
    > +# 2. Summary
    > +1. Device counters visible to the driver
    > +
    > +# 3. Requirements
    > +## 3.1 Device counters
    > +1. The driver should be able to query the device and/or per vq counters for
    > + debugging purpose using a virtqueue directly from driver to device for
    > + example using a control vq.
    > +2. The driver should be able to query which counters are supported using a
    > + virtqueue command, for example using an existing control vq.
    > +3. If this device is migrated between two hosts, the driver should be able
    > + get the counter values in the destination host from where it was left
    > + off in the source host.
    > +4. If a virtio device is group member device, a group owner should be able
    > + to query all the counter attributes using the administration command which
    > + a virtio member device will expose via a virtqueue to the driver.
    > +
    > +### 3.1.1 Per receive queue counters
    > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
    > + oversize than the buffer size
    > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
    > + buffer in the receive queue
    > +3. le64 rx_gro_pkts: Packets treated as receive GSO sequence by the device
    > +4. le64 rx_pkts: Total packets received by the device
    > +
    > +### 3.1.2 Per transmit queue counters
    > +1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
    > +2. le64 tx_pkts: Total packets send by the device


    Hi

    About the device counter, I will continue to push this patch, can you review it.

    20220315032402.6088-1-xuanzhuo@linux.alibaba.com">http://lore.kernel.org/all/20220315032402.6088-1-xuanzhuo@linux.alibaba.com

    Thanks


    > --
    > 2.26.2
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe from this mail list, you must leave the OASIS TC that
    > generates this mail. Follow this link to all your TCs in OASIS at:
    > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
    >



  • 4.  Re: [virtio] [PATCH REQUIREMENTS v2 1/7] net-features: Add requirements document for release 1.4

    Posted 07-20-2023 02:13
    On Mon, 3 Jul 2023 02:44:04 +0300, Parav Pandit <parav@nvidia.com> wrote: > Add requirements document template for the virtio net features. > > Add virtio net device counters visible to driver. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > changelog: > v0->v1: > - removed tx dropped counter > - updated requirements to mention about virtqueue interface for counters > query > --- > net-workstream/features-1.4.md 35 ++++++++++++++++++++++++++++++++++ > 1 file changed, 35 insertions(+) > create mode 100644 net-workstream/features-1.4.md > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > new file mode 100644 > index 0000000..4c3797b > --- /dev/null > +++ b/net-workstream/features-1.4.md > @@ -0,0 +1,35 @@ > +# 1. Introduction > + > +This document describes the overall requirements for virtio net device > +improvements for upcoming release 1.4. Some of these requirements are > +interrelated and influence the interface design, hence reviewing them > +together is desired while updating the virtio net interface. > + > +# 2. Summary > +1. Device counters visible to the driver > + > +# 3. Requirements > +## 3.1 Device counters > +1. The driver should be able to query the device and/or per vq counters for > + debugging purpose using a virtqueue directly from driver to device for > + example using a control vq. > +2. The driver should be able to query which counters are supported using a > + virtqueue command, for example using an existing control vq. > +3. If this device is migrated between two hosts, the driver should be able > + get the counter values in the destination host from where it was left > + off in the source host. > +4. If a virtio device is group member device, a group owner should be able > + to query all the counter attributes using the administration command which > + a virtio member device will expose via a virtqueue to the driver. > + > +### 3.1.1 Per receive queue counters > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being > + oversize than the buffer size > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the > + buffer in the receive queue > +3. le64 rx_gro_pkts: Packets treated as receive GSO sequence by the device > +4. le64 rx_pkts: Total packets received by the device > + > +### 3.1.2 Per transmit queue counters > +1. le64 tx_gso_pkts: Packets send as transmit GSO sequence > +2. le64 tx_pkts: Total packets send by the device Hi About the device counter, I will continue to push this patch, can you review it. 20220315032402.6088-1-xuanzhuo@linux.alibaba.com">http://lore.kernel.org/all/20220315032402.6088-1-xuanzhuo@linux.alibaba.com Thanks > -- > 2.26.2 > > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php >


  • 5.  RE: [virtio] [PATCH REQUIREMENTS v2 1/7] net-features: Add requirements document for release 1.4

    Posted 07-20-2023 02:16


    > From: virtio@lists.oasis-open.org <virtio@lists.oasis-open.org> On Behalf Of
    > Xuan Zhuo
    > Sent: Wednesday, July 19, 2023 10:11 PM
    >
    > On Mon, 3 Jul 2023 02:44:04 +0300, Parav Pandit <parav@nvidia.com> wrote:
    > > Add requirements document template for the virtio net features.
    > >
    > > Add virtio net device counters visible to driver.
    > >
    > > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > > ---
    > > changelog:
    > > v0->v1:
    > > - removed tx dropped counter
    > > - updated requirements to mention about virtqueue interface for counters
    > > query
    > > ---
    > > net-workstream/features-1.4.md | 35
    > > ++++++++++++++++++++++++++++++++++
    > > 1 file changed, 35 insertions(+)
    > > create mode 100644 net-workstream/features-1.4.md
    > >
    > > diff --git a/net-workstream/features-1.4.md
    > > b/net-workstream/features-1.4.md new file mode 100644 index
    > > 0000000..4c3797b
    > > --- /dev/null
    > > +++ b/net-workstream/features-1.4.md
    > > @@ -0,0 +1,35 @@
    > > +# 1. Introduction
    > > +
    > > +This document describes the overall requirements for virtio net
    > > +device improvements for upcoming release 1.4. Some of these
    > > +requirements are interrelated and influence the interface design,
    > > +hence reviewing them together is desired while updating the virtio net
    > interface.
    > > +
    > > +# 2. Summary
    > > +1. Device counters visible to the driver
    > > +
    > > +# 3. Requirements
    > > +## 3.1 Device counters
    > > +1. The driver should be able to query the device and/or per vq counters for
    > > + debugging purpose using a virtqueue directly from driver to device for
    > > + example using a control vq.
    > > +2. The driver should be able to query which counters are supported using a
    > > + virtqueue command, for example using an existing control vq.
    > > +3. If this device is migrated between two hosts, the driver should be able
    > > + get the counter values in the destination host from where it was left
    > > + off in the source host.
    > > +4. If a virtio device is group member device, a group owner should be able
    > > + to query all the counter attributes using the administration command
    > which
    > > + a virtio member device will expose via a virtqueue to the driver.
    > > +
    > > +### 3.1.1 Per receive queue counters
    > > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
    > > + oversize than the buffer size
    > > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
    > > + buffer in the receive queue
    > > +3. le64 rx_gro_pkts: Packets treated as receive GSO sequence by the
    > > +device 4. le64 rx_pkts: Total packets received by the device
    > > +
    > > +### 3.1.2 Per transmit queue counters 1. le64 tx_gso_pkts: Packets
    > > +send as transmit GSO sequence 2. le64 tx_pkts: Total packets send by
    > > +the device
    >
    >
    > Hi
    >
    > About the device counter, I will continue to push this patch, can you review it.
    >
    > http://lore.kernel.org/all/20220315032402.6088-1-
    > xuanzhuo@linux.alibaba.com
    Yes. I had trouble working this week.
    But want to review later in the week.

    A quick glance at patch,
    we need a bitmap to indicate which counters are supported by device or not.
    Seeing zero from driver for unsupported is misleading to user.
    So we want to avoid showing those counters in ethtool which are not supported by the device.



  • 6.  RE: [virtio] [PATCH REQUIREMENTS v2 1/7] net-features: Add requirements document for release 1.4

    Posted 07-20-2023 02:16
    > From: virtio@lists.oasis-open.org <virtio@lists.oasis-open.org> On Behalf Of > Xuan Zhuo > Sent: Wednesday, July 19, 2023 10:11 PM > > On Mon, 3 Jul 2023 02:44:04 +0300, Parav Pandit <parav@nvidia.com> wrote: > > Add requirements document template for the virtio net features. > > > > Add virtio net device counters visible to driver. > > > > Signed-off-by: Parav Pandit <parav@nvidia.com> > > --- > > changelog: > > v0->v1: > > - removed tx dropped counter > > - updated requirements to mention about virtqueue interface for counters > > query > > --- > > net-workstream/features-1.4.md 35 > > ++++++++++++++++++++++++++++++++++ > > 1 file changed, 35 insertions(+) > > create mode 100644 net-workstream/features-1.4.md > > > > diff --git a/net-workstream/features-1.4.md > > b/net-workstream/features-1.4.md new file mode 100644 index > > 0000000..4c3797b > > --- /dev/null > > +++ b/net-workstream/features-1.4.md > > @@ -0,0 +1,35 @@ > > +# 1. Introduction > > + > > +This document describes the overall requirements for virtio net > > +device improvements for upcoming release 1.4. Some of these > > +requirements are interrelated and influence the interface design, > > +hence reviewing them together is desired while updating the virtio net > interface. > > + > > +# 2. Summary > > +1. Device counters visible to the driver > > + > > +# 3. Requirements > > +## 3.1 Device counters > > +1. The driver should be able to query the device and/or per vq counters for > > + debugging purpose using a virtqueue directly from driver to device for > > + example using a control vq. > > +2. The driver should be able to query which counters are supported using a > > + virtqueue command, for example using an existing control vq. > > +3. If this device is migrated between two hosts, the driver should be able > > + get the counter values in the destination host from where it was left > > + off in the source host. > > +4. If a virtio device is group member device, a group owner should be able > > + to query all the counter attributes using the administration command > which > > + a virtio member device will expose via a virtqueue to the driver. > > + > > +### 3.1.1 Per receive queue counters > > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being > > + oversize than the buffer size > > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the > > + buffer in the receive queue > > +3. le64 rx_gro_pkts: Packets treated as receive GSO sequence by the > > +device 4. le64 rx_pkts: Total packets received by the device > > + > > +### 3.1.2 Per transmit queue counters 1. le64 tx_gso_pkts: Packets > > +send as transmit GSO sequence 2. le64 tx_pkts: Total packets send by > > +the device > > > Hi > > About the device counter, I will continue to push this patch, can you review it. > > http://lore.kernel.org/all/20220315032402.6088-1- > xuanzhuo@linux.alibaba.com Yes. I had trouble working this week. But want to review later in the week. A quick glance at patch, we need a bitmap to indicate which counters are supported by device or not. Seeing zero from driver for unsupported is misleading to user. So we want to avoid showing those counters in ethtool which are not supported by the device.


  • 7.  [PATCH REQUIREMENTS v2 3/7] net-features: Add low latency receive queue requirements

    Posted 07-02-2023 23:45
    Add requirements for the low latency receive queue. Signed-off-by: Parav Pandit <parav@nvidia.com> --- changelog: v0->v1: - clarified the requirements further - added line for the gro case - added design goals as the motivation for the requirements --- net-workstream/features-1.4.md 45 +++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 0c3202c..3e8b5a4 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface. # 2. Summary 1. Device counters visible to the driver -2. Low latency tx virtqueue for PCI transport +2. Low latency tx and rx virtqueues for PCI transport # 3. Requirements ## 3.1 Device counters @@ -114,3 +114,46 @@ struct vnet_data_desc desc[2]; 7. Ability to place all transmit completion together with it per packet stream transmit timestamp using single PCIe transcation. + +### 3.2.2 Low latency rx virtqueue +0. Design goal: + a. Keep packet metadata and buffer data together which is consumed by driver + layer and make it available in a single cache line of cpu + b. Instead of having per packet descriptors which is complex to scale for + the device, supply the page directly to the device to consume it based + on packet size +1. The device should be able to write a packet receive completion that consists + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write + PCIe TLP. +2. The device should be able to perform DMA writes of multiple packets + completions in a single DMA transaction up to the PCIe maximum write limit + in a transaction. +3. The device should be able to zero pad packet write completion to align it to + 64B or CPU cache line size whenever possible. +4. An example of the above DMA completion structure: + +``` +/* Constant size receive packet completion */ +struct vnet_rx_completion { + u16 flags; + u16 id; /* buffer id */ + u8 gso_type; + u8 reserved[3]; + le16 gso_hdr_len; + le16 gso_size; + le16 csum_start; + le16 csum_offset; + u16 reserved2; + u64 timestamp; /* explained later */ + u8 padding[]; +}; +``` +5. The driver should be able to post constant-size buffer pages on a receive + queue which can be consumed by the device for an incoming packet of any size + from 64B to 9K bytes. +6. The device should be able to know the constant buffer size at receive + virtqueue level instead of per buffer level. +7. The device should be able to indicate when a full page buffer is consumed, + which can be recycled by the driver when the packets from the completed + page is fully consumed. +8. The device should be able to consume multiple pages for a receive GSO stream. -- 2.26.2


  • 8.  [PATCH REQUIREMENTS v2 2/7] net-features: Add low latency transmit queue requirements

    Posted 07-02-2023 23:45
    Add requirements for the low latency transmit queue. Signed-off-by: Parav Pandit <parav@nvidia.com> --- chagelog: v0->v1: - added design goals for which requirements are added --- net-workstream/features-1.4.md 81 ++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 4c3797b..0c3202c 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -7,6 +7,7 @@ together is desired while updating the virtio net interface. # 2. Summary 1. Device counters visible to the driver +2. Low latency tx virtqueue for PCI transport # 3. Requirements ## 3.1 Device counters @@ -33,3 +34,83 @@ together is desired while updating the virtio net interface. ### 3.1.2 Per transmit queue counters 1. le64 tx_gso_pkts: Packets send as transmit GSO sequence 2. le64 tx_pkts: Total packets send by the device + +## 3.2 Low PCI latency virtqueues +### 3.2.1 Low PCI latency tx virtqueue +0. Design goal + a. Reduce PCI access latency in packet transmit flow + b. Avoid O(N) descriptor parser to detect a packet stream to simplify device + logic + c. Reduce number of PCI transmit completion transactions and have unified + completion flow with/without transmit timestamping + d. Avoid partial cache line writes on transmit completions + +1. Packet transmit descriptor should contain data descriptors count without any + indirection and without any O(N) search to find the end of a packet stream. + For example, a packet transmit descriptor (called vnet_tx_hdr_desc + subsequently) to contain a field num_next_desc for the packet stream + indicating that a packet is located N data descriptors. + +2. Packet transmit descriptor should contain segmentation offload-related fields + without any indirection. For example, packet transmit descriptor to contain + gso_type, gso_size/mss, header length, csum placement byte offset, and + csum start. + +3. Packet transmit descriptor should be able to place a small size packet that + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory. + For example a TCP ack only packet can fit in a descriptor memory which + otherwise consume more than 25% of metadata to describe the packet. + +4. Packet transmit descriptor should be able to place a full GSO header (L2 to + L4) after header descriptor and before data descriptors. For example, the + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory. + When such a GSO header is positioned adjacent to the packet transmit + descriptor, and when the GSO header is not aligned to 16B, the following + data descriptor to start on the 8B aligned boundary. + +5. An example of the above requirements at high level is: + +``` +struct vitio_packed_q_desc { + /* current desc for reference */ + u64 address; + u32 len; + u16 id; + u16 flags; +}; + +/* Constant size header descriptor for tx packets */ +struct vnet_tx_hdr_desc { + u16 flags; /* indicate how to parse next fields */ + u16 id; /* desc id to come back in completion */ + u8 num_next_desc; /* indicates the number of the next 16B data desc for this + * buffer. + */ + u8 gso_type; + le16 gso_hdr_len; + le16 gso_size; + le16 csum_start; + le16 csum_offset; + u8 inline_pkt_len; /* indicates the length of the inline packet after this + * desc + */ + u8 reserved; + u8 padding[]; +}; + +/* Example of a short packet or GSO header placed in the desc section of the vq + */ +struct vnet_tx_small_pkt_desc { + u8 raw_pkt[128]; +}; + +/* Example of header followed by data descriptor */ +struct vnet_tx_hdr_desc hdr_desc; +struct vnet_data_desc desc[2]; + +``` +6. Ability to zero pad the transmit completion when the transmit completion is + shorter than the CPU cache line size. + +7. Ability to place all transmit completion together with it per packet stream + transmit timestamp using single PCIe transcation. -- 2.26.2


  • 9.  Re: [virtio] [PATCH REQUIREMENTS v2 2/7] net-features: Add low latency transmit queue requirements

    Posted 07-20-2023 15:28
    On Mon, Jul 03, 2023 at 02:44:05AM +0300, Parav Pandit wrote:
    > Add requirements for the low latency transmit queue.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > chagelog:
    > v0->v1:
    > - added design goals for which requirements are added
    > ---
    > net-workstream/features-1.4.md | 81 ++++++++++++++++++++++++++++++++++
    > 1 file changed, 81 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index 4c3797b..0c3202c 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -7,6 +7,7 @@ together is desired while updating the virtio net interface.
    >
    > # 2. Summary
    > 1. Device counters visible to the driver
    > +2. Low latency tx virtqueue for PCI transport
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -33,3 +34,83 @@ together is desired while updating the virtio net interface.
    > ### 3.1.2 Per transmit queue counters
    > 1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
    > 2. le64 tx_pkts: Total packets send by the device
    > +
    > +## 3.2 Low PCI latency virtqueues
    > +### 3.2.1 Low PCI latency tx virtqueue
    > +0. Design goal
    > + a. Reduce PCI access latency in packet transmit flow
    > + b. Avoid O(N) descriptor parser to detect a packet stream to simplify device
    > + logic
    > + c. Reduce number of PCI transmit completion transactions and have unified
    > + completion flow with/without transmit timestamping
    > + d. Avoid partial cache line writes on transmit completions
    > +
    > +1. Packet transmit descriptor should contain data descriptors count without any
    > + indirection and without any O(N) search to find the end of a packet stream.
    > + For example, a packet transmit descriptor (called vnet_tx_hdr_desc
    > + subsequently) to contain a field num_next_desc for the packet stream
    > + indicating that a packet is located N data descriptors.
    > +
    > +2. Packet transmit descriptor should contain segmentation offload-related fields
    > + without any indirection. For example, packet transmit descriptor to contain
    > + gso_type, gso_size/mss, header length, csum placement byte offset, and
    > + csum start.
    > +
    > +3. Packet transmit descriptor should be able to place a small size packet that
    > + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory.

    Please make this a generic virtqueue-level feature. It sounds like the
    idea is to vary the vring descriptor length per device type and per
    virtqueue so that headers and small payloads can be embedded directly
    into the vring.

    > + For example a TCP ack only packet can fit in a descriptor memory which
    > + otherwise consume more than 25% of metadata to describe the packet.
    > +
    > +4. Packet transmit descriptor should be able to place a full GSO header (L2 to
    > + L4) after header descriptor and before data descriptors. For example, the
    > + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory.
    > + When such a GSO header is positioned adjacent to the packet transmit
    > + descriptor, and when the GSO header is not aligned to 16B, the following
    > + data descriptor to start on the 8B aligned boundary.
    > +
    > +5. An example of the above requirements at high level is:
    > +
    > +```
    > +struct vitio_packed_q_desc {
    > + /* current desc for reference */
    > + u64 address;
    > + u32 len;
    > + u16 id;
    > + u16 flags;
    > +};
    > +
    > +/* Constant size header descriptor for tx packets */
    > +struct vnet_tx_hdr_desc {
    > + u16 flags; /* indicate how to parse next fields */
    > + u16 id; /* desc id to come back in completion */
    > + u8 num_next_desc; /* indicates the number of the next 16B data desc for this
    > + * buffer.
    > + */
    > + u8 gso_type;
    > + le16 gso_hdr_len;
    > + le16 gso_size;
    > + le16 csum_start;
    > + le16 csum_offset;
    > + u8 inline_pkt_len; /* indicates the length of the inline packet after this
    > + * desc
    > + */
    > + u8 reserved;
    > + u8 padding[];
    > +};
    > +
    > +/* Example of a short packet or GSO header placed in the desc section of the vq
    > + */
    > +struct vnet_tx_small_pkt_desc {
    > + u8 raw_pkt[128];
    > +};
    > +
    > +/* Example of header followed by data descriptor */
    > +struct vnet_tx_hdr_desc hdr_desc;
    > +struct vnet_data_desc desc[2];
    > +
    > +```
    > +6. Ability to zero pad the transmit completion when the transmit completion is
    > + shorter than the CPU cache line size.
    > +
    > +7. Ability to place all transmit completion together with it per packet stream
    > + transmit timestamp using single PCIe transcation.
    > --
    > 2.26.2
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe from this mail list, you must leave the OASIS TC that
    > generates this mail. Follow this link to all your TCs in OASIS at:
    > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
    >



  • 10.  Re: [virtio] [PATCH REQUIREMENTS v2 2/7] net-features: Add low latency transmit queue requirements

    Posted 07-20-2023 15:29
    On Mon, Jul 03, 2023 at 02:44:05AM +0300, Parav Pandit wrote: > Add requirements for the low latency transmit queue. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > chagelog: > v0->v1: > - added design goals for which requirements are added > --- > net-workstream/features-1.4.md 81 ++++++++++++++++++++++++++++++++++ > 1 file changed, 81 insertions(+) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index 4c3797b..0c3202c 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -7,6 +7,7 @@ together is desired while updating the virtio net interface. > > # 2. Summary > 1. Device counters visible to the driver > +2. Low latency tx virtqueue for PCI transport > > # 3. Requirements > ## 3.1 Device counters > @@ -33,3 +34,83 @@ together is desired while updating the virtio net interface. > ### 3.1.2 Per transmit queue counters > 1. le64 tx_gso_pkts: Packets send as transmit GSO sequence > 2. le64 tx_pkts: Total packets send by the device > + > +## 3.2 Low PCI latency virtqueues > +### 3.2.1 Low PCI latency tx virtqueue > +0. Design goal > + a. Reduce PCI access latency in packet transmit flow > + b. Avoid O(N) descriptor parser to detect a packet stream to simplify device > + logic > + c. Reduce number of PCI transmit completion transactions and have unified > + completion flow with/without transmit timestamping > + d. Avoid partial cache line writes on transmit completions > + > +1. Packet transmit descriptor should contain data descriptors count without any > + indirection and without any O(N) search to find the end of a packet stream. > + For example, a packet transmit descriptor (called vnet_tx_hdr_desc > + subsequently) to contain a field num_next_desc for the packet stream > + indicating that a packet is located N data descriptors. > + > +2. Packet transmit descriptor should contain segmentation offload-related fields > + without any indirection. For example, packet transmit descriptor to contain > + gso_type, gso_size/mss, header length, csum placement byte offset, and > + csum start. > + > +3. Packet transmit descriptor should be able to place a small size packet that > + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory. Please make this a generic virtqueue-level feature. It sounds like the idea is to vary the vring descriptor length per device type and per virtqueue so that headers and small payloads can be embedded directly into the vring. > + For example a TCP ack only packet can fit in a descriptor memory which > + otherwise consume more than 25% of metadata to describe the packet. > + > +4. Packet transmit descriptor should be able to place a full GSO header (L2 to > + L4) after header descriptor and before data descriptors. For example, the > + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory. > + When such a GSO header is positioned adjacent to the packet transmit > + descriptor, and when the GSO header is not aligned to 16B, the following > + data descriptor to start on the 8B aligned boundary. > + > +5. An example of the above requirements at high level is: > + > +``` > +struct vitio_packed_q_desc { > + /* current desc for reference */ > + u64 address; > + u32 len; > + u16 id; > + u16 flags; > +}; > + > +/* Constant size header descriptor for tx packets */ > +struct vnet_tx_hdr_desc { > + u16 flags; /* indicate how to parse next fields */ > + u16 id; /* desc id to come back in completion */ > + u8 num_next_desc; /* indicates the number of the next 16B data desc for this > + * buffer. > + */ > + u8 gso_type; > + le16 gso_hdr_len; > + le16 gso_size; > + le16 csum_start; > + le16 csum_offset; > + u8 inline_pkt_len; /* indicates the length of the inline packet after this > + * desc > + */ > + u8 reserved; > + u8 padding[]; > +}; > + > +/* Example of a short packet or GSO header placed in the desc section of the vq > + */ > +struct vnet_tx_small_pkt_desc { > + u8 raw_pkt[128]; > +}; > + > +/* Example of header followed by data descriptor */ > +struct vnet_tx_hdr_desc hdr_desc; > +struct vnet_data_desc desc[2]; > + > +``` > +6. Ability to zero pad the transmit completion when the transmit completion is > + shorter than the CPU cache line size. > + > +7. Ability to place all transmit completion together with it per packet stream > + transmit timestamp using single PCIe transcation. > -- > 2.26.2 > > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php > Attachment: signature.asc Description: PGP signature


  • 11.  RE: [virtio] [PATCH REQUIREMENTS v2 2/7] net-features: Add low latency transmit queue requirements

    Posted 07-23-2023 21:32


    > From: Stefan Hajnoczi <stefanha@redhat.com>
    > Sent: Thursday, July 20, 2023 11:28 AM

    > > +3. Packet transmit descriptor should be able to place a small size packet that
    > > + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue
    > memory.
    >
    > Please make this a generic virtqueue-level feature. It sounds like the idea is to
    > vary the vring descriptor length per device type and per virtqueue so that
    > headers and small payloads can be embedded directly into the vring.
    >
    First sorry for the delayed response. I am still recovering from illness and unable to work consistently.

    Yes, should be possible with inlining the header.
    Even within for same device type, for flow filters this is needed.

    It largely applies to requester side queues like net txq, net flow filter q, cvq, blk/scsi req q.

    Thanks for the input. Will update this requirement in v3.



  • 12.  RE: [virtio] [PATCH REQUIREMENTS v2 2/7] net-features: Add low latency transmit queue requirements

    Posted 07-23-2023 21:32
    > From: Stefan Hajnoczi <stefanha@redhat.com> > Sent: Thursday, July 20, 2023 11:28 AM > > +3. Packet transmit descriptor should be able to place a small size packet that > > + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue > memory. > > Please make this a generic virtqueue-level feature. It sounds like the idea is to > vary the vring descriptor length per device type and per virtqueue so that > headers and small payloads can be embedded directly into the vring. > First sorry for the delayed response. I am still recovering from illness and unable to work consistently. Yes, should be possible with inlining the header. Even within for same device type, for flow filters this is needed. It largely applies to requester side queues like net txq, net flow filter q, cvq, blk/scsi req q. Thanks for the input. Will update this requirement in v3.


  • 13.  [PATCH REQUIREMENTS v2 6/7] net-features: Add packet timestamp requirements

    Posted 07-02-2023 23:45
    Add tx and rx packet timestamp requirements. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index ae40ee8..466e4db 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -10,6 +10,7 @@ together is desired while updating the virtio net interface. 2. Low latency tx and rx virtqueues for PCI transport 3. Virtqueue notification coalescing re-arming support 4 Virtqueue receive flow filters (RFF) +5. Device timestamp for tx and rx packets # 3. Requirements ## 3.1 Device counters @@ -267,3 +268,28 @@ struct virtio_net_rff_delete { u8 padding[2]; le32 flow_id; }; + +## 3.5 Packet timestamp +1. Device should provide transmit timestamp and receive timestamp of the packets + at per packet level when the device is enabled. +2. Device should provide the current free running clock in the least latency + possible using an MMIO register read of 64-bit to have the least jitter. +3. Device should provide the current frequency and the frequency unit for the + software to synchronize the reference point of software and the device using + a control vq command. + +### 3.5.1 Transmit timestamp +1. Transmit completion must contain a packet transmission timestamp when the + device is enabled for it. +2. The device should record the packet transmit timestamp in the completion at + the farthest egress point towards the network. +3. The device must provide a transmit packet timestamp in a single DMA + transaction along with the rest of the transmit completion fields. + +### 3.5.2 Receive timestamp +1. Receive completion must contain a packet reception timestamp when the device + is enabled for it. +2. The device should record the received packet timestamp at the closet ingress + point of reception from the network. +3. The device should provide a receive packet timestamp in a single DMA + transaction along with the rest of the receive completion fields. -- 2.26.2


  • 14.  [PATCH REQUIREMENTS v2 4/7] net-features: Add notification coalescing requirements

    Posted 07-02-2023 23:45
    Add virtio net device notification coalescing improvements requirements. Signed-off-by: Parav Pandit <parav@nvidia.com> --- changelog: v0->v1: - updated the description --- net-workstream/features-1.4.md 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 3e8b5a4..a34556c 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -8,6 +8,7 @@ together is desired while updating the virtio net interface. # 2. Summary 1. Device counters visible to the driver 2. Low latency tx and rx virtqueues for PCI transport +3. Virtqueue notification coalescing re-arming support # 3. Requirements ## 3.1 Device counters @@ -157,3 +158,14 @@ struct vnet_rx_completion { which can be recycled by the driver when the packets from the completed page is fully consumed. 8. The device should be able to consume multiple pages for a receive GSO stream. + +## 3.3 Virtqueue notification coalescing re-enable support +0. Design goal: + a. Avoid constant notification from the device even in conditions when + the driver may not have acted on the previous pending notification. +1. When Tx and Rx virtqueue notification coalescing is enabled, and when such + notification is reported by the device, device should be able to disable + further notifications until the driver finish reacting to the already + generated notification. +2. When the driver enables the notification coalescing reporting, the device + to notify again if notification coalescing conditions are met. -- 2.26.2


  • 15.  Re: [virtio] [PATCH REQUIREMENTS v2 4/7] net-features: Add notification coalescing requirements

    Posted 07-20-2023 15:19
    On Mon, Jul 03, 2023 at 02:44:07AM +0300, Parav Pandit wrote:
    > Add virtio net device notification coalescing improvements requirements.
    >
    > Signed-off-by: Parav Pandit <parav@nvidia.com>
    > ---
    > changelog:
    > v0->v1:
    > - updated the description
    > ---
    > net-workstream/features-1.4.md | 12 ++++++++++++
    > 1 file changed, 12 insertions(+)
    >
    > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
    > index 3e8b5a4..a34556c 100644
    > --- a/net-workstream/features-1.4.md
    > +++ b/net-workstream/features-1.4.md
    > @@ -8,6 +8,7 @@ together is desired while updating the virtio net interface.
    > # 2. Summary
    > 1. Device counters visible to the driver
    > 2. Low latency tx and rx virtqueues for PCI transport
    > +3. Virtqueue notification coalescing re-arming support
    >
    > # 3. Requirements
    > ## 3.1 Device counters
    > @@ -157,3 +158,14 @@ struct vnet_rx_completion {
    > which can be recycled by the driver when the packets from the completed
    > page is fully consumed.
    > 8. The device should be able to consume multiple pages for a receive GSO stream.
    > +
    > +## 3.3 Virtqueue notification coalescing re-enable support

    It's called "re-arming" above but "re-enable" here. Please choose one
    term and use it consistently.

    > +0. Design goal:
    > + a. Avoid constant notification from the device even in conditions when
    > + the driver may not have acted on the previous pending notification.
    > +1. When Tx and Rx virtqueue notification coalescing is enabled, and when such
    > + notification is reported by the device, device should be able to disable

    "notification" -> "a notification"

    ", device" -> ", the device"

    > + further notifications until the driver finish reacting to the already

    s/finish/finishes/

    > + generated notification.
    > +2. When the driver enables the notification coalescing reporting, the device

    "enables the notification coalescing reporting" ->
    "enables notification coalescing reporting"

    > + to notify again if notification coalescing conditions are met.

    I can't parse this sentence. Maybe "the device _has_ to notify again
    ..."?

    I find this text hard to understand. Is this a mechanism where the
    device does not send further notifications on a virtqueue until the
    driver has re-armed them?

    How does this relate to EVENT_IDX, which can be used to achieve a
    similar effect? I guess the downside to EVENT_IDX is that the device
    must DMA repeatedly in order to detect changes from driver, whereas this
    new re-arming mechanism involves a hardware register write?

    Can this new mechanism be generic for any kind of virtqueue, not just
    virtio-net rx/tx?

    > --
    > 2.26.2
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe from this mail list, you must leave the OASIS TC that
    > generates this mail. Follow this link to all your TCs in OASIS at:
    > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
    >



  • 16.  Re: [virtio] [PATCH REQUIREMENTS v2 4/7] net-features: Add notification coalescing requirements

    Posted 07-20-2023 15:25
    On Mon, Jul 03, 2023 at 02:44:07AM +0300, Parav Pandit wrote: > Add virtio net device notification coalescing improvements requirements. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > changelog: > v0->v1: > - updated the description > --- > net-workstream/features-1.4.md 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index 3e8b5a4..a34556c 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -8,6 +8,7 @@ together is desired while updating the virtio net interface. > # 2. Summary > 1. Device counters visible to the driver > 2. Low latency tx and rx virtqueues for PCI transport > +3. Virtqueue notification coalescing re-arming support > > # 3. Requirements > ## 3.1 Device counters > @@ -157,3 +158,14 @@ struct vnet_rx_completion { > which can be recycled by the driver when the packets from the completed > page is fully consumed. > 8. The device should be able to consume multiple pages for a receive GSO stream. > + > +## 3.3 Virtqueue notification coalescing re-enable support It's called "re-arming" above but "re-enable" here. Please choose one term and use it consistently. > +0. Design goal: > + a. Avoid constant notification from the device even in conditions when > + the driver may not have acted on the previous pending notification. > +1. When Tx and Rx virtqueue notification coalescing is enabled, and when such > + notification is reported by the device, device should be able to disable "notification" -> "a notification" ", device" -> ", the device" > + further notifications until the driver finish reacting to the already s/finish/finishes/ > + generated notification. > +2. When the driver enables the notification coalescing reporting, the device "enables the notification coalescing reporting" -> "enables notification coalescing reporting" > + to notify again if notification coalescing conditions are met. I can't parse this sentence. Maybe "the device _has_ to notify again ..."? I find this text hard to understand. Is this a mechanism where the device does not send further notifications on a virtqueue until the driver has re-armed them? How does this relate to EVENT_IDX, which can be used to achieve a similar effect? I guess the downside to EVENT_IDX is that the device must DMA repeatedly in order to detect changes from driver, whereas this new re-arming mechanism involves a hardware register write? Can this new mechanism be generic for any kind of virtqueue, not just virtio-net rx/tx? > -- > 2.26.2 > > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php > Attachment: signature.asc Description: PGP signature


  • 17.  RE: [virtio] [PATCH REQUIREMENTS v2 4/7] net-features: Add notification coalescing requirements

    Posted 07-23-2023 21:40


    > From: Stefan Hajnoczi <stefanha@redhat.com>
    > Sent: Thursday, July 20, 2023 11:19 AM
    > > +## 3.3 Virtqueue notification coalescing re-enable support
    >
    > It's called "re-arming" above but "re-enable" here. Please choose one term and
    > use it consistently.
    >
    Will change to re-arming as "enablement" can be easily confused with queue_enable and reset registers of pci.


    > > +0. Design goal:
    > > + a. Avoid constant notification from the device even in conditions when
    > > + the driver may not have acted on the previous pending notification.
    > > +1. When Tx and Rx virtqueue notification coalescing is enabled, and when
    > such
    > > + notification is reported by the device, device should be able to
    > > +disable
    >
    > "notification" -> "a notification"
    >
    > ", device" -> ", the device"
    >
    > > + further notifications until the driver finish reacting to the
    > > + already
    >
    > s/finish/finishes/
    >
    > > + generated notification.
    > > +2. When the driver enables the notification coalescing reporting, the
    > > +device
    >
    > "enables the notification coalescing reporting" -> "enables notification
    > coalescing reporting"
    >
    > > + to notify again if notification coalescing conditions are met.
    >
    > I can't parse this sentence. Maybe "the device _has_ to notify again ..."?
    >
    > I find this text hard to understand. Is this a mechanism where the device does
    > not send further notifications on a virtqueue until the driver has re-armed
    > them?
    >
    Right.
    I will rephrase it.

    > How does this relate to EVENT_IDX, which can be used to achieve a similar
    > effect? I guess the downside to EVENT_IDX is that the device must DMA
    > repeatedly in order to detect changes from driver, whereas this new re-arming
    > mechanism involves a hardware register write?
    >
    Correct. It is similar to EVENT_IDX without polling large scale unique addresses on the PCI.

    > Can this new mechanism be generic for any kind of virtqueue, not just virtio-net
    > rx/tx?
    Maybe we can. Haven't found wider usage yet for it.
    One needs to create a generic interface first to configure such things.
    It was kind of implicit requirement that I didn't write it specific to this feature.

    Config space was one way, but we discovered, debated and agreed that instead of config space, we should have a dma interface to exchange such info between driver and device.
    Its now due for many features.
    I suggest since receive flow filter has such requirement, we start with generic interface for it and expand to more like this one.

    I will revise v3 to capture your inputs further.




  • 18.  RE: [virtio] [PATCH REQUIREMENTS v2 4/7] net-features: Add notification coalescing requirements

    Posted 07-23-2023 21:40
    > From: Stefan Hajnoczi <stefanha@redhat.com> > Sent: Thursday, July 20, 2023 11:19 AM > > +## 3.3 Virtqueue notification coalescing re-enable support > > It's called "re-arming" above but "re-enable" here. Please choose one term and > use it consistently. > Will change to re-arming as "enablement" can be easily confused with queue_enable and reset registers of pci. > > +0. Design goal: > > + a. Avoid constant notification from the device even in conditions when > > + the driver may not have acted on the previous pending notification. > > +1. When Tx and Rx virtqueue notification coalescing is enabled, and when > such > > + notification is reported by the device, device should be able to > > +disable > > "notification" -> "a notification" > > ", device" -> ", the device" > > > + further notifications until the driver finish reacting to the > > + already > > s/finish/finishes/ > > > + generated notification. > > +2. When the driver enables the notification coalescing reporting, the > > +device > > "enables the notification coalescing reporting" -> "enables notification > coalescing reporting" > > > + to notify again if notification coalescing conditions are met. > > I can't parse this sentence. Maybe "the device _has_ to notify again ..."? > > I find this text hard to understand. Is this a mechanism where the device does > not send further notifications on a virtqueue until the driver has re-armed > them? > Right. I will rephrase it. > How does this relate to EVENT_IDX, which can be used to achieve a similar > effect? I guess the downside to EVENT_IDX is that the device must DMA > repeatedly in order to detect changes from driver, whereas this new re-arming > mechanism involves a hardware register write? > Correct. It is similar to EVENT_IDX without polling large scale unique addresses on the PCI. > Can this new mechanism be generic for any kind of virtqueue, not just virtio-net > rx/tx? Maybe we can. Haven't found wider usage yet for it. One needs to create a generic interface first to configure such things. It was kind of implicit requirement that I didn't write it specific to this feature. Config space was one way, but we discovered, debated and agreed that instead of config space, we should have a dma interface to exchange such info between driver and device. Its now due for many features. I suggest since receive flow filter has such requirement, we start with generic interface for it and expand to more like this one. I will revise v3 to capture your inputs further.


  • 19.  [PATCH REQUIREMENTS v2 5/7] net-features: Add n-tuple receive flow filters requirements

    Posted 07-02-2023 23:45
    Add virtio net device requirements for receive flow filters. Signed-off-by: Parav Pandit <parav@nvidia.com> --- changelog: v1->v2: - split setup and operations requirements - added design goal - worded requirements more precisely v0->v1: - fixed comments from Heng Li - renamed receive flow steering to receive flow filters - clarified byte offset in match criteria --- net-workstream/features-1.4.md 98 ++++++++++++++++++++++++++++++++++ 1 file changed, 98 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index a34556c..ae40ee8 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface. 1. Device counters visible to the driver 2. Low latency tx and rx virtqueues for PCI transport 3. Virtqueue notification coalescing re-arming support +4 Virtqueue receive flow filters (RFF) # 3. Requirements ## 3.1 Device counters @@ -169,3 +170,100 @@ struct vnet_rx_completion { generated notification. 2. When the driver enables the notification coalescing reporting, the device to notify again if notification coalescing conditions are met. + +## 3.4 Virtqueue receive flow filters (RFF) +0. Design goal: + To filter and/or to steer packet based on specific pattern match to a + specific context to support application/networking stack driven receive + processing. +1. Two use cases are: to support Linux netdev set_rxnfc() for ETHTOOL_SRXCLSRLINS + and to support netdev feature NETIF_F_NTUPLE aka ARFS. + +### 3.4.1 control path +1. The number of flow filter operations/sec can range from 100k/sec to 1M/sec + or even more. Hence flow filter operations must be done over a queueing + interface using one or more queues. +2. The device should be able to expose one or more supported flow filter queue + count and start vq index to the driver. +3. High rate flow filter operations are supposed to operate that can make device + implementation of it in hardware accelerated manner. Hence transporting them + using a flow vq being different than control vq is needed. +3. The driver should be able to query such capability over a DMA interface to + save on the device memory of the device config space region. +4. Since flow filters are enabled much later in the driver life cycle, driver + will likely create these queues when flow filters are enabled. +5. The device should be able to expose if it support filter masks. +6. The driver may want to have priority among group of flow rules; to facilitate + the device support grouping flow filter rules by a notion of a group. Each + group defines priority in processing flow. + +### 3.4.2 flow operations path +1. The driver should be able to define a receive packet match criteria, an + action and a destination for a packet. For example, an ipv4 packet with a + multicast address to be steered to the receive vq 0. The second example is + ipv4, tcp packet matching a specified IP address and tcp port tuple to + steered to receive vq 10. +2. The match criteria should include exact tuple fields well-defined such as mac + address, IP addresses, tcp/udp ports, etc. +3. The match criteria should also optionally include the field mask. +4. The match criteria may optionally also include specific packet byte offset + pattern, match length, mask instead of RFC defined fields. + length, and matching pattern, which may not be defined in the standard RFC. +5. Action includes (a) dropping or (b) forwarding the packet. +6. Destination location is a receive virtqueue index or rss context. +7. The device should process RFF rules before RSS rules, i.e., when there is a + miss on the RFF rule, RSS rule applies if RSS configuration exists. +8. If multiple rules are programmed which has overlapping attributes for a + received packet, the driver to define the location/priority of the rule. +9. The filter rule add/delete entries are usually short in size of few tens of + bytes, for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is + high, hence supplying fields inside the queue descriptor is preferred for + up to a certain fixed size, say 56 bytes. +10. The driver should be able to query flow filter table entries programmed in + the device by the flow id. +11. The driver should be able to add the entry for attributes (a) match + criteria, (b) action, (c) destination and (d) assign a unique id of 32 bits. +12. The driver should be able to delete the flow filter rule entry via a unique id. +13. A group member device should be able to query the attributes of the flow + filter rules that device supports. +14. The driver and group owner driver should be able to query supported device + limits for the flow rule entries. +15. Filter rule operation completion to be reported via a completion queue + interface. + +### 3.4.3 interface example + +7. Flow filter capabilities to query using a DMA interface: + +``` +struct flow_filter_capabilities { + u8 flow_groups; + u16 num_flow_filter_vqs; + u16 start_vq_index; + u32 max_flow_filters_per_group; + u32 max_flow_filters; + u64 supported_packet_field_mask_bmap[4]; +}; + +``` + +1. Flow filter rule add/modify, delete: + +struct virtio_net_rff_add_modify { + u8 flow_op; + u8 group_id; + u8 padding[2]; + le32 flow_id; + struct match_criteria mc; + struct destination dest; + struct action action; + + struct match_criteria mask; /* optional */ +}; + +struct virtio_net_rff_delete { + u8 flow_op; + u8 group_id; + u8 padding[2]; + le32 flow_id; +}; -- 2.26.2


  • 20.  [PATCH REQUIREMENTS v2 7/7] net-features: Add header data split requirements

    Posted 07-02-2023 23:45
    Add header data split requirements for the receive packets. Signed-off-by: Parav Pandit <parav@nvidia.com> --- net-workstream/features-1.4.md 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 466e4db..1bbf430 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -11,6 +11,7 @@ together is desired while updating the virtio net interface. 3. Virtqueue notification coalescing re-arming support 4 Virtqueue receive flow filters (RFF) 5. Device timestamp for tx and rx packets +6. Header data split for the receive virtqueue # 3. Requirements ## 3.1 Device counters @@ -293,3 +294,15 @@ struct virtio_net_rff_delete { point of reception from the network. 3. The device should provide a receive packet timestamp in a single DMA transaction along with the rest of the receive completion fields. + +## 3.6 Header data split for the receive virtqueue +1. The device should be able to DMA the packet header and data to two different + memory locations, this enables driver and networking stack to perform zero + copy to application buffer(s). +2. The driver should be able to configure maximum header buffer size per + virtqueue. +3. The header buffer to be in a physically contiguous memory per virtqueue +4. The device should be able to indicate header data split in the receive + completion. +5. The device should be able to zero pad the header buffer when the received + header is shorter than cpu cache line size. -- 2.26.2


  • 21.  Re: [virtio-comment] [PATCH REQUIREMENTS v2 0/7] virtio net new features requirements

    Posted 07-20-2023 02:21
    On Mon, 3 Jul 2023 02:44:03 +0300, Parav Pandit <parav@nvidia.com> wrote:
    > Hi All,
    >
    > This document captures the virtio net device requirements for the upcoming
    > release 1.4 that some of us are currently working on.
    >
    > This is live document to be updated in coming time and work towards it for
    > its design which can result in a draft specification.
    >
    > The objectives are:
    > 1. to consider these requirements in introducing new features
    > listed in the document and otherwise and work towards the interface
    > design followed by drafting the specification changes.
    >
    > 2. Define practical list of requirements that can be achieved in 1.4
    > timeframe incrementally and also have the ability to implement them.
    >
    > Please review.

    First of all, I want to pay tribute to you.

    Under your promotion, I think the advancement of these tasks has been relatively
    fast. But I'm wondering when we expect these things to be done.

    I also want to know, do you plan to release specs one by one, or these related
    specs together. I would like to know about related plans.

    We very much hope that these tasks, especially features such as device counter,
    n-tuple receive flow filters requirements, and packet timestamp can be
    completed within this year. The sooner the better.

    Thanks.


    >
    > TODO:
    > 1. Some more refinement needed for rx low latency and header data split
    > requirements.
    >
    > ---
    > changelog:
    > v1->v2:
    > - major update of receive flow filter requirements updated based on last
    > two design discussions in community and offline research
    > - examples added
    > - link to use case and design goal added
    > - control and operation side requirements split
    > - more verbose
    > v0->v1:
    > - addressed comments from Heng Li
    > - addressed few (not all) comments from Michael
    > - per patch changelog
    >
    >
    > Parav Pandit (7):
    > net-features: Add requirements document for release 1.4
    > net-features: Add low latency transmit queue requirements
    > net-features: Add low latency receive queue requirements
    > net-features: Add notification coalescing requirements
    > net-features: Add n-tuple receive flow filters requirements
    > net-features: Add packet timestamp requirements
    > net-features: Add header data split requirements
    >
    > net-workstream/features-1.4.md | 308 +++++++++++++++++++++++++++++++++
    > 1 file changed, 308 insertions(+)
    > create mode 100644 net-workstream/features-1.4.md
    >
    > --
    > 2.26.2
    >
    >
    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/
    >



  • 22.  Re: [virtio-comment] [PATCH REQUIREMENTS v2 0/7] virtio net new features requirements

    Posted 07-20-2023 02:22
    On Mon, 3 Jul 2023 02:44:03 +0300, Parav Pandit <parav@nvidia.com> wrote: > Hi All, > > This document captures the virtio net device requirements for the upcoming > release 1.4 that some of us are currently working on. > > This is live document to be updated in coming time and work towards it for > its design which can result in a draft specification. > > The objectives are: > 1. to consider these requirements in introducing new features > listed in the document and otherwise and work towards the interface > design followed by drafting the specification changes. > > 2. Define practical list of requirements that can be achieved in 1.4 > timeframe incrementally and also have the ability to implement them. > > Please review. First of all, I want to pay tribute to you. Under your promotion, I think the advancement of these tasks has been relatively fast. But I'm wondering when we expect these things to be done. I also want to know, do you plan to release specs one by one, or these related specs together. I would like to know about related plans. We very much hope that these tasks, especially features such as device counter, n-tuple receive flow filters requirements, and packet timestamp can be completed within this year. The sooner the better. Thanks. > > TODO: > 1. Some more refinement needed for rx low latency and header data split > requirements. > > --- > changelog: > v1->v2: > - major update of receive flow filter requirements updated based on last > two design discussions in community and offline research > - examples added > - link to use case and design goal added > - control and operation side requirements split > - more verbose > v0->v1: > - addressed comments from Heng Li > - addressed few (not all) comments from Michael > - per patch changelog > > > Parav Pandit (7): > net-features: Add requirements document for release 1.4 > net-features: Add low latency transmit queue requirements > net-features: Add low latency receive queue requirements > net-features: Add notification coalescing requirements > net-features: Add n-tuple receive flow filters requirements > net-features: Add packet timestamp requirements > net-features: Add header data split requirements > > net-workstream/features-1.4.md 308 +++++++++++++++++++++++++++++++++ > 1 file changed, 308 insertions(+) > create mode 100644 net-workstream/features-1.4.md > > -- > 2.26.2 > > > This publicly archived list offers a means to provide input to the > OASIS Virtual I/O Device (VIRTIO) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > List help: virtio-comment-help@lists.oasis-open.org > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > Committee: https://www.oasis-open.org/committees/virtio/ > Join OASIS: https://www.oasis-open.org/join/ >


  • 23.  RE: [virtio-comment] [PATCH REQUIREMENTS v2 0/7] virtio net new features requirements

    Posted 07-20-2023 02:28


    > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    > Sent: Wednesday, July 19, 2023 10:21 PM

    > Under your promotion, I think the advancement of these tasks has been
    > relatively fast. But I'm wondering when we expect these things to be done.
    >
    There has been delay in last 10 days due to some unplanned tasks.

    Device counters and ntuple is very clear multi-vendor and multi-user requirements.
    So lets finish both the requirements by 26th of July.
    By mid august we can complete the design and draft for both by 1st week of sept.

    > I also want to know, do you plan to release specs one by one, or these related
    > specs together. I would like to know about related plans.
    >
    Yes, we don't need to put all in one patchset.
    In last few meetings we already identified requirements that can be divisible as individual one to progress.
    Counters and ntuple are on top.

    > We very much hope that these tasks, especially features such as device counter,
    > n-tuple receive flow filters requirements, and packet timestamp can be
    > completed within this year. The sooner the better.

    Packet timestamp requires new VQ desc format because it has direct impact on how many DMAs are done and how we pack them.
    So how about we get to this after, we finish the ntuple and counters, in mid sept?
    We can design new VQ format for latency + timestamp in one go from Sept onwards before Dec.




  • 24.  RE: [virtio-comment] [PATCH REQUIREMENTS v2 0/7] virtio net new features requirements

    Posted 07-20-2023 02:28
    > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Sent: Wednesday, July 19, 2023 10:21 PM > Under your promotion, I think the advancement of these tasks has been > relatively fast. But I'm wondering when we expect these things to be done. > There has been delay in last 10 days due to some unplanned tasks. Device counters and ntuple is very clear multi-vendor and multi-user requirements. So lets finish both the requirements by 26th of July. By mid august we can complete the design and draft for both by 1st week of sept. > I also want to know, do you plan to release specs one by one, or these related > specs together. I would like to know about related plans. > Yes, we don't need to put all in one patchset. In last few meetings we already identified requirements that can be divisible as individual one to progress. Counters and ntuple are on top. > We very much hope that these tasks, especially features such as device counter, > n-tuple receive flow filters requirements, and packet timestamp can be > completed within this year. The sooner the better. Packet timestamp requires new VQ desc format because it has direct impact on how many DMAs are done and how we pack them. So how about we get to this after, we finish the ntuple and counters, in mid sept? We can design new VQ format for latency + timestamp in one go from Sept onwards before Dec.


  • 25.  Re: RE: [virtio-comment] [PATCH REQUIREMENTS v2 0/7] virtio net new features requirements

    Posted 07-20-2023 02:36
    On Thu, 20 Jul 2023 02:28:26 +0000, Parav Pandit <parav@nvidia.com> wrote:
    >
    >
    > > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    > > Sent: Wednesday, July 19, 2023 10:21 PM
    >
    > > Under your promotion, I think the advancement of these tasks has been
    > > relatively fast. But I'm wondering when we expect these things to be done.
    > >
    > There has been delay in last 10 days due to some unplanned tasks.
    >
    > Device counters and ntuple is very clear multi-vendor and multi-user requirements.
    > So lets finish both the requirements by 26th of July.
    > By mid august we can complete the design and draft for both by 1st week of sept.

    OK. we will do these firstly.

    >
    > > I also want to know, do you plan to release specs one by one, or these related
    > > specs together. I would like to know about related plans.
    > >
    > Yes, we don't need to put all in one patchset.
    > In last few meetings we already identified requirements that can be divisible as individual one to progress.
    > Counters and ntuple are on top.

    OK.


    >
    > > We very much hope that these tasks, especially features such as device counter,
    > > n-tuple receive flow filters requirements, and packet timestamp can be
    > > completed within this year. The sooner the better.
    >
    > Packet timestamp requires new VQ desc format because it has direct impact on how many DMAs are done and how we pack them.
    > So how about we get to this after, we finish the ntuple and counters, in mid sept?


    Great.

    > We can design new VQ format for latency + timestamp in one go from Sept onwards before Dec.
    >

    We are more than happry to help.

    Thanks.



  • 26.  Re: RE: [virtio-comment] [PATCH REQUIREMENTS v2 0/7] virtio net new features requirements

    Posted 07-20-2023 02:41
    On Thu, 20 Jul 2023 02:28:26 +0000, Parav Pandit <parav@nvidia.com> wrote: > > > > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > Sent: Wednesday, July 19, 2023 10:21 PM > > > Under your promotion, I think the advancement of these tasks has been > > relatively fast. But I'm wondering when we expect these things to be done. > > > There has been delay in last 10 days due to some unplanned tasks. > > Device counters and ntuple is very clear multi-vendor and multi-user requirements. > So lets finish both the requirements by 26th of July. > By mid august we can complete the design and draft for both by 1st week of sept. OK. we will do these firstly. > > > I also want to know, do you plan to release specs one by one, or these related > > specs together. I would like to know about related plans. > > > Yes, we don't need to put all in one patchset. > In last few meetings we already identified requirements that can be divisible as individual one to progress. > Counters and ntuple are on top. OK. > > > We very much hope that these tasks, especially features such as device counter, > > n-tuple receive flow filters requirements, and packet timestamp can be > > completed within this year. The sooner the better. > > Packet timestamp requires new VQ desc format because it has direct impact on how many DMAs are done and how we pack them. > So how about we get to this after, we finish the ntuple and counters, in mid sept? Great. > We can design new VQ format for latency + timestamp in one go from Sept onwards before Dec. > We are more than happry to help. Thanks.