OASIS Virtual I/O Device (VIRTIO) TC

 View Only
Expand all | Collapse all

RE: [virtio-comment] proposal: use admin command (and aq) of the device to query config space

  • 1.  RE: [virtio-comment] proposal: use admin command (and aq) of the device to query config space

    Posted 08-02-2023 09:07
    > From: Jason Wang <jasowang@redhat.com> > Sent: Wednesday, August 2, 2023 2:23 PM > > On Tue, Aug 1, 2023 at 3:09âPM Parav Pandit <parav@nvidia.com> wrote: > > > > One line proposal: > > Let's use new admin command and admin q for all device types to query > device config space for new fields. (always). > > Before we mandate anything to admin command, we need to first invent an > admin command over MMIO interface otherwise it would always be an issue > for the nesting. > Nesting can be independent requirement in itself. > > > > Details below. > > > > Query of device capabilities and configuration using DMA interface. > > Need: > > Currently device configuration space is exposed as read only registers. > > This is wrong: > > 1) device configuration space is transport independent, some transport already > use DMA to access the device configuration space You can say ccw instead of "some". :) > 2) device configuration space is not read only, we've already had several > examples of using it as write > It is even worse to have writable. > > It is growing rapidly. > > Some devices may be even multi-functionality device in coming future such as > net + clock + rdma device. > > For a PCI transport implementing such ever-growing capabilities, > configuration is burdensome as plain registers. > > We've already fixed size VIRTIO_PCI_CAP_PCI_CFG. What's wrong with that? > The wrong part is: it is still and indirect and slow, sub-optimal register interface. > And we have a lot of device specific virtqueues that could be used for > configuration. > Sure, this is the option_3 listed in here. > > Hence, it is required for the driver to query capabilities and configuration > using a DMA interface. > > > > Interface requirements: > > 1. Maintain backward compatibility for existing defined configuration fields to > stay as registers. > > 2. Any new field added must be accessed via DMA interface, regardless of > device implementation (hw/sw etc). > > Results in single driver code regardless of device implementation. > > Virtio is flexible as it decouples transport from the device model. > This breaks this flexibility, and this prevents non-DMA transport from being > developed. > VQ is decoupled from transport already. So, there is no flexibility broken. And yet you suggested transport dependent VIRTIO_PCI_CAP_PCI_CFG above making it further wrong. :) > > 3. A device must be able to choose, starting from which field driver > > must query such configuration via DMA interface. This field offset must be > greater than currently defined configuration field. > > 4. Any driver to device query operation must not be mandated to be > > mediated by the owner device for PCI VFs or SIOV or SF devices. Driver > > must be able to communicate query capabilities and configuration > > fields directly to the device regardless of device type being PCI PF, VF, SF/SIOV > device uniformly. > > 5. When having multi-functionality device in future, it is desired to > > not always query all the configuration but may be able to query per- > functionality configurations. > > For example, query only steering capabilities, query only rdma capabilities or > query only clock capabilities. > > 6. The driver should be able to query config/capabilities without > > polling for the DMA completion, in other words, the driver should be > > able to get notification from the device when DMA command completes. > > 7. The driver should be able to utilize existing interrupt vector > > and/or virtqueue for query and set operation without demanding > > additional interrupt vector whenever possible. > > > > There are multiple options for DMA interface. > > Some of these options are listed below that we would like to consider fulfilling > above requirements. > > > > Option_1: > > New DMA interface registers. > > New registers which allows single outstanding DMA command per device. > > Such as, > > struct pci_dma_cmd_mmio_registers { > > le64 cmd_addr; /* rw */ > > le32 cmd_len; /* rw: ordered write to it after cmd_addr, this triggers > cmd to device */ > > le32 start_offset_cfg_space; /* fields below this offset are > > not available as register, dma is must */ }; > > > > struct pci_config_access_cmd { > > u8 opcode; /* 0 = get config, 1 to N set dev specific config */ > > u8 reserved; > > le16 msix_vector_index; > > le64 rsp_addr; /* points to struct pci_config_access_rsp */ > > le32 rsp_len; > > }; > > struct pci_config_access_rsp { > > u8 status; > > u8 debug_field; > > u8 cmd_specific_data[]; > > }; > > > > Cons: > > 1. Duplication of a VQ interface which can do same work and vq is purposed > for "bulk data transfer" in spec already like this use case. > > 2. Some devices already have CVQ that can easily fulfil this role, which is not > utilized. > > 3. Requires per device additional 16 bytes of register space. > > 4. Requires an extra msix interrupt vector to differentiate from config change > interrupts. > > > > Option_2: > > Use admin vq for all the device types regardless of its transport such as PCI PF, > PCI VF, PCI SIOV. > > In this method a new admin command is issued on the admin vq of the device > itself. > > Pros: > > 1. Requirements 1 to 7 are fulfilled. > > 2. Driver can reuse the same vector with CVQ that addresses requirement #7. > > Cons: > > 1. Requires per device unique admin queue number and count. > > Still better than dedicated dma interface of #1, as it requires only 4 additional > bytes as opposed to 12 bytes. > > > > Option_3: > > Use control VQ for the devices that already has CVQ. > > In this method an existing CVQ of the device is used to query device config > space. > > Pros: > > 1. All 7 requirements are fulfilled. > > 2. Does not need 4 bytes of admin queue registers. > > 3. Save MSI-X vectors of the AQ. > > 4. Superior to option_2 as it doesn't require extra AQ. > > 5. Most efficient of all 3 options. > > Cons: > > 1. Those devices which does not have CVQ, may need to add it if at all they > are growing. > > > > d. Any other option? > > Transport virtqueue on top of admin virtqueue will address this seamlessly. > :) Donât see why one would create few more objects on top of aq when aq or cvq itself can fulfil the need. Can you please elaborate?


  • 2.  Re: [virtio] RE: [virtio-comment] proposal: use admin command (and aq) of the device to query config space

    Posted 08-02-2023 09:32
    On Wed, Aug 2, 2023 at 5:07?PM Parav Pandit <parav@nvidia.com> wrote:
    >
    >
    > > From: Jason Wang <jasowang@redhat.com>
    > > Sent: Wednesday, August 2, 2023 2:23 PM
    > >
    > > On Tue, Aug 1, 2023 at 3:09?PM Parav Pandit <parav@nvidia.com> wrote:
    > > >
    > > > One line proposal:
    > > > Let's use new admin command and admin q for all device types to query
    > > device config space for new fields. (always).
    > >
    > > Before we mandate anything to admin command, we need to first invent an
    > > admin command over MMIO interface otherwise it would always be an issue
    > > for the nesting.
    > >
    > Nesting can be independent requirement in itself.'

    I don't understand here. If you tie new fields to the DMA interface,
    it basically means nesting won't get any new features unless:

    1) it's a PCI VF
    2) SR-IOV emulation is done
    3) admin virtqueue emulation is done

    If you want differ nesting devices from others, it would be a
    nightmare to maintain.

    > > >
    > > > Details below.
    > > >
    > > > Query of device capabilities and configuration using DMA interface.
    > > > Need:
    > > > Currently device configuration space is exposed as read only registers.
    > >
    > > This is wrong:
    > >
    > > 1) device configuration space is transport independent, some transport already
    > > use DMA to access the device configuration space
    > You can say ccw instead of "some". :)

    Kind of but the transport vq proposal goes in the same way.

    >
    > > 2) device configuration space is not read only, we've already had several
    > > examples of using it as write
    > >
    > It is even worse to have writable.

    Well, what I meant is that, it's not necessarily read only and not
    necessarily a register interface.

    >
    > > > It is growing rapidly.
    > > > Some devices may be even multi-functionality device in coming future such as
    > > net + clock + rdma device.
    > > > For a PCI transport implementing such ever-growing capabilities,
    > > configuration is burdensome as plain registers.
    > >
    > > We've already fixed size VIRTIO_PCI_CAP_PCI_CFG. What's wrong with that?
    > >
    > The wrong part is: it is still and indirect and slow, sub-optimal register interface.

    Do we really care about the performance here? And if it is one of your
    major concerns, it's better to explain it along with the
    "ever-growing" concern.

    >
    > > And we have a lot of device specific virtqueues that could be used for
    > > configuration.
    > >
    > Sure, this is the option_3 listed in here.
    >
    > > > Hence, it is required for the driver to query capabilities and configuration
    > > using a DMA interface.
    > > >
    > > > Interface requirements:
    > > > 1. Maintain backward compatibility for existing defined configuration fields to
    > > stay as registers.
    > > > 2. Any new field added must be accessed via DMA interface, regardless of
    > > device implementation (hw/sw etc).
    > > > Results in single driver code regardless of device implementation.
    > >
    > > Virtio is flexible as it decouples transport from the device model.
    > > This breaks this flexibility, and this prevents non-DMA transport from being
    > > developed.
    > >
    > VQ is decoupled from transport already.
    > So, there is no flexibility broken.
    > And yet you suggested transport dependent VIRTIO_PCI_CAP_PCI_CFG above making it further wrong. :)
    >

    The context here is that you want to mandate any new fields to be DMA.
    DMA is obviously transport specific. There are transports that don't
    use DMA at all (e.g the shared memory).

    > > > 3. A device must be able to choose, starting from which field driver
    > > > must query such configuration via DMA interface. This field offset must be
    > > greater than currently defined configuration field.

    [...]

    > > >
    > > > d. Any other option?
    > >
    > > Transport virtqueue on top of admin virtqueue will address this seamlessly.
    > >
    > :)
    >
    > Don’t see why one would create few more objects on top of aq when aq or cvq itself can fulfil the need.
    > Can you please elaborate?

    If cvq can work, there's no need for any other methods. If you care
    about admin virtqueue then device configuration space is not the only
    thing that can be "ever growing", common_cfg is another one. Allowing
    them to be accessed via a virtqueue (thus DMA in PCI transport) is
    basically the idea of the transport virtqueue. Ling shan is rebasing
    transport virtqueue proposals on top of admin commands, we can see
    then.

    Thanks




  • 3.  Re: [virtio] RE: [virtio-comment] proposal: use admin command (and aq) of the device to query config space

    Posted 08-02-2023 09:32
    On Wed, Aug 2, 2023 at 5:07âPM Parav Pandit <parav@nvidia.com> wrote: > > > > From: Jason Wang <jasowang@redhat.com> > > Sent: Wednesday, August 2, 2023 2:23 PM > > > > On Tue, Aug 1, 2023 at 3:09âPM Parav Pandit <parav@nvidia.com> wrote: > > > > > > One line proposal: > > > Let's use new admin command and admin q for all device types to query > > device config space for new fields. (always). > > > > Before we mandate anything to admin command, we need to first invent an > > admin command over MMIO interface otherwise it would always be an issue > > for the nesting. > > > Nesting can be independent requirement in itself.' I don't understand here. If you tie new fields to the DMA interface, it basically means nesting won't get any new features unless: 1) it's a PCI VF 2) SR-IOV emulation is done 3) admin virtqueue emulation is done If you want differ nesting devices from others, it would be a nightmare to maintain. > > > > > > Details below. > > > > > > Query of device capabilities and configuration using DMA interface. > > > Need: > > > Currently device configuration space is exposed as read only registers. > > > > This is wrong: > > > > 1) device configuration space is transport independent, some transport already > > use DMA to access the device configuration space > You can say ccw instead of "some". :) Kind of but the transport vq proposal goes in the same way. > > > 2) device configuration space is not read only, we've already had several > > examples of using it as write > > > It is even worse to have writable. Well, what I meant is that, it's not necessarily read only and not necessarily a register interface. > > > > It is growing rapidly. > > > Some devices may be even multi-functionality device in coming future such as > > net + clock + rdma device. > > > For a PCI transport implementing such ever-growing capabilities, > > configuration is burdensome as plain registers. > > > > We've already fixed size VIRTIO_PCI_CAP_PCI_CFG. What's wrong with that? > > > The wrong part is: it is still and indirect and slow, sub-optimal register interface. Do we really care about the performance here? And if it is one of your major concerns, it's better to explain it along with the "ever-growing" concern. > > > And we have a lot of device specific virtqueues that could be used for > > configuration. > > > Sure, this is the option_3 listed in here. > > > > Hence, it is required for the driver to query capabilities and configuration > > using a DMA interface. > > > > > > Interface requirements: > > > 1. Maintain backward compatibility for existing defined configuration fields to > > stay as registers. > > > 2. Any new field added must be accessed via DMA interface, regardless of > > device implementation (hw/sw etc). > > > Results in single driver code regardless of device implementation. > > > > Virtio is flexible as it decouples transport from the device model. > > This breaks this flexibility, and this prevents non-DMA transport from being > > developed. > > > VQ is decoupled from transport already. > So, there is no flexibility broken. > And yet you suggested transport dependent VIRTIO_PCI_CAP_PCI_CFG above making it further wrong. :) > The context here is that you want to mandate any new fields to be DMA. DMA is obviously transport specific. There are transports that don't use DMA at all (e.g the shared memory). > > > 3. A device must be able to choose, starting from which field driver > > > must query such configuration via DMA interface. This field offset must be > > greater than currently defined configuration field. [...] > > > > > > d. Any other option? > > > > Transport virtqueue on top of admin virtqueue will address this seamlessly. > > > :) > > Donât see why one would create few more objects on top of aq when aq or cvq itself can fulfil the need. > Can you please elaborate? If cvq can work, there's no need for any other methods. If you care about admin virtqueue then device configuration space is not the only thing that can be "ever growing", common_cfg is another one. Allowing them to be accessed via a virtqueue (thus DMA in PCI transport) is basically the idea of the transport virtqueue. Ling shan is rebasing transport virtqueue proposals on top of admin commands, we can see then. Thanks