virtio-comment

 View Only
  • 1.  [RFC PATCH] virtio-pci: introduce device state capability

    Posted 12-04-2020 05:46
    This patch tries to introduce device state capability in order to be
    self virtualized. One possible user is the support of live migration.

    The minimal requirements for achieving are:

    - The ability to pause and resume the the datapath. This patch choose
    to do this per virtqueue via a dedicated rw field queue_status to do
    this. Driver may choose to write 0 to stop a queue and device can
    only advertise the stop when it completes all the inflight
    descriptors.
    - The ability to set and get the device internal ring state. As an RFC
    this patch only introduce the last_avail_idx field for split
    ring. We could extend it to support packed ring (e.g with the wrap
    counters).

    We probably need a new feature bit for advertising this new feature
    but this draft should be sufficient for starting the discussion.

    Several other choices:

    1) allow writing 0 to queue_enable when the new feature is negotiated
    2) reuse the common structure for queue index save and restore
    3) introduce new virtio device status to stop/resume the whole device

    Please share your thoughts.

    Signed-off-by: Jason Wang <jasowang@redhat.com>
    ---
    content.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
    1 file changed, 51 insertions(+)

    diff --git a/content.tex b/content.tex
    index 61eab41..7cdb66a 100644
    --- a/content.tex
    +++ b/content.tex
    @@ -705,6 +705,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
    #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
    /* Vendor-specific data */
    #define VIRTIO_PCI_CAP_VENDOR_CFG 9
    +/* Device state */
    +#define VIRTIO_PCI_CAP_DEVICE_STATE_CFG 10
    \end{lstlisting}

    Any other value is reserved for future use.
    @@ -1247,6 +1249,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
    specified by some other Virtio Structure PCI Capability
    of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.

    +\subsubsection{Device State capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Device state capability}
    +
    +The device state structure is found at the \field{bar} and \field{offset} within the VIRTIO_PCI_CAP_DEVICE_STATE_CFG capability; its layout is below.
    +
    +\begin{lstlisting}
    +struct virtio_pci_device_state {
    + /* About a specific virtqueue. */
    + le16 queue_select; /* read-write */
    + le16 last_avail_idx; /* read-write */
    + u8 queue_status; /* read-write */
    +};
    +\end{lstlisting}
    +
    +\begin{description}
    +\item[\field{queue_select}]
    + Queue Select. The driver selects which virtqueue the following
    + fields refer to.
    +
    +\item[\field{last_avail_idx}]
    + Last avail index for this virtqueue that is used by the device.
    +
    +\item[\field{queue_status}]
    + The driver uses this to stop the executing request for this virtqueue.
    + 1 - running; 0 - stopped.
    +\end{description}
    +
    +\devicenormative{\paragraph}{PCI Device State capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI Device State capability}
    +
    +The device MUST present a 0 in \field{queue_status} on reset.
    +
    +The device MUST present a 1 in \field{queue_status} upon writing 1
    +to \field{queue_enable} from common configuration structure by driver.
    +
    +The device MUST complete all inflight requests before presenting a 0
    +in \field{queue_status}.
    +
    +The device MUST ignore the write to \field{queue_status} if the
    +virtqueue is unavailable when the value \field{queue_enable} is zero
    +from common configuration structure.
    +
    +\drivernormative{\paragraph}{PCI Device State capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI Device State capability}
    +
    +To stop using the queue the driver MUST write zero (0x0) to this
    +\field{queue_status} and MUST read the value back to ensure
    +synchronization.
    +
    +The driver MUST NOT access \field{last_avail_idx} when
    +\field{queue_status} is not zero.
    +
    \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}

    Transitional devices MUST present part of configuration
    --
    2.25.1




  • 2.  Re: [virtio-comment] [RFC PATCH] virtio-pci: introduce device state capability

    Posted 12-17-2020 15:28
    On Fri, Dec 04, 2020 at 01:46:15PM +0800, Jason Wang wrote:
    > @@ -1247,6 +1249,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
    > specified by some other Virtio Structure PCI Capability
    > of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
    >
    > +\subsubsection{Device State capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Device state capability}
    > +
    > +The device state structure is found at the \field{bar} and \field{offset} within the VIRTIO_PCI_CAP_DEVICE_STATE_CFG capability; its layout is below.
    > +
    > +\begin{lstlisting}
    > +struct virtio_pci_device_state {
    > + /* About a specific virtqueue. */
    > + le16 queue_select; /* read-write */
    > + le16 last_avail_idx; /* read-write */
    > + u8 queue_status; /* read-write */

    Starting/stopping queues is useful for managing multiqueue across CPU
    hotplug. Perhaps the queue_status functionality should be generic and
    not related to device state save/load?

    > +};
    > +\end{lstlisting}
    > +
    > +\begin{description}
    > +\item[\field{queue_select}]
    > + Queue Select. The driver selects which virtqueue the following
    > + fields refer to.
    > +
    > +\item[\field{last_avail_idx}]
    > + Last avail index for this virtqueue that is used by the device.

    This is only sufficient to save/load devices where all device state is
    visible to the driver (I would call them "stateless"). virtio-gpu,
    virtio-crypto, and virtio-fs are "stateful" devices in the sense that
    devices contain internal state that is not directly visible to the
    driver but is needed to save/load the device.

    If we want to save/load stateful devices then an additional interface is
    needed for this data. It can be large in size (e.g. imagine graphics
    buffers or many thousands of inodes) and should therefore probably
    support iterative saving.

    > +\item[\field{queue_status}]
    > + The driver uses this to stop the executing request for this virtqueue.
    > + 1 - running; 0 - stopped.
    > +\end{description}

    A per-virtqueue stopped state is not enough to stop a device because
    some devices have global behavior like config change events. It must be
    possible to completely stop the device to prevent further activity
    (VIRTIO config space changes, interrupts, etc).

    Maybe a Device Status register bit can be added for that?

    > +
    > +\devicenormative{\paragraph}{PCI Device State capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI Device State capability}
    > +
    > +The device MUST present a 0 in \field{queue_status} on reset.
    > +
    > +The device MUST present a 1 in \field{queue_status} upon writing 1
    > +to \field{queue_enable} from common configuration structure by driver.
    > +
    > +The device MUST complete all inflight requests before presenting a 0
    > +in \field{queue_status}.

    Is there a definition of "inflight requests"? I think the precise
    meaning here is avail requests that the device is processing. It does
    not include avail requests that the device has not yet started
    processing.

    Is there any notification when the queue_status bit transitions from 1
    to 0? If not, then drivers must poll the register to detect completion.

    Is it possible to resume a virtqueue that is in the process of stopping
    (for example, if the guest driver decides a timeout has been reached and
    it no longer wishes to stop the virtqueue)?



  • 3.  Re: [virtio-comment] [RFC PATCH] virtio-pci: introduce device state capability

    Posted 12-18-2020 03:17

    On 2020/12/17 ??11:28, Stefan Hajnoczi wrote:
    > On Fri, Dec 04, 2020 at 01:46:15PM +0800, Jason Wang wrote:
    >> @@ -1247,6 +1249,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
    >> specified by some other Virtio Structure PCI Capability
    >> of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
    >>
    >> +\subsubsection{Device State capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Device state capability}
    >> +
    >> +The device state structure is found at the \field{bar} and \field{offset} within the VIRTIO_PCI_CAP_DEVICE_STATE_CFG capability; its layout is below.
    >> +
    >> +\begin{lstlisting}
    >> +struct virtio_pci_device_state {
    >> + /* About a specific virtqueue. */
    >> + le16 queue_select; /* read-write */
    >> + le16 last_avail_idx; /* read-write */
    >> + u8 queue_status; /* read-write */
    > Starting/stopping queues is useful for managing multiqueue across CPU
    > hotplug. Perhaps the queue_status functionality should be generic and
    > not related to device state save/load?


    That's fine, actually, I was considering reusing queue_enable which
    forbids to be write with 0 now. My idea is to allow zero to be wrote if
    some features is negotiated.


    >
    >> +};
    >> +\end{lstlisting}
    >> +
    >> +\begin{description}
    >> +\item[\field{queue_select}]
    >> + Queue Select. The driver selects which virtqueue the following
    >> + fields refer to.
    >> +
    >> +\item[\field{last_avail_idx}]
    >> + Last avail index for this virtqueue that is used by the device.
    > This is only sufficient to save/load devices where all device state is
    > visible to the driver (I would call them "stateless"). virtio-gpu,
    > virtio-crypto, and virtio-fs are "stateful" devices in the sense that
    > devices contain internal state that is not directly visible to the
    > driver but is needed to save/load the device.


    Right, this is the support at generic device level.

    For stateful device, it needs device specific extension which I think is
    hard to be generalized.


    >
    > If we want to save/load stateful devices then an additional interface is
    > needed for this data. It can be large in size (e.g. imagine graphics
    > buffers or many thousands of inodes) and should therefore probably
    > support iterative saving.


    Yes, this needs more consideration. But we can solve the generic
    virtqueue level first.


    >
    >> +\item[\field{queue_status}]
    >> + The driver uses this to stop the executing request for this virtqueue.
    >> + 1 - running; 0 - stopped.
    >> +\end{description}
    > A per-virtqueue stopped state is not enough to stop a device because
    > some devices have global behavior like config change events. It must be
    > possible to completely stop the device to prevent further activity
    > (VIRTIO config space changes, interrupts, etc).
    >
    > Maybe a Device Status register bit can be added for that?


    I've considered such case, one of the idea is to introduce a new device
    status to achieve which seems more generic. Then we need to define a
    state machine carefully.


    >
    >> +
    >> +\devicenormative{\paragraph}{PCI Device State capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI Device State capability}
    >> +
    >> +The device MUST present a 0 in \field{queue_status} on reset.
    >> +
    >> +The device MUST present a 1 in \field{queue_status} upon writing 1
    >> +to \field{queue_enable} from common configuration structure by driver.
    >> +
    >> +The device MUST complete all inflight requests before presenting a 0
    >> +in \field{queue_status}.
    > Is there a definition of "inflight requests"? I think the precise
    > meaning here is avail requests that the device is processing. It does
    > not include avail requests that the device has not yet started
    > processing.


    Right.


    >
    > Is there any notification when the queue_status bit transitions from 1
    > to 0? If not, then drivers must poll the register to detect completion.


    To be simple, this draft requires the driver to poll for the status.


    >
    > Is it possible to resume a virtqueue that is in the process of stopping
    > (for example, if the guest driver decides a timeout has been reached and
    > it no longer wishes to stop the virtqueue)?


    I think the answer is yes, we can clarify this in the spec.

    Thanks





  • 4.  Re: [virtio-comment] [RFC PATCH] virtio-pci: introduce device state capability

    Posted 12-18-2020 04:58

    On 2020/12/18 ??11:16, Jason Wang wrote:
    >>>
    >> A per-virtqueue stopped state is not enough to stop a device because
    >> some devices have global behavior like config change events. It must be
    >> possible to completely stop the device to prevent further activity
    >> (VIRTIO config space changes, interrupts, etc).
    >>
    >> Maybe a Device Status register bit can be added for that?
    >
    >
    > I've considered such case, one of the idea is to introduce a new
    > device status to achieve which seems more generic. Then we need to
    > define a state machine carefully.


    So I post another RFC to introduce a new device status bit. Please review.

    Thanks




  • 5.  Re: [virtio-comment] [RFC PATCH] virtio-pci: introduce device state capability

    Posted 12-18-2020 17:29
    On Fri, Dec 18, 2020 at 12:58:21PM +0800, Jason Wang wrote:
    >
    > On 2020/12/18 ??11:16, Jason Wang wrote:
    > > > >
    > > > A per-virtqueue stopped state is not enough to stop a device because
    > > > some devices have global behavior like config change events. It must be
    > > > possible to completely stop the device to prevent further activity
    > > > (VIRTIO config space changes, interrupts, etc).
    > > >
    > > > Maybe a Device Status register bit can be added for that?
    > >
    > >
    > > I've considered such case, one of the idea is to introduce a new device
    > > status to achieve which seems more generic. Then we need to define a
    > > state machine carefully.
    >
    >
    > So I post another RFC to introduce a new device status bit. Please review.

    Thanks, that sounds good. I will be offline until January 4th and will
    try to catch up then.

    Stefan



  • 6.  Re: [virtio-comment] [RFC PATCH] virtio-pci: introduce device state capability

    Posted 12-21-2020 03:43

    On 2020/12/19 ??1:28, Stefan Hajnoczi wrote:
    > On Fri, Dec 18, 2020 at 12:58:21PM +0800, Jason Wang wrote:
    >> On 2020/12/18 ??11:16, Jason Wang wrote:
    >>>> A per-virtqueue stopped state is not enough to stop a device because
    >>>> some devices have global behavior like config change events. It must be
    >>>> possible to completely stop the device to prevent further activity
    >>>> (VIRTIO config space changes, interrupts, etc).
    >>>>
    >>>> Maybe a Device Status register bit can be added for that?
    >>>
    >>> I've considered such case, one of the idea is to introduce a new device
    >>> status to achieve which seems more generic. Then we need to define a
    >>> state machine carefully.
    >>
    >> So I post another RFC to introduce a new device status bit. Please review.
    > Thanks, that sounds good. I will be offline until January 4th and will
    > try to catch up then.
    >
    > Stefan


    Great.

    Thanks