virtio-comment

 View Only
Expand all | Collapse all

[PATCH 2/5] virtio-blk spec: writeback cache enable improvements

  • 1.  [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 09:55
    This patch introduces two improvements to writeback cache handling
    in the virtio-blk spec.

    1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    QEMU's behavior is documented explicitly as part of the spec: the host
    negotiates the feature only if its cache is writeback. The obvious dual
    requirement is imposed on the guest: it should negotiate the feature
    only if it is able to send flushes. And in order to protect against
    data loss, the spec now mandates that the host operates in writethrough
    mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    was already _allowed_ by the spec so far). This can change with every
    reset of course; typically the BIOS will run as writethrough, while the
    "main" OS will run in writeback mode. This is a backwards-compatible
    refinement geared towards old or limited guests, so there is no need
    for a new feature bit.

    2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    the same information in the configuration. This will enable the driver
    to modify the write-cache setting at runtime (via sysfs for Linux, via
    MODE SELECT for Windows).

    Patches for QEMU and Linux will come soonish.

    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    ---
    virtio-spec.lyx | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
    1 file changed, 56 insertions(+), 1 deletion(-)

    diff --git a/virtio-spec.lyx b/virtio-spec.lyx
    index 025b6ad..440af3e 100644
    --- a/virtio-spec.lyx
    +++ b/virtio-spec.lyx
    @@ -5108,7 +5108,19 @@ VIRTIO_BLK_F_SCSI (7) Device supports scsi packet commands.
    \end_layout

    \begin_layout Description
    -VIRTIO_BLK_F_FLUSH (9) Cache flush command support.
    +VIRTIO_BLK_F_
    +\change_deleted 1531152142 1341302299
    +FLUSH
    +\change_inserted 1531152142 1341302304
    +WCE
    +\change_unchanged
    + (9)
    +\change_deleted 1531152142 1341302317
    +Cache flush command support.
    +\change_inserted 1531152142 1341305427
    +Device cache starts in writeback mode after reset.
    + Guests should not negotiate this feature unless they are capable of sending
    + VIRTIO_BLK_T_FLUSH commands.
    \end_layout

    \begin_layout Description
    @@ -5117,6 +5129,13 @@ VIRTIO_BLK_F_FLUSH (9) Cache flush command support.
    VIRTIO_BLK_F_TOPOLOGY (10) Device exports information on optimal I/O alignment.
    \end_layout

    +\begin_layout Description
    +
    +\change_inserted 1531152142 1341302349
    +VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between writeback
    + and writethrough modes.
    +\end_layout
    +
    \end_deeper
    \begin_layout Description
    Device
    @@ -5234,6 +5253,15 @@ struct virtio_blk_config {

    \begin_layout Plain Layout

    +\change_inserted 1531152142 1341301918
    +
    + u8 writeback;
    +\change_unchanged
    +
    +\end_layout
    +
    +\begin_layout Plain Layout
    +
    };
    \end_layout

    @@ -5294,6 +5322,33 @@ If the VIRTIO_BLK_F_TOPOLOGY feature is negotiated, the fields in the topology
    This also does not affect the units in the protocol, only performance.
    \end_layout

    +\begin_layout Enumerate
    +
    +\change_inserted 1531152142 1341305949
    +The cache mode should be read from the writeback field of the configuration
    + if the VIRTIO_BLK_F_CONFIG_WCE feature if available; the driver can also
    + write to the field in order to toggle the cache between writethrough (0)
    + and writeback (1) mode.
    + If the feature is not available, the driver can instead look at the result
    + of negotiating VIRTIO_BLK_F_WCE: the cache will be in writeback mode after
    + reset if and only if VIRTIO_BLK_F_WCE is negotiated
    +\begin_inset Foot
    +status open
    +
    +\begin_layout Plain Layout
    +
    +\change_inserted 1531152142 1341306004
    +Until version 1.1, QEMU remained in writeback mode even after a guest announced
    + lack of support for VIRTIO_BLK_F_FLUSH.
    +\change_unchanged
    +
    +\end_layout
    +
    +\end_inset
    +
    +.
    +\end_layout
    +
    \begin_layout Section*
    Device Operation
    \end_layout
    --
    1.8.3.1




  • 2.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 15:04
    On Mon, 2013-08-19 at 11:55 +0200, Paolo Bonzini wrote:
    > This patch introduces two improvements to writeback cache handling
    > in the virtio-blk spec.
    >
    > 1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    > QEMU's behavior is documented explicitly as part of the spec: the host
    > negotiates the feature only if its cache is writeback. The obvious dual
    > requirement is imposed on the guest: it should negotiate the feature
    > only if it is able to send flushes. And in order to protect against
    > data loss, the spec now mandates that the host operates in writethrough
    > mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    > was already _allowed_ by the spec so far). This can change with every
    > reset of course; typically the BIOS will run as writethrough, while the
    > "main" OS will run in writeback mode. This is a backwards-compatible
    > refinement geared towards old or limited guests, so there is no need
    > for a new feature bit.
    >
    > 2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    > the same information in the configuration. This will enable the driver
    > to modify the write-cache setting at runtime (via sysfs for Linux, via
    > MODE SELECT for Windows).

    I've got to say this looks cockeyed; you're effectively translating SCSI
    commands into your own command set: the way you're setting this up you
    have to modify the driver for every spec feature you support and have
    some sort of elaborate protocol to identify the supported feature set.
    What does this buy you? SCSI is already a packet transport protocol.
    Just transport the SCSI commands over the virtio-interface and let the
    server reply (with I don't know what you're talking about if the command
    isn't supported). This is the way SCSI probes real device, so if you do
    it this way there's no need for the elaborate protocol identification
    and SCSI will work out what the supported feature set is.

    James




  • 3.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 15:10
    Il 19/08/2013 17:03, James Bottomley ha scritto:
    > On Mon, 2013-08-19 at 11:55 +0200, Paolo Bonzini wrote:
    >> This patch introduces two improvements to writeback cache handling
    >> in the virtio-blk spec.
    >>
    >> 1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    >> QEMU's behavior is documented explicitly as part of the spec: the host
    >> negotiates the feature only if its cache is writeback. The obvious dual
    >> requirement is imposed on the guest: it should negotiate the feature
    >> only if it is able to send flushes. And in order to protect against
    >> data loss, the spec now mandates that the host operates in writethrough
    >> mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    >> was already _allowed_ by the spec so far). This can change with every
    >> reset of course; typically the BIOS will run as writethrough, while the
    >> "main" OS will run in writeback mode. This is a backwards-compatible
    >> refinement geared towards old or limited guests, so there is no need
    >> for a new feature bit.
    >>
    >> 2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    >> the same information in the configuration. This will enable the driver
    >> to modify the write-cache setting at runtime (via sysfs for Linux, via
    >> MODE SELECT for Windows).
    >
    > I've got to say this looks cockeyed; you're effectively translating SCSI
    > commands into your own command set:

    Note that this is only for Windows, because all block drivers in Windows
    must speak SCSI. Microsoft provides an ATA->SCSI translataion layer,
    while for everything else you're on your own. For what it's worth, Xen
    paravirtualized drivers for Windows have to do the same dance.

    On Linux, there's no MODE SELECT involved, as the commit message says.

    > the way you're setting this up you
    > have to modify the driver for every spec feature you support

    Yes, that's known and that's one of the reasons why virtio-scsi was born.

    > and have
    > some sort of elaborate protocol to identify the supported feature set.

    It's not elaborate (it's just a feature bit), but that's indeed another
    of the reasons why virtio-scsi was born.

    > What does this buy you?

    Nothing, which is why virtio-scsi does it...

    > SCSI is already a packet transport protocol.
    > Just transport the SCSI commands over the virtio-interface and let the
    > server reply (with I don't know what you're talking about if the command
    > isn't supported). This is the way SCSI probes real device, so if you do
    > it this way there's no need for the elaborate protocol identification
    > and SCSI will work out what the supported feature set is.

    ... exactly this way.

    Paolo



  • 4.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 15:25
    On Mon, Aug 19, 2013 at 05:09:37PM +0200, Paolo Bonzini wrote:
    > Il 19/08/2013 17:03, James Bottomley ha scritto:
    > > On Mon, 2013-08-19 at 11:55 +0200, Paolo Bonzini wrote:
    > >> This patch introduces two improvements to writeback cache handling
    > >> in the virtio-blk spec.
    > >>
    > >> 1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    > >> QEMU's behavior is documented explicitly as part of the spec: the host
    > >> negotiates the feature only if its cache is writeback. The obvious dual
    > >> requirement is imposed on the guest: it should negotiate the feature
    > >> only if it is able to send flushes. And in order to protect against
    > >> data loss, the spec now mandates that the host operates in writethrough
    > >> mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    > >> was already _allowed_ by the spec so far). This can change with every
    > >> reset of course; typically the BIOS will run as writethrough, while the
    > >> "main" OS will run in writeback mode. This is a backwards-compatible
    > >> refinement geared towards old or limited guests, so there is no need
    > >> for a new feature bit.
    > >>
    > >> 2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    > >> the same information in the configuration. This will enable the driver
    > >> to modify the write-cache setting at runtime (via sysfs for Linux, via
    > >> MODE SELECT for Windows).
    > >
    > > I've got to say this looks cockeyed; you're effectively translating SCSI
    > > commands into your own command set:
    >
    > Note that this is only for Windows, because all block drivers in Windows
    > must speak SCSI. Microsoft provides an ATA->SCSI translataion layer,
    > while for everything else you're on your own. For what it's worth, Xen
    > paravirtualized drivers for Windows have to do the same dance.
    >
    > On Linux, there's no MODE SELECT involved, as the commit message says.
    >
    > > the way you're setting this up you
    > > have to modify the driver for every spec feature you support
    >
    > Yes, that's known and that's one of the reasons why virtio-scsi was born.
    >
    > > and have
    > > some sort of elaborate protocol to identify the supported feature set.
    >
    > It's not elaborate (it's just a feature bit), but that's indeed another
    > of the reasons why virtio-scsi was born.
    >
    > > What does this buy you?
    >
    > Nothing, which is why virtio-scsi does it...
    >
    > > SCSI is already a packet transport protocol.
    > > Just transport the SCSI commands over the virtio-interface and let the
    > > server reply (with I don't know what you're talking about if the command
    > > isn't supported). This is the way SCSI probes real device, so if you do
    > > it this way there's no need for the elaborate protocol identification
    > > and SCSI will work out what the supported feature set is.
    >
    > ... exactly this way.
    >
    > Paolo

    Hmm can VIRTIO_BLK_F_SCSI work for this somehow?


    > This publicly archived list offers a means to provide input to the
    > OASIS Virtual I/O Device (VIRTIO) TC.
    >
    > In order to verify user consent to the Feedback License terms and
    > to minimize spam in the list archive, subscription is required
    > before posting.
    >
    > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
    > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
    > List help: virtio-comment-help@lists.oasis-open.org
    > List archive: https://lists.oasis-open.org/archives/virtio-comment/
    > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
    > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
    > Committee: https://www.oasis-open.org/committees/virtio/
    > Join OASIS: https://www.oasis-open.org/join/



  • 5.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 15:31
    On Mon, 2013-08-19 at 18:25 +0300, Michael S. Tsirkin wrote:
    > On Mon, Aug 19, 2013 at 05:09:37PM +0200, Paolo Bonzini wrote:
    > > Il 19/08/2013 17:03, James Bottomley ha scritto:
    > > > On Mon, 2013-08-19 at 11:55 +0200, Paolo Bonzini wrote:
    > > >> This patch introduces two improvements to writeback cache handling
    > > >> in the virtio-blk spec.
    > > >>
    > > >> 1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    > > >> QEMU's behavior is documented explicitly as part of the spec: the host
    > > >> negotiates the feature only if its cache is writeback. The obvious dual
    > > >> requirement is imposed on the guest: it should negotiate the feature
    > > >> only if it is able to send flushes. And in order to protect against
    > > >> data loss, the spec now mandates that the host operates in writethrough
    > > >> mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    > > >> was already _allowed_ by the spec so far). This can change with every
    > > >> reset of course; typically the BIOS will run as writethrough, while the
    > > >> "main" OS will run in writeback mode. This is a backwards-compatible
    > > >> refinement geared towards old or limited guests, so there is no need
    > > >> for a new feature bit.
    > > >>
    > > >> 2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    > > >> the same information in the configuration. This will enable the driver
    > > >> to modify the write-cache setting at runtime (via sysfs for Linux, via
    > > >> MODE SELECT for Windows).
    > > >
    > > > I've got to say this looks cockeyed; you're effectively translating SCSI
    > > > commands into your own command set:
    > >
    > > Note that this is only for Windows, because all block drivers in Windows
    > > must speak SCSI. Microsoft provides an ATA->SCSI translataion layer,
    > > while for everything else you're on your own. For what it's worth, Xen
    > > paravirtualized drivers for Windows have to do the same dance.
    > >
    > > On Linux, there's no MODE SELECT involved, as the commit message says.
    > >
    > > > the way you're setting this up you
    > > > have to modify the driver for every spec feature you support
    > >
    > > Yes, that's known and that's one of the reasons why virtio-scsi was born.
    > >
    > > > and have
    > > > some sort of elaborate protocol to identify the supported feature set.
    > >
    > > It's not elaborate (it's just a feature bit), but that's indeed another
    > > of the reasons why virtio-scsi was born.
    > >
    > > > What does this buy you?
    > >
    > > Nothing, which is why virtio-scsi does it...
    > >
    > > > SCSI is already a packet transport protocol.
    > > > Just transport the SCSI commands over the virtio-interface and let the
    > > > server reply (with I don't know what you're talking about if the command
    > > > isn't supported). This is the way SCSI probes real device, so if you do
    > > > it this way there's no need for the elaborate protocol identification
    > > > and SCSI will work out what the supported feature set is.
    > >
    > > ... exactly this way.
    > >
    > > Paolo
    >
    > Hmm can VIRTIO_BLK_F_SCSI work for this somehow?

    Well, yes, effectively. You only have two implemented VIRTIO drivers:
    virtio-scsi and virtio-blk, both of which emulate packet transport
    protocols. Block is much more problematic because it's not really
    designed to be a wire protocol unlike SCSI. However, what you should be
    doing is mirroring the REQ_ protocol and have a side handshake to set
    the capability flags. That will ensure that all block features just
    work (tm) in the same way. For standards purposes, I'm not really sure
    how to proceed on this but I believe linux will be the only implementor
    of virtio-blk, right, so it probably doesn't matter.

    James




  • 6.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 15:47
    Il 19/08/2013 17:31, James Bottomley ha scritto:
    > Well, yes, effectively. You only have two implemented VIRTIO drivers:
    > virtio-scsi and virtio-blk, both of which emulate packet transport
    > protocols. Block is much more problematic because it's not really
    > designed to be a wire protocol unlike SCSI.

    SCSI is an RPC protocol, but the wire protocol is obviously left to the
    transport. So in this respect block and SCSI aren't really like night
    and day. The difference is that the SCSI RPC is very small, and the
    complicated/evolving parts are at a different level, so the host and
    guest code that implements the virtio-scsi spec is very small.

    Those complicated/evolving parts are _still_ there, so the code needed
    in the host to implement virtio-scsi is indeed much larger than
    virtio-blk. But it's a non-issue for a non-toy VMM because that code is
    needed anyway, for example to support USB storage.

    And piggybacking on SCSI means that whenever a new feature comes along,
    someone else is already doing the work to integrate into the Linux block
    layer, to use it, to export it to userspace, etc. This typically
    happens before hosts implement the feature, so us virt people get a
    large part of the work for free; we only need to implement the feature
    in the host, because these feature rarely if ever affect the SCSI RPC
    interface. Instead, adding the smallest feature to virtio-blk always
    necessitates updates to both the spec and the guest drivers, in addition
    to the VMM (typically QEMU).

    This is not hypothetical, it has already happened for discard and it
    will happen sometime for write-zeroes.

    > However, what you should be
    > doing is mirroring the REQ_ protocol and have a side handshake to set
    > the capability flags. That will ensure that all block features just
    > work (tm) in the same way.

    That's roughly what virtio-blk does (though it was not kept up-to-date
    with discard and write-zeroes).

    > For standards purposes, I'm not really sure
    > how to proceed on this but I believe linux will be the only implementor
    > of virtio-blk, right, so it probably doesn't matter.

    There are Windows drivers for virtio-blk.

    Paolo



  • 7.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 16:44
    On Mon, 2013-08-19 at 17:47 +0200, Paolo Bonzini wrote:
    > Il 19/08/2013 17:31, James Bottomley ha scritto:
    > > Well, yes, effectively. You only have two implemented VIRTIO drivers:
    > > virtio-scsi and virtio-blk, both of which emulate packet transport
    > > protocols. Block is much more problematic because it's not really
    > > designed to be a wire protocol unlike SCSI.
    >
    > SCSI is an RPC protocol, but the wire protocol is obviously left to the
    > transport.

    I don't understand your point here: If you think the transports modify
    the commands, that's not correct. The job of the transport is to get
    the command from initiator to target ... it's just that, a transport
    layer; exactly like RPC or TCP.

    > So in this respect block and SCSI aren't really like night
    > and day.

    This is what I don't get. They could both be implemented as transport
    encapsulations of an existing command set.

    > The difference is that the SCSI RPC is very small, and the
    > complicated/evolving parts are at a different level, so the host and
    > guest code that implements the virtio-scsi spec is very small.

    > Those complicated/evolving parts are _still_ there, so the code needed
    > in the host to implement virtio-scsi is indeed much larger than
    > virtio-blk. But it's a non-issue for a non-toy VMM because that code is
    > needed anyway, for example to support USB storage.

    It might be now, but once you've done the feature creep in the
    virtio-blk spec to support all of our user visible features (DIF,
    discard, etc.) you'll probably find that the need to translate the
    commands causes virtio-blk to grow faster than virtio-scsi.

    > And piggybacking on SCSI means that whenever a new feature comes along,
    > someone else is already doing the work to integrate into the Linux block
    > layer, to use it, to export it to userspace, etc. This typically
    > happens before hosts implement the feature, so us virt people get a
    > large part of the work for free; we only need to implement the feature
    > in the host, because these feature rarely if ever affect the SCSI RPC
    > interface. Instead, adding the smallest feature to virtio-blk always
    > necessitates updates to both the spec and the guest drivers, in addition
    > to the VMM (typically QEMU).

    Spec, possibly, since someone needs to be told about commands added to
    block, but it shouldn't need modification of correctly implemented
    drivers. If the encapsulation were done right, the client would just
    send the correct commands, so no code growth. Implementing them as a
    pass through is a nop as well, so the only problem case is non-native
    and there, you'll find the code growth is just slightly less than what
    you have abstracting all the commands.

    > This is not hypothetical, it has already happened for discard and it
    > will happen sometime for write-zeroes.
    >
    > > However, what you should be
    > > doing is mirroring the REQ_ protocol and have a side handshake to set
    > > the capability flags. That will ensure that all block features just
    > > work (tm) in the same way.
    >
    > That's roughly what virtio-blk does (though it was not kept up-to-date
    > with discard and write-zeroes).

    But this is illustrative of the point. When the block layer is patched
    to add more stuff (which has been happening quite a lot lately), you
    have to rev your spec and update all the guest and host drivers. This
    is one of the reasons for the current SCSI architecture model: not to
    have to do this.

    > > For standards purposes, I'm not really sure
    > > how to proceed on this but I believe linux will be the only implementor
    > > of virtio-blk, right, so it probably doesn't matter.
    >
    > There are Windows drivers for virtio-blk.

    Which are thus effectively emulator drivers.

    James




  • 8.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 18:00
    Il 19/08/2013 18:44, James Bottomley ha scritto:
    >> The difference is that the SCSI RPC is very small, and the
    >> complicated/evolving parts are at a different level, so the host and
    >> guest code that implements the virtio-scsi spec is very small.
    >
    >> Those complicated/evolving parts are _still_ there, so the code needed
    >> in the host to implement virtio-scsi is indeed much larger than
    >> virtio-blk. But it's a non-issue for a non-toy VMM because that code is
    >> needed anyway, for example to support USB storage.
    >
    > It might be now, but once you've done the feature creep in the
    > virtio-blk spec to support all of our user visible features (DIF,
    > discard, etc.) you'll probably find that the need to translate the
    > commands causes virtio-blk to grow faster than virtio-scsi.

    Oh, definitely! In fact, we've decided not to do much feature-creep in
    virtio-blk.

    There was none after this patch. Rusty said discard would be borderline
    but acceptable, but he already wasn't sure about write-zeroes.

    >> And piggybacking on SCSI means that whenever a new feature comes along,
    >> someone else is already doing the work to integrate into the Linux block
    >> layer, to use it, to export it to userspace, etc. This typically
    >> happens before hosts implement the feature, so us virt people get a
    >> large part of the work for free; we only need to implement the feature
    >> in the host, because these feature rarely if ever affect the SCSI RPC
    >> interface. Instead, adding the smallest feature to virtio-blk always
    >> necessitates updates to both the spec and the guest drivers, in addition
    >> to the VMM (typically QEMU).
    >
    > Spec, possibly, since someone needs to be told about commands added to
    > block, but it shouldn't need modification of correctly implemented
    > drivers. If the encapsulation were done right, the client would just
    > send the correct commands, so no code growth.

    Ah, I see what you mean now. It's a different channel to describe
    features. That would be a virtio-blk v2 basically. It would have made
    sense if done in the beginning, but not now.

    > But this is illustrative of the point. When the block layer is patched
    > to add more stuff (which has been happening quite a lot lately), you
    > have to rev your spec and update all the guest and host drivers. This
    > is one of the reasons for the current SCSI architecture model: not to
    > have to do this.
    >
    >>> For standards purposes, I'm not really sure
    >>> how to proceed on this but I believe linux will be the only implementor
    >>> of virtio-blk, right, so it probably doesn't matter.
    >>
    >> There are Windows drivers for virtio-blk.
    >
    > Which are thus effectively emulator drivers.

    Yes.

    But honestly I'm not sure what your point is... I'm submitting this
    patch at Rusty's request in order to clean up IP issues. It's already
    part of the spec, of QEMU, of the guest drivers. I don't plan on any
    other feature creep in virtio-blk, or fwiw no one else doesn't.

    Paolo



  • 9.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 19:10
    On Mon, Aug 19, 2013 at 08:00:27PM +0200, Paolo Bonzini wrote:
    > Il 19/08/2013 18:44, James Bottomley ha scritto:
    > >> The difference is that the SCSI RPC is very small, and the
    > >> complicated/evolving parts are at a different level, so the host and
    > >> guest code that implements the virtio-scsi spec is very small.
    > >
    > >> Those complicated/evolving parts are _still_ there, so the code needed
    > >> in the host to implement virtio-scsi is indeed much larger than
    > >> virtio-blk. But it's a non-issue for a non-toy VMM because that code is
    > >> needed anyway, for example to support USB storage.
    > >
    > > It might be now, but once you've done the feature creep in the
    > > virtio-blk spec to support all of our user visible features (DIF,
    > > discard, etc.) you'll probably find that the need to translate the
    > > commands causes virtio-blk to grow faster than virtio-scsi.
    >
    > Oh, definitely! In fact, we've decided not to do much feature-creep in
    > virtio-blk.
    >
    > There was none after this patch. Rusty said discard would be borderline
    > but acceptable, but he already wasn't sure about write-zeroes.
    >
    > >> And piggybacking on SCSI means that whenever a new feature comes along,
    > >> someone else is already doing the work to integrate into the Linux block
    > >> layer, to use it, to export it to userspace, etc. This typically
    > >> happens before hosts implement the feature, so us virt people get a
    > >> large part of the work for free; we only need to implement the feature
    > >> in the host, because these feature rarely if ever affect the SCSI RPC
    > >> interface. Instead, adding the smallest feature to virtio-blk always
    > >> necessitates updates to both the spec and the guest drivers, in addition
    > >> to the VMM (typically QEMU).
    > >
    > > Spec, possibly, since someone needs to be told about commands added to
    > > block, but it shouldn't need modification of correctly implemented
    > > drivers. If the encapsulation were done right, the client would just
    > > send the correct commands, so no code growth.
    >
    > Ah, I see what you mean now. It's a different channel to describe
    > features. That would be a virtio-blk v2 basically. It would have made
    > sense if done in the beginning, but not now.

    It's worth considering. We can deprecate the WCE bit if we want to.
    I don't think that's virtio-blk v2 really.

    > > But this is illustrative of the point. When the block layer is patched
    > > to add more stuff (which has been happening quite a lot lately), you
    > > have to rev your spec and update all the guest and host drivers. This
    > > is one of the reasons for the current SCSI architecture model: not to
    > > have to do this.
    > >
    > >>> For standards purposes, I'm not really sure
    > >>> how to proceed on this but I believe linux will be the only implementor
    > >>> of virtio-blk, right, so it probably doesn't matter.
    > >>
    > >> There are Windows drivers for virtio-blk.
    > >
    > > Which are thus effectively emulator drivers.
    >
    > Yes.
    >
    > But honestly I'm not sure what your point is... I'm submitting this
    > patch at Rusty's request in order to clean up IP issues. It's already
    > part of the spec, of QEMU, of the guest drivers. I don't plan on any
    > other feature creep in virtio-blk, or fwiw no one else doesn't.
    >
    > Paolo

    virtio-blk is widely used. I wouldn't write it off just yet.
    So I think these comments are valuable.

    For example, this made me realize that even when scsi command
    pass-through is enabled, we either take the clue on WCE support from
    host feature bit. This is wrong, isn't it? We really should send scsi
    commands to the device to figure out and set/clear it.

    I also note that virtio-blk simply has custom sysfs attributes
    for WCE control and status. Is this normal?
    James, are there tools that send scsi commands to block devices
    to control WCE?

    --
    MST



  • 10.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-20-2013 08:23
    Il 19/08/2013 21:10, Michael S. Tsirkin ha scritto:
    > virtio-blk is widely used. I wouldn't write it off just yet.
    > So I think these comments are valuable.
    >
    > For example, this made me realize that even when scsi command
    > pass-through is enabled, we either take the clue on WCE support from
    > host feature bit. This is wrong, isn't it?

    No, it isn't. Very much the contrary, actually.

    > We really should send scsi
    > commands to the device to figure out and set/clear it.

    Absolutely not, because this would bypass the host kernel. Flush
    requests would not work properly.

    There is _no_ reason why a guest kernel should _ever_ use
    VIRTIO_BLK_F_SCSI, except if userspace tells it to (with the assumption
    that userspace knows what it's doing).

    In fact, deprecating VIRTIO_BLK_F_SCSI would be a very good idea. It is
    half-assed, as Rusty put it, confusing, and mostly useless. It's also
    one of the few cases where we rely on buffers to determine field sizes.

    > I also note that virtio-blk simply has custom sysfs attributes
    > for WCE control and status. Is this normal?

    Yes, the sysfs attribute is modeled after the one for SCSI disks.

    > James, are there tools that send scsi commands to block devices
    > to control WCE?

    You can use sg_raw, but it is not safe. Not coincidentially, the
    sg3_utils suite has a tool to send MODE SENSE (i.e. grab the raw state
    from the block device), but no tool to send MODE SELECT.

    Paolo



  • 11.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-20-2013 14:13
    On Mon, 2013-08-19 at 22:10 +0300, Michael S. Tsirkin wrote:
    > On Mon, Aug 19, 2013 at 08:00:27PM +0200, Paolo Bonzini wrote:
    > > Il 19/08/2013 18:44, James Bottomley ha scritto:
    > > >> And piggybacking on SCSI means that whenever a new feature comes along,
    > > >> someone else is already doing the work to integrate into the Linux block
    > > >> layer, to use it, to export it to userspace, etc. This typically
    > > >> happens before hosts implement the feature, so us virt people get a
    > > >> large part of the work for free; we only need to implement the feature
    > > >> in the host, because these feature rarely if ever affect the SCSI RPC
    > > >> interface. Instead, adding the smallest feature to virtio-blk always
    > > >> necessitates updates to both the spec and the guest drivers, in addition
    > > >> to the VMM (typically QEMU).
    > > >
    > > > Spec, possibly, since someone needs to be told about commands added to
    > > > block, but it shouldn't need modification of correctly implemented
    > > > drivers. If the encapsulation were done right, the client would just
    > > > send the correct commands, so no code growth.
    > >
    > > Ah, I see what you mean now. It's a different channel to describe
    > > features. That would be a virtio-blk v2 basically. It would have made
    > > sense if done in the beginning, but not now.
    >
    > It's worth considering. We can deprecate the WCE bit if we want to.
    > I don't think that's virtio-blk v2 really.

    It doesn't even need to be v2. You can keep what you've already done,
    just add future features compatibly with the underlying protocol.

    > > > But this is illustrative of the point. When the block layer is patched
    > > > to add more stuff (which has been happening quite a lot lately), you
    > > > have to rev your spec and update all the guest and host drivers. This
    > > > is one of the reasons for the current SCSI architecture model: not to
    > > > have to do this.
    > > >
    > > >>> For standards purposes, I'm not really sure
    > > >>> how to proceed on this but I believe linux will be the only implementor
    > > >>> of virtio-blk, right, so it probably doesn't matter.
    > > >>
    > > >> There are Windows drivers for virtio-blk.
    > > >
    > > > Which are thus effectively emulator drivers.
    > >
    > > Yes.
    > >
    > > But honestly I'm not sure what your point is... I'm submitting this
    > > patch at Rusty's request in order to clean up IP issues. It's already
    > > part of the spec, of QEMU, of the guest drivers. I don't plan on any
    > > other feature creep in virtio-blk, or fwiw no one else doesn't.
    > >
    > > Paolo
    >
    > virtio-blk is widely used. I wouldn't write it off just yet.
    > So I think these comments are valuable.
    >
    > For example, this made me realize that even when scsi command
    > pass-through is enabled, we either take the clue on WCE support from
    > host feature bit. This is wrong, isn't it? We really should send scsi
    > commands to the device to figure out and set/clear it.

    Right: this illustrates how the incremental addition can be done: do
    pass through on the mode sense SCSI uses to decide this and discovery
    can proceed normally. Older virtio says I don't know what you're
    talking about and SCSI assumes write through ... the same will work for
    all the other features.

    > I also note that virtio-blk simply has custom sysfs attributes
    > for WCE control and status. Is this normal?
    > James, are there tools that send scsi commands to block devices
    > to control WCE?

    Yes, it's actually very simple

    echo "cache type" > /sys/class/scsi_disk/<dev>/cache_type

    Assuming the cache type can be changed (not every disk allows this),
    then the sd driver will send the mode select to change it. Note that
    this change is volatile: the disk will revert to default cache type
    once the power is removed. To make it permanent, you need to update the
    "saved" mode page using sg_wr_mode --save

    James


    James





  • 12.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-20-2013 14:35
    Il 20/08/2013 16:13, James Bottomley ha scritto:
    > > For example, this made me realize that even when scsi command
    > > pass-through is enabled, we either take the clue on WCE support from
    > > host feature bit. This is wrong, isn't it? We really should send scsi
    > > commands to the device to figure out and set/clear it.
    >
    > Right: this illustrates how the incremental addition can be done: do
    > pass through on the mode sense SCSI uses to decide this and discovery
    > can proceed normally. Older virtio says I don't know what you're
    > talking about and SCSI assumes write through ... the same will work for
    > all the other features.

    No, please don't!!!

    The host is _not_ supposed to have a SCSI target implementation for
    VIRTIO_BLK_F_SCSI's sake, no matter how simple!

    And the guest is not supposed to build CDBs if you are using virtio-blk
    (if you are writing a Windows driver, you have to interpret CDBs, but
    that's a different story).

    If you want CDBs, that's perfect---but then your device model should
    follow the virtio-scsi spec and reuse _all_ the SCSI code in the guest
    (virtio-scsi.c and sd.c).

    Do not try to resurrect the utterly broken VIRTIO_BLK_F_SCSI experiment
    that unluckily made it to the spec. And keep virtio-blk damn simple.
    If that means not adding a feature, do not add it.

    >> > I also note that virtio-blk simply has custom sysfs attributes
    >> > for WCE control and status. Is this normal?
    >> > James, are there tools that send scsi commands to block devices
    >> > to control WCE?
    > Yes, it's actually very simple
    >
    > echo "cache type" > /sys/class/scsi_disk/<dev>/cache_type

    Same for virtio-blk, though the change is always volatile.

    Paolo




  • 13.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 15:35
    Il 19/08/2013 17:25, Michael S. Tsirkin ha scritto:
    > On Mon, Aug 19, 2013 at 05:09:37PM +0200, Paolo Bonzini wrote:
    >> Il 19/08/2013 17:03, James Bottomley ha scritto:
    >>> On Mon, 2013-08-19 at 11:55 +0200, Paolo Bonzini wrote:
    >>>> This patch introduces two improvements to writeback cache handling
    >>>> in the virtio-blk spec.
    >>>>
    >>>> 1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    >>>> QEMU's behavior is documented explicitly as part of the spec: the host
    >>>> negotiates the feature only if its cache is writeback. The obvious dual
    >>>> requirement is imposed on the guest: it should negotiate the feature
    >>>> only if it is able to send flushes. And in order to protect against
    >>>> data loss, the spec now mandates that the host operates in writethrough
    >>>> mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    >>>> was already _allowed_ by the spec so far). This can change with every
    >>>> reset of course; typically the BIOS will run as writethrough, while the
    >>>> "main" OS will run in writeback mode. This is a backwards-compatible
    >>>> refinement geared towards old or limited guests, so there is no need
    >>>> for a new feature bit.
    >>>>
    >>>> 2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    >>>> the same information in the configuration. This will enable the driver
    >>>> to modify the write-cache setting at runtime (via sysfs for Linux, via
    >>>> MODE SELECT for Windows).
    >>>
    >>> I've got to say this looks cockeyed; you're effectively translating SCSI
    >>> commands into your own command set:
    >>
    >> Note that this is only for Windows, because all block drivers in Windows
    >> must speak SCSI. Microsoft provides an ATA->SCSI translataion layer,
    >> while for everything else you're on your own. For what it's worth, Xen
    >> paravirtualized drivers for Windows have to do the same dance.
    >>
    >> On Linux, there's no MODE SELECT involved, as the commit message says.
    >>
    >>> the way you're setting this up you
    >>> have to modify the driver for every spec feature you support
    >>
    >> Yes, that's known and that's one of the reasons why virtio-scsi was born.
    >>
    >>> and have
    >>> some sort of elaborate protocol to identify the supported feature set.
    >>
    >> It's not elaborate (it's just a feature bit), but that's indeed another
    >> of the reasons why virtio-scsi was born.
    >>
    >>> What does this buy you?
    >>
    >> Nothing, which is why virtio-scsi does it...
    >>
    >>> SCSI is already a packet transport protocol.
    >>> Just transport the SCSI commands over the virtio-interface and let the
    >>> server reply (with I don't know what you're talking about if the command
    >>> isn't supported). This is the way SCSI probes real device, so if you do
    >>> it this way there's no need for the elaborate protocol identification
    >>> and SCSI will work out what the supported feature set is.
    >>
    >> ... exactly this way.
    >
    > Hmm can VIRTIO_BLK_F_SCSI work for this somehow?

    No, it cannot. First of all, it only works for virtio-blk devices
    backed by a physical disk (not a logical volume, not a partition, not a
    file). That's a feature, not a bug, since we do not want virtio-blk
    devices to carry a SCSI target implementation (QEMU's is 2.5 kSLOC).

    Second, any change you make through SCSI requests is not visible to the
    host. For example, if the guest kernel used SCSI requests to enable
    writeback caching, the host would not know about it and will not send
    flushes (thus a power loss will eat data).

    These reasons are why Windows drivers make no attempt to use
    VIRTIO_BLK_F_SCSI, they just ignore it and emulate the handful of
    commands that Windows actually uses (basically INQUIRY, READ CAPACITY,
    READ, WRITE, MODE SENSE, TEST UNIT READY, SYNCHRONIZE CACHE).

    VIRTIO_BLK_F_SCSI is a huge hack. There are some cases where it's
    useful if you're stuck with virtio-blk, but if you need it it's probably
    a better idea to migrate to virtio-scsi.

    Paolo



  • 14.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 16:52
    On Mon, 2013-08-19 at 17:34 +0200, Paolo Bonzini wrote:
    > Il 19/08/2013 17:25, Michael S. Tsirkin ha scritto:
    > > On Mon, Aug 19, 2013 at 05:09:37PM +0200, Paolo Bonzini wrote:
    > >> Il 19/08/2013 17:03, James Bottomley ha scritto:
    > >>> On Mon, 2013-08-19 at 11:55 +0200, Paolo Bonzini wrote:
    > >>>> This patch introduces two improvements to writeback cache handling
    > >>>> in the virtio-blk spec.
    > >>>>
    > >>>> 1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    > >>>> QEMU's behavior is documented explicitly as part of the spec: the host
    > >>>> negotiates the feature only if its cache is writeback. The obvious dual
    > >>>> requirement is imposed on the guest: it should negotiate the feature
    > >>>> only if it is able to send flushes. And in order to protect against
    > >>>> data loss, the spec now mandates that the host operates in writethrough
    > >>>> mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    > >>>> was already _allowed_ by the spec so far). This can change with every
    > >>>> reset of course; typically the BIOS will run as writethrough, while the
    > >>>> "main" OS will run in writeback mode. This is a backwards-compatible
    > >>>> refinement geared towards old or limited guests, so there is no need
    > >>>> for a new feature bit.
    > >>>>
    > >>>> 2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    > >>>> the same information in the configuration. This will enable the driver
    > >>>> to modify the write-cache setting at runtime (via sysfs for Linux, via
    > >>>> MODE SELECT for Windows).
    > >>>
    > >>> I've got to say this looks cockeyed; you're effectively translating SCSI
    > >>> commands into your own command set:
    > >>
    > >> Note that this is only for Windows, because all block drivers in Windows
    > >> must speak SCSI. Microsoft provides an ATA->SCSI translataion layer,
    > >> while for everything else you're on your own. For what it's worth, Xen
    > >> paravirtualized drivers for Windows have to do the same dance.
    > >>
    > >> On Linux, there's no MODE SELECT involved, as the commit message says.
    > >>
    > >>> the way you're setting this up you
    > >>> have to modify the driver for every spec feature you support
    > >>
    > >> Yes, that's known and that's one of the reasons why virtio-scsi was born.
    > >>
    > >>> and have
    > >>> some sort of elaborate protocol to identify the supported feature set.
    > >>
    > >> It's not elaborate (it's just a feature bit), but that's indeed another
    > >> of the reasons why virtio-scsi was born.
    > >>
    > >>> What does this buy you?
    > >>
    > >> Nothing, which is why virtio-scsi does it...
    > >>
    > >>> SCSI is already a packet transport protocol.
    > >>> Just transport the SCSI commands over the virtio-interface and let the
    > >>> server reply (with I don't know what you're talking about if the command
    > >>> isn't supported). This is the way SCSI probes real device, so if you do
    > >>> it this way there's no need for the elaborate protocol identification
    > >>> and SCSI will work out what the supported feature set is.
    > >>
    > >> ... exactly this way.
    > >
    > > Hmm can VIRTIO_BLK_F_SCSI work for this somehow?
    >
    > No, it cannot. First of all, it only works for virtio-blk devices
    > backed by a physical disk (not a logical volume, not a partition, not a
    > file). That's a feature, not a bug, since we do not want virtio-blk
    > devices to carry a SCSI target implementation (QEMU's is 2.5 kSLOC).

    The unstated assumption above is that SCSI emulators are huge. This is
    bogus. The tiny USB key you probably have in your pocket at the moment
    that you bought for a tenner has a minimal SCSI emulator inside its rom.

    You don't need to emulate every command. Just those that provide the
    features you want.

    > Second, any change you make through SCSI requests is not visible to the
    > host. For example, if the guest kernel used SCSI requests to enable
    > writeback caching, the host would not know about it and will not send
    > flushes (thus a power loss will eat data).

    I'm a bit lost here. If you mean the native case, so host passes
    commands directly to underlying SCSI, then, of course, you're correct,
    the host driver doesn't snoop. But the command chain is unbroken from
    guest to physical device, so there's no data loss on power failure. If
    you mean the emulated case, then the host has to intercept the command.

    James

    > These reasons are why Windows drivers make no attempt to use
    > VIRTIO_BLK_F_SCSI, they just ignore it and emulate the handful of
    > commands that Windows actually uses (basically INQUIRY, READ CAPACITY,
    > READ, WRITE, MODE SENSE, TEST UNIT READY, SYNCHRONIZE CACHE).
    >
    > VIRTIO_BLK_F_SCSI is a huge hack. There are some cases where it's
    > useful if you're stuck with virtio-blk, but if you need it it's probably
    > a better idea to migrate to virtio-scsi.
    >
    > Paolo





  • 15.  Re: [virtio-comment] [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-19-2013 18:03
    Il 19/08/2013 18:51, James Bottomley ha scritto:
    >> No, it cannot. First of all, it only works for virtio-blk devices
    >> backed by a physical disk (not a logical volume, not a partition, not a
    >> file). That's a feature, not a bug, since we do not want virtio-blk
    >> devices to carry a SCSI target implementation (QEMU's is 2.5 kSLOC).
    >
    > The unstated assumption above is that SCSI emulators are huge. This is
    > bogus. The tiny USB key you probably have in your pocket at the moment
    > that you bought for a tenner has a minimal SCSI emulator inside its rom.

    And you know how many off-by-one errors or "liberal" implementations of
    the standards they contain, and how much code is in sd.c (or in USB
    quirks) to handle those problems.

    A simple virtio-scsi implementation could use a small SCSI emulator.
    But it could still be twice the size of virtio-blk, perhaps.

    > You don't need to emulate every command. Just those that provide the
    > features you want.
    >
    >> Second, any change you make through SCSI requests is not visible to the
    >> host. For example, if the guest kernel used SCSI requests to enable
    >> writeback caching, the host would not know about it and will not send
    >> flushes (thus a power loss will eat data).
    >
    > I'm a bit lost here. If you mean the native case, so host passes
    > commands directly to underlying SCSI, then, of course, you're correct,
    > the host driver doesn't snoop. But the command chain is unbroken from
    > guest to physical device, so there's no data loss on power failure. If
    > you mean the emulated case, then the host has to intercept the command.

    VIRTIO_BLK_F_SCSI is not meant to do emulation, since you can perfectly
    use virtio-scsi for that usecase.

    Paolo



  • 16.  Re: [PATCH 2/5] virtio-blk spec: writeback cache enable improvements

    Posted 08-21-2013 06:13
    Paolo Bonzini <pbonzini@redhat.com> writes:
    > This patch introduces two improvements to writeback cache handling
    > in the virtio-blk spec.
    >
    > 1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    > QEMU's behavior is documented explicitly as part of the spec: the host
    > negotiates the feature only if its cache is writeback. The obvious dual
    > requirement is imposed on the guest: it should negotiate the feature
    > only if it is able to send flushes. And in order to protect against
    > data loss, the spec now mandates that the host operates in writethrough
    > mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    > was already _allowed_ by the spec so far). This can change with every
    > reset of course; typically the BIOS will run as writethrough, while the
    > "main" OS will run in writeback mode. This is a backwards-compatible
    > refinement geared towards old or limited guests, so there is no need
    > for a new feature bit.
    >
    > 2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    > the same information in the configuration. This will enable the driver
    > to modify the write-cache setting at runtime (via sysfs for Linux, via
    > MODE SELECT for Windows).
    >
    > Patches for QEMU and Linux will come soonish.
    >
    > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

    Here's what it looks like against the text version:

    commit 3ee1f0b769c1d9e8ff5edcda11ff52cb41ba0139
    Author: Rusty Russell <rusty@au1.ibm.com>
    Date: Wed Aug 21 15:37:33 2013 +0930

    2.5.2: writeback cache enable improvements

    From: Paolo Bonzini <pbonzini@redhat.com>
    This patch introduces two improvements to writeback cache handling
    in the virtio-blk spec.

    1) The VIRTIO_BLK_F_FLUSH feature is renamed to VIRTIO_BLK_F_WCE, and
    QEMU's behavior is documented explicitly as part of the spec: the host
    negotiates the feature only if its cache is writeback. The obvious dual
    requirement is imposed on the guest: it should negotiate the feature
    only if it is able to send flushes. And in order to protect against
    data loss, the spec now mandates that the host operates in writethrough
    mode if the guest does not negotiate VIRTIO_BLK_F_WCE (this behavior
    was already _allowed_ by the spec so far). This can change with every
    reset of course; typically the BIOS will run as writethrough, while the
    "main" OS will run in writeback mode. This is a backwards-compatible
    refinement geared towards old or limited guests, so there is no need
    for a new feature bit.

    2) a second feature is added, VIRTIO_BLK_F_CONFIG_WCE, that provides
    the same information in the configuration. This will enable the driver
    to modify the write-cache setting at runtime (via sysfs for Linux, via
    MODE SELECT for Windows).

    Patches for QEMU and Linux will come soonish.

    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

    See: https://lists.oasis-open.org/archives/virtio-comment/201308/msg00029.html

    Signed-off-by: Rusty Russell <rusty@au1.ibm.com>

    diff --git a/virtio-v1.0-wd01-part1-specification.txt b/virtio-v1.0-wd01-part1-specification.txt
    index f989630..495dd98 100644
    --- a/virtio-v1.0-wd01-part1-specification.txt
    +++ b/virtio-v1.0-wd01-part1-specification.txt
    @@ -1539,11 +1539,16 @@ device except where noted.

    VIRTIO_BLK_F_SCSI (7) Device supports scsi packet commands.

    - VIRTIO_BLK_F_FLUSH (9) Cache flush command support.
    + VIRTIO_BLK_F_WCE (9) Device cache starts in writeback mode after reset.
    + Guests should not negotiate this feature unless they are capable of sending
    + VIRTIO_BLK_T_FLUSH commands.

    VIRTIO_BLK_F_TOPOLOGY (10) Device exports information on optimal I/O
    alignment.

    + VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between writeback
    + and writethrough modes.
    +
    Device configuration layout The capacity of the device
    (expressed in 512-byte sectors) is always present. The
    availability of the others all depend on various feature bits
    @@ -1565,6 +1570,7 @@ device except where noted.
    u16 min_io_size;
    u32 opt_io_size;
    } topology;
    + u8 writeback;
    };

    2.5.2.4 Device Initialization
    @@ -1588,6 +1594,14 @@ device except where noted.
    I/O lengths for the driver to use. This also does not affect the units
    in the protocol, only performance.

    +5. The cache mode should be read from the writeback field of the configuration
    + if the VIRTIO_BLK_F_CONFIG_WCE feature if available; the driver can also
    + write to the field in order to toggle the cache between writethrough (0)
    + and writeback (1) mode.
    + If the feature is not available, the driver can instead look at the result
    + of negotiating VIRTIO_BLK_F_WCE: the cache will be in writeback mode after
    + reset if and only if VIRTIO_BLK_F_WCE is negotiated[30]
    +
    2.5.2.5 Device Operation
    ------------------------

    @@ -2885,3 +2899,6 @@ when a port is closed or hot-unplugged.

    [29] As updates to configuration space are not atomic, this field
    isn't particularly reliable, but can be used to diagnose buggy guests.
    +
    +[30] Until version 1.1, QEMU remained in writeback mode even after a guest announced
    + lack of support for VIRTIO_BLK_F_FLUSH.