OASIS Virtual I/O Device (VIRTIO) TC

 View Only
Expand all | Collapse all

[PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

  • 1.  [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-01-2015 22:41
    VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance (up to 2x, though more realistically +30-40%) in latency-bound workloads. However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH. Restore it, clarifying the definition of writeback and writethrough cache according to what is commonly done by high-end storage. In addition, fix two occurrences where the driver was mentioned rather than the device, and clarify the influence of the legacy VIRTIO_BLK_F_FLUSH feature on caching. There is a difference between the legacy and virtio1 behavior of VIRTIO_BLK_F_CONFIG_WCE. In legacy devices, the writeback field was only available if VIRTIO_BLK_F_CONFIG_WCE was negotiated, while now it is always available. However, the VIRTIO_BLK_F_CONFIG_WCE feature remains and describes whether the field is writable. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- content.tex 45 ++++++++++++++++++++++++++++++++------------- 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/content.tex b/content.tex index 1efdcc8..7c00fd7 100644 --- a/content.tex +++ b/content.tex @@ -3725,6 +3725,9 @@ device except where noted. item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O alignment. + +item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback + and writethrough modes. end{description} subsubsection{Legacy Interface: Feature bits}label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits} @@ -3735,9 +3738,6 @@ device except where noted. item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands. item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support. - -item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback - and writethrough modes. end{description} VIRTIO_BLK_F_FLUSH was also called VIRTIO_BLK_F_WCE: Legacy drivers @@ -3746,9 +3746,9 @@ VIRTIO_BLK_T_FLUSH commands. subsection{Device configuration layout}label{sec:Device Types / Block Device / Device configuration layout} -The field{capacity} of the device (expressed in 512-byte sectors) is always -present. The availability of the others all depend on various feature -bits as indicated above. +The field{capacity} of the device (expressed in 512-byte sectors) and +caching mode (field{writeback}) are always present. The availability +of the others all depend on various feature bits as indicated above. egin{lstlisting} struct virtio_blk_config { @@ -3771,7 +3771,7 @@ struct virtio_blk_config { // optimal (suggested maximum) I/O size in blocks le32 opt_io_size; } topology; - u8 reserved; + u8 writeback; }; end{lstlisting} @@ -3801,15 +3801,20 @@ according to the native endian of the guest rather than field{topology} struct can be read to determine the physical block size and optimal I/O lengths for the driver to use. This also does not affect the units in the protocol, only performance. + +item The cache mode can be read from field{writeback}. 0 corresponds + to a writethrough cache, 1 to a writeback cachefootnote{Consistent + with
    ef{sec:Device Types / Block Device / Device Operation}, + a writethrough cache can be defined broadly as a cache that commits + writes to non-volatile storage before reporting their completion. + For example, a battery-backed writeback cache actually counts as + writethrough according to this definition.}. end{enumerate} subsubsection{Legacy Interface: Device Initialization}label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization} -The field{reserved} field used to be called field{writeback}. If the -VIRTIO_BLK_F_CONFIG_WCE feature is offered, the cache mode can be -read from field{writeback}; the -driver can also write to the field in order to toggle the cache -between writethrough (0) and writeback (1) mode. If the feature is +For legacy devices, the field{writeback} field is only available if +the VIRTIO_BLK_F_CONFIG_WCE feature is offered. If the feature is not available, the driver can instead look at the result of negotiating VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode after reset if and only if VIRTIO_BLK_F_FLUSH is negotiated. @@ -3871,9 +3876,19 @@ A device MUST set the field{status} byte to VIRTIO_BLK_S_IOERR for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT write any data. -Upon receipt of a VIRTIO_BLK_T_FLUSH request, the driver SHOULD ensure +Upon receipt of a VIRTIO_BLK_T_FLUSH request, the device SHOULD ensure that any writes which were completed are committed to non-volatile storage. +If the field{writeback} field in configuration space is 0, the device +SHOULD ensure that all writes are committed to non-volatile storage before +reporting completion. + +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY +switch to writethrough or writeback mode by writing respectively 0 and +1 to the field{writeback} field. After writing a 0 to field{writeback}, +the driver MUST NOT assume that writes submitted while the field was 1 have +been committed to non-volatile storage. + subsubsection{Legacy Interface: Device Operation}label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation} When using the legacy interface, transitional devices and drivers MUST format the fields in struct virtio_blk_req @@ -3884,6 +3899,10 @@ The field{reserved} field was previously called field{ioprio}. field{ioprio} is a hint about the relative priorities of requests to the device: higher numbers indicate more important requests. +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the device +SHOULD ensure that all writes are committed to non-volatile storage before +reporting completion. + egin{lstlisting} #define VIRTIO_BLK_T_FLUSH_OUT 5 end{lstlisting} -- 2.4.3


  • 2.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 07:02
    On Thu, Jul 02, 2015 at 12:41:04AM +0200, Paolo Bonzini wrote: > VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance > (up to 2x, though more realistically +30-40%) in latency-bound workloads. > However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH. > > Restore it, clarifying the definition of writeback and writethrough cache > according to what is commonly done by high-end storage. In addition, fix > two occurrences where the driver was mentioned rather than the device, and > clarify the influence of the legacy VIRTIO_BLK_F_FLUSH feature on caching. > > There is a difference between the legacy and virtio1 behavior of > VIRTIO_BLK_F_CONFIG_WCE. In legacy devices, the writeback field was > only available if VIRTIO_BLK_F_CONFIG_WCE was negotiated, while now > it is always available. However, the VIRTIO_BLK_F_CONFIG_WCE feature > remains and describes whether the field is writable. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Please mention the issue ID in the commit log in the future. E.g. on a line by itself: VIRTIO-144 > --- > content.tex 45 ++++++++++++++++++++++++++++++++------------- > 1 file changed, 32 insertions(+), 13 deletions(-) > > diff --git a/content.tex b/content.tex > index 1efdcc8..7c00fd7 100644 > --- a/content.tex > +++ b/content.tex > @@ -3725,6 +3725,9 @@ device except where noted. > > item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O > alignment. > + > +item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback > + and writethrough modes. > end{description} > > subsubsection{Legacy Interface: Feature bits}label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits} > @@ -3735,9 +3738,6 @@ device except where noted. > item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands. > > item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support. > - > -item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback > - and writethrough modes. > end{description} > > VIRTIO_BLK_F_FLUSH was also called VIRTIO_BLK_F_WCE: Legacy drivers > @@ -3746,9 +3746,9 @@ VIRTIO_BLK_T_FLUSH commands. > > subsection{Device configuration layout}label{sec:Device Types / Block Device / Device configuration layout} > > -The field{capacity} of the device (expressed in 512-byte sectors) is always > -present. The availability of the others all depend on various feature > -bits as indicated above. > +The field{capacity} of the device (expressed in 512-byte sectors) and > +caching mode (field{writeback}) are always present. The availability > +of the others all depend on various feature bits as indicated above. > > egin{lstlisting} > struct virtio_blk_config { I'm afraid we can't make this change at this point - it's clearly incompatible with devices that implement cs01 unless they also implement the WCE bit. If you see value in having a RO writeback field, just add a feature bit for that. > @@ -3771,7 +3771,7 @@ struct virtio_blk_config { > // optimal (suggested maximum) I/O size in blocks > le32 opt_io_size; > } topology; > - u8 reserved; > + u8 writeback; > }; > end{lstlisting} > > @@ -3801,15 +3801,20 @@ according to the native endian of the guest rather than > field{topology} struct can be read to determine the physical block size and optimal > I/O lengths for the driver to use. This also does not affect the units > in the protocol, only performance. > + > +item The cache mode can be read from field{writeback}. 0 corresponds > + to a writethrough cache, 1 to a writeback cachefootnote{Consistent > + with
    ef{sec:Device Types / Block Device / Device Operation}, > + a writethrough cache can be defined broadly as a cache that commits > + writes to non-volatile storage before reporting their completion. > + For example, a battery-backed writeback cache actually counts as > + writethrough according to this definition.}. > end{enumerate} Could you add some conformance statements for this as well please? > > subsubsection{Legacy Interface: Device Initialization}label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization} > > -The field{reserved} field used to be called field{writeback}. If the > -VIRTIO_BLK_F_CONFIG_WCE feature is offered, the cache mode can be > -read from field{writeback}; the > -driver can also write to the field in order to toggle the cache > -between writethrough (0) and writeback (1) mode. If the feature is > +For legacy devices, the field{writeback} field is only available if > +the VIRTIO_BLK_F_CONFIG_WCE feature is offered. If the feature is > not available, the driver can instead look at the result of > negotiating VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode > after reset if and only if VIRTIO_BLK_F_FLUSH is negotiated. > @@ -3871,9 +3876,19 @@ A device MUST set the field{status} byte to VIRTIO_BLK_S_IOERR > for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT > write any data. > > -Upon receipt of a VIRTIO_BLK_T_FLUSH request, the driver SHOULD ensure > +Upon receipt of a VIRTIO_BLK_T_FLUSH request, the device SHOULD ensure > that any writes which were completed are committed to non-volatile storage. > > +If the field{writeback} field in configuration space is 0, the device > +SHOULD ensure that all writes are committed to non-volatile storage before > +reporting completion. > + > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY > +switch to writethrough or writeback mode by writing respectively 0 and > +1 to the field{writeback} field. After writing a 0 to field{writeback}, > +the driver MUST NOT assume that writes submitted while the field was 1 have > +been committed to non-volatile storage. > + > subsubsection{Legacy Interface: Device Operation}label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation} > When using the legacy interface, transitional devices and drivers > MUST format the fields in struct virtio_blk_req > @@ -3884,6 +3899,10 @@ The field{reserved} field was previously called field{ioprio}. field{ioprio} > is a hint about the relative priorities of requests to the device: > higher numbers indicate more important requests. > > +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the device > +SHOULD ensure that all writes are committed to non-volatile storage before > +reporting completion. > + Maybe clarify what happens if it doesn't "Failure to do so might lead to data loss"? > egin{lstlisting} > #define VIRTIO_BLK_T_FLUSH_OUT 5 > end{lstlisting} > -- > 2.4.3 > > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 3.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 07:16
    On 02/07/2015 09:01, Michael S. Tsirkin wrote: > On Thu, Jul 02, 2015 at 12:41:04AM +0200, Paolo Bonzini wrote: >> VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance >> (up to 2x, though more realistically +30-40%) in latency-bound workloads. >> However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH. >> >> Restore it, clarifying the definition of writeback and writethrough cache >> according to what is commonly done by high-end storage. In addition, fix >> two occurrences where the driver was mentioned rather than the device, and >> clarify the influence of the legacy VIRTIO_BLK_F_FLUSH feature on caching. >> >> There is a difference between the legacy and virtio1 behavior of >> VIRTIO_BLK_F_CONFIG_WCE. In legacy devices, the writeback field was >> only available if VIRTIO_BLK_F_CONFIG_WCE was negotiated, while now >> it is always available. However, the VIRTIO_BLK_F_CONFIG_WCE feature >> remains and describes whether the field is writable. >> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > > Please mention the issue ID in the commit log in the future. > E.g. on a line by itself: Ok. It's in the subject though, which technically is the first line of the commit log. :) > VIRTIO-144 > >> --- >> content.tex 45 ++++++++++++++++++++++++++++++++------------- >> 1 file changed, 32 insertions(+), 13 deletions(-) >> >> diff --git a/content.tex b/content.tex >> index 1efdcc8..7c00fd7 100644 >> --- a/content.tex >> +++ b/content.tex >> @@ -3725,6 +3725,9 @@ device except where noted. >> >> item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O >> alignment. >> + >> +item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback >> + and writethrough modes. >> end{description} >> >> subsubsection{Legacy Interface: Feature bits}label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits} >> @@ -3735,9 +3738,6 @@ device except where noted. >> item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands. >> >> item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support. >> - >> -item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback >> - and writethrough modes. >> end{description} >> >> VIRTIO_BLK_F_FLUSH was also called VIRTIO_BLK_F_WCE: Legacy drivers >> @@ -3746,9 +3746,9 @@ VIRTIO_BLK_T_FLUSH commands. >> >> subsection{Device configuration layout}label{sec:Device Types / Block Device / Device configuration layout} >> >> -The field{capacity} of the device (expressed in 512-byte sectors) is always >> -present. The availability of the others all depend on various feature >> -bits as indicated above. >> +The field{capacity} of the device (expressed in 512-byte sectors) and >> +caching mode (field{writeback}) are always present. The availability >> +of the others all depend on various feature bits as indicated above. >> >> egin{lstlisting} >> struct virtio_blk_config { > > I'm afraid we can't make this change at this point - it's clearly > incompatible with devices that implement cs01 unless they > also implement the WCE bit. > > If you see value in having a RO writeback field, > just add a feature bit for that. No, it's okay to remove the RO capability. I'll send v2. > >> @@ -3771,7 +3771,7 @@ struct virtio_blk_config { >> // optimal (suggested maximum) I/O size in blocks >> le32 opt_io_size; >> } topology; >> - u8 reserved; >> + u8 writeback; >> }; >> end{lstlisting} >> >> @@ -3801,15 +3801,20 @@ according to the native endian of the guest rather than >> field{topology} struct can be read to determine the physical block size and optimal >> I/O lengths for the driver to use. This also does not affect the units >> in the protocol, only performance. >> + >> +item The cache mode can be read from field{writeback}. 0 corresponds >> + to a writethrough cache, 1 to a writeback cachefootnote{Consistent >> + with
    ef{sec:Device Types / Block Device / Device Operation}, >> + a writethrough cache can be defined broadly as a cache that commits >> + writes to non-volatile storage before reporting their completion. >> + For example, a battery-backed writeback cache actually counts as >> + writethrough according to this definition.}. >> end{enumerate} > > > Could you add some conformance statements for this as well > please? The above footnote is a consequence of the conformance statements in the "Device Operation" section. (Hence putting it in a footnote only). >> subsubsection{Legacy Interface: Device Initialization}label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization} >> >> -The field{reserved} field used to be called field{writeback}. If the >> -VIRTIO_BLK_F_CONFIG_WCE feature is offered, the cache mode can be >> -read from field{writeback}; the >> -driver can also write to the field in order to toggle the cache >> -between writethrough (0) and writeback (1) mode. If the feature is >> +For legacy devices, the field{writeback} field is only available if >> +the VIRTIO_BLK_F_CONFIG_WCE feature is offered. If the feature is >> not available, the driver can instead look at the result of >> negotiating VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode >> after reset if and only if VIRTIO_BLK_F_FLUSH is negotiated. >> @@ -3871,9 +3876,19 @@ A device MUST set the field{status} byte to VIRTIO_BLK_S_IOERR >> for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT >> write any data. >> >> -Upon receipt of a VIRTIO_BLK_T_FLUSH request, the driver SHOULD ensure >> +Upon receipt of a VIRTIO_BLK_T_FLUSH request, the device SHOULD ensure >> that any writes which were completed are committed to non-volatile storage. >> >> +If the field{writeback} field in configuration space is 0, the device >> +SHOULD ensure that all writes are committed to non-volatile storage before >> +reporting completion. >> + >> +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY >> +switch to writethrough or writeback mode by writing respectively 0 and >> +1 to the field{writeback} field. After writing a 0 to field{writeback}, >> +the driver MUST NOT assume that writes submitted while the field was 1 have >> +been committed to non-volatile storage. >> + >> subsubsection{Legacy Interface: Device Operation}label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation} >> When using the legacy interface, transitional devices and drivers >> MUST format the fields in struct virtio_blk_req >> @@ -3884,6 +3899,10 @@ The field{reserved} field was previously called field{ioprio}. field{ioprio} >> is a hint about the relative priorities of requests to the device: >> higher numbers indicate more important requests. >> >> +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the device >> +SHOULD ensure that all writes are committed to non-volatile storage before >> +reporting completion. >> + > > Maybe clarify what happens if it doesn't "Failure to do so might > lead to data loss"? Okay. Paolo


  • 4.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 08:48
    On Thu, Jul 02, 2015 at 09:15:21AM +0200, Paolo Bonzini wrote: > > > On 02/07/2015 09:01, Michael S. Tsirkin wrote: > > On Thu, Jul 02, 2015 at 12:41:04AM +0200, Paolo Bonzini wrote: > >> VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance > >> (up to 2x, though more realistically +30-40%) in latency-bound workloads. > >> However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH. > >> > >> Restore it, clarifying the definition of writeback and writethrough cache > >> according to what is commonly done by high-end storage. In addition, fix > >> two occurrences where the driver was mentioned rather than the device, and > >> clarify the influence of the legacy VIRTIO_BLK_F_FLUSH feature on caching. > >> > >> There is a difference between the legacy and virtio1 behavior of > >> VIRTIO_BLK_F_CONFIG_WCE. In legacy devices, the writeback field was > >> only available if VIRTIO_BLK_F_CONFIG_WCE was negotiated, while now > >> it is always available. However, the VIRTIO_BLK_F_CONFIG_WCE feature > >> remains and describes whether the field is writable. > >> > >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > > > > Please mention the issue ID in the commit log in the future. > > E.g. on a line by itself: > > Ok. It's in the subject though, which technically is the first line of > the commit log. :) > > > VIRTIO-144 > > > >> --- > >> content.tex 45 ++++++++++++++++++++++++++++++++------------- > >> 1 file changed, 32 insertions(+), 13 deletions(-) > >> > >> diff --git a/content.tex b/content.tex > >> index 1efdcc8..7c00fd7 100644 > >> --- a/content.tex > >> +++ b/content.tex > >> @@ -3725,6 +3725,9 @@ device except where noted. > >> > >> item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O > >> alignment. > >> + > >> +item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback > >> + and writethrough modes. > >> end{description} > >> > >> subsubsection{Legacy Interface: Feature bits}label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits} > >> @@ -3735,9 +3738,6 @@ device except where noted. > >> item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands. > >> > >> item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support. > >> - > >> -item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback > >> - and writethrough modes. > >> end{description} > >> > >> VIRTIO_BLK_F_FLUSH was also called VIRTIO_BLK_F_WCE: Legacy drivers > >> @@ -3746,9 +3746,9 @@ VIRTIO_BLK_T_FLUSH commands. > >> > >> subsection{Device configuration layout}label{sec:Device Types / Block Device / Device configuration layout} > >> > >> -The field{capacity} of the device (expressed in 512-byte sectors) is always > >> -present. The availability of the others all depend on various feature > >> -bits as indicated above. > >> +The field{capacity} of the device (expressed in 512-byte sectors) and > >> +caching mode (field{writeback}) are always present. The availability > >> +of the others all depend on various feature bits as indicated above. > >> > >> egin{lstlisting} > >> struct virtio_blk_config { > > > > I'm afraid we can't make this change at this point - it's clearly > > incompatible with devices that implement cs01 unless they > > also implement the WCE bit. > > > > If you see value in having a RO writeback field, > > just add a feature bit for that. > > No, it's okay to remove the RO capability. I'll send v2. > > > > >> @@ -3771,7 +3771,7 @@ struct virtio_blk_config { > >> // optimal (suggested maximum) I/O size in blocks > >> le32 opt_io_size; > >> } topology; > >> - u8 reserved; > >> + u8 writeback; > >> }; > >> end{lstlisting} > >> > >> @@ -3801,15 +3801,20 @@ according to the native endian of the guest rather than > >> field{topology} struct can be read to determine the physical block size and optimal > >> I/O lengths for the driver to use. This also does not affect the units > >> in the protocol, only performance. > >> + > >> +item The cache mode can be read from field{writeback}. 0 corresponds > >> + to a writethrough cache, 1 to a writeback cachefootnote{Consistent > >> + with
    ef{sec:Device Types / Block Device / Device Operation}, > >> + a writethrough cache can be defined broadly as a cache that commits > >> + writes to non-volatile storage before reporting their completion. > >> + For example, a battery-backed writeback cache actually counts as > >> + writethrough according to this definition.}. > >> end{enumerate} > > > > > > Could you add some conformance statements for this as well > > please? > > The above footnote is a consequence of the conformance statements in the > "Device Operation" section. (Hence putting it in a footnote only). Hmm I don't get it. The only non-legacy text that mention writethrough is this: VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between writeback and writethrough modes. so this is the only definition of these terms. Also, isn't the flush command special in the writeback mode? maybe mention this... Also Some older legacy devices did not operate in writethrough mode even after a driver announced lack of support for VIRTIO_BLK_F_FLUSH. What does this mean? let's just say in which mode they did operate. Also what should driver do in this case? > >> subsubsection{Legacy Interface: Device Initialization}label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization} > >> > >> -The field{reserved} field used to be called field{writeback}. If the > >> -VIRTIO_BLK_F_CONFIG_WCE feature is offered, the cache mode can be > >> -read from field{writeback}; the > >> -driver can also write to the field in order to toggle the cache > >> -between writethrough (0) and writeback (1) mode. If the feature is > >> +For legacy devices, the field{writeback} field is only available if > >> +the VIRTIO_BLK_F_CONFIG_WCE feature is offered. If the feature is > >> not available, the driver can instead look at the result of > >> negotiating VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode > >> after reset if and only if VIRTIO_BLK_F_FLUSH is negotiated. > >> @@ -3871,9 +3876,19 @@ A device MUST set the field{status} byte to VIRTIO_BLK_S_IOERR > >> for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT > >> write any data. > >> > >> -Upon receipt of a VIRTIO_BLK_T_FLUSH request, the driver SHOULD ensure > >> +Upon receipt of a VIRTIO_BLK_T_FLUSH request, the device SHOULD ensure > >> that any writes which were completed are committed to non-volatile storage. > >> > >> +If the field{writeback} field in configuration space is 0, the device > >> +SHOULD ensure that all writes are committed to non-volatile storage before > >> +reporting completion. > >> + > >> +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY > >> +switch to writethrough or writeback mode by writing respectively 0 and > >> +1 to the field{writeback} field. After writing a 0 to field{writeback}, > >> +the driver MUST NOT assume that writes submitted while the field was 1 have > >> +been committed to non-volatile storage. > >> + > >> subsubsection{Legacy Interface: Device Operation}label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation} > >> When using the legacy interface, transitional devices and drivers > >> MUST format the fields in struct virtio_blk_req > >> @@ -3884,6 +3899,10 @@ The field{reserved} field was previously called field{ioprio}. field{ioprio} > >> is a hint about the relative priorities of requests to the device: > >> higher numbers indicate more important requests. > >> > >> +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the device > >> +SHOULD ensure that all writes are committed to non-volatile storage before > >> +reporting completion. > >> + > > > > Maybe clarify what happens if it doesn't "Failure to do so might > > lead to data loss"? > > Okay. > > Paolo


  • 5.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 08:51
    On 02/07/2015 10:48, Michael S. Tsirkin wrote: >> > >> > The above footnote is a consequence of the conformance statements in the >> > "Device Operation" section. (Hence putting it in a footnote only). > Hmm I don't get it. > > The only non-legacy text that mention writethrough is this: > VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between > writeback and writethrough modes. > > so this is the only definition of these terms. After the patch (quoting from v2): +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated and the +field{writeback} field in configuration space is 0, the device MUST +ensure that all writes are committed to non-volatile storage before +reporting completion. + +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY +switch to writethrough or writeback mode by writing respectively 0 and +1 to the field{writeback} field. After writing a 0 to field{writeback}, +the driver MUST NOT assume that writes submitted while the field was 1 have +been committed to non-volatile storage. > Also, isn't the flush command special in the writeback mode? > maybe mention this... Again quoting from v2: +If the field{VIRTIO_BLK_F_FLUSH} feature is negotiated, upon receipt +of a VIRTIO_BLK_T_FLUSH request, the device MUST ensure that any writes +which were completed are committed to non-volatile storage. > > Also > Some older legacy devices did not operate in writethrough mode > even after a driver announced lack of > support for VIRTIO_BLK_F_FLUSH. > > What does this mean? let's just say in which mode they did operate. > Also what should driver do in this case? In v2: +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the +device SHOULD ensure that all writes are committed to non-volatile +storage before reporting completion. It MUST do so if it proposed +field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. (No humor intended, I know this is not your field of expertise). Paolo


  • 6.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 11:21
    On Thu, 2015-07-02 at 10:51 +0200, Paolo Bonzini wrote: > > On 02/07/2015 10:48, Michael S. Tsirkin wrote: > >> > > >> > The above footnote is a consequence of the conformance statements in the > >> > "Device Operation" section. (Hence putting it in a footnote only). > > Hmm I don't get it. > > > > The only non-legacy text that mention writethrough is this: > > VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between > > writeback and writethrough modes. > > > > so this is the only definition of these terms. > > After the patch (quoting from v2): > > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated and the > +field{writeback} field in configuration space is 0, the device MUST > +ensure that all writes are committed to non-volatile storage before > +reporting completion. > + > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY > +switch to writethrough or writeback mode by writing respectively 0 and > +1 to the field{writeback} field. After writing a 0 to field{writeback}, > +the driver MUST NOT assume that writes submitted while the field was 1 have > +been committed to non-volatile storage. > > > Also, isn't the flush command special in the writeback mode? > > maybe mention this... > > Again quoting from v2: > > +If the field{VIRTIO_BLK_F_FLUSH} feature is negotiated, upon receipt > +of a VIRTIO_BLK_T_FLUSH request, the device MUST ensure that any writes > +which were completed are committed to non-volatile storage. > > > > > Also > > Some older legacy devices did not operate in writethrough mode > > even after a driver announced lack of > > support for VIRTIO_BLK_F_FLUSH. > > > > What does this mean? let's just say in which mode they did operate. > > Also what should driver do in this case? > > In v2: > > +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the > +device SHOULD ensure that all writes are committed to non-volatile > +storage before reporting completion. It MUST do so if it proposed > +field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. Not SHOULD, MUST. Please don't leave spec based doors open to bad implementations. An implementation that allows data corruption MUST NOT have a spec interpretation that supports it. James


  • 7.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 11:40
    On Thu, Jul 02, 2015 at 11:21:05AM +0000, James Bottomley wrote: > On Thu, 2015-07-02 at 10:51 +0200, Paolo Bonzini wrote: > > > > On 02/07/2015 10:48, Michael S. Tsirkin wrote: > > >> > > > >> > The above footnote is a consequence of the conformance statements in the > > >> > "Device Operation" section. (Hence putting it in a footnote only). > > > Hmm I don't get it. > > > > > > The only non-legacy text that mention writethrough is this: > > > VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between > > > writeback and writethrough modes. > > > > > > so this is the only definition of these terms. > > > > After the patch (quoting from v2): > > > > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated and the > > +field{writeback} field in configuration space is 0, the device MUST > > +ensure that all writes are committed to non-volatile storage before > > +reporting completion. > > + > > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY > > +switch to writethrough or writeback mode by writing respectively 0 and > > +1 to the field{writeback} field. After writing a 0 to field{writeback}, > > +the driver MUST NOT assume that writes submitted while the field was 1 have > > +been committed to non-volatile storage. > > > > > Also, isn't the flush command special in the writeback mode? > > > maybe mention this... > > > > Again quoting from v2: > > > > +If the field{VIRTIO_BLK_F_FLUSH} feature is negotiated, upon receipt > > +of a VIRTIO_BLK_T_FLUSH request, the device MUST ensure that any writes > > +which were completed are committed to non-volatile storage. > > > > > > > > Also > > > Some older legacy devices did not operate in writethrough mode > > > even after a driver announced lack of > > > support for VIRTIO_BLK_F_FLUSH. > > > > > > What does this mean? let's just say in which mode they did operate. > > > Also what should driver do in this case? > > > > In v2: > > > > +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the > > +device SHOULD ensure that all writes are committed to non-volatile > > +storage before reporting completion. It MUST do so if it proposed > > +field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. > > Not SHOULD, MUST. Please don't leave spec based doors open to bad > implementations. An implementation that allows data corruption MUST NOT > have a spec interpretation that supports it. > > James > > I'm not sure I agree. Personally I create disks that discard data all the time. I *don't* want these writes committed anywhere. -- MST


  • 8.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 11:46
    On Thu, 2015-07-02 at 13:40 +0200, Michael S. Tsirkin wrote: > On Thu, Jul 02, 2015 at 11:21:05AM +0000, James Bottomley wrote: > > On Thu, 2015-07-02 at 10:51 +0200, Paolo Bonzini wrote: > > > > > > On 02/07/2015 10:48, Michael S. Tsirkin wrote: > > > >> > > > > >> > The above footnote is a consequence of the conformance statements in the > > > >> > "Device Operation" section. (Hence putting it in a footnote only). > > > > Hmm I don't get it. > > > > > > > > The only non-legacy text that mention writethrough is this: > > > > VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between > > > > writeback and writethrough modes. > > > > > > > > so this is the only definition of these terms. > > > > > > After the patch (quoting from v2): > > > > > > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated and the > > > +field{writeback} field in configuration space is 0, the device MUST > > > +ensure that all writes are committed to non-volatile storage before > > > +reporting completion. > > > + > > > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY > > > +switch to writethrough or writeback mode by writing respectively 0 and > > > +1 to the field{writeback} field. After writing a 0 to field{writeback}, > > > +the driver MUST NOT assume that writes submitted while the field was 1 have > > > +been committed to non-volatile storage. > > > > > > > Also, isn't the flush command special in the writeback mode? > > > > maybe mention this... > > > > > > Again quoting from v2: > > > > > > +If the field{VIRTIO_BLK_F_FLUSH} feature is negotiated, upon receipt > > > +of a VIRTIO_BLK_T_FLUSH request, the device MUST ensure that any writes > > > +which were completed are committed to non-volatile storage. > > > > > > > > > > > Also > > > > Some older legacy devices did not operate in writethrough mode > > > > even after a driver announced lack of > > > > support for VIRTIO_BLK_F_FLUSH. > > > > > > > > What does this mean? let's just say in which mode they did operate. > > > > Also what should driver do in this case? > > > > > > In v2: > > > > > > +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the > > > +device SHOULD ensure that all writes are committed to non-volatile > > > +storage before reporting completion. It MUST do so if it proposed > > > +field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. > > > > Not SHOULD, MUST. Please don't leave spec based doors open to bad > > implementations. An implementation that allows data corruption MUST NOT > > have a spec interpretation that supports it. > > > > James > > > > > > I'm not sure I agree. Personally I create disks that discard data all > the time. I *don't* want these writes committed anywhere. There's a difference between you actively configuring a data destroying device and a spec supporting a device implementation that can never be operated with guaranteed data integrity. If you want to damage integrity for speed, don't flush in the guest. Perhaps I'm partly on this hobby horse because I'm getting demands over in SCSI land that we clean up behind a range of USB devices with a writeback cache and no flush command. Personally I don't believe the WCE feature is separable from the FLUSH one in a well run implementation. James


  • 9.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 12:10
    On Thu, Jul 02, 2015 at 11:46:48AM +0000, James Bottomley wrote: > On Thu, 2015-07-02 at 13:40 +0200, Michael S. Tsirkin wrote: > > On Thu, Jul 02, 2015 at 11:21:05AM +0000, James Bottomley wrote: > > > On Thu, 2015-07-02 at 10:51 +0200, Paolo Bonzini wrote: > > > > > > > > On 02/07/2015 10:48, Michael S. Tsirkin wrote: > > > > >> > > > > > >> > The above footnote is a consequence of the conformance statements in the > > > > >> > "Device Operation" section. (Hence putting it in a footnote only). > > > > > Hmm I don't get it. > > > > > > > > > > The only non-legacy text that mention writethrough is this: > > > > > VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between > > > > > writeback and writethrough modes. > > > > > > > > > > so this is the only definition of these terms. > > > > > > > > After the patch (quoting from v2): > > > > > > > > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated and the > > > > +field{writeback} field in configuration space is 0, the device MUST > > > > +ensure that all writes are committed to non-volatile storage before > > > > +reporting completion. > > > > + > > > > +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY > > > > +switch to writethrough or writeback mode by writing respectively 0 and > > > > +1 to the field{writeback} field. After writing a 0 to field{writeback}, > > > > +the driver MUST NOT assume that writes submitted while the field was 1 have > > > > +been committed to non-volatile storage. > > > > > > > > > Also, isn't the flush command special in the writeback mode? > > > > > maybe mention this... > > > > > > > > Again quoting from v2: > > > > > > > > +If the field{VIRTIO_BLK_F_FLUSH} feature is negotiated, upon receipt > > > > +of a VIRTIO_BLK_T_FLUSH request, the device MUST ensure that any writes > > > > +which were completed are committed to non-volatile storage. > > > > > > > > > > > > > > Also > > > > > Some older legacy devices did not operate in writethrough mode > > > > > even after a driver announced lack of > > > > > support for VIRTIO_BLK_F_FLUSH. > > > > > > > > > > What does this mean? let's just say in which mode they did operate. > > > > > Also what should driver do in this case? > > > > > > > > In v2: > > > > > > > > +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the > > > > +device SHOULD ensure that all writes are committed to non-volatile > > > > +storage before reporting completion. It MUST do so if it proposed > > > > +field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. > > > > > > Not SHOULD, MUST. Please don't leave spec based doors open to bad > > > implementations. An implementation that allows data corruption MUST NOT > > > have a spec interpretation that supports it. > > > > > > James > > > > > > > > > > I'm not sure I agree. Personally I create disks that discard data all > > the time. I *don't* want these writes committed anywhere. > > There's a difference between you actively configuring a data destroying > device and a spec supporting a device implementation that can never be > operated with guaranteed data integrity. > > If you want to damage integrity for speed, don't flush in the guest. Users want to configure this on the host, not on the guest. > Perhaps I'm partly on this hobby horse because I'm getting demands over > in SCSI land that we clean up behind a range of USB devices with a > writeback cache and no flush command. Personally I don't believe the > WCE feature is separable from the FLUSH one in a well run > implementation. > > James Well this is not the case here. We do say "Failure to do so can cause data loss." Maybe "MUST unless data loss in case of a crash is acceptable"? -- MST


  • 10.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 13:11
    On 02/07/2015 13:21, James Bottomley wrote: > > +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the > > +device SHOULD ensure that all writes are committed to non-volatile > > +storage before reporting completion. It MUST do so if it proposed > > +field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. > > Not SHOULD, MUST. Please don't leave spec based doors open to bad > implementations. An implementation that allows data corruption MUST NOT > have a spec interpretation that supports it. What I'm trying to do is to make it possible for the guest to detect an implementation that is intentionally unsafe. Such an implementation MUST NOT propose the VIRTIO_BLK_F_FLUSH feature, unlike the USB drives you mention. I totally agree that WCE and FLUSH are not separable. Paolo


  • 11.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 17:17
    On Thu, Jul 02, 2015 at 03:10:48PM +0200, Paolo Bonzini wrote: > > > On 02/07/2015 13:21, James Bottomley wrote: > > > +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the > > > +device SHOULD ensure that all writes are committed to non-volatile > > > +storage before reporting completion. It MUST do so if it proposed > > > +field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. > > > > Not SHOULD, MUST. Please don't leave spec based doors open to bad > > implementations. An implementation that allows data corruption MUST NOT > > have a spec interpretation that supports it. > > What I'm trying to do is to make it possible for the guest to detect an > implementation that is intentionally unsafe. Such an implementation > MUST NOT propose the VIRTIO_BLK_F_FLUSH feature, unlike the USB drives > you mention. I totally agree that WCE and FLUSH are not separable. > > Paolo Hmm. Very simple drivers don't negotiate any features. Making such drivers lose data isn't a good idea I think. -- MST


  • 12.  Re: [virtio] [PATCH] virtio-blk: restore VIRTIO_BLK_F_CONFIG_WCE (VIRTIO-144)

    Posted 07-02-2015 18:14
    On 02/07/2015 19:16, Michael S. Tsirkin wrote: > On Thu, Jul 02, 2015 at 03:10:48PM +0200, Paolo Bonzini wrote: >> >> >> On 02/07/2015 13:21, James Bottomley wrote: >>>> +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the >>>> +device SHOULD ensure that all writes are committed to non-volatile >>>> +storage before reporting completion. It MUST do so if it proposed >>>> +field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. >>> >>> Not SHOULD, MUST. Please don't leave spec based doors open to bad >>> implementations. An implementation that allows data corruption MUST NOT >>> have a spec interpretation that supports it. >> >> What I'm trying to do is to make it possible for the guest to detect an >> implementation that is intentionally unsafe. Such an implementation >> MUST NOT propose the VIRTIO_BLK_F_FLUSH feature, unlike the USB drives >> you mention. I totally agree that WCE and FLUSH are not separable. > > Hmm. Very simple drivers don't negotiate any features. > Making such drivers lose data isn't a good idea I think. But it won't! Unless it talks to a very simple device that also doesn't negotiate any feature (including VIRTIO_BLK_F_FLUSH). Bringing the "cheap USB pen drive" parallel forward, such a simple device would probably disregard MUSTs left and right anyway. At least what I wrote makes it possible for a not-so-simple driver to know it's talking to a very simple device. Paolo