On Thu, Jul 02, 2015 at 09:15:21AM +0200, Paolo Bonzini wrote: > > > On 02/07/2015 09:01, Michael S. Tsirkin wrote: > > On Thu, Jul 02, 2015 at 12:41:04AM +0200, Paolo Bonzini wrote: > >> VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance > >> (up to 2x, though more realistically +30-40%) in latency-bound workloads. > >> However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH. > >> > >> Restore it, clarifying the definition of writeback and writethrough cache > >> according to what is commonly done by high-end storage. In addition, fix > >> two occurrences where the driver was mentioned rather than the device, and > >> clarify the influence of the legacy VIRTIO_BLK_F_FLUSH feature on caching. > >> > >> There is a difference between the legacy and virtio1 behavior of > >> VIRTIO_BLK_F_CONFIG_WCE. In legacy devices, the writeback field was > >> only available if VIRTIO_BLK_F_CONFIG_WCE was negotiated, while now > >> it is always available. However, the VIRTIO_BLK_F_CONFIG_WCE feature > >> remains and describes whether the field is writable. > >> > >> Signed-off-by: Paolo Bonzini <
pbonzini@redhat.com> > > > > Please mention the issue ID in the commit log in the future. > > E.g. on a line by itself: > > Ok. It's in the subject though, which technically is the first line of > the commit log. :) > > > VIRTIO-144 > > > >> --- > >> content.tex 45 ++++++++++++++++++++++++++++++++------------- > >> 1 file changed, 32 insertions(+), 13 deletions(-) > >> > >> diff --git a/content.tex b/content.tex > >> index 1efdcc8..7c00fd7 100644 > >> --- a/content.tex > >> +++ b/content.tex > >> @@ -3725,6 +3725,9 @@ device except where noted. > >> > >> item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O > >> alignment. > >> + > >> +item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback > >> + and writethrough modes. > >> end{description} > >> > >> subsubsection{Legacy Interface: Feature bits}label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits} > >> @@ -3735,9 +3738,6 @@ device except where noted. > >> item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands. > >> > >> item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support. > >> - > >> -item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback > >> - and writethrough modes. > >> end{description} > >> > >> VIRTIO_BLK_F_FLUSH was also called VIRTIO_BLK_F_WCE: Legacy drivers > >> @@ -3746,9 +3746,9 @@ VIRTIO_BLK_T_FLUSH commands. > >> > >> subsection{Device configuration layout}label{sec:Device Types / Block Device / Device configuration layout} > >> > >> -The field{capacity} of the device (expressed in 512-byte sectors) is always > >> -present. The availability of the others all depend on various feature > >> -bits as indicated above. > >> +The field{capacity} of the device (expressed in 512-byte sectors) and > >> +caching mode (field{writeback}) are always present. The availability > >> +of the others all depend on various feature bits as indicated above. > >> > >> egin{lstlisting} > >> struct virtio_blk_config { > > > > I'm afraid we can't make this change at this point - it's clearly > > incompatible with devices that implement cs01 unless they > > also implement the WCE bit. > > > > If you see value in having a RO writeback field, > > just add a feature bit for that. > > No, it's okay to remove the RO capability. I'll send v2. > > > > >> @@ -3771,7 +3771,7 @@ struct virtio_blk_config { > >> // optimal (suggested maximum) I/O size in blocks > >> le32 opt_io_size; > >> } topology; > >> - u8 reserved; > >> + u8 writeback; > >> }; > >> end{lstlisting} > >> > >> @@ -3801,15 +3801,20 @@ according to the native endian of the guest rather than > >> field{topology} struct can be read to determine the physical block size and optimal > >> I/O lengths for the driver to use. This also does not affect the units > >> in the protocol, only performance. > >> + > >> +item The cache mode can be read from field{writeback}. 0 corresponds > >> + to a writethrough cache, 1 to a writeback cachefootnote{Consistent > >> + with
ef{sec:Device Types / Block Device / Device Operation}, > >> + a writethrough cache can be defined broadly as a cache that commits > >> + writes to non-volatile storage before reporting their completion. > >> + For example, a battery-backed writeback cache actually counts as > >> + writethrough according to this definition.}. > >> end{enumerate} > > > > > > Could you add some conformance statements for this as well > > please? > > The above footnote is a consequence of the conformance statements in the > "Device Operation" section. (Hence putting it in a footnote only). Hmm I don't get it. The only non-legacy text that mention writethrough is this: VIRTIO_BLK_F_CONFIG_WCE (11) Device can toggle its cache between writeback and writethrough modes. so this is the only definition of these terms. Also, isn't the flush command special in the writeback mode? maybe mention this... Also Some older legacy devices did not operate in writethrough mode even after a driver announced lack of support for VIRTIO_BLK_F_FLUSH. What does this mean? let's just say in which mode they did operate. Also what should driver do in this case? > >> subsubsection{Legacy Interface: Device Initialization}label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization} > >> > >> -The field{reserved} field used to be called field{writeback}. If the > >> -VIRTIO_BLK_F_CONFIG_WCE feature is offered, the cache mode can be > >> -read from field{writeback}; the > >> -driver can also write to the field in order to toggle the cache > >> -between writethrough (0) and writeback (1) mode. If the feature is > >> +For legacy devices, the field{writeback} field is only available if > >> +the VIRTIO_BLK_F_CONFIG_WCE feature is offered. If the feature is > >> not available, the driver can instead look at the result of > >> negotiating VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode > >> after reset if and only if VIRTIO_BLK_F_FLUSH is negotiated. > >> @@ -3871,9 +3876,19 @@ A device MUST set the field{status} byte to VIRTIO_BLK_S_IOERR > >> for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT > >> write any data. > >> > >> -Upon receipt of a VIRTIO_BLK_T_FLUSH request, the driver SHOULD ensure > >> +Upon receipt of a VIRTIO_BLK_T_FLUSH request, the device SHOULD ensure > >> that any writes which were completed are committed to non-volatile storage. > >> > >> +If the field{writeback} field in configuration space is 0, the device > >> +SHOULD ensure that all writes are committed to non-volatile storage before > >> +reporting completion. > >> + > >> +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY > >> +switch to writethrough or writeback mode by writing respectively 0 and > >> +1 to the field{writeback} field. After writing a 0 to field{writeback}, > >> +the driver MUST NOT assume that writes submitted while the field was 1 have > >> +been committed to non-volatile storage. > >> + > >> subsubsection{Legacy Interface: Device Operation}label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation} > >> When using the legacy interface, transitional devices and drivers > >> MUST format the fields in struct virtio_blk_req > >> @@ -3884,6 +3899,10 @@ The field{reserved} field was previously called field{ioprio}. field{ioprio} > >> is a hint about the relative priorities of requests to the device: > >> higher numbers indicate more important requests. > >> > >> +If the field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the device > >> +SHOULD ensure that all writes are committed to non-volatile storage before > >> +reporting completion. > >> + > > > > Maybe clarify what happens if it doesn't "Failure to do so might > > lead to data loss"? > > Okay. > > Paolo