NT-store semantics in PM

30 views

Skip to first unread message

Henrique Dominguez da Silva Nunes Fernandes

unread,

Jun 5, 2024, 8:59:18 AMJun 5

to pmem

Hi, I have a question regarding the semantics and ordering guarantees of NT-stores in the context of PM. Specifically, what behaviour is expected when we observe a NT-store to an address that is already in a cacheline, since NT-stores reside in separate WC buffers until they are propagated to PM. Intel's manual states the following in chapter 10.4.6.2 Caching of Temporal vs. Non-Temporal Data: "If a program specifies a non-temporal store with one of these instructions and the memory type of the destination region is write back (WB), write through (WT), or write combining (WC), the processor will do the following: • If the memory location being written to is present in the cache hierarchy, the data in the caches is evicted. • The non-temporal data is written to memory with WC semantics." However, regarding WC semantics, I could only find a manual for the P6 family processor from 1998 that states the following in chapter "WRITE COMBINING": "Ordering is not maintained between the successive allocation/deallocation of WC buffers e.g. writes to WC buffer 1 followed by writes to WC buffer 2 may appear as buffer 2 followed by buffer 1 on the system bus. When a WC buffer is propagated to memory as partial writes there is no guaranteed ordering between successive partial writes e.g. partial write for chunk 2 may appear on the bus before partial write for chunk 1 or visa-versa." This, combined with the fact that each processor has its own WC region size, makes it hard to understand what guarantees NT-stores provide. Thus, my main questions are:

What are the ordering guarantees that nt-stores provide between them and other "normal" stores?
Are these guarantees tied to the processor implementation? If so, how do processors that support PM exactly handle NT-stores?