From what I understand it always appears as if the cache has been flushed after write, and always appears as if reads are conducted straight from memory on read. The effect is that a Thread will always see the results of writes from another Thread and (according to the Java Memory Model) never a cached value. The actual implementation and CPU instructions will vary from one architecture to another however.
It doesn't guarantee correctness if you increment the variable in more than one thread, or check its value and take some action since obviously there is no actual synchronization. You can generally only guarantee correct execution if there is only just Thread writing to the variable and others are all reading.
Also note that a 64 bit NON-volatile variable can be read/written as two 32 bit variables, so the 32 bit variables are atomic on write but the 64 bit ones aren't. One half can be written before another - so the value read could be nether the old or the new value.
This is quite a helpful page from my bookmarks:
http://www.cs.umd.edu/~pugh/java/memoryModel/
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…