Let's start with an example here. First the obvious one I think:
List<String> wordList = Arrays.asList("just", "a", "test");
Set<String> wordSet = new HashSet<>(wordList);
System.out.println(wordSet);
for (int i = 0; i < 100; i++) {
wordSet.add("" + i);
}
for (int i = 0; i < 100; i++) {
wordSet.remove("" + i);
}
System.out.println(wordSet);
Output will show a different "order" - because we have made the capacity bigger (via 1-100
addition) and entries have moved. They are still 3 there - but in different order (if such can be called order).
So, yes, once you modify your Set
between stream operations, the "order" could change.
Since you say that post creation the Set
will not be modified - the order is preserved at the moment, under the current implementation (whatever that is). Or more accurately it is not internally randomized - once entries are laid into the Set
.
But this is absolutely something not to rely one - ever. Things can change without notice, since the contract is allowed to do that - the docs don't make any guarantees about any order what-so-ever - Set
s are about uniqueness after all.
To give you an example the jdk-9 Immutable Set
and Map
do have an internal randomization and the "order" will change from run to run:
Set<String> set = Set.of("just", "a", "test");
System.out.println(set);
This is allowed to print:
[a, test, just] or [a, just, test]
EDIT
Here is how the randomization pattern looks like:
/**
* A "salt" value used for randomizing iteration order. This is initialized once
* and stays constant for the lifetime of the JVM. It need not be truly random, but
* it needs to vary sufficiently from one run to the next so that iteration order
* will vary between JVM runs.
*/
static final int SALT;
static {
long nt = System.nanoTime();
SALT = (int)((nt >>> 32) ^ nt);
}
What this does:
take a long, XOR the first 32 bits with the last 32 bits and take the last 32 bits from that long (by casting to int). XOR is used because it has a 50% zeroes and ones distribution, so it does not alter the result.
How is that used(for a Set
of two elements for example):
// based on SALT set the elements in a particular iteration "order"
if (SALT >= 0) {
this.e0 = e0;
this.e1 = e1;
} else {
this.e0 = e1;
this.e1 = e0;
My guess on the jdk9 internal randomization part, initially taken from here, the relevant part:
The final safety feature is the randomized iteration order of the immutable Set elements and Map keys. HashSet and HashMap iteration order has always been unspecified, but fairly stable, leading to code having inadvertent dependencies on that order. This causes things to break when the iteration order changes, which occasionally happens. The new Set/Map collections change their iteration order from run to run, hopefully flushing out order dependencies earlier in test or development
So it's basically to break all that code that would rely on order for a Set
/Map
. The same thing happened when people moved from java-7 to java-8 and were relying on HashMap's order (LinkedNode
s), that was different due to TreeNode
s introduction. If you leave a feature like that and people rely on it for years - it's hard to remove it and perform some optimizations - like HashMap moved to TreeNode
s; because now you are forced to preserve that order, even if you don't want to. But that is just a guess obviously, treat it as such please