When a player triggers a chunk load via walking around or teleporting there
is no need to stop everything and get this chunk on the main thread. The
client is used to having to wait some time for this chunk and the server
doesn't immediately do anything with it except send it to the player. At
the same time chunk loading is the last major source of file IO that still
runs on the main thread.
These two facts make it possible to offload chunks loaded for this reason
to another thread. However, not all parts of chunk loading can happen off
the main thread. For this we use the new AsynchronousExecutor system to
split chunk loading in to three pieces. The first is loading data from
disk, decompressing it, and parsing it in to an NBT structure. The second
piece is creating entities and tile entities in the chunk and adding them
to the world, this is still done on the main thread. The third piece is
informing everyone who requested a chunk load that the load is finished.
For this we register callbacks and then run them on the main thread once
the previous two stages are finished.
There are still cases where a chunk is needed immediately and these will
still trigger chunk loading entirely on the main thread. The most obvious
case is plugins using the API to request a chunk load. We also must load
the chunk immediately when something in the world tries to access it. In
these cases we ignore any possibly pending or in progress chunk loading
that is happening asynchronously as we will have the chunk loaded by the
time they are finished.
The hope is that overall this system will result in less CPU time and
pauses due to blocking file IO on the main thread thus giving more
consistent performance. Testing so far has shown that this also speeds up
chunk loading client side although some of this is likely to be because
we are sending less chunks at once for the client to process.
Thanks for @ammaraskar for help with the implementation of this feature.
After further testing it appears that while the original LongHashtable
has issues with object creation churn and is severly slower than even
java.util.HashMap in general case benchmarks it is in fact very efficient
for our use case.
With this in mind I wrote a replacement LongObjectHashMap modeled after
LongHashtable. Unlike the original implementation this one does not use
Entry objects for storage so does not have the same object creation churn.
It also uses a 2D array instead of a 3D one and does not use a cache as
benchmarking shows this is more efficient. The "bucket size" was chosen
based on benchmarking performance of the HashMap with contents that would
be plausible for a 200+ player server. This means it uses a little extra
memory for smaller servers but almost always uses less than the normal
java.util.HashMap.
To make up for the original LongHashtable being a poor choice for generic
datasets I added a mixer to the new implementation based on code from
MurmurHash. While this has no noticable effect positive or negative with
our normal use of chunk coordinates it makes the HashMap perform just as
well with nearly any kind of dataset.
After these changes ChunkProviderServer.isChunkLoaded() goes from using
20% CPU time while sampling to not even showing up after 45 minutes of
sampling due to the CPU usage being too low to be noticed.
This ArrayList duplicates part of the functionality of the much more
efficient chunk map so can be removed as the map can be used in the few
places this was needed.
Replace uses of LongHashtable and LongHashset with new implementations.
Remove EntryBase, LongBaseHashtable, LongHashset, and LongHashtable as they
are no longer used.
LongObjectHashMap does not use Entry or EntryBase classes internally for
storage so has much lower object churn and greater performance. LongHashSet
is not as much of performance win for our use case but for general use is
up to seventeen times faster than the old implementation and is in fact
faster than alternatives from "high performance" java libraries. This is
being added so that if someone tries to use it in the future in a place
unrelated to its current use they don't accidentally end up with something
slower than the Java collections HashSet implementation.
Added newlines at the end of files
Fixed improper line endings on some files
Matched start - end comments
Added some missing comments for diffs
Fixed syntax on some spots
Minimized some diff
Removed some no longer used files
Added comment on some required files with no changes
Fixed imports of items used once
Added imports for items used more than once