Appending to the tail of the chunk tasks leaves a
window for the chunk to be moved to a
non-ticking status.
Additionally, use CB's callback executor so we
can ensure that we are not incorrectly
scheduling.
See: https://gist.github.com/aikar/dd22bbd2a3d78a2fd3d92e95e9f28dc6
as part of post processing a chunk, we can call ChunkConverter.
ChunkConverter then kicks off major physics updates, and when blocks
that have connections across chunk boundries occur, a recursive risk
can occur where A updates a block that triggers a physics request.
That physics request may trigger a chunk request, that then enqueues
a task into the Mailbox ChunkTaskQueueSorter.
If anything requests that same chunk that is in the middle of conversion,
it's mailbox queue is going to be held up, so the subsequent chunk request
will be unable to proceed.
We delay post processing of Chunk.A() 1 "pass" by re stuffing it back into
the executor so that the mailbox ChunkQueue is now considered empty.
This successfully fixed a reoccurring and highly reproduceable crash
for heightmaps.
If the request to shut down the server is received while we are in
a watchdog hang, immediately treat it as a crash and begin the shutdown
process. Shutdown process is now improved to also shutdown cleanly when
not using restart scripts either.
If a server is deadlocked, a server owner can send SIGHUP (or any other signal
the JVM understands to shut down as it currently does) and the watchdog
will no longer need to wait until the full timeout, allowing you to trigger
a close process and try to shut the server down gracefully, saving player and
world data.
Previously there was no way to trigger this outside of waiting for a full watchdog
timeout, which may be set to a really long time...
Additionally, fix everything to do with shutting the server down asynchronously.
Previously, nearly everything about the process was fragile and unsafe. Main might
not have actually been frozen, and might still be manipulating state.
Or, some reuest might ask main to do something in the shutdown but main is dead.
Or worse, other things might start closing down items such as the Console or Thread Pool
before we are fully shutdown.
This change tries to resolve all of these issues by moving everything into the stop
method and guaranteeing only one thread is stopping the server.
We then issue Thread Death to the main thread of another thread initiates the stop process.
We have to ensure Thread Death propagates correctly though to stop main completely.
This is to ensure that if main isn't truely stuck, it's not manipulating state we are trying to save.
Also check class loader cache before locking to speed up cached hits to avoid the lock
wasn't gonna make a unique build just for that but can lump it in here.
Very few entities actually hard collide, so store them in their own
entity slices and provide a special getEntites type call just for them.
This reduces entity collision checking impact (in my testing) by 25%
for crammed entities (shove 130 cows into an 8x6 area in one chunk).
Less crammed entities are likely to show significantly less benefit.
Effectively, this patch optimises crammed entity situations.
A players previous block break location is held onto permanently, and if
an interact event is cancelled, the client sends a stop breaking block packet
This then tries to update client about that old location.
This old location might then be in a now unloaded chunk, and it caused it to load.
We now also clear reference to it once abort destroy block is ran to stop trying
to send updates about the old block anyways.
I had did a few of the operations myself, which would have broken chunkCheck
from doing it itself, which would leave some state left in the original chunk
and thats not good....
If the request to shut down the server is received while we are in
a watchdog hang, immediately treat it as a crash and begin the shutdown
process. Shutdown process is now improved to also shutdown cleanly when
not using restart scripts either.
If a server is deadlocked, a server owner can send SIGUP (or any other signal
the JVM understands to shut down as it currently does) and the watchdog
will no longer need to wait until the full timeout, allowing you to trigger
a close process and try to shut the server down gracefully, saving player and
world data.
Previously there was no way to trigger this outside of waiting for a full watchdog
timeout, which may be set to a really long time...
Leaf informed me this could cause ordering issues.
So, the risk if this occurring is lowered now anyways, but if an
entity causes a sync chunk load, it could process an unload...
We will tackle the problem better in a future commit
Also fixed another async-chunks=false issue
This will help prevent many cases of unregistering entities during entity ticking
Currently delays Chunk Unloads and Async Chunk load callbacks
Also dropped mid ticking chunk tasks during entity ticking to reduce this risk
This is how it use to behave on Paper, and this is totally destroying
the ability to try to shut the server down gracefully during the
shutdown process as events firing on the watchdog thread are throwing
errors.
This isn't an issue on Spigot
This has caused me so many rollbacks on watchdog already :(
Previous method only worked for a normal shutdown, and didn't include
when the server enters a closing state due to watchdog crashes
This is the correct variable to detect the server is in the middle of shutdown process
The streams hurt performance and allocate tons of garbage, so
replace them with the standard iterator.
Also optimise the stream.anyMatch statement to move to a bitset
where we can replace the call with a single bitwise operation.
This fix is for the few people who are using such low end systems that
asynchronous chunk loading hurts them rather than helping.
The previous build made paper crash if you turned off async chunks, and
this fixes that issue.
Mark chunks that are blocking main thread for world generation as urgent
Implements a general priority system so that chunks that are sorted in
the generator queues can prioritize certain chunks over another.
Urgent chunks will jump to the front of the line, ensuring that a
sync chunk load on an ungenerated chunk does not lag the server for
a long period of time if the servers generator queues are filled with
lots of chunks already.
This massively reduces the lag spikes from sync chunk gens.
This is also a precursor to my next improvement to prioritize chunks
in front of the player (Frustum Priorization)
In most cases, this change won't benefit much. However, there
exists the possibility that your Chunk Task threads are all busy
doing super slow work such as converting chunks.
If this occurs, the main thread blocking tasks, even at highest priority,
has to wait for some thread to become available.
This change gives us a waiting thread used only for main thread blocking
tasks, as well as an increased thread priority level, so that the OS
will give priority to this thread over the other threads.
This is more about guarantees, and won't be any real performanc boost
to anyone who has low or fast activity on their chunk tasks anyways.
But not all of us force upgrade our worlds, and this can be a life saver.
also reordered some patches because multiple PR's were merged.
Forgot to flip the pending boolean back to false, causing it to copy
empty data on the next tick if nothing else triggered a load.
haven't managed to actually reproduce the crash others got, but did
verify that the bad copy was occurring erasing the data.
also fixed a bug with chunk load callback not executing before
another one was scheduled.
Upstream has released updates that appears to apply and compile correctly.
This update has not been tested by PaperMC and as with ANY update, please do your own testing
CraftBukkit Changes:
183139d4 SPIGOT-5665: Improve loading spawn egg NBT
dec5df26 SPIGOT-5667: Can't add recipe without (vanilla) datapack
Spigot Changes:
ae72bf43 SPIGOT-5666: Customizable End City Seed
This can cause a nasty server lag the spawn chunks are not kept loaded
or they aren't finished loading yet, or if the world spawn radius is
larger than the keep loaded range.
By skipping this, we avoid potential for a large spike on server start.
Credit to Spotted for the idea
A lot of the new chunk system requires constant back and forth the main thread
to handle priority scheduling and ensuring conflicting tasks do not run at the
same time.
The issue is, these queues are only checked at either:
A) Sync Chunk Loads
B) End of Tick while sleeping
This results in generating chunks sitting waiting for a full tick to
complete before it will even start the next unit of work to do.
Additionally, this also delays loading of chunks until this same timing.
We will now periodically poll the chunk task queues throughout the tick,
looking for work to do.
We do this in a fair method that considers all worlds, not just the one being
ticked, so that each world can get 1 task procesed each before the next pass.
We also cap the throughput of these task processes to 1 per world per 0.1ms or
200 max per tick, to ensure that high volume of tasks do not overload the current
tick time.
In a view distance of 15, chunk loading performance was visually faster on the client.
Flying at high speed in spectator mode was able to keep up with chunk loading (as long as they are already generated)
Wiz mentioned that large WorldEdit operations cause light to run on
main thread. The queue was small, set to 5.. this bumps it to 20
but makes it configurable per-world.
The main risk of increasing this higher is during shutdown, some
queued light updates may be lost because mojang did not flush the
light engine on shutdown...
The queue size only puts a cap on max loss, doesn't solve that problem.
Don't touch this unless you know you have a problem and ok with the risk.