This is currently happening because some linters try to evaluate parts of
packages for cross-building to aarch64-linux-gnu, but not all packages support
that and some crash in this case.
I'm not quite sure what the correct behaviour should be, but maybe the data
service needs to try and handle these crashes rather than not processing the
entire revision.
This took a while to find as process-job would just get stuck, and this wasn't
directly related to any particular change, just that more fibers increased the
chance of hitting it.
This commit includes lots of the things I changed while debugging.
If there's lots of contention for the resource pool, there will be lots of
waiters, so telling all of them to retry whenever a resource is returned seems
wasteful. This commit adds a new option (assume-reliable-waiters?) which will
have the resource pool try to give a returned resource to the oldest waiter,
if this fails, it'll go back to the old behaviour of telling all waiters to
retry.
As I think this was happening when it missed the resource-pool-retry-checkout
reply from the resource pool. Handle this case by periodically retrying with a
configurable timeout.
The new fibers-map uses the same batching approach that fibers-for-each uses,
and fibers-map-with-progress allows tracking on the results while the map is
happening.
To allow for calling derivation-file-names->derivation-ids in parallel across
multiple fibers, using the PostgreSQL connection fiber to perform atomic
operations.
So that the WAL can grow more when there's sufficient space. When the
inferiors are closed it takes time to restart them, so doing this less should
speed up processing revisions.