These cached store connections have caches associated with them, that take up
lots of memory, leading to the inferior crashing. This change seems to help.
To the end of the main revision processing transaction.
Currently, I think there are issues when this query does update some builds,
as those rows in the build table remain locked until the end of the
transaction. This then causes build event submission to hang. Moving this part
of the revision loading process to the end of the transaction should help to
mitigate this.
Previously, duplicates could creep through if the duplicate wasn't exported,
and only found as a replacement. Now they're filtered out.
This isn't ideal, as duplicates aren't always mistakes, it would be useful
still to capture this package, but having multiple entries for the same
name+version causes the comparison functionality to break.
Switch from using a recursive query to doing a breath first search through the
graph of derivations, as I think PostgreSQL wasn't doing a great job of
planning the recursive queries (it would overestimate the rows involved, and
prefer sequential scans for the derivation_outputs table).
As I think some operations (like the database backup) can block the DROP
SEQUENCE bit, so at least this approach means that the main transaction should
commit and then the sequence is eventually dropped.
This code is a bit tricky, since it should be compatible with old and new guix
revisions. I think these changes stop computing package derivations for
invalid systems, while hopefully not breaking anything.
Previously it would compute a long list of strings, potentially more than
100,000 elements long, then split this string up and insert it in chunks. Only
then could memory be freed.
This new approach builds the strings in batches for the insertion query, then
moves on to the next batch. This should mean that more memory can be freed and
reused along the way.
Rather than creating vhashes, just update the hash table that is used as a
cache, and query that. This should speed things up and reduce memory usage
when loading derivations.
Split the recursive part of the query from the non-recursive part, since
PostgreSQL doesn't do a great job of estimating the number of rows which will
come back from the recursive part, and thus generates a bad plan.