Commit graph

133 commits

Author SHA1 Message Date
02947cc9c4 Replace more instances of system->system-id with lookup-system-id
To avoid systems being inserted from queries.
2025-07-04 14:16:03 +01:00
de476a8b40 Reapply "Optimise inserting derivation inputs"
Reverting this change entirely was too slow, so change the joins in the query
from inner joins to left joins, as this should mean that NULL values get
inserted if there are missing derivations or derivation outputs, which should
cause an error rather than silently skipping inserting the derivation inputs.

This reverts commit edeb89e0cf.
2025-03-19 14:36:01 +00:00
7fe042498f Tweak using vectors in insert-derivation-outputs 2025-03-17 14:59:42 +00:00
b904fdb161 Try to address the issue of missing derivation outputs 2025-03-17 10:26:21 +00:00
8635a5561b Add mechanism to fix derivation inputs
Some derivations are missing inputs, I don't know why, but this should allow
for manually fixing the affected derivations.
2025-03-10 08:20:43 +00:00
61d49cedb3 Remove compatability with old guix derivation-inputs 2025-03-10 06:30:43 +00:00
edeb89e0cf Revert "Optimise inserting derivation inputs"
I'm concerned that this approach is more error prone and won't error if there
are issues with the data in the database.

This reverts commit 3081887b90.
2025-03-10 06:30:43 +00:00
9e3cfabe77 Fix some nulls 2025-02-06 17:14:47 +00:00
5ed98343d7 Rework loading revision data
These changes were motivated by switching to a mechanism of loading data that
isn't dependent on the big advisory lock that prevents more than one revision
from being processed at a time.

Since INSERT ... RETURNING id; is used, this can block if another transaction
inserts the same data, and then cause an error when that transaction
commits. The solution is to use ON CONFLICT DO NOTHING, but you have to handle
the case when the INSERT doesn't return an id since the other transaction has
inserted it.

This commit rewrites insert-missing-data-and-return-all-ids to do as described
above, as well as being more efficient in how existing data is detected and to
use more vectors. Other utilities for inserting data are added as well.
2024-12-09 10:53:06 +00:00
38d5501233 Add placeholder derivation source file nar procedures 2024-10-27 14:39:52 +00:00
77962f7c2c Move inserting derivations in to the load-new-guix-revision module
And start to more closely integrate it. This makes it possible to start making
it faster by doing more in parallel.
2024-08-07 17:21:28 +01:00
bbbcea8ff6 Add more time logging in to insert-missing-derivations 2024-07-16 16:13:17 +01:00
1754d1a321 Stop inserting missing source file nars
This was more an issue several years ago, so this code is not really needed
now.
2024-07-16 16:06:46 +01:00
e6205e988a Speed up querying for revision package derivations
By splitting it up by system.
2024-06-21 15:29:34 +01:00
39f626aa45 Remove even more time logging 2024-01-28 08:18:13 +00:00
e51d2f8932 Remove some time logging
As this is a bit noisy.
2024-01-27 18:39:41 +00:00
15b6dad5a5 Have delete-duplicates/sort! take a equality procedure
And change the default, as eq? doesn't always work.
2024-01-18 14:41:32 +00:00
4f1ae74d2f Handle derivations with no sources 2023-11-05 18:49:23 +00:00
03327c0cc3 Include output information in the package page response
As this will be useful for QA to say whether the package builds reproducibly
or not.
2023-11-05 13:46:20 +00:00
f5acc60288 Make some sweeping changes to loading new revisions
Move in the direction of being able to run multiple inferior REPLs, and use
some vectors rather than lists in places (maybe this is more efficient).
2023-11-02 12:16:17 +00:00
54c7a1a880 Fix ignoring canceled builds
The previous changes only affected searching for package derivations, and they
also didn't work.
2023-05-18 12:31:58 +01:00
4208b5f148 Ignore canceled builds when querying package derivations
This will help when using this to submit builds, since you won't end up
ignoring derivations with canceled builds.
2023-05-18 11:25:14 +01:00
1a0eaeb672 Improve performance of select-fixed-output-package-derivations-in-revision 2023-03-11 18:19:19 +00:00
6ada1cb845 Guard against divide by 0 in update-derivation-outputs-statistics 2022-11-28 13:17:20 +00:00
1a0c5599eb Do derivation inputs and outputs housekeeping at the end of each job
This should help with query performance, as the recursive queries using
derivation_inputs and derivation_outputs are particularly sensitive to the
n_distinct values for these tables.
2022-11-28 11:36:12 +00:00
78a5abee21 Improve chunking when inserting derivation inputs
Chunk the values inserted in the query, rather than the derivations involved,
as this is more consistent.
2022-09-17 08:53:23 +02:00
7050ea749f Reduce some chunk sizes 2022-09-17 00:40:51 +02:00
8ef896b103 Further reduce some chunk sizes 2022-09-15 16:25:31 +02:00
f41bfcf8b6 Reduce some chunk sizes
As these queries are still slow enough to be logged.
2022-09-14 15:42:00 +01:00
12af30c039 Reduce chunk size for inserting dervation inputs
As this query can take some time.
2022-09-14 09:48:59 +01:00
77c4e1cb63 Reduce the chunk size for querying related derivation ids
And include the chunk size in the log message.
2022-09-13 21:00:04 +01:00
6da5e8e67b Sort derivation output details ids
To ensure that direct array comparison can be used in the query.
2022-07-08 13:47:52 +01:00
db37d9f6a8 Split out inserting derivation output details sets
So that this can be used when inserting builds.
2022-07-08 13:47:52 +01:00
811256a920 Split out inserting into derivation_output_details
So that this can be done when inserting builds.
2022-07-08 13:47:52 +01:00
6d403cbc8d Allow filtering package derivations based on build server builds
This means you can query for derivations where builds exist or don't exist on
a given build server.

I think this will come in useful when submitting builds from a Guix Data
Service instance.
2022-05-23 22:39:32 +01:00
df4e0a7a61 Add to the hardcoded list of valid targets
Since the hardcoded list in the load-new-guix-revision code has been updated.
2022-03-11 11:50:10 +00:00
f86657915e Try to further speed up inserting missing derivation source files
Switch from using a recursive query to doing a breath first search through the
graph of derivations, as I think PostgreSQL wasn't doing a great job of
planning the recursive queries (it would overestimate the rows involved, and
prefer sequential scans for the derivation_outputs table).
2022-03-02 18:00:36 +00:00
c5b504e94a Speed up the finding of missing sources
Use larger batches and more efficient duplicate deletion.
2022-03-01 20:57:26 +00:00
f1418c4e88 Support querying package derivation outputs without the nars
Since this speeds up the response if you don't need the nar information.
2022-01-31 20:24:27 +00:00
a7c9daab6a Process derivations in chunks
Which should reduce the peak memory usage.
2022-01-14 15:25:53 +00:00
5ae8b796a7 Rename chunk-map! to chunk-for-each!
As that better reflects what it does.
2022-01-14 15:25:13 +00:00
21cb33a859 Re-write insert-derivation-inputs in a more memory efficient manor
Previously it would compute a long list of strings, potentially more than
100,000 elements long, then split this string up and insert it in chunks. Only
then could memory be freed.

This new approach builds the strings in batches for the insertion query, then
moves on to the next batch. This should mean that more memory can be freed and
reused along the way.
2022-01-12 18:18:15 +00:00
ba9bcbf735 Use a bigger start size for the hash table
This might help when there's lots of derivations to insert.
2021-10-03 15:28:40 +01:00
b28d338de7 Insert derivations in chunks
To avoid making a very large query when inserting lots of derivations.
2021-10-03 14:54:43 +01:00
af0a06d147 Log the time to read missing derivations from the store 2021-10-03 12:59:26 +01:00
3627d36d77 Select existing derivations in chunks
To avoid one massive query.
2021-10-03 12:59:02 +01:00
857b4e32d5 Insert derivation inputs in chunks
To avoid one massive query.
2021-10-03 12:56:23 +01:00
211da6868f Handle the case where there are no missing file names
In update-derivation-ids-hash-table!.
2021-09-25 00:09:08 +01:00
3081887b90 Optimise inserting derivation inputs
Rather than querying for the output ids one by one and then running an insert
query for each derivation, perform the task with a single insert query.
2021-09-24 18:22:28 +01:00
abff41f9ae Neaten up formatting in select-derivation-output-id 2021-09-24 17:26:48 +01:00