guix-data-service

Author	SHA1	Message	Date
Christopher Baines	02947cc9c4	Replace more instances of system->system-id with lookup-system-id To avoid systems being inserted from queries.	2025-07-04 14:16:03 +01:00
Christopher Baines	de476a8b40	Reapply "Optimise inserting derivation inputs" Reverting this change entirely was too slow, so change the joins in the query from inner joins to left joins, as this should mean that NULL values get inserted if there are missing derivations or derivation outputs, which should cause an error rather than silently skipping inserting the derivation inputs. This reverts commit `edeb89e0cf`.	2025-03-19 14:36:01 +00:00
Christopher Baines	7fe042498f	Tweak using vectors in insert-derivation-outputs	2025-03-17 14:59:42 +00:00
Christopher Baines	b904fdb161	Try to address the issue of missing derivation outputs	2025-03-17 10:26:21 +00:00
Christopher Baines	8635a5561b	Add mechanism to fix derivation inputs Some derivations are missing inputs, I don't know why, but this should allow for manually fixing the affected derivations.	2025-03-10 08:20:43 +00:00
Christopher Baines	61d49cedb3	Remove compatability with old guix derivation-inputs	2025-03-10 06:30:43 +00:00
Christopher Baines	edeb89e0cf	Revert "Optimise inserting derivation inputs" I'm concerned that this approach is more error prone and won't error if there are issues with the data in the database. This reverts commit `3081887b90`.	2025-03-10 06:30:43 +00:00
Christopher Baines	9e3cfabe77	Fix some nulls	2025-02-06 17:14:47 +00:00
Christopher Baines	5ed98343d7	Rework loading revision data These changes were motivated by switching to a mechanism of loading data that isn't dependent on the big advisory lock that prevents more than one revision from being processed at a time. Since INSERT ... RETURNING id; is used, this can block if another transaction inserts the same data, and then cause an error when that transaction commits. The solution is to use ON CONFLICT DO NOTHING, but you have to handle the case when the INSERT doesn't return an id since the other transaction has inserted it. This commit rewrites insert-missing-data-and-return-all-ids to do as described above, as well as being more efficient in how existing data is detected and to use more vectors. Other utilities for inserting data are added as well.	2024-12-09 10:53:06 +00:00
Christopher Baines	38d5501233	Add placeholder derivation source file nar procedures	2024-10-27 14:39:52 +00:00
Christopher Baines	77962f7c2c	Move inserting derivations in to the load-new-guix-revision module And start to more closely integrate it. This makes it possible to start making it faster by doing more in parallel.	2024-08-07 17:21:28 +01:00
Christopher Baines	bbbcea8ff6	Add more time logging in to insert-missing-derivations	2024-07-16 16:13:17 +01:00
Christopher Baines	1754d1a321	Stop inserting missing source file nars This was more an issue several years ago, so this code is not really needed now.	2024-07-16 16:06:46 +01:00
Christopher Baines	e6205e988a	Speed up querying for revision package derivations By splitting it up by system.	2024-06-21 15:29:34 +01:00
Christopher Baines	39f626aa45	Remove even more time logging	2024-01-28 08:18:13 +00:00
Christopher Baines	e51d2f8932	Remove some time logging As this is a bit noisy.	2024-01-27 18:39:41 +00:00
Christopher Baines	15b6dad5a5	Have delete-duplicates/sort! take a equality procedure And change the default, as eq? doesn't always work.	2024-01-18 14:41:32 +00:00
Christopher Baines	4f1ae74d2f	Handle derivations with no sources	2023-11-05 18:49:23 +00:00
Christopher Baines	03327c0cc3	Include output information in the package page response As this will be useful for QA to say whether the package builds reproducibly or not.	2023-11-05 13:46:20 +00:00
Christopher Baines	f5acc60288	Make some sweeping changes to loading new revisions Move in the direction of being able to run multiple inferior REPLs, and use some vectors rather than lists in places (maybe this is more efficient).	2023-11-02 12:16:17 +00:00
Christopher Baines	54c7a1a880	Fix ignoring canceled builds The previous changes only affected searching for package derivations, and they also didn't work.	2023-05-18 12:31:58 +01:00
Christopher Baines	4208b5f148	Ignore canceled builds when querying package derivations This will help when using this to submit builds, since you won't end up ignoring derivations with canceled builds.	2023-05-18 11:25:14 +01:00
Christopher Baines	1a0eaeb672	Improve performance of select-fixed-output-package-derivations-in-revision	2023-03-11 18:19:19 +00:00
Christopher Baines	6ada1cb845	Guard against divide by 0 in update-derivation-outputs-statistics	2022-11-28 13:17:20 +00:00
Christopher Baines	1a0c5599eb	Do derivation inputs and outputs housekeeping at the end of each job This should help with query performance, as the recursive queries using derivation_inputs and derivation_outputs are particularly sensitive to the n_distinct values for these tables.	2022-11-28 11:36:12 +00:00
Christopher Baines	78a5abee21	Improve chunking when inserting derivation inputs Chunk the values inserted in the query, rather than the derivations involved, as this is more consistent.	2022-09-17 08:53:23 +02:00
Christopher Baines	7050ea749f	Reduce some chunk sizes	2022-09-17 00:40:51 +02:00
Christopher Baines	8ef896b103	Further reduce some chunk sizes	2022-09-15 16:25:31 +02:00
Christopher Baines	f41bfcf8b6	Reduce some chunk sizes As these queries are still slow enough to be logged.	2022-09-14 15:42:00 +01:00
Christopher Baines	12af30c039	Reduce chunk size for inserting dervation inputs As this query can take some time.	2022-09-14 09:48:59 +01:00
Christopher Baines	77c4e1cb63	Reduce the chunk size for querying related derivation ids And include the chunk size in the log message.	2022-09-13 21:00:04 +01:00
Christopher Baines	6da5e8e67b	Sort derivation output details ids To ensure that direct array comparison can be used in the query.	2022-07-08 13:47:52 +01:00
Christopher Baines	db37d9f6a8	Split out inserting derivation output details sets So that this can be used when inserting builds.	2022-07-08 13:47:52 +01:00
Christopher Baines	811256a920	Split out inserting into derivation_output_details So that this can be done when inserting builds.	2022-07-08 13:47:52 +01:00
Christopher Baines	6d403cbc8d	Allow filtering package derivations based on build server builds This means you can query for derivations where builds exist or don't exist on a given build server. I think this will come in useful when submitting builds from a Guix Data Service instance.	2022-05-23 22:39:32 +01:00
Christopher Baines	df4e0a7a61	Add to the hardcoded list of valid targets Since the hardcoded list in the load-new-guix-revision code has been updated.	2022-03-11 11:50:10 +00:00
Christopher Baines	f86657915e	Try to further speed up inserting missing derivation source files Switch from using a recursive query to doing a breath first search through the graph of derivations, as I think PostgreSQL wasn't doing a great job of planning the recursive queries (it would overestimate the rows involved, and prefer sequential scans for the derivation_outputs table).	2022-03-02 18:00:36 +00:00
Christopher Baines	c5b504e94a	Speed up the finding of missing sources Use larger batches and more efficient duplicate deletion.	2022-03-01 20:57:26 +00:00
Christopher Baines	f1418c4e88	Support querying package derivation outputs without the nars Since this speeds up the response if you don't need the nar information.	2022-01-31 20:24:27 +00:00
Christopher Baines	a7c9daab6a	Process derivations in chunks Which should reduce the peak memory usage.	2022-01-14 15:25:53 +00:00
Christopher Baines	5ae8b796a7	Rename chunk-map! to chunk-for-each! As that better reflects what it does.	2022-01-14 15:25:13 +00:00
Christopher Baines	21cb33a859	Re-write insert-derivation-inputs in a more memory efficient manor Previously it would compute a long list of strings, potentially more than 100,000 elements long, then split this string up and insert it in chunks. Only then could memory be freed. This new approach builds the strings in batches for the insertion query, then moves on to the next batch. This should mean that more memory can be freed and reused along the way.	2022-01-12 18:18:15 +00:00
Christopher Baines	ba9bcbf735	Use a bigger start size for the hash table This might help when there's lots of derivations to insert.	2021-10-03 15:28:40 +01:00
Christopher Baines	b28d338de7	Insert derivations in chunks To avoid making a very large query when inserting lots of derivations.	2021-10-03 14:54:43 +01:00
Christopher Baines	af0a06d147	Log the time to read missing derivations from the store	2021-10-03 12:59:26 +01:00
Christopher Baines	3627d36d77	Select existing derivations in chunks To avoid one massive query.	2021-10-03 12:59:02 +01:00
Christopher Baines	857b4e32d5	Insert derivation inputs in chunks To avoid one massive query.	2021-10-03 12:56:23 +01:00
Christopher Baines	211da6868f	Handle the case where there are no missing file names In update-derivation-ids-hash-table!.	2021-09-25 00:09:08 +01:00
Christopher Baines	3081887b90	Optimise inserting derivation inputs Rather than querying for the output ids one by one and then running an insert query for each derivation, perform the task with a single insert query.	2021-09-24 18:22:28 +01:00
Christopher Baines	abff41f9ae	Neaten up formatting in select-derivation-output-id	2021-09-24 17:26:48 +01:00

1 2 3

133 commits