84a2ad5b25
Speed up deleting derivation sources
2024-06-20 17:07:41 +01:00
530f58b59c
Cache the derivations that weren't deleted
...
When deleting derivations, as I think this might reduce the number of queries.
2024-06-20 15:47:21 +01:00
f7ada4bf1f
Guard against trying to delete an empty list of commits
2024-05-22 11:46:18 +01:00
9f102dbd39
Add code to delete nars entries
2023-08-01 14:13:10 +01:00
7495085f63
Delete unreferenced derivations in batches
...
To avoid a long blocking query.
2023-08-01 10:16:31 +01:00
bbc53deb1f
Rewrite deleting unreferenced derivations
...
Use fibers more, leaning in on the non-blocking use of Squee for parallelism.
2023-07-25 17:57:00 +01:00
7251c7d653
Stop using a pool of threads for database operations
...
Now that squee cooperates with suspendable ports, this is unnecessary. Use a
connection pool to still support running queries in parallel using multiple
connections.
2023-07-10 18:56:31 +01:00
742949cc97
Improve data deletion
2023-07-01 12:01:13 +01:00
47c482bdcc
Set lock_timeout for some data deletion transactions
...
As these can cause deadlocks. This will probably cause errors, so some
retrying will need to be added.
2023-05-09 08:55:09 +01:00
4fa7a3601e
Include distribution counts table in data deletion
2023-04-07 11:21:28 +01:00
2d96fbff48
Speed up deleting blocked_builds entries
2023-02-27 22:52:43 +00:00
1bce38a69d
Move the delete-unreferenced-derivations advisory lock
...
To better prevent two processes running at the same time.
2023-02-27 22:48:54 +00:00
1266d3d336
Remove redundant postgresql connection when deleting derivations
2023-02-14 20:59:21 +00:00
ebbcf36dc4
Delete blocked_builds entries when deleting derivations
2023-02-14 20:10:44 +00:00
5874c4ee37
Delete git_branches entries
...
When deleting data for a branch.
2023-02-14 19:57:30 +00:00
9872367c01
Avoid errors dropping partition tables if they don't exist
2023-02-13 20:10:23 +00:00
078516e0ab
Improve dropping package_derivation_by_guix_revision_range partitions
2023-02-13 19:26:44 +00:00
38b3657013
Use advisory locks to avoid deadlocks during data deletion
...
In the case where multiple data deleting processes end up running at the same
time.
2022-11-28 10:26:46 +00:00
39487cd7e6
Improve deleting derivations
...
Drop the batch size to get rid of warnings about memory usage and improve the
logging by adding duration information.
2022-07-08 20:55:58 +01:00
22c2ed2fa7
Fix ambiguous id column in delete-guix-revisions query
2022-06-16 12:46:32 +01:00
754f64718f
Fix DELETE query in delete-revisions-from-branch
2022-06-16 12:38:51 +01:00
be45e4251e
Fix ambiguous id column in delete-from-git-commits
2022-06-16 12:30:08 +01:00
71aaf1016b
Remove duplicate AND from delete-from-git-commits query
2022-06-16 12:25:47 +01:00
64be52844e
Partition the package_derivations_by_guix_revision_range table
...
And create a proper git_branches table in the process.
I'm hoping this will help with slow deletions from the
package_derivations_by_guix_revision_range table in the case where there are
lots of branches, since it'll separate the data for one branch from another.
These migrations will remove the existing data, so
rebuild-package-derivations-table will currently need manually running to
regenerate it.
2022-05-23 19:10:25 +01:00
971a474f65
Update delete-unreferenced-derivations
...
To delete from latest_build_status as well.
2020-10-13 20:33:07 +01:00
f02c245652
Add another guard clause in to the data deletion code
...
I've see this error [1] which may relate to the derivation-output-details-id
not being a number, so this check should confirm if there is a issue.
1: Throw to key `psql-query-error' with args `(fatal-error "PGRES_FATAL_ERROR" "ERROR: invalid input syntax for integer: \"\"\n")'.
2020-10-10 13:34:54 +01:00
2c463fcdab
Guard against derivation IDs that aren't numbers
...
I saw an error suggesting that something came back that wasn't a number, and
this should give a more informative error.
2020-10-09 19:27:04 +01:00
062397e82b
Just use map rather than par-map& for deleting derivations
...
As I think par-map& is probably no faster.
2020-10-08 08:20:03 +01:00
936fda57c5
Make the derivation deletion batch size configurable
2020-10-08 07:52:03 +01:00
b540abaeba
Reduce the derivation deletion batch size
2020-10-08 07:49:28 +01:00
f68166514f
Actually delete more of the data for a revision
...
Previously the package_derivations table wasn't considered, which would mean
derivations would still be referenced. This commit fixes that, along with also
deleting unreferenced entries in some linter related tables.
2020-10-04 15:11:21 +01:00
48673b32cb
Fix delete-unreferenced-derivations
2020-10-04 13:23:15 +01:00
a24d3e934d
Extract out the ability to delete a range of commits
...
Some revisions have got disassociated from branches, probably because they
were associated with multiple branches in the first place. This should allow
deleting them.
2020-10-04 12:18:57 +01:00
e2e55c69de
Rework the shortlived PostgreSQL specific connection channel
...
In to a generic thing more like (ice-9 futures). Including copying some bits
from the (ice-9 threads) module and adapting them to work with this fibers
approach, rather than futures. The advantage being that using fibers channels
doesn't block the threads being used by fibers, whereas futures would.
2020-10-03 21:32:46 +01:00
470573b318
Delete derivation_source_files that are unreferenced
...
This will also delete unreferenced derivation_source_file_nars.
2020-10-02 20:15:23 +01:00
54654417a3
Delete derivations in parallel
...
In an attempt to make this faster.
2020-10-01 19:15:32 +01:00
16600b1a43
Remove the deleting derivations progress output
...
As this is harder to do when deleting derivations in parallel.
2020-10-01 19:14:56 +01:00
fb4c7ecd4c
Delete derivations through a channel
...
Not much different from before, but this will allow parallelising things.
2020-10-01 19:14:11 +01:00
3330f034a4
Remove a now redundant part of the maybe-delete-derivation query
...
As this is covered by the big query selecting the derivation ids.
2020-09-30 20:34:33 +01:00
d844b325e2
Stop recursing now that derivation deletion selection is smarter
...
As this probably won't help with performance.
2020-09-30 20:07:41 +01:00
47af6c9661
Attempt to speed up derivation deletion
...
Stop querying for the file-name, as it's unused. Rather than fetching all ids,
then looking at each to see if it can be deleted, do some imperfect but not
too slow checks in the initial query.
2020-09-30 19:38:56 +01:00
02681d7e7a
Fix delete builds for derivation output details set
2020-09-27 16:21:51 +01:00
5b13ee2251
Delete builds for unreferenced derivations
2020-09-27 11:11:02 +01:00
52a23a5333
Further data deletion improvements
2020-09-27 11:10:47 +01:00
65e8bf3f8d
Add delete-revisions-from-branch-except-most-recent-n
2020-09-26 19:38:56 +01:00
992a0af63e
Split off delete-revisions-from-branch from delete-data-for-branch
...
To support not deleting all of the revisions.
2020-09-26 18:23:21 +01:00
f11421824d
Add a helper procedure to delete data for deleted branches
2020-05-23 21:05:44 +01:00
ca0d3ee754
Stop using package_versions_by_guix_revision_range
...
It's been replaced by the package_derivations_by_guix_revision_range table.
2020-03-24 20:44:57 +00:00
9178bd51a9
Add a function to delete unreferenced derivations
2020-02-16 22:29:25 +00:00
b087cfca67
Define the code to delete data from non-master branches properly
2020-02-16 10:59:38 +00:00