guix-data-service

Author	SHA1	Message	Date
Christopher Baines	f68166514f	Actually delete more of the data for a revision Previously the package_derivations table wasn't considered, which would mean derivations would still be referenced. This commit fixes that, along with also deleting unreferenced entries in some linter related tables.	2020-10-04 15:11:21 +01:00
Christopher Baines	48673b32cb	Fix delete-unreferenced-derivations	2020-10-04 13:23:15 +01:00
Christopher Baines	93c9813546	Fix the implementation of par-map& It was pretty wrong...	2020-10-04 13:22:35 +01:00
Christopher Baines	d2646e7110	Remove some non-existent imports	2020-10-04 13:22:24 +01:00
Christopher Baines	a24d3e934d	Extract out the ability to delete a range of commits Some revisions have got disassociated from branches, probably because they were associated with multiple branches in the first place. This should allow deleting them.	2020-10-04 12:18:57 +01:00
Christopher Baines	fe7da1ba57	Remove some unnecessary parallel-via-thread-pool-channel calls As these were causing errors because they were nested in letpar&.	2020-10-04 11:29:51 +01:00
Christopher Baines	96b65f16fb	Avoid fiber deadlocks Channels don't represent some channel on which messages travel, at least not a very long one because it can't accommodate any messages. They simply represent a direct exchange of the message between a sender and receiver. Because of this, put-message blocks the fiber, and if all the threads on the other end are waiting for replies to be received, then you have a deadlock. To avoid this situation, spawn new fibers to send the messages. I think this works at least, although I'm unsure how sensible it is.	2020-10-04 10:18:53 +01:00
Christopher Baines	55eaaaeeac	Bump the copyright date in the footer Later is better than never...	2020-10-03 21:42:18 +01:00
Christopher Baines	c3c9c07f9a	Completely rework the way db connections are handled during requests Previously, a connection was passed through the code handling the request. When queries were performed, this could block the thread though, potentially leaving the server unable to serve other requests. Instead, this now runs queries in a pool of threads. This should remove the possibility of blocking the threads used by the web server, and in doing so, some of the queries have been parallelised. I''m still not sure about the naming and syntax, but I think the functionality is a sort of step forward.	2020-10-03 21:35:31 +01:00
Christopher Baines	e2e55c69de	Rework the shortlived PostgreSQL specific connection channel In to a generic thing more like (ice-9 futures). Including copying some bits from the (ice-9 threads) module and adapting them to work with this fibers approach, rather than futures. The advantage being that using fibers channels doesn't block the threads being used by fibers, whereas futures would.	2020-10-03 21:32:46 +01:00
Christopher Baines	18b6dd9e6d	Stop opening a PostgreSQL connection per request This was good in that it avoided having to deal with long running connections, but it probably takes some time to open the connection, and these changes are a step towards offloading the PostgreSQL queries to other threads, so they don't block the threads for fibers.	2020-10-03 09:22:29 +01:00
Christopher Baines	9723a18df4	Add some utilities to work with PostgreSQL connections in threads	2020-10-03 09:20:39 +01:00
Christopher Baines	1bdc8855ba	Extract out opening PostgreSQL connections So this can be reused.	2020-10-03 08:55:56 +01:00
Christopher Baines	470573b318	Delete derivation_source_files that are unreferenced This will also delete unreferenced derivation_source_file_nars.	2020-10-02 20:15:23 +01:00
Christopher Baines	e2a7705d3d	Add an index for derivation_sources.derivation_source_file_id As this speeds up deleting derivation_source_files.	2020-10-02 20:15:23 +01:00
Christopher Baines	71afa93981	Make with-postgresql-connection work with multiple values	2020-10-02 20:15:23 +01:00
Christopher Baines	841f5fb186	Change a constraint to add ON DELETE CASCADE I've not used these in many places, to try and avoid hiding deleting data, but in this case, this will allow more easily deleting the derivation source file nars, by just deleting the derivation_source_files table entry.	2020-10-02 20:15:10 +01:00
Christopher Baines	125a35fce5	Reformat lint warning related query	2020-10-02 17:52:07 +01:00
Christopher Baines	af40c1ac13	Speed up a query for derivation builds This change removes a sequential scan from the query plan, making it much faster.	2020-10-02 17:51:55 +01:00
Christopher Baines	6e0e33addf	Change the autovacuum config for some tables Looking at data for the the patches deployment of the Guix Data Service, these tables look like they might benefit from vacuuming/analyzing more often, so adjust the configuration so this will hopefully happen.	2020-10-01 22:30:39 +01:00
Christopher Baines	c05a8e4e9f	COALESCE a couple more pg_stat fields As apparently they can be NULL.	2020-10-01 22:00:52 +01:00
Christopher Baines	7f49756bac	Track some pg_stat metrics Hopefully this'll help track database things better.	2020-10-01 21:43:41 +01:00
Christopher Baines	404f39a9ee	Drop default thread count for make-postgresql-connection-channel At least for data deletion, 4 seems unnecessary.	2020-10-01 19:41:13 +01:00
Christopher Baines	54654417a3	Delete derivations in parallel In an attempt to make this faster.	2020-10-01 19:15:32 +01:00
Christopher Baines	16600b1a43	Remove the deleting derivations progress output As this is harder to do when deleting derivations in parallel.	2020-10-01 19:14:56 +01:00
Christopher Baines	fb4c7ecd4c	Delete derivations through a channel Not much different from before, but this will allow parallelising things.	2020-10-01 19:14:11 +01:00
Christopher Baines	614f9888a5	Add some utilities to use PostgreSQL/Squee through a channel To allow for some concurrency.	2020-10-01 19:13:30 +01:00
Christopher Baines	3330f034a4	Remove a now redundant part of the maybe-delete-derivation query As this is covered by the big query selecting the derivation ids.	2020-09-30 20:34:33 +01:00
Christopher Baines	d844b325e2	Stop recursing now that derivation deletion selection is smarter As this probably won't help with performance.	2020-09-30 20:07:41 +01:00
Christopher Baines	47af6c9661	Attempt to speed up derivation deletion Stop querying for the file-name, as it's unused. Rather than fetching all ids, then looking at each to see if it can be deleted, do some imperfect but not too slow checks in the initial query.	2020-09-30 19:38:56 +01:00
Christopher Baines	39b5df04eb	Remove development code from the process job script	2020-09-28 08:29:20 +01:00
Christopher Baines	033858410b	Add a JSON page for repository branches	2020-09-27 16:32:56 +01:00
Christopher Baines	f7933807ac	Add a JSON representation for repositories	2020-09-27 16:26:45 +01:00
Christopher Baines	02681d7e7a	Fix delete builds for derivation output details set	2020-09-27 16:21:51 +01:00
Christopher Baines	84907fe040	Implement the JSON representation for system tests	2020-09-27 12:06:18 +01:00
Christopher Baines	5b13ee2251	Delete builds for unreferenced derivations	2020-09-27 11:11:02 +01:00
Christopher Baines	52a23a5333	Further data deletion improvements	2020-09-27 11:10:47 +01:00
Christopher Baines	65e8bf3f8d	Add delete-revisions-from-branch-except-most-recent-n	2020-09-26 19:38:56 +01:00
Christopher Baines	992a0af63e	Split off delete-revisions-from-branch from delete-data-for-branch To support not deleting all of the revisions.	2020-09-26 18:23:21 +01:00
Christopher Baines	fb180e1ebd	Replace debug-set! with setenv COLUMNS As that actually seems to work.	2020-09-26 16:42:18 +01:00
Christopher Baines	6bc1da014f	Better handle loading the (guix i18n) module in the inferior Previously it would only be loaded if the (guix lint) module exists.	2020-09-26 16:07:53 +01:00
Christopher Baines	faf46565ce	Fix some package search issues Previously, the name wasn't taken in to account when filtering results, so a search like "git-annex" wouldn't find the git-annex package, since it's synopsis or description doesn't include the name. Filtering on the name made the queries much slower, so to address that, the filtering by revision is moved to a separate part of the CTE, which means PostgreSQL filters down the rows by quite a lot before it begins filtering by name. Also, add in a variant of the query without dashes (-) because that helps with searches like ruby-engine.	2020-09-26 16:05:06 +01:00
Christopher Baines	53341c70fc	Change the locale codeset representation From the normalized one, to the one actually contained within glibc. Recent versions of glibc also contain symlinks linking the normalized codeset to the locales with the .UTF-8 ending, but older ones do not. Maybe handling codeset normalisation for queries would be good, but the locale values ending in .UTF-8 are more compatible and allow the code to be simplified. For querying, maybe there should be a locales table which handles different representations.	2020-09-26 11:45:57 +01:00
Christopher Baines	af2e12a9ef	Add some new metrics about load new revision jobs	2020-09-20 19:13:23 +01:00
Christopher Baines	fd3ba489d9	Add a metric for the number of revisions	2020-09-20 18:39:46 +01:00
Christopher Baines	857ac36711	Return a number from count-guix-revisions	2020-09-20 18:38:39 +01:00
Christopher Baines	e38db9eed9	Set the locale at the start of the process jobs script This might help with the odd [1] errors regarding PostgreSQL queries. 1: invalid byte sequence for encoding "UTF8":	2020-09-20 11:11:03 +01:00
Christopher Baines	a0e098a6ce	Increase the stack trace width when processing jobs As this might result in more useful error messages.	2020-09-20 10:59:22 +01:00
Christopher Baines	c596a1c6a9	Add a Prometheus metrics page, with some database metrics The database size is growing, but it's hard to know what parts are growing the fastest. These metrics will hopefully help with understanding that.	2020-09-06 13:14:31 +01:00
Christopher Baines	a0cd1097f9	Add guile-prometheus to guix-dev.scm	2020-09-06 13:13:30 +01:00

1 2 3 4 5 ...

965 commits