restic

Commit Graph

Author	SHA1	Message	Date
Gilbert Gilb's	536ebefff4	feat(backends/s3): add warmup support before repacks and restores (#5173 ) * feat(backends/s3): add warmup support before repacks and restores This commit introduces basic support for transitioning pack files stored in cold storage to hot storage on S3 and S3-compatible providers. To prevent unexpected behavior for existing users, the feature is gated behind new flags: - `s3.enable-restore`: opt-in flag (defaults to false) - `s3.restore-days`: number of days for the restored objects to remain in hot storage (defaults to `7`) - `s3.restore-timeout`: maximum time to wait for a single restoration (default to `1 day`) - `s3.restore-tier`: retrieval tier at which the restore will be processed. (default to `Standard`) As restoration times can be lengthy, this implementation preemptively restores selected packs to prevent incessant restore-delays during downloads. This is slightly sub-optimal as we could process packs out-of-order (as soon as they're transitioned), but this would really add too much complexity for a marginal gain in speed. To maintain simplicity and prevent resources exhautions with lots of packs, no new concurrency mechanisms or goroutines were added. This just hooks gracefully into the existing routines. Limitations: - Tests against the backend were not written due to the lack of cold storage class support in MinIO. Testing was done manually on Scaleway's S3-compatible object storage. If necessary, we could explore testing with LocalStack or mocks, though this requires further discussion. - Currently, this feature only warms up before restores and repacks (prune/copy), as those are the two main use-cases I came across. Support for other commands may be added in future iterations, as long as affected packs can be calculated in advance. - The feature is gated behind a new alpha `s3-restore` feature flag to make it explicit that the feature is still wet behind the ears. - There is no explicit user notification for ongoing pack restorations. While I think it is not necessary because of the opt-in flag, showing some notice may improve usability (but would probably require major refactoring in the progress bar which I didn't want to start). Another possibility would be to add a flag to send restores requests and fail early. See https://github.com/restic/restic/issues/3202 * ui: warn user when files are warming up from cold storage * refactor: remove the PacksWarmer struct It's easier to handle multiple handles in the backend directly, and it may open the door to reducing the number of requests made to the backend in the future.	2025-02-01 18:26:27 +00:00
Michael Eischer	4df887406f	repository: inline MasterIndex interface into Repository interface	2024-05-24 21:33:17 +02:00
Michael Eischer	ffe5439149	Merge pull request #4605 from MichaelEischer/better-restorer-error-handling Rework repository.StreamPacks & better restorer error handling	2024-05-01 16:37:41 +02:00
Michael Eischer	31624aeffd	Improve command shutdown on context cancellation	2024-04-22 22:31:38 +02:00
Michael Eischer	621012dac0	repository: Add blob loading fallback to LoadBlobsFromPack Try to retrieve individual blobs via LoadBlob if streaming did not work.	2024-04-21 21:35:55 +02:00
Michael Eischer	2c310a526e	repository: Replace StreamPack function with LoadBlobsFromPack method LoadBlobsFromPack is now part of the repository struct. This ensures that users of that method don't have to deal will internals of the repository implementation. The filerestorer tests now also contain far fewer pack file implementation details.	2024-01-19 21:40:43 +01:00
Michael Eischer	5773b86d02	repository: Push all usage of errors.Fatal out of the package As the `Fatal` error type only includes a string, it becomes impossible to inspect the contained error. This is for a example a problem for the fuse implementation, which must be able to detect context.Canceled errors. Co-authored-by: greatroar <61184462+greatroar@users.noreply.github.com>	2023-05-18 17:27:41 +02:00
Michael Eischer	c4fc5c97f9	prune: Use a single CountedBlobSet to track blobs The set covers necessary, existing and duplicate blobs. This removes the duplicate sets used to track whether all necessary blobs also exist. This reduces the memory usage of prune by about 20-30%.	2022-10-22 18:45:12 +02:00
Michael Eischer	7682149c9d	repository: cleanup copy connection count check	2022-08-28 11:40:56 +02:00
Michael Eischer	b03277ead5	repository: don't hang when copying using a single connection	2022-08-28 11:40:31 +02:00
Michael Eischer	623770eebb	repository: try to recover from invalid blob while repacking If a blob that should be kept is invalid, Repack will now try to request the blob using LoadBlob. Only return an error if that fails.	2022-07-30 17:37:07 +02:00
Michael Eischer	6f53ecc1ae	adapt workers based on whether an operation is CPU or IO-bound Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks, repo.Connections() for IO-bound task and a combination if a task can be both. Streaming packs is treated as IO-bound as adding more worker cannot provide a speedup. Typical IO-bound tasks are download / uploading / deleting files. Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are a combination of both, e.g. for combined download and decode functions. In the latter case add both limits together. As the backends have their own concurrency limits restic still won't download more than repo.Connections() files in parallel, but the additional workers can decode already downloaded data in parallel.	2022-07-03 12:19:26 +02:00
Michael Eischer	120ccc8754	repository: Rework blob saving to use an async pack uploader Previously, SaveAndEncrypt would assemble blobs into packs and either return immediately if the pack is not yet full or upload the pack file otherwise. The upload will block the current goroutine until it finishes. Now, the upload is done using separate goroutines. This requires changes to the error handling. As uploads are no longer tied to a SaveAndEncrypt call, failed uploads are signaled using an errgroup. To count the uploaded amount of data, the pack header overhead is no longer returned by `packer.Finalize` but rather by `packer.HeaderOverhead`. This helper method is necessary to continue returning the pack header overhead directly to the responsible call to `repository.SaveBlob`. Without the method this would not be possible, as packs are finalized asynchronously.	2022-07-02 22:42:34 +02:00
Alexander Neumann	99634c0936	Return real size from SaveBlob	2022-07-02 18:55:12 +02:00
Michael Eischer	e597b99b55	repository: Reduce repack workers to prevent deadlock As repack streams packs these occupy one backend connection. Uploading a new pack also requires a backend connection. To prevent a deadlock during repack when reaching the backend connections limit, simply limit the repackWorker count to always leave one connection for uploading.	2022-04-23 11:28:18 +02:00
Michael Eischer	537b4c310a	copy: Implement by reusing repack The repack operation copies all selected blobs from a set of pack files into new pack files. For prune the source and destination repositories are identical. To implement copy, just use a different source and destination repository.	2022-03-26 20:47:15 +01:00
Michael Eischer	f00f690658	repository: stream packs during repacking	2022-02-12 20:18:25 +01:00
Alexander Weiss	c3ddde9e7d	Return hdrSize in ListPack	2020-11-21 22:13:54 +01:00
greatroar	21b787a4d1	Stop Counters where they're constructed and started	2020-11-09 13:03:31 +01:00
greatroar	ddca699cd2	Replace restic.Progress with new progress.Counter This fixes two race conditions while cleaning up the code.	2020-11-09 12:12:35 +01:00
Alexander Weiss	aaf1c44362	Fix #3062	2020-11-05 17:05:42 +01:00
Alexander Neumann	ae5302c7a8	Add comment that keepBlobs is modified	2020-11-05 10:33:38 +01:00
Alexander Neumann	866a52ad4e	Remove unneeded seek The file returned from DownloadAndHash() is already seeked to the start of the file.	2020-11-05 10:31:49 +01:00
Michael Eischer	b373f164fe	prune: Parallelize repack command	2020-11-05 10:31:49 +01:00
Michael Eischer	367449dede	prune: Reduce memory allocations while repacking The slicing operator `slice[low:high]` default to 0 for the lower bound and len(slice) for the upper bound when either or both are not specified. Fix the code to use `cap(slice)` to check for the slice capacity.	2020-08-16 11:34:01 +02:00
Michael Eischer	7042bafea5	prune: Abort repacking when a pack contains a wrong blob If a blob in a pack file can be decrypted successfully but contains data that results in a different hash than stated in the header pack, then abort repacking. As both the pack header and the blob are cryptographically verified this either means than a malicious entity tampered with the backup or indicates hardware problems on the client. prune should fail with an error in both cases.	2020-08-16 11:34:01 +02:00
aawsome	0fed6a8dfc	Use "pack file" instead of "data file" (#2885 ) - changed variable names, especially changed DataFile into PackFile - changed in some comments - always use "pack file" in docu	2020-08-16 11:16:38 +02:00
Michael Eischer	05116e4787	prune: Cleanup progress bar handling while repacking	2020-08-03 19:32:46 +02:00
Alexander Weiss	91906911b0	Fix non-intuitive repository behavior - The SaveBlob method now checks for duplicates. - Moves handling of pending blobs to MasterIndex. -> also cleans up pending index entries when they are saved in the index -> when using SaveBlob no need to care about index any longer - Always check for full index and save it when storing packs. -> removes the need of an index uploader -> also removes the verbose "uploaded intermediate index" messages - The Flush method now also saves the index - Fix race condition when checking and saving full/non-finalized indexes	2020-06-11 13:05:23 +02:00
Alexander Neumann	bfa18ee8ec	DownloadAndHash: Check error returned by Load()	2018-10-28 21:28:56 +01:00
Igor Fedorenko	ab040d8811	Introduced repository.DownloadAndHash helper Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>	2018-02-16 21:13:11 -05:00
Igor Fedorenko	d58ae43317	Reworked Backend.Load API to retry errors during ongoing download Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>	2018-02-16 21:12:14 -05:00
Alexander Neumann	cccb2fc7e7	Merge pull request #1583 from restic/close-open-backend-files Close backend files in case of errors	2018-01-26 21:57:28 +01:00
Alexander Neumann	909d9273cc	Close backend files in case of errors	2018-01-25 21:05:57 +01:00
Alexander Neumann	663c57ab4d	debug: Remove manual Str() call Log()	2018-01-25 20:49:41 +01:00
George Armhold	d886cb5c27	replace ad-hoc context.TODO() with gopts.ctx, so that cancellation can properly trickle down from cmd_*. gh-1434	2017-12-03 07:22:14 -05:00
Alexander Neumann	931e6ed2ac	Use Seal/Open everywhere	2017-11-01 10:30:40 +01:00
Alexander Neumann	f26492fc2d	prune: Warn about wrong plaintext blob ID	2017-10-02 16:27:08 +02:00
Alexander Neumann	23c903074c	Move restic package to internal/restic	2017-07-24 17:43:32 +02:00
Alexander Neumann	6caeff2408	Run goimports	2017-07-23 14:21:03 +02:00
Alexander Neumann	83d1a46526	Moves files	2017-07-23 14:19:13 +02:00

41 Commits