Commit Graph

51 Commits

Author SHA1 Message Date
Gilbert Gilb's 536ebefff4
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores

This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.

To prevent unexpected behavior for existing users, the feature is gated
behind new flags:

- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
  in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
  (default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
  processed. (default to `Standard`)

As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.

To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.

**Limitations:**

- Tests against the backend were not written due to the lack of cold
  storage class support in MinIO. Testing was done manually on
  Scaleway's S3-compatible object storage. If necessary, we could
  explore testing with LocalStack or mocks, though this requires further
  discussion.
- Currently, this feature only warms up before restores and repacks
  (prune/copy), as those are the two main use-cases I came across.
  Support for other commands may be added in future iterations, as long
  as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
  make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
  While I think it is not necessary because of the opt-in flag, showing
  some notice may improve usability (but would probably require major
  refactoring in the progress bar which I didn't want to start). Another
  possibility would be to add a flag to send restores requests and fail
  early.

See https://github.com/restic/restic/issues/3202

* ui: warn user when files are warming up from cold storage

* refactor: remove the PacksWarmer struct

It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 18:26:27 +00:00
Michael Eischer 99e105eeb6 repository: restrict SaveUnpacked and RemoveUnpacked
Those methods now only allow modifying snapshots. Internal data types
used by the repository are now read-only. The repository-internal code
can bypass the restrictions by wrapping the repository in an
`internalRepository` type.

The restriction itself is implemented by using a new datatype
WriteableFileType in the SaveUnpacked and RemoveUnpacked methods. This
statically ensures that code cannot bypass the access restrictions.

The test changes are somewhat noisy as some of them modify repository
internals and therefore require some way to bypass the access
restrictions. This works by capturing an `internalRepository` or
`Backend` when creating the Repository using a test helper function.
2025-01-13 22:39:57 +01:00
Michael Eischer 400ae55940 replace deprecated usages of math/rand 2024-08-10 19:34:49 +02:00
Michael Eischer 550d1eeac3 repository: remove SaveIndex from interface
The method is now only indirectly accessible via Prune or RepairIndex.
2024-05-24 21:33:17 +02:00
Michael Eischer 864995271e repository: unwrap BlobHandle parameters of LookupBlob
The method now uses the same parameters as LookupBlobSize.
2024-05-24 21:33:17 +02:00
Michael Eischer 1266a4932f repository: fix parameter order of LookupBlobSize
All methods should use blobType followed by ID.
2024-05-24 21:33:17 +02:00
Michael Eischer 0bb0720348 test cleanups 2024-05-24 21:33:17 +02:00
Michael Eischer 4df887406f repository: inline MasterIndex interface into Repository interface 2024-05-24 21:33:17 +02:00
Michael Eischer 8a425c2f0a remove usages of repo.Backend() from tests 2024-05-18 21:42:51 +02:00
Michael Eischer ab9077bc13 replace usages of backend.Remove() with repository.RemoveUnpacked()
RemoveUnpacked will eventually block removal of all filetypes other than
snapshots. However, getting there requires a major refactor to provide
some components with privileged access.
2024-05-18 21:38:31 +02:00
Michael Eischer 8d507c1372 repository: add basic test for RepairIndex 2024-04-14 13:45:15 +02:00
Michael Eischer b25fc2c89d repository: remove redundant flushes from tests 2024-04-14 13:45:10 +02:00
Michael Eischer c65459cd8a repository: speed up tests 2024-04-14 13:45:10 +02:00
Michael Eischer 86b38a0b17 rename `--no-verify-pack` to `--no-extra-verify` 2024-02-04 17:01:05 +01:00
Michael Eischer 2dbb18128c repository: Allow skipping verification for tests
Some tests have to explicitly create pack files with blobs that don't
match their ID. For those blobs the builtin verification of the
repository must be disabled.
2024-02-03 18:22:47 +01:00
Michael Eischer bfb56b78e1 replace some usages of restic.Repository with more specific interface
This should eventually make it easier to test the code.
2024-01-27 13:02:02 +01:00
Michael Eischer 42c9318b9c repair pack: add tests 2024-01-27 12:51:45 +01:00
Michael Eischer cb50832d50 index: let MasterIndex.Save also delete obsolete indexes 2024-01-27 12:51:08 +01:00
Andrea Gelmini 241916d55b
Fix typos 2023-12-06 13:11:55 +01:00
Michael Eischer 1b8a67fe76 move Backend interface to backend package 2023-10-25 23:00:18 +02:00
arjunajesh ed65a7dbca implement progress bar for index loading 2023-10-01 19:52:59 +02:00
greatroar c0b5ec55ab repository: Remove empty cleanup functions in tests
TestRepository and its variants always returned no-op cleanup functions.
If they ever do need to do cleanup, using testing.T.Cleanup is easier
than passing these functions around.
2022-12-11 11:06:25 +01:00
Michael Eischer 2e3f1c08c5 repository: split index into a separate package 2022-10-08 21:15:34 +02:00
Michael Eischer b03277ead5 repository: don't hang when copying using a single connection 2022-08-28 11:40:31 +02:00
Michael Eischer 73053674d9 repository: Test fallback to existing blobs 2022-07-30 17:37:07 +02:00
Michael Eischer 120ccc8754 repository: Rework blob saving to use an async pack uploader
Previously, SaveAndEncrypt would assemble blobs into packs and either
return immediately if the pack is not yet full or upload the pack file
otherwise. The upload will block the current goroutine until it
finishes.

Now, the upload is done using separate goroutines. This requires changes
to the error handling. As uploads are no longer tied to a SaveAndEncrypt
call, failed uploads are signaled using an errgroup.

To count the uploaded amount of data, the pack header overhead is no
longer returned by `packer.Finalize` but rather by
`packer.HeaderOverhead`. This helper method is necessary to continue
returning the pack header overhead directly to the responsible call to
`repository.SaveBlob`. Without the method this would not be possible,
as packs are finalized asynchronously.
2022-07-02 22:42:34 +02:00
Alexander Neumann 99634c0936 Return real size from SaveBlob 2022-07-02 18:55:12 +02:00
Michael Eischer ed8aa15376 repository: add Save method to MasterIndex interface 2022-07-02 18:38:56 +02:00
Michael Eischer a77d5c4d11 repository: index saving belongs into the MasterIndex 2022-07-02 18:38:56 +02:00
Michael Eischer 9ffb8920f1 repository: run blackbox tests using old and new repo version 2022-04-30 11:34:10 +02:00
Michael Eischer 537b4c310a copy: Implement by reusing repack
The repack operation copies all selected blobs from a set of pack files
into new pack files. For prune the source and destination repositories
are identical. To implement copy, just use a different source and
destination repository.
2022-03-26 20:47:15 +01:00
Alexander Neumann 75f53955ee errcheck: Add error checks
Most added checks are straight forward.
2021-01-30 20:02:37 +01:00
Alexander Weiss aa7a5f19c2 Use BlobHandle in index methods 2020-11-22 20:41:12 +01:00
Alexander Weiss 1ec628ddf5 Add extraObsolete to MasterIndex.Save 2020-11-15 07:05:09 +01:00
Alexander Weiss 5898cb341f Use CreateIndexFromPacks() in test 2020-11-15 07:05:05 +01:00
Alexander Neumann 7def2d8ea7 Use a non-constant seed 2020-11-05 10:31:49 +01:00
Alexander Neumann ee0112ab3b Clarify message about expected error 2020-11-05 10:31:49 +01:00
Michael Eischer f4b9544ab2 prune: Add test that repack aborts on wrong blob 2020-08-16 11:34:01 +02:00
aawsome 0fed6a8dfc
Use "pack file" instead of "data file" (#2885)
- changed variable names, especially changed DataFile into PackFile
- changed in some comments
- always use "pack file" in docu
2020-08-16 11:16:38 +02:00
Alexander Weiss 9d1fb94c6c make Lookup() return all blobs
+ simplify syntax
2020-07-25 21:18:34 +02:00
Alexander Weiss 91906911b0 Fix non-intuitive repository behavior
- The SaveBlob method now checks for duplicates.
- Moves handling of pending blobs to MasterIndex.
  -> also cleans up pending index entries when they are saved in the index
  -> when using SaveBlob no need to care about index any longer
- Always check for full index and save it when storing packs.
  -> removes the need of an index uploader
  -> also removes the verbose "uploaded intermediate index" messages
- The Flush method now also saves the index
- Fix race condition when checking and saving full/non-finalized indexes
2020-06-11 13:05:23 +02:00
greatroar 4de12bf593 Remove restic.RandReader
math/rand.Rand has implemented Reader since Go 1.6. The repacking tests
are not deterministic, but they weren't before, either.
2020-03-09 10:00:28 +01:00
Alexander Neumann 9c55e8d69c Merge pull request #1549 from MJDSys/more_index_lookup_avoids
More optimizations to avoid calling Index.Lookup()
2018-01-24 20:53:30 +01:00
Igor Fedorenko 0084e42cb6 Optimize Repository.ListPack()
Use pack file size returned by Backend.List() to avoid extra per-pack
Backend.Stat() requests

Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>
2018-01-23 22:39:51 -05:00
Matthew Dawson df2c03a6a4
repository/master_index: Optimize Index.Lookup()
When looking up a blob in the master index, with several
indexes present in the master index, a significant amount of time
is spent generating errors for each failed lookup.  However, these
errors are often used to check if a blob is present, but the contents
are not inspected making the overhead of the error not useful.

Instead, change Index.Lookup (and Index.LookupSize) to instead return
a boolean denoting if the blob was found instead of an error.  Also change
all the calls to these functions to handle the new function signature.

benchmark                                            old ns/op     new ns/op     delta
BenchmarkMasterIndexLookupSingleIndex-6              820           897           +9.39%
BenchmarkMasterIndexLookupMultipleIndex-6            12821         2001          -84.39%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       5378          492           -90.85%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     17026         1649          -90.31%

benchmark                                            old allocs     new allocs     delta
BenchmarkMasterIndexLookupSingleIndex-6              9              9              +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6            59             19             -67.80%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       22             6              -72.73%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     72             16             -77.78%

benchmark                                            old bytes     new bytes     delta
BenchmarkMasterIndexLookupSingleIndex-6              160           160           +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6            3200          240           -92.50%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       1232          48            -96.10%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     4272          128           -97.00%
2018-01-23 22:25:56 -05:00
Alexander Neumann 59782e347c repository: Fix repack test
This reduces the chance of duplicate blobs, otherwise the tests fail
(make the contents of a blob depend on a pseudo-random number instead of
the size, sizes may be duplicate).
2018-01-23 23:14:05 +01:00
Alexander Neumann b0c6e53241 Fix calls to repo/backend.List() everywhere 2018-01-21 21:15:09 +01:00
George Armhold d886cb5c27 replace ad-hoc context.TODO() with gopts.ctx, so that cancellation
can properly trickle down from cmd_*.

gh-1434
2017-12-03 07:22:14 -05:00
Alexander Neumann 23c903074c Move restic package to internal/restic 2017-07-24 17:43:32 +02:00
Alexander Neumann 6caeff2408 Run goimports 2017-07-23 14:21:03 +02:00