Commit Graph

548 Commits

Author SHA1 Message Date
Gilbert Gilb's 536ebefff4
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores

This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.

To prevent unexpected behavior for existing users, the feature is gated
behind new flags:

- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
  in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
  (default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
  processed. (default to `Standard`)

As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.

To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.

**Limitations:**

- Tests against the backend were not written due to the lack of cold
  storage class support in MinIO. Testing was done manually on
  Scaleway's S3-compatible object storage. If necessary, we could
  explore testing with LocalStack or mocks, though this requires further
  discussion.
- Currently, this feature only warms up before restores and repacks
  (prune/copy), as those are the two main use-cases I came across.
  Support for other commands may be added in future iterations, as long
  as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
  make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
  While I think it is not necessary because of the opt-in flag, showing
  some notice may improve usability (but would probably require major
  refactoring in the progress bar which I didn't want to start). Another
  possibility would be to add a flag to send restores requests and fail
  early.

See https://github.com/restic/restic/issues/3202

* ui: warn user when files are warming up from cold storage

* refactor: remove the PacksWarmer struct

It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 18:26:27 +00:00
Michael Eischer 14d02df8bb
Merge pull request #5162 from MichaelEischer/promote-feature-gates
Stabilize `explicit-s3-anonymous-auth` and `safe-forget-keep-tags` feature flags
2025-01-13 22:03:06 +01:00
Michael Eischer 098db935f7 Stabilize `explicit-s3-anonymous-auth` and `safe-forget-keep-tags` flags
The features can no longer be disabled.
2024-11-30 21:22:51 +01:00
Richard Grover dfbd4fb983 Error if AZURE_ACCOUNT_NAME not set 2024-11-13 08:02:22 -07:00
Michael Eischer 1133498ef8
Merge pull request #5046 from konidev20/fix-gh-4521-azure-blob-storage-add-support-for-access-tiers
azure: add support for access tiers hot, cool and cold
2024-11-11 22:01:52 +01:00
Michael Eischer 569a117a1d improve fprintf related error handling 2024-11-01 17:07:43 +01:00
Michael Eischer ded9fc7690
Merge pull request #5101 from MichaelEischer/sftp-load-error
sftp: check for broken connection in Load/List operation
2024-11-01 16:05:29 +01:00
Michael Eischer 71c185313e sftp: check for broken connection in Load/List operation 2024-11-01 15:33:27 +01:00
Michael Eischer 58dc4a6892 backend/retry: hide final log for `stat()` method
stat is only used to check the config file's existence. We don't want
log output in this case.
2024-11-01 15:17:54 +01:00
Srigovind Nayak db686592a1
debug: azure add debug log to show access-tier 2024-10-20 20:24:49 +05:30
Srigovind Nayak bff3341d10
azure: add support for hot, cool, or cool access tiers 2024-10-20 15:27:21 +05:30
Connor Findlay b434f560cc backend/azure: Add tests for both token types
Add two new test cases, TestBackendAzureAccountToken and
TestBackendAzureContainerToken, that ensure that the authorization using
both types of token works.

This introduces two new environment variables,
RESTIC_TEST_AZURE_ACCOUNT_SAS and RESTIC_TEST_AZURE_CONTAINER_SAS, that
contain the tokens to use when testing restic. If an environment
variable is missing, the related test is skipped.
2024-10-17 20:38:03 +02:00
Connor Findlay 2e704c69ac backend/azure: Handle Container SAS/SAT
Ignore AuthorizationFailure caused by using a container level SAS/SAT
token when calling GetProperties during the Create() call. This is because the
GetProperties call expects an Account Level token, and the container
level token simply lacks the appropriate permissions. Supressing the
Authorization Failure is OK, because if the token is actually invalid,
this is caught elsewhere when we try to actually use the token to do
work.
2024-10-17 20:38:03 +02:00
Damien Clark 4795143d6d cache: fix race condition in cache cleanup
Fix multiple restic processes executing concurrently and racing to remove obsolete snapshots.

Co-authored-by: Michael Eischer <michael.eischer@fau.de>
2024-09-14 18:07:46 +02:00
Michael Eischer 24f4e780f1 backend: consistently use os package for filesystem access
The go std library should be good enough to manage the files in the
backend and cache folders.
2024-08-31 18:20:40 +02:00
Michael Eischer 97f696b937 backend: remove dead code 2024-08-31 17:25:24 +02:00
Michael Eischer af989aab4e backend/layout: unexport fields and simplify rest layout 2024-08-31 17:25:24 +02:00
Michael Eischer 6024597028 drop support for s3legacy layout 2024-08-31 17:25:24 +02:00
Michael Eischer ddf35a60ad
Merge pull request #5026 from MichaelEischer/fix-handling-invalid-filenames
cache: Fix handling of invalid filenames
2024-08-31 16:42:13 +02:00
Michael Eischer 0aadfe32bb
Merge pull request #5018 from MichaelEischer/rest-retry-http2-goaway
rest: improve handling of HTTP2 goaway
2024-08-29 16:58:04 +02:00
Michael Eischer 7bbf75237d
Merge pull request #5014 from MichaelEischer/configurable-slow-request-timeout
Make timeout for slow requests configurable
2024-08-29 16:52:24 +02:00
Michael Eischer 8eff4e0e5c cache: correctly ignore files whose filename is no ID
this can for example be the case for temporary files created by the
backend implementation.
2024-08-29 16:32:15 +02:00
Michael Eischer e24dd5a162 backend/retry: don't trip circuit breaker if context is canceled
When the context used for a load operation is canceled, then the result
is always an error independent of whether the file could be retrieved
from the backend. Do not false positively trip the circuit breaker in
this case.

The old behavior was problematic when trying to lock a repository. When
`Lock.checkForOtherLocks` listed multiple lock files in parallel and one
of them fails to load, then all other loads were canceled. This
cancelation was remembered by the circuit breaker, such that locking
retries would fail.
2024-08-26 16:22:21 +02:00
Michael Eischer 36c4475ad9 rest: improve handling of HTTP2 goaway
The HTTP client can only retry HTTP2 requests after receiving a GOAWAY
response if it can rewind the body. As we use a custom data type,
explicitly provide an implementation of `GetBody`.
2024-08-26 15:44:17 +02:00
Michael Eischer b8f409723d make timeout for slow requests configurable 2024-08-26 14:14:43 +02:00
Michael Eischer 5cca6e66be
Merge pull request #4981 from konidev20/fix-gh-4934-cleanup-removed-snaphots-from-cache
cache: clear snapshot files from cache during load index
2024-08-16 19:04:59 +00:00
Michael Eischer 74d3f92cc7
Merge pull request #4993 from MichaelEischer/fix-timeout-error
backend: return correct error on upload/request timeout
2024-08-15 22:07:37 +02:00
Srigovind Nayak 5fd984ba6f
cache: add test for the automated cache clear to cache backend 2024-08-11 23:41:07 +05:30
Srigovind Nayak a23e7bfb82
cache: check for context cancellation before clearing cache 2024-08-11 23:41:07 +05:30
Srigovind Nayak f66624f5bf
cache: backend add List method and a cache clear functionality
* removes files which are no longer in the repository, including index files, snapshot files and pack files from the cache.

cache: fix ids set initialisation with NewIDSet()
2024-08-11 23:40:52 +05:30
Michael Eischer 3f5e2160de
Merge pull request #4938 from MichaelEischer/bump-go-version
Bump go version to 1.21
2024-08-10 19:57:59 +02:00
Michael Eischer 400ae55940 replace deprecated usages of math/rand 2024-08-10 19:34:49 +02:00
Michael Eischer 0b19f6cf5a Switch back to sha256 from the std library
The std library now also supports the sha assembly instructions on
ARM64. Thus, sha256-simd no longer provides a performance benefit.
2024-08-10 19:16:10 +02:00
Michael Eischer ad48751adb bump required go version to 1.21 2024-08-10 19:16:10 +02:00
Michael Eischer fa35e72214 backend: tweak timeouts to make watchdog timeout test less flaky 2024-08-10 19:08:03 +02:00
Michael Eischer 853a686994 backend: return correct error on upload/request timeout 2024-08-10 18:06:24 +02:00
Michael Eischer 4266dca1b6 cache: fix confusing debug log 2024-08-03 18:51:38 +02:00
Michael Eischer ae1cb889dd Add more checks for canceled contexts 2024-07-31 19:30:47 +02:00
Michael Eischer 94fdca08c4 return exit code 10 if repository does not exist 2024-07-10 21:46:26 +02:00
Michael Eischer f74e70cc36 s3: forbid anonymous authentication unless explicitly requested 2024-07-10 20:10:27 +02:00
Michael Eischer 4b364940aa s3: use http client with configured timeouts for s3 IAM communication
The default client has no timeouts configured opening network
connections. Thus, if 169.254.169.254 is inaccessible, then the client
would wait for until the operating system gives up, which will take
several minutes.
2024-07-07 11:32:40 +02:00
Michael Eischer a2a2401a68 s3: prevent repeated credential queries with anonymous authentication 2024-07-07 11:31:04 +02:00
Viktor Szépe ac00229386 Fix typos 2024-07-03 20:02:06 +02:00
Michael Eischer 3f878aa8e7
Merge pull request #4845 from greatroar/errors
Fix error handling bug + clean up error messages
2024-06-07 17:07:07 +00:00
Michael Eischer 0a70bbcea5
Merge pull request #4844 from MichaelEischer/improve-timeout-error
backend: Improve timeout error message
2024-06-07 19:05:39 +02:00
Leo R. Lundgren b2bbbe805f azure: Improve error message in azure.Create() 2024-06-03 23:37:17 +02:00
Michael Eischer 660679c2f6
Merge pull request #4835 from MichaelEischer/fix-list-cancel
backend/retry: do not log final error if context was canceled
2024-06-01 19:17:30 +02:00
Michael Eischer db2398f35b backend: increase request progress timeout to 5 minutes
Apparently, 2 minutes are too short in some cases and can result in
canceled List requests.
2024-06-01 19:01:51 +02:00
Michael Eischer 6bf3d4859f backend: improve error on http request timeout
Now yields a "request timeout" error instead of "context canceled".
2024-06-01 18:52:39 +02:00
greatroar 4a874000b7 gs: Replace some errors.Wrap calls
The first one in Create is already a WithStack error. The rest were
referencing code that hasn't existed for quite some time. Note that
errors from Google SDKs tends to start with "google:" or "googleapi:".

Also, use restic/internal/errors.
2024-06-01 15:11:06 +02:00