* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
Add two new test cases, TestBackendAzureAccountToken and
TestBackendAzureContainerToken, that ensure that the authorization using
both types of token works.
This introduces two new environment variables,
RESTIC_TEST_AZURE_ACCOUNT_SAS and RESTIC_TEST_AZURE_CONTAINER_SAS, that
contain the tokens to use when testing restic. If an environment
variable is missing, the related test is skipped.
Ignore AuthorizationFailure caused by using a container level SAS/SAT
token when calling GetProperties during the Create() call. This is because the
GetProperties call expects an Account Level token, and the container
level token simply lacks the appropriate permissions. Supressing the
Authorization Failure is OK, because if the token is actually invalid,
this is caught elsewhere when we try to actually use the token to do
work.
When the context used for a load operation is canceled, then the result
is always an error independent of whether the file could be retrieved
from the backend. Do not false positively trip the circuit breaker in
this case.
The old behavior was problematic when trying to lock a repository. When
`Lock.checkForOtherLocks` listed multiple lock files in parallel and one
of them fails to load, then all other loads were canceled. This
cancelation was remembered by the circuit breaker, such that locking
retries would fail.
The HTTP client can only retry HTTP2 requests after receiving a GOAWAY
response if it can rewind the body. As we use a custom data type,
explicitly provide an implementation of `GetBody`.
* removes files which are no longer in the repository, including index files, snapshot files and pack files from the cache.
cache: fix ids set initialisation with NewIDSet()
The default client has no timeouts configured opening network
connections. Thus, if 169.254.169.254 is inaccessible, then the client
would wait for until the operating system gives up, which will take
several minutes.
The first one in Create is already a WithStack error. The rest were
referencing code that hasn't existed for quite some time. Note that
errors from Google SDKs tends to start with "google:" or "googleapi:".
Also, use restic/internal/errors.