* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
This removes code that is only used within a backend implementation from
the backend package. The latter now only contains code that also has
external users.
In order to change the backend initialization in `global.go` to be able
to generically call cfg.ApplyEnvironment() for supported backends, the
`interface{}` returned by `ParseConfig` must contain a pointer to the
configuration.
An alternative would be to use reflection to convert the type from
`interface{}(Config)` to `interface{}(*Config)` (from value to pointer
type). However, this would just complicate the type mess further.
The SemaphoreBackend now uniformly enforces the limit of concurrent
backend operations. In addition, it unifies the parameter validation.
The List() methods no longer uses a semaphore. Restic already never runs
multiple list operations in parallel.
By managing the semaphore in a wrapper backend, the sections that hold a
semaphore token grow slightly. However, the main bottleneck is IO, so
this shouldn't make much of a difference.
The key insight that enables the SemaphoreBackend is that all of the
complex semaphore handling in `openReader()` still happens within the
original call to `Load()`. Thus, getting and releasing the semaphore
tokens can be refactored to happen directly in `Load()`. This eliminates
the need for wrapping the reader in `openReader()` to release the token.
The Test method was only used in exactly one place, namely when trying
to create a new repository it was used to check whether a config file
already exists.
Use a combination of Stat() and IsNotExist() instead.
The ioutil functions are deprecated since Go 1.17 and only wrap another
library function. Thus directly call the underlying function.
This commit only mechanically replaces the function calls.
... called backend/sema. I resisted the temptation to call the main
type sema.Phore. Also, semaphores are now passed by value to skip a
level of indirection when using them.
github.com/pkg/errors is no longer getting updates, because Go 1.13
went with the more flexible errors.{As,Is} function. Use those instead:
errors from pkg/errors already support the Unwrap interface used by 1.13
error handling. Also:
* check for io.EOF with a straight ==. That value should not be wrapped,
and the chunker (whose error is checked in the cases changed) does not
wrap it.
* Give custom Error methods pointer receivers, so there's no ambiguity
when type-switching since the value type will no longer implement error.
* Make restic.ErrAlreadyLocked private, and rename it to
alreadyLockedError to match the stdlib convention that error type
names end in Error.
* Same with rest.ErrIsNotExist => rest.notExistError.
* Make s3.Backend.IsAccessDenied a private function.
This enables the backends to request the calculation of a
backend-specific hash. For the currently supported backends this will
always be MD5. The hash calculation happens as early as possible, for
pack files this is during assembly of the pack file. That way the hash
would even capture corruptions of the temporary pack file on disk.
Bugs in the error handling while uploading a file to the backend could
cause incomplete files, e.g. https://github.com/golang/go/issues/42400
which could affect the local backend.
Proactively add sanity checks which will treat an upload as failed if
the reported upload size does not match the actual file size.
As mentioned in issue [#1560](https://github.com/restic/restic/pull/1560#issuecomment-364689346)
this changes the signature for `backend.Save()`. It now takes a
parameter of interface type `RewindReader`, so that the backend
implementations or our `RetryBackend` middleware can reset the reader to
the beginning and then retry an upload operation.
The `RewindReader` interface also provides a `Length()` method, which is
used in the backend to get the size of the data to be saved. This
removes several ugly hacks we had to do to pull the size back out of the
`io.Reader` passed to `Save()` before. In the `s3` and `rest` backend
this is actively used.