2018-04-08 14:02:30 +02:00
|
|
|
package restorer
|
|
|
|
|
|
|
|
import (
|
|
|
|
"bytes"
|
|
|
|
"context"
|
2021-06-29 21:11:30 +02:00
|
|
|
"fmt"
|
2022-12-02 19:36:43 +01:00
|
|
|
"os"
|
2023-12-31 12:07:19 +01:00
|
|
|
"sort"
|
2018-04-08 14:02:30 +02:00
|
|
|
"testing"
|
|
|
|
|
2021-01-01 09:50:47 +01:00
|
|
|
"github.com/restic/restic/internal/errors"
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
"github.com/restic/restic/internal/feature"
|
2018-04-08 14:02:30 +02:00
|
|
|
"github.com/restic/restic/internal/restic"
|
|
|
|
rtest "github.com/restic/restic/internal/test"
|
|
|
|
)
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
type TestBlob struct {
|
2018-04-08 14:02:30 +02:00
|
|
|
data string
|
|
|
|
pack string
|
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
type TestFile struct {
|
2018-04-08 14:02:30 +02:00
|
|
|
name string
|
2018-07-11 21:33:36 +02:00
|
|
|
blobs []TestBlob
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
type TestWarmupJob struct {
|
|
|
|
handlesCount int
|
|
|
|
waitCalled bool
|
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
type TestRepo struct {
|
2018-04-08 14:02:30 +02:00
|
|
|
packsIDToData map[restic.ID][]byte
|
|
|
|
|
|
|
|
// blobs and files
|
|
|
|
blobs map[restic.ID][]restic.PackedBlob
|
|
|
|
files []*fileInfo
|
|
|
|
filesPathToContent map[string]string
|
|
|
|
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
warmupJobs []*TestWarmupJob
|
|
|
|
|
2018-04-08 14:02:30 +02:00
|
|
|
//
|
2023-12-31 12:07:19 +01:00
|
|
|
loader blobsLoaderFn
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
2024-05-19 14:56:17 +02:00
|
|
|
func (i *TestRepo) Lookup(tpe restic.BlobType, id restic.ID) []restic.PackedBlob {
|
|
|
|
packs := i.blobs[id]
|
2020-06-14 13:26:10 +02:00
|
|
|
return packs
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
func (i *TestRepo) fileContent(file *fileInfo) string {
|
2018-09-15 02:18:37 +02:00
|
|
|
return i.filesPathToContent[file.location]
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
func (i *TestRepo) StartWarmup(ctx context.Context, packs restic.IDSet) (restic.WarmupJob, error) {
|
|
|
|
job := TestWarmupJob{handlesCount: len(packs)}
|
|
|
|
i.warmupJobs = append(i.warmupJobs, &job)
|
|
|
|
return &job, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func (job *TestWarmupJob) HandleCount() int {
|
|
|
|
return job.handlesCount
|
|
|
|
}
|
|
|
|
|
|
|
|
func (job *TestWarmupJob) Wait(_ context.Context) error {
|
|
|
|
job.waitCalled = true
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
func newTestRepo(content []TestFile) *TestRepo {
|
|
|
|
type Pack struct {
|
2018-04-08 14:02:30 +02:00
|
|
|
name string
|
|
|
|
data []byte
|
|
|
|
blobs map[restic.ID]restic.Blob
|
|
|
|
}
|
2018-07-11 21:33:36 +02:00
|
|
|
packs := make(map[string]Pack)
|
2018-04-08 14:02:30 +02:00
|
|
|
filesPathToContent := make(map[string]string)
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
for _, file := range content {
|
2018-04-08 14:02:30 +02:00
|
|
|
var content string
|
2018-07-11 21:33:36 +02:00
|
|
|
for _, blob := range file.blobs {
|
|
|
|
content += blob.data
|
2018-04-08 14:02:30 +02:00
|
|
|
|
|
|
|
// get the pack, create as necessary
|
2018-07-11 21:33:36 +02:00
|
|
|
var pack Pack
|
|
|
|
var found bool
|
|
|
|
if pack, found = packs[blob.pack]; !found {
|
|
|
|
pack = Pack{name: blob.pack, blobs: make(map[restic.ID]restic.Blob)}
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
// calculate blob id and add to the pack as necessary
|
2018-07-11 21:33:36 +02:00
|
|
|
blobID := restic.Hash([]byte(blob.data))
|
|
|
|
if _, found := pack.blobs[blobID]; !found {
|
2023-12-31 12:07:19 +01:00
|
|
|
blobData := []byte(blob.data)
|
2018-07-11 21:33:36 +02:00
|
|
|
pack.blobs[blobID] = restic.Blob{
|
2020-11-05 21:52:34 +01:00
|
|
|
BlobHandle: restic.BlobHandle{
|
|
|
|
Type: restic.DataBlob,
|
|
|
|
ID: blobID,
|
|
|
|
},
|
2023-12-31 12:07:19 +01:00
|
|
|
Length: uint(len(blobData)),
|
|
|
|
UncompressedLength: uint(len(blobData)),
|
|
|
|
Offset: uint(len(pack.data)),
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
2018-07-11 21:33:36 +02:00
|
|
|
pack.data = append(pack.data, blobData...)
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
packs[blob.pack] = pack
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
2018-07-11 21:33:36 +02:00
|
|
|
filesPathToContent[file.name] = content
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
blobs := make(map[restic.ID][]restic.PackedBlob)
|
|
|
|
packsIDToData := make(map[restic.ID][]byte)
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
for _, pack := range packs {
|
|
|
|
packID := restic.Hash(pack.data)
|
|
|
|
packsIDToData[packID] = pack.data
|
|
|
|
for blobID, blob := range pack.blobs {
|
|
|
|
blobs[blobID] = append(blobs[blobID], restic.PackedBlob{Blob: blob, PackID: packID})
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
var files []*fileInfo
|
2018-07-11 21:33:36 +02:00
|
|
|
for _, file := range content {
|
2018-04-08 14:02:30 +02:00
|
|
|
content := restic.IDs{}
|
2018-07-11 21:33:36 +02:00
|
|
|
for _, blob := range file.blobs {
|
|
|
|
content = append(content, restic.Hash([]byte(blob.data)))
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
2018-09-15 02:18:37 +02:00
|
|
|
files = append(files, &fileInfo{location: file.name, blobs: content})
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
repo := &TestRepo{
|
2018-04-08 14:02:30 +02:00
|
|
|
packsIDToData: packsIDToData,
|
|
|
|
blobs: blobs,
|
|
|
|
files: files,
|
|
|
|
filesPathToContent: filesPathToContent,
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
warmupJobs: []*TestWarmupJob{},
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
2023-12-31 12:07:19 +01:00
|
|
|
repo.loader = func(ctx context.Context, packID restic.ID, blobs []restic.Blob, handleBlobFn func(blob restic.BlobHandle, buf []byte, err error) error) error {
|
|
|
|
blobs = append([]restic.Blob{}, blobs...)
|
|
|
|
sort.Slice(blobs, func(i, j int) bool {
|
|
|
|
return blobs[i].Offset < blobs[j].Offset
|
|
|
|
})
|
|
|
|
|
|
|
|
for _, blob := range blobs {
|
|
|
|
found := false
|
|
|
|
for _, e := range repo.blobs[blob.ID] {
|
|
|
|
if packID == e.PackID {
|
|
|
|
found = true
|
|
|
|
buf := repo.packsIDToData[packID][e.Offset : e.Offset+e.Length]
|
|
|
|
err := handleBlobFn(e.BlobHandle, buf, nil)
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if !found {
|
|
|
|
return fmt.Errorf("missing blob: %v", blob)
|
|
|
|
}
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
2023-12-31 12:07:19 +01:00
|
|
|
return nil
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
return repo
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
2022-08-07 17:26:46 +02:00
|
|
|
func restoreAndVerify(t *testing.T, tempdir string, content []TestFile, files map[string]bool, sparse bool) {
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
defer feature.TestSetFlag(t, feature.Flag, feature.S3Restore, true)()
|
|
|
|
|
2023-12-31 12:07:19 +01:00
|
|
|
t.Helper()
|
2018-07-11 21:33:36 +02:00
|
|
|
repo := newTestRepo(content)
|
2018-04-08 14:02:30 +02:00
|
|
|
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
r := newFileRestorer(tempdir, repo.loader, repo.Lookup, 2, sparse, false, repo.StartWarmup, nil)
|
2020-11-20 13:15:25 +01:00
|
|
|
|
|
|
|
if files == nil {
|
|
|
|
r.files = repo.files
|
|
|
|
} else {
|
|
|
|
for _, file := range repo.files {
|
|
|
|
if files[file.location] {
|
|
|
|
r.files = append(r.files, file)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2018-04-08 14:02:30 +02:00
|
|
|
|
2019-11-27 13:22:38 +01:00
|
|
|
err := r.restoreFiles(context.TODO())
|
|
|
|
rtest.OK(t, err)
|
2018-04-08 14:02:30 +02:00
|
|
|
|
2021-06-29 21:11:30 +02:00
|
|
|
verifyRestore(t, r, repo)
|
|
|
|
}
|
|
|
|
|
|
|
|
func verifyRestore(t *testing.T, r *fileRestorer, repo *TestRepo) {
|
2023-12-31 12:07:19 +01:00
|
|
|
t.Helper()
|
2020-11-20 13:15:25 +01:00
|
|
|
for _, file := range r.files {
|
2018-09-15 02:18:37 +02:00
|
|
|
target := r.targetPath(file.location)
|
2022-12-02 19:36:43 +01:00
|
|
|
data, err := os.ReadFile(target)
|
2018-04-08 14:02:30 +02:00
|
|
|
if err != nil {
|
2018-09-15 02:18:37 +02:00
|
|
|
t.Errorf("unable to read file %v: %v", file.location, err)
|
2018-04-08 14:02:30 +02:00
|
|
|
continue
|
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
content := repo.fileContent(file)
|
2018-04-08 14:02:30 +02:00
|
|
|
if !bytes.Equal(data, []byte(content)) {
|
2018-09-15 02:18:37 +02:00
|
|
|
t.Errorf("file %v has wrong content: want %q, got %q", file.location, content, data)
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
}
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
|
|
|
|
if len(repo.warmupJobs) == 0 {
|
|
|
|
t.Errorf("warmup did not occur")
|
|
|
|
}
|
|
|
|
for i, warmupJob := range repo.warmupJobs {
|
|
|
|
if !warmupJob.waitCalled {
|
|
|
|
t.Errorf("warmup job %d was not waited", i)
|
|
|
|
}
|
|
|
|
}
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
|
|
|
|
2018-07-11 21:33:36 +02:00
|
|
|
func TestFileRestorerBasic(t *testing.T) {
|
2022-12-09 13:42:33 +01:00
|
|
|
tempdir := rtest.TempDir(t)
|
2018-04-08 14:02:30 +02:00
|
|
|
|
2022-08-07 17:26:46 +02:00
|
|
|
for _, sparse := range []bool{false, true} {
|
|
|
|
restoreAndVerify(t, tempdir, []TestFile{
|
|
|
|
{
|
|
|
|
name: "file1",
|
|
|
|
blobs: []TestBlob{
|
|
|
|
{"data1-1", "pack1-1"},
|
|
|
|
{"data1-2", "pack1-2"},
|
|
|
|
},
|
2018-04-08 14:02:30 +02:00
|
|
|
},
|
2022-08-07 17:26:46 +02:00
|
|
|
{
|
|
|
|
name: "file2",
|
|
|
|
blobs: []TestBlob{
|
|
|
|
{"data2-1", "pack2-1"},
|
|
|
|
{"data2-2", "pack2-2"},
|
|
|
|
},
|
2018-04-08 14:02:30 +02:00
|
|
|
},
|
2022-08-07 17:26:46 +02:00
|
|
|
{
|
|
|
|
name: "file3",
|
|
|
|
blobs: []TestBlob{
|
|
|
|
// same blob multiple times
|
|
|
|
{"data3-1", "pack3-1"},
|
|
|
|
{"data3-1", "pack3-1"},
|
|
|
|
},
|
2019-11-27 13:22:38 +01:00
|
|
|
},
|
2024-05-30 23:06:15 +02:00
|
|
|
{
|
|
|
|
name: "empty",
|
|
|
|
blobs: []TestBlob{},
|
|
|
|
},
|
2022-08-07 17:26:46 +02:00
|
|
|
}, nil, sparse)
|
|
|
|
}
|
2020-11-20 13:15:25 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
func TestFileRestorerPackSkip(t *testing.T) {
|
2022-12-09 13:42:33 +01:00
|
|
|
tempdir := rtest.TempDir(t)
|
2020-11-20 13:15:25 +01:00
|
|
|
|
|
|
|
files := make(map[string]bool)
|
|
|
|
files["file2"] = true
|
|
|
|
|
2022-08-07 17:26:46 +02:00
|
|
|
for _, sparse := range []bool{false, true} {
|
|
|
|
restoreAndVerify(t, tempdir, []TestFile{
|
|
|
|
{
|
|
|
|
name: "file1",
|
|
|
|
blobs: []TestBlob{
|
|
|
|
{"data1-1", "pack1"},
|
|
|
|
{"data1-2", "pack1"},
|
|
|
|
{"data1-3", "pack1"},
|
|
|
|
{"data1-4", "pack1"},
|
|
|
|
{"data1-5", "pack1"},
|
|
|
|
{"data1-6", "pack1"},
|
|
|
|
},
|
2020-11-20 13:15:25 +01:00
|
|
|
},
|
2022-08-07 17:26:46 +02:00
|
|
|
{
|
|
|
|
name: "file2",
|
|
|
|
blobs: []TestBlob{
|
|
|
|
// file is contained in pack1 but need pack parts to be skipped
|
|
|
|
{"data1-2", "pack1"},
|
|
|
|
{"data1-4", "pack1"},
|
|
|
|
{"data1-6", "pack1"},
|
|
|
|
},
|
2020-11-20 13:15:25 +01:00
|
|
|
},
|
2022-08-07 17:26:46 +02:00
|
|
|
}, files, sparse)
|
|
|
|
}
|
2018-04-08 14:02:30 +02:00
|
|
|
}
|
2021-01-01 09:50:47 +01:00
|
|
|
|
2024-01-07 12:17:35 +01:00
|
|
|
func TestFileRestorerFrequentBlob(t *testing.T) {
|
|
|
|
tempdir := rtest.TempDir(t)
|
|
|
|
|
|
|
|
for _, sparse := range []bool{false, true} {
|
|
|
|
blobs := []TestBlob{
|
|
|
|
{"data1-1", "pack1-1"},
|
|
|
|
}
|
|
|
|
for i := 0; i < 10000; i++ {
|
|
|
|
blobs = append(blobs, TestBlob{"a", "pack1-1"})
|
|
|
|
}
|
|
|
|
blobs = append(blobs, TestBlob{"end", "pack1-1"})
|
|
|
|
|
|
|
|
restoreAndVerify(t, tempdir, []TestFile{
|
|
|
|
{
|
|
|
|
name: "file1",
|
|
|
|
blobs: blobs,
|
|
|
|
},
|
|
|
|
}, nil, sparse)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-01-01 09:50:47 +01:00
|
|
|
func TestErrorRestoreFiles(t *testing.T) {
|
2022-12-09 13:42:33 +01:00
|
|
|
tempdir := rtest.TempDir(t)
|
2021-01-01 09:50:47 +01:00
|
|
|
content := []TestFile{
|
|
|
|
{
|
|
|
|
name: "file1",
|
|
|
|
blobs: []TestBlob{
|
|
|
|
{"data1-1", "pack1-1"},
|
|
|
|
},
|
|
|
|
}}
|
|
|
|
|
|
|
|
repo := newTestRepo(content)
|
|
|
|
|
|
|
|
loadError := errors.New("load error")
|
|
|
|
// loader always returns an error
|
2023-12-31 12:07:19 +01:00
|
|
|
repo.loader = func(ctx context.Context, packID restic.ID, blobs []restic.Blob, handleBlobFn func(blob restic.BlobHandle, buf []byte, err error) error) error {
|
2021-01-01 09:50:47 +01:00
|
|
|
return loadError
|
|
|
|
}
|
|
|
|
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
r := newFileRestorer(tempdir, repo.loader, repo.Lookup, 2, false, false, repo.StartWarmup, nil)
|
2021-01-01 09:50:47 +01:00
|
|
|
r.files = repo.files
|
|
|
|
|
|
|
|
err := r.restoreFiles(context.TODO())
|
2021-08-20 12:12:38 +02:00
|
|
|
rtest.Assert(t, errors.Is(err, loadError), "got %v, expected contained error %v", err, loadError)
|
2021-01-01 09:50:47 +01:00
|
|
|
}
|
2021-06-29 21:11:30 +02:00
|
|
|
|
2023-12-30 22:39:26 +01:00
|
|
|
func TestFatalDownloadError(t *testing.T) {
|
|
|
|
tempdir := rtest.TempDir(t)
|
|
|
|
content := []TestFile{
|
|
|
|
{
|
|
|
|
name: "file1",
|
|
|
|
blobs: []TestBlob{
|
|
|
|
{"data1-1", "pack1"},
|
|
|
|
{"data1-2", "pack1"},
|
|
|
|
},
|
|
|
|
},
|
|
|
|
{
|
|
|
|
name: "file2",
|
|
|
|
blobs: []TestBlob{
|
|
|
|
{"data2-1", "pack1"},
|
|
|
|
{"data2-2", "pack1"},
|
|
|
|
{"data2-3", "pack1"},
|
|
|
|
},
|
|
|
|
}}
|
|
|
|
|
|
|
|
repo := newTestRepo(content)
|
|
|
|
|
|
|
|
loader := repo.loader
|
2023-12-31 12:07:19 +01:00
|
|
|
repo.loader = func(ctx context.Context, packID restic.ID, blobs []restic.Blob, handleBlobFn func(blob restic.BlobHandle, buf []byte, err error) error) error {
|
|
|
|
ctr := 0
|
|
|
|
return loader(ctx, packID, blobs, func(blob restic.BlobHandle, buf []byte, err error) error {
|
|
|
|
if ctr < 2 {
|
|
|
|
ctr++
|
|
|
|
return handleBlobFn(blob, buf, err)
|
|
|
|
}
|
|
|
|
// break file2
|
|
|
|
return errors.New("failed to load blob")
|
|
|
|
})
|
2023-12-30 22:39:26 +01:00
|
|
|
}
|
|
|
|
|
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
|
|
|
r := newFileRestorer(tempdir, repo.loader, repo.Lookup, 2, false, false, repo.StartWarmup, nil)
|
2023-12-30 22:39:26 +01:00
|
|
|
r.files = repo.files
|
|
|
|
|
|
|
|
var errors []string
|
|
|
|
r.Error = func(s string, e error) error {
|
|
|
|
// ignore errors as in the `restore` command
|
|
|
|
errors = append(errors, s)
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
err := r.restoreFiles(context.TODO())
|
|
|
|
rtest.OK(t, err)
|
|
|
|
|
|
|
|
rtest.Assert(t, len(errors) == 1, "unexpected number of restore errors, expected: 1, got: %v", len(errors))
|
|
|
|
rtest.Assert(t, errors[0] == "file2", "expected error for file2, got: %v", errors[0])
|
|
|
|
}
|