stash/pkg/sqlite/custom_migrations.go

25 lines
582 B
Go
Raw Normal View History

package sqlite
Add indexes for path and checksum to images (#1740) * Add indexes for path and checksum to images The scenes table has unique indexes/constraints on path and checksum colums. The images table doesn't, which doesn't really make sense, as scanning uses these colums extensively which warrents an index, and both should be unique as well. Adding these indexes thus heavily improves the scanning tasks performance. On a database containing 4700 images a (re)scan of those 4700 files, which thus shouldn't do anything, took 1.2 seconds, with the indexes added this only takes 0.4 seconds. Taking the same test on a generated database containing 4M images + the actual 4700 images took 26 minutes for a rescan, and with the index existing also only takes 0.4 seconds. * Add images.checksum unique constraint in code with fallback Work around the issue where in some cases duplicate images (/checksums on images) might exist. This as discussed in #1740 by creating the index on startup and in case of an error logging the duplicates. This so the users where this scenario exists can correct the database (by searching on the logged checksum(s) and removing the duplicates) and after a restart the unique index / constraint will still be created. In case when creating the unique index fails a "normal" / non-unique index is created as surrogate so the user will still get the performance benefit (for example during scanning) without being forced to remove the duplicates and restart beforehand. This surrogate is also automatically cleaned up after the unique index is succesfully created.
2021-09-21 01:48:52 +00:00
import (
"context"
Add indexes for path and checksum to images (#1740) * Add indexes for path and checksum to images The scenes table has unique indexes/constraints on path and checksum colums. The images table doesn't, which doesn't really make sense, as scanning uses these colums extensively which warrents an index, and both should be unique as well. Adding these indexes thus heavily improves the scanning tasks performance. On a database containing 4700 images a (re)scan of those 4700 files, which thus shouldn't do anything, took 1.2 seconds, with the indexes added this only takes 0.4 seconds. Taking the same test on a generated database containing 4M images + the actual 4700 images took 26 minutes for a rescan, and with the index existing also only takes 0.4 seconds. * Add images.checksum unique constraint in code with fallback Work around the issue where in some cases duplicate images (/checksums on images) might exist. This as discussed in #1740 by creating the index on startup and in case of an error logging the duplicates. This so the users where this scenario exists can correct the database (by searching on the logged checksum(s) and removing the duplicates) and after a restart the unique index / constraint will still be created. In case when creating the unique index fails a "normal" / non-unique index is created as surrogate so the user will still get the performance benefit (for example during scanning) without being forced to remove the duplicates and restart beforehand. This surrogate is also automatically cleaned up after the unique index is succesfully created.
2021-09-21 01:48:52 +00:00
File storage rewrite (#2676) * Restructure data layer part 2 (#2599) * Refactor and separate image model * Refactor image query builder * Handle relationships in image query builder * Remove relationship management methods * Refactor gallery model/query builder * Add scenes to gallery model * Convert scene model * Refactor scene models * Remove unused methods * Add unit tests for gallery * Add image tests * Add scene tests * Convert unnecessary scene value pointers to values * Convert unnecessary pointer values to values * Refactor scene partial * Add scene partial tests * Refactor ImagePartial * Add image partial tests * Refactor gallery partial update * Add partial gallery update tests * Use zero/null package for null values * Add files and scan system * Add sqlite implementation for files/folders * Add unit tests for files/folders * Image refactors * Update image data layer * Refactor gallery model and creation * Refactor scene model * Refactor scenes * Don't set title from filename * Allow galleries to freely add/remove images * Add multiple scene file support to graphql and UI * Add multiple file support for images in graphql/UI * Add multiple file for galleries in graphql/UI * Remove use of some deprecated fields * Remove scene path usage * Remove gallery path usage * Remove path from image * Move funscript to video file * Refactor caption detection * Migrate existing data * Add post commit/rollback hook system * Lint. Comment out import/export tests * Add WithDatabase read only wrapper * Prepend tasks to list * Add 32 pre-migration * Add warnings in release and migration notes
2022-07-13 06:30:54 +00:00
"github.com/jmoiron/sqlx"
Add indexes for path and checksum to images (#1740) * Add indexes for path and checksum to images The scenes table has unique indexes/constraints on path and checksum colums. The images table doesn't, which doesn't really make sense, as scanning uses these colums extensively which warrents an index, and both should be unique as well. Adding these indexes thus heavily improves the scanning tasks performance. On a database containing 4700 images a (re)scan of those 4700 files, which thus shouldn't do anything, took 1.2 seconds, with the indexes added this only takes 0.4 seconds. Taking the same test on a generated database containing 4M images + the actual 4700 images took 26 minutes for a rescan, and with the index existing also only takes 0.4 seconds. * Add images.checksum unique constraint in code with fallback Work around the issue where in some cases duplicate images (/checksums on images) might exist. This as discussed in #1740 by creating the index on startup and in case of an error logging the duplicates. This so the users where this scenario exists can correct the database (by searching on the logged checksum(s) and removing the duplicates) and after a restart the unique index / constraint will still be created. In case when creating the unique index fails a "normal" / non-unique index is created as surrogate so the user will still get the performance benefit (for example during scanning) without being forced to remove the duplicates and restart beforehand. This surrogate is also automatically cleaned up after the unique index is succesfully created.
2021-09-21 01:48:52 +00:00
)
File storage rewrite (#2676) * Restructure data layer part 2 (#2599) * Refactor and separate image model * Refactor image query builder * Handle relationships in image query builder * Remove relationship management methods * Refactor gallery model/query builder * Add scenes to gallery model * Convert scene model * Refactor scene models * Remove unused methods * Add unit tests for gallery * Add image tests * Add scene tests * Convert unnecessary scene value pointers to values * Convert unnecessary pointer values to values * Refactor scene partial * Add scene partial tests * Refactor ImagePartial * Add image partial tests * Refactor gallery partial update * Add partial gallery update tests * Use zero/null package for null values * Add files and scan system * Add sqlite implementation for files/folders * Add unit tests for files/folders * Image refactors * Update image data layer * Refactor gallery model and creation * Refactor scene model * Refactor scenes * Don't set title from filename * Allow galleries to freely add/remove images * Add multiple scene file support to graphql and UI * Add multiple file support for images in graphql/UI * Add multiple file for galleries in graphql/UI * Remove use of some deprecated fields * Remove scene path usage * Remove gallery path usage * Remove path from image * Move funscript to video file * Refactor caption detection * Migrate existing data * Add post commit/rollback hook system * Lint. Comment out import/export tests * Add WithDatabase read only wrapper * Prepend tasks to list * Add 32 pre-migration * Add warnings in release and migration notes
2022-07-13 06:30:54 +00:00
type customMigrationFunc func(ctx context.Context, db *sqlx.DB) error
Add indexes for path and checksum to images (#1740) * Add indexes for path and checksum to images The scenes table has unique indexes/constraints on path and checksum colums. The images table doesn't, which doesn't really make sense, as scanning uses these colums extensively which warrents an index, and both should be unique as well. Adding these indexes thus heavily improves the scanning tasks performance. On a database containing 4700 images a (re)scan of those 4700 files, which thus shouldn't do anything, took 1.2 seconds, with the indexes added this only takes 0.4 seconds. Taking the same test on a generated database containing 4M images + the actual 4700 images took 26 minutes for a rescan, and with the index existing also only takes 0.4 seconds. * Add images.checksum unique constraint in code with fallback Work around the issue where in some cases duplicate images (/checksums on images) might exist. This as discussed in #1740 by creating the index on startup and in case of an error logging the duplicates. This so the users where this scenario exists can correct the database (by searching on the logged checksum(s) and removing the duplicates) and after a restart the unique index / constraint will still be created. In case when creating the unique index fails a "normal" / non-unique index is created as surrogate so the user will still get the performance benefit (for example during scanning) without being forced to remove the duplicates and restart beforehand. This surrogate is also automatically cleaned up after the unique index is succesfully created.
2021-09-21 01:48:52 +00:00
File storage rewrite (#2676) * Restructure data layer part 2 (#2599) * Refactor and separate image model * Refactor image query builder * Handle relationships in image query builder * Remove relationship management methods * Refactor gallery model/query builder * Add scenes to gallery model * Convert scene model * Refactor scene models * Remove unused methods * Add unit tests for gallery * Add image tests * Add scene tests * Convert unnecessary scene value pointers to values * Convert unnecessary pointer values to values * Refactor scene partial * Add scene partial tests * Refactor ImagePartial * Add image partial tests * Refactor gallery partial update * Add partial gallery update tests * Use zero/null package for null values * Add files and scan system * Add sqlite implementation for files/folders * Add unit tests for files/folders * Image refactors * Update image data layer * Refactor gallery model and creation * Refactor scene model * Refactor scenes * Don't set title from filename * Allow galleries to freely add/remove images * Add multiple scene file support to graphql and UI * Add multiple file support for images in graphql/UI * Add multiple file for galleries in graphql/UI * Remove use of some deprecated fields * Remove scene path usage * Remove gallery path usage * Remove path from image * Move funscript to video file * Refactor caption detection * Migrate existing data * Add post commit/rollback hook system * Lint. Comment out import/export tests * Add WithDatabase read only wrapper * Prepend tasks to list * Add 32 pre-migration * Add warnings in release and migration notes
2022-07-13 06:30:54 +00:00
func RegisterPostMigration(schemaVersion uint, fn customMigrationFunc) {
v := postMigrations[schemaVersion]
v = append(v, fn)
postMigrations[schemaVersion] = v
Add indexes for path and checksum to images (#1740) * Add indexes for path and checksum to images The scenes table has unique indexes/constraints on path and checksum colums. The images table doesn't, which doesn't really make sense, as scanning uses these colums extensively which warrents an index, and both should be unique as well. Adding these indexes thus heavily improves the scanning tasks performance. On a database containing 4700 images a (re)scan of those 4700 files, which thus shouldn't do anything, took 1.2 seconds, with the indexes added this only takes 0.4 seconds. Taking the same test on a generated database containing 4M images + the actual 4700 images took 26 minutes for a rescan, and with the index existing also only takes 0.4 seconds. * Add images.checksum unique constraint in code with fallback Work around the issue where in some cases duplicate images (/checksums on images) might exist. This as discussed in #1740 by creating the index on startup and in case of an error logging the duplicates. This so the users where this scenario exists can correct the database (by searching on the logged checksum(s) and removing the duplicates) and after a restart the unique index / constraint will still be created. In case when creating the unique index fails a "normal" / non-unique index is created as surrogate so the user will still get the performance benefit (for example during scanning) without being forced to remove the duplicates and restart beforehand. This surrogate is also automatically cleaned up after the unique index is succesfully created.
2021-09-21 01:48:52 +00:00
}
File storage rewrite (#2676) * Restructure data layer part 2 (#2599) * Refactor and separate image model * Refactor image query builder * Handle relationships in image query builder * Remove relationship management methods * Refactor gallery model/query builder * Add scenes to gallery model * Convert scene model * Refactor scene models * Remove unused methods * Add unit tests for gallery * Add image tests * Add scene tests * Convert unnecessary scene value pointers to values * Convert unnecessary pointer values to values * Refactor scene partial * Add scene partial tests * Refactor ImagePartial * Add image partial tests * Refactor gallery partial update * Add partial gallery update tests * Use zero/null package for null values * Add files and scan system * Add sqlite implementation for files/folders * Add unit tests for files/folders * Image refactors * Update image data layer * Refactor gallery model and creation * Refactor scene model * Refactor scenes * Don't set title from filename * Allow galleries to freely add/remove images * Add multiple scene file support to graphql and UI * Add multiple file support for images in graphql/UI * Add multiple file for galleries in graphql/UI * Remove use of some deprecated fields * Remove scene path usage * Remove gallery path usage * Remove path from image * Move funscript to video file * Refactor caption detection * Migrate existing data * Add post commit/rollback hook system * Lint. Comment out import/export tests * Add WithDatabase read only wrapper * Prepend tasks to list * Add 32 pre-migration * Add warnings in release and migration notes
2022-07-13 06:30:54 +00:00
func RegisterPreMigration(schemaVersion uint, fn customMigrationFunc) {
v := preMigrations[schemaVersion]
v = append(v, fn)
preMigrations[schemaVersion] = v
Add indexes for path and checksum to images (#1740) * Add indexes for path and checksum to images The scenes table has unique indexes/constraints on path and checksum colums. The images table doesn't, which doesn't really make sense, as scanning uses these colums extensively which warrents an index, and both should be unique as well. Adding these indexes thus heavily improves the scanning tasks performance. On a database containing 4700 images a (re)scan of those 4700 files, which thus shouldn't do anything, took 1.2 seconds, with the indexes added this only takes 0.4 seconds. Taking the same test on a generated database containing 4M images + the actual 4700 images took 26 minutes for a rescan, and with the index existing also only takes 0.4 seconds. * Add images.checksum unique constraint in code with fallback Work around the issue where in some cases duplicate images (/checksums on images) might exist. This as discussed in #1740 by creating the index on startup and in case of an error logging the duplicates. This so the users where this scenario exists can correct the database (by searching on the logged checksum(s) and removing the duplicates) and after a restart the unique index / constraint will still be created. In case when creating the unique index fails a "normal" / non-unique index is created as surrogate so the user will still get the performance benefit (for example during scanning) without being forced to remove the duplicates and restart beforehand. This surrogate is also automatically cleaned up after the unique index is succesfully created.
2021-09-21 01:48:52 +00:00
}
File storage rewrite (#2676) * Restructure data layer part 2 (#2599) * Refactor and separate image model * Refactor image query builder * Handle relationships in image query builder * Remove relationship management methods * Refactor gallery model/query builder * Add scenes to gallery model * Convert scene model * Refactor scene models * Remove unused methods * Add unit tests for gallery * Add image tests * Add scene tests * Convert unnecessary scene value pointers to values * Convert unnecessary pointer values to values * Refactor scene partial * Add scene partial tests * Refactor ImagePartial * Add image partial tests * Refactor gallery partial update * Add partial gallery update tests * Use zero/null package for null values * Add files and scan system * Add sqlite implementation for files/folders * Add unit tests for files/folders * Image refactors * Update image data layer * Refactor gallery model and creation * Refactor scene model * Refactor scenes * Don't set title from filename * Allow galleries to freely add/remove images * Add multiple scene file support to graphql and UI * Add multiple file support for images in graphql/UI * Add multiple file for galleries in graphql/UI * Remove use of some deprecated fields * Remove scene path usage * Remove gallery path usage * Remove path from image * Move funscript to video file * Refactor caption detection * Migrate existing data * Add post commit/rollback hook system * Lint. Comment out import/export tests * Add WithDatabase read only wrapper * Prepend tasks to list * Add 32 pre-migration * Add warnings in release and migration notes
2022-07-13 06:30:54 +00:00
var postMigrations = make(map[uint][]customMigrationFunc)
var preMigrations = make(map[uint][]customMigrationFunc)