Skip to content

multi: add migrate-db command #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 15, 2025
Merged

multi: add migrate-db command #21

merged 6 commits into from
Apr 15, 2025

Conversation

guggero
Copy link
Member

@guggero guggero commented Jul 4, 2022

This migration tool will allow LND users to migrate boltdb databases to sql databases. LND supports two types of SQL databases SQLITE and Postgres.

So only the direction BBolt => SQLITE or BBOLT => Postgres is supported.

Important:

What is NOT supported (for now):

etcd => SQL (sqlite/postgres), bbolt => etcd nor any variation of SQLITE => Postgres etc.

So be aware that if you choose to migrate to SQLITE there is currently no way to switch later to Postgres.

@kiwiidb
Copy link

kiwiidb commented Jul 14, 2022

Using this code I was getting an error:
Runtime error: db connection set not initialized
I started looking around and found that you were probably missing this: https://github.com/lightningnetwork/lnd/blob/master/lncfg/db.go#L140

So I added a quick patch: getAlby@0ffcae3
(should be a config param probably).

And then I was able to do the migration. The last logs were:

2022-07-14 10:41:42.534 LNDINIT: Opened destination DB
2022-07-14 10:41:42.535 LNDINIT: Checking tombstone marker on source DB
2022-07-14 10:41:42.536 LNDINIT: Checking DB version of source DB
2022-07-14 10:41:42.537 LNDINIT: Checking if migration was already applied to target DB
2022-07-14 10:41:42.540 LNDINIT: Starting the migration to the target backend
2022-07-14 10:41:42.540 LNDINIT: Copying top-level bucket 'waddrmgr'
2022-07-14 10:41:48.981 LNDINIT: Committing bucket 'waddrmgr'
2022-07-14 10:41:48.983 LNDINIT: Copying top-level bucket 'wtxmgr'
2022-07-14 10:41:49.016 LNDINIT: Committing bucket 'wtxmgr'
2022-07-14 10:41:49.017 LNDINIT: Creating 'wallet created' marker
2022-07-14 10:41:49.024 LNDINIT: Committing 'wallet created' marker

It seems everything has migrated succesfully, so I would say that a log line indicating this would also be useful.

@twofaktor
Copy link

twofaktor commented Apr 15, 2024

Hello, what happened with this PR? Have been it abandoned? I think this is very useful to migrate an existing bbolt to postgres and following this guide

Thanks!

@ziggie1984
Copy link
Contributor

This is an important PR we need to focus on to allow nodes to finally migrate from kv to native sql. cc @saubyk

@Roasbeef
Copy link
Member

Roasbeef commented Oct 2, 2024

sqlite should be added here: https://github.com/lightninglabs/lndinit/pull/21/files#diff-00eb92ba2060dddcdbce2dba1dc551065557b5ebcf98e47e70946e1b50ffd243R453

The top-level bucket structure might also have changed somewhat, so we should update that aspect.

One other thing is we'll need to figure out a way to ensure that all the data has truly been migrated (eg: we don't forget some other top level bucket recently added).

Copy link

@saubyk saubyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still testing, but wanted to note some comments

@ziggie1984 ziggie1984 self-assigned this Nov 5, 2024
@ziggie1984
Copy link
Contributor

ziggie1984 commented Nov 7, 2024

One other thing is we'll need to figure out a way to ensure that all the data has truly been migrated (eg: we don't forget some other top level bucket recently added).

so you mean having the db files constant here (channeldb, towerdb etc) is not enough but we need to also make sure every new bucket is in the new db ? Hmm not sure I understand this one? Because as long as we iterate through each bucket in the db file, there shouldn't be a problem that we miss a bucket ?

@ziggie1984
Copy link
Contributor

ziggie1984 commented Nov 19, 2024

Migration now supports sqlite as well. Keep in mind this is still an experimental feature because we need to implement some consistency checks.

This PR still needs some work:

  1. Need to do the migration in batches, because memory issues can be a problem
  2. Add consistency checks: Base level: Compare the two dbs and make sure every key-value pair was successfully copied
    Do some structural data tests for example compare the Channels from the old db and the new migrated db.
  3. This only works if you run LND 18.3 otherwise we do not allow the migration.
  4. Add a wtclient.db check so that we only migrate if the specific version is met.

Example for the sqlite migration:

 lndinit -v migrate-db  --network regtest --source.bolt.data-dir /LND-DIR/data   --dest.backend sqlite  --dest.sqlite.data-dir /LND-DIR/data

make sure the data-dir for the sqlite directory is the same as the boltdb otherwise you need to copy the new files into the other directory when starting lnd.

for postgress:

 lndinit -v migrate-db  --network regtest --source.bolt.data-dir /LND-DIR/data  --dest.backend postgres  --dest.postgres.dsn="postgres://lnd_migration:lnd_migration@localhost:5432/lnd_migration_db?sslmode=disable"

make sure you create a db in postgres before otherwise the opening of the db will fail.

Also make sure you choose the correct network, otherwise the bolt.db will not be found in the LND dir.

@bhandras bhandras self-requested a review November 20, 2024 19:46
@ziggie1984 ziggie1984 force-pushed the migrate-db branch 2 times, most recently from db62172 to 9a15a14 Compare November 22, 2024 20:42
@feelancer21
Copy link

feelancer21 commented Dec 15, 2024

@ziggie1984
I am testing the sqlite migration for my watchtower node. It seems that the migration routine is looking for the watchtower.db in $HOME/.lnd and not in the specified source.bolt.data-dir <- solved with using tower-dir parameters

Edit:
moreover per default the routine is looking for $HOME/.lnd/data/bitcoin/mainnet/watchtower.db and not for $HOME/.lnd/data/watchtower/bitcoin/mainnet/watchtower.db

Edit2:
A runtime error occurred. I think that the memory ran out because watchtower.db has a size of ~60G, the memory 8G and the memory was almost completely used up at the end.

2024-12-15 12:44:39.625 LNDINIT: Migrating DB with prefix towerserverdb
2024-12-15 12:44:39.625 LNDINIT: Opening bbolt backend at /home/feelancer21/.lnd/data/bitcoin/mainnet/watchtower.db for prefix 'towerserverdb'
2024-12-15 14:05:53.255 LNDINIT: Opening sqlite backend with prefix 'towerserverdb'
2024-12-15 14:05:53.321 LNDINIT: Opened destination DB
2024-12-15 14:05:53.321 LNDINIT: Checking tombstone marker on source DB
2024-12-15 14:05:53.322 LNDINIT: Checking DB version of source DB
2024-12-15 14:05:53.322 LNDINIT: Checking if migration was already applied to target DB
2024-12-15 14:05:53.330 LNDINIT: Starting the migration to the target backend
2024-12-15 14:05:53.331 LNDINIT: Copying top-level bucket 'lookout-tip-bucket'
2024-12-15 14:05:53.331 LNDINIT: Committing bucket 'lookout-tip-bucket'
2024-12-15 14:05:53.335 LNDINIT: Copying top-level bucket 'metadata-bucket'
2024-12-15 14:05:53.336 LNDINIT: Committing bucket 'metadata-bucket'
2024-12-15 14:05:53.339 LNDINIT: Copying top-level bucket 'sessions-bucket'
2024-12-15 14:05:59.461 LNDINIT: Committing bucket 'sessions-bucket'
2024-12-15 14:05:59.580 LNDINIT: Copying top-level bucket 'update-index-bucket'
2024-12-15 15:36:29.131 LNDINIT: Committing bucket 'update-index-bucket'
2024-12-15 15:41:59.768 LNDINIT: Copying top-level bucket 'updates-bucket'
2024-12-15 17:38:34.307 LNDINIT: Runtime error: error enumerating top level buckets: error copying bucket 'updates-bucket': error creating bucket '1b12e646e05eb7698a7db9ddbdfd9def': context deadline exceeded

@jkuchar
Copy link

jkuchar commented Jan 2, 2025

Any plans for releasing this feature? It can be released in beta and used at own risk. What is the level of risk loosing money because of broken migration? Is there any way to "dry run" migration and ensure that lnd after the migration is in correct state?

@ziggie1984
Copy link
Contributor

Any plans for releasing this feature? It can be released in beta and used at own risk. What is the level of risk loosing money because of broken migration? Is there any way to "dry run" migration and ensure that lnd after the migration is in correct state?

This feature will be released alongside with LND 19. Currently this feature is not ready for big databases. There is data-consistency checks in the making which will make sure the db migrated correctly.

@ziggie1984
Copy link
Contributor

ziggie1984 commented Mar 6, 2025

The main code change will reside in ziggie1984/migrateboltdb#1

This will make it very easy to move the code to LND in case we decide to do so.

@ziggie1984
Copy link
Contributor

ziggie1984 commented Mar 6, 2025

Missing:

  • Add a itest/unit-test which migrate real lnd dbs (small dbs just to verify the migration works as expected)
  • Maybe some docs how the migration works
  • Planning to remove complete ETCD support here because it might not work with the assumptions made during the chunked migration design

@ziggie1984
Copy link
Contributor

After discussing this change we decided to just add a new package to lndinit which has the new migration/verification code

@ZZiigguurraatt
Copy link

OK, so adding my comments as separate issues so we can more easily track. Here's a new one: #47 .

Copy link
Member

@bhandras bhandras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work @ziggie1984 and @guggero 🎉
I have one comment that I think is worth considering re: sequence keys.

tACK (also migrated my production node with the last version)

LGTM 🥇

@ziggie1984 ziggie1984 force-pushed the migrate-db branch 3 times, most recently from c1b4d8c to 0293f53 Compare April 10, 2025 15:40
@kornpow
Copy link

kornpow commented Apr 10, 2025

Justed tested out the latest code, and did a migration to sqlite. The information output is really useful, and it definitely felt like verification was a bit faster. 5 minutes to migrate my smaller test node. 👍

@ZZiigguurraatt
Copy link

I finished growing my database to 32GB this morning

$ du -hs bob/.lnd/
32G	bob/.lnd/
$

I then fetched the latest code updates, recompiled, and started a new migration test with this tool. The migration tool took 356 minutes when migrating to postgres.

My CPU vintage is 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz and I have a NVMe disk.

The resulting postgres DB is only 20GB.

$ sudo du -hs pgdata/
20G	pgdata/
$

I'm now going to re-try with SQLite.

@ZZiigguurraatt
Copy link

I completed the migration of the same database to SQLite on the same machine. The time it took was 324 minutes.

Here are the resulting files:

$ ll -Rh bob/.lnd
bob/.lnd:
total 28K
drwx------ 5 b b 4.0K Apr 11 17:39 ./
drwxrwxr-x 6 b b 4.0K Apr 11 17:39 ../
drwx------ 5 b b 4.0K Apr 11 23:23 data/
drwx------ 2 b b 4.0K Apr 11 17:39 letsencrypt/
drwx------ 3 b b 4.0K Apr 11 17:39 logs/
-rw-r--r-- 1 b b  802 Apr 11 17:39 tls.cert
-rw------- 1 b b  227 Apr 11 17:39 tls.key

bob/.lnd/data:
total 20K
drwx------ 5 b b 4.0K Apr 11 23:23 ./
drwx------ 5 b b 4.0K Apr 11 17:39 ../
drwx------ 3 b b 4.0K Apr 11 17:38 chain/
drwx------ 3 b b 4.0K Apr 11 17:38 graph/
drwx------ 3 b b 4.0K Apr 11 17:38 watchtower/

bob/.lnd/data/chain:
total 12K
drwx------ 3 b b 4.0K Apr 11 17:38 ./
drwx------ 5 b b 4.0K Apr 11 23:23 ../
drwx------ 3 b b 4.0K Apr 11 17:38 bitcoin/

bob/.lnd/data/chain/bitcoin:
total 12K
drwx------ 3 b b 4.0K Apr 11 17:38 ./
drwx------ 3 b b 4.0K Apr 11 17:38 ../
drwx------ 2 b b 4.0K Apr 11 23:23 regtest/

bob/.lnd/data/chain/bitcoin/regtest:
total 1.9M
drwx------ 2 b b 4.0K Apr 11 23:23 ./
drwx------ 3 b b 4.0K Apr 11 17:38 ../
-rw-r----- 1 b b  293 Apr 11 17:38 admin.macaroon
-rw-r--r-- 1 b b   83 Apr 11 17:38 chainnotifier.macaroon
-rw-r--r-- 1 b b 772K Apr 11 23:23 chain.sqlite
-rw-r--r-- 1 b b 1.9K Apr 11 17:38 channel.backup
-rw-r--r-- 1 b b  132 Apr 11 17:38 invoice.macaroon
-rw-r--r-- 1 b b   91 Apr 11 17:38 invoices.macaroon
-rw------- 1 b b  32K Apr 11 23:07 macaroons.db
-rw-rw-r-- 1 b b    0 Apr 11 23:07 macaroons.db.migrated-to-sqlite-2025-04-11-23-07
-rw-r--r-- 1 b b  217 Apr 11 17:38 readonly.macaroon
-rw-r--r-- 1 b b   91 Apr 11 17:38 router.macaroon
-rw-r--r-- 1 b b   92 Apr 11 17:38 signer.macaroon
-rw------- 1 b b 2.0M Apr 11 23:23 wallet.db
-rw-rw-r-- 1 b b    0 Apr 11 23:23 wallet.db.migrated-to-sqlite-2025-04-11-23-23
-rw-r--r-- 1 b b  114 Apr 11 17:38 walletkit.macaroon

bob/.lnd/data/graph:
total 12K
drwx------ 3 b b 4.0K Apr 11 17:38 ./
drwx------ 5 b b 4.0K Apr 11 23:23 ../
drwx------ 2 b b 4.0K Apr 11 23:23 regtest/

bob/.lnd/data/graph/regtest:
total 45G
drwx------ 2 b b 4.0K Apr 11 23:23 ./
drwx------ 3 b b 4.0K Apr 11 17:38 ../
-rw------- 1 b b  32G Apr 11 23:07 channel.db
-rw-rw-r-- 1 b b    0 Apr 11 23:07 channel.db.migrated-to-sqlite-2025-04-11-23-07
-rw-r--r-- 1 b b  14G Apr 11 23:23 channel.sqlite
-rw------- 1 b b 415M Apr 11 23:23 sphinxreplay.db
-rw-rw-r-- 1 b b    0 Apr 11 23:23 sphinxreplay.db.migrated-to-sqlite-2025-04-11-23-23

bob/.lnd/data/watchtower:
total 12K
drwx------ 3 b b 4.0K Apr 11 17:38 ./
drwx------ 5 b b 4.0K Apr 11 23:23 ../
drwx------ 3 b b 4.0K Apr 11 17:38 bitcoin/

bob/.lnd/data/watchtower/bitcoin:
total 12K
drwx------ 3 b b 4.0K Apr 11 17:38 ./
drwx------ 3 b b 4.0K Apr 11 17:38 ../
drwx------ 2 b b 4.0K Apr 11 17:38 regtest/

bob/.lnd/data/watchtower/bitcoin/regtest:
total 8.0K
drwx------ 2 b b 4.0K Apr 11 17:38 ./
drwx------ 3 b b 4.0K Apr 11 17:38 ../

bob/.lnd/letsencrypt:
total 8.0K
drwx------ 2 b b 4.0K Apr 11 17:39 ./
drwx------ 5 b b 4.0K Apr 11 17:39 ../

bob/.lnd/logs:
total 12K
drwx------ 3 b b 4.0K Apr 11 17:39 ./
drwx------ 5 b b 4.0K Apr 11 17:39 ../
drwx------ 3 b b 4.0K Apr 11 17:39 bitcoin/

bob/.lnd/logs/bitcoin:
total 12K
drwx------ 3 b b 4.0K Apr 11 17:39 ./
drwx------ 3 b b 4.0K Apr 11 17:39 ../
drwx------ 2 b b 4.0K Apr 11 17:39 regtest/

bob/.lnd/logs/bitcoin/regtest:
total 13M
drwx------ 2 b b 4.0K Apr 11 17:39 ./
drwx------ 3 b b 4.0K Apr 11 17:39 ../
-rw-r--r-- 1 b b 6.6M Apr 11 17:58 lnd.log
-rw-r--r-- 1 b b 2.0M Apr 11 17:39 lnd.log.29033.gz
-rw-r--r-- 1 b b 2.0M Apr 11 17:39 lnd.log.29034.gz
-rw-r--r-- 1 b b 1.9M Apr 11 17:39 lnd.log.29035.gz
$

@ziggie1984 ziggie1984 force-pushed the migrate-db branch 3 times, most recently from 42e4513 to cc522fb Compare April 13, 2025 07:08
@ziggie1984
Copy link
Contributor

@ZZiigguurraatt addressed all your issues PTAL

@ziggie1984
Copy link
Contributor

Migrated my node to SQLITE also successfully:

channeldb:   3 GB, took 1:30 min
decayedlogdb: 1.4 GB took 4 hours

The others were neglectable.

@ZZiigguurraatt
Copy link

I tried starting LND up with --lnd.db.backend=sqlite --lnd.db.use-native-sql. That looks like it took about 62 minutes to migrate the invoice DB from KV to SQL using SQLite.

2025-04-14 05:32:01.283 [INF] INVC: Starting migration of invoices from KV to SQL
2025-04-14 05:41:01.168 [DBG] INVC: Created SQL invoice hash index in 8m59.885808016s

<snip because of https://github.com/lightningnetwork/lnd/issues/9661>

2025-04-14 06:33:13.677 [DBG] INVC: Migrated 4689000 KV invoices to SQL in 821.470938ms

2025-04-14 06:33:14.495 [DBG] INVC: Migrated 4690000 KV invoices to SQL in 818.067068ms

2025-04-14 06:33:15.261 [DBG] INVC: Migrated 4691000 KV invoices to SQL in 765.216883ms

2025-04-14 06:33:16.046 [DBG] INVC: Migrated 4692000 KV invoices to SQL in 785.032097ms

2025-04-14 06:33:16.861 [DBG] INVC: Migrated 4693000 KV invoices to SQL in 815.156901ms

2025-04-14 06:33:17.662 [DBG] INVC: Migrated 4694000 KV invoices to SQL in 800.830851ms

2025-04-14 06:33:18.529 [DBG] INVC: Migrated 4695000 KV invoices to SQL in 866.687796ms

2025-04-14 06:33:19.320 [DBG] INVC: Migrated 4696000 KV invoices to SQL in 790.951899ms

2025-04-14 06:33:20.110 [DBG] INVC: Migrated 4697000 KV invoices to SQL in 790.124272ms

2025-04-14 06:33:20.897 [DBG] INVC: Migrated 4698000 KV invoices to SQL in 787.642721ms

2025-04-14 06:33:21.703 [DBG] INVC: Migrated 4699000 KV invoices to SQL in 805.502951ms

2025-04-14 06:33:22.547 [DBG] INVC: Migrated 4700000 KV invoices to SQL in 843.782202ms

2025-04-14 06:33:23.259 [DBG] INVC: Migrated 4700862 KV invoices to SQL in 712.062361ms

2025-04-14 06:33:23.259 [INF] INVC: All invoices migrated
2025-04-14 06:33:29.774 [INF] INVC: Migration of 4700862 invoices from KV to SQL completed

@guggero guggero merged commit 724a26b into main Apr 15, 2025
2 checks passed
@guggero guggero deleted the migrate-db branch April 15, 2025 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.