Skip to content

Implement #6413 : Data pages of newly gbak restored databases should marked as "swept" [CORE6164] #8549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

hvlad
Copy link
Member

@hvlad hvlad commented May 3, 2025

No description provided.

@hvlad hvlad self-assigned this May 3, 2025
@hvlad hvlad requested a review from dyemanov May 3, 2025 18:09
@aafemt
Copy link
Contributor

aafemt commented May 3, 2025

Why exception for system relations?

@hvlad
Copy link
Member Author

hvlad commented May 4, 2025

Why exception for system relations?

There is no promise that system tables contains no backversions after restore.

@omachtandras
Copy link

Vlad, thank you!!!

@aafemt
Copy link
Contributor

aafemt commented May 4, 2025

There is no promise that system tables contains no backversions after restore.

Does this flag prevent garbage collection as well? System tables are read extremely intensively so hardly any garbage in them can left uncollected.

@hvlad
Copy link
Member Author

hvlad commented May 5, 2025

There is no promise that system tables contains no backversions after restore.

Does this flag prevent garbage collection as well?

Read the code.

System tables are read extremely intensively so hardly any garbage in them can left uncollected.

Do you give guarantee ? Run gbak -r and gstat -r -s after it.

@dyemanov dyemanov requested a review from ilya071294 May 7, 2025 13:52
src/jrd/dpm.epp Outdated
if (type == DPM_primary)
{
// When restoring, mark slot as swept, data page will be marked by our caller
if (tdbb->getAttachment()->isGbak() && !relation->isSystem())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I would put this check in a separate inline function like inline bool sweptOnRestore(thread_db* tdbb, jrd_rel* rel).
  2. isGbak() is not 100% safe to use. I remember I tested a case where I created some DB trigger which inserts/updates data. And when I started gbak -b, the normal swept flag logic wasn't working during execution of the trigger. So we need to distinguish between backup and restore. isRWGbak() is more suitable here but it's not going to work right away. ATT_creator is also should be passed to parallel worker attachments. To do that, I created isc_dpb_gbak_restore_rel_attach so JProvider::internalAttach can check it along with isc_dpb_gbak_attach and set ATT_creator.

If it's needed, I can provide a patch with these corrections.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I'll consider it.
  2. Good advise, thanks. I think we can handle "backup vs restore" issue without a need in new DPB tag: let creator attachment mark dbb with new DBB_restore flag (DBB_creating is already exists and serve another purposes) and clear this flag on disconnect. The only risk here is if user creates attachment while database is restored and changes some data, but this is not supported anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User applications using isc_create_database() can do many things in the attachment, not only restore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. The only risk here is if user creates attachment while database is restored

This is possible only when restore is parallel due to DB multi shutdown mode, right? Maybe, when DBB_restore is set, only attachments with isc_dpb_gbak_attach should be allowed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The only risk here is if user creates attachment while database is restored

This is possible only when restore is parallel due to DB multi shutdown mode, right?

Yes

Maybe, when DBB_restore is set, only attachments with isc_dpb_gbak_attach should be allowed.

Done. But it is a bit limited in CS\SC case - another process could bypass this check. Anyway, I think it is better than nothing. Later we can implement more robust protection, if required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is a bit limited in CS\SC case - another process could bypass this check

BTW, the parallel restore in CS is still setting the swept flag normally after you added and used m_restoring?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it should work in all server modes.
BTW, I'm not sure we should prevent all kind of user attachments during restore.
For example, it make impossible to correctly stop restore process in some cases.
Without this restriction one could stop restore by full shutdown restoring database or kill gbak attachment using monitoring tables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I'm not sure we should prevent all kind of user attachments during restore.
For example, it make impossible to correctly stop restore process in some cases.

From this point of view, did you consider the solution I proposed in the first comment ? Also, I need to mention another difference between our implementations which can be important. In my variant extend_relation is responsible for setting the swept flag on DPs during restore instead of DPM_store/store_big_record. Due to this, even changing data in user attachments becomes safe, if I understand correctly. And when restore ends, changed DPs will just end up without the swept flag which is totally fine for this unusual case, I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I'm not sure we should prevent all kind of user attachments during restore.
For example, it make impossible to correctly stop restore process in some cases.

From this point of view, did you consider the solution I proposed in the first comment ?

Yes, I did. I prefer to not change gbak and not introduce new DPB items just for this task. Note, with your solution and using old gbak with new server, state of the restored database will depend on command line (if services was used or not). I don't think it is consistent and prefer not to explain users why restored database was swept in one case and not swept in another.

Also, I need to mention another difference between our implementations which can be important. In my variant extend_relation is responsible for setting the swept flag on DPs during restore instead of DPM_store/store_big_record.

I suppose, you set swept flag on newly allocated DP immediately after formatting and make DPM_store and store_big_record to not clear it during restore, correct ?

I see no principal difference here, while your way need a bit less code changes, agree.

Due to this, even changing data in user attachments becomes safe, if I understand correctly.

Only if they didn't consider database as "restoring". And this is same for my way too.

The really important difference is that I made "restoring" an global database state, while you made it a local per-connection attribute.

And when restore ends, changed DPs will just end up without the swept flag which is totally fine for this unusual case, I guess.

Yes. But anyway I consider ability for user to change data during restore as a problem that better to avoid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose, you set swept flag on newly allocated DP immediately after formatting and make DPM_store and store_big_record to not clear it during restore, correct ?

Correct. The difference is when DPM_store is called several times for the same DP, it has no chance to set the swept flag back if it was recently cleared by a concurrent user attachment. But if we don't allow to change data - it doesn't matter, and both approaches are fine.

… the restoring database, extract commonly used function swept_at_restore() as Ilya Eremin (@ilya071294) suggested.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data pages of newly gbak restored databases should marked as "swept" [CORE6164]
6 participants