-
Notifications
You must be signed in to change notification settings - Fork 6k
[Research Project
] Add AnyText: Multilingual Visual Text Generation And Editing
#8998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 124 commits
Commits
Show all changes
130 commits
Select commit
Hold shift + click to select a range
6e8088f
Add initial template
tolgacangoz 98c2d6e
Second template
tolgacangoz 867bbbf
feat: Add TextEmbeddingModule to AnyTextPipeline
tolgacangoz 8818372
feat: Add AuxiliaryLatentModule template to AnyTextPipeline
tolgacangoz 37c46d8
Merge branch 'main' into Add-AnyText
tolgacangoz 64c63eb
Add bert tokenizer from the anytext repo for now
tolgacangoz 92f8b79
feat: Update AnyTextPipeline's modify_prompt method
tolgacangoz e9c688c
Fill in the `forward` pass of `AuxiliaryLatentModule`
tolgacangoz 42a41d0
`make style && make quality`
tolgacangoz 9d50f80
`chore: Update bert_tokenizer.py with a TODO comment suggesting the u…
tolgacangoz 22aa69a
Merge branch 'main' into Add-AnyText
tolgacangoz 5e1e515
Update error handling to raise and logging
tolgacangoz 2d10f0c
Add `create_glyph_lines` function into `TextEmbeddingModule`
tolgacangoz bc197a9
make style
tolgacangoz e52d8cc
Up
tolgacangoz 8c69d83
Up
tolgacangoz 4a413aa
Up
tolgacangoz 571608b
Up
tolgacangoz 2b1b50d
Merge branch 'main' into Add-AnyText
tolgacangoz a7d025f
Remove several comments
tolgacangoz d2c5a65
refactor: Remove ControlNetConditioningEmbedding and update code acco…
tolgacangoz c1f538c
Merge branch 'main' into Add-AnyText
tolgacangoz 2607b6b
Up
tolgacangoz a9fe4a0
Up
tolgacangoz 567f553
up
tolgacangoz a9991d0
refactor: Update AnyTextPipeline to include new optional parameters
tolgacangoz e69c51e
Merge branch 'main' into Add-AnyText
tolgacangoz 91252e0
up
tolgacangoz e54f876
Merge branch 'main' into Add-AnyText
tolgacangoz b9164e3
feat: Add OCR model and its components
tolgacangoz cd4c9c2
chore: Update `TextEmbeddingModule` to include OCR model components a…
tolgacangoz 0918cbd
chore: Update `AuxiliaryLatentModule` to include VAE model and its de…
tolgacangoz 37ae99f
`make style`
tolgacangoz 15fd4df
Merge branch 'main' into Add-AnyText
tolgacangoz 2e40224
Merge branch 'main' into Add-AnyText
tolgacangoz b475a3b
refactor: Update `AnyTextPipeline`'s docstring
tolgacangoz ea957f0
Update `AuxiliaryLatentModule` to include info dictionary so that tex…
tolgacangoz cc0c6e5
simplify
tolgacangoz 52fb0b4
`make style`
tolgacangoz 187473d
Merge branch 'main' into Add-AnyText
tolgacangoz 9dd4ee9
Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function
tolgacangoz 7dbd4bc
Simplify for now
tolgacangoz f422423
`make style`
tolgacangoz 62bb2a0
Merge branch 'main' into Add-AnyText
tolgacangoz 8466009
Up
tolgacangoz 2b4be7a
feat: Add scripts to convert AnyText controlnet to diffusers
tolgacangoz d4718fd
Merge branch 'main' into Add-AnyText
tolgacangoz 1cdbb55
`make style`
tolgacangoz da67ff7
Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLat…
tolgacangoz af30f0f
make style
tolgacangoz a8dbbe2
Up
tolgacangoz 12fca1c
Merge branch 'main' of github.com:huggingface/diffusers into Add-AnyText
tolgacangoz bbfe8f2
Merge branch 'main' into Add-AnyText
tolgacangoz 936c2ff
Simplify
tolgacangoz 73d8144
Merge branch 'main' into Add-AnyText
tolgacangoz cffa036
Up
tolgacangoz 8b43bc3
feat: Add safetensors module for loading model file
tolgacangoz f60a72b
Fix device issues
tolgacangoz be4a319
Up
tolgacangoz 18d3f60
Merge branch 'main' into Add-AnyText
tolgacangoz f713171
Up
tolgacangoz da9adbb
merge
tolgacangoz fdf0275
Merge branch 'main' into Add-AnyText
tolgacangoz f347ff2
refactor: Simplify
tolgacangoz d52e973
refactor: Simplify code for loading models and handling data types
tolgacangoz a3b493f
`make style`
tolgacangoz 020074a
Merge branch 'main' into Add-AnyText
tolgacangoz 4267c84
refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddin…
tolgacangoz ab51226
refactor: Update dtype in embedding_manager.py to match proj.weight
tolgacangoz c961a96
Merge branch 'main' of github.com:huggingface/diffusers
tolgacangoz 3873d02
Merge branch 'main' into Add-AnyText
tolgacangoz 5041d40
Merge branch 'main' of github.com:huggingface/diffusers
tolgacangoz 1521e8f
Up
tolgacangoz 1aa17bb
Merge branch 'main' into Add-AnyText
tolgacangoz c13c61d
Merge branch 'main' into Add-AnyText
tolgacangoz 1d18f1d
Merge branch 'main' into Add-AnyText
tolgacangoz 44a3a70
Merge branch 'main' into Add-AnyText
tolgacangoz 56992d1
Add attribution and adaptation information to pipeline_anytext.py
tolgacangoz 7ad6865
Update usage example
tolgacangoz a5edca5
Will refactor `controlnet_cond_embedding` initialization
tolgacangoz 48e88eb
Merge branch 'main' into Add-AnyText
tolgacangoz 2f42e40
Add `AnyTextControlNetConditioningEmbedding` template
tolgacangoz 670fef5
Refactor organization
tolgacangoz 930c37a
style
tolgacangoz 923da7b
Merge branch 'main' into Add-AnyText
tolgacangoz 21c0c35
style
tolgacangoz c4db96a
Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNet…
tolgacangoz e2e7160
Merge branch 'main' into Add-AnyText
tolgacangoz 4335ebd
Merge branch 'main' into Add-AnyText
tolgacangoz 6bd0b4c
Follow one-file policy
tolgacangoz b3f98a7
style
tolgacangoz cccf0f4
Merge branch 'main' into Add-AnyText
tolgacangoz b5856a6
Merge branch 'main' into Add-AnyText
tolgacangoz b04d015
Merge branch 'main' into Add-AnyText
tolgacangoz 67f8839
Merge branch 'main' into Add-AnyText
tolgacangoz d75508e
Merge branch 'main' into Add-AnyText
tolgacangoz 75a0f1f
[Docs] Update README and pipeline_anytext.py to use AnyTextControlNet…
tolgacangoz d3dcf57
[Docs] Update import statement for AnyTextControlNetModel in pipeline…
tolgacangoz 963fac0
[Fix] Update import path for ControlNetModel, ControlNetOutput in any…
tolgacangoz 0c94143
Merge branch 'main' of github.com:huggingface/diffusers into Add-AnyText
tolgacangoz 2b6f08b
Refactor AnyTextControlNet to use configurable conditioning embedding…
tolgacangoz 971d6ad
Merge branch 'main' into Add-AnyText
tolgacangoz 9c43a65
Complete control net conditioning embedding in AnyTextControlNetModel
tolgacangoz d46ac3e
Merge branch 'main' into Add-AnyText
tolgacangoz 2ffb80b
Merge branch 'main' into Add-AnyText
tolgacangoz b8ca0d6
up
tolgacangoz 9657980
[FIX] Ensure embeddings use correct device in AnyTextControlNetModel
tolgacangoz 25ea8be
up
tolgacangoz 2be7bca
up
tolgacangoz 0fc4aab
style
tolgacangoz 5345702
[UPDATE] Revise README and example code for AnyTextPipeline integrati…
tolgacangoz 5b73a1d
[UPDATE] Update example code in anytext.py to use correct font file a…
tolgacangoz 7f87755
down
tolgacangoz 61693a5
[UPDATE] Refactor BasicTokenizer usage to a new Checker class for tex…
tolgacangoz 3b2435f
update pillow
tolgacangoz 3ea49c1
[UPDATE] Remove commented-out code and unnecessary docstring in anyte…
tolgacangoz 299a646
[REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py…
tolgacangoz 0d44b5b
[UPDATE] Replace edict with dict for configuration in anytext.py and …
tolgacangoz 13b7ecf
🆙
tolgacangoz d5a6e5f
style
tolgacangoz 09fdd22
[UPDATE] Revise README.md for clarity, remove unused imports in anyte…
tolgacangoz f8f5edd
Merge branch 'main' into Add-AnyText
tolgacangoz 9495ddb
style
tolgacangoz 13ab248
Merge branch 'main' into Add-AnyText
tolgacangoz f4abaf2
Update examples/research_projects/anytext/README.md
tolgacangoz 8d313bc
Remove commented-out image preparation code in AnyTextPipeline
tolgacangoz d02615f
Remove unnecessary blank line in README.md
tolgacangoz 52aa924
Merge branch 'main' into Add-AnyText
tolgacangoz 78447b2
Merge branch 'main' into Add-AnyText
tolgacangoz 3e3f972
Merge branch 'main' into Add-AnyText
tolgacangoz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# AnyTextPipeline Pipeline | ||
|
||
Project page: https://aigcdesigngroup.github.io/homepage_anytext | ||
|
||
"AnyText comprises a diffusion pipeline with two primary elements: an auxiliary latent module and a text embedding module. The former uses inputs like text glyph, position, and masked image to generate latent features for text generation or editing. The latter employs an OCR model for encoding stroke data as embeddings, which blend with image caption embeddings from the tokenizer to generate texts that seamlessly integrate with the background. We employed text-control diffusion loss and text perceptual loss for training to further enhance writing accuracy." | ||
|
||
Each text line that needs to be generated should be enclosed in double quotes. For any usage questions, please refer to the [paper](https://arxiv.org/abs/2311.03054). | ||
|
||
|
||
|
||
```py | ||
import torch | ||
from diffusers import DiffusionPipeline | ||
from anytext_controlnet import AnyTextControlNetModel | ||
from diffusers.utils import load_image | ||
|
||
# I chose a font file shared by an HF staff: | ||
!wget https://huggingface.co/spaces/ysharma/TranslateQuotesInImageForwards/resolve/main/arial-unicode-ms.ttf | ||
|
||
anytext_controlnet = AnyTextControlNetModel.from_pretrained("tolgacangoz/anytext-controlnet", torch_dtype=torch.float16, | ||
variant="fp16",) | ||
pipe = DiffusionPipeline.from_pretrained("tolgacangoz/anytext", font_path="arial-unicode-ms.ttf", | ||
controlnet=anytext_controlnet, torch_dtype=torch.float16, | ||
trust_remote_code=False, # One needs to give permission to run this pipeline's code | ||
).to("cuda") | ||
|
||
# generate image | ||
prompt = 'photo of caramel macchiato coffee on the table, top-down perspective, with "Any" "Text" written on it using cream' | ||
draw_pos = load_image("https://raw.githubusercontent.com/tyxsspa/AnyText/refs/heads/main/example_images/gen9.png") | ||
image = pipe(prompt, num_inference_steps=20, mode="generate", draw_pos=draw_pos, | ||
).images[0] | ||
image | ||
``` |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.