Description
Is your feature request related to a problem? Please describe.
Calls to load_textual_inversion() take about 0.4 seconds each on my test machine and I have many in a directory that are being loaded. It would be more acceptable if this was app startup time, but as described in #3147, I frequently need to recreate the pipeline based on application state (e.g. enabling/disabling ControlNet).
The majority of the time is spent in resize_token_embeddings() which could be done once for a batched load. I did a quick hack test of this, which included loading processing each embedding twice, and got a 6x speedup.
Describe the solution you'd like
Add a load_textual_inversions() API that processes a list of embeddings and only resizes the encoder once.
Describe alternatives you've considered
I could load embeddings on demand based on parsing the prompt token.
Additional context
None.