Skip to content

Encoding Error on Windows with WandB #143

Closed
@kpister

Description

@kpister

When syncing with openai wandb sync, I get a character encoding issue on one of the run files. This happens specifically on Windows which often struggles with the default encoding on a file with open(filename).

One solution is artifact.new_file(filename, "w", encoding="utf-8") on line 279 of the wandb_logger.py which solves the problem locally for me. Alternatively, using a default of "utf-8" in the artifact.new_file function should work too, but might have other unintended side effects.

Here is the output:

wandb: ERROR Failed to open the provided file (UnicodeEncodeError: 'charmap' codec can't encode character '\u03bc' in position 205764: character maps to <undefined>). Please provide the proper encoding.
Traceback (most recent call last):
  File "C:\Users\miniconda3\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\kaiser\miniconda3\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\venv\Scripts\openai.exe\__main__.py", line 7, in <module>
    sys.exit(main())
  File "C:\venv\lib\site-packages\openai\_openai_scripts.py", line 63, in main
    args.func(args)
  File "C:\venv\lib\site-packages\openai\cli.py", line 586, in sync
    resp = openai.wandb_logger.WandbLogger.sync(
  File "C:\venv\lib\site-packages\openai\wandb_logger.py", line 74, in sync
    fine_tune_logged = [
  File "C:\venv\lib\site-packages\openai\wandb_logger.py", line 75, in <listcomp>
    cls._log_fine_tune(
  File "C:\venv\lib\site-packages\openai\wandb_logger.py", line 172, in _log_fine_tune
    cls._log_artifacts(fine_tune, project, entity)
  File "C:\venv\lib\site-packages\openai\wandb_logger.py", line 236, in _log_artifacts
    cls._log_artifact_inputs(file, prefix, artifact_type, project, entity)
  File "C:\venv\lib\site-packages\openai\wandb_logger.py", line 280, in _log_artifact_inputs
    f.write(file_content)
  File "C:\Users\miniconda3\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u03bc' in position 205764: character maps to <undefined>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions