Skip to content

docs: add documentation for workspaces + general cleanup #402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/workflows/doc-build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,17 @@ jobs:
run: |
cd docs
make papermill
- name: Coverage
run: |
set -ex
cd docs
make coverage
if [ "$(wc -l build/*/coverage/python.txt)" -ne 2 ]
then
cat build/*/coverage/python.txt
echo "missing documentation coverage!"
exit 1
fi

docpush:
runs-on: ubuntu-18.04
Expand Down
2 changes: 1 addition & 1 deletion docs/source/app_best_practices.rst
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ model definition from a python file and then you'll load the weights and state
dict from a ``.ckpt`` or ``.pt`` file.

This is how Pytorch Lightning's
`ModelCheckpoint <https://pytorch-lightning.readthedocs.io/en/latest/extensions/generated/pytorch_lightning.callbacks.ModelCheckpoint.html>`__ hook works.
`ModelCheckpoint <https://pytorch-lightning.readthedocs.io/en/latest/api/pytorch_lightning.callbacks.ModelCheckpoint.html>`__ hook works.

This is the most common but makes it harder to make a reusable app since your
trainer app needs to include the model definition code.
Expand Down
11 changes: 6 additions & 5 deletions docs/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,12 @@ The top level modules in TorchX are:

1. :mod:`torchx.specs`: application spec (job definition) APIs
2. :mod:`torchx.components`: predefined (builtin) app specs
3. :mod:`torchx.runner`: given an app spec, submits the app as a job on a scheduler
4. :mod:`torchx.schedulers`: backend job schedulers that the runner supports
5. :mod:`torchx.pipelines`: adapters that convert the given app spec to a "stage" in an ML pipeline platform
6. :mod:`torchx.runtime`: util and abstraction libraries you can use in authoring apps (not app spec)
7. :mod:`torchx.cli`: CLI tool
3. :mod:`torchx.workspace`: handles patching images for remote execution
4. :mod:`torchx.cli`: CLI tool
5. :mod:`torchx.runner`: given an app spec, submits the app as a job on a scheduler
6. :mod:`torchx.schedulers`: backend job schedulers that the runner supports
7. :mod:`torchx.pipelines`: adapters that convert the given app spec to a "stage" in an ML pipeline platform
8. :mod:`torchx.runtime`: util and abstraction libraries you can use in authoring apps (not app spec)

Below is a UML diagram

Expand Down
13 changes: 7 additions & 6 deletions docs/source/components/utils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ Utils
.. automodule:: torchx.components.utils
.. currentmodule:: torchx.components.utils

.. autofunction:: torchx.components.utils.echo
.. autofunction:: torchx.components.utils.touch
.. autofunction:: torchx.components.utils.sh
.. autofunction:: torchx.components.utils.copy
.. autofunction:: torchx.components.utils.python
.. autofunction:: torchx.components.utils.booth
.. autofunction:: echo
.. autofunction:: touch
.. autofunction:: sh
.. autofunction:: copy
.. autofunction:: python
.. autofunction:: booth
.. autofunction:: binary
6 changes: 6 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,12 @@
"IPython.sphinxext.ipython_console_highlighting",
]

# coverage options

coverage_ignore_modules = [
"torchx.components.component_test_base",
]

# katex options
#
#
Expand Down
67 changes: 32 additions & 35 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ most unique applications can be serviced without customizing the whole vertical


**GETTING STARTED?** First learn the :ref:`basic concepts<basics:Basic Concepts>` and
follow the :ref:`quickstart guide<quickstart:Quickstart>`.
follow the :ref:`quickstart guide<quickstart:Quickstart - Custom Components>`.

.. image:: torchx_index_diag.png

Expand Down Expand Up @@ -47,8 +47,37 @@ Documentation
quickstart.md
cli

runner.config

advanced


Works With
---------------

.. _Schedulers:
.. toctree::
:maxdepth: 1
:caption: Schedulers

schedulers/local
schedulers/docker
schedulers/kubernetes
schedulers/slurm
schedulers/ray
schedulers/aws_batch

.. _Pipelines:
.. toctree::
:maxdepth: 1
:caption: Pipelines

pipelines/kfp


Examples
------------

.. toctree::
:maxdepth: 1
:caption: Examples
Expand All @@ -58,6 +87,7 @@ Documentation
examples_pipelines/index



Components Library
---------------------
.. _Components:
Expand Down Expand Up @@ -85,28 +115,6 @@ Runtime Library
runtime/tracking


Works With
---------------

.. _Schedulers:
.. toctree::
:maxdepth: 1
:caption: Schedulers

schedulers/local
schedulers/kubernetes
schedulers/slurm
schedulers/ray
schedulers/aws_batch

.. _Pipelines:
.. toctree::
:maxdepth: 1
:caption: Pipelines

pipelines/kfp


Reference
-----------

Expand All @@ -118,6 +126,7 @@ Reference
specs
runner
schedulers
workspace
pipelines

.. toctree::
Expand All @@ -126,15 +135,3 @@ Reference

app_best_practices
component_best_practices


Experimental
---------------
.. toctree::
:maxdepth: 1
:caption: Experimental Features

experimental/runner.config



2 changes: 1 addition & 1 deletion docs/source/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jupyter:
name: python3
---

# Quickstart
# Quickstart - Custom Components

This is a self contained guide on how to build a simple app and component spec
and launch it via two different schedulers.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
(beta) .torchxconfig file
.torchxconfig
-----------------------------

.. automodule:: torchx.runner.config
Expand All @@ -10,3 +10,7 @@ Config API Functions
.. autofunction:: apply
.. autofunction:: load
.. autofunction:: dump
.. autofunction:: find_configs
.. autofunction:: get_configs
.. autofunction:: get_config
.. autofunction:: load_sections
3 changes: 3 additions & 0 deletions docs/source/runtime/hpo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,7 @@ Ax (Adaptive Experimentation)
.. currentmodule:: torchx.runtime.hpo.ax

.. autoclass:: TorchXRunner
:members:

.. autoclass:: AppMetric
:members:
10 changes: 10 additions & 0 deletions docs/source/schedulers/aws_batch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,17 @@ AWS Batch
=================

.. automodule:: torchx.schedulers.aws_batch_scheduler

.. currentmodule:: torchx.schedulers.aws_batch_scheduler

.. autoclass:: AWSBatchScheduler
:members:
:show-inheritance:

.. autoclass:: BatchJob
:members:

Reference
~~~~~~~~~~~~

.. autofunction:: create_scheduler
23 changes: 23 additions & 0 deletions docs/source/schedulers/docker.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
Docker
=================

.. automodule:: torchx.schedulers.docker_scheduler

.. currentmodule:: torchx.schedulers.docker_scheduler

.. autoclass:: DockerScheduler
:members:
:show-inheritance:

.. autoclass:: DockerJob
:members:

Reference
~~~~~~~~~~~~

.. autofunction:: create_scheduler

.. autoclass:: DockerContainer
:members:

.. autofunction:: has_docker
14 changes: 14 additions & 0 deletions docs/source/schedulers/kubernetes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,22 @@ Kubernetes
=================

.. automodule:: torchx.schedulers.kubernetes_scheduler

.. currentmodule:: torchx.schedulers.kubernetes_scheduler

.. autoclass:: KubernetesScheduler
:members:
:show-inheritance:

.. autoclass:: KubernetesJob
:members:

Reference
~~~~~~~~~~~~

.. autofunction:: create_scheduler
.. autofunction:: app_to_resource
.. autofunction:: cleanup_str
.. autofunction:: pod_labels
.. autofunction:: role_to_pod
.. autofunction:: sanitize_for_serialization
30 changes: 22 additions & 8 deletions docs/source/schedulers/local.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,38 @@ Local
=================

.. automodule:: torchx.schedulers.local_scheduler

.. currentmodule:: torchx.schedulers.local_scheduler

.. autoclass:: LocalScheduler
:members:

.. automodule:: torchx.schedulers.docker_scheduler
.. currentmodule:: torchx.schedulers.docker_scheduler

.. autoclass:: DockerScheduler
:members:
:show-inheritance:

Image Providers
~~~~~~~~~~~~~~~~~

.. currentmodule:: torchx.schedulers.local_scheduler

.. autoclass:: ImageProvider
:members:

.. autoclass:: CWDImageProvider
:members:

.. autoclass:: LocalDirectoryImageProvider
:members:

Reference
~~~~~~~~~~~~

.. autofunction:: create_cwd_scheduler

.. autoclass:: LogIterator
:members:

.. autoclass:: PopenRequest
:members:

.. autoclass:: ReplicaParam
:members:

.. autoclass:: SignalException
:members:
9 changes: 9 additions & 0 deletions docs/source/schedulers/ray.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,16 @@ Ray
=================

.. automodule:: torchx.schedulers.ray_scheduler

.. currentmodule:: torchx.schedulers.ray_scheduler

.. autoclass:: RayScheduler
:members:
:show-inheritance:

.. autofunction:: create_scheduler
.. autofunction:: has_ray
.. autofunction:: serialize

.. autoclass:: RayJob
:members:
10 changes: 10 additions & 0 deletions docs/source/schedulers/slurm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,17 @@ Slurm
=================

.. automodule:: torchx.schedulers.slurm_scheduler

.. currentmodule:: torchx.schedulers.slurm_scheduler

.. autoclass:: SlurmScheduler
:members:
:show-inheritance:

.. autofunction:: create_scheduler

.. autoclass:: SlurmBatchRequest
:members:

.. autoclass:: SlurmReplicaRequest
:members:
Loading