Skip to content

Commit a1030d3

Browse files
committed
Pushing the docs to _pst_preview/ for branch: new_web_theme, commit c1adbac2b1cb4ef76c2d4937dc6c5588b67b0e27
1 parent b5e6e46 commit a1030d3

File tree

1,569 files changed

+15712
-17125
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,569 files changed

+15712
-17125
lines changed

_pst_preview/.buildinfo

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
# Sphinx build info version 1
22
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3-
config: 4aa0097b2b97714bf707e4da06a20a2d
3+
config: 5f8529899ed1da684e9b1bb6c892cca4
44
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file not shown.

_pst_preview/_downloads/1b8827af01c9a70017a4739bcf2e21a8/plot_gpr_co2.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,19 @@
44
====================================================================================
55
66
This example is based on Section 5.4.3 of "Gaussian Processes for Machine
7-
Learning" [RW2006]_. It illustrates an example of complex kernel engineering
7+
Learning" [1]_. It illustrates an example of complex kernel engineering
88
and hyperparameter optimization using gradient ascent on the
99
log-marginal-likelihood. The data consists of the monthly average atmospheric
1010
CO2 concentrations (in parts per million by volume (ppm)) collected at the
1111
Mauna Loa Observatory in Hawaii, between 1958 and 2001. The objective is to
1212
model the CO2 concentration as a function of the time :math:`t` and extrapolate
1313
for years after 2001.
1414
15-
.. topic: References
15+
.. rubric:: References
1616
17-
.. [RW2006] `Rasmussen, Carl Edward.
18-
"Gaussian processes in machine learning."
19-
Summer school on machine learning. Springer, Berlin, Heidelberg, 2003
20-
<http://www.gaussianprocess.org/gpml/chapters/RW.pdf>`_.
17+
.. [1] `Rasmussen, Carl Edward. "Gaussian processes in machine learning."
18+
Summer school on machine learning. Springer, Berlin, Heidelberg, 2003
19+
<http://www.gaussianprocess.org/gpml/chapters/RW.pdf>`_.
2120
"""
2221

2322
print(__doc__)

_pst_preview/_downloads/23614d75e8327ef369659da7d2ed62db/plot_nested_cross_validation_iris.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,17 +30,17 @@
3030
performance of non-nested and nested CV strategies by taking the difference
3131
between their scores.
3232
33-
.. topic:: See Also:
33+
.. seealso::
3434
3535
- :ref:`cross_validation`
3636
- :ref:`grid_search`
3737
38-
.. topic:: References:
38+
.. rubric:: References
3939
40-
.. [1] `Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and
41-
subsequent selection bias in performance evaluation.
42-
J. Mach. Learn. Res 2010,11, 2079-2107.
43-
<http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf>`_
40+
.. [1] `Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and
41+
subsequent selection bias in performance evaluation.
42+
J. Mach. Learn. Res 2010,11, 2079-2107.
43+
<http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf>`_
4444
4545
"""
4646

_pst_preview/_downloads/2402de18d671ce5087e3760b2540184f/plot_grid_search_stats.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -330,7 +330,7 @@
330330
"cell_type": "markdown",
331331
"metadata": {},
332332
"source": [
333-
".. topic:: References\n\n .. [1] Dietterich, T. G. (1998). [Approximate statistical tests for\n comparing supervised classification learning algorithms](http://web.cs.iastate.edu/~jtian/cs573/Papers/Dietterich-98.pdf).\n Neural computation, 10(7).\n .. [2] Nadeau, C., & Bengio, Y. (2000). [Inference for the generalization\n error](https://papers.nips.cc/paper/1661-inference-for-the-generalization-error.pdf).\n In Advances in neural information processing systems.\n .. [3] Bouckaert, R. R., & Frank, E. (2004). [Evaluating the replicability\n of significance tests for comparing learning algorithms](https://www.cms.waikato.ac.nz/~ml/publications/2004/bouckaert-frank.pdf).\n In Pacific-Asia Conference on Knowledge Discovery and Data Mining.\n .. [4] Benavoli, A., Corani, G., Dem\u0161ar, J., & Zaffalon, M. (2017). [Time\n for a change: a tutorial for comparing multiple classifiers through\n Bayesian analysis](http://www.jmlr.org/papers/volume18/16-305/16-305.pdf).\n The Journal of Machine Learning Research, 18(1). See the Python\n library that accompanies this paper [here](https://github.com/janezd/baycomp).\n .. [5] Diebold, F.X. & Mariano R.S. (1995). [Comparing predictive accuracy](http://www.est.uc3m.es/esp/nueva_docencia/comp_col_get/lade/tecnicas_prediccion/Practicas0708/Comparing%20Predictive%20Accuracy%20(Dielbold).pdf)\n Journal of Business & economic statistics, 20(1), 134-144.\n\n"
333+
".. rubric:: References\n\n.. [1] Dietterich, T. G. (1998). [Approximate statistical tests for\n comparing supervised classification learning algorithms](http://web.cs.iastate.edu/~jtian/cs573/Papers/Dietterich-98.pdf).\n Neural computation, 10(7).\n.. [2] Nadeau, C., & Bengio, Y. (2000). [Inference for the generalization\n error](https://papers.nips.cc/paper/1661-inference-for-the-generalization-error.pdf).\n In Advances in neural information processing systems.\n.. [3] Bouckaert, R. R., & Frank, E. (2004). [Evaluating the replicability\n of significance tests for comparing learning algorithms](https://www.cms.waikato.ac.nz/~ml/publications/2004/bouckaert-frank.pdf).\n In Pacific-Asia Conference on Knowledge Discovery and Data Mining.\n.. [4] Benavoli, A., Corani, G., Dem\u0161ar, J., & Zaffalon, M. (2017). [Time\n for a change: a tutorial for comparing multiple classifiers through\n Bayesian analysis](http://www.jmlr.org/papers/volume18/16-305/16-305.pdf).\n The Journal of Machine Learning Research, 18(1). See the Python\n library that accompanies this paper [here](https://github.com/janezd/baycomp).\n.. [5] Diebold, F.X. & Mariano R.S. (1995). [Comparing predictive accuracy](http://www.est.uc3m.es/esp/nueva_docencia/comp_col_get/lade/tecnicas_prediccion/Practicas0708/Comparing%20Predictive%20Accuracy%20(Dielbold).pdf)\n Journal of Business & economic statistics, 20(1), 134-144.\n\n"
334334
]
335335
}
336336
],

_pst_preview/_downloads/32173eb704d697c23dffbbf3fd74942a/plot_digits_denoising.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,12 @@
1212
1313
We will use USPS digits dataset to reproduce presented in Sect. 4 of [1]_.
1414
15-
.. topic:: References
15+
.. rubric:: References
1616
17-
.. [1] `Bakır, Gökhan H., Jason Weston, and Bernhard Schölkopf.
18-
"Learning to find pre-images."
19-
Advances in neural information processing systems 16 (2004): 449-456.
20-
<https://papers.nips.cc/paper/2003/file/ac1ad983e08ad3304a97e147f522747e-Paper.pdf>`_
17+
.. [1] `Bakır, Gökhan H., Jason Weston, and Bernhard Schölkopf.
18+
"Learning to find pre-images."
19+
Advances in neural information processing systems 16 (2004): 449-456.
20+
<https://papers.nips.cc/paper/2003/file/ac1ad983e08ad3304a97e147f522747e-Paper.pdf>`_
2121
2222
"""
2323

_pst_preview/_downloads/3c3c738275484acc54821615bf72894a/plot_permutation_importance.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,10 @@
1818
This example shows how to use Permutation Importances as an alternative that
1919
can mitigate those limitations.
2020
21-
.. topic:: References:
21+
.. rubric:: References
2222
23-
* :doi:`L. Breiman, "Random Forests", Machine Learning, 45(1), 5-32,
24-
2001. <10.1023/A:1010933404324>`
23+
* :doi:`L. Breiman, "Random Forests", Machine Learning, 45(1), 5-32,
24+
2001. <10.1023/A:1010933404324>`
2525
2626
"""
2727

_pst_preview/_downloads/45916745bb89ca49be3a50aa80e65e3f/plot_nested_cross_validation_iris.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"\n# Nested versus non-nested cross-validation\n\nThis example compares non-nested and nested cross-validation strategies on a\nclassifier of the iris data set. Nested cross-validation (CV) is often used to\ntrain a model in which hyperparameters also need to be optimized. Nested CV\nestimates the generalization error of the underlying model and its\n(hyper)parameter search. Choosing the parameters that maximize non-nested CV\nbiases the model to the dataset, yielding an overly-optimistic score.\n\nModel selection without nested CV uses the same data to tune model parameters\nand evaluate model performance. Information may thus \"leak\" into the model\nand overfit the data. The magnitude of this effect is primarily dependent on\nthe size of the dataset and the stability of the model. See Cawley and Talbot\n[1]_ for an analysis of these issues.\n\nTo avoid this problem, nested CV effectively uses a series of\ntrain/validation/test set splits. In the inner loop (here executed by\n:class:`GridSearchCV <sklearn.model_selection.GridSearchCV>`), the score is\napproximately maximized by fitting a model to each training set, and then\ndirectly maximized in selecting (hyper)parameters over the validation set. In\nthe outer loop (here in :func:`cross_val_score\n<sklearn.model_selection.cross_val_score>`), generalization error is estimated\nby averaging test set scores over several dataset splits.\n\nThe example below uses a support vector classifier with a non-linear kernel to\nbuild a model with optimized hyperparameters by grid search. We compare the\nperformance of non-nested and nested CV strategies by taking the difference\nbetween their scores.\n\n.. topic:: See Also:\n\n - `cross_validation`\n - `grid_search`\n\n.. topic:: References:\n\n .. [1] [Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and\n subsequent selection bias in performance evaluation.\n J. Mach. Learn. Res 2010,11, 2079-2107.](http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf)\n"
7+
"\n# Nested versus non-nested cross-validation\n\nThis example compares non-nested and nested cross-validation strategies on a\nclassifier of the iris data set. Nested cross-validation (CV) is often used to\ntrain a model in which hyperparameters also need to be optimized. Nested CV\nestimates the generalization error of the underlying model and its\n(hyper)parameter search. Choosing the parameters that maximize non-nested CV\nbiases the model to the dataset, yielding an overly-optimistic score.\n\nModel selection without nested CV uses the same data to tune model parameters\nand evaluate model performance. Information may thus \"leak\" into the model\nand overfit the data. The magnitude of this effect is primarily dependent on\nthe size of the dataset and the stability of the model. See Cawley and Talbot\n[1]_ for an analysis of these issues.\n\nTo avoid this problem, nested CV effectively uses a series of\ntrain/validation/test set splits. In the inner loop (here executed by\n:class:`GridSearchCV <sklearn.model_selection.GridSearchCV>`), the score is\napproximately maximized by fitting a model to each training set, and then\ndirectly maximized in selecting (hyper)parameters over the validation set. In\nthe outer loop (here in :func:`cross_val_score\n<sklearn.model_selection.cross_val_score>`), generalization error is estimated\nby averaging test set scores over several dataset splits.\n\nThe example below uses a support vector classifier with a non-linear kernel to\nbuild a model with optimized hyperparameters by grid search. We compare the\nperformance of non-nested and nested CV strategies by taking the difference\nbetween their scores.\n\n.. seealso::\n\n - `cross_validation`\n - `grid_search`\n\n.. rubric:: References\n\n.. [1] [Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and\n subsequent selection bias in performance evaluation.\n J. Mach. Learn. Res 2010,11, 2079-2107.](http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf)\n"
88
]
99
},
1010
{

_pst_preview/_downloads/4e46f015ab8300f262e6e8775bcdcf8a/plot_adaboost_multiclass.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,11 @@
1717
be selected. This ensures that subsequent iterations of the algorithm focus on
1818
the difficult-to-classify samples.
1919
20-
.. topic:: References:
20+
.. rubric:: References
2121
22-
.. [1] :doi:`J. Zhu, H. Zou, S. Rosset, T. Hastie, "Multi-class adaboost."
23-
Statistics and its Interface 2.3 (2009): 349-360.
24-
<10.4310/SII.2009.v2.n3.a8>`
22+
.. [1] :doi:`J. Zhu, H. Zou, S. Rosset, T. Hastie, "Multi-class adaboost."
23+
Statistics and its Interface 2.3 (2009): 349-360.
24+
<10.4310/SII.2009.v2.n3.a8>`
2525
2626
"""
2727

@@ -231,16 +231,16 @@ def misclassification_error(y_true, y_pred):
231231
# decision. Indeed, this exactly is the formulation of updating the base
232232
# estimators' weights after each iteration in AdaBoost.
233233
#
234-
# |details-start| Mathematical details |details-split|
234+
# .. dropdown:: Mathematical details
235235
#
236-
# The weight associated with a weak learner trained at the stage :math:`m` is
237-
# inversely associated with its misclassification error such that:
236+
# The weight associated with a weak learner trained at the stage :math:`m` is
237+
# inversely associated with its misclassification error such that:
238238
#
239-
# .. math:: \alpha^{(m)} = \log \frac{1 - err^{(m)}}{err^{(m)}} + \log (K - 1),
239+
# .. math:: \alpha^{(m)} = \log \frac{1 - err^{(m)}}{err^{(m)}} + \log (K - 1),
240240
#
241-
# where :math:`\alpha^{(m)}` and :math:`err^{(m)}` are the weight and the error
242-
# of the :math:`m` th weak learner, respectively, and :math:`K` is the number of
243-
# classes in our classification problem. |details-end|
241+
# where :math:`\alpha^{(m)}` and :math:`err^{(m)}` are the weight and the error
242+
# of the :math:`m` th weak learner, respectively, and :math:`K` is the number of
243+
# classes in our classification problem.
244244
#
245245
# Another interesting observation boils down to the fact that the first weak
246246
# learners of the model make fewer errors than later weak learners of the

_pst_preview/_downloads/51833337bfc73d152b44902e5baa50ff/plot_lasso_lars_ic.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"\n# Lasso model selection via information criteria\n\nThis example reproduces the example of Fig. 2 of [ZHT2007]_. A\n:class:`~sklearn.linear_model.LassoLarsIC` estimator is fit on a\ndiabetes dataset and the AIC and the BIC criteria are used to select\nthe best model.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>It is important to note that the optimization to find `alpha` with\n :class:`~sklearn.linear_model.LassoLarsIC` relies on the AIC or BIC\n criteria that are computed in-sample, thus on the training set directly.\n This approach differs from the cross-validation procedure. For a comparison\n of the two approaches, you can refer to the following example:\n `sphx_glr_auto_examples_linear_model_plot_lasso_model_selection.py`.</p></div>\n\n.. topic:: References\n\n .. [ZHT2007] :arxiv:`Zou, Hui, Trevor Hastie, and Robert Tibshirani.\n \"On the degrees of freedom of the lasso.\"\n The Annals of Statistics 35.5 (2007): 2173-2192.\n <0712.0881>`\n"
7+
"\n# Lasso model selection via information criteria\n\nThis example reproduces the example of Fig. 2 of [ZHT2007]_. A\n:class:`~sklearn.linear_model.LassoLarsIC` estimator is fit on a\ndiabetes dataset and the AIC and the BIC criteria are used to select\nthe best model.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>It is important to note that the optimization to find `alpha` with\n :class:`~sklearn.linear_model.LassoLarsIC` relies on the AIC or BIC\n criteria that are computed in-sample, thus on the training set directly.\n This approach differs from the cross-validation procedure. For a comparison\n of the two approaches, you can refer to the following example:\n `sphx_glr_auto_examples_linear_model_plot_lasso_model_selection.py`.</p></div>\n\n.. rubric:: References\n\n.. [ZHT2007] :arxiv:`Zou, Hui, Trevor Hastie, and Robert Tibshirani.\n \"On the degrees of freedom of the lasso.\"\n The Annals of Statistics 35.5 (2007): 2173-2192.\n <0712.0881>`\n"
88
]
99
},
1010
{

0 commit comments

Comments
 (0)