Tutorial

Introduction

The Conformal Predictions add-on expands the Orange library with implementations of algorithms from the theoretical framework of conformal predictions (CP) to obtain error calibration under classification and regression settings.

In contrast with standard supervised machine learning, which for a given new data instance typically produces \(\hat{y}\), called a point prediction, here we are interested in making a region prediction. For example, with conformal prediction we could produce a 95% prediction region — a set \(\Gamma^{0.05}\) that contains the true label \(y\) with probability at least 95%. In the case of regression, where \(y\) is a number, \(\Gamma^{0.05}\) is typically an interval around \(\hat{y}\). In the case of classification, where \(y\) has a limited number of possible values, \(\Gamma^{0.05}\) may consist of a few of these values or, in the ideal case, just one. For a more detailed explanation of the conformal predictions theory refer to the paper [Vovk08] or the book [Shafer05].

In this library the final method for conformal predictions is obtained by selecting a combination of pre-prepared components. Starting with the learning method (either classification or regression) used to fit predictive models, we need to link it with a suitable nonconformity measure and use them together in a selected conformal predictions procedure: transductive, inductive or cross. These CP procedures differ in the way data is split and used for training the predictive model and calibration, which computes the distribution of nonconformity scores used to evaluate possible new predicitions. Inductive CP requires two disjoint data sets to be provided - one for training, the other for calibration. Cross CP uses a single training data set and automatically prepares k different splits into training and calibration sets in the same manner as k-fold crossvalidation. Transductive CP on the other hand does not need a separate calibration set at all, but retrains the model with a new test instance included for each of its possible labels and compares the nonconformity to those of the labelled instances. This allows it to use the complete training set, but makes it computationally more expensive.

Sections below will explain how to use the implemented methods from this library through practical examples and use-cases. For a detailed documentation of implemented methods and classes along with their parameters consult the Library reference. For more code examples, take a look at the tests module.

References

[Vovk08]Glenn Shafer, Vladimir Vovk. Tutorial on Conformal Predictions. Journal of Machine Learning Research 9 (2008) 371-421
[Shafer05]Vladimir Vovk, Alex Gammerman, and Glenn Shafer. Algorithmic Learning in a Random World. Springer, New York, 2005.

Classification

All 3 types of conformal prediction are implemented for classification (transductive, inductive and cross), with several different nonconformity measures to choose from.

We will show how to train and use a conformal predictive model in the following simple, but fully functional example.

Let’s load the iris data set and try to make a prediction for the last instance using the rest for learning.

>>> import Orange
>>> import orangecontrib.conformal as cp
>>> iris = Orange.data.Table('iris')
>>> train = iris[:-1]
>>> test_instance = iris[-1]

We will use a LogisticRegressionLearner from Orange and the inverse probability nonconformity score in a 5-fold cross conformal prediction classifier.

>>> lr = Orange.classification.LogisticRegressionLearner()
>>> ip = cp.nonconformity.InverseProbability(lr)
>>> ccp = cp.classification.CrossClassifier(ip, 5, train)

Predicting the 90% and 99% prediction regions gives the following results.

>>> print('Actual class:', test_instance.get_class())
Actual class: Iris-virginica
>>> print(ccp(test_instance, 0.1))
['Iris-virginica']
>>> print(ccp(test_instance, 0.01))
['Iris-versicolor', 'Iris-virginica']

We can see that in the first case only the correct class of ‘Iris-virginica’ was predicted. In the second case, with a much lower tolerance for errors, the model claims only that the instance belongs to one of two possible classes ‘Iris-versicolor’ or ‘Iris-virginica’, but not the third ‘Iris-setosa’.

Regression

For regression inductive and cross conformal prediction are implemented along with several nonconformity measures.

Similarly to the classification example, let’s combine some standard components to show how to train and use a conformal prediction model for regression.

Let’s load the housing data set and try to make a prediction for the last instance using the rest for learning.

>>> import Orange
>>> import orangecontrib.conformal as cp
>>> housing = Orange.data.Table('housing')
>>> train = housing[:-1]
>>> test_instance = housing[-1]

We will use a LinearRegressionLearner from Orange and the absolute error nonconformity score in a 5-fold cross conformal regressor.

>>> lr = Orange.regression.LinearRegressionLearner()
>>> abs_err = cp.nonconformity.AbsError(lr)
>>> ccr = cp.regression.CrossRegressor(abs_err, 5, train)

Predicting the 90% and 99% prediction regions gives the following results.

>>> print('Actual target value:', test_instance.get_class())
Actual target value: 11.900
>>> print(ccr(test_instance, 0.1))
(13.708550425853684, 31.417230194137165)
>>> print(ccr(test_instance, 0.01))
(-0.98542733224618217, 46.111207952237031)

We can see that in the first case the predicted interval was smaller, but did not contain the correct value (this should not happend more than 10% of the time). In the second case, with a much lower tolerance for errors, the model predicted a larger interval, which did contain the correct value.

Evaluation

The evaluation module provides many useful classes and functions for evaluating the performance and validity of conformal predictions. The main two classes, which represent the results of a conformal classifier and regressor, are conformal.evaluation.ResultsClass and conformal.evaluation.ResultsRegr.

For ease of use, the evaluation results can be obtained using utility functions that evaluate the selected conformal predictor on data defined by the provided sampler (conformal.evaluation.run()) or explicitly provided by the user (conformal.evaluation.run_train_test()).

As an example, let’s take a look at how to quickly evaluate a conformal classifier on a test data set and compute some of the performance metrics:

>>> import Orange
>>> import orangecontrib.conformal as cp
>>> iris = Orange.data.Table('iris')
>>> train, test = iris[::2], iris[1::2]
>>> lr = Orange.classification.LogisticRegressionLearner()
>>> ip = cp.nonconformity.InverseProbability(lr)
>>> ccp = cp.classification.CrossClassifier(ip, 5)
>>> res = cp.evaluation.run_train_test(ccp, 0.1, train, test)

The results are an instance of conformal.evaluation.ResultsClass mentioned above, and can be used to compute the accuracy of predictions (fraction of predictions including the actual class). For a valid predictor it needs to hold that the error (1 - accuracy) is lower or equal to the specified significance level. In addition to validity, we are often interested in the efficiency of a predictor. For classification, this is often measured with the fraction of cases with a single predicted class (conformal.evaluation.ResultsClass.singleton_criterion()). For regression, one might measure the widths of predicted intervals and e.g. report the average value (conformal.evaluation.ResultsRegr.mean_range()).

>>> print('Accuracy:', res.accuracy())
Accuracy: 0.946666666667
>>> print('Singletons:', res.singleton_criterion())
Singletons: 0.96