Model evaluation¶
-
cbar.cross_validation.
cv
(dataset, codebook_size, multi_word_queries=False, threshold=1, n_folds=3, method='loreta', **kwargs)¶ Perform cross-validation
This function performs cross-validation of the retrieval methods on different datasets.
Parameters: - dataset (str, 'cal500', 'cal10k', or 'freesound') – The dataset on which the retrieval method should be evaluated
- codebook_size (int) – The codebook size the dataset should be encoded with. The data loading
utility
cbar.datasets.fetch_cal500()
contains more information about the codebook representation of sounds. - multi_word_queries (bool, default:
False
) – If the retrieval method should be evaluated with multi-word queries. Only relevant whendataset == 'freesound'
. - threshold (int, default: 1) – Only queries with relevant examples in X_train and X_test >= threshold are evaluated.
- n_folds (int, default: 3) – The number of folds used. Only applies to the CAL500 and Freesound dataset. The CAL10k dataset has 5 pre-defined folds.
- method (str, 'loreta', 'pamir', or 'random-forest', default: 'loreta') – The retrieval method to be evaluated.
- kwargs (key-value pairs) – Additionaly keyword arguments are passed to the retrieval methods.
-
cbar.cross_validation.
dataset_for_train_test_split
(X_train, X_test, Y_train, Y_test, threshold=1, multi_word_queries=False, scaler='standard')¶ Make dataset from a train-test-split
This function scales the input data und generates queries and query-weights from the training set vocabulary.
Parameters: - X_train (array-like, shape = [n_train_samples, n_features]) – Training set data
- X_test (array-like, shape = [n_test_samples, n_features]) – Test set data.
- Y_train (array-like, shape = [n_train_samples, n_classes]) – Training set labels.
- Y_test (array-like, shape = [n_test_samples, n_classes]) – Test set labels.
- threshold (int, default: 1) – The threshold ...
- multi_word_queries (bool, default:
False
) – Generate multi-word queries from real-world user-queries for the Freesound dataset if set toTrue
. Ultimately callscbar.preprocess.get_relevant_queries()
- scaler (str, 'standard' or 'robust', or None) – Use either
sklearn.preprocessing.StandardScaler()
orsklearn.preprocessing.RobustScaler()
to scale the input data
Returns: - X_train (array-like, shape = [n_train_samples, n_features]) – The scaled training data.
- X_test (array-like, shape = [n_test_samples, n_features]) – The scaled test data.
- Y_train_bin (array-like, shape = [n_train_samples, n_classes]) – The training labels in binary indicator format.
- Y_test_bin (array-like, shape = [n_test_samples, n_classes]) – The test labels in binary indicator format.
- Q_vec (array-like, shape = [n_queries, n_classes]) – The query vectors to evaluate
- weights (array-like, shape = [n_queries]) – The weights used to weight the queries during evaluation. For one-word queries the weight for each query is the same. For multi-word queries the counts from the aggregrated query-log of user-queries are used to weight the queries accordingly.
-
cbar.cross_validation.
validate_fold
(X_train, X_test, Y_train, Y_test, Q_vec, weights, evaluator, retrieval_method, **kwargs)¶ Perform validation on one fold of the data
This function evaluates a retrieval method on one split of a dataset.
Parameters: - X_train (pd.DataFrame, shape = [n_train_samples, codebook_size]) – Training data.
- X_test (pd.DataFrame, shape = [n_test_samples, codebook_size]) – Test data.
- Y_train (pd.DataFrame, shape = [n_test_samples, n_classes]) – Training tags.
- Y_train – Test tags.
- Q_vec (array-like, shape = [n_queries, n_classes]) – The queries to evaluate
- weights (array-like, shape = [n_queries]) – Ouery weights. Multi-word queries can be weighted to reflect importance to users.
- evaluator (object) – An instance of
cbar.evaluation.Evaluator
. - retrieval_method (str, 'loreta', 'pamir', or 'random-forest') – The retrieval to be evaluated.
- kwargs (key-value pairs) – Additionaly keyword arguments are passed to the retrieval methods.
Returns: params – The
retrieval_method
‘s parameters used for the evaluationReturn type:
-
class
cbar.evaluation.
Evaluator
¶ The
Evaluator
evaluates a retrieval method, collects the perfromance measures, and keeps values of multiple runs (for example in k-fold cross-validation).-
eval
(queries, weights, Y_score, Y_test, n_relevant)¶ Parameters: - queries (array-like, shape = [n_queries, n_classes]) – The queries to evaluate
- weights (int, default: 1) – Ouery weights. Multi-word queries can be weighted to reflect importance to users.
- Y_score (array-like, shape = [n_queries, n_classes]) – Scores of queries and sounds.
- Y_test (array-like, shape = [n_samples, n_classes]) – Test set tags associated with each test set song in binary indicator format.
- n_relevant (array-like, shape = [n_queries]) – The number of relevant sounds in X_train for each query.
-
to_json
(dataset, method, codebook_size, params)¶ Write the retrieval performance results to a file.
Parameters:
-
-
cbar.evaluation.
ranking_precision_score
(y_true, y_score, k=10)¶ Precision at rank k
Parameters: - y_true (array-like, shape = [n_samples]) – Ground truth (true relevance labels).
- y_score (array-like, shape = [n_samples]) – Predicted scores.
- k (int) – Rank.
Returns: precision@k – Precision at rank k.
Return type: