Quickstart guide¶
This example demonstrates how to build a simple content-based audio retrieval model and evaluate the retrieval accuracy on a small song dataset, CAL500. This dataset consists of 502 western pop songs, performed by 499 unique artists. Each song is tagged by at least three people using a standard survey and a fixed tag vocabulary of 174 musical concepts.
This package includes a loading utility for getting and processing this dataset, which makes loading quite easy.
In [1]:
from cbar.datasets import fetch_cal500
X, Y = fetch_cal500()
Calling fetch_cal500()
initally downloads the CAL500 dataset to a
subfolder of your home directory. You can specify a different location
using the data_home
parameter (fetch_cal500(data_home='path')
).
Subsequents calls simply load the dataset.
The raw dataset consists of about 10,000 39-dimensional features vectors per minute of audio content which were created by
- Sliding a half-overlapping short-time window of 12 milliseconds over each song’s waveform data.
- Extracting the 13 mel-frequency cepstral coefficients.
- Appending the instantaneous first-order and second-order derivatives.
Each song is, then, represented by exactly 10,000 randomly subsampled, real-valued feature vectors as a bag-of-frames. The bag-of-frames features are further processed into one k-dimensional feature vector by encoding the feature vectors using a codebook and pooling them into one compact vector.
Specifically, k-means is used to cluster all frame vectors into k clusters. The resulting cluster centers correspond to the codewords in the codebook. Each frame vector is assigned to its closest cluster center and a song represented as the counts of frames assigned to each of the k cluster centers.
By default, fetch_cal500()
uses a codebook size of 512 but this size
is easily modified with the codebook_size
parameter
(fetch_cal500(codebook_size=1024)
).
In [2]:
X.shape, Y.shape
Out[2]:
((502, 512), (502, 174))
Let’s split the data into training data and test data, fit the model on the training data, and evaluate it on the test data. Import and instantiate the model first.
In [3]:
from cbar.loreta import LoretaWARP
model = LoretaWARP(n0=0.1, valid_interval=1000)
Then split the data and fit the model using the training data.
In [4]:
from cbar.cross_validation import train_test_split_plus
(X_train, X_test,
Y_train, Y_test,
Q_vec, weights) = train_test_split_plus(X, Y)
%time model.fit(X_train, Y_train, Q_vec, X_test, Y_test)
/Users/David/.virtualenvs/machine-learning/lib/python2.7/site-packages/scipy/linalg/basic.py:884: RuntimeWarning: internal gelsd driver lwork query error, required iwork dimension not returned. This is likely the result of LAPACK bug 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to 'gelss' driver.
warnings.warn(mesg, RuntimeWarning)
iter: 0, stepsize: 0.100, P10: 0.168, AP: 0.517, loss: 0.000
iter: 1000, stepsize: 0.100, P10: 0.212, AP: 0.176, loss: 0.999
iter: 2000, stepsize: 0.100, P10: 0.197, AP: 0.177, loss: 0.993
iter: 3000, stepsize: 0.100, P10: 0.205, AP: 0.179, loss: 0.975
iter: 4000, stepsize: 0.100, P10: 0.215, AP: 0.181, loss: 0.966
iter: 5000, stepsize: 0.100, P10: 0.191, AP: 0.178, loss: 0.962
iter: 6000, stepsize: 0.100, P10: 0.189, AP: 0.176, loss: 0.941
iter: 7000, stepsize: 0.100, P10: 0.185, AP: 0.175, loss: 0.937
iter: 8000, stepsize: 0.100, P10: 0.186, AP: 0.175, loss: 0.919
iter: 9000, stepsize: 0.100, P10: 0.191, AP: 0.175, loss: 0.919
iter: 10000, stepsize: 0.100, P10: 0.184, AP: 0.173, loss: 0.904
iter: 11000, stepsize: 0.100, P10: 0.188, AP: 0.171, loss: 0.895
iter: 12000, stepsize: 0.100, P10: 0.180, AP: 0.171, loss: 0.883
iter: 13000, stepsize: 0.100, P10: 0.198, AP: 0.172, loss: 0.881
2016-12-29 17:36:30,046 [MainThread ] [WARNI] max_dips reached, stopped at 14000 iterations.
CPU times: user 11.5 s, sys: 375 ms, total: 11.9 s
Wall time: 12 s
Out[4]:
cbar.loreta.LoretaWARP(max_iter=100000, k=30, n0=0.1, n1=0.0, rank_thresh=0.1, lambda_=0.1, loss='warp', max_dips=10, valid_interval=1000)
Now, predict the scores for each query with all songs. Ordering the songs from highest to lowest score corresponds to the ranking.
In [5]:
Y_score = model.predict(Q_vec, X_test)
Evaluate the predictions.
In [6]:
from cbar.evaluation import Evaluator
from cbar.utils import make_relevance_matrix
n_relevant = make_relevance_matrix(Q_vec, Y_train).sum(axis=1)
evaluator = Evaluator()
evaluator.eval(Q_vec, weights, Y_score, Y_test, n_relevant)
evaluator.prec_at
Out[6]:
defaultdict(list,
{1: [0.17682926829268295],
2: [0.14634146341463417],
3: [0.15040650406504064],
4: [0.1443089430894309],
5: [0.14339430894308947],
6: [0.14705284552845529],
7: [0.15779616724738676],
8: [0.17079703832752613],
9: [0.17630178087495163],
10: [0.17976432442895857],
11: [0.18487488121634463],
12: [0.19447699996480486],
13: [0.20046400076887883],
14: [0.20617928819148332],
15: [0.21475927527756794],
16: [0.22658865524719185],
17: [0.23965035741398727],
18: [0.24535472896305033],
19: [0.25569593647729272],
20: [0.25993391964659807]})
Cross-validation¶
The cv
function in the cross_validation
module offers an easy
way to evaluate a retrieval method on multiple splits of the data. Let’s
run the same experiment on three folds.
In [7]:
from cbar.cross_validation import cv
In [8]:
cv('cal500', 512, n_folds=3, method='loreta', n0=0.1, valid_interval=1000)
2016-12-29 17:36:30,273 [MainThread ] [INFO ] Running CV with 3 folds ...
2016-12-29 17:36:58,463 [MainThread ] [INFO ] Validating fold 0 ...
iter: 0, stepsize: 0.100, P10: 0.160, AP: 0.515, loss: 0.000
iter: 1000, stepsize: 0.100, P10: 0.175, AP: 0.167, loss: 0.999
iter: 2000, stepsize: 0.100, P10: 0.193, AP: 0.171, loss: 0.989
iter: 3000, stepsize: 0.100, P10: 0.185, AP: 0.168, loss: 0.980
iter: 4000, stepsize: 0.100, P10: 0.182, AP: 0.167, loss: 0.957
iter: 5000, stepsize: 0.100, P10: 0.183, AP: 0.169, loss: 0.953
iter: 6000, stepsize: 0.100, P10: 0.191, AP: 0.168, loss: 0.923
iter: 7000, stepsize: 0.100, P10: 0.177, AP: 0.168, loss: 0.916
iter: 8000, stepsize: 0.100, P10: 0.164, AP: 0.166, loss: 0.902
iter: 9000, stepsize: 0.100, P10: 0.166, AP: 0.163, loss: 0.892
iter: 10000, stepsize: 0.100, P10: 0.165, AP: 0.164, loss: 0.886
iter: 11000, stepsize: 0.100, P10: 0.162, AP: 0.163, loss: 0.864
2016-12-29 17:37:09,004 [MainThread ] [WARNI] max_dips reached, stopped at 12000 iterations.
2016-12-29 17:37:09,165 [MainThread ] [INFO ] Validating fold 1 ...
iter: 0, stepsize: 0.100, P10: 0.182, AP: 0.512, loss: 0.000
iter: 1000, stepsize: 0.100, P10: 0.177, AP: 0.161, loss: 1.000
iter: 2000, stepsize: 0.100, P10: 0.174, AP: 0.165, loss: 0.994
iter: 3000, stepsize: 0.100, P10: 0.165, AP: 0.165, loss: 0.975
iter: 4000, stepsize: 0.100, P10: 0.169, AP: 0.162, loss: 0.954
iter: 5000, stepsize: 0.100, P10: 0.172, AP: 0.164, loss: 0.931
iter: 6000, stepsize: 0.100, P10: 0.167, AP: 0.160, loss: 0.922
iter: 7000, stepsize: 0.100, P10: 0.170, AP: 0.164, loss: 0.903
iter: 8000, stepsize: 0.100, P10: 0.179, AP: 0.163, loss: 0.897
iter: 9000, stepsize: 0.100, P10: 0.167, AP: 0.162, loss: 0.888
iter: 10000, stepsize: 0.100, P10: 0.164, AP: 0.161, loss: 0.869
iter: 11000, stepsize: 0.100, P10: 0.162, AP: 0.162, loss: 0.868
2016-12-29 17:37:19,624 [MainThread ] [WARNI] max_dips reached, stopped at 12000 iterations.
2016-12-29 17:37:19,781 [MainThread ] [INFO ] Validating fold 2 ...
iter: 0, stepsize: 0.100, P10: 0.152, AP: 0.517, loss: 0.000
iter: 1000, stepsize: 0.100, P10: 0.175, AP: 0.166, loss: 1.000
iter: 2000, stepsize: 0.100, P10: 0.178, AP: 0.168, loss: 0.990
iter: 3000, stepsize: 0.100, P10: 0.178, AP: 0.169, loss: 0.970
iter: 4000, stepsize: 0.100, P10: 0.165, AP: 0.169, loss: 0.961
iter: 5000, stepsize: 0.100, P10: 0.161, AP: 0.162, loss: 0.953
iter: 6000, stepsize: 0.100, P10: 0.153, AP: 0.160, loss: 0.913
iter: 7000, stepsize: 0.100, P10: 0.158, AP: 0.160, loss: 0.909
iter: 8000, stepsize: 0.100, P10: 0.159, AP: 0.160, loss: 0.901
iter: 9000, stepsize: 0.100, P10: 0.156, AP: 0.162, loss: 0.886
iter: 10000, stepsize: 0.100, P10: 0.162, AP: 0.160, loss: 0.885
iter: 11000, stepsize: 0.100, P10: 0.157, AP: 0.158, loss: 0.863
iter: 12000, stepsize: 0.100, P10: 0.159, AP: 0.161, loss: 0.864
iter: 13000, stepsize: 0.100, P10: 0.168, AP: 0.160, loss: 0.852
2016-12-29 17:37:32,020 [MainThread ] [WARNI] max_dips reached, stopped at 14000 iterations.
The cross-validation results including retrieval method parameters are written to a JSON file. For each dataset three separate result files for mean average precision (MAP), precision-at-k, and precision-at-10 as a function of relevant training examples are written to disk. Here are the mean average precision values of the last cross-validation run.
In [9]:
import json
import os
from cbar.settings import RESULTS_DIR
results = json.load(open(os.path.join(RESULTS_DIR, 'cal500_ap.json')))
results[results.keys()[-1]]['precision']
Out[9]:
[0.15721287889679258, 0.1559216267824481, 0.16173633269857096]
Start cross-validation with the CLI¶
This package comes with a simple CLI which makes it easy to start cross-validation experiments from the command line. The CLI enables you to specify a dataset and a retrieval method as well as additional options in one line.
To start an experiment on the CAL500 dataset with the LORETA retrieval method, use the following command.
$ cbar crossval --dataset cal500 loreta
This simple command uses all the default parameters for LORETA but you
can specify all parameters as arguments to the loreta
command. To
see the available options for the loreta
command, ask for help like
this.
$ cbar crossval loreta --help
Usage: cbar crossval loreta [OPTIONS]
Options:
-n, --max-iter INTEGER Maximum number of iterations
-i, --valid-interval INTEGER Rank of parameter matrix W
-k INTEGER Rank of parameter matrix W
--n0 FLOAT Step size parameter 1
--n1 FLOAT Step size parameter 2
-t, --rank-thresh FLOAT Threshold for early stopping
-l, --lambda FLOAT Regularization constant
--loss [warp|auc] Loss function
-d, --max-dips INTEGER Maximum number of dips
-v, --verbose Verbosity
--help Show this message and exit.