LoretaWARP¶
-
class
cbar.loreta.
LoretaWARP
(max_iter=100000, k=30, n0=1.0, n1=0.0, rank_thresh=0.1, lambda_=0.1, loss='warp', valid_interval=10000, max_dips=10, verbose=True)¶ Low Rank Retraction Algorithm with Weighted Approximate-Rank Pairwise loss (WARP loss).
Parameters: - max_iter (int) – The maximum number of iterations. One iteration equals one run of sampling the triplet \((q, x_q^+, x_q^-)\).
- k (int) – The rank of the parameter matrix \(W = AB^T \in \mathbb{R}^{n \times m}\) with \(A \in \mathbb{R}^{n \times k}\) and \(B \in \mathbb{R}^{m \times k}\)
- n0 (float) – The first step size parameter.
- n1 (float) – The second step size parameter
- rank_thresh (float) – The threshold for early-stopping the sampling procedure. Stops sampling after \(x_q^- \cdot rank\_thresh\) samples have been sampled.
- lambda (float) – The regularization constant \(\lambda\).
- loss (str, 'warp' or 'auc') – The loss function.
- max_dips (int) – The maximum number of dips the algorithms is allowed to make
- valid_interval (int) – The interval at which a validation of the current model state is performed.
- verbose (bool, optional) – Turns valiation output at each validation interval on or off.
-
A
¶ array, shape = [n_classes, k] – Factor \(A\) of the factored parameter matrix \(W\) such that \(W = AB^T\) with \(A \in \mathbb{R}^{n \times k}\)
-
B
¶ array, shape = [n_features, k] – Factor \(B\) of the factored parameter matrix \(W\) such that \(W = AB^T\) \(B \in \mathbb{R}^{m \times k}\)
-
A_pinv
¶ array, shape = [k, n_classes] – The pseudo-inverse of A.
-
B_pinv
¶ array, shape = [k, n_features] – The pseudo-inverse of B.
Notes
This class implements the Low Rank Retraction Algorithm with rank-one gradients as described in [1]. The rank-one gradients allow for an efficient, iterative update of the pseudo-inverses based on [2] instead of two expensive \(\mathcal{O}(n^3)\) spectral decomposition in every iteration to compute the pseudo-inversess.
Moreover, the algorithm features the Weighted Approximate-Rank Pairwise loss (WARP loss) [3] which employs a clever sampling scheme to speed up training time.
A similar algorithm with the additional constraint that W is PSD was developed in [4].
References
[1] Shalit, U., Weinshall, D. and Chechik, G., 2012. Online learning in the embedded manifold of low-rank matrices. Journal of Machine Learning Research, 13(Feb), pp.429-458. [2] Meyer, Jr, C.D., 1973. Generalized inversion of modified matrices. SIAM Journal on Applied Mathematics, 24(3), pp.315-323. [3] Weston, J., Bengio, S. and Usunier, N., 2010. Large scale image annotation: learning to rank with joint word-image embeddings. Machine learning, 81(1), pp.21-35. [4] Lim, D. and Lanckriet, G., 2014. Efficient Learning of Mahalanobis Metrics for Ranking. In Proceedings of The 31st International Conference on Machine Learning (pp. 1980-1988). -
_initialize_W
(n, m)¶ Initialize the parameter matrix in factored form \(W = AB^T\) as
self.A
andself.B
and initialize the respective pseudo-inverses asself.A_pinv
andself.B_pinv
.Parameters:
-
_sample_violator
(X, Q, q_id, rel_id)¶ Sample a violator according to the WARP sampling scheme. See [4] for details.
Parameters: Returns: - n_drawn (int) – The number of samples drawn until a violator was found.
- n_irrelevant (int) – The number of irrelevant traning examples for the
q_id
. - rel_minus_irr (array, shape = [n_features]) – The vector difference of the relevant and the irrelvant training example, \(d = x_q^+ - x_q^-\).
-
fit
(X, Y, Q, X_val=None, Y_val=None, weights=None)¶ Fit the model.
Parameters: - X (array-like, shape = [n_samples, n_features]) – Training data
- Y (array-like, shape = [n_samples, n_classes]) – Training labels
- Q (array-like, shape = [n_queries, n_classes]) – Training queries
- X_val (array-like, shape = [n_samples, n_features], optional) – Validation data
- Y_val (array-like, shape = [n_samples, n_classes], optional) – Validation labels
- weights (array-like, shape = [n_queries], optional) – Query weights
Returns: self – The fitted model.
Return type: LoretaWARP instance
-
fit_partial
(X, Y, Q, X_val=None, Y_val=None, weights=None)¶ Fit the model. Repeated calls to this method will cause training to resume from the current model state.
Parameters: - X (array-like, shape = [n_samples, n_features]) – Training data
- Y (array-like, shape = [n_samples, n_classes]) – Training labels
- Q (array-like, shape = [n_queries, n_classes]) – Training queries
- X_val (array-like, shape = [n_samples, n_features], optional) – Validation data
- Y_val (array-like, shape = [n_samples, n_classes], optional) – Validation labels
- weights (array-like, shape = [n_queries], optional) – Query weights
Returns: self – The fitted model.
Return type: LoretaWARP instance