General Contrastive Loss: The authors first define a general form for the gradient of any contrastive loss function.
∇θL=−Ex,y+,y−[a(x,y+,D−)∇θπθ(y+∣x)−b(x,y−,D+)∇θπθ(y−∣x)]
- Explanation: This equation states that a contrastive loss gradient has two parts: one that pushes up the probability of the positive sample y+ (
∇θπθ(y+∣x) ) and one that pushes down the probability of the negative sample y^- ( −∇θπθ(y−∣x) ). The functions a and b are just weighting coefficients.