Many properties of divergences can be derived if we restrict S to be a statistical manifold, meaning that it can be parametrized with a finite-dimensional coordinate system θ, so that for a distribution p ∈ S we can write p = p(θ).
For a pair of points p, q ∈ S with coordinates θp and θq, denote the partial derivatives of D(p || q) as
Now we restrict these functions to a diagonal p = q, and denote 
By definition, the function D(p || q) is minimized at p = q, and therefore
and the dual to this connection ∇* is generated by the dual divergence D*.
Thus, a divergence D(· || ·) generates on a statistical manifold a unique dualistic structure (g(D), ∇(D), ∇(D*)). The converse is also true: every torsion-free dualistic structure on a statistical manifold is induced from some globally defined divergence function (which however need not be unique).
For example, when D is an f-divergence for some function ƒ(·), then it generates the metricg(Df) = c·g and the connection ∇(Df) = ∇(α), where g is the canonical Fisher information metric, ∇(α) is the α-connection, c = ƒ′′(1), and α = 3 + 2ƒ′′′(1)/ƒ′′(1).
The two most important classes of divergences are the f-divergences and Bregman divergences; however, other types of divergence functions are also encountered in the literature. The only divergence that is both an f-divergence and a Bregman divergence is the Kullback–Leibler divergence; the squared Euclidean divergence is a Bregman divergence (corresponding to the function ), but not an f-divergence.
Bregman divergences correspond to convex functions on convex sets. Given a strictly convex, continuously-differentiable function F on a convex set, known as the Bregman generator, the Bregman divergence measures the convexity of: the error of the linear approximation of F from q as an approximation of the value at p:
The dual divergence to a Bregman divergence is the divergence generated by the convex conjugateF* of the Bregman generator of the original divergence. For example, for the squared Euclidean distance, the generator is , while for the relative entropy the generator is the negative entropy.
Matumoto, Takao (1993). "Any statistical manifold has a contrast function — on the C³-functions taking the minimum at the diagonal of the product manifold". Hiroshima Mathematical Journal. 23 (2): 327–332. doi:10.32917/hmj/1206128255.