- Boxes
- definitions
- Ellipses
- theorems and lemmas
- Blue border
- the statement of this result is ready to be formalized; all prerequisites are done
- Orange border
- the statement of this result is not ready to be formalized; the blueprint needs more work
- Blue background
- the proof of this result is ready to be formalized; all prerequisites are done
- Green border
- the statement of this result is formalized
- Green background
- the proof of this result is formalized
- Dark green background
- the proof of this result and all its ancestors are formalized
Let \(\operatorname{LZSet}(A)\) be the Lempel–Ziv dictionary for a sequence \(A\). We define LZ–Jaccard distance \(\operatorname{LZJD}\) by
The Szymkiewicz–-Simpson coefficient is defined by
The Tversky dissimilarity \(d^T_{\alpha ,\beta }\) is a metric iff \(\alpha =\beta \ge 1\).
Let \(\mathcal X\) be a set. A metric on \(\mathcal X\) is a function \(d : \mathcal X \times \mathcal X \to \mathbb R\) such that
\(d(x, y) \ge 0\),
\(d(x, y) = 0\) if and only if \(x = y\),
\(d(x, y) = d(y, x)\) (symmetry),
\(d(x,y)\le d(x,z)+d(z,y)\) (the triangle inequality)
for all \(x,y,z\in \mathcal X\). If \(d\) satisfies Enumi 1, Enumi 2, Enumi 3 but not necessarily Enumi 4 then \(d\) is called a semimetric.
The Sørensen–Dice coefficient is defined by
Let \(\delta :=\alpha \tilde m +\overline{\alpha }\tilde M\). Let \(X=\{ 0\} , Y=\{ 1\} , Z=\{ 0,1\} \). Then \(\delta (X,Y)=1\), \(\delta (X,Z)=\delta (Y,Z)=\overline{\alpha }\).
Let \(f(A,B)=\lvert A\setminus B\rvert +\lvert B\setminus A\rvert \). Then \(f\) is a metric.
If \(d_1\) and \(d_2\) are metrics and \(a,b\) are nonnegative constants, not both zero, then \(ad_1+bd_2\) is a metric.
Let \(0\le \alpha \le 1\) and \(\beta {\gt} 0\). Then \(D_{\alpha ,\beta }\) is a metric if and only if \(0\le \alpha \le 1/2\) and \(\beta \ge 1/(1-\alpha )\).
Let \(d(x,y)\) be a metric and let \(a(x,y)\) be a nonnegative symmetric function. If \(a(x,z)\le a(x,y)+d(y,z)\) for all \(x,y,z\), then \(d'(x,y)=\frac{d(x,y)}{a(x,y)+d(x,y)}\), with \(d'(x,y)=0\) if \(d(x,y)=0\), is a metric.
Let \(f(A,B)\) be a metric such that
for all \(A,B\). Then the function \(d\) given by
is a metric.
\(\delta _{\alpha }=\alpha \tilde m +\overline{\alpha }\tilde M\) satisfies the triangle inequality if and only if \(0\le \alpha \le 1/2\).
Let \(f(A,B)=\max \{ \lvert A\setminus B\rvert ,\lvert B\setminus A\rvert \} \). Then \(f\) is a metric.
The function \(D_{\alpha ,\beta }\) is a metric on all finite power sets only if \(\alpha \le 1/2\).
Let \(\mathcal X\) be a collection of finite sets. We define \(S:\mathcal X\times \mathcal X\to \mathbb R\) as follows. The symmetric Tversky ratio model is defined by
The unbiased symmetric TRM (\(\mathbf{ustrm}\)) is the case where \(\mathrm{bias}=0\), which is the case we shall assume we are in for the rest of this paper. The Tversky semimetric \(D_{\alpha ,\beta }\) is defined by \(D_{\alpha ,\beta }(X,Y)=1-\mathbf{ustrm}(X,Y)\), or more precisely
Suppose \(d\) is a metric on a collection of nonempty sets \(\mathcal X\), with \(d(X,Y)\le 2\) for all \(X,Y\in \mathcal X\). Let \(\hat{\mathcal X}=\mathcal X\cup \{ \emptyset \} \) and define \(\hat d:\hat{\mathcal X}\times \hat{\mathcal X}\to \mathbb R\) by stipulating that for \(X,Y\in \mathcal X\),
Then \(\hat d\) is a metric on \(\hat{\mathcal X}\).
Let \(f(A,B)=m\min \{ \lvert A\setminus B\rvert ,\lvert B\setminus A\rvert \} +M\max \{ \lvert A\setminus B\rvert ,\lvert B\setminus A\rvert \} \) with \(0{\lt}m\le M\) and \(1\le M\). Then the function \(d\) given by
is a metric.
For sets \(X\) and \(Y\) the Tversky index with parameters \(\alpha ,\beta \ge 0\) is a number between 0 and 1 given by
We also define the corresponding Tversky dissimilarity \(d^T_{\alpha ,\beta }\) by