Precision matrix estimation requires selecting appropriate regularization parameter λ to balance sparsity (number of edges) and model fit (likelihood), and a mixing parameter α to trade off between element-wise (individual-level) and block-wise (group-level) penalties.
In a Gaussian graphical model (GGM), the data matrix Xn × d consists of n independent and identically distributed observations X1, …, Xn drawn from Nd(μ, Σ). Let Ω = Σ−1 denote the precision matrix, and define the empirical covariance matrix as $S = n^{-1} \sum_{i=1}^n (X_i-\bar{X})(X_i-\bar{X})^\top$. Up to an additive constant, the negative log-likelihood (nll) for Ω simplified to $$ \mathrm{nll}(\Omega) = \frac{n}{2}[-\log\det(\Omega) + \mathrm{tr}(S\Omega)]. $$ The edge set E(Ω) is determined by the non-zero off-diagonal entries: an edge (i, j) is included if and only if ωij ≠ 0 for i < j. The number of edges is therefore given by |E(Ω)|.
Ω̂AIC = arg min Ω{2 nll(Ω) + 2 |E(Ω)|}.
Ω̂BIC = arg min Ω{2 nll(Ω) + log (n) |E(Ω)|}.
Ω̂EBIC = arg min Ω{2 nll(Ω) + log (n) |E(Ω)| + 4 ξ log (d) |E(Ω)|},
where ξ ∈ [0, 1] is a tuning parameter. Setting ξ = 0 reduces EBIC to the classic BIC.
Ω̂HBIC = arg min Ω{2 nll(Ω) + log [log (n)] log (d) |E(Ω)|}.
Figure 1 illustrates the K-fold cross-validation procedure used for tuning the parameters λ and α. The notation #λ and #α denotes the number of candidate values considered for λ and α, respectively, forming a grid of #λ × #α total parameter combinations. For each of the K iterations, negative log-likelihood loss is evaluated for all parameter combinations, yielding K performance values per combination. The optimal parameter pair is selected as the one achieving the lowest average loss across the K iterations.