基于高斯混合模型的无向网络重构

何瑞辉; 张海峰; 王欢; 马闯

doi:10.7498/aps.73.20240552

基于高斯混合模型的无向网络重构

1.
安徽大学数学科学学院, 合肥　230601
2.
安徽大学大数据与统计学院, 合肥　230601
3.
安徽大学互联网学院, 合肥　230039

通讯作者: E-mail: chuang_m@126.com

中图分类号: 89.75.Hc, 89.75.Fb, 05.10.-a, 05.10.Ln

Gaussian mixture model based reconstruction of undirected networks

1.
School of Mathematical Science, Anhui University, Hefei 230601, China
2.
School of Big Data and Statistics, Anhui University, Hefei 230601, China
3.
School of Internet, Anhui University, Hefei 230039, China

Corresponding author: E-mail: chuang_m@126.com

MSC: 89.75.Hc, 89.75.Fb, 05.10.-a, 05.10.Ln

摘要: 从数据中推断网络的结构作为复杂网络中一个重要科学问题已得到广泛关注. 现有的网络重构方法大多将网络重构问题转化为一系列线性方程组的求解问题, 然后通过某种截断方法对每个方程组的解进行截断, 从而确定每个节点的局部结构. 然而现有的截断方法大多存在着精度不足的问题, 且少有方法衡量每个方程组解的可截断性, 即节点的可重构性. 为了解决这些问题, 本文提出了一种基于高斯混合模型的无向网络重构方法. 该方法首先将节点间连接关系的推断问题转化为一个聚类问题, 然后利用高斯混合模型进行求解, 得到每个节点与其他节点的连接概率, 并根据概率定义一个基于信息熵的可重构指标, 从而在真实网络结构未知的情况下衡量每个节点的可重构性. 将该方法用于无向网络中, 可以利用无向网络的对称特征, 将可重构性高的节点作为训练集指导可重构性低的节点进行结构推断, 从而更好地重构出无向网络. 最后, 通过在合成数据和真实数据上与现有的截断方法进行比较, 证明了该方法可以更有效地重构出网络结构.
- 网络重构 /
- 高斯混合模型 /
- 可重构性 /
- 无向网络
Abstract: The reconstruction of network structure from data represents a significant scientific challenge in the field of complex networks, which has attracted considerable attention from the research community. The most of existing network reconstruction methods transform the problem into a series of linear equation systems, to solve the equations. Subsequently, truncation methods are used to determine the local structure of each node by truncating the solution of each equation system. However, truncation methods frequently exhibit inadequate accuracy, and lack methods of evaluating the truncatability of solutions to each system of equations, that is to say, the reconstructability of nodes. In order to address these issues, in this work an undirected network reconstruction method is proposed based on a Gaussian mixture model. In this method, a Gaussian mixture model is first used to cluster the solution results obtainedby solving a series of linear equations, and then the probabilities of the clustering results are utilized to depict the likelihood of connections between nodes. Subsequently, an index of reconstructibility is defined based on information entropy, thus the probability of connections between each node and other nodes can be used to measure the reconstructibility of each node. The proposed method is ultimately applied to undirected networks. Nodes identified with high reconstructibility are used as a training set to guide the structural inference of nodes with lower reconstrucibility, thus enhancing the reconstruction of the undirected network. The symmetrical properties of the undirected network are then employed to infer the connection probabilities of the remaining nodes with other nodes. The experiments on both synthetic and real data are conducted and a variety of methods are used for constructing linear equations and diverse dynamical models. Compared with the results from a previous truncated reconstruction method, the reconstruction outcomes are evaluated. The experimental results show that the method proposed in this work outperforms existing truncation reconstruction methods in terms of reconstruction performance, thus confirming the universality and effectiveness of the proposed method.
- network reconstruction /
- Gaussian mixture model /
- reconstructability /
- undirected networks .

图 1 网络重构结果　(a)和(b)分别为求解两组一系列线性方程组的求解结果, 红色点表示存在连接关系的求解结果, 蓝色点表示不存在连接关系的求解结果; (c)为(a)中红色方框内节点求解结果的直方图分布; (d)为(b)中红色方框内节点求解结果的直方图分布. (a)和(b)中的横坐标表示网络中节点编号, 纵坐标表示线性方程组的求解结果, (c)和(d)中的横坐标表示线性方程组的求解结果, 纵坐标表示分布的数量

Figure 1. Network reconstruction results: (a) and (b) represent two different solution results for solving a series of linear equation systems, respectively, with red dots indicating solutions with connectivity and blue dots indicating solutions without connectivity; (c) represents histogram distribution of the solution result for the node within the red box in panel (a); (d) represents histogram distribution of the solution result for the node within the red box in panel (b). The horizontal axes in panels (a) and (b) represent the node number in the network, the vertical axes represent the solution results of the linear equation system, the horizontal axes in panels (c) and (d) represent the solution results of the linear equation system, and the vertical axes represent the number of distributions.

下载: 全尺寸图片幻灯片

图 2 $ {H(i)} $与节点重构效果的关系　横坐标表示节点编号, 左纵坐标表示节点的可重构性指标负值$ {-H(i)} $, 右纵坐标表示重构效果F₁

Figure 2. The relationship between $ {H(i)} $ and node reconstruction effect: The horizontal axis represents the node number, the left vertical axis represents the negative value of node’s reconfigurability index $ {-H(i)} $, and the right vertical axis represents the reconstruction effect F₁

下载: 全尺寸图片幻灯片

图 3 UNRGMM与TTM在合成网络中的重构效果比较　(a)和(d)为ER网络上的重构效果; (b)和(e)为WS网络上的重构效果; (c)和(f)为BA网络上的重构效果. 误差棒表示10次独立实验的标准差

Figure 3. Comparison of reconstruction effects between UNRGMM and TTM in synthetic networks: (a) and (d) represent the reconstruction effect on the ER network; (b) and (e) represent the reconstruction effect on the WS network; (c) and (f) represent the reconstruction effect on the BA network. The error bar represents standard deviation over ten independent trials.

下载: 全尺寸图片幻灯片

图 4 UNRGMM与TTM在噪声干扰下的重构效果　(a)和(d)为ER网络上的重构效果; (b)和(e)为WS网络上的重构效果; (c)和(f)为BA网络上的重构效果. 误差棒表示10次独立实验的标准差

Figure 4. The reconstruction effect of UNRGMM and TTM under noise interference: (a) and (d) represent the reconstruction effect on the ER network; (b) and (e) represent the reconstruction effect on the WS network; (c) and (f) represent the reconstruction effect on the BA network. The error bar represents standard deviation over ten independent trials.

下载: 全尺寸图片幻灯片

图 5 UNRGMM与TTM在不同平均度的合成网络中的重构效果　(a)和(d)为ER网络上的重构效果; (b)和(e)为WS网络上的重构效果; (c)和(f)为BA网络上的重构效果. 误差棒表示10次独立实验的标准差

Figure 5. The reconstruction effect of UNRGMM and TTM in synthetic networks with different average degrees: (a) and (d) represent the reconstruction effect on the ER network; (b) and (e) represent the reconstruction effect on the WS network; (c) and (f) represent the reconstruction effect on the BA network. The error bar represents standard deviation over ten independent trials.

下载: 全尺寸图片幻灯片

图 6 UNRGMM与TTM在不同动力学下的重构效果　(a)和(d)为Ising动力学的重构效果; (b)和(e)为Game动力学的重构效果; (c)和(f)为Majority动力学的重构效果. 误差棒表示10次独立实验的标准差

Figure 6. The reconstruction effect of UNRGMM and TTM under different dynamics: (a) and (d) represent the reconstruction effect for Ising dynamics; (b) and (e) represent the reconstruction effect for Game dynamics; (c) and (f) represent the reconstruction effect for Majority dynamics. The error bar represents standard deviation over ten independent trials.

下载: 全尺寸图片幻灯片

表 1 真实网络结构特征N和E分别是节点和连边数量; ${\langle k\rangle}$表示平均度; C和r分别是聚类系数和分类系数; H是异质性程度, 定义为${H= {\langle k^2\rangle}/{{\langle k\rangle}^{2}}}$

Table 1. Real networks structure characteristics. N and E are the number of nodes and edges, respectively; ${\langle k\rangle}$ indicates the average degree; C and r are clustering coefficients and classification coefficients, respectively; H is the degree heterogeneity, defined as ${H= {\langle k^2\rangle}/{{\langle k\rangle}^{2}}}$.

真实网络	N	E	$ {\langle k\rangle} $	C	r	H
Karate	34	78	4.588	0.59	–0.476	1.693
Dolphins	62	159	5.129	0.29	–0.071	1.326
Football	115	613	10.661	0.4	0.162	1.007
Polbooks	105	441	8.40	0.49	–0.128	1.421

下载: 导出CSV

表 2 UNRGMM与TTM在真实网络中的重构结果比较

Table 2. Comparison of reconstruction results between UNRGMM and TTM on real networks.

真实网络	TTM		UNRGMM
真实网络	F₁	Accuracy	F₁	Accuracy
Karate	0.723	0.889	0.989	0.997
Dolphins	0.776	0.951	0.970	0.995
Football	0.484	0.874	0.774	0.963
Polbooks	0.571	0.896	0.838	0.978

下载: 导出CSV

[1]	Li X, Sun L, Ling M J, Peng Y 2023 Neurocomputing 549 126441 doi: 10.1016/j.neucom.2023.126441
[2]	张彦超, 刘云, 张海峰, 程辉, 熊菲 2011 物理学报 60 050501 doi: 10.7498/aps.60.050501 Zhang Y C, Liu Y, Zhang H F, Cheng H, Xiong F 2011 Acta Phys. Sin. 60 050501 doi: 10.7498/aps.60.050501
[3]	Gardner T S, Di Bernardo D, Lorenz D, Collins J J 2003 Science 301 102 doi: 10.1126/science.1081900
[4]	Geier F, Timmer J, Fleck C 2007 BMC Syst. Biol. 1 1 doi: 10.1186/1752-0509-1-1
[5]	Gao C, Fan Y, Jiang S H, Deng Y, Liu J M, Li X H 2021 IEEE Trans. Intell. Transp. Syst. 23 6509 doi: 10.1109/TITS.2021.3058185
[6]	Zhou Y M, Li S P, Kundu T, Bai X W, Qin W 2021 IEEE Trans. Network Sci. Eng. 8 2249 doi: 10.1109/TNSE.2021.3085818
[7]	张海峰, 王文旭 2020 物理学报 69 088906 doi: 10.7498/aps.69.20200001 Zhang H F, Wang W X 2020 Acta Phys. Sin. 69 088906 doi: 10.7498/aps.69.20200001
[8]	Wang J Y, Zhang Y J, Xu C, Li J Z, Sun J C, Xie J R, Feng L, Zhou T S, Hu Y Q 2024 Nat. Commun. 15 2849 doi: 10.1038/s41467-024-47248-x
[9]	康玲, 项冰冰, 翟素兰, 鲍中奎, 张海峰 2018 物理学报 67 198901 doi: 10.7498/aps.67.20181000 Kang L, Xiang B B, Zhai S L, Bao Z K, Zhang H F 2018 Acta Phys. Sin. 67 198901 doi: 10.7498/aps.67.20181000
[10]	Xiang B B, Bao Z K, Ma C, Zhang X Y, Chen H S, Zhang H F 2018 Chaos: An Interdisciplinary Journal of Nonlinear Science 28 013122 doi: 10.1063/1.4990734
[11]	Zhao J, Cheong K H 2024 IEEE Trans. Syst. Man Cybern. Part A Syst. Humans 54 6 doi: 10.1109/TSMC.2024.3349537
[12]	Guo Q T, Jiang X, Lei Y J, Li M, Ma Y F, Zheng Z M 2015 Phys. Rev. E 91 012822 doi: 10.1103/PhysRevE.91.012822
[13]	Li D D, Qian W Q, Sun X X, Han D, Sun M 2023 Appl. Math. Comput. 458 128233
[14]	Lv X J, Fan D M, Li Q, Wang J L, Zhou L 2023 Physica A 627 129131 doi: 10.1016/j.physa.2023.129131
[15]	徐翔, 朱承, 朱先强 2021 物理学报 70 088901 doi: 10.7498/aps.70.20201756 Xu X, Zhu C, Zhu X Q 2021 Acta Phys. Sin. 70 088901 doi: 10.7498/aps.70.20201756
[16]	Wang H, Ma C, Chen H S, Lai Y C, Zhang H F 2022 Nat. Commun. 13 3043 doi: 10.1038/s41467-022-30706-9
[17]	Ma C, Wang H, Zhang H F 2023 Europhys. Lett. 144 21002 doi: 10.1209/0295-5075/ad07b2
[18]	杨浦, 郑志刚 2012 物理学报 61 120508 doi: 10.7498/aps.61.120508 Yang P, Zheng Z G 2012 Acta Phys. Sin. 61 120508 doi: 10.7498/aps.61.120508
[19]	Ma C, Chen H S, Li X, Lai Y C, Zhang H F 2020 SIAM J. Appl. Dyn. Syst. 19 124 doi: 10.1137/19M1254040
[20]	Shen Z S, Wang W X, Fan Y, Di Z R, Lai Y C 2014 Nat. Commun. 5 4323 doi: 10.1038/ncomms5323
[21]	Liu Q M, Ma C, Xiang B B, Chen H S, Zhang H F 2019 IEEE Trans. Syst. Man Cybern. Part A Syst. Humans 51 4639 doi: 10.1109/TSMC.2019.2945363
[22]	Zhang A B, Fan Y, Di Z R, Zeng A 2023 Chaos, Solitons Fractals 173 113712 doi: 10.1016/j.chaos.2023.113712
[23]	Wang W X, Lai Y C, Grebogi C, Ye J P 2011 Phys. Rev. X 1 021021
[24]	Li G J, Li N, Liu S H, Wu X Q 2019 Chaos: An Interdisciplinary Journal of Nonlinear Science 29 053117 doi: 10.1063/1.5093270
[25]	Mei G F, Wu X Q, Wang Y F, Hu M, Lu J A, Chen G R 2017 IEEE Trans. Cybern. 48 754 doi: 10.1109/TCYB.2017.2655511
[26]	Pandey P K, Adhikari B 2017 IEEE Trans. Knowl. Data Eng. 29 2072 doi: 10.1109/TKDE.2017.2725264
[27]	Pandey P K, Adhikari B, Mazumdar M, Ganguly N 2020 IEEE Trans. Knowl. Data Eng. 34 3377 doi: 10.1109/TKDE.2020.3024779
[28]	Ma C, Chen H S, Lai Y C, Zhang H F 2018 Phys. Rev. E 97 022301
[29]	Zhang Z, Zhao Y, Liu J, Wang S, Tao R, Xin R, Zhang J 2019 Appl. Network Sci. 4 1 doi: 10.1007/s41109-018-0108-x
[30]	Xu X, Zhu X Q, Zhu C 2023 Complex Intell. Syst. 9 3131 doi: 10.1007/s40747-022-00893-5
[31]	Mignone P, Pio G, D’ Elia D, Ceci M 2020 Bioinformatics 36 1553 doi: 10.1093/bioinformatics/btz781
[32]	Reynolds D A 2009 Encyclopedia of Biometrics 741 659
[33]	Wang Y, Chakrabarti D, Wang C X, Faloutsos C 2003 In 22nd International Symposium on Reliable Distributed Systems Florence, Italy, October 6–8, 2003 pp25–34
[34]	Perotti J I, Tessone C J, Clauset A, Caldarelli G 2018 arXiv: 1806.07005 (Physics and Society
[35]	Erds P, Rényi A 1960 Publ. Math. Inst. Hungar. Acad. Sci. 5 17
[36]	Watts D J, Strogatz S H 1998 Nature 393 440 doi: 10.1038/30918
[37]	Barabási A L, Albert R 1999 Science 286 509 doi: 10.1126/science.286.5439.509
[38]	Li J W, Shen Z S, Wang W X, Grebogi C, Lai Y C 2017 Phys. Rev. E 95 032303

图( 6) 表( 2)

计量

文章访问数: 478
HTML全文浏览数: 478
PDF下载数: 7
施引文献: 0

全文HTML

1. 引　言

复杂网络为建模现实系统中个体间的复杂交互作用提供了有力的工具, 如社交网络^[1,2]、基因网络^[3,4]和交通网络^[5,6]等. 通过分析网络的拓扑结构, 可以揭示网络的功能, 推断网络的演化过程等^[7–10]. 然而这些研究都是建立在网络拓扑结构已知的情况下, 但是在通常情况下, 想要完全获取网络的拓扑结构是困难的, 这就需要发展一些重构方法完成从收集到的数据中推断出网络结构. 因此, 网络重构是研究网络拓扑结构的前提.

当已知网络中的动力学模型时, 可以用复杂网络上的动力学对个体间的相互作用和影响进行建模与刻画, 例如流行病在人群中的传播^[11,12]、谣言在网络上的扩散^[13,14]等, 同时个体间的动力学关系也可以反映出网络的拓扑结构, 所以可以利用动力学过程在不同时刻的时间序列数据对网络进行重构, 关于这方面的研究已经取得许多成果^[15–19]. 在以往已知动力学模型重构网络结构的方法中, 通常都是利用时间序列数据和动力学机理构造一系列线性方程组^{[7,17,19–25]}, 每个方程组的求解结果对应着每个节点的局部连接关系, 通过求解线性方程组从而重构网络结构. 当网络中的动力学模型未知时, 可以将贝叶斯推断应用于网络重构当中^[26–28], 通过先验知识和观测数据构建有关网络结构的后验概率分布, 利用极大似然估计重构网络结构. 此外将深度学习应用于网络重构当中可以使用自动微分的方法^[29,30]对网络进行重构, 通过将网络结构参数化后建立关于观测数据与网络结构之间关系的损失函数, 之后利用观测数据对损失函数进行迭代, 从而获得有关网络结构的最优参数, 同时还可以用迁移学习的方法^[31], 利用源域中的网络结构知识来重构目标域中的网络结构.

然而这些方法中基于线性化问题进行网络重构会面临阈值截断问题. 例如在线性方程组的求解结果中, 真实存在连接和不存在连接的求解结果存在明显的区别, 所以在求解单个线性方程组后, 可以直接从求解结果的直方图分布中分辨单个节点与其他节点的连接关系, 单个线性方程组求解结果的直方图分布中往往存在一个很明显的间断, 在这个间断区域内取一个阈值进行截断用于判断节点间的连接关系, 这种方法被称为为阈值截断方法(threshold truncation method, TTM), 该方法被广泛应在用网络结构的推断中^[20–23]. 但这个截断方法存在着两个缺点: 1)如果真实存在连接的求解结果和不存在连接的求解结果混合在一起, 则会导致对应的直方图分布中的间隔不能反映真实的连边情况, 最终使得重构出的网络与真实结构差异巨大; 2)在真实网络结构未知的情况下, 虽然可以通过截断方法重构出每个节点的局部结构, 但是无法衡量每个节点的可重构性.

为了解决上述问题, 本文提出了一种基于高斯混合模型的无向网络重构方法. 首先, 根据求解一系列线性方程组得到的求解结果, 基于高斯混合模型^[32]对每个方程组的求解结果进行聚类, 利用聚类结果的概率刻画节点间存在连接的可能性. 然后, 根据每个节点与其他节点存在连接的概率值, 基于信息熵定义一个可重构指标, 衡量每个节点的可重构性. 最后, 将提出的方法用于无向网络中, 筛选出可重构性高的节点并将这些节点与其他节点存在连接的概率作为训练集, 利用无向网络的对称特征来推断剩余节点与其他节点存在连接的概率. 通过在合成数据集和真实数据集上与现有的重构方法做比较, 结果表明本文的重构方法可以更有效地重构出网络的结构.

3. 基于高斯混合模型的无向网络重构

构造出关于每个节点的线性方程组之后, 每个方程组的求解结果都和一个节点与其他节点的连接关系有关, 但是通过求解结果并不能直接分辨出一个节点与其他节点的连接关系, 所以要重构出网络结构还需要一种有效的方法分辨求解结果代表的是存在连接还是不存在. 因此, 本文提出了一种基于高斯混合模型的无向网络重构方法(undirected network reconstruction based on Gaussian mixture model, UNRGMM), 将求解结果的分辨问题转化成聚类问题, 并基于高斯混合模型进行求解. 下面将详细介绍方法步骤.

对于构造的线性方程组(12), 求解后可以得到关于节点i与其他节点连接关系的求解结果, 这里用向量$ {{\boldsymbol{{B}}^{i}}=\left(B_{j}^{i}\right)\in \mathbb{R}^{\left(N-1\right) \times 1}} $表示, 即$ B_{j}^{i}= A_{j}^{i}\ln\left(1-{\lambda}^i\right) $. 因为$ {A_{j}^{i}} $的真实值为0或者1, 所以$ {{\boldsymbol{{B}}^{i}}} $中的值会集中在0和$ {\ln(1-\lambda^i)} $附近. 因此, 可以假设$ {B_{j}^{i}} $服从高斯混合分布, 即:

其中, $ {\phi\left(x|\mu_{k}, \sigma^{2}_{k}\right)} $为均值为$ {\mu_{k}} $方差为$ {\sigma^{2}_{k}} $的高斯分布, $ {{\boldsymbol{\theta}}=\left[\mu_{1}, \mu_{2}, \sigma^{2}_{1}, \sigma^{2}_{2}, \alpha_{1}, \alpha_{2}\right]^{{\mathrm{T}}}} $, $ {\alpha_{1}+\alpha_{2}=1} $. 假设$ {{\boldsymbol{{B}}^{i}}} $中每个值都是独立分布的, 所以对数似然为

对于高斯混合模型, 可以利用EM算法对模型参数进行迭代优化, 迭代公式定义为

其中, $ {\gamma_{jk}^{i}} $可以表示i与j不存在连接的概率$ (k= 1) $和存在连接的概率$ {\left(k=2\right)} $. 反复计算公式(15)—(18)直至收敛, 最终可以得到节点i与节点j之间存在连接的概率$ {\gamma_{j2}^{i}} $. 初始时刻, 初始化均值$ {\mu_{1}} $设置为0, $ {\mu_{2}} $的值通过对$ {{\boldsymbol{{B}}^{i}}} $做柱状图获取, 即取柱状图最小值点与$ {{\boldsymbol{{B}}^{i}}} $中最小值的均值. 标准差$ {\sigma_{1}} $和$ {\sigma_{2}} $统一设置为$ {{\boldsymbol{{B}}^{i}}} $中值的标准差的1/2, $ {\alpha_{1}} $和$ {\alpha_{2}} $分别设置为0.8和0.2.

在得到$ {{\boldsymbol{{B}}^{i}}} $中每个值表示存在的连接概率后, 可以根据这些概率衡量节点i的可重构性. 例如, 当得到的概率全部接近0或者1, 则认为可重构性很高, 反之, 如果得到的概率全部在$ {0.5} $附近, 则很难判断这些值是表示存在连接还是表示不存在连接. 因为$ {\gamma_{j1}^{i}+\gamma_{j2}^{i}=1} $, 所以基于信息熵定义一个可重构指标来衡量节点的可重构性, 节点i的重构性指标定义为

$ {H(i)} $越小则节点i可重构性越高, 当$ {H(i)=0} $时, 表明对于任意节点j可以得到$ \gamma_{j2}^{i}=0 $或$ \gamma_{j2}^{i}=1. $

由(19)式可以判断出哪些节点的可重构性高, 哪些节点可重构性低, 下面将方法用于无向网络中, 利用节点的可重构性高低和无向网络的对称特征重构出网络的结构.

步骤一: 首先利用每个节点与其他节点存在连接的概率$ {\gamma_{j2}^{i}} $计算其可重构性指标$ {H(i)} $, 然后设置阈值α把所有节点分为可重构性高的节点与可重构性低的节点两个部分: $ {\boldsymbol{V_{<\alpha}}}=\{v_{1}, v_{2}, \cdots, v_{k}\} $与$ {{\boldsymbol{V_{>\alpha}}} = \left\{v_{k+1}, v_{k+2}, \cdots, v_{N}\right\}} $, 最后对于可重构性高的节点, 将它们与其他节点的连接概率$ {\gamma_{j2}^{i}\left(i \in {{\boldsymbol{V_{<\alpha}}}}, j \in 1, \cdots, N \right) } $作为最终结果, 并根据$ {\gamma_{j2}^{i}\left(i \in {{\boldsymbol{V_{<\alpha}}}}\right)} $重新计算可重构性低的节点与其他节点存在连接的概率更新$ {\gamma_{j2}^{i}\left(i \in {{\boldsymbol{V_{>\alpha}}}}\right)} $.

步骤二: 通过阈值分类得到可重构性高的节点与其他节点的连接概率, 接下来将重构出这部分节点的局部结构, 同时利用无向网络的对称特征推断可重构性低的节点的部分连接关系. 为此, 定义新的邻接矩阵$ {\boldsymbol{\tilde{A}}} $, 定义网络中不存在自环, 所以当节点$ {i=j} $时, $ {\tilde{A}_{j}^{i}} $的值为0, 当$ {i \not= j} $时, 元素$ {\tilde{A}_{j}^{i}} $的取值由节点i和节点j的可重构性和存在连接的概率$ {\gamma_{j2}^{i}} $和$ {\gamma_{i2}^{j}} $决定. 当节点i属于可重构性高的节点且可重构性比节点j要高时, $ {\tilde{A}_{j}^{i}} $的值取决于$ {\gamma_{j2}^{i}} $; 当节点j属于可重构性高的节点且可重构性比节点i要高时, $ {\tilde{A}_{j}^{i}} $的值取决于$ {\gamma_{i2}^{j}} $; 当节点i和节点j都属于可重构性低的节点时, $ {\tilde{A}_{j}^{i}} $的值为–1, 公式定义如下:

其中, $ [\cdot] $表示四舍五入取整. 得到的邻接矩阵$ {\boldsymbol{\tilde{A}}} $中非–1值表示重构出的部分网络结构.

步骤三: 接下来需要重构出任意两个可重构性低的节点之间的连接关系, 即邻接矩阵$ {\boldsymbol{\tilde{A}}} $中–1位置的元素. 对此, 可以利用可重构性低的节点中已经重构出的连接关系作为训练集推断未重构的连接关系. 首先, 对于任意一个可重构性低的节点$ {i \in {\boldsymbol{V_{>\alpha}}}} $, 由(20)式可以知道向量$ {\boldsymbol{\tilde{A}}^{i}}=[\tilde{A}_{1}^{i}, \cdots, \tilde{A}_{i-1}^{i}, \tilde{A}_{i+1}^{i}, \tilde{A}_{N}^{i}]^{{\mathrm{T}}} $中哪些值为0, 1和–1. 定义$ {\boldsymbol{\beta}}_{{1}} $表示$ {\tilde{A}_{j}^{i}} $为0对应$ {B_{j}^{i}} $的集合; 定义$ {\boldsymbol{\beta}}_{{2}} $表示$ {\tilde{A}_{j}^{i}} $为1对应$ {B_{j}^{i}} $的集合. 通过贝叶斯分类器, 利用最大化后验概率准则, 即可得到$ {i \not= j} $时$ {B_{j}^{i}} $所属不存在连接或存在连接的概率为

其中, $ {w_{1}} $, $ {w_{2}} $分别代表$ {\tilde{A}_{j}^{i}=0} $, $ {\tilde{A}_{j}^{i}=1} $. $ {P\left(w_{k}\right)} $, $ {\mu_{k}} $, $ {\sigma^{2}_{k}} $可由已知结果训练得到, 即 $ P\left(w_{k}\right)\;= {\left|{\boldsymbol{\beta}_{k}}\right|}/({\left|{\boldsymbol{\beta}}_{{1}}\right|+\left|{\boldsymbol{\beta}}_{{2}}\right|}) $, $ {\mu_{k}} $和$ {\sigma^{2}_{k}} $分别为$ {{\boldsymbol{\beta}_{k}}} $中元素的均值和方差. 若求得$ {P\left(w_{1}|B_{j}^{i}\right)<P\left(w_{2}|B_{j}^{i}\right)} $则节点j是节点$ {{i}} $的邻居, 令$ {\tilde{A}_{j}^{i}=1} $, 反之则$ {\tilde{A}_{j}^{i}=0} $, 最终得到的邻接矩阵$ {\boldsymbol{\tilde{A}}} $作为网络重构结果.

5. 结　论

网络重构问题一直都是复杂网络研究中重要的科学问题, 而以往的重构方法研究中大多将研究重点放在如何有效地构建线性方程组, 通过求解线性方程组来推断网络的结构, 从而忽略了在求解结果与网络结构中建立一个有效且科学的关系. 本文提出了一种基于高斯混合模型的无向网络重构方法, 建立了从求解结果到网络结构间的关系映射. 利用高斯混合模型将节点间连接关系的推断问题转化为求解结果的一个聚类问题, 并且提供了一个在真实网络结构未知的情况下衡量每个节点可重构性的指标. 将本文方法用于无向网络中, 还可以利用无向网络的对称特征提高重构的效果. 通过在合成数据和真实数据上采用不同动力学模型和不同构造线性方程组的方法进行实验, 与以往的截断重构方法的重构效果对比证明了本文方法的优势. 但是, 该方法主要适用于线性化重构结果的截断, 还有一些例如统计推断的方法, 在预测连边概率后同样面临着截断问题, 这类问题如何解决将是我们下一步的研究内容.

参考文献 (38)

基于高斯混合模型的无向网络重构

通讯作者: E-mail: chuang_m@126.com

Gaussian mixture model based reconstruction of undirected networks

Corresponding author: E-mail: chuang_m@126.com

计量

基于高斯混合模型的无向网络重构

通讯作者: E-mail: chuang_m@126.com

English Abstract

Gaussian mixture model based reconstruction of undirected networks

Corresponding author: E-mail: chuang_m@126.com

全文HTML

目录

基于高斯混合模型的无向网络重构

通讯作者: E-mail: chuang_m@126.com

Gaussian mixture model based reconstruction of undirected networks

Corresponding author: E-mail: chuang_m@126.com

计量

出版历程

基于高斯混合模型的无向网络重构

通讯作者: E-mail: chuang_m@126.com

English Abstract

Gaussian mixture model based reconstruction of undirected networks

Corresponding author: E-mail: chuang_m@126.com

全文HTML

目录