网络安全事件分析，你学会了吗？-网络安全事件分析

引言

网络安全事件被收集为一种网络威胁情报(CTI)可以用来对抗网络攻击。开发一个网络事件分析模型来预测可能的威胁，可以帮助组织提供决策指导。网络安全事件是一个完整的语义单元，包含所有参与的对象，这些对象具有丰富的属性(如攻击的结果和种类)。通过分析网络安全事件，可以帮助预测组织可能面临的威胁。

1.介绍

由于复杂的零日攻击一直在增加，确保组织的系统安全非常困难[1]。为了对抗这些攻击，组织依靠外部公开报告来收集和共享安全信息[2]。作为网络威胁情报(CTI)的一种，来自外部报告的网络安全事件是关于资产存在或正在出现的威胁的基于证据的知识。目前许多项目，如VCDB[3]、Hackmageddon[4]和Web Hacking Incident Database[5]，被用来共享安全事件信息。图1给出了网络安全事件的一个样例。

开发一个网络安全事件分析模型来预测组织可能面临的威胁，对于获取攻击趋势并为决策提供指导[6]具有巨大的价值。组织必须全面利用网络事件分析模型，以更好地掌握当前的威胁情况，例如“组织中哪些资产更多可能受到危害”、“谁是组织的潜在攻击者”、“他们可能对组织实施什么类型的攻击”以及“发现威胁的可能方法”等。

图1 网络攻击事件的一个例子:一个受害组织遭受了来自攻击者a1的恶意软件后门攻击，攻击者a1窃取了受害组织的敏感文件。

2.网络安全事件建模及应用

安全信息共享已经成为缓解网络攻击的新武器。VCDB[3],Hackmageddon[4]和Web Hacking Incident Database[5]等安全事件信息共享项目被用于收集网络安全事故报告。但是，这些基于组织报告的项目旨在收集事件信息，无法分析威胁信息。

机器学习方法在分析网络安全事件的组织报告中发挥着重要作用[10]。刘洋等人[11]从收集组织的外部网络特征，并使用随机森林分类器来预测组织的违规事件。Sarabi等人[12]基于随机森林方法，利用公开的业务细节来预测数据泄露风险。Portalatinet等人[13] 提出一个统计框架对多元时间序列进行建模和预测。这些方法通过统计分析可测量的特征来预测安全事件的风险。

许多基于图的方法被提出来分析网络安全事件中的异构对象及其关系。赵军和刘旭东等人[7] 基于攻击事件构建属性异构信息网，对攻击者、漏洞、被利用的脚本、被入侵的设备和被入侵的平台的异构对象进行建模。他们使用属性异构信息网络来预测网络攻击偏好。HinCTI[8] 旨在对网络威胁情报进行建模并识别威胁类型，以减轻安全分析师繁重的分析工作。赵军和严其本等人[9]提出一个框架来模拟异构IOC之间的相互依赖关系，以量化它们的相关性。

3.网络表征学习

网络安全事件包含大量的多类型对象从而形成异构信息网络。网络表示学习将网络中的节点嵌入到低维空间，以采用机器学习方法进行分析。

节点结构嵌入方法的一个分支受到 Skip-gram（最初用于词嵌入）模型的启发[14]。DeepWalk [15] 首先使用random walks[16] 从网络中采样路径并学习对象嵌入。LINE[17] 通过优化独立于邻居的可能性，保留网络的一阶和二阶邻近性。Node2vec[18] 扩展了有权重的DeepWalk用于探索不同的邻居。Struc2vec[19] 构建一个多层图来编码具有相同结构但结构不相邻的节点。这些工作考虑和建模了成对对象之间的关系。

为了将多个交互作为一个整体来捕获，事件[20]被定义为表示完整的语义单元。HEBE[20]通过学习异构信息网络中对象与事件的关系来保持对象的邻近性。Event2vec[21]考虑事件中关系的数量和性质，并在嵌入空间中保持事件驱动的一阶和二阶邻近。基于事件的建模封装了更多信息，这对于安全事件分析尤为重要。

属性网络嵌入可以有效地处理节点属性以学习更好的表示。典型的例子是SNE[22]，这为具有丰富属性的社会行动者保留了结构和属性接近性。BANE模型[23]聚集来自相邻节点的节点属性和链接的信息，以学习二进制节点表示。

4.网络安全事件分析框架CyEvent2vec

网络安全事件建模框架CyEvent2vec[24]的体系结构如图2所示。框架的过程由四个主要组成部分组成：

数据处理与特征提取：从网络安全事件中提取属性对象及其关系和标签，包括受害者组织、资产、攻击类型和攻击者节点。

组织事件和矩阵生成：组织事件生成算法可以根据遭受网络事件的企业作为目标，将相关的安全对象集合在一起。可以基于生成的组织事件构造属性异构信息网络。组织事件被处理成事件矩阵，以表示攻击事件和具有属性的对象之间的关系。

网络安全事件建模：为了探究对象之间复杂的关系，将事件矩阵输入到自编码器模型中，以获得事件嵌入，使事件在低维空间中保持接近性。基于事件嵌入，可以计算得到对象嵌入。

安全事件分析的应用：将对象嵌入方法应用于组织威胁预测和威胁对象分类。组织威胁预测可以帮助分析人员预测受害组织可能面临的威胁，可以被看做为链接预测任务。威胁对象分类预测了可能发现威胁的方法，可以看作是一个多标签分类任务。

图2 网络安全事件分析框架

5.总结

在本文中，我们专注于网络安全事件分析，旨在预测组织可能面临的威胁。网络安全事件包含大量相互作用的多类型对象从而形成异构信息网络。网络表示学习将网络中的节点嵌入到低维空间，从而可以采用机器学习技术对网络安全事件进行分析。

参考文献

[1] N. Sun, J. Zhang, P. Rimba, S. Gao, L. Y. Zhang, and Y. Xiang, “Data-driven cybersecurity incident prediction: A survey,” IEEE communications surveys & tutorials, vol. 21, no. 2, pp. 1744–1772, 2018.

[2] I. Sarhan and M. Spruit, “Open-cykg: An open cyber threat intelligence knowledge graph,” Knowledge-Based Systems, vol. 233, p. 107524,2021.

[3] VERIS, “Veris community database (vcdb),” http://veriscommunity.net/index.html.

[4] Hackmageddon, “Veris community database (vcdb),” https://www.hackmageddon.com.

[5] VERIS, “Web-hacking-incident-database,” http://projects.webappsec.org/w/page/13246995/Web-Hacking-Incident-Database.

[6] K. Shu, A. Sliva, J. Sampson, and H. Liu, “Understanding cyber attack behaviors with sentiment information on social media,” in International Conference on Social Computing, ehavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation. Springer, 2018, pp. 377–388.

[7] J. Zhao, X. Liu, Q. Yan, B. Li, M. Shao, H. Peng, and L. Sun, “Automatically predicting cyber attack preference with attributed heterogeneous attention networks and transductive learning,” computers & security, vol. 102, p. 102152, 2021.

[8] Y. Gao, L. Xiaoyong, P. Hao, B. Fang, and P. Yu, “Hincti: A cyber threat intelligence modeling and identification system based on heterogeneous information network,” IEEE Transactions on Knowledge and Data Engineering, 2020.

[9] J. Zhao, Q. Yan, X. Liu, B. Li, and G. Zuo, “Cyber threat intelligence modeling based on heterogeneous graph convolutional network,” in 23rd International Symposium on Research in Attacks, Intrusions and Defenses ({RAID} 2020), 2020, pp. 241–256.

[10] D. Sun, Z. Wu, Y. Wang, Q. Lv, and B. Hu, “Cyber profiles based risk prediction of application systems for effective access control,” in 2019 IEEE Symposium on Computers and Communications (ISCC). IEEE, 2019, pp. 1–7.

[11] Y. Liu, A. Sarabi, J. Zhang, P. Naghizadeh, M. Karir, M. Bailey, and M. Liu, “Cloudy with a chance of breach: Forecasting cyber security incidents,” in 24th {USENIX} Security Symposium ({USENIX} Security 15), 2015, pp. 1009–1024.

[12] A. Sarabi, P. Naghizadeh, Y. Liu, and M. Liu, “Risky business: Fine-grained data breach prediction using business profiles,” Journal of Cybersecurity, vol. 2, no. 1, pp. 15–28, 2016.

[13] Z. Fang, M. Xu, S. Xu, and T. Hu, “A framework for predicting data breach risk: Leveraging dependence to cope with sparsity,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 2186–2201, 2021.

[14] W. Cheng, C. Greaves, and M. Warren, “From n-gram to skipgram to concgram,” International journal of corpus linguistics, vol. 11, no. 4, pp. 411–433, 2006.

[15] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 701–710.

[16] F. G ̈obel and A. Jagers, “Random walks on graphs,” Stochastic processes and their applications, vol. 2, no. 4, pp. 311–336, 1974.

[17] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “Line: Large-scale information network embedding,” in Proceedings of the 24th international conference on world wide web, 2015, pp. 1067–1

[18] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855–864.

[19] L. F. Ribeiro, P. H. Saverese, and D. R. Figueiredo, “struc2vec: Learning node representations from structural identity,” in Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, 2017, pp. 385–394.

[20] H. Gui, J. Liu, F. Tao, M. Jiang, B. Norick, L. Kaplan, and J. Han, “Embedding learning with events in heterogeneous information networks,” IEEE transactions on knowledge and data engineering, vol. 29, no. 11, pp. 2428–2441, 2017.

[21] G. Fu, B. Yuan, Q. Duan, and X. Yao, “Representation learning for heterogeneous information networks via embedding events,” in International Conference on Neural Information Processing. Springer, 2019, pp. 327–339.

[22] L. Liao, X. He, H. Zhang, and T.-S. Chua, “Attributed social network embedding,” IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 12, pp. 2257–2270, 2018.

[23] H. Yang, S. Pan, P. Zhang, L. Chen, D. Lian, and C. Zhang, “Binarized attributed network embedding,” in 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 2018, pp. 1476–1481.

[24] X. Ma, L.Q.Wang, et al, “CyEvent2vec: Attributed Heterogeneous Information Network based Event Embedding Framework for Cyber Security Events Analysis,” IJCNN，2022.

网络安全事件分析，你学会了吗？

​引 言