基于整數(shù)線性規(guī)劃重構(gòu)抽象語義圖結(jié)構(gòu)的語義摘要算法

陳鴻昶; 明拓思宇; 劉樹新; 高超

doi:10.11999/JEIT180720

基于整數(shù)線性規(guī)劃重構(gòu)抽象語義圖結(jié)構(gòu)的語義摘要算法

doi: 10.11999/JEIT180720 cstr: 32379.14.JEIT180720

國家數(shù)字交換系統(tǒng)工程技術(shù)研究中心 ??鄭州 ??450002

基金項目: 國家自然科學基金(61521003)，國家自然科學基金青年科學基金(61601513)

詳細信息

作者簡介:
陳鴻昶：男，1964年生，教授，博士生導師，研究方向為通信與信息工程、網(wǎng)絡(luò)大數(shù)據(jù)

明拓思宇：男，1994年生，碩士生，研究方向為網(wǎng)絡(luò)大數(shù)據(jù)、文本摘要

劉樹新：男，1987年生，助理研究員，研究方向為網(wǎng)絡(luò)大數(shù)據(jù)、復(fù)雜網(wǎng)絡(luò)

高超：男，1982年生，助理研究員，研究方向為網(wǎng)絡(luò)大數(shù)據(jù)、計算機視覺

通訊作者:
明拓思宇　1139446336@qq.com

中圖分類號: TP391.1
計量
- 文章訪問數(shù): 2537
- HTML全文瀏覽量: 989
- PDF下載量: 105
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2018-07-18
- 修回日期: 2018-10-26
- 網(wǎng)絡(luò)出版日期: 2018-11-19
- 刊出日期: 2019-07-01

Semantic Summarization of Reconstructed Abstract Meaning Representation Graph Structure Based on Integer Linear Pragramming

National Digital Switching System Engineering Technological Research Center, Zhengzhou 450002, China

Funds: The National Natural Science Foundation of China (61521003), The National Natural Science Foundation of China Youth Science Fund (61601513)

摘要

摘要: 針對利用抽象語義(AMR)圖來預(yù)測摘要子圖存在的語義結(jié)構(gòu)不完整問題，該文提出一種基于整數(shù)線性規(guī)劃(ILP)重構(gòu)AMR圖結(jié)構(gòu)的語義摘要算法。首先將數(shù)據(jù)預(yù)處理生成一個AMR總圖；然后基于統(tǒng)計特征從AMR總圖中抽取出摘要子圖重要節(jié)點信息；最后利用ILP的方法來對摘要子圖中節(jié)點關(guān)系進行重構(gòu)，利用完整的摘要子圖恢復(fù)生成語義摘要。實驗結(jié)果表明，相比其他語義摘要方法，所提方法的ROUGE值和Smatch值都有顯著提高，最多分別提高了9%和14%，該方法有利于提高語義摘要的質(zhì)量。
- 抽象語義圖 /
- 語義摘要 /
- 摘要子圖 /
- 語義結(jié)構(gòu) /
- 整數(shù)線性規(guī)劃
Abstract: In order to solve the incomplete semantic structure problem that occurs in the process of using the Abstract Meaning Representation (AMR) graph to predict the summary subgraph, a semantic summarization algorithm is proposed based on Integer Linear Programming (ILP) reconstructed AMR graph structure. Firstly, the text data are preprocessed to generate an AMR total graph. Then the important node information of the summary subgraph is extracted from the AMR total graph based on the statistical features. Finally, the ILP method is applied to reconstructing the node relationships in the summary subgraph, which is further utilized to generate a semantic summarization. The experimental results show that compared with other semantic summarization methods, the ROUGE index and Smatch index of the proposed algorithm are significantly improved, up to 9% and 14% respectively. This method improves significantly the quality of semantic summarization.
- Abstract Meaning Representation (AMR) graph /
- Semantic summarization /
- Summary subgraph /
- Semantic structure /
- Integer Linear Programming (ILP)

HTML全文

圖 1 算法框架圖

下載: 全尺寸圖片幻燈片

圖 2 英文句“I saw Joe’s dog, which was running in the garden”的AMR圖表示

下載: 全尺寸圖片幻燈片

圖 3 AMR圖合并生成AMR總圖示意圖

下載: 全尺寸圖片幻燈片

圖 4 實驗結(jié)果AMR圖與標準摘要AMR圖的對比

下載: 全尺寸圖片幻燈片

圖 5 L值對摘要質(zhì)量各指標的影響

下載: 全尺寸圖片幻燈片

表 1 摘要子圖節(jié)點和邊預(yù)測正確率(%)

	P	R	F1
節(jié)點	71.4	82.5	76.5
邊	45.6	60.1	51.9

下載: 導出CSV

表 2 不同語義摘要算法的性能對比

算法	ROUGE-1	ROUGE-2	ROUGE-W	Smatch
外部語義資源	20.4	5.6	14.3	17.8
語義聚類	21.2	6.0	15.2	19.1
潛在語義分析	22.8	6.8	14.9	20.5
TextRank算法	25.7	8.1	16.8	24.6
PAS語義圖	26.5	9.6	18.6	28.9
本文方法	29.3	10.4	19.6	32.1

下載: 導出CSV

表 3 使用ILP和未使用ILP摘要質(zhì)量對比

	ROUGE-1	ROUGE-2	ROUGE-W	Smatch
未使用ILP	29.1	9.8	18.7	29.7
使用ILP	29.3	10.4	19.6	32.1
結(jié)果提升	0.2	0.6	0.9	2.4

下載: 導出CSV

表 4 與深度學習算法的性能對比

方法	ROUGE-1	ROUGE-2	ROUGE-W	Smatch
本文方法	29.3	10.4	19.6	32.1
深度學習	33.4	13.6	24.8	26.7

下載: 導出CSV

參考文獻(16)

LYNN H M, CHOI C, and KIM P. An improved method of automatic text summarization for web contents using lexical chain with semantic-related terms[J]. Soft Computing, 2018, 22(12): 4013–4023. doi: 10.1007/s00500-017-2612-9

SHETTY K and KALLIMANI J S. Automatic extractive text summarization using K-means clustering[C]. International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), Mysuru, India, 2017: 1–9.

YU Shanshan, SU Jindian, LI Pengfei, et al. Towards high performance text mining: A TextRank-based method for automatic text summarization[J]. International Journal of Grid and High Performance Computing (IJGHPC) , 2016, 8(2): 58–75. doi: 10.4018/IJGHPC.2016040104

NGUYEN-HOANG T A, NGUYEN K, and TRAN Q V. TSGVi: A graph-based summarization system for Vietnamese documents[J]. Journal of Ambient Intelligence and Humanized Computing, 2012, 3(4): 305–313. doi: 10.1007/s12652-012-0143-x

KHAN A, SALIM N, FARMAN H, et al. Abstractive text summarization based on improved semantic graph approach[J]. International Journal of Parallel Programming, 2018: 1–25. doi: 10.1007/s10766-018-0560-3

BANARESU L, BONIAL C, CAI S, et al. Abstract meaning representation for sembanking[C]. Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, Sofia, Bulgaria, 2013: 178–186.

LIU Fei, FLANIGAN J, THOMSON S, et al. Toward abstractive summarization using semantic representations[C]. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, USA, 2015: 1077–1086.

SONG Linfeng, PENG Xiaochang, ZHANG Yue, et al. AMR-to-text generation with synchronous node replacement grammar[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017: 7–13.

KONSTAS I, IYER S, YATSKAR M, et al. Neural AMR: Sequence-to-sequence models for parsing and generation[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017: 146–157.

明拓思宇, 陳鴻昶, 黃瑞陽, 等. 一種基于加權(quán)AMR圖的語義子圖預(yù)測摘要算法[J]. 計算機工程, 2018, 44(10): 292–297. doi: 10.19678/j.issn.1000-3428.0050770

MING Tuosiyu, CHEN Hongchang, HUANG Ruiyang, et al. A semantic subgraph predictive summary algorithm based on improved AMR graph[J]. Computer Engineering, 2018, 44(10): 292–297. doi: 10.19678/j.issn.1000-3428.0050770

COLLINS M. Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms[C]. Proceedings of the ACL-02 conference on Empirical Methods in Natural Language Processing, Philadelphia, USA, 2002, 10: 1–8.

HERMANN K M, KO?ISKY T, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]. Proceeding NIPS’15 Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015, 1: 1693–1701.

LIN Chinyew. ROUGE: A package for automatic evaluation of summaries[C]. Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Barcelona, Spain, 2004, 10: 74–81.

CAI Shu and KNIGHT K. Smatch: An evaluation metric for semantic feature structures[C]. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, 2013, 2: 748–752.

SEE A, LIU P J, and MANNING C D. Get to the point: Summarization with pointer-generator networks[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, 1: 1073–1083.

TAN Jiwei, WAN Xiaojun, and XIAO Jianguo. Abstractive document summarization with a graph-based attentional neural model[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, 1: 1171–1181.

相關(guān)文章

施引文獻

資源附件(0)

訪問統(tǒng)計