6/16/2023 0 Comments Similarity meterStart to pay attention to carrying out semantic analysis using ontology. The new representation of knowledge and description form is widely applied to the various aspects such as semantic net, information retrieval, more and more researchers This expression calculates vector similarity as text similarity directly in vector space later.In recent years, ontology, as one kind What is wanted is exactly a little to calculate patent text similarity.Text similarity, general algorithmic method are using vector space model to text The semantic information for itself being included.The essence of patent examination is the high related patents of unexamined patent similarity, among these, most heavy Technology.The fast development of science and technology makes annual amount of the application for patent sharply increase.Traditional retrieval mode passes through termĬarry out matching return as a result, being usually correlation using the quantity that term occurs as patent, not in view of patent DescriptionĬurrent Internet era, carrier of the patent as record mankind's achievement contain a large amount of scientific and technological achievement and innovation The present invention relates to a kind of Chinese patent text similarity calculating methods, including:Text is segmented;TF IDF values are calculated to word segmentation result, extraction TF IDF values are higher to be used as keyword, and the sentence where positioning keyword obtains the critical sentence set of each text as critical sentence, and using the maximum weights of keyword in critical sentence as the weights of critical sentence;The weight to text for calculating each critical sentence chooses text to be compared and compares the critical sentence of text successively, and the sentence similarity based on critical sentence calculates the similarity of text.The present invention utilizes existing patent field ontology, analyze the semantic relation in patent text, the calculating of patent text similarity is carried out using vector space model and domain body, the accuracy and recall rate of result of calculation are higher, similarity degree between patent can be described more accurately, it can accelerate the speed of patent examination, the needs of practical application can be met well.
0 Comments
Leave a Reply. |