WO2015003143A3 - Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus - Google Patents

Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus Download PDF

Info

Publication number
WO2015003143A3
WO2015003143A3 PCT/US2014/045432 US2014045432W WO2015003143A3 WO 2015003143 A3 WO2015003143 A3 WO 2015003143A3 US 2014045432 W US2014045432 W US 2014045432W WO 2015003143 A3 WO2015003143 A3 WO 2015003143A3
Authority
WO
WIPO (PCT)
Prior art keywords
rhetorical
corpus
relations
discourse
annotated corpus
Prior art date
Application number
PCT/US2014/045432
Other languages
French (fr)
Other versions
WO2015003143A2 (en
Inventor
Blake HOWALD
Andrew NYSTROM
Original Assignee
Thomson Reuters Global Resources
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Reuters Global Resources filed Critical Thomson Reuters Global Resources
Priority to AU2014285073A priority Critical patent/AU2014285073B9/en
Priority to CA2917153A priority patent/CA2917153C/en
Publication of WO2015003143A2 publication Critical patent/WO2015003143A2/en
Publication of WO2015003143A3 publication Critical patent/WO2015003143A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures

Abstract

The present invention provides a method and system directed to predicting implicit rhetorical relations between two spans of text, e.g., in a large annotated corpus, such as the Penn Discourse Treebank ("PDTB"), Rhetorical Structure Theory corpus, and the Discourse Graph Bank, and particularly directed to determining a rhetorical relation in the absence of an explicit discourse marker. Surface level features may be used to capture pragmatic information encoded in the absent marker. In one manner a simplified feature set based only on raw text and semantic dependencies is used to improve performance for all relations. By using surface level features to predict implicit rhetorical relations for the large annotated corpus the invention approaches a theoretical maximum performance, suggesting that more data will not necessarily improve performance based on these and similarly situated features.
PCT/US2014/045432 2013-07-03 2014-07-03 Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus WO2015003143A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2014285073A AU2014285073B9 (en) 2013-07-03 2014-07-03 Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus
CA2917153A CA2917153C (en) 2013-07-03 2014-07-03 Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201361842635P 2013-07-03 2013-07-03
US61/842,635 2013-07-03

Publications (2)

Publication Number Publication Date
WO2015003143A2 WO2015003143A2 (en) 2015-01-08
WO2015003143A3 true WO2015003143A3 (en) 2015-05-14

Family

ID=52144292

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/045432 WO2015003143A2 (en) 2013-07-03 2014-07-03 Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus

Country Status (3)

Country Link
AU (1) AU2014285073B9 (en)
CA (1) CA2917153C (en)
WO (1) WO2015003143A2 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3688609A1 (en) * 2017-09-28 2020-08-05 Oracle International Corporation Determining cross-document rhetorical relationships based on parsing and identification of named entities
US11809825B2 (en) 2017-09-28 2023-11-07 Oracle International Corporation Management of a focused information sharing dialogue based on discourse trees
US11328016B2 (en) 2018-05-09 2022-05-10 Oracle International Corporation Constructing imaginary discourse trees to improve answering convergent questions
CN111209366B (en) * 2019-10-10 2023-04-21 天津大学 Implicit chapter relation recognition method of mutual excitation neural network based on TransS driving
US11580298B2 (en) 2019-11-14 2023-02-14 Oracle International Corporation Detecting hypocrisy in text
CN112257460B (en) * 2020-09-25 2022-06-21 昆明理工大学 Pivot-based Hanyue combined training neural machine translation method
CN113407713B (en) * 2020-10-22 2024-04-05 腾讯科技(深圳)有限公司 Corpus mining method and device based on active learning and electronic equipment
CN113535973B (en) * 2021-06-07 2023-06-23 中国科学院软件研究所 Event relation extraction and language-to-language relation analysis method and device based on knowledge mapping
CN113377915B (en) * 2021-06-22 2022-07-19 厦门大学 Dialogue chapter analysis method
CN113553830B (en) * 2021-08-11 2023-01-03 桂林电子科技大学 Graph-based English text sentence language piece coherent analysis method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020046018A1 (en) * 2000-05-11 2002-04-18 Daniel Marcu Discourse parsing and summarization
US20040044519A1 (en) * 2002-08-30 2004-03-04 Livia Polanyi System and method for summarization combining natural language generation with structural analysis
US20090119286A1 (en) * 2000-05-23 2009-05-07 Richard Reisman Method and Apparatus for Utilizing User Feedback to Improve Signifier Mapping
US20100285434A1 (en) * 2002-01-23 2010-11-11 Jill Burstein Automated Annotation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5659766A (en) * 1994-09-16 1997-08-19 Xerox Corporation Method and apparatus for inferring the topical content of a document based upon its lexical content without supervision

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020046018A1 (en) * 2000-05-11 2002-04-18 Daniel Marcu Discourse parsing and summarization
US20090119286A1 (en) * 2000-05-23 2009-05-07 Richard Reisman Method and Apparatus for Utilizing User Feedback to Improve Signifier Mapping
US20100285434A1 (en) * 2002-01-23 2010-11-11 Jill Burstein Automated Annotation
US20040044519A1 (en) * 2002-08-30 2004-03-04 Livia Polanyi System and method for summarization combining natural language generation with structural analysis

Also Published As

Publication number Publication date
AU2014285073A1 (en) 2016-02-04
CA2917153A1 (en) 2015-01-08
AU2014285073B9 (en) 2017-04-06
CA2917153C (en) 2022-05-17
WO2015003143A2 (en) 2015-01-08
AU2014285073B2 (en) 2016-11-03

Similar Documents

Publication Publication Date Title
WO2015003143A3 (en) Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus
UA114314C2 (en) Restriction of prediction units in b slices to uni-directional inter prediction
WO2012068544A3 (en) Performing actions on a computing device using a contextual keyboard
GB0906700D0 (en) Automatically extracting data from semi-stuctured documents
BR112015022493A2 (en) demographic context determination system
BR112014017364A2 (en) yield enhancement for cabac coefficient level coding
BR112013007710A2 (en) content prediction
WO2013181588A3 (en) Defining and mapping application interface semantics
BR102013031320A8 (en) non-transient computer readable system and media
WO2014043366A3 (en) Optimal data representation and auxiliary structures for in-memory database query processing
MX2010002350A (en) Identification of semantic relationships within reported speech.
ECSP15029651A (en) DEVICE AND METHOD FOR SCALABLE ENCODING OF VIDEO INFORMATION BASED ON HIGH EFFICIENCY VIDEO ENCODING
TR201904508T4 (en) Method for signaling the progressive temporal substrate access pattern.
BR112014010751A2 (en) video prediction coding device, video prediction coding method, video prediction coding program, video prediction decoding device, video prediction decoding method and video prediction decoding program
BR112013009616A2 (en) computer-implemented method for initiating an action on a mobile computing device responsive to receiving text data, computer-implemented method for generating alternative search terms, computer-implemented method for modifying a search database and storage medium read by computer
WO2013025624A3 (en) Searching encrypted electronic books
WO2011088521A3 (en) Improved searching using semantic keys
WO2011103326A3 (en) Apparatus and methods to reduce duplicate line fills in a victim cache
Sindbæk All in the same boat: The Vikings as European and global heritage
BR112013019266A2 (en) inventory data access layer
Fennell Examining structural racism in the Jim Crow era of Illinois
CA2844486C (en) Enhanced grid reliability through predictive analysis and dynamic action for stable power distribution
Bouya et al. Total electron content forecast model over Australia
Kristensen Using adapted budget cost variance techniques to measure the impact of Lean–based on empirical findings in Lean case studies
Ghaemi et al. Prediction of vapor-liquid equilibrium for aqueous solutions of electrolytes using artificial neural networks

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2917153

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2014285073

Country of ref document: AU

Date of ref document: 20140703

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14820158

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 14820158

Country of ref document: EP

Kind code of ref document: A2