WO2016156995A1 - Methods, systems and computer program products for machine based processing of natural language input - Google Patents

Methods, systems and computer program products for machine based processing of natural language input Download PDF

Info

Publication number
WO2016156995A1
WO2016156995A1 PCT/IB2016/050593 IB2016050593W WO2016156995A1 WO 2016156995 A1 WO2016156995 A1 WO 2016156995A1 IB 2016050593 W IB2016050593 W IB 2016050593W WO 2016156995 A1 WO2016156995 A1 WO 2016156995A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
knowledge base
operator input
operator
received
Prior art date
Application number
PCT/IB2016/050593
Other languages
French (fr)
Inventor
Vijaya Rama Raju PENUMATCHA
Original Assignee
Yokogawa Electric Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yokogawa Electric Corporation filed Critical Yokogawa Electric Corporation
Publication of WO2016156995A1 publication Critical patent/WO2016156995A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Definitions

  • the present invention relates to generation of responses to natural language queries.
  • the invention includes methods, apparatuses and computer program products for receiving natural language inputs or queries in the form of speech or text input, from a user or operator within an industrial plant, and for processing received natural language inputs or queries using one or both of databases that provide logical storage for structured data, and knowledge bases that are based on ontological models and that provide logical storage for unstructured data.
  • Information related to events and operations within an industrial plant is stored or embedded across a number of different data sources within the plant.
  • data may include (t) structured data - e.g. data stored in plant historian databases, incident management databases, work permit databases, batch databases etc. or (if) unstmcfured data - e.g. remarks annexed by a user or operator in a report or shift record,
  • structured data refers to information with a high degree of organization, such that logical storage in a relational database is seamless, and readily searchable using straightforward keyword based search queries or operations.
  • Unstructured data refers to information where the lack of structure or organization makes compilation, comprehension and logical storage a time and energy consuming task, and which does not permit for straightforward keyword based search and retrieval operations.
  • the invention provides methods, systems and computer program products for machine based processing of natural language input received from an operator within an industrial plant.
  • the invention comprises a method for machine, based processing of natural language input received from an operator within an industrial plant.
  • the method comprises receiving an operator input from the operator and classifying the received operator input as one of a query and a statement Responsive to classification of the received operator input as a statement, the method proceeds to extract data from the received operator input, and to update a knowledge base by storing the extracted data as a set of interrelated concept instances within said knowledge base. Responsive to classification of the received operator input as a query, the method proceeds to search at least one of a structured database and the knowledge base for data that matches one or more query parameters extracted from the received operator input. A query response based on a returned search resul t may thereafter be presented to the operator.
  • the step of searching at least one of the structure database and the knowledge base may include generating a first set of query parameters based on the recei ved operator input, and performing search of the structured database based on the generated first set of query parameters. Responsive to receiving a null result from search of the structured database, the method may proceed to generate a second set of query parameters based on the received operator inpu and thereafter perform search of the knowledge base based on the generated second set of query parameters.
  • a query string may be generated based on the first set of query parameters. Further, the step of performing search of the structured database may comprise ascertaining whether the query string matches any predefined query pattern within a set of predefined query patterns. Responsive to identi fication of a matching predefined query pattern, the method may proceed to execute a set of query language instructions associated with the identified matching predefined query pattern.
  • the set of predefined query patterns may comprise AIML patterns. Further, each AIML pattern may have a corresponding AIML template associated therewith. In an embodiment, at least one of said AIML templates comprises a set of query language instructions encapsulated therewithin.
  • the structured database may be selected from among a plurality of structured databases. ' The selection may be based on at least one of the extracted one or more query parameters, or on the query language instructions associated with the identi fied matching predefined query pattern.
  • an oncological query model may be generated based on the second set of que y parameters. Thereafter, searching the knowledge base may include ascertaining whether the oncological query model matches any part of the knowledge base. Responsive to identification of a match within the knowledge base, the method may include extracting data from the identified matched portion of the knowledge base, and returning a search result comprising- data extracted from the knowledge base. Identifying a match between the ontological query model and the knowledge base may in an embodiment be determined based on graph pattern matching.
  • die knowledge base may be implemented based on an ontological model having interrelated concept classes corresponding to one or more (and in an embodiment, all) of (i) one or more objects, equipment or components wi thin an industrial plant (ii) observed events within said industrial plant (Hi) causative factors related to observed events (iv) operator inputs or operator requests in response to either of an observed event or a determined causative factor (v) any action executed in response to a operator input or operator request and (vi) temporal data.
  • the method of the present invention may additionally include receiving operator input in the form of speech signals.
  • the received operator input may be subjected to speech to test processing and natural language processing prior to classification as one of a query and a statement.
  • the invention may additionally comprise a system configured for machine based processing of natural language input received from an operator within an industrial plant.
  • the system may include at least one processor, a data repository system, an operator input device, a natural language engine, a knowledge base updater, and at least one comparator.
  • the data repository system of the present invention may comprise at least one structured database and at least one knowledge base.
  • the operator input device may be configured for receiving an operator input from the operator.
  • the natural language engine may be configured for processing the received operator input and classifying said received operator input as one of a query and a statement.
  • the knowledge base updater of the system may be co figured to respond to classification of the received operator input as a statement by extracting data from the received operator input, and updating a knowledge base by storing the. extracted data as a set of interrelated concept instances within said knowledge base.
  • the at least one comparator may be configured to respond to classi fication of the received operator input as a query by searching at least one of the at least one stnictured database and the knowledge base for data matching one or more query parameters extracted from the received operator input. The comparator may thereafter present to the operator, a query response based on a returned search result.
  • the system may be configured such that the at least one comparator comprises a structured database comparator configured to generate a first set of query parameters based on the received operator input.
  • the structured database comparator may perform search of at least one structured database based on the generated first set of query parameters.
  • the at least one comparator may also comprise a knowledge base comparator configured to respond to a null result arising from search of the structured database by generating a second set of query parameters based on the received operator input, and performing search of the knowledge base, based on the generated second set of query parameters.
  • the structured database comparator of the system may be configured to generate a query siting based on the first set of query parameters, and to perform search of the structured database by ascertaining whether the query string matches any predefined query pattern within a set of predefined query patterns. Responsive to identification of a matching predefined query pattern, the staictured database comparator may execute a set of quesy language instmctions associated with the marching- predefined query pattern.
  • the set of predefined query patterns may comprise AIML patterns.
  • Each of said AIML patterns may include a corresponding AIML template associated therewith -and at least one of said AIML templates may have a set of query language instmctions enc ap suJ ated therewith) n .
  • the knowledge base comparator of the system may be configured to generate an ontologies! quesy model based on the second set of query parameters.
  • the knowledge base comparator may perform search of the knowledge base by ascertaining whether the ontological query model matches any part of the knowledge base. Responsive to identification of a match within the knowledge base, the knowledge base comparator may extract data from the matched portion of the knowledge base, and return a search result comprising data extracted from the knowledge base.
  • the knowledge base may be implemented based on an ontological model having interrelated concept classes corresponding to one or more (and in an embodiment, all) of (i) one or more objects, equipment or components within an industrial plant (it) observed events or effects or events within said industrial plant (iii) causative factors related to observed events (iv) operator inputs or operator requests in response to an observed event or a determined causative factor (v) any action executed in response to an operator input or operator request and (vi) temporal data.
  • the knowledge base comparator of die system may be configured to identify matches between the ontological query model and the knowledge base, based on graph pattern matching.
  • the system embodiments described above may additionally include a speech processor configured to receive speech signals from the operator input device, perform speech to text conversion on the received speech signals, and communicate the converted test to the natural language engine.
  • a speech processor configured to receive speech signals from the operator input device, perform speech to text conversion on the received speech signals, and communicate the converted test to the natural language engine.
  • the invention may additionally include a computer program product for machine based processing of natural language input received from an operator within an industrial plant, which computes: program product comprises computer readable instaictions for implementing method embodiments of the present in vention,
  • Figure 1 illustrates steps typically encountered during natural language processing.
  • Figure 2 illustrates a data repository system.
  • Figure 3 illustrates an exemplary ontology.
  • Figure 4A illustrates an oncological model
  • Figure 4B illustrates an instance of a knowledge, base generated based on the ontological model of Figure 4A.
  • Figures 4C and 4D respectively illustrate data structures that may be used to implement trie ontological model and knowledge base of Figures 4A and 4B.
  • Figures 5 to 9 are lowcharts illustrating methods in accordance with the present i vention.
  • Figures 10A and 10B illustrates an exemplary ontoiogical query model.
  • Figures 11 and 12 depict systems configured in accordance with the teachings of the present invention.
  • the term "operator input” may be understood to include one or both of (i) speech inputs received from a user or operator or a supervisory operator, at a speech input device (such as a microphone or other acoustic sensor) or (if) text inputs received from a user or operator at a user interface device (such as a computer terminal, mobile phone, tablet, personal digital assistant etc) or (in) inputs entered into reports by a user or operator or a supervisory operator.
  • a speech input device such as a microphone or other acoustic sensor
  • a user interface device such as a computer terminal, mobile phone, tablet, personal digital assistant etc
  • text inputs received directly from a user or an operator, or text received as an output of the speech- to-text conversion comprise natural, language inputs, and may be subjected to natural language processing (NIP) steps, which are briefly discussed in Figure 1.
  • NIP natural language processing
  • Steps 102 to 118 of Figure 1 illustrate steps involved in natural language processing of speech inputs.
  • steps 102 and 104 may be omitted and the method may commence directly from step 106.
  • step 102 comprises extracting a set of acoustic vectors from the input speech.
  • step 104 uses speech-to-text conversion, including for example Hidden Markov Models (HMMs), and appropriate grammars and dictionaries, for converting the speech representing the operator input to a set of " human readable text characters.
  • HMMs Hidden Markov Models
  • dictionaries appropriate grammars and dictionaries
  • Step 106 comprises sentence segmentation and toke ization.
  • Sentence segmentation comprises delimiting the human readable text characters (that is either received from step 104 or directly received as text input from an operator) into one or more sentences.
  • Tokenization comprises segmenting each sentence into a list, of tokens.
  • a token may be understood as a sequence of characters that are grouped together as a useful semantic unit for processing. Within a sentence, tokens typically correspond to discrete words or terms. In certain situations, in addition to grouping characters together, toke zation may also include removing or omitting certain characters (such as for example, certain punctuation characters or delimiters).
  • Step 108 comprises lexical (or morphological) analysis of sentences - ⁇ where each word is tagged with, its corresponding part of speech.
  • Step 110 comprises syntactical analysis or parsing, which comprises assigning a syntactic structure or a parse tree, to a given natural language sentence.
  • Semantic analysis at step 112 comprises translating a syntactic structure of a sentence into a semantic representation that is a precise and unambiguous representation of the meaning expressed by the sentence. Semantic analysis is achieved based on the knowledge about the structure of words and sentences - with a view to precisely stipulate the meaning of words, phrases, sentences and texts, and subsequently also their purpose and consequences.
  • Step 114 comprises topic identification - which ascertains the topic or subject to which a sentence or set of sentences relates. ' Topic identification is based on one or more topic models that are used to analyze text and classify statements or queries within the text to corresponding topics.
  • Step 116 comprises intent idanti fkatton— which seeks to ascertain the purpose or intention behind a statement or query.
  • Step 118 comprises Named Entity Recognition which tags or labels names of things (e.g. names o persons, organizations, locations, expressions of times, quantities, monetary values, percentages) that are found within a sentence.
  • a disambiguated logically structured and tagged representation of human readable text may be generated, which can be used for machine processing of natural language statements or queries received from a user or operator.
  • the present invention also relies on storage and retrieval of structured data and unstructured data for the purposes of processing natural language statements or queries received from a user or operator.
  • ''unstructured data refers to data that is received in the form of free test or natural language text, interpretation whereof relies on interpretation of individual terms as well as on context associated with individual terms.
  • structured data means data that is stored or received in a predefined format which may have one or more input, rules and interpretation whereof is independent of contest.
  • Structured data may be obtained in any number o ways that are well understood in die art - for example, data acquisition dirough sensors (for example in the case of a historian database), data that is stored while transactions are executed (for example, in database consisting of order details corresponding to a product) or data that is entered manually during execution of work flows (for example, i formation recordal in the course of lab analysis of a product).
  • structured data may comprise data that is already stored in a system database, and which is not a product of operator input
  • FIG. 2 illustrates a data repository system comprising one or more repositories 202a, 202b, 202c and 202d configured for storing structure data, as well as a repository 204 configured for storing unstructured data.
  • Repositories 202a to 202d may comprise databases that are configured to logically store structured data. Said repositories are based on database schemas configured to store data that is received in predefined formats. Common examples of structured databases that are likely to be encountered within industrial plants include archival databases (e.g. historian databases), enterprise resource planning- databases (ERP databases), laboratory information management system databases (LIMS databases), databases for storing data relevant to monitoring and resolution of service disnjptions (Incident Management databases), databases for storing data relevant to access control, permissions (Work Permit System databases) and databases for storing data relevant to plant management information systems (Plant MIS databases).
  • archival databases e.g. historian databases
  • ERP databases enterprise resource planning- databases
  • LIMS databases laboratory information management system databases
  • Information Management databases databases for storing data relevant to monitoring and resolution of service disnjptions
  • Incident Management databases databases for storing data relevant to access control, permissions (Work Permit System databases) and databases for storing data relevant to plant management information systems (Plant
  • repository 204 for storing unstructured data may be implemented by way of a knowledge base that is generated based on one or more ontological models.
  • An ontology comprises a set of interconnected concepts and relationships within a particular domain.
  • Concepts may be understood as classes of instances / things, while, relationships signi fy the relation or linkages between such classes.
  • Ontologies are used and adopted as a ay to understand. process and apply domain specific knowledge and information.
  • an ontology for traffic may include concepts such as vehicles, streets lights, geographical locations, time of day, and traffic volume.
  • the traffic ontology also includes relationships between concepts. For example, a specific vehicle may be "halted" at a particular street light that is in turn "located within" a specified geographical location.
  • Each instance of a concept may have di fferent attributes - for example, vehicles may be classified as cars, buses, two-wheeled vehicles etc.
  • Attribute of concept instances may be defined in terms o corresponding attribute information - ⁇ for example, in the case of vehicles, each concept instance may be defined in terms of attribute information such as size, make, model, number of wheels, public or private transport etc.
  • FIG. 3 illustrates an exemplary ontology comprising concepts 1. to 5 that are interconnected by virtue of relationships 1 to 5.
  • An ontology is represented by an ontology model - which is a structure for representing an ontology.
  • Ontology models may inter alia be represented using a graph structure such as a tree structure, where concepts are represented by nodes and relationships between concepts are represented by edges.
  • node attributes may be used to define properties of each concept instance.
  • the implemented data structure comprises a knowledge base (i.e. an ontological model based knowledge base).
  • repository 204 (which is configured for logical storage and retrieval of unstructured data) comprises a knowledge base of the kind described above.
  • each concept from the ontological model is implemented as a concept class, and specific concept instances corresponding to a concept class may be stored by storing attribute data specific to said concept instance. Relationships between concept instances may be defined by linking said concept instances. In an embodiment, said relationships between concept instances may be established by linking using primary keys, secondary keys, pointers or other links within the data structure being used to implement die ontological model.
  • the onrologtcaJ model which forms the basis for implementation of the knowledge base may comprise a model speci fically designed ot con figured to accommodate or store data corresponding to concepts and relationships that are likely to be encountered within the intended domain of implementation.
  • Figure 4A provides an exemplary embodiment of an ontological model configured to accommodate concepts and relationships within data obtained within an industrial plant. Nodes within the ontological model are configured to accommodate data corresponding to the "Equipment Name”, “Rated Output”, “Equipment Class”, “Manufacturer”, “Operator ID” and “date” concept classes.
  • Edges within the ontological model establish (i) a "generates” relation between "Equipment Name” and “Rated Output” nodes (si) a "classi tied as” relation between “Equipment Name” and “Equipment Class” nodes (iii) a "preferred vendor of spare parts” relation between “Equipment Class” and “Manufacturer” nodes (tv) a "manufactured by” relation between "Equipment Name” and “Manufacturer” nodes, (v) an "operated by” relation between "Equipment Name” and “Operator ID” nodes and (v) a "last serviced on” relation between “Equipment Name” and “Date” nodes.
  • Figure 4C illustrates an exemplary table data structure which may be used to implement the ontological model of Figure 4A.
  • Figure 4B provides an exemplary illustration of the manner in which a knowledge base that is implanted based on the ontological model of Figure 4A may be used to logically store unstructured data from die industrial plant.
  • Figure 4B illustrates logical storage of the following set of unstructured data relating to a particular item of equipment i.e. Steam Turbine No. 3:
  • Steam Turbine No. 3 has a maximum output rating of 40 KW. and is classified as heavy rotating machinery.
  • the stem turbine is presently operated by operator 432, and was last serviced on March 3, 2015.
  • the machine has been mantfa tured by Heavy Machines Inc. - ⁇ ⁇ are also designatedprefemd vendors for obtaining spares for heavy rotating machinery "
  • Figure 4D illustrates the exemplary table data structure of Figure 4C which has been populated with the specific interrelated concept instances of Figure 4B.
  • Figure 5 illustrates a method embodiment of die present invention.
  • the method relies on a data repository system of the kind illustrated m Figure 2, which comprises at least one repository configured for storing logical data, and at least one knowledge base that is implemented based on an oncological model, and con figured for enabling- storage and logical retrieval of unstructured data.
  • Step 502 comprises receiving an operator input, which operator input may comprise speech input or text input.
  • operator input may comprise speech input
  • the method would include the additional, steps (not illustrated) of extracting acoustic vectors from the input speech, and applying speech to text conversion for generating a set. of human readable text characters, before proceeding to step 504. If on the other hand the operator input comprises test inputs, the additional steps may be omitted and the method proceeds directly to step 504.
  • Step 504 comprises application of natural language processing techniques to the generated or received set of text characters—which natural language processing techniques may include, some or all of the steps described in connection with Figure 1.
  • Step 506 comprises determining, based on the output of the natural language processing techniques, whether die operator input comprises a statement or a query.
  • step 508 the method proceeds to step 508, wherein concepts and relationships extracted from the operator input are used to update the knowledge base within the data repository system,
  • the. method instead proceeds to step 510. at which, at least one of a structured database(s) and a knowledge base are searched for data that matches a query string or query parameters extracted from said query. At step 510. one or more results returned by the search operations is used to generate a response to the query - which response may be presented to the operator either in the form of a displayed text response, or in the form of speech output generated by text-to-speech operations.
  • the method of Figure 5 permits for an operator input to be classified as a statement and dealt with in accordance with step 508, while another operator input may be classified as a query and dealt with in accordance with steps 510 and 512. The method of Figure 5 may therefore alternate between steps 508 and steps 510 and 512, depending on the contents of received operator input.
  • Figure 6 illustrates an exemplary embodiment of the. method more, generally discussed in connection with method step 508 of Figure 5, wherein concepts and relationships extracted from the operator input may be used to update the knowledge base within the data repository system.
  • data corresponding to concepts and relationships are extracted from the received operator input (i.e. from the. set of test characters received at or generated based on step 502 of Figure 5 ⁇ .
  • the extracted concept and relationship data is used to populate the knowledge base by storing concept data, as concept instances within the knowledge base. Extracted relationships between specific concept instances may simultaneously be reflected within the knowledge base by linking interrelated concept instances using one or more keys (such primary keys or secondary keys), pointers or other links.
  • Steps 602 and 604 accordingly enable logical storage of unstructured data that may be received by way of natural language statements made by a system operator - which data may subsequently be subjected to logical search and retrieval operations executed on the knowledge base.
  • Figure 7 illustrates an exemplary embodiment of the method more generally discussed in connection wi th method step 510 of Figure 5, wherein at least one of a structured database(s) and a knowledge base are searched by seeking a match for a query string or query parameters extracted from said query.
  • the method generates a first set of query parameters (or a first query string) based on output from natural language processing steps that have been applied (at step 504 of Figure 5) to the received operator input.
  • Step 704 checks to ascertain whether the first set of -query parameters (or the first query string) return a valid search result from one or more structured databases that form a part of the data repository system (see Figure 2 ⁇ . Responsive to the first set of query parameters (or first query string) returning valid search results from the one or more structured databases, the method proceeds at step 710 to generate a response to the operator input query based on returned one or more search results.
  • step 706 comprises generating a second set of query parameters (or second query string) based on output from natural language processing steps that have been applied (at step 304 of Figure 5) to the received operator input.
  • Step 708 thereafter checks to ascertain whether the second set of query parameters (or second query string) return a valid search result from the knowledge databases that forms a part of the data repository system (see Figure 2) . Responsive to the. second set of query parameters (or second query- string) returning valid search results from the knowledge base, the method proceeds at step 10 to generate a response to the operator input query based on the returned search results.
  • the knowledge base on which the searches of step 708 are executed may be implemented based on an ontological model having interrelated concept classes corresponding to (i) one or more objects, equipment or components within an industrial plant (it) observed events or effects or events within said industrial plant (iii) causative factors related to observed events (iv) operator inputs or operator requests in response to an observed event or a determined causative factor (v) any action executed in response to the operator input or operator request and (vi) temporal data corresponding to any or all of the above.
  • Figure 8 describes the specific method steps involved in ascertaining whether a first set of query parameters return valid search results from the one or more structured databases (step 704 of Figure 7) .
  • Figure 9 further describes specific method steps involved in ascertaining whether a second set of query parameters return valid search results from the knowledge base (step 708 of Figure 7).
  • Step 802 comprises generating a query string based on a first set of query parameters extracted from output of the natural language processing steps that have been applied (at step 504 of Figure S) to the received operator input.
  • the query string may be. generated based on one or more predefined rules for generation of a query string.
  • the query- string may comprise the entire string of text characters extracted as output of the natural language processing steps.
  • the query string may comprise one or more of topics, intents and /or named entities identified in the course of the executed natural language processing steps.
  • Step 804 ascertains whether the query string matches any predefined query pattern within a predefined set of query patterns.
  • the predefined set of query patterns comprises a plurality of query patterns generated in Artificial Intelligence Markup Language (AIML) syntax, and step 804 ascertains whether the query string matches any of the predefined AIML query patterns within said predefined set of query patterns.
  • AIML Artificial Intelligence Markup Language
  • step 806 comprises executing a set of database queries instmctions which correspond to the matching predefined query pattern.
  • the predefined query pattern is an AIML pattern generated using AIML syntax
  • step 806 comprises executing a set of database query instructions encapsulated within an AIML template corresponding to said AIML pattern.
  • the execution of the set of database query instmctions encapsulated within the AIML template may rely on one or more quer parameters extracted from the query string. For example, selection of a stmctured database (from among a plurality of stmctured databases within a data repository system) may be based on one or more topics, intents or named entities extracted from the query string.
  • the set of database query instmctions encapsulated within an AIML template may comprise a single database query instruction or a plurality of database query instructions.
  • the database query instructions may comprise database query instmctions in any query language capable of querying one or more stmctured databases within the data repository system.
  • exemplary query languages that may be used to implement the present invention include SQL, .QL, CQL, CQLF, COFL, DMF, Datalog, F-3ogic, FQL, OQL, OCX etc,
  • the method generates a response to the query string, based on results returned by the executed set of database query instmctions -- which response may be presented (e.g. in text form or speech form) to the operator.
  • An operator may, by way of speech or text input check "Have any recent inadents been observed in connection with the distillation column *7 .
  • the invention would identity the operator input as a query, and based on identified topics, intents and/or named entities, may extract the following in formation for search through structured databases:
  • step 804 of Figure 8 searches for a matching AIML query pattern. Given that the topic of the query string involves Inadents, in an embodiment, step 804 may search for a matching AIML query pattern only among query patterns that are associated only with the Incident Database. In the example, the following matching AIML pattern (and corresponding AIML template) is found:
  • the query language instructions encapsulated within the. ATML template which corresponds to the matched AIML pattern are executed once within the Incident database, including as a search parameter, equipment name as "the distillation column". If no result is obtained, a wider set of search instructions is executed at try 2 by eliminating the. equipment name as a search parameter. Assuming a result is obtained, the values at positions 0, 1. and 2 of the returned search results may be copied into placeholder locations valueO, value 1 and value 2 of the defined response text, which response test is thereafter further processed for presentation to the querying operator.
  • the predefined set o f AIML patterns and corresponding templates may additionally encapsulate a response format into which search results that are retrieved from a structured database may be incorporated for presentation to the operator.
  • Step 902 of Figure 9 comprises generating an ontoiogical query model based on a second set of query parameters extracted from output of the natural language processing steps that have been applied (at step 504 of Figure 5) to operator input.
  • the ontoiogical query model is an ontology model comprising concept instances and relations extracted from a query string or set of query parameters.
  • an ontoiogical query model may include a set of data instances corresponding to (i) at least two interrelated concept instances or (ii) at least one concept instance and one corresponding relationship.
  • the ontoiogical query model may additionally include a concept instance placeholder which identifies a concept class (for which a corresponding data value is presently unknown or ;s sought to be found within the knowledge base) that is related to the set of data instances that have been used to construct the ontoiogical query model.
  • step 904 he method ascertains whether the ontology query model matches or maps on to any set of concept instances and relationships that are already stored within the knowledge base.
  • the step of matching the ontoiogical query model with data within the knowledge base can be achieved in any number of ways includi g using pattern matching, graph pattern matching or query languages that implement graph pattern matching (e.g. SPARQL).
  • step 906 In response to an identified match between the ontology query model and a set of concept instances and relationships stored within the knowledge base, step 906 generates a response based on data extracted from the matched set of concept instances and relationships that has been identified within the knowledge base.
  • step 906 responsive to an ontology query model matching a set of concept instances and relationships stored within the knowledge base, step 906 extracts data corresponding to a concept instance within the knowledge base, which concept instance corresponds in terms of both concept class and relationships to a concept class placeholder within the ontoiogical query model, and uses the extracted data to generate a response for presentation to an operator.
  • Figure 10 illustrates an exemplary ontology query model that may be generated in accordance with the method described above in connection with Figure 9.
  • Figure 10 specifically illustrates an ontology query model that has been generated in response to a query relating to the industrial plant speci fic knowledge base that has been previously discussed in connection with Figure 4B.
  • Figures 10A and 10B illustrate an exemplary ontology query model generated based on the query "What is the rated output of Steam Turbine No. 3".
  • the ontology query model of Figure 10A includes a concept instance 1002 having data value "Steam Turbine No. 3", which falls within concept class 1004 of the type "Equipment Name", and has a "generates" relationship corresponding to said concept instance 1002.
  • the ontology query model additionally includes a concept class placeholder 1008a which falls within concept class 1006 of the type "Rated Output".
  • the ontology query model of Figure. 10A may be pattern matched against the knowledgebase illustrated in Figure 4B.
  • the concept class 1006 of type "Rated Output” within the ontoiogical query model is matched against the corresponding "Rated Output" concept class of the knowledge base.
  • the pattern matching further establishes that the relevant concept instance of this "Rated Output" concept class is a concept instance of the class " ' Equipment Name" and which has the specific data value "Steam Turbine No. 3".
  • the invention accordingly determines (as illustrated in Figure 10B) that the relevant concept instance from the knowledge base (Le.
  • This data value may be extracted from die knowledge base and may be used to generate and present a response to the operator query for which the ontoiogical query model, of Figure 10A was generated.
  • FIG. 11 illustrates a system 1100 in accordance the present invention, which may be configured to implement one or more of the above discussed methods for generating responses to natural language queries received by way of operator input.
  • the system 1100 comprises a speech input 1102, a speech processing engine 1104, natural language engine 1106, structured database comparator 1108, knowledge base comparator 1110, response generator 1112, operator interface 1114, structured database(s) 1116, knowledge base 1118 and knowledge base updater 1120.
  • Operator input device 1102 may comprise any device configured to receive speech or text inputs from an operator.
  • operator input device 1102 may comprise any input device capable of receiving speech or sound based input signals, including by way of example a microphone, acoustic sensor or other device capable of receiving recieving sound waves and generating electromagnetic signals representing received sound waves.
  • operator input device 1102 may comprise any text input device, including by way of example a keyboard, keypad. touch screen interface or other peripheral device configured to enable an operator to input text characters.
  • Speech processing engine 1104 comprises a processor implemented engine configured for extracting acoustic vectors from speech signals and to implement speech-to-text conversion based on the extracted acoustic vectors - for converting received speech input to a set of readable text characters.
  • Natural language engine 1106 comprises a processor implemented engine for implementing one or more of the natural language processing steps described in connection with Figure 1.
  • natural language engine 1106 may be configured to implement on or more of sentence segmentation, tokemzation, lexical analysis, syntactical analysis, semantic analysis, topic identification, intent identification and named entity recognition.
  • natural language engine 1106 may additionally be configured to determine whether an operator input comprises a statement or a query.
  • Structured database comparator 1108 comprises a processor implemented comparator configured for extracting a first set of query parameters or query string from output received from natural language engine 1106, and for determining whether the query parameters or query string return any valid search results based on data within structured database 1116.
  • structured database comparator 1108 may be configured to implement method steps discussed above in connection with Figures 7 and 8.
  • Knowledge base comparator 1110 comprises a processor implemented comparator configured for extracting a second set of query parameters or an ontological query model from output received from natural language engine 1106, and for determining whether a search based on the second set of query parameters or ontological query model return any valid search results from within knowledge base 1118.
  • knowledge base comparator 1110 may be configured to implement method steps discussed above in connection with Figures 7 and 9.
  • knowledge base comparator 1110 may be configured to find and return valid search results from a knowledge base, by generating an ontological query model representative of an operator query and searching the knowledge base for a matching graph structure, using pattern matching or graph pattern matching.
  • Response generator 1112 comprises a processor implemented generator for generating a response to an operator query based on search results received from either of structured database comparator 1108 or knowledge base comparator 1110, Response generator 1112 communicates the generated response to output device 1114.
  • output device may comprise a display for presenting a text based response to an operator.
  • output device may comprise a text-to-speech converter for converting a text based response to a speech signal and implementing playback of said speech signal to an operator.
  • Knowledge base updater 1120 comprises a processor implemented updater configured to respond to a determination (by natural language engine 1106) that a received operator input comprises a statement, by updating knowledge base 1118 to include data corresponding to concept instances and relationships identified within or extracted from said statement.
  • knowledge base updater 1120 may be configured to implement method steps from the methods discussed above in connection with Figures 5 or 6.
  • knowledge base 1118 may be implemented based on an ontological model having interrelated concept classes corresponding to (i) one or more objects, equipment or components within an industrial plant (li) observed events or effects or events within said industrial plant ( ⁇ ) causative factors related to observed events (' ⁇ ) operator inputs or operator requests in response to an observed event or a determined causative factor (v) any action executed in response to the operator input or operator request and (vi) temporal data corresponding to any or all of the above.
  • Figure 12 illustrates an exemplary computing system in which various embodiments of the invention may be implemented.
  • the system 1202 comprises at-least one processor 1204 and at-least one memory 1206.
  • the processor 1204 executes program instructions and may be a real processor.
  • the processor 1204 may also be a virtual processor.
  • the computer system 1202 is not intended to suggest any limitation as to scope of use or functionality of described embodiments.
  • the computer system 1202 may include, but not be limited to, one or more of a general-purpose computer, a programmed microprocessor, a micro-controller, an integrated circuit, and other devices or arrangements of devices that are capable of implementing the steps that constitute the method of the present invention.
  • Exemplary embodiments of a system 1202 in accordance with the present invention may include one or more servers, desktops, laptops, tablets, smart phones, mobile phones, mobile communication devices, tablets, phablets and personal digital assistants.
  • the memory 1206 may store software for implementing various embodiments ofthe present invention.
  • the computer system 1202 may have additional components.
  • the computer system 1202 includes one or more communication channels 1208, one or more input devices 1210, one or more output devices 1212, and storage 1214.
  • An interconnection mechanism such as a bus, controller, or network, interconnects the components of the computer system 1202.
  • operating system software provides an operating environment for various softwares executing in the computer system 1202 using a processor 1204, and manages different functionalities of the components of the computer system 1202.
  • the communication channel (s) 1208 allow communication over a communication medium to various other computing entities.
  • the communication medium provides information such as program instructions, or other data in a communication media.
  • the communication media includes, but not limited to, wired or wireless methodologies implemented with an electrical, optica!, RF, infrared, acoustic, microwave, Bluetooth or other transmission media.
  • the input device(s) 1210 may include, but is not limited to, a touch screen, a keyboard, mouse, pen, joystick, trackball, a voice device, a scanning device, or any another device that is capable of providing input to the computer system 1202.
  • the input device(s) 1210 may be a sound card or similar device that accepts audio input in analog or digital form.
  • the output device(s) 1212 may include, but not limited to, a user interface on CRT, LCD, LED display, or any other display associated with any of servers, desktops, laptops, tablets, smart phones, mobile phones, mobile communication devices, tablets, phablets and personal digital assistants, printer, speaker, CD/DVD writer, or any other device that provides output from the computer system 1202.
  • the storage 1214 may include, but not limited to, magnetic disks, magnetic tapes, CD-ROMs, CD-RWs, DVDs, any types of computer memory, magnetic stripes, smart cards, printed barcodes or any other transitory or non-transitory medium which can be used to store information and can be accessed by the computer system 1202.
  • the storage 1214 contains program instructions for implementing the described embodiments.
  • the computer system 1202 is part of a distributed network.
  • the present invention may be implemented in numerous ways including as a system, a method, or a computer program product such as a computer readable storage medium or a computer network wherein programming instructions are communicated from a remote location.
  • the present invention may suitably be embodied as a computer program product for use with the computer system 1202.
  • the method described herein is typically implemented as a computer program product, comprising a set of program instructions which is executed by the computer system 1202 or any other similar device.
  • the set of program instructions may be a series of computer readable codes stored on a tangible medium, such as a computer readable storage medium (storage 1204), for example, diskette, CD-ROM, ROM, flash drives or hard disk, or transmittable to the computer system 1202, via a modem or other interface device, over either a tangible medium, including but not limited to optical or analogue communications channel(s) 1208.
  • the implementation of the invention as a computer program roduct may be in an intangible form using wireless techniques, including but not limited to microwave, infrared, bluetooth or other transmission techniques. These instructions can be preloaded into a system or recorded on a storage medium such as a CD-ROM, or made available for downloading over a network such as the Internet or a mobile telephone network.
  • the series of computer readable instructions may embody all or part of the functionality previously described herein.

Abstract

The invention provides methods, systems and computer program products for machine based processing of natural language input received from an operator within an industrial plant. The invention comprises receiving an operator input from the operator and classifying the received operator input as one of a query and a statement. Responsive to classification of the received operator input as a statement, the method proceeds to extract data from the received operator input, and to update a knowledge base by storing the extracted data as a set of interrelated concept instances within said knowledge base. Responsive to classification of the received operator input as a query, the method proceeds to search at least one of a structured database and the knowledge base for data that matches one or more query parameters extracted from, the received operator input A query response based on a returned search result may thereafter be presented to the operator.

Description

Methods, Systems aad Computer Program Products
for Machine Based Processing of Natural Language Input
Field of the Invention
[001] The present invention relates to generation of responses to natural language queries. The invention includes methods, apparatuses and computer program products for receiving natural language inputs or queries in the form of speech or text input, from a user or operator within an industrial plant, and for processing received natural language inputs or queries using one or both of databases that provide logical storage for structured data, and knowledge bases that are based on ontological models and that provide logical storage for unstructured data.
J ckgfoiiffld
[002] Information related to events and operations within an industrial plant is stored or embedded across a number of different data sources within the plant. Such data may include (t) structured data - e.g. data stored in plant historian databases, incident management databases, work permit databases, batch databases etc. or (if) unstmcfured data - e.g. remarks annexed by a user or operator in a report or shift record,
[003] Briefly, structured data refers to information with a high degree of organization, such that logical storage in a relational database is seamless, and readily searchable using straightforward keyword based search queries or operations. Unstructured data on the other hand refers to information where the lack of structure or organization makes compilation, comprehension and logical storage a time and energy consuming task, and which does not permit for straightforward keyword based search and retrieval operations.
[004] Known solutions for retrieval of i formation from structured data repositories rely on one or more applications or interfaces for effecting logical storage and retrieval of the structured data. 'The processes for storing data and developing applications or interfaces for retrieval requires knowledge of query languages, and are correspondingly time and cost intensive. As a consequence, de velopment of applications for storage and retrieval of stnictured data is t ically restricted to cases involving data that is accessed regularly. Further, a user or operator seeking to retrieve information from stnictured databases is required to input data queries in a input format (depending on requirements of the applicable query language) prescribed by the storage and retrieval application or interlace - which impacts ease of access arid flexibility of the search or retrieval process.
[005] Use of unstructured data presents additional barriers to data search and retrieval. A primary reason is that unstructured data suffers from context related ambiguities - wherein data input or search terms may be interpreted in multiple different ways, depending- on context that is implied by adjacent terms. As a consequence, keyword searches thai may otherwise be effective in the case of structured data sources are often of limited effectiveness in the case o unstmctured data.
[006] There is accordingly a need for enabling data storage and retrieval based on natural language inputs or queries raised by a user or operator, which permit (i) logical storage of natural language inputs or other unstructured data received from a user or operator, in a knowledge base and (it) generation of meaningful responses to natural language queries received from a user or operator, based on information retrieved either from one or more structured databases or from a knowledge base that has been used to store natural language inputs or other unstructured data.
SmB.miirv.
[007] 'The invention provides methods, systems and computer program products for machine based processing of natural language input received from an operator within an industrial plant.
[008] The invention comprises a method for machine, based processing of natural language input received from an operator within an industrial plant. 'The method comprises receiving an operator input from the operator and classifying the received operator input as one of a query and a statement Responsive to classification of the received operator input as a statement, the method proceeds to extract data from the received operator input, and to update a knowledge base by storing the extracted data as a set of interrelated concept instances within said knowledge base. Responsive to classification of the received operator input as a query, the method proceeds to search at least one of a structured database and the knowledge base for data that matches one or more query parameters extracted from the received operator input. A query response based on a returned search resul t may thereafter be presented to the operator. [009] The step of searching at least one of the structure database and the knowledge base may include generating a first set of query parameters based on the recei ved operator input, and performing search of the structured database based on the generated first set of query parameters. Responsive to receiving a null result from search of the structured database, the method may proceed to generate a second set of query parameters based on the received operator inpu and thereafter perform search of the knowledge base based on the generated second set of query parameters.
[0010] A query string may be generated based on the first set of query parameters. Further, the step of performing search of the structured database may comprise ascertaining whether the query string matches any predefined query pattern within a set of predefined query patterns. Responsive to identi fication of a matching predefined query pattern, the method may proceed to execute a set of query language instructions associated with the identified matching predefined query pattern.
[0011] The set of predefined query patterns may comprise AIML patterns. Further, each AIML pattern may have a corresponding AIML template associated therewith. In an embodiment, at least one of said AIML templates comprises a set of query language instructions encapsulated therewithin.
[0012] In an embodiment, the structured database may be selected from among a plurality of structured databases. 'The selection may be based on at least one of the extracted one or more query parameters, or on the query language instructions associated with the identi fied matching predefined query pattern.
[0013] In an embodiment of the method, an oncological query model may be generated based on the second set of que y parameters. Thereafter, searching the knowledge base may include ascertaining whether the oncological query model matches any part of the knowledge base. Responsive to identification of a match within the knowledge base, the method may include extracting data from the identified matched portion of the knowledge base, and returning a search result comprising- data extracted from the knowledge base. Identifying a match between the ontological query model and the knowledge base may in an embodiment be determined based on graph pattern matching. [0014] in a method embodiment, die knowledge base may be implemented based on an ontological model having interrelated concept classes corresponding to one or more (and in an embodiment, all) of (i) one or more objects, equipment or components wi thin an industrial plant (ii) observed events within said industrial plant (Hi) causative factors related to observed events (iv) operator inputs or operator requests in response to either of an observed event or a determined causative factor (v) any action executed in response to a operator input or operator request and (vi) temporal data.
[0015] The method of the present invention may additionally include receiving operator input in the form of speech signals. The received operator input may be subjected to speech to test processing and natural language processing prior to classification as one of a query and a statement.
[0016] The invention may additionally comprise a system configured for machine based processing of natural language input received from an operator within an industrial plant. The system may include at least one processor, a data repository system, an operator input device, a natural language engine, a knowledge base updater, and at least one comparator.
[0017] The data repository system of the present invention may comprise at least one structured database and at least one knowledge base. The operator input device may be configured for receiving an operator input from the operator. The natural language engine may be configured for processing the received operator input and classifying said received operator input as one of a query and a statement.
[0018] The knowledge base updater of the system may be co figured to respond to classification of the received operator input as a statement by extracting data from the received operator input, and updating a knowledge base by storing the. extracted data as a set of interrelated concept instances within said knowledge base. The at least one comparator may be configured to respond to classi fication of the received operator input as a query by searching at least one of the at least one stnictured database and the knowledge base for data matching one or more query parameters extracted from the received operator input. The comparator may thereafter present to the operator, a query response based on a returned search result. [0019] The system may be configured such that the at least one comparator comprises a structured database comparator configured to generate a first set of query parameters based on the received operator input. The structured database comparator may perform search of at least one structured database based on the generated first set of query parameters. The at least one comparator may also comprise a knowledge base comparator configured to respond to a null result arising from search of the structured database by generating a second set of query parameters based on the received operator input, and performing search of the knowledge base, based on the generated second set of query parameters.
[0020] The structured database comparator of the system may be configured to generate a query siting based on the first set of query parameters, and to perform search of the structured database by ascertaining whether the query string matches any predefined query pattern within a set of predefined query patterns. Responsive to identification of a matching predefined query pattern, the staictured database comparator may execute a set of quesy language instmctions associated with the marching- predefined query pattern.
[0021] In an embodiment of the system, the set of predefined query patterns may comprise AIML patterns. Each of said AIML patterns may include a corresponding AIML template associated therewith -and at least one of said AIML templates may have a set of query language instmctions enc ap suJ ated therewith) n .
[0022] The knowledge base comparator of the system may be configured to generate an ontologies! quesy model based on the second set of query parameters. The knowledge base comparator may perform search of the knowledge base by ascertaining whether the ontological query model matches any part of the knowledge base. Responsive to identification of a match within the knowledge base, the knowledge base comparator may extract data from the matched portion of the knowledge base, and return a search result comprising data extracted from the knowledge base.
[0023] In an embodiment of the system, the knowledge base may be implemented based on an ontological model having interrelated concept classes corresponding to one or more (and in an embodiment, all) of (i) one or more objects, equipment or components within an industrial plant (it) observed events or effects or events within said industrial plant (iii) causative factors related to observed events (iv) operator inputs or operator requests in response to an observed event or a determined causative factor (v) any action executed in response to an operator input or operator request and (vi) temporal data.
[0024] The knowledge base comparator of die system may be configured to identify matches between the ontological query model and the knowledge base, based on graph pattern matching.
[0025] The system embodiments described above may additionally include a speech processor configured to receive speech signals from the operator input device, perform speech to text conversion on the received speech signals, and communicate the converted test to the natural language engine.
[0026] The invention may additionally include a computer program product for machine based processing of natural language input received from an operator within an industrial plant, which computes: program product comprises computer readable instaictions for implementing method embodiments of the present in vention,
Figure imgf000007_0001
[0027] Figure 1 illustrates steps typically encountered during natural language processing.
[0028] Figure 2 illustrates a data repository system. [0029] Figure 3 illustrates an exemplary ontology. [0030] Figure 4A. illustrates an oncological model
[0031] Figure 4B illustrates an instance of a knowledge, base generated based on the ontological model of Figure 4A.
[0032] Figures 4C and 4D respectively illustrate data structures that may be used to implement trie ontological model and knowledge base of Figures 4A and 4B.
[0033] Figures 5 to 9 are lowcharts illustrating methods in accordance with the present i vention. [0034] Figures 10A and 10B illustrates an exemplary ontoiogical query model.
[0035] Figures 11 and 12 depict systems configured in accordance with the teachings of the present invention.
Detailed Description
[0036] For the purposes of the present invention, the term "operator input" may be understood to include one or both of (i) speech inputs received from a user or operator or a supervisory operator, at a speech input device (such as a microphone or other acoustic sensor) or (if) text inputs received from a user or operator at a user interface device (such as a computer terminal, mobile phone, tablet, personal digital assistant etc) or (in) inputs entered into reports by a user or operator or a supervisory operator.
[0037] For the purposes of the present invention, it will be understood that text inputs received directly from a user or an operator, or text received as an output of the speech- to-text conversion comprise natural, language inputs, and may be subjected to natural language processing (NIP) steps, which are briefly discussed in Figure 1.
[0038] Steps 102 to 118 of Figure 1 illustrate steps involved in natural language processing of speech inputs. In the event an operator input comprises test inputs, steps 102 and 104 may be omitted and the method may commence directly from step 106.
[0039] in the event an operator input comprises speech inputfs), step 102 comprises extracting a set of acoustic vectors from the input speech. 'Thereafter step 104 uses speech-to-text conversion, including for example Hidden Markov Models (HMMs), and appropriate grammars and dictionaries, for converting the speech representing the operator input to a set of " human readable text characters.
[0040] Step 106 comprises sentence segmentation and toke ization. Sentence segmentation comprises delimiting the human readable text characters (that is either received from step 104 or directly received as text input from an operator) into one or more sentences. Tokenization comprises segmenting each sentence into a list, of tokens. For the purposes of the present invention, a token may be understood as a sequence of characters that are grouped together as a useful semantic unit for processing. Within a sentence, tokens typically correspond to discrete words or terms. In certain situations, in addition to grouping characters together, toke zation may also include removing or omitting certain characters (such as for example, certain punctuation characters or delimiters).
[0041] Step 108 comprises lexical (or morphological) analysis of sentences -· where each word is tagged with, its corresponding part of speech. Step 110 comprises syntactical analysis or parsing, which comprises assigning a syntactic structure or a parse tree, to a given natural language sentence. Semantic analysis at step 112 comprises translating a syntactic structure of a sentence into a semantic representation that is a precise and unambiguous representation of the meaning expressed by the sentence. Semantic analysis is achieved based on the knowledge about the structure of words and sentences - with a view to precisely stipulate the meaning of words, phrases, sentences and texts, and subsequently also their purpose and consequences.
[0042] Step 114 comprises topic identification - which ascertains the topic or subject to which a sentence or set of sentences relates. 'Topic identification is based on one or more topic models that are used to analyze text and classify statements or queries within the text to corresponding topics. Step 116 comprises intent idanti fkatton— which seeks to ascertain the purpose or intention behind a statement or query. Step 118 comprises Named Entity Recognition which tags or labels names of things (e.g. names o persons, organizations, locations, expressions of times, quantities, monetary values, percentages) that are found within a sentence.
[0043] By implementing some or all o natural language processing steps 102 to 118 o Figure 1, a disambiguated logically structured and tagged representation of human readable text may be generated, which can be used for machine processing of natural language statements or queries received from a user or operator.
[0044] In addition to natural language processing, the present invention also relies on storage and retrieval of structured data and unstructured data for the purposes of processing natural language statements or queries received from a user or operator. For the purposes of the invention, ''unstructured data" refers to data that is received in the form of free test or natural language text, interpretation whereof relies on interpretation of individual terms as well as on context associated with individual terms. For the purposes of the invention "structured data" means data that is stored or received in a predefined format which may have one or more input, rules and interpretation whereof is independent of contest. Structured data may be obtained in any number o ways that are well understood in die art - for example, data acquisition dirough sensors (for example in the case of a historian database), data that is stored while transactions are executed (for example, in database consisting of order details corresponding to a product) or data that is entered manually during execution of work flows (for example, i formation recordal in the course of lab analysis of a product). In an embodiment of the invention, structured data may comprise data that is already stored in a system database, and which is not a product of operator input
[0045J The invention achieves logical storage and retrieval of both structured and unstructured data using a data repository system of the type described in Figure 2. Figure 2 illustrates a data repository system comprising one or more repositories 202a, 202b, 202c and 202d configured for storing structure data, as well as a repository 204 configured for storing unstructured data.
[0046] Repositories 202a to 202d may comprise databases that are configured to logically store structured data. Said repositories are based on database schemas configured to store data that is received in predefined formats. Common examples of structured databases that are likely to be encountered within industrial plants include archival databases (e.g. historian databases), enterprise resource planning- databases (ERP databases), laboratory information management system databases (LIMS databases), databases for storing data relevant to monitoring and resolution of service disnjptions (Incident Management databases), databases for storing data relevant to access control, permissions (Work Permit System databases) and databases for storing data relevant to plant management information systems (Plant MIS databases).
[0047] For the purposes of the present invention, repository 204 for storing unstructured data may be implemented by way of a knowledge base that is generated based on one or more ontological models.
[0048] An ontology comprises a set of interconnected concepts and relationships within a particular domain. Concepts may be understood as classes of instances / things, while, relationships signi fy the relation or linkages between such classes. Ontologies are used and adopted as a ay to understand. process and apply domain specific knowledge and information. By way of example, an ontology for traffic may include concepts such as vehicles, streets lights, geographical locations, time of day, and traffic volume. The traffic ontology also includes relationships between concepts. For example, a specific vehicle may be "halted" at a particular street light that is in turn "located within" a specified geographical location. Each instance of a concept may have di fferent attributes - for example, vehicles may be classified as cars, buses, two-wheeled vehicles etc. Attribute of concept instances may be defined in terms o corresponding attribute information -· for example, in the case of vehicles, each concept instance may be defined in terms of attribute information such as size, make, model, number of wheels, public or private transport etc.
[0049J Figure 3 illustrates an exemplary ontology comprising concepts 1. to 5 that are interconnected by virtue of relationships 1 to 5.
[0050] An ontology is represented by an ontology model - which is a structure for representing an ontology. Ontology models may inter alia be represented using a graph structure such as a tree structure, where concepts are represented by nodes and relationships between concepts are represented by edges. In implementing an ontology, node attributes may be used to define properties of each concept instance.
[0051] When an ontology is implemented within an actual, data structure, and used to store instances of specific ontology elements, the implemented data structure comprises a knowledge base (i.e. an ontological model based knowledge base). For the purposes of the present invention, repository 204 (which is configured for logical storage and retrieval of unstructured data) comprises a knowledge base of the kind described above.
[0052] For the purposes of die invention, when referring to a knowledge base that has been implemented based on an ontological model, each concept from the ontological model is implemented as a concept class, and specific concept instances corresponding to a concept class may be stored by storing attribute data specific to said concept instance. Relationships between concept instances may be defined by linking said concept instances. In an embodiment, said relationships between concept instances may be established by linking using primary keys, secondary keys, pointers or other links within the data structure being used to implement die ontological model. [0053] In an embodiment of the invention, the onrologtcaJ model which forms the basis for implementation of the knowledge base may comprise a model speci fically designed ot con figured to accommodate or store data corresponding to concepts and relationships that are likely to be encountered within the intended domain of implementation.
[0054] Figure 4A provides an exemplary embodiment of an ontological model configured to accommodate concepts and relationships within data obtained within an industrial plant. Nodes within the ontological model are configured to accommodate data corresponding to the "Equipment Name", "Rated Output", "Equipment Class", "Manufacturer", "Operator ID" and "date" concept classes. Edges within the ontological model establish (i) a "generates" relation between "Equipment Name" and "Rated Output" nodes (si) a "classi tied as" relation between "Equipment Name" and "Equipment Class" nodes (iii) a "preferred vendor of spare parts" relation between "Equipment Class" and "Manufacturer" nodes (tv) a "manufactured by" relation between "Equipment Name" and "Manufacturer" nodes, (v) an "operated by" relation between "Equipment Name" and "Operator ID" nodes and (v) a "last serviced on" relation between "Equipment Name" and "Date" nodes. Figure 4C illustrates an exemplary table data structure which may be used to implement the ontological model of Figure 4A.
[0055] Figure 4B provides an exemplary illustration of the manner in which a knowledge base that is implanted based on the ontological model of Figure 4A may be used to logically store unstructured data from die industrial plant. Figure 4B illustrates logical storage of the following set of unstructured data relating to a particular item of equipment i.e. Steam Turbine No. 3:
"Steam Turbine No. 3 has a maximum output rating of 40 KW. and is classified as heavy rotating machinery. The stem turbine is presently operated by operator 432, and was last serviced on March 3, 2015. The machine has been mantfa tured by Heavy Machines Inc. - ηώο are also designatedprefemd vendors for obtaining spares for heavy rotating machinery "
[0056] By parsing the above unstnictured text, and using natural language processing techniques for identifying concepts and relationships therebetween, it is possible to logically store data describing said unstructured text within the knowledge base illustrated Figure 4B. Figure 4D illustrates the exemplary table data structure of Figure 4C which has been populated with the specific interrelated concept instances of Figure 4B.
[0057] Figure 5 illustrates a method embodiment of die present invention. The method relies on a data repository system of the kind illustrated m Figure 2, which comprises at least one repository configured for storing logical data, and at least one knowledge base that is implemented based on an oncological model, and con figured for enabling- storage and logical retrieval of unstructured data.
[0058] Step 502 comprises receiving an operator input, which operator input may comprise speech input or text input. In the event the operator input comprises speech input, the method would include the additional, steps (not illustrated) of extracting acoustic vectors from the input speech, and applying speech to text conversion for generating a set. of human readable text characters, before proceeding to step 504. If on the other hand the operator input comprises test inputs, the additional steps may be omitted and the method proceeds directly to step 504.
[0059] Step 504 comprises application of natural language processing techniques to the generated or received set of text characters— which natural language processing techniques may include, some or all of the steps described in connection with Figure 1.
[0060] Step 506 comprises determining, based on the output of the natural language processing techniques, whether die operator input comprises a statement or a query.
[0061] In the event the operator input is determined to comprise a statement, the method proceeds to step 508, wherein concepts and relationships extracted from the operator input are used to update the knowledge base within the data repository system,
[0062] Alternatively, in the event the operator input is determined to comprise a query, the. method instead proceeds to step 510. at which, at least one of a structured database(s) and a knowledge base are searched for data that matches a query string or query parameters extracted from said query. At step 510. one or more results returned by the search operations is used to generate a response to the query - which response may be presented to the operator either in the form of a displayed text response, or in the form of speech output generated by text-to-speech operations. [0063] it would be understood that the method of Figure 5 permits for an operator input to be classified as a statement and dealt with in accordance with step 508, while another operator input may be classified as a query and dealt with in accordance with steps 510 and 512. The method of Figure 5 may therefore alternate between steps 508 and steps 510 and 512, depending on the contents of received operator input.
[0064] Figure 6 illustrates an exemplary embodiment of the. method more, generally discussed in connection with method step 508 of Figure 5, wherein concepts and relationships extracted from the operator input may be used to update the knowledge base within the data repository system. At step 602, data corresponding to concepts and relationships are extracted from the received operator input (i.e. from the. set of test characters received at or generated based on step 502 of Figure 5}. At step 604, the extracted concept and relationship data is used to populate the knowledge base by storing concept data, as concept instances within the knowledge base. Extracted relationships between specific concept instances may simultaneously be reflected within the knowledge base by linking interrelated concept instances using one or more keys (such primary keys or secondary keys), pointers or other links. Steps 602 and 604 accordingly enable logical storage of unstructured data that may be received by way of natural language statements made by a system operator - which data may subsequently be subjected to logical search and retrieval operations executed on the knowledge base.
[0065] Figure 7 illustrates an exemplary embodiment of the method more generally discussed in connection wi th method step 510 of Figure 5, wherein at least one of a structured database(s) and a knowledge base are searched by seeking a match for a query string or query parameters extracted from said query.
[0066] At step 702, the method generates a first set of query parameters (or a first query string) based on output from natural language processing steps that have been applied (at step 504 of Figure 5) to the received operator input. Step 704 checks to ascertain whether the first set of -query parameters (or the first query string) return a valid search result from one or more structured databases that form a part of the data repository system (see Figure 2}. Responsive to the first set of query parameters (or first query string) returning valid search results from the one or more structured databases, the method proceeds at step 710 to generate a response to the operator input query based on returned one or more search results. [0067] Alternatively, responsive to the first set of query parameters (or first query string) not returning valid search results (hereinafter a "null result"') from the one or more structured databases (, the method proceeds to step 706, which comprises generating a second set of query parameters (or second query string) based on output from natural language processing steps that have been applied (at step 304 of Figure 5) to the received operator input.
[0068] Step 708 thereafter checks to ascertain whether the second set of query parameters (or second query string) return a valid search result from the knowledge databases that forms a part of the data repository system (see Figure 2) . Responsive to the. second set of query parameters (or second query- string) returning valid search results from the knowledge base, the method proceeds at step 10 to generate a response to the operator input query based on the returned search results.
[0069] In an embodiment of the method illustrated in Figure 7, the knowledge base on which the searches of step 708 are executed, may be implemented based on an ontological model having interrelated concept classes corresponding to (i) one or more objects, equipment or components within an industrial plant (it) observed events or effects or events within said industrial plant (iii) causative factors related to observed events (iv) operator inputs or operator requests in response to an observed event or a determined causative factor (v) any action executed in response to the operator input or operator request and (vi) temporal data corresponding to any or all of the above.
[0070] Figure 8 describes the specific method steps involved in ascertaining whether a first set of query parameters return valid search results from the one or more structured databases (step 704 of Figure 7) . Likewise, Figure 9 further describes specific method steps involved in ascertaining whether a second set of query parameters return valid search results from the knowledge base (step 708 of Figure 7).
[0071] In Figure 8, Step 802 comprises generating a query string based on a first set of query parameters extracted from output of the natural language processing steps that have been applied (at step 504 of Figure S) to the received operator input. The query string may be. generated based on one or more predefined rules for generation of a query string. In an embodiment of the method, the query- string may comprise the entire string of text characters extracted as output of the natural language processing steps. In another embodiment of the method, the query string may comprise one or more of topics, intents and /or named entities identified in the course of the executed natural language processing steps.
[0072J Step 804 ascertains whether the query string matches any predefined query pattern within a predefined set of query patterns. In an embodiment of the invention, the predefined set of query patterns comprises a plurality of query patterns generated in Artificial Intelligence Markup Language (AIML) syntax, and step 804 ascertains whether the query string matches any of the predefined AIML query patterns within said predefined set of query patterns.
[0073] Responsive to the. query string matching- a predefined query pattern, step 806 comprises executing a set of database queries instmctions which correspond to the matching predefined query pattern. In the embodiment where the predefined query pattern is an AIML pattern generated using AIML syntax, step 806 comprises executing a set of database query instructions encapsulated within an AIML template corresponding to said AIML pattern. In a ye more particular embodiment of step 804, the execution of the set of database query instmctions encapsulated within the AIML template may rely on one or more quer parameters extracted from the query string. For example, selection of a stmctured database (from among a plurality of stmctured databases within a data repository system) may be based on one or more topics, intents or named entities extracted from the query string.
[0074] It would be understood that the set of database query instmctions encapsulated within an AIML template may comprise a single database query instruction or a plurality of database query instructions. Further the database query instructions may comprise database query instmctions in any query language capable of querying one or more stmctured databases within the data repository system. Without intending to limit the scope of the present, invention, exemplary query languages that may be used to implement the present invention include SQL, .QL, CQL, CQLF, COFL, DMF, Datalog, F-3ogic, FQL, OQL, OCX etc,
[0075] At step 808, the method generates a response to the query string, based on results returned by the executed set of database query instmctions -- which response may be presented (e.g. in text form or speech form) to the operator.
[0076] To better understand the method steps involved in Figure 8, an example involving implementation of AIML syntax may be considered. An operator may, by way of speech or text input check "Have any recent inadents been observed in connection with the distillation column*7. Applying natural language processing steps, the invention would identity the operator input as a query, and based on identified topics, intents and/or named entities, may extract the following in formation for search through structured databases:
• Query String: Last Similar Incident
• Equipment: The distillation column
• Equipment type: Distillation column
• Time stamp: Unknown
[0077] Responsive to generation of the query string "Last Similar Incident", step 804 of Figure 8 searches for a matching AIML query pattern. Given that the topic of the query string involves Inadents, in an embodiment, step 804 may search for a matching AIML query pattern only among query patterns that are associated only with the Incident Database. In the example, the following matching AIML pattern (and corresponding AIML template) is found:
<catcgory>
<pat†.ern> Last Similar Incident </pattem>
<tempkte>
<extract>
<exceutedatabase type — "Incident Databasc">
'-command try ~ "1">
Similar Incident
<context>
<name datatype - t<text'V>
<equipmentclass datatype ~ "text"/>
<equipmentname datatype— "text"/>
<equipmenttype datatype = "text"/5*
</context>
< /command >
<command try— "∑' '>
Sirnilar Incident <conrext>
<name datatype ~ "text'7>
<incidenttype datatype ·--- £<text"/ >
<equipmentname datatype ~ "text"/>
</context>
</coirimarid>
</executedatabase >
</ex ract>
< response'.l"ext>
'The Similar Incident was found by <that> <ctxt> valueO </ctxt></that> at <that><ctxt> valuel </ctst></that><response> value?. < / response> < / that>
</responseTex t>
</template>
< / category>
[0078] The query language instructions encapsulated within the. ATML template which corresponds to the matched AIML pattern (i.e. <pattern> Last Similar Incident </pattem>) are executed once within the Incident database, including as a search parameter, equipment name as "the distillation column". If no result is obtained, a wider set of search instructions is executed at try 2 by eliminating the. equipment name as a search parameter. Assuming a result is obtained, the values at positions 0, 1. and 2 of the returned search results may be copied into placeholder locations valueO, value 1 and value 2 of the defined response text, which response test is thereafter further processed for presentation to the querying operator. As will be understood from the above example, in certain embodiments, the predefined set o f AIML patterns and corresponding templates may additionally encapsulate a response format into which search results that are retrieved from a structured database may be incorporated for presentation to the operator.
[0079] Figure 9 provides further detail regarding step 708 of Figure 7, which ascertains whether a second set of query parameters returns valid search results from within a knowledge base. [0080] Step 902 of Figure 9 comprises generating an ontoiogical query model based on a second set of query parameters extracted from output of the natural language processing steps that have been applied (at step 504 of Figure 5) to operator input. The ontoiogical query model, is an ontology model comprising concept instances and relations extracted from a query string or set of query parameters. In an embodiment of the invention, an ontoiogical query model may include a set of data instances corresponding to (i) at least two interrelated concept instances or (ii) at least one concept instance and one corresponding relationship. The ontoiogical query model may additionally include a concept instance placeholder which identifies a concept class (for which a corresponding data value is presently unknown or ;s sought to be found within the knowledge base) that is related to the set of data instances that have been used to construct the ontoiogical query model.
[0081] At step 904, he method ascertains whether the ontology query model matches or maps on to any set of concept instances and relationships that are already stored within the knowledge base. The step of matching the ontoiogical query model with data within the knowledge base can be achieved in any number of ways includi g using pattern matching, graph pattern matching or query languages that implement graph pattern matching (e.g. SPARQL).
[0082] In response to an identified match between the ontology query model and a set of concept instances and relationships stored within the knowledge base, step 906 generates a response based on data extracted from the matched set of concept instances and relationships that has been identified within the knowledge base. In an embodiment of the invention, responsive to an ontology query model matching a set of concept instances and relationships stored within the knowledge base, step 906 extracts data corresponding to a concept instance within the knowledge base, which concept instance corresponds in terms of both concept class and relationships to a concept class placeholder within the ontoiogical query model, and uses the extracted data to generate a response for presentation to an operator.
[0083] Figure 10 illustrates an exemplary ontology query model that may be generated in accordance with the method described above in connection with Figure 9. Figure 10 specifically illustrates an ontology query model that has been generated in response to a query relating to the industrial plant speci fic knowledge base that has been previously discussed in connection with Figure 4B. [0084] Figures 10A and 10B illustrate an exemplary ontology query model generated based on the query "What is the rated output of Steam Turbine No. 3". The ontology query model of Figure 10A includes a concept instance 1002 having data value "Steam Turbine No. 3", which falls within concept class 1004 of the type "Equipment Name", and has a "generates" relationship corresponding to said concept instance 1002. The ontology query model additionally includes a concept class placeholder 1008a which falls within concept class 1006 of the type "Rated Output". In executing the method of Figure 9, the ontology query model of Figure. 10A may be pattern matched against the knowledgebase illustrated in Figure 4B. By virtue of pattern matching, the concept class 1006 of type "Rated Output" within the ontoiogical query model is matched against the corresponding "Rated Output" concept class of the knowledge base. The pattern matching further establishes that the relevant concept instance of this "Rated Output" concept class is a concept instance of the class "'Equipment Name" and which has the specific data value "Steam Turbine No. 3". The invention accordingly determines (as illustrated in Figure 10B) that the relevant concept instance from the knowledge base (Le. which most closely provides a match for the concept class placeholder within the ontoiogical query model) is the concept instance 1008b having concept class "Rated Output" and having the data value "40 W" . This data value may be extracted from die knowledge base and may be used to generate and present a response to the operator query for which the ontoiogical query model, of Figure 10A was generated.
[0085] Figure 11 illustrates a system 1100 in accordance the present invention, which may be configured to implement one or more of the above discussed methods for generating responses to natural language queries received by way of operator input. The system 1100 comprises a speech input 1102, a speech processing engine 1104, natural language engine 1106, structured database comparator 1108, knowledge base comparator 1110, response generator 1112, operator interface 1114, structured database(s) 1116, knowledge base 1118 and knowledge base updater 1120.
[0086] Operator input device 1102 may comprise any device configured to receive speech or text inputs from an operator. In an embodiment, operator input device 1102 may comprise any input device capable of receiving speech or sound based input signals, including by way of example a microphone, acoustic sensor or other device capable of receiving recieving sound waves and generating electromagnetic signals representing received sound waves. In another embodiment, operator input device 1102 may comprise any text input device, including by way of example a keyboard, keypad. touch screen interface or other peripheral device configured to enable an operator to input text characters.
[0087] Speech processing engine 1104 comprises a processor implemented engine configured for extracting acoustic vectors from speech signals and to implement speech-to-text conversion based on the extracted acoustic vectors - for converting received speech input to a set of readable text characters.
[0088] Natural language engine 1106 comprises a processor implemented engine for implementing one or more of the natural language processing steps described in connection with Figure 1. In an embodiment, natural language engine 1106 may be configured to implement on or more of sentence segmentation, tokemzation, lexical analysis, syntactical analysis, semantic analysis, topic identification, intent identification and named entity recognition. In an embodiment, natural language engine 1106 may additionally be configured to determine whether an operator input comprises a statement or a query.
[0089] Structured database comparator 1108 comprises a processor implemented comparator configured for extracting a first set of query parameters or query string from output received from natural language engine 1106, and for determining whether the query parameters or query string return any valid search results based on data within structured database 1116. In an embodiment, structured database comparator 1108 may be configured to implement method steps discussed above in connection with Figures 7 and 8.
[0090] Knowledge base comparator 1110 comprises a processor implemented comparator configured for extracting a second set of query parameters or an ontological query model from output received from natural language engine 1106, and for determining whether a search based on the second set of query parameters or ontological query model return any valid search results from within knowledge base 1118. In an embodiment, knowledge base comparator 1110 may be configured to implement method steps discussed above in connection with Figures 7 and 9. In a specific embodiment of the invention, knowledge base comparator 1110 may be configured to find and return valid search results from a knowledge base, by generating an ontological query model representative of an operator query and searching the knowledge base for a matching graph structure, using pattern matching or graph pattern matching. [0091] Response generator 1112 comprises a processor implemented generator for generating a response to an operator query based on search results received from either of structured database comparator 1108 or knowledge base comparator 1110, Response generator 1112 communicates the generated response to output device 1114. In an embodiment of the invention, output device may comprise a display for presenting a text based response to an operator. In another embodiment of the invention, output device may comprise a text-to-speech converter for converting a text based response to a speech signal and implementing playback of said speech signal to an operator.
[0092] Knowledge base updater 1120 comprises a processor implemented updater configured to respond to a determination (by natural language engine 1106) that a received operator input comprises a statement, by updating knowledge base 1118 to include data corresponding to concept instances and relationships identified within or extracted from said statement. In an embodiment of the invention, knowledge base updater 1120 may be configured to implement method steps from the methods discussed above in connection with Figures 5 or 6.
[0093] In an embodiment of the system illustrated in Figure 11, knowledge base 1118 may be implemented based on an ontological model having interrelated concept classes corresponding to (i) one or more objects, equipment or components within an industrial plant (li) observed events or effects or events within said industrial plant (Ίι ) causative factors related to observed events ('ίν) operator inputs or operator requests in response to an observed event or a determined causative factor (v) any action executed in response to the operator input or operator request and (vi) temporal data corresponding to any or all of the above.
[0094] Figure 12 illustrates an exemplary computing system in which various embodiments of the invention may be implemented.
[0095] 'The system 1202 comprises at-least one processor 1204 and at-least one memory 1206. The processor 1204 executes program instructions and may be a real processor. The processor 1204 may also be a virtual processor. The computer system 1202 is not intended to suggest any limitation as to scope of use or functionality of described embodiments. For example, the computer system 1202 may include, but not be limited to, one or more of a general-purpose computer, a programmed microprocessor, a micro-controller, an integrated circuit, and other devices or arrangements of devices that are capable of implementing the steps that constitute the method of the present invention. Exemplary embodiments of a system 1202 in accordance with the present invention may include one or more servers, desktops, laptops, tablets, smart phones, mobile phones, mobile communication devices, tablets, phablets and personal digital assistants. In an embodiment of the present invention, the memory 1206 may store software for implementing various embodiments ofthe present invention. The computer system 1202 may have additional components. For example, the computer system 1202 includes one or more communication channels 1208, one or more input devices 1210, one or more output devices 1212, and storage 1214. An interconnection mechanism (not shown) such as a bus, controller, or network, interconnects the components of the computer system 1202. In various embodiments of the present invention, operating system software (not shown) provides an operating environment for various softwares executing in the computer system 1202 using a processor 1204, and manages different functionalities of the components of the computer system 1202.
[0096] The communication channel (s) 1208 allow communication over a communication medium to various other computing entities. The communication medium provides information such as program instructions, or other data in a communication media. The communication media includes, but not limited to, wired or wireless methodologies implemented with an electrical, optica!, RF, infrared, acoustic, microwave, Bluetooth or other transmission media.
[0097J 'The input device(s) 1210 may include, but is not limited to, a touch screen, a keyboard, mouse, pen, joystick, trackball, a voice device, a scanning device, or any another device that is capable of providing input to the computer system 1202. In an embodiment of the present invention, the input device(s) 1210 may be a sound card or similar device that accepts audio input in analog or digital form. The output device(s) 1212 may include, but not limited to, a user interface on CRT, LCD, LED display, or any other display associated with any of servers, desktops, laptops, tablets, smart phones, mobile phones, mobile communication devices, tablets, phablets and personal digital assistants, printer, speaker, CD/DVD writer, or any other device that provides output from the computer system 1202.
[0098] The storage 1214 may include, but not limited to, magnetic disks, magnetic tapes, CD-ROMs, CD-RWs, DVDs, any types of computer memory, magnetic stripes, smart cards, printed barcodes or any other transitory or non-transitory medium which can be used to store information and can be accessed by the computer system 1202. In various embodiments of the present invention, the storage 1214 contains program instructions for implementing the described embodiments.
[0099] In an embodiment of the present invention, the computer system 1202 is part of a distributed network.
[0 100] The present invention may be implemented in numerous ways including as a system, a method, or a computer program product such as a computer readable storage medium or a computer network wherein programming instructions are communicated from a remote location.
[00101] The present invention may suitably be embodied as a computer program product for use with the computer system 1202. The method described herein is typically implemented as a computer program product, comprising a set of program instructions which is executed by the computer system 1202 or any other similar device. The set of program instructions may be a series of computer readable codes stored on a tangible medium, such as a computer readable storage medium (storage 1204), for example, diskette, CD-ROM, ROM, flash drives or hard disk, or transmittable to the computer system 1202, via a modem or other interface device, over either a tangible medium, including but not limited to optical or analogue communications channel(s) 1208. The implementation of the invention as a computer program roduct may be in an intangible form using wireless techniques, including but not limited to microwave, infrared, bluetooth or other transmission techniques. These instructions can be preloaded into a system or recorded on a storage medium such as a CD-ROM, or made available for downloading over a network such as the Internet or a mobile telephone network. The series of computer readable instructions may embody all or part of the functionality previously described herein.
[00102] It will be understood that methods, systems and computer program products in accordance with the present invention provide a reliable and effective solution for natural language processing of operator inputs, and for logically storing and querying structured as well as unstructured data that are encountered within a defined knowledge domain.
[00103] While the exemplary embodiments of the present invention are described and illustrated herein, it will be appreciated that they are merely illustrative. It will be understood by those skilled in the art that various modifications in form and detail may be made therein without de from or offending the spirit and scope of the invention as defined by the appended claims.

Claims

We claim:
1. A method for machine based processing of natural language input received from an operator within an industrial plant, the method comprising:
receiving an operator input from the operator;
classifying the received operator input as one of a query and a statement;
responsive to classification of the received operator input as a statement:
extracting data f om the received operator input; and
updating a knowledge base by storing the extracted data as a set of interrelated concept instances within said knowledge base:
and
responsive to classification of die received operator input as a query:
searching at least one of a structured database and the knowledge base for data that matches one or more query parameters extracted from the received operator input; and
presenting to the operator, a query response based on a returned search result,
2. The method as claimed in claim 1. wherein the step of searching at least one of the structured database and the knowledge base comprises:
generating a first set of query parameters based on the received operator input;
performing search of the structured database based on the generated first set of query parameters; and
responsive to receiving a null result from search of the structured database:
generating a second set of query parameters based on the received operator input; and performing search of the knowledge base, based on the generated second set of query parameters.
3. The method as claimed in claim 2, wherein:
a query string is generated based on the first set of query parameters; and
the step of performing search of the structured database comprises:
ascertaining whether the query string matches any predefined query pattern within a set of predefined query patterns; and responsive to identification of a matching predefined query pattern, executing a set of query language instructions associated with the identified matching predefined query pattern.
4. The method as claimed in claim 3, wherein:
the set of predefined query patterns comprise AIML patterns, each of said AIML patterns having a corresponding AIML template associated therewith; and
at least one of said AIML templates comprising a set of query language instujcuons encapsulated dierewithin,
5. The method as claimed in claim 3. wherein the structured database is selected from among a plurality of structured databases, and wherein said selection is based on at least one of the extracted one or more query parameters, or on the query language instructions associated with the identified matching predefined query pattern.
6. The method as claimed in claim 2, wherein:
an ontological query model is generated based on the second set of query parameters;
and wherein performing search of the knowledge base comprises:
ascertaining whether the ontological query model matches any part of the knowledge base: responsive to identification of a match within the knowledge base, extracting data from the identified matched portion of the knowledge base; and
returning a search result comprising data extracted from the knowledge base.
7. 'The method as claimed in claim 1, wherein the. knowledge base is implemented based on an ontological model having interrelated concept classes corresponding to all of (i) one or more objects, equipment or components within an industrial plant (it) observed events within said industrial plant (ill) causative factors related to observed events (iv) operator inputs or operator requests in response to either of an observed event or a determined causative factor (v) any action executed in response to a operator input or operator request and (vi) temporal data.
3. The method as claimed in claim 6, wherein identifying a match between the ontological query model and the knowledge base is determined based on graph pattern matching.
9. The method as claimed in claim 1. wherein the received operator input comprises speech signals, and said received operator input is subjected to speech to text processing and
natural language processing pnor to classi fication as one of a query and a statement.
I.0. A system for machine based processing of natural language input received from an operator within an industrial plant, the system comprising:
at least one processor;
a data repository system comprising at least one structured database and at least one knowledge base;
an operator input device configured tor receiving an operator input from the operator;
a natural language engine configured for processing the received operator input and classifying said received operator input as one of a query and a statement;
a knowledge base updater configured to respond to classification of the received operator input as a statement by:
extracting data from the received operator input; and
updating a knowledge base by storing the extracted data as a set of interrelated concept instances within said knowledge base;
and
at least one comparator configured to respond to classification of the received operator input as a query by:
searching at least one of the at least one stmctured database and the knowledge base for data matching one or more query parameters extracted from the received operator input; and presenting to the. operator, a query response based on a returned search result.
II . The system as claimed in claim 10, wherein the at least one comparator comprises:
a structured database comparator configured to:
generate, a first set of q ery parameters based on the received operator input; and perform search of at least one structured database based on the generated first set of query parameters;
and
a knowledge base comparator configured, to respond to a null result arising from search of the structured database, by: generating a second set of query parameters based on the received operator input; and performing search of the knowledge base, based on the generated second set of query parameters.
1.2. The system as claimed in claim 11, wherein the structured database comparator is configured generate a query string based on the first set of query parameters; and
perform search of the structured database including by:
ascertaining whether the query string matches any predefined query pattern within a set of predefined query patterns; and
responsive to identification of a matching predefined query pattern, executing a set of query language instructions associated with the matching predefined query pattern.
13. The system as claimed in claim 12, wherein:
the set of predefined query patterns comprise AIML patterns, each of said AIML patterns having a corresponding AIML template associated therewith; and
at least one of said AIML templates comprises a set of query language instructions encapsulated thcrewitiim.
14. 'The system as claimed in claim 12, wherein the structured database comparator is configured to select the structured databases from among a plurality of structured database, and wherein said selection is ba sed on at least one of the extracted one or more query parameters or on at least one query language instruction associated with the matching predefined query pattern.
15. The system as claimed in claim 11, wherein the knowledge base comparator is configured to: generate an ontologieal query model based on the second set of query parameters; and
perform search of the knowledge base including by:
ascertaining whether the ontologieal query model matches any part of the knowledge base; responsive to identification of a match within the knowledge base, extracting data from the matched portion of the knowledge base; and
returning a search result comprising data extracted from the knowledge base.
1.6. The system as claimed in claim 10., wherein the knowledge base is implemented based on an ontoiogical model having interrelated concept classes corresponding to all of ø one or more objects, equipment or components within an industrial plant (ii) observed events or effects or events within said industrial plant (lii) causative factors related to observed events (iv) operator inputs or operator requests in response to an observed event or a determined causative factor (v) any action executed in response to an operator input or operator request and (vi) temporal data.
17. The system as claimed in claim 15, wherein die knowledge base comparator is configured to identi fy matches between the ontoiogical query model and the knowledge base, based on graph pattern matching.
18. The system as claimed in claim 10, further comprising:
a speech processor configured to:
receive speech signals from the operator input device;
to perform speech to text conversion on the received speech signals; and
communicate the converted text to die natural language engine.
19. A computer program product tor machine based processing of natural language input received from an operator within an industrial plant, the computer program product embodied on a non-transitory computer readable medium, and comprising computer readable instructions for: receiving an operator input from the operator;
classifying the received operator input as one of a query and a statement;
responsive to classification of the received operator input as a statement:
extracting data from the received operator input; and
updating a knowledge base by storing the extracted data as a set of interrelated concept instances within said knowledge base:
and
responsive to classification of the received operator input as a query:
searching at least one of a structured database and the knowledge base for data matching one or more query parameters extracted from the received operator input; and
presenting to the operator, a query response based on a returned search resul t.
PCT/IB2016/050593 2015-03-30 2016-02-05 Methods, systems and computer program products for machine based processing of natural language input WO2016156995A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN1643/CHE/2015 2015-03-30
IN1643CH2015 2015-03-30

Publications (1)

Publication Number Publication Date
WO2016156995A1 true WO2016156995A1 (en) 2016-10-06

Family

ID=57003967

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2016/050593 WO2016156995A1 (en) 2015-03-30 2016-02-05 Methods, systems and computer program products for machine based processing of natural language input

Country Status (1)

Country Link
WO (1) WO2016156995A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111651990A (en) * 2020-04-14 2020-09-11 车智互联(北京)科技有限公司 Entity identification method, computing equipment and readable storage medium
CN112613312A (en) * 2020-12-18 2021-04-06 平安科技(深圳)有限公司 Method, device and equipment for training entity naming recognition model and storage medium
CN112818005A (en) * 2021-02-03 2021-05-18 北京清科慧盈科技有限公司 Structured data searching method, device, equipment and storage medium
CN114637766A (en) * 2022-05-18 2022-06-17 山东师范大学 Intelligent question-answering method and system based on natural resource industrial chain knowledge graph

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5991726A (en) * 1997-05-09 1999-11-23 Immarco; Peter Speech recognition devices
US20010053969A1 (en) * 2000-03-22 2001-12-20 Wide Roeland Hogenhout Natural language machine interface
US20020010574A1 (en) * 2000-04-20 2002-01-24 Valery Tsourikov Natural language processing and query driven information retrieval
US20040039734A1 (en) * 2002-05-14 2004-02-26 Judd Douglass Russell Apparatus and method for region sensitive dynamically configurable document relevance ranking
WO2005062202A2 (en) * 2003-12-23 2005-07-07 Thomas Eskebaek Knowledge management system with ontology based methods for knowledge extraction and knowledge search
WO2007125108A1 (en) * 2006-04-27 2007-11-08 Abb Research Ltd A method and system for controlling an industrial process including automatically displaying information generated in response to a query in an industrial installation
US20080059432A1 (en) * 2006-09-01 2008-03-06 Yokogawa Electric Corporation System and method for database indexing, searching and data retrieval
US20080263006A1 (en) * 2007-04-20 2008-10-23 Sap Ag Concurrent searching of structured and unstructured data
US20120254143A1 (en) * 2011-03-31 2012-10-04 Infosys Technologies Ltd. Natural language querying with cascaded conditional random fields
US20140006012A1 (en) * 2012-07-02 2014-01-02 Microsoft Corporation Learning-Based Processing of Natural Language Questions

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5991726A (en) * 1997-05-09 1999-11-23 Immarco; Peter Speech recognition devices
US20010053969A1 (en) * 2000-03-22 2001-12-20 Wide Roeland Hogenhout Natural language machine interface
US20020010574A1 (en) * 2000-04-20 2002-01-24 Valery Tsourikov Natural language processing and query driven information retrieval
US20040039734A1 (en) * 2002-05-14 2004-02-26 Judd Douglass Russell Apparatus and method for region sensitive dynamically configurable document relevance ranking
WO2005062202A2 (en) * 2003-12-23 2005-07-07 Thomas Eskebaek Knowledge management system with ontology based methods for knowledge extraction and knowledge search
WO2007125108A1 (en) * 2006-04-27 2007-11-08 Abb Research Ltd A method and system for controlling an industrial process including automatically displaying information generated in response to a query in an industrial installation
US20080059432A1 (en) * 2006-09-01 2008-03-06 Yokogawa Electric Corporation System and method for database indexing, searching and data retrieval
US20080263006A1 (en) * 2007-04-20 2008-10-23 Sap Ag Concurrent searching of structured and unstructured data
US20120254143A1 (en) * 2011-03-31 2012-10-04 Infosys Technologies Ltd. Natural language querying with cascaded conditional random fields
US20140006012A1 (en) * 2012-07-02 2014-01-02 Microsoft Corporation Learning-Based Processing of Natural Language Questions

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SUN.: "XML-based Agent Scripts and Inference Mechanisms.", DISS., August 2003 (2003-08-01), pages 1 - 55, XP055315209, Retrieved from the Internet <URL:http://digital.library.unt.edu/ark:/67531/metadc4288/m2/1/high_res_d/thesis.pdf> *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111651990A (en) * 2020-04-14 2020-09-11 车智互联(北京)科技有限公司 Entity identification method, computing equipment and readable storage medium
CN111651990B (en) * 2020-04-14 2024-03-15 车智互联(北京)科技有限公司 Entity identification method, computing device and readable storage medium
CN112613312A (en) * 2020-12-18 2021-04-06 平安科技(深圳)有限公司 Method, device and equipment for training entity naming recognition model and storage medium
CN112613312B (en) * 2020-12-18 2022-03-18 平安科技(深圳)有限公司 Method, device and equipment for training entity naming recognition model and storage medium
CN112818005A (en) * 2021-02-03 2021-05-18 北京清科慧盈科技有限公司 Structured data searching method, device, equipment and storage medium
CN112818005B (en) * 2021-02-03 2024-02-02 北京清科慧盈科技有限公司 Structured data searching method, device, equipment and storage medium
CN114637766A (en) * 2022-05-18 2022-06-17 山东师范大学 Intelligent question-answering method and system based on natural resource industrial chain knowledge graph
CN114637766B (en) * 2022-05-18 2022-08-26 山东师范大学 Intelligent question-answering method and system based on natural resource industrial chain knowledge graph

Similar Documents

Publication Publication Date Title
US10262062B2 (en) Natural language system question classifier, semantic representations, and logical form templates
KR102288249B1 (en) Information processing method, terminal, and computer storage medium
JP5936698B2 (en) Word semantic relation extraction device
CN111291161A (en) Legal case knowledge graph query method, device, equipment and storage medium
CN110727779A (en) Question-answering method and system based on multi-model fusion
CN110263248B (en) Information pushing method, device, storage medium and server
KR102491172B1 (en) Natural language question-answering system and learning method
JP2002288201A (en) Question-answer processing method, question-answer processing program, recording medium for the question- answer processing program, and question-answer processor
CN109947952B (en) Retrieval method, device, equipment and storage medium based on English knowledge graph
CN102279894A (en) Method for searching, integrating and providing comment information based on semantics and searching system
US20140180728A1 (en) Natural Language Processing
WO2016156995A1 (en) Methods, systems and computer program products for machine based processing of natural language input
KR101709055B1 (en) Apparatus and Method for Question Analysis for Open web Question-Answering
WO2002089004A2 (en) Search data management
CN109446313B (en) Sequencing system and method based on natural language analysis
KR20210063874A (en) A method and an apparatus for analyzing marketing information based on knowledge graphs
CN111753522A (en) Event extraction method, device, equipment and computer readable storage medium
CN112380848B (en) Text generation method, device, equipment and storage medium
CN113297251A (en) Multi-source data retrieval method, device, equipment and storage medium
RU2718978C1 (en) Automated legal advice system control method
KR20210063878A (en) A method and an apparatus for providing chatbot services of analyzing marketing information
CN110688559A (en) Retrieval method and device
CN115525750A (en) Robot phonetics detection visualization method and device, electronic equipment and storage medium
CN113806492A (en) Record generation method, device and equipment based on semantic recognition and storage medium
KR20220074572A (en) A method and an apparatus for extracting new words based on deep learning to generate marketing knowledge graphs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16771470

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16771470

Country of ref document: EP

Kind code of ref document: A1