Concept for a Natural Language Processing (NLP) Application: Artificial Intelligence (AI) Technology for Text and Language Search (ATTLS)

M. Niv, N. Kumar, E. Henry
T&T Consulting Services, Inc,
United States

Keywords: artificial intelligence, AI, natural language processing, NLP, knowledge representation language, query, languages


We propose a concept for a Natural Language Processing (NLP) application to effectively probe big data sets to answer content questions--a centralized query capability to enable specialists in various domains to use their own language and domain knowledge to effectively query results of complex systems and gain insight on outcomes and characteristics. Crucially, the query capability must not require the specialists to learn syntax, conventions, and vocabulary of a query language (e.g., Structured Query Language - SQL). Instead, specialists should be able to naturally interact with the system like they interact with modern technologies (e.g., Siri). The final model should be able to grow effectively, covering all relevant domain-elements of the system. These “domain-elements” are vocabulary, jargon, typographical conventions, acronyms, along with graphical conventions and visualization techniques. Domain-elements also include inferential or calculation mini-algorithms which promote user understanding: for an “inventory exhausted” event, where one must find the earliest record where resource quantity is zero and occurs later in time when a prior record existed of a positive quantity. We propose sub-components that resolve such ambiguity using domain knowledge. The proposed syntactic sub-system design is flexible/adaptable/intelligent enough to properly represent the variety of natural language structures. As such, we are developing an Artificial Intelligence (AI) Technology for Text and Language Search (ATTLS) to perform knowledge elicitation, design a core NLP platform, engineer knowledge into a lexicon, and demonstrate the platform. We start with active learning of expert knowledge in the elicitation phase, followed by core NLP platform design as an English word language representation system – Knowledge Representation Language (KRL). We design/build a computer software module for composing the query meaning the representation derived from individual word meanings – NLP. We encapsulate this model into a software module to interpret a composed meaning representation of a query and map it to standard database query language (e.g., SQL) – Query Interpreter (QI). We will conduct Knowledge Engineering (KE) to use available information for the development of a set of analytic questions. We utilize KE to enrich Target Questions (TQ) – those this prototype is intended to handle. We also use KE to create meaning representations for English words and terms that appear in the TQ, thus serving as the Lexicon. The Demonstration Platform is a software system integrating software and knowledge elements – a user testbed to gain a practical understanding of how the system handles TQ questions. Visualization modalities will present results in tabular lists, summary statistics, single value answer(s), time series line graphs, and bar graphs. The final software system will integrate all software and knowledge elements necessary to query unstructured information under selected data schema. This system will prove a robust query capability concept--“field operators” need not learn a query language. Instead, they can naturally interact with the system as they would with modern technologies. Furthermore, the system will demonstrate capability to adapt to other domains.