Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

An elegant solution to text-data processing problems

by David O’Donoghue
October 13, 2017
in Machine Learning, Technology & IT
Home Topics Data Science Machine Learning
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

A breakthrough in the field of natural language processing (NLP) has been achieved with the Cortical.io Retina engine, which elegantly solves numerous text-data processing problems faced by big businesses. The Retina engine surmounts the obstacles in dealing with terabytes of unstructured text data, thanks to a unique algorithm that is based on neuroscientific research into how the human brain processes information. The technology can analyze the meaning of not just keywords, but of whole sentences, paragraphs, and long texts. It can also be applied to documents in different languages. By focusing on meaning, the challenges of language ambiguity and vocabulary mismatch are overcome. For example, the phrases “we closed the deal” and “the contract was signed” have similar meanings but use completely different words; the Cortical.io Retina engine recognizes that similarity.

AI-based contract intelligence

Major global companies are leveraging a wealth of data from contracts and other legal documents by deploying Cortical.io Contract Intelligence, a highly accurate data-extraction solution that combines the Retina engine with various NLP techniques. These companies use Cortical.io technology to automate the precise extraction of key information from thousands of complex contracts that contain disparate and diverse language. Automation has freed up significant manual resources, greatly reduced costs, and corrected human error in the extraction of information. Firms are able to quickly generate consistent and comparable summary abstracts and spreadsheets, manage contract lifecycles in real time, and gain valuable insights into the financial situation of potential clients. Financial institutions are lowering credit risk by identifying clauses associated with performing and non-performing contracts, and large firms are meeting new legal requirements by being able to quickly search agreements for the correct figures to be added to the balance sheet.

A Mix of unsupervised Machine Learning and Expert Feedback

Cortical.io Contract Intelligence is a standalone system whose input is information from data sources that contain contracts and other documents. The engine processes this unstructured or semi-structured information and returns structured key information that is easily used in further business information analysis processes.

An elegant solution to text-data processing problems

Data can be extracted from new contract types based on how the requested information is written in only three to ten sample contracts. The technology amplifies company intelligence and increases accuracy through a unique combination of unsupervised machine learning and an iterative fine-tuning process involving the companies’ subject matter experts (SMEs).


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


SMEs typically go through five to ten simple iterations, interacting with the system to produce the best results.

An elegant solution to text-data processing problems

  1. SMEs define the type of information that they want to extract.
  2. In an unsupervised learning phase, the system learns to recognize contract vocabulary and concepts (for example, facilities, loans, variations of dates, and parties to an agreement) and forms relationships among the concepts.
  3. Information is extracted from the contracts.
  4. Based on the results of the extraction, SMEs fine-tune the system by adding to or modifying the type of information requested.

Using the brain as a model for artificial intelligence

Cortical.io Contract Intelligence makes use of the Retina engine, which was developed from the Semantic Folding theory of how the human brain works. The brain’s neocortex is a 2D sheet of neuron assemblies that process information such as text, images and sounds. A mathematical model called Sparse Distributed Representation simulates how the neocortex stores this information. In this model, each piece of information is represented by a long binary vector that has many “zeros” (inactive bits) and relatively few “ones” (active bits). Each active bit contains some part of the information’s meaning. If the same bit is active in two vectors, the two pieces of information that the vectors represent are, in at least one aspect, similar in meaning. The greater the number of active bits in common, the more similar the two pieces of information.

An elegant solution to text-data processing problems

Collecting text to form a semantic space

To create an efficient, elegant system for natural language processing, Cortical.io technology gathers text from selected reference literature. The text is cut into meaning-based slices, called snippets, which are then distributed over a 2D grid. Snippets that have similar meaning are placed close to one other. The 2D-grid is called a semantic space, and, in this space, each snippet has a pair of coordinates.

Representing the meaning of words numerically

To represent the meaning of any word numerically, the Cortical.io Retina engine activates the grid positions of all snippets that contain the word. The resulting grid representation of the meaning of the word is known as the word’s semantic fingerprint.

An elegant solution to text-data processing problems

Example of a semantic fingerprint of a word

The grid can be unfurled to form a long binary vector, where each active position on the grid corresponds to an active bit in the vector. The binary vector is then the numerical representation of the word’s meaning and can be used for comparison and computation—operations that are crucial to the efficient application of NLP. To convert a longer piece of text to a semantic fingerprint, the system first converts each word of the text to an individual fingerprint and then combines the fingerprints.

Find out more at the Data Natives workshop and presentation

At the Data Natives conference in Berlin, 15-17 November 2017, Cortical.io co-founder and General Manager, Francisco Webber, will explain how Cortical.io Conference Intelligence technology works and hold a workshop to extract information from sample contracts.

For more information about Cortical and Semantic Folding theory, you can also see the video about Cortical.io Retina technology.

Related Posts

Adversarial machine learning 101: A new frontier in cybersecurity

Adversarial machine learning 101: A new cybersecurity frontier

January 31, 2023
What is digital maturity: Model, scale, level

Fostering a culture of innovation through digital maturity

January 30, 2023
5 best office automation tools: Examples

The significance of office automation in today’s rapidly changing business world

January 24, 2023
What is smart robotics: Benefits and challenges

Unlocking the full potential of automation with smart robotics

January 19, 2023
Business intelligence consultant: Salary, job description, role and more

The data-smart consultants: The significance of BI experts in today’s business landscape

January 12, 2023
Stock prediction in machine learning explained

Applying machine learning in financial markets: A review of state-of-the-art methods

January 11, 2023

Comments 1

  1. Kulvinder Panesar says:
    5 years ago

    Interesting creative and elegant thinking to address the challenging text based unstructured data problem. Particularly interesting is the convergence of AI into the lower level processing and representation. I have looked at something similar in my research. Regards Dr Kulvinder Panesar e-mail: kulvinderpanesar666@gmail.com

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

BuzzFeed ChatGPT integration: Buzzfeed stock surges after the OpenAI deal

Adversarial machine learning 101: A new cybersecurity frontier

Fostering a culture of innovation through digital maturity

Nvidia Eye Contact AI can be the savior of your online meetings

How did ChatGPT passed an MBA exam?

AI prompt engineering is the key to limitless worlds

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.