New Indico Open Source Project Delivers Dramatic Improvements to Machine Learning Model Training

September 05 12:46 2018 Print This Article

Solving big problems with small data — Indico Finetune leverages OpenAI research to deliver breakthrough improvement in natural language processing tasks

BOSTON, Sept. 06, 2018 (GLOBE NEWSWIRE) — Indico, a provider of Enterprise AI solutions for intelligent process automation, today announced the launch of a new open source project focused on enhancing the performance of machine learning for natural language processing. Named Finetune, the project offers users a single, general-purpose language model which can be easily tuned to solve a variety of different tasks involved in text and document-based workflows.

‘Finetuning’ is a specific type of transfer learning designed to take a model trained on one task and adapt it to solve a different, but related task. Users can make small modifications to repurpose an existing model to effectively solve a new, related problem, saving substantial time and effort, while also improving accuracy.

“Most organizations have natural language processing problems, but few have the labeled data they need to solve them with machine learning,” said Madison May, Indico machine learning architect and cofounder. “Finetune lets them do more with less labeled training data. And it only requires a base level of IT experience.”

The Finetune project extends original research and development work completed by OpenAI to address a wider range of problems. OpenAI’s base project provides an illustrative model for increasing the accuracy and performance of machine learning models with natural language content and includes general capabilities for document classification, comparison, and multiple-choice question answering. The Finetune library packages that capability up for easier use and supports additional tasks such as document annotation, regression, and multi-label classification.  

Indico delivers Finetune in a format that mimics a popular open source repository – scikit-learn – and documents it so users are able to write as little as five lines of code (vs. 200) to try out OpenAI’s research on their own data problems. The models in Finetune have also demonstrated statistically higher performance as users add more labeled training data vs. traditional natural language processing approaches. Finetune outperforms these methods with only a hundred labels, and that gap continues to widen as available training data increase.

The Indico team is conducting empirical research to evaluate how the models behave on different datasets and machine learning tasks. The company also plans to incorporate Finetune into its commercial product to address specific customer use cases.

In June, Indico launched a related open source project named Enso, a library of standard interfaces and tools to streamline the benchmarking of embedding and transfer learning methods for a wide variety of natural language processing tasks. Enso was used to benchmark the improved performance achieved with Finetune before the project was launched.

“We have a vested interest in promoting the advantages of transfer learning and giving back to the open source community is a really productive way for us to do that,” said Slater Victoroff, co-founder and CTO of Indico. “I also want to acknowledge the important research and development work done by the team at OpenAI and Alec Radford. They are driving huge innovations in machine learning that really help accelerate the progress of companies like Indico.”

About Indico
Indico is a provider of Enterprise AI solutions for intelligent process automation. Our focus is on helping to automate tedious back-office tasks, improving the efficiency of labor-intensive document-based workflows, and extracting valuable insights from unstructured content, including text and images. Our breakthrough in solving these challenges is an approach known as transfer learning, which allows us to train machine learning models with orders of magnitude less data than required by traditional content analysis techniques. With Indico, enterprises are now able to benefit from the dramatic advantages of machine learning in a fraction of the time. For more information, visit. https://indico.io/.

CONTACT: Media Contact:
Tim Walsh
for Indico
617.512.1641
[email protected]
  Categories:
view more articles

About Article Author

Jonathan Yaniv
Jonathan Yaniv

Jonathan is the founder and editor-in-chief of TrustedNerd.com. Covering major tech shows such as CES, Jonathan is always there for the latest tech news. Want your gadget to be reviewed or have a release you'd like to be considered for publishing? Send Jonathan an email, jonathan [at] trustednerd.com

View More Articles
0 comments