Smart Artificial Intelligence Needs An Open (Source) Classroom

  • Hannah
  • June 27, 2020
  • Comments Off on Smart Artificial Intelligence Needs An Open (Source) Classroom

As schoolchildren and students of all ages will widely confirm after the Covid-19 (Coronavirus) pandemic with the imposition of home-schooling for many, it’s harder to learn in a vacuum. It’s not impossible, but it’s generally agreed that we humans learn better in groups through mutual discovery, intercommunication on problem-solving and through the general process and pursuit of team-based challenges and goals. This, after all, is why we have schools.

Could the same need for interconnected cross-fertilization also help computers to ‘learn’ as they build their data-powered Artificial Intelligence (AI) knowledge bases and software-driven analytics engines?

Openness in machine learning

More examples of open AI are surfacing all the time. This June saw news of Databricks take its open source machine learning to join the Linux Foundation’s ranks of open technologies. Known as MLflow, Databricks chose this name to reflect the Machine Learning (ML) abilities it is engineered to deliver — and (the flow part) to explain the end-to-end nature of the functions it delivers i.e. it can flow throughout a complete development lifecycle.

Often used as a fluffy marketing-speak name-tag, what end-to-end is supposed to mean in purist technology circles is a software tool’s ability to work from one end of the software development lifecycle to the other.

What that means in Machine Learning terms is the journey data has to go on from: preparation (including parsing and deduplication), through an experimentation phase, onward into packaging code into ‘reproducible runs’ (intelligence blocks that can be componentized and use in a more off-the-shelf way) and then finally into AI models than be shared and collaborated with.

It is that ‘collaboration’ part right at the end there that’s really important i.e. the ability to share AI and ML datasets, processing engines and the other working paraphernalia of Deep Learning (DL) through open (and indeed open source) platforms, channels and communities is argued to be a more productive way for the machines themselves to learn more naturally.

Databricks’ MLflow project is now two-years old and has seen engagement from somewhere over 200 contributors. Moving it to the Linux Foundation gives it a vendor-neutral home with an open governance model, which is hoped to broaden adoption and contributions.

Databricks explains that it first created MLflow to address what it calls the ‘inherently complicated process’ of Machine Leaning model development — a process known to be highly complex as a result of the need to build, train, tune, deploy and manage machine models

“The steady increase in community engagement shows the commitment data teams have to building the machine learning platforms of the future. The rate of adoption demonstrates the need for an open source approach to standardizing the machine learning lifecycle,” said Michael Dolan, VP of strategic programs at the Linux Foundation. “Our experience in working with the largest open source projects in the world shows that an open governance model allows for faster innovation and adoption through broad industry contribution and consensus building.”

The proliferation of digital intelligence

In some way showing that the proliferation of digital intelligence for AI is best served through open source development, this June 2020 also saw Abbyy launch its NeoML open source library for building, training and deploying Machine Learning models. The Silicon Valley, Russia, Europe and Far East headquartered company has foundations in document capture & management, but has now extended its scope to deliver its Digital Intelligence technologies brand for the enterprise.

Available on the GitHub open software code repository, NeoML supports both Deep Learning and traditional Machine Learning algorithms. It is a cross platform framework that is optimized for applications that run in cloud environments, on desktop and mobile devices.

Where Databricks’ (our first example) open intelligence technology works on big data processing and cloud computing ‘cluster’ management, in Abbyy’s case the Machine Learning framework is optimized for image processing tasks and offers fast performance for pre-trained models running on any device. It’s different kinds of smartness for different technology use cases, but both brain functions have been exposed to what is said to be the ‘goodness’ of open community computing contributions.

The company says that as open source becomes a staple in the development of mission-critical software, with 95% of IT leaders asserting that it is strategically important, Abbyy aims to support advancements in AI by open sourcing its machine learning framework. Software developers can use NeoML to build, train and deploy models for object identification, classification, semantic segmentation, verification and predictive modeling for various different business goals.

Providing some example use cases, AI evangelist at Abbyy Ivan Yamshchikov says that this technology could be used for instance, by banks to develop models to manage credit risk and predict customer churn. By telecom companies to analyze the performance of marketing campaigns. By retail and fast-moving consumer goods (FMCG) companies to build remote client identification with face recognition and data verification.

“Sharing our framework in the open source arena allows developers to leverage its inference speed, cross-platform capabilities and especially its potential on mobile devices, while their feedback and contribution will grow and improve the library. We are thrilled to promote advancements in AI and support machine learning being applied to increasingly high-value and impactful use cases,” Yamshchikov.

Open your neural pathways

But is open source the only way to the best AI intelligence? It would probably be unfair and unwise (if not reckless) to suggest that a good proportion of the Machine Leaning going on inside closed proprietary circles isn’t going to be useful. If there were some higher level of exchanging learning patterns (if not exact learning models) then that might provide an additional level of AI democratization.

In the case of Abbyy’s NeoML, the technology supports the Open Neural Network Exchange (ONNX), a global open ecosystem for interoperable ML models, which is hoped to improve compatibility of tools making it easier for software developers to use the right combinations to achieve their goals. The ONNX standard is supported jointly by Microsoft, Facebook and other partners as an open source project. So open AI intelligence is becoming comparatively ubiquitous.

As they say, open your mind, right?