Hazelcast Jet Simplifies Enterprise Python and Java AI/ML Deployments
By John K. Waters03/11/2020
In-memory computing platform maker Hazelcast on Wednesday announced new support in its Hazelcast Jet event stream processing engine for artificial intelligence (AI) and machine learning (ML) deployments of mission-critical applications. The latest release (4.0) reduces time-to-deployment through “inference runners” for any native Python- and Java-based models, the company says. The new Jet release also includes expanded database support and other updates focused on data integrity.
“With machine learning inferencing in Hazelcast Jet, customers can take models from their data scientists unchanged and deploy within a streaming pipeline,” said Greg Luck, CTO of Hazelcast, in a statement. “This approach completely eliminates the impedance mismatch between the data scientist and data engineer since Hazelcast Jet can handle the data ingestion, transformation, scoring and post-processing.”
The Hazelcast inference runner allows models to be natively plugged into the stream processing pipeline. Jet now allows developers to deploy Python models in a stream processing architecture that enables enterprises to feed real-time streaming data directly into the model. This approach eliminates the need to call out to external services via REST, which adds round-trip network latency and requires administrative overhead to maintain those external services, the company said. In Jet, the Python models are run locally to the processing jobs, eliminating latency and leveraging the built-in resilience to support mission-critical deployments. These ML Inference jobs can be scaled to the number of cores per Jet node and then scaled linearly by adding more Jet nodes to the job.
Also, this release of Hazelcast Jet incorporates new logic that runs a two-phase commit to ensure consistency across a broader set of data sources and sinks. This new logic expands upon the “exactly once” guarantee by tracking reads and writes at the source and sink levels, and ensures no data is lost or “duplicately processed” when a failure or outage occurs. Customers can, for example, read data from a Java Message Service (JMS) topic, process the data and write it to an Apache Kafka topic with an “exactly once” guarantee. This guarantee is critical in systems where lost or duplicate data can be costly, such as in payment processing or e-commerce transaction systems.
Hazelcast Jet 4.0 also includes a change data capture (CDC) integration with the open source project Debezium, which allows databases to act as streaming sources. The CDC integration adds support for a number of popular databases, including MySQL, PostgreSQL, MongoDB, and SQL Server. Because CDC effectively creates a stream out of database updates, Hazelcast Jet can efficiently process the updates at high-speed for the applications that depend on the latest data.
About the Author
John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He’s been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he’s written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].