Supply Chain Council of European Union | Scceu.org
Distribution

Apache Hop data orchestration hits open source milestone

The open source Apache Hop data orchestration platform has achieved a big milestone, becoming a Top Level Project at the Apache Software Foundation.

Hop, a recursive acronym for the Hop Orchestration Platform, first came to the Apache Incubator in September 2020.

The Apache Incubator is often the initial entry project for technologies into the ASF. After a project is able to demonstrate community and technology growth over a period of time, a project can be elevated to Top Level Project status, which signifies a milestone for project maturity.

Hop’s roots go back much further than 2020, having been originally based on the Kettle data orchestration project that was made open source by former data integration and analytics vendor Pentaho in 2012. In 2019, the Hop project was started as a fork of Kettle.

Moving from Kettle to Hop for data orchestration

Among the users of Kettle that migrated to Hop is Belgian car tire wholesaler Deli Tyres. Jan Lievens, managing director of Deli Tyres, said the company had been using Kettle for more than a decade and recently upgraded its entire system from Kettle to Apache Hop.

“Deli Tyres processes data from a variety of sources to feed the web shop’s stock systems, receive and place orders, feed the data warehouse and more,” Lievens said. “Hop is used as the main data processing engine in a combination of real-time streaming and batch processes.”

Among the reasons why Lievens and his team chose to move to Hop is that Hop has a visual development environment that enables faster development and easier maintenance. Lievens said that Hop also provides a smaller resource footprint and is able to handle metadata more efficiently.

“After the upgrade, Hop’s smaller footprint and improved metadata management resulted in a system that runs smoother, more transparent and more reliable than was possible before,” Lievens said.

Apache Hop data orchestration continuing to mature

The graduation of Apache Hop to the Top Level Project status at the ASF, made public Jan. 18, means a number of things to Bart Maertens, vice president, Apache Hop, and managing partner at business intelligence consulting firm know.bi.

Maertens said that the new status means Hop has been able to build an active and engaged community.

“We expect the graduation as an Apache Top-Level Project to increase adoption of Hop and grow its community,” Maertens said. “As a consequence we expect more organizations to help out with Hop development and increase the user base which is expected to lead to an increase in contributions and functionality.”

While Hop got its start as a fork of the Kettle project that was led by Pentaho, Maertens emphasized that the project never had the intention to be compatible with Kettle, and it isn’t. 

He explained that the technical design of Hop is different than Kettle in that Hop now has a kernel and plug-ins architecture, with the engine is intended to be as robust and stable as possible, while plug-ins provide added functionality.

“In addition to the revamped architecture, Hop gained a lot of functionality to support data teams in the entire project lifecycle,” Maertens said.

The intersection of Hop data orchestration and DataOps

At the core of the Kettle project and with Hop as well, are ETL (extract, transform load) capabilities, though Hop can handle more than ETL.

“The Hop platform, implemented according to our best practices, can be used to build and run projects that meet the criteria specified by the DataOps manifesto,” a set of DataOps principles, Maertens said.

Maertens emphasized that how organizations use and run Hop depends on their perspective.

Hop also has focuses on areas outside the purview of DataOps. Those areas include version control and unit and integration testing, as well as integration with CI/CD (continuous integration/continuous delivery) platforms, that apply to DevOps and GitOps principles rather than what is commonly thought of as DataOps.

“More than anything else, Hop intends to be a data platform that not only supports data teams in the development phase but also provides tools and guidance throughout the entire project lifecycle,” Maertens said.

 

 

 

Related posts

Croatian distribution operator HEP ODS launches $27.5m pilot

scceu

Drive-thru food distribution to be held outside Toyota Center Thursday morning

scceu

Wildberries distribution center opens in Kyrgyzstan

scceu