Hadoop provider MapR and datacenter resource management firm Mesosphere have lifted the lid on a new framework designed to allow big-data workloads to run alongside other apps, rather than on dedicated clusters.
The Myriad framework, conceived early last year by eBay developer Mohit Soni and Renan DelValle, makes use of the Apache YARN management layer and the Apache Mesos distributed systems kernel to consolidate big-data jobs with other tasks in enterprise and cloud datacenters.
Mesosphere CEO and co-founder Florian Leibert said in a statement that the arrival of Myriad will mean developers no longer have to choose between YARN and Mesos for managing clusters.
"Myriad allows you to run both, and to run all your big-data workloads and distributed applications and systems on a single pool of resources," he said.
"Big-data developers get the best of YARN's power for Hadoop-driven workloads and Mesos' ability to run any other kind of workload, including non-Hadoop applications like web applications and other long-running services."
Other services alongside which big-data workloads will be able to run include streaming applications such as Storm, build systems, continuous-integration tools like Jenkins, high-performance computing jobs, Docker containers, as well as custom scripts, Mesosphere said.
Now that Myriad has passed beyond proof of concept, Mesosphere and MapR are keen for the community to participate in the project more broadly. Plans exist for Myriad to be submitted as an Apache incubator project this quarter.
Mesosphere and MapR are expected to be demoing Myriad at next week's Strata and Hadoop World conference in San Jose, California.
The GitHub readme on Myriad describes it as a "framework designed for scaling a YARN cluster on Mesos. Myriad can expand or shrink the resources managed by a YARN cluster in response to events as per configured rules and policies".
The readme continues: "It allows one to expand overall resources managed by Mesos, even when the cluster under Mesos management runs other cluster managers like YARN".
MapR Technologies director of enterprise strategy and architecture, Jim Scott, said just as Hadoop has removed the walls between data silos, Myriad will remove walls between isolated clusters.
In a blogpost, Mesosphere said conventional practice for companies running YARN has been for operations teams to create a statically-partitioned cluster dedicated to YARN workloads.
"In this siloed model, the YARN cluster would only run Hadoop workloads and nothing else. It would have its own hardware or cloud instances, its own operations team, and could not share resources with other workloads in the datacenter," Mesosphere said.
"Project Myriad combines the best of YARN and Mesos, allowing modern Hadoop workloads to run elastically with other datacenter and cloud workloads, thereby sharing resources with all the organisation's Linux applications - for example, web servers, Java apps - as well as their datacenter services like Cassandra, Kafka, Elasticsearch, and Kubernetes."
In December, as well as $36m in new funding, Mesosphere unveiled its Datacenter Operating System platform, or DCOS, designed to pool resources across a datacenter as if it were a single machine.
MapR CEO John Schroeder said recently that an initial public offering for the Hadoop company is definitely in the cards.
More on big data
- Teradata rolls out big data apps, updates Loom
- Docker 1.5 is out, boasting new features and squashed bugs
- Why Amazon's Docker service is linking into Apache Mesos for simpler clustering
- MongoDB 3.0 gets ready to roll with WiredTiger engine onboard
- DataStax snaps up Aurelius and its Titan team to build new graph database
- Data scientists: How to hire and how to get the best from them
- Getting big data right is about more than the size of your database
- MySQL: Percona plugs in TokuDB storage engine for big datasets