Google has quickly moved its Cloud Dataproc service from beta into general availability, setting up a fresh option in 2016 for companies interested in making more effective use of Hadoop and Spark for big data projects. 

Cloud Dataproc had only been announced in September at Google I/O, making today's announcement just several months later evidence Google has put ample development resources into the service. [Go here for Constellation VP and principal analyst Holger Mueller's take on Cloud Dataproc from the time of the beta launch.]

In a blog post, Google laid out what it considers the many benefits of Cloud Dataproc's approach:

Often, popular tools to process data, such as Apache Hadoop and Apache Spark, require a careful balancing act between cost, complexity, scale, and utilization. Unfortunately, this means you focus less on what is important — your data — and more on what should require little or no attention — the cluster processing it.

Cloud Dataproc minimizes two common and major distractions in data processing — cost and complexity by providing:

Low-cost. We believe two things — using Spark and Hadoop should not break the bank and that you should pay for what you actually use. As a result, Cloud Dataproc is priced at only 1 cent per virtual CPU in your cluster per hour, on top of the other Cloud Platform resources you use. 

Speed. With Cloud Dataproc, clusters do not take 10, 15, or more minutes to start or stop. On average, Cloud Dataproc start and stop operations take 90 seconds or less. This can be a 2-10x improvement over other on-premises and IaaS solutions. 

Management. Cloud Dataproc clusters don't require specialized administrators or software products. Cloud Dataproc clusters are built on proven Cloud Platform services, such as Google Compute Engine, Google Coud Networking, and Google Cloud Logging to increase availability while eliminating the need for complicated hands-on cluster administration. Moreover, Cloud Dataproc supports cluster versioning, giving you access to modern, tested, and stable versions of Spark and Hadoop.

There's a lot more detail in the blog, so be sure to check it out.

Analysis: Google Throws Another 10-Gallon Hat into the Big Data Ring

Cloud Dataproc already has a major reference customer in the form of Spotify, which Google announced is moving its entire back end onto Google Cloud. The streaming music service has more than 75 million users.

Meanwhile, "it’s always nice to have options and competition, so Dataproc is a good thing for the big data analysis market overall," says Constellation Research VP and principal analyst Doug Henschen. "It will be particularly attractive to Google customers amassing big data from their use of Google AdWords and Analytics services."

Cloud Dataproc's branding may be a little wonky, and it comes to market after some competing products, it does mean Google can finally go toe-to-toe with the likes of Amazon (Elastic MapReduce with Spark services), Microsoft (Azure HDInsight with Spark services) and IBM (BigInsights and Spark Services) on big-data processing and analysis in the cloud, Henschen adds.

A key differentiator for Google on Dataproc could end up being pricing, given the base cost of 1 cent per hour along with granular billing. Right now at least, that's less expensive than AWS's competing entry.

Reprints
Reprints can be purchased through Constellation Research, Inc. To request officialreprints in PDF format, please contact Sales.