Sameer Nori is currently a Sr. Product Marketing Manager at MapR Technologies with responsibility for solutions and industry marketing. He has 10+ years of experience in the technology industry in marketing, pre-sales and consulting with domain experience in the business intelligence, analytics and big data markets. Some of the companies he has worked at include SAP Business Objects, MicroStrategy and Jaspersoft.
Apache Spark is becoming very popular and widely used in the big data community. There are several reasons for Spark getting such rapid traction. These include its in-memory processing capabilities, support for a wide range of engines for various use cases such as streaming, machine learning, and SQL, and the ability to develop in multiple languages such as Python and Scala.
One of the customer questions has centered around wanting to understand how to determine the degree of parallelism being used for various operators in queries. We’ll address this question and the best practice that originated from this in the rest of this blog post.
Customers are flocking to Spark as their primary compute engine for big data use cases, and we received further proof of this last week when we ran an “Ask Us Anything about Spark” forum in the Converge Community. There were some great discussions that took place, where our Spark experts answered questions from customers and partners.
There’s been a lot of buzz and high expectations in the big data community around Apache Spark 2.0 and how it will impact the development of data pipelines, streaming applications, machine learning algorithms and all of the other use cases that Apache Spark is enabling.
Spark 1.6 is now in Developer Preview on the MapR Converged Data Platform. In this blog post, I’ll share a few details on what Spark 1.6 brings to the table and what you should care about.
We are excited to announce that Spark 1.5.2 is here and is part of the MapR Converged Data Platform. In this blog post, I’ll share a few details on some of the latest capabilities in Spark. If you’re a data engineer, data scientist or in application development, Spark 1.5.2 has new capabilities that you should take advantage of.
In this week's Whiteboard Walkthrough, Sameer Nori, Business Intelligence Expert at MapR, explains how BI has evolved over the last 3 decades from being IT driven to analyst driven with Self-Service tools.
We thought the Kickstart song by Mötley Crüe was appropriate, since everyone is excited about kickstarting their Spark-based applications these days. That’s our theme for the Quick Start Solutions we’re announcing today at Spark Summit West—you can kickstart your Spark efforts into high gear with our Spark Quick Starts. You’ll be able to develop at high speeds, use streaming data, and build applications faster.
Reducing operating costs and increasing efficiency are, and will always be, priorities for any business, but they become imperative when an industry is facing cyclical challenges. Given the current volatility in the oil market, the oil and gas industry is looking for solutions that can proactively address inefficiencies through better asset tracking and predictive maintenance.
This is a tremendously exciting time for those who work in clinical genomics. The demand for cutting-edge technologies that deliver fast and accurate genome information has exploded. In 2013, close to 2000 genome sequencers were in operation. These genome sequencers produced a whopping 15 petabytes of sequence data, which included the sequencing of 300k human genomes.
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!