Featured Author

Ellen Friedman
Apache Drill and Apache Mahout Committer, Big Data Consultant at MapR

Ellen Friedman is a consultant and commentator on big data topics. Active in open source, Ellen is committer for Apache Drill and Apache Mahout projects and co-author of many books on working with data in the Hadoop ecosystem. She has a PhD in biochemistry, years of experience as a research scientist and has written about a wide range of technical topics including biology, oceanography and the genetics of learning and memory.

Ellen thinks rabbits are funny, so she helped design magic-themed cartoons in the book "A Rabbit Under the Hat."

Author's Posts

Posted on February 17, 2017 by Ellen Friedman

Does this sound disturbing? You try to reach a particular website only to find the site is down. But it’s not that simple. You try another site – also not reachable. And another and another… You look to social media for in-the-moment reports about what’s happening and while you are reading about a huge swath of the country under cyber attack, that social media site goes out, too.

Posted on September 7, 2016 by Ellen Friedman

In this week's Whiteboard Walkthrough, Ellen Friedman, Solutions Consultant at MapR, describes what happens when certain fundamental big data capabilities are engineered together as a part of the same technology. This brief overview compares the converged data platform as a foundation for big data projects versus building solutions on a base of separate pieces.

Posted on August 16, 2016 by Ellen Friedman

It’s not just a concern when ordering coffee. Something similar can happen as we investigate new and innovative big data technologies and techniques. I used the cappuccino example in a talk I presented recently at the Strata + Hadoop World Conference in London. The talk, titled “Building Better Cross Team Communication,” highlighted the importance of identifying and addressing the difference in how each side thinks the world works when two groups that have different experience and skills come together.

Posted on July 19, 2016 by Ellen Friedman

In January, I made predictions about six big data trends for 2016 (“What Will You Do in 2016?”). Now we’ve reached the mid-and-a-bit-more year, so it’s a good time to check them out and see how well these predictions match what has happened so far in 2016, what is surprising about that, and what’s likely to come in the second half of the year.

Posted on June 21, 2016 by Ellen Friedman

Streaming data can be used as a long-term auditable history when you choose a messaging system with persistence, but is this approach practical in terms of the cost of storing years of data at scale? The answer is “yes”, particularly because of the way topic partitions are handled in MapR Streams. Here’s how it works.

Posted on June 13, 2016 by Ellen Friedman

The power of SQL for business analytics is a given, but the challenge in big data settings is that SQL is normally a static language that assumes pre-defined, fixed and well-known schema. SQL also needs flat data structures. It has been assumed that you need fixed schema for performance.

Posted on June 8, 2016 by Ellen Friedman

In this week's Whiteboard Walkthrough, Ellen Friedman, a consultant at MapR, talks about how to design a system to handle real-time applications, but also how to take advantage of streaming data beyond those in the moment insights.

Posted on May 16, 2016 by Ellen Friedman

Streaming data now is a big focus for many big data projects, including real time applications, so there’s a lot of interest in excellent messaging technologies such as Apache Kafka or MapR Streams, which uses the Kafka 0.9 API.

Posted on May 11, 2016 by Ellen Friedman

What capabilities should you look for in a messaging system when you design the architecture for a streaming data project? Let’s start with a hypothetical IoT data aggregation example to illustrate specific business goals and the requirements they place on messaging technology and data architecture needed to meet those goals...

Posted on April 25, 2016 by Ellen Friedman

Organizations embracing big data are ready to put data to work, including looking for ways to effectively analyze data from a variety of sources in real time or near real time.


Blog Sign Up

Sign up and get the top posts from each week delivered to your inbox every Friday!