Technical Tips

Using Spark GraphFrames to Analyze Facebook Connections

Sooner or later, if you eyeball enough data sets, you will encounter some that look like a graph, or are best represented a graph. Whether it's social media, computer networks, or interactions between machines, graph representations are often a straightforward choice for representing relationships among one or more entities.

The Ultimate 3-Minute Guide to Time Series Data and OpenTSDB

What is a time series? A time series is a sequence of data points which are ordered in time. Time series data can come in multiple shapes, and can be used in many facets of everyday life, such as measuring rainfall, earthquake activity, or even stock prices. With the growth of the Internet of Things, the volume of time series data you can collect is staggering - reaching 100 million data points per second.

Four Examples of Characterizations for Discovery from Big Data

We previously discussed the “Top 8 Reasons that Characterization is Right for Your Data.” Here we move the discussion of characterization from the theoretical to the practical, by providing four simple examples of characterizations of data. In each of these cases, the set of characterizations that are generated can then be fed into different types of analytics algorithms for discovery from your data: predictive patterns, clusters (segments), associations, correlations, trends, and anomalies (outliers, surprises).

Let Spark Fly: Advantages and Use Cases for Spark on Hadoop - Webinar Follow Up

Apache Spark is currently one of the most active projects in the Hadoop ecosystem, and there’s been plenty of hype about it in the past several months. In the latest webinar from the Data Science Central webinar series, titled “Let Spark Fly: Advantages and Use Cases for Spark on Hadoop,” we cut through the noise to uncover practical advantages for having the full set of Spark technologies at your disposal.

Sample Code, Best Practices and Technical Resources Now Available on Developer Central

The newly launched Developer Central is a place just for the developer community. Full of code samples and best practices, Developer Central will help you get started on Hadoop and manage your clusters efficiently.  The three core content areas of Developer Central are Code, Architecture and Resources. 

Subscribe to RSS - Technical Tips