Today was the day – the Northwestern University conferred the degree of Master of Science in Predictive Analytics upon Andre Obereigner ;O)

A view of the server room at The National Archives

Access Unlimited Computation Resources: Amazon’s EC2

Even the best personal laptop reaches its limits when faced with analytics tasks, and that pretty quickly. While contestants of various Kaggle competitions report that they often do pretty well with 4 cores and 8 – 16 GB RAM, my own experience tells me that building many models with even moderate-sized data sets as well as parameter tuning requires a different sort of machine. Amazon and its Elastic Compute Cloud (Amazon EC2) come to our rescue.

Read More

A Text Prediction App in Collaboration with Coursera & SwiftKey

The text prediction app is the result from the Coursera Data Science Capstone project in collaboration with SwiftKey.

The objective of the capstone project was to (1) build a model that predicts the next term in a sequence of words, and to (2) encapsulate the result in an appropriate user interface using Shiny.  You can try out the Text Prediction App on the Shiny server.

Read More

Workforce Analytics Awards 2016 | The US Shortlist

From plenty of unstructured information to valuable insights using text mining and analytics!

The text analytics project which I led at Groupon aimed at the successful analysis of survey comments and the accurate prediction of topic labels for each of the comments. I am happy to read that our project was short-listed again – this time for the US Workforce Analytics Award 2016.

Read More

Workforce Analytics Awards 2016 | The European Shortlist

From plenty of unstructured information to valuable insights using text mining and analytics!

The text analytics project which I led at Groupon aimed at the successful analysis of survey comments and the accurate prediction of topic labels for each of the comments. I am happy to read that our project was short-listed for the European Workforce Analytics Award 2016.
Read More

“In Data We Trust” at the Workforce Analytics Summit in NYC

“​In Data We Trust: Improving Data Quality To Add Credibility To People Analytics”. That was the topic of the presentation which I gave at the Workforce Analytics Summit sponsored by IBM in New York City in June 2015.

Data quality was an important topic at the conference because it represents the very foundation for the analysis of data. While concerns about the completeness and precision of available data should not hold companies back from launching their people analytics initiatives, a continuous and systematic improvement of data quality is essential for building trust in the insights derived from organizational information. My aim was to raise awareness for data quality as a prerequisite for any analytical endeavor in general and workforce analytics in particular.

Read More

Statistical Graphics, Exploratory Data Analysis and Data Visualization

I was recently told that statistical graphics are commonly used throughout the modeling process and that the term “data visualization” appears frequently in conjunction with the term “analytics”. The question that followed was: How are statistical graphics used in exploratory data analysis and is there a difference between the term “statistical graphics” and “data visualization”. So, I did some reading and thinking with the outcomes as follows …

Read More

Statistical graphs are central to effective data analysis, both in the early stages of an investigation and in statistical modeling.

John Fox

Predictive Analytics versus Predictive Modeling

My initial thought was that Predictive Analytics refers to an overall field of expertise while Predictive Modeling refers to an activity in which individuals apply potentially relevant mathematical algorithms to data sets in order to learn its structure which can then be applied to new observations to make predictions.

Read More

Strategy of Predicting Repeat Restaurant Bookings

The post is a response to the request of sharing the strategy for approaching the Kaggle in-class competition “Predict Repeat Restaurant Bookings” (https://inclass.kaggle.com/c/predict-repeat-restaurant-bookings). I am still new to the Kaggle competitions and yes, it has been my first competition. I am seeing Kaggle as a terrific playground for trying to apply the theory that I am currently learning during my Master of Science studies in Predictive Analytics.

Read More