H2O Platform Workshop

by Hassan Namarvar, Principal Data Scientist

The engineering team at ShareThis met on Wednesday for a hands-on H2O workshop. During the workshop, I introduced the world’s fastest in-memory open source H2O platform for machine learning and predictive analytics. This was valuable because the team is now able to:

1) Get familiar with important features of the H2O platform versus other open source machine learning tools.

2) Download the bleeding edge version of the platform, install it on their own local machine and use the platform Web API to upload a big dataset and investigate data.


3) Build a CPA (cost per action) model using the GLM (generalized linear model) on a ShareThis campaign’s real dataset.

4) Validate the model on test set and interpret results.

5) Build more advanced models such as GBMs (gradient boost models), Big Data Random Forests and compare performance of these models using the multi-modeling scores module.

6) Discuss superior results of the exact GLM model deployed to production and A/B tested on an actual campaign for past two months.

Overall, the team was able to re-produce highly advanced online advertisement optimization models within less than an hour! Without using the H2O platform the whole end-to-end process could have taken months even for a savvy data scientist.