Personal Information
Organização/Local de trabalho
United States United States
Cargo
Data Scientist
Setor
Technology / Software / Internet
Sobre
I lead the development and deployment of scaleable models, with expertise in both real-time and big data architecture.
= Apache: Spark, Hadoop, Pig, Hive, and Oozie.
= Python: scikit-learn, pandas, NumPy, and Luigi.
= R: PivotalR, madlib, Time Series Analysis with X12-ARIMA.
= Modeling: MLLib, H2O, yhat, Sense
= Machine Learning: Random Forests, Clustering, Association Rules, and Logistic Regression.
= Software Development: Streaming, Distributed Systems, REST APIs.
= Visualization: Matplotlib, ggplot2, Seaborn, and D3.
= Database: Hive, Postgres, SQL
I build data science pipelines and frameworks (see my presentations below).
Marcadores
model
classification
machine learning
kaggle
predictive analytics
analytics
data science
software
scikit-learn
logistic regression
xgboost
tensorflow
pipeline
pandas
python
gradient boosting
random forest
framework
stock market
regression
market analysis
change point
nfl
fantasy
sports
Ver mais
Apresentações
(3)Gostaram
(4)AlphaPy
Robert Scott
•
Há 7 anos
kaggle_meet_up
Marios Michailidis
•
Há 7 anos
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Vivian S. Zhang
•
Há 8 anos
General Tips for participating Kaggle Competitions
Mark Peng
•
Há 8 anos
Personal Information
Organização/Local de trabalho
United States United States
Cargo
Data Scientist
Setor
Technology / Software / Internet
Sobre
I lead the development and deployment of scaleable models, with expertise in both real-time and big data architecture.
= Apache: Spark, Hadoop, Pig, Hive, and Oozie.
= Python: scikit-learn, pandas, NumPy, and Luigi.
= R: PivotalR, madlib, Time Series Analysis with X12-ARIMA.
= Modeling: MLLib, H2O, yhat, Sense
= Machine Learning: Random Forests, Clustering, Association Rules, and Logistic Regression.
= Software Development: Streaming, Distributed Systems, REST APIs.
= Visualization: Matplotlib, ggplot2, Seaborn, and D3.
= Database: Hive, Postgres, SQL
I build data science pipelines and frameworks (see my presentations below).
Marcadores
model
classification
machine learning
kaggle
predictive analytics
analytics
data science
software
scikit-learn
logistic regression
xgboost
tensorflow
pipeline
pandas
python
gradient boosting
random forest
framework
stock market
regression
market analysis
change point
nfl
fantasy
sports
Ver mais