DB Tsai

141 Seguidores

9 SlideShares 141 Seguidores 60 Seguindos

Big Data Machine Learning Engineer with strong computer science, theoretical physics and mathematical background. I've deep understanding of implementing data mining algorithms in a scalable ways, not just using them as consumers. I'm a big fan of Scala, and have been using it to develop scalable and distributed data mining algorithms with Apache Spark. I've involved with open source Apache Spark development as a contributor. Apache Spark is a fast and general engine for large-scale data processing, and it fits into the Hadoop open-source ecosystem. Specialties: • Machine Learning and Data Mining. • Distributed/Parallel Computing and Big Data Processing. • Expert in Apache Hadoop

machine learning spark mapreduce hadoop mllib alpine data labs big data logistic regression netflix data mining apache spark multinomial l-bfgs recommendation pipeline kernel methods linear models polynomial mapping feature engineering linear regression ml spark summit elastic-net batch layer serving layer speed layer spark streaming pig lambda architecture real time storm stream large scale iot internet of things svd k-means unsupervised learning

Ver mais

Atividades
Sobre

DB Tsai

Apresentações

Multinomial Logistic Regression with Apache Spark

Unsupervised Learning with Apache Spark

Large-Scale Machine Learning with Apache Spark

2014-06-20 Multinomial Logistic Regression with Apache Spark

2014-08-14 Alpine Innovation to Spark

2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Things Conference

2015 01-17 Lambda Architecture with Apache Spark, NextML Conference

2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at Spark Summit 2015

2017 Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East talk by DB Tsai

Gostaram

Distributed Time Travel for Feature Generation at Netflix

2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at Spark Summit 2015

Introducing Windowing Functions (pgCon 2009)

Multinomial Logistic Regression with Apache Spark