This webinar discusses cloud based Machine Learning platforms in detail while identifying suitable business use cases for each of them: Microsoft Azure ML, Amazon Machine Learning DataBricks Cloud
4. ComparisonApproach:
What do we look for in an online Machine Learning Platform?
Data
Preparation
• Data Ingestion (out of the box support of data sources) & Data Export
• Data Cleaning, Transformation, Visualization
Data
Selection
• Feature selection/engineering
Algorithms
• Which algorithms are supported out the box? Modify or create new ones?
• Saving/comparing results
Optimize
• E.g. Identify the optimal parameter settings for algorithms
Knowledge
9. Amazon ML
Available Performance Metrics
• BinaryAUC: The binary MLModel uses the Area Under the Curve (AUC)
technique to measure performance.
• RegressionRMSE: The regression MLModel uses the Root Mean Square
Error (RMSE) technique to measure performance. RMSE measures the
difference between predicted and actual values for a single variable.
• MulticlassAvgFScore: The multiclass MLModel uses the F1 score technique
to measure performance.
27. Databricks Cloud
• Spark: has joined Hadoop as de-facto industry standards for distributed
computing
• Rapidly approaching popularity of hadoop
– And supplanting it if/when organizations can make the switch
• Databricks is the spin-off of Berkeley Amplab –the original creators of Spark
• Databricks staff include a large fraction of the Spark core committers
• And an even larger proportion of the key decision makers / "shepherds"
– Including the spark.ml/mllib shepherds
• Cloud based availability of Spark including Spark SQL and spark.ml
• Access to capabilities of Spark Mllib, Spark Dataframes/SQL, Streaming, and
Resilient Distributed Datasets
• Notebooks approach: Scala, Python, Java, and R
34. Wrap-up/Summary
Amazon ML may be sufficient for:
- customers that already have data residing in those providers
- simpler/fewer options are acceptable
AzureML has a strong usability and workflow approch and provides a reasonable
cross section of algorithms available for casual & intermediate users
Databricks Cloud has the most comprehensive offering
– Variety, performance, configurability of Algorithms
– Richness of the capabilities of the Notebooks
– Options /configurability of the hosting clusters/environment