SlideShare a Scribd company logo
1 of 45
Download to read offline
Feature	Engineering
Dmitry	Larko,	Sr.	Data	Scientist	@	H2O.ai
dmitry@h2o.ai
About	me
6x
6x	
"Coming up with features is difficult, time-consuming, requires expert knowledge. “Applied machine
learning” is basically feature engineering." ~ Andrew Ng
Feature	Engineering
“… some machine learning projects succeed and some fail. What makes the difference? Easily the
most important factor is the features used.” ~ Pedro Domingos
“Good data preparation and feature engineering is integral to better prediction” ~ Marios Michailidis
(KazAnova), Kaggle GrandMaster, Kaggle #3, former #1
”you have to turn your inputs into things the algorithm can understand” ~ Shayne Miel, answer to
“What is the intuitive explanation of feature engineering in machine learning?”
What	is	feature	engineering
6x	
Not	possible	to	separate	using	linear	classifier
What	is	feature	engineering
6x	
What	if	we	use	polar	coordinates	
instead?
What	is	feature	engineering
6x
Feature	Engineering
That’s	NOT	Feature	Engineering
6x	
• Initial	data	collection	
• Creating	the	target	variable.
• Removing	duplicates,	handling	missing	values,	or	fixing	mislabeled	
classes,	it’s	data	cleaning.
• Scaling	or	normalization
• Feature	selection,	although	I’m	going	to	mention	it	in	this	talk	J
Feature	Engineering	cycle
6x	Hypothesis	
set
Validate	
hypothesis
Apply	
hypothesis
Dataset
Feature	Engineering	cycle
6x	Hypothesis	
set
Validate	
hypothesis
Apply	
hypothesis
Dataset
How?
• Domain	knowledge
• Prior	experience
• EDA
• ML	model	feedback
Feature	Engineering	cycle
6x	Hypothesis	
set
Validate	
hypothesis
Apply	
hypothesis
Dataset
How?
• Cross-validation
• Measurement	of	
desired	metrics
• Avoid	leakage
Why	feature	engineering	is	hard
6x	
• Powerful	feature	transformations	(like	target	encoding)	
can	introduce	leakage	when	applied	wrong
• Usually	requires	domain	knowledge	about	how	
features	interact	with	each	other
• Time-consuming,	need	to	run	thousand	of	experiments
Feature	Engineering
6x	
• Extract	more	new	gold	features,	remove	irrelevant	
or	noisy	features	
• Simpler	models	with	better	results
• Key	Elements
• Target	Transformation
• Feature	Encoding
• Feature	Extraction
Target	Transformation
6x	
• Predictor/Response	Variable	Transformation
• Use	it	when	variable	shows	a	skewed	distribution	make	
the	residuals	more	close	to	“normal	distribution”	(bell	
curve)
• Can	improve	model	fit
• log(x),	log(x+1),	sqrt(x),	sqrt(x+1),	etc.
Target	Transformation	
In	Liberty	Mutual	Group:	Property	Inspection	Prediction
6x	
Different	transformations	might	lead	to	different	results
Feature	Encoding
6x	
• Turn	categorical	features	into	numeric	features	to	
provide	more	fine-grained	information
• Help	explicitly	capture	non-linear	relationships	and	
interactions	between	the	values	of	features
• Most	of	machine	learning	tools	only	accept	numbers	as	
their	input
• xgboost,	gbm,	glmnet,	libsvm,	liblinear,	etc.
Feature	Encoding
6x	
• Labeled	Encoding
• Interpret	the	categories	as	ordered	integers	(mostly	
wrong)
• Python	scikit-learn:	LabelEncoder
• Ok	for	tree-based	methods
• One	Hot	Encoding
• Transform	categories	into	individual	binary	(0	or	1)	
features
• Python	scikit-learn:	DictVectorizer,	OneHotEncoder
• Ok	for	K-means,	Linear,	NNs,	etc.
Feature	Encoding
• Labeled	Encoding
Feature 1 Encoded Feature 1
A 0
A 0
A 0
A 0
B 1
B 1
B 1
C 2
C 2
A 0
B 1
C 2
Feature	Encoding
6x	
• One	Hot	Encoding
Feature Feature = A Feature = B Feature = C
A 1 0 0
A 1 0 0
A 1 0 0
A 1 0 0
B 0 1 0
B 0 1 0
B 0 1 0
C 0 0 1
C 0 0 1
A 1 0 0
B 0 1 0
C 0 0 1
Feature	Encoding
6x	
• Frequency	Encoding
• Encoding	of	categorical	levels	of	feature	to	values	
between	0	and	1	based	on	their	relative	frequency
Feature Encoded Feature
A 0.44
A 0.44
A 0.44
A 0.44
B 0.33
B 0.33
B 0.33
C 0.22
C 0.22
A 0.44 (4 out of 9)
B 0.33 (3 out of 9)
C 0.22 (2 out of 9)
Feature	Encoding	- Target	mean	encoding
6x	
• Instead	of	dummy	encoding	of	categorical	variables	and	increasing	the	
number	of	features	we	can	encode	each	level	as	the	mean	of	the	response.
Feature Outcome MeanEncode
A 1 0.75
A 0 0.75
A 1 0.75
A 1 0.75
B 1 0.66
B 1 0.66
B 0 0.66
C 1 1.00
C 1 1.00
A 0.75 (3 out of 4)
B 0.66 (2 out of 3)
C 1.00 (2 out of 2)
Feature	Encoding	- Target	mean	encoding
6x	
• Also	it	is	better	to	calculate	weighted	average	of	the	
overall	mean	of	the	training	set	and	the	mean	of	the	
level:
𝜆 𝑛 ∗ 𝑚𝑒𝑎𝑛 𝑙𝑒𝑣𝑒𝑙 + 1 − 𝜆 𝑛 ∗ 𝑚𝑒𝑎𝑛(𝑑𝑎𝑡𝑎𝑠𝑒𝑡)
• The	weights	are	based	on	the	frequency	of	the	levels	i.e.	
if	a	category	only	appears	a	few	times	in	the	dataset	
then	its	encoded	value	will	be	close	to	the	overall	mean	
instead	of	the	mean	of	that	level.
Feature	Encoding	– Target	mean	encoding		 𝜆 𝑛 example
6x	
x = frequency
k = inflection
point
f = steepness
https://www.desmos.com/calculator
Feature	Encoding	- Target	mean	encoding	- Smoothing
6x	
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
x level dataset 𝜆
A 4 0.75 0.77 0.99 0.99*0.75 + 0.01*0.77 = 0.7502
B 3 0.66 0.77 0.98 0.98*0.66 + 0.02*0.77 = 0.6622
C 2 1.00 0.77 0.5 0.5*1.0 + 0.5*0.77 = 0.885
𝜆 =	
1
1 + exp	(−
𝑥 − 2
0.25
)
x level dataset 𝜆
A 4 0.75 0.77 0.98 0.98*0.75 + 0.01*0.77 = 0.7427
B 3 0.66 0.77 0.5 0.5*0.66 + 0.5*0.77 = 0.715
C 2 1.00 0.77 0.017 0.017*1.0 + 0.983*0.77 = 0.773
𝜆 =	
1
1 + exp	(−
𝑥 − 3
0.25
)
Feature	Encoding	- Target	mean	encoding
6x	
• Instead	of	dummy	encoding	of	categorical	variables	and	increasing	the	
number	of	features	we	can	encode	each	level	as	the	mean	of	the	response.
Feature Outcome MeanEncode
A 1 0.75
A 0 0.75
A 1 0.75
A 1 0.75
B 1 0.66
B 1 0.66
B 0 0.66
C 1 1.00
C 1 1.00
A 0.75 (3 out of 4)
B 0.66 (2 out of 3)
C 1.00 (2 out of 2)
Feature	Encoding	- Target	mean	encoding
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
Feature	Encoding	- Target	mean	encoding
6x	
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
1.00
Feature	Encoding	- Target	mean	encoding
6x	
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
1.00
0.66
Feature	Encoding	- Target	mean	encoding
6x	
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
1.00
0.66
0.66
Feature	Encoding	- Target	mean	encoding
6x	
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
1.00
0.66
0.66
0.50
Feature	Encoding	- Target	mean	encoding
6x	
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
1.00
0.66
0.66
0.50
0.50
Feature	Encoding	- Target	mean	encoding
6x	
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
1.00
0.66
0.66
0.50
0.50
1.00
Feature	Encoding	- Target	mean	encoding
6x	
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
1.00
0.66
0.66
0.50
0.50
1.00
1.00
Feature	Encoding	- Target	mean	encoding
6x	
• To	avoid	overfitting	we	could	use	leave-one-out	schema
Feature Outcome
A 1
A 0
A 1
A 1
B 1
B 1
B 0
C 1
C 1
LOOencode
0.66
1.00
0.66
0.66
0.50
0.50
1.00
1.00
1.00
Feature	Encoding	– Weight	of	Evidence
6x	
To	avoid	division	by	zero
𝑊𝑜𝐸 = ln	(
%	𝑛𝑜𝑛 − 𝑒𝑣𝑒𝑛𝑡𝑠	
%	𝑒𝑣𝑒𝑛𝑡𝑠
)
𝑊𝑜𝐸BCD = ln	(
Number	of	non-events	in	a	group	+	0.5
Number	of	non-eventsE
Number	of	events	in	a	group +	0.5
Number	of	events
F
)
Feature	Encoding	– Weight	of	Evidence
6x	
Feature Outcome WoE
A 1 0.4
A 0 0.4
A 1 0.4
A 1 0.4
B 1 0.74
B 1 0.74
B 0 0.74
C 1 -0.35
C 1 -0.35
Non-
events
Events
%	of
non-
events
%	of	
events
WoE
A 1 3 50 42 ln
(1 + 0.5)
2
E
(3 + 0.5)
7
E
= 0.4
B 1 2 50 29 ln
(1 + 0.5)
2
E
(2 + 0.5)
7
E
= 0.74
C 0 2 0 29 ln
(0 + 0.5)
2
E
(2 + 0.5)
7
E
= −0.35
Feature	Encoding	– Weight	of	Evidence	and	Information	Value
6x	
Feature Outcome WoE
A 1 0.4
A 0 0.4
A 1 0.4
A 1 0.4
B 1 0.74
B 1 0.74
B 0 0.74
C 1 -0.35
C 1 -0.35
Non-
events
Events
%	of
non-
events
%	of	
events
WoE IV
A 1 3 50 42 ln
(1 + 0.5)
2
E
(3 + 0.5)
7
E
= 0.4 0.5	 − 0.42 ∗ 0.4 = 0.032
B 1 2 50 29 ln
(1 + 0.5)
2
E
(2 + 0.5)
7
E
= 0.74 0.5	 − 0.29 ∗ 0.4 = 0.084
C 0 2 0 29 ln
(0 + 0.5)
2
E
(2 + 0.5)
7
E
= −0.35 0	 − 0.29 ∗ −0.35 = 0.105
0.221
𝐼𝑉 =	M %	𝑛𝑜𝑛 − 𝑒𝑣𝑒𝑛𝑡𝑠	 − %	𝑒𝑣𝑒𝑛𝑡𝑠 ∗ 𝑊𝑜𝐸
Feature	Encoding	– Weight	of	Evidence	and	Information	Value
Information Value Variable Predictiveness
Less than 0.02 Not useful for prediction
0.02 to 0.1 Weak predictive Power
0.1 to 0.3 Medium predictive Power
0.3 to 0.5 Strong predictive Power
>0.5 Suspicious Predictive Power
Feature	Encoding	– Numerical	features
6x	
• Binning	using	quantiles	(population	of	the	same	size	in	each	bin)	or	
histograms	(bins	of	same	size)
• Replace	with	bin’s	mean	or	median
• Treat	bin	id	as	a	category	level	and	use	any	categorical	encoding	schema
• Dimensionality	reduction	techniques	– SVD	and	PCA
• Clustering	and	using	cluster	IDs	or/and	distances	to	cluster	centers	as	
new	features
Feature	Interaction
6x	
• 𝑦 =	 𝑥P
Q − 𝑥Q
Q + 𝑥Q − 1
Adding	𝑥P
Q
and	𝑥Q
Q
as	new	features
Feature	Interaction	– how	to	find?
6x	
• Domain	knowledge
• ML	algorithm	behavior	(for	example	analyzing	GBM	splits	or	linear	
regressor weights)
Feature	Interaction	– how	to	model?
• Clustering	and	kNN for	numerical	features
• Target	encoding	for	pairs	(or	even	triplets	and	etc.)	of	categorical	
features
• Encode	categorical	features	by	stats	of	numerical	features
Feature	Extraction
6x	
• There	usually	have	some	meaningful	features	inside	
existing	features,	you	need	to	extract	them	
manually
• Some	examples
• Location
• Address,	city,	state	and	zip	code	....	(categorical	or	numeric)	
• Time
• Year,	month,	day,	hour,	minute,	time	ranges,	....	(numeric)
• Weekdays	or	weekend	(binary)
• Morning,	noon,	afternoon,	evening,	...	(categorical)
• Numbers
• Turn	age	numbers	into	ranges	(ordinal	or	categorical)
Feature	Extraction:	Textual	Data
6x	
• Bag-of-Words:	extract	tokens	from	text	and	use	
their	occurrences	(or	TF/IDF	weights)	as	features	
• Require	some	NLP	techniques	to	aggregate	token	
counts	more	precisely
• Split	token	into	sub-tokens	by	delimiters	or	case	changes
• N-grams	at	word	(often	2-5	grams)	or	character	level
• Stemming	for	English	words
• Remove	stop	words	(not	always	necessary)
• Convert	all	words	to	lower	case	
• Bag-of-Words	Tools
• Python:	CountVectorizer,	TfidfTransformer in	scikit-learn	
package
Feature	Extraction:	Textual	Data
6x	
• Deep	Learning	for	textual	data
• Turn	each	token	into	a	vector	of	predefined	size
• Help	compute	“semantic	distance”	between	
tokens/words
• For	example,	the	semantic	distance	between	user	query	and	
product	titles	in	search	results	(how	relevant?)
• Greatly	reduce	the	number	of	text	features	used	for	
training
• Use	average	vector	of	all	words	in	given	text
• Vector	size:	100~300	is	often	enough
• Tools
• Word2vec,	Doc2vec,	GloVe
End

More Related Content

What's hot

Label propagation - Semisupervised Learning with Applications to NLP
Label propagation - Semisupervised Learning with Applications to NLPLabel propagation - Semisupervised Learning with Applications to NLP
Label propagation - Semisupervised Learning with Applications to NLP
David Przybilla
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 

What's hot (20)

最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine Learning
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
 
Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
 
Label propagation - Semisupervised Learning with Applications to NLP
Label propagation - Semisupervised Learning with Applications to NLPLabel propagation - Semisupervised Learning with Applications to NLP
Label propagation - Semisupervised Learning with Applications to NLP
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
 
For MANABIYA
For MANABIYAFor MANABIYA
For MANABIYA
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Introduction to XGboost
Introduction to XGboostIntroduction to XGboost
Introduction to XGboost
 
Feature Engineering for ML - Dmitry Larko, H2O.ai
Feature Engineering for ML - Dmitry Larko, H2O.aiFeature Engineering for ML - Dmitry Larko, H2O.ai
Feature Engineering for ML - Dmitry Larko, H2O.ai
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
XGBoost (System Overview)
XGBoost (System Overview)XGBoost (System Overview)
XGBoost (System Overview)
 
Xgboost
XgboostXgboost
Xgboost
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 

Similar to Feature Engineering

Machine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksMachine Learning : why we should know and how it works
Machine Learning : why we should know and how it works
Kevin Lee
 

Similar to Feature Engineering (20)

Feature Engineering in H2O Driverless AI - Dmitry Larko - H2O AI World London...
Feature Engineering in H2O Driverless AI - Dmitry Larko - H2O AI World London...Feature Engineering in H2O Driverless AI - Dmitry Larko - H2O AI World London...
Feature Engineering in H2O Driverless AI - Dmitry Larko - H2O AI World London...
 
Feature Engineering Hands-On by Dmitry Larko
Feature Engineering Hands-On by Dmitry LarkoFeature Engineering Hands-On by Dmitry Larko
Feature Engineering Hands-On by Dmitry Larko
 
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Eye deep
Eye deepEye deep
Eye deep
 
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
 
Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learn
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Xgboost
XgboostXgboost
Xgboost
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptx
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_finance
 
supervised.pptx
supervised.pptxsupervised.pptx
supervised.pptx
 
Gan seminar
Gan seminarGan seminar
Gan seminar
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep Learning
 
Neural networks
Neural networksNeural networks
Neural networks
 
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Machine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksMachine Learning : why we should know and how it works
Machine Learning : why we should know and how it works
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
 

More from Sri Ambati

More from Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Feature Engineering