SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
ARTIFICIAL NEURAL NETWORKS PROBLEM
YAKUP GÖRÜR
DATE NAME
13 DECEMBER 2016
• Talking Data
• Introduction
• Datasets
• Sample submission
• Some analysis
• Basic solution
• To Do
• Predicting A Biological Response
• Introduction
• Datasets
• Solution
• Model
• Code
• To Do
TOPICS
ARTIFICIAL NEURAL NETWORKS PROBLEM
PROJECT
TALKINGDATA MOBILE USER DEMOGRAPHICS
A KAGGLE COMPETITION
YAKUP GÖRÜRDATE NAME
13 DECEMBER 2016
TALKINGDATA MOBILE USER DEMOGRAPHICS
INTRODUCTION
• TalkingData, China’s largest third-party mobile data platform,
• TalkingData is seeking to leverage behavioral data from more than
70% of the 500 million mobile devices active daily in China to help
its clients better understand and interact with their audiences
• In this competition, challenged to build a model predicting users’
demographic characteristics based on
• Their app usage
• Geolocations
• Mobile device properties
KAGGLE WEB-SITE
• The data was obtained from the kaggle.com as a .csv file.
• Test Data:
• gender_age_test.csv
• Training Datas:
• gender_age_train.csv
• events.csv
• phone_brand_device.csv
• app_events.csv
• app_labels.csv
• label_categories.csv
TALKINGDATA MOBILE USER DEMOGRAPHICS
TALKINGDATA MOBILE USER DEMOGRAPHICS
Gender_age_train.csv Gender_age_test.csv
•The two files are our
training and test data.
• Our target variable is group
that we are going to predict
74645 Train Data(%40)
112071 Test Data (%60)
Events.csv App_events.csv
When a user uses
TalkingData SDK,
the event gets
logged in the events
data.
The event
corresponds to a
list of apps in
app_events.
App_Labels.csv Label_Categories.csv
apps
and
their labels
Phone_Brand_Device_Model.Csv English_Phone_Brand_Device_Model.Csv
SAMPLE SUBMISSION
GEOLOCATION VISUALISATIONS
INVESTIGATING TIME AND DAY AND GENDER
USER PORTRAITS
10 largest
positive (red)
negative (blue)
coefficients
def read_train_test():
# App events
print('Read app events...')
ape = pd.read_csv("/Users/yakup/Downloads/TalkingData/app_events.csv")
ape['installed'] = ape.groupby(['event_id'])['is_installed'].transform('sum')
ape['active'] = ape.groupby(['event_id'])['is_active'].transform('sum')
ape.drop(['is_installed', 'is_active'], axis=1, inplace=True)
ape.drop_duplicates('event_id', keep='first', inplace=True)
ape.drop(['app_id'], axis=1, inplace=True)
# Events
print('Read events...')
events = pd.read_csv("/Users/yakup/Downloads/TalkingData/events.csv", dtype={'device_id': np.str})
events['counts'] = events.groupby(['device_id'])['event_id'].transform('count')
# The idea here is to count the number of installed apps using the data
# from app_events.csv above. Also to count the number of active apps.
events = pd.merge(events, ape, how='left', on='event_id', left_index=True)
# Below is the original events_small table
# events_small = events[['device_id', 'counts']].drop_duplicates('device_id', keep='first')
# And this is the new events_small table with two extra features
events_small = events[['device_id', 'counts', 'installed', 'active']].drop_duplicates('device_id', keep='first')
# Phone brand
print('Read brands...')
pbd = pd.read_csv("/Users/yakup/Downloads/TalkingData/phone_brand_device_model.csv", dtype={'device_id': np.str})
pbd.drop_duplicates('device_id', keep='first', inplace=True)
pbd = map_column(pbd, 'phone_brand')
pbd = map_column(pbd, 'device_model')
# Train
print('Read train...')
train = pd.read_csv("/Users/yakup/Downloads/TalkingData/gender_age_train.csv", dtype={'device_id': np.str})
train = map_column(train, 'group')
train = train.drop(['age'], axis=1)
train = train.drop(['gender'], axis=1)
train = pd.merge(train, pbd, how='left', on='device_id', left_index=True)
train = pd.merge(train, events_small, how='left', on='device_id', left_index=True)
train.fillna(-1, inplace=True)
# Test
print('Read test...')
test = pd.read_csv("/Users/yakup/Downloads/TalkingData/gender_age_test.csv", dtype={'device_id': np.str})
test = pd.merge(test, pbd, how='left', on='device_id', left_index=True)
test = pd.merge(test, events_small, how='left', on='device_id', left_index=True)
test.fillna(-1, inplace=True)
# Features
features = list(test.columns.values)
features.remove('device_id')
return train, test, features Thanks to @ZFTurbo
XGBOOST SUBMISSION SAMPLE
just using users’ telephone model and their application and labels
XGBOOST SUBMISSION SAMPLE
just using users’ telephone model and their application and labels
TO DO
• Use also latitude/longitude
• Use also Female/Male events hours
• Re-train model and re-test
ARTIFICIAL NEURAL NETWORKS PROBLEM
PROJECT
PREDICTING A BIOLOGICAL RESPONSE
A KAGGLE COMPETITION
YAKUP GÖRÜRDATE NAME
13 DECEMBER 2016
PREDICTING A BIOLOGICAL RESPONSE
INTRODUCTION
• The development of a new drug largely depends on trial and
error.
• It typically involves synthesizing thousands of compounds that
finally becomes a drug.
• As a result, this process is extremely expensive and slow.
• Therefore, the ability to accurately predict the biological activity
of molecules, and understand the rationale behind those
predictions are of great value.
PREDICTING A BIOLOGICAL RESPONSE
COMPETITION AND DATA
• The objective of the competition is to help us build as good a
model as possible so that we can, as optimally as this data allows,
relate molecular information, to an actual biological response.
• Purpose: Predict a biological response of molecules from their
chemical properties
• The competition was from the Kaggle.com’s competition:
“Predicting a Biological Response” held between March16, 2012
and June 15, 2012 and re-enabled with new data 2013.
KAGGLE WEB-SITE
• The data was obtained from the kaggle.com as a .csv file.
• train.csv
• test.csv
• svm_bencmark.csv
PREDICTING A BIOLOGICAL RESPONSE
• The first column contains
experimental data describing
an actual biological response
(Active/Inactive).
• The remaining columns
represent molecular
descriptors (D1 through
D1776) e. g. Size, shape, etc.
PREDICTING A BIOLOGICAL RESPONSE
TRAIN DATA
PREDICTING A BIOLOGICAL RESPONSE
TEST DATA DESIRED_OUTPUT
TRAINING ERROR IN EVERY ITERATION
Alphas/
Iterati
0.00001 0.0001 0,001 0.01 1 10 100
999 0.4970816592 0.496532726679 0.488427223939 0.13242254511 0.45774460 0.45774460 0.45774460
1999 0.49675409750 0.496383899255 0.253194405576 0.0957865492231 0.45774460 0.45774460 0.45774460
3999 0.49667002752 0.496179916176 0.106717625619 0.0611895628345 0.45774460 0.45774460 0.45774460
4999 0.49664307167 0.496062878454 0.0820233558756 0.0511976563565 0.45774460 0.45774460 0.45774460
5999 0.49661803106 0.495889077621 0.070172349899 0.0626155080386 0.45774460 0.45774460 0.45774460
6999 0.49659461636 0.495579874905 0.062604915027 0.050769160817 0.45774460 0.45774460 0.45774460
11999 0.49649684869 0.435855005691 0.0438280987103 0.0350799920535 0.45774460 0.45774460 0.45774460
12999 0.49648036118 0.397982602659 0.04226643999 0.0347661317427 0.45774460 0.45774460 0.45774460
18999 0.49639600726 0.268281553859 0.0378180439635 0.0347022410019 0.45774460 0.45774460 0.45774460
19999 0.49638381501 0.252608236007 0.0373610189016 0.0358672248921 0.45774460 0.45774460 0.45774460
LEARNING
TESTING
alpha=0.00001iterate:20000 error: 0.247995007025 standart error= 0,147995070209
alpha=0.0001 iterate:20000 error: 0.155827287349 standart error= 0,085592708521
alpha=0.001 iterate:20000 error: 0.302561724981 standart error= 0.366689370858
alpha=0.01 iterate:20000 error: 0.299910036583 standart error= 0.367160158942
alpha=1 iterate:20000 error: 0.453635315074 standart error= 0.530927445033
alpha=10 iterate:20000 error: 0.453635315074 standart error= 0.530927445033
alpha=100 iterate:20000 error: 0.453635315074 standart error= 0.530927445033
MY OUTPUT’S ERROR
0
0,125
0,25
0,375
0,5
0.00001 0.0001 0.001 0.01 1 10 100
PREDICTING A BIOLOGICAL RESPONSE
MY OUTPUT’S ERROR
output
Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 OutputInput
PREDICTING A BIOLOGICAL RESPONSE
MODEL
# convert output of sigmoid function to its derivative
def sigmoid_output_to_derivative(output):
return output * (1 - output)
X = traindatai.values
y = traindatao
f = open('Error.txt', 'w')
for alpha in alphas:
print "nTraining With Alpha:" + str(alpha)
np.random.seed(1)
#randomly initialize our weights with mean 0
synapse_0 = 2 * np.random.random((1776, 11)) - 1
synapse_1 = 2 * np.random.random((11, 4)) - 1
synapse_2 = 2 * np.random.random((4, 2)) - 1
synapse_3 = 2 * np.random.random((2, 1)) - 1
for j in xrange(20000):
# Feed forward through layers 0,1,2,3,4
layer_0=X
layer_1=sigmoid(np.dot(layer_0,synapse_0))
layer_2=sigmoid(np.dot(layer_1,synapse_1))
layer_3=sigmoid(np.dot(layer_2,synapse_2))
layer_4=sigmoid(np.dot(layer_3,synapse_3))
#how much did we miss the target value?
layer_4_error= layer_4 - y
if(j%1000) == 999:
print "Error After:" + str(j)+ " iterations:" + str(np.mean(np.abs(layer_4_error)))
# in what direction is the target value?
# were we really sure? if so, don't change too much
layer_4_delta = layer_4_error * sigmoid_output_to_derivative(layer_4)
#layer_2_delta=layer_2_error*sigmoid_output_to_derivative(layer_2)
#how much did each l3 value contribute to the l4 error (according to the weights)?
layer_3_error = layer_4_delta.dot(synapse_3.T)
#layer_1_error=layer_2_delta.dot(synapse_1.T)
#in what direction is the target l3?
# were we really sure? if so, don't change too much
#layer_1_delta=layer_1_error*sigmoid_output_to_derivative(layer_1)
layer_3_delta = layer_3_error * sigmoid_output_to_derivative(layer_3)
# how much did each l2 value contribute to the l3 error (according to the weights)?
layer_2_error = layer_3_delta.dot(synapse_2.T)
# in what direction is the target l2?
# were we really sure? if so, don't change too much
layer_2_delta = layer_2_error * sigmoid_output_to_derivative(layer_2)
# how much did each l1 value contribute to the l2 error (according to the weights)?
layer_1_error = layer_2_delta.dot(synapse_1.T)
# in what direction is the target l1?
# were we really sure? if so, don't change too much
layer_1_delta = layer_1_error * sigmoid_output_to_derivative(layer_1)
synapse_3 -= alpha * (layer_3.T.dot(layer_4_delta))
synapse_2 -= alpha * (layer_2.T.dot(layer_3_delta))
synapse_1 -= alpha * (layer_1.T.dot(layer_2_delta))
synapse_0 -= alpha * (layer_0.T.dot(layer_1_delta))
test_data = pd.read_csv('/Users/yakup/Downloads/Predicting a Biological Response/test.csv') # Open file
x = test_data.values
layer_0 = x
https://github.com/ykpgrr/Artificial_Neural_Network
TO DO
• Use gradient adaptive methods
• Clean dataset
• Try different ANN model
THANK YOU

Mais conteúdo relacionado

Semelhante a Artificial Neural Networks Workshop

Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
Informatics Practices (new) solution CBSE 2021, Compartment, improvement ex...
Informatics Practices (new) solution CBSE  2021, Compartment,  improvement ex...Informatics Practices (new) solution CBSE  2021, Compartment,  improvement ex...
Informatics Practices (new) solution CBSE 2021, Compartment, improvement ex...FarhanAhmade
 
Musings of kaggler
Musings of kagglerMusings of kaggler
Musings of kagglerKai Xin Thia
 
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...Databricks
 
Webpage Personalization and User Profiling
Webpage Personalization and User ProfilingWebpage Personalization and User Profiling
Webpage Personalization and User Profilingyingfeng
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...ijcseit
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...IJCSES Journal
 
Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Yao Yao
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Miguel González-Fierro
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solrTrey Grainger
 
Building a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrBuilding a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrlucenerevolution
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Daniel Chan
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelDavid Ritchie
 
Machine learning key to your formulation challenges
Machine learning key to your formulation challengesMachine learning key to your formulation challenges
Machine learning key to your formulation challengesMarc Borowczak
 

Semelhante a Artificial Neural Networks Workshop (20)

Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
Informatics Practices (new) solution CBSE 2021, Compartment, improvement ex...
Informatics Practices (new) solution CBSE  2021, Compartment,  improvement ex...Informatics Practices (new) solution CBSE  2021, Compartment,  improvement ex...
Informatics Practices (new) solution CBSE 2021, Compartment, improvement ex...
 
Musings of kaggler
Musings of kagglerMusings of kaggler
Musings of kaggler
 
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
 
Analytics with Spark
Analytics with SparkAnalytics with Spark
Analytics with Spark
 
Webpage Personalization and User Profiling
Webpage Personalization and User ProfilingWebpage Personalization and User Profiling
Webpage Personalization and User Profiling
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
 
Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...
 
ML .pptx
ML .pptxML .pptx
ML .pptx
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...
 
Data herding
Data herdingData herding
Data herding
 
Data herding
Data herdingData herding
Data herding
 
CSL0777-L07.pptx
CSL0777-L07.pptxCSL0777-L07.pptx
CSL0777-L07.pptx
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 
Building a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrBuilding a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solr
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
 
Imix
ImixImix
Imix
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_Model
 
Machine learning key to your formulation challenges
Machine learning key to your formulation challengesMachine learning key to your formulation challenges
Machine learning key to your formulation challenges
 

Último

scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...HenryBriggs2
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stageAbc194748
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxSCMS School of Architecture
 

Último (20)

scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 

Artificial Neural Networks Workshop

  • 1. ARTIFICIAL NEURAL NETWORKS PROBLEM YAKUP GÖRÜR DATE NAME 13 DECEMBER 2016
  • 2. • Talking Data • Introduction • Datasets • Sample submission • Some analysis • Basic solution • To Do • Predicting A Biological Response • Introduction • Datasets • Solution • Model • Code • To Do TOPICS
  • 3. ARTIFICIAL NEURAL NETWORKS PROBLEM PROJECT TALKINGDATA MOBILE USER DEMOGRAPHICS A KAGGLE COMPETITION YAKUP GÖRÜRDATE NAME 13 DECEMBER 2016
  • 4. TALKINGDATA MOBILE USER DEMOGRAPHICS INTRODUCTION • TalkingData, China’s largest third-party mobile data platform, • TalkingData is seeking to leverage behavioral data from more than 70% of the 500 million mobile devices active daily in China to help its clients better understand and interact with their audiences • In this competition, challenged to build a model predicting users’ demographic characteristics based on • Their app usage • Geolocations • Mobile device properties
  • 6. • The data was obtained from the kaggle.com as a .csv file. • Test Data: • gender_age_test.csv • Training Datas: • gender_age_train.csv • events.csv • phone_brand_device.csv • app_events.csv • app_labels.csv • label_categories.csv TALKINGDATA MOBILE USER DEMOGRAPHICS
  • 7. TALKINGDATA MOBILE USER DEMOGRAPHICS
  • 8. Gender_age_train.csv Gender_age_test.csv •The two files are our training and test data. • Our target variable is group that we are going to predict 74645 Train Data(%40) 112071 Test Data (%60)
  • 9. Events.csv App_events.csv When a user uses TalkingData SDK, the event gets logged in the events data. The event corresponds to a list of apps in app_events.
  • 14. INVESTIGATING TIME AND DAY AND GENDER
  • 15. USER PORTRAITS 10 largest positive (red) negative (blue) coefficients
  • 16. def read_train_test(): # App events print('Read app events...') ape = pd.read_csv("/Users/yakup/Downloads/TalkingData/app_events.csv") ape['installed'] = ape.groupby(['event_id'])['is_installed'].transform('sum') ape['active'] = ape.groupby(['event_id'])['is_active'].transform('sum') ape.drop(['is_installed', 'is_active'], axis=1, inplace=True) ape.drop_duplicates('event_id', keep='first', inplace=True) ape.drop(['app_id'], axis=1, inplace=True) # Events print('Read events...') events = pd.read_csv("/Users/yakup/Downloads/TalkingData/events.csv", dtype={'device_id': np.str}) events['counts'] = events.groupby(['device_id'])['event_id'].transform('count') # The idea here is to count the number of installed apps using the data # from app_events.csv above. Also to count the number of active apps. events = pd.merge(events, ape, how='left', on='event_id', left_index=True) # Below is the original events_small table # events_small = events[['device_id', 'counts']].drop_duplicates('device_id', keep='first') # And this is the new events_small table with two extra features events_small = events[['device_id', 'counts', 'installed', 'active']].drop_duplicates('device_id', keep='first') # Phone brand print('Read brands...') pbd = pd.read_csv("/Users/yakup/Downloads/TalkingData/phone_brand_device_model.csv", dtype={'device_id': np.str}) pbd.drop_duplicates('device_id', keep='first', inplace=True) pbd = map_column(pbd, 'phone_brand') pbd = map_column(pbd, 'device_model') # Train print('Read train...') train = pd.read_csv("/Users/yakup/Downloads/TalkingData/gender_age_train.csv", dtype={'device_id': np.str}) train = map_column(train, 'group') train = train.drop(['age'], axis=1) train = train.drop(['gender'], axis=1) train = pd.merge(train, pbd, how='left', on='device_id', left_index=True) train = pd.merge(train, events_small, how='left', on='device_id', left_index=True) train.fillna(-1, inplace=True) # Test print('Read test...') test = pd.read_csv("/Users/yakup/Downloads/TalkingData/gender_age_test.csv", dtype={'device_id': np.str}) test = pd.merge(test, pbd, how='left', on='device_id', left_index=True) test = pd.merge(test, events_small, how='left', on='device_id', left_index=True) test.fillna(-1, inplace=True) # Features features = list(test.columns.values) features.remove('device_id') return train, test, features Thanks to @ZFTurbo XGBOOST SUBMISSION SAMPLE just using users’ telephone model and their application and labels
  • 17. XGBOOST SUBMISSION SAMPLE just using users’ telephone model and their application and labels
  • 18. TO DO • Use also latitude/longitude • Use also Female/Male events hours • Re-train model and re-test
  • 19. ARTIFICIAL NEURAL NETWORKS PROBLEM PROJECT PREDICTING A BIOLOGICAL RESPONSE A KAGGLE COMPETITION YAKUP GÖRÜRDATE NAME 13 DECEMBER 2016
  • 20. PREDICTING A BIOLOGICAL RESPONSE INTRODUCTION • The development of a new drug largely depends on trial and error. • It typically involves synthesizing thousands of compounds that finally becomes a drug. • As a result, this process is extremely expensive and slow. • Therefore, the ability to accurately predict the biological activity of molecules, and understand the rationale behind those predictions are of great value.
  • 21. PREDICTING A BIOLOGICAL RESPONSE COMPETITION AND DATA • The objective of the competition is to help us build as good a model as possible so that we can, as optimally as this data allows, relate molecular information, to an actual biological response. • Purpose: Predict a biological response of molecules from their chemical properties • The competition was from the Kaggle.com’s competition: “Predicting a Biological Response” held between March16, 2012 and June 15, 2012 and re-enabled with new data 2013.
  • 23. • The data was obtained from the kaggle.com as a .csv file. • train.csv • test.csv • svm_bencmark.csv PREDICTING A BIOLOGICAL RESPONSE
  • 24. • The first column contains experimental data describing an actual biological response (Active/Inactive). • The remaining columns represent molecular descriptors (D1 through D1776) e. g. Size, shape, etc. PREDICTING A BIOLOGICAL RESPONSE TRAIN DATA
  • 25. PREDICTING A BIOLOGICAL RESPONSE TEST DATA DESIRED_OUTPUT
  • 26. TRAINING ERROR IN EVERY ITERATION Alphas/ Iterati 0.00001 0.0001 0,001 0.01 1 10 100 999 0.4970816592 0.496532726679 0.488427223939 0.13242254511 0.45774460 0.45774460 0.45774460 1999 0.49675409750 0.496383899255 0.253194405576 0.0957865492231 0.45774460 0.45774460 0.45774460 3999 0.49667002752 0.496179916176 0.106717625619 0.0611895628345 0.45774460 0.45774460 0.45774460 4999 0.49664307167 0.496062878454 0.0820233558756 0.0511976563565 0.45774460 0.45774460 0.45774460 5999 0.49661803106 0.495889077621 0.070172349899 0.0626155080386 0.45774460 0.45774460 0.45774460 6999 0.49659461636 0.495579874905 0.062604915027 0.050769160817 0.45774460 0.45774460 0.45774460 11999 0.49649684869 0.435855005691 0.0438280987103 0.0350799920535 0.45774460 0.45774460 0.45774460 12999 0.49648036118 0.397982602659 0.04226643999 0.0347661317427 0.45774460 0.45774460 0.45774460 18999 0.49639600726 0.268281553859 0.0378180439635 0.0347022410019 0.45774460 0.45774460 0.45774460 19999 0.49638381501 0.252608236007 0.0373610189016 0.0358672248921 0.45774460 0.45774460 0.45774460 LEARNING
  • 27. TESTING alpha=0.00001iterate:20000 error: 0.247995007025 standart error= 0,147995070209 alpha=0.0001 iterate:20000 error: 0.155827287349 standart error= 0,085592708521 alpha=0.001 iterate:20000 error: 0.302561724981 standart error= 0.366689370858 alpha=0.01 iterate:20000 error: 0.299910036583 standart error= 0.367160158942 alpha=1 iterate:20000 error: 0.453635315074 standart error= 0.530927445033 alpha=10 iterate:20000 error: 0.453635315074 standart error= 0.530927445033 alpha=100 iterate:20000 error: 0.453635315074 standart error= 0.530927445033 MY OUTPUT’S ERROR
  • 28. 0 0,125 0,25 0,375 0,5 0.00001 0.0001 0.001 0.01 1 10 100 PREDICTING A BIOLOGICAL RESPONSE MY OUTPUT’S ERROR
  • 29. output Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 OutputInput PREDICTING A BIOLOGICAL RESPONSE MODEL
  • 30. # convert output of sigmoid function to its derivative def sigmoid_output_to_derivative(output): return output * (1 - output) X = traindatai.values y = traindatao f = open('Error.txt', 'w') for alpha in alphas: print "nTraining With Alpha:" + str(alpha) np.random.seed(1) #randomly initialize our weights with mean 0 synapse_0 = 2 * np.random.random((1776, 11)) - 1 synapse_1 = 2 * np.random.random((11, 4)) - 1 synapse_2 = 2 * np.random.random((4, 2)) - 1 synapse_3 = 2 * np.random.random((2, 1)) - 1 for j in xrange(20000): # Feed forward through layers 0,1,2,3,4 layer_0=X layer_1=sigmoid(np.dot(layer_0,synapse_0)) layer_2=sigmoid(np.dot(layer_1,synapse_1)) layer_3=sigmoid(np.dot(layer_2,synapse_2)) layer_4=sigmoid(np.dot(layer_3,synapse_3)) #how much did we miss the target value? layer_4_error= layer_4 - y if(j%1000) == 999: print "Error After:" + str(j)+ " iterations:" + str(np.mean(np.abs(layer_4_error))) # in what direction is the target value? # were we really sure? if so, don't change too much layer_4_delta = layer_4_error * sigmoid_output_to_derivative(layer_4) #layer_2_delta=layer_2_error*sigmoid_output_to_derivative(layer_2) #how much did each l3 value contribute to the l4 error (according to the weights)? layer_3_error = layer_4_delta.dot(synapse_3.T) #layer_1_error=layer_2_delta.dot(synapse_1.T) #in what direction is the target l3? # were we really sure? if so, don't change too much #layer_1_delta=layer_1_error*sigmoid_output_to_derivative(layer_1) layer_3_delta = layer_3_error * sigmoid_output_to_derivative(layer_3) # how much did each l2 value contribute to the l3 error (according to the weights)? layer_2_error = layer_3_delta.dot(synapse_2.T) # in what direction is the target l2? # were we really sure? if so, don't change too much layer_2_delta = layer_2_error * sigmoid_output_to_derivative(layer_2) # how much did each l1 value contribute to the l2 error (according to the weights)? layer_1_error = layer_2_delta.dot(synapse_1.T) # in what direction is the target l1? # were we really sure? if so, don't change too much layer_1_delta = layer_1_error * sigmoid_output_to_derivative(layer_1) synapse_3 -= alpha * (layer_3.T.dot(layer_4_delta)) synapse_2 -= alpha * (layer_2.T.dot(layer_3_delta)) synapse_1 -= alpha * (layer_1.T.dot(layer_2_delta)) synapse_0 -= alpha * (layer_0.T.dot(layer_1_delta)) test_data = pd.read_csv('/Users/yakup/Downloads/Predicting a Biological Response/test.csv') # Open file x = test_data.values layer_0 = x https://github.com/ykpgrr/Artificial_Neural_Network
  • 31. TO DO • Use gradient adaptive methods • Clean dataset • Try different ANN model