SlideShare uma empresa Scribd logo
1 de 7
· Make sure you include the major 7 sections graded mentioned
below in your paper
· The deliverable should contain the following components:
(
1) Overall Goals/Research Hypothesis (10 %)
1-3 research questions to navigate/direct all your project.
· You may delay this section until (1) you study all previous
work and (2) you do some analysis and understand the
dataset/project
(2) (Previous/Related Contributions) (15 %)
As most of the selected projects use public datasets, no doubt
there are different attempts/projects to analyze those datasets.
30 % of this deliverable is in your overall assessment of
previous data analysis efforts. This effort should include:
· Evaluating existing source codes that they have (e.g. in
Kernels and discussion sections) or any other refence. Make
sure you try those codes and show their results
· In addition to the code, summarize most relevant literature or
efforts to analyze the same dataset you have picked.
· For the few who picked their own datasets, you are still
expecting to do your literature survey in this section on what is
most relevant to your data/idea/area and summarize those most
relevant contributions.
(3) A comparison study (15 %)
Compare results in your own work/project with results from
previous or other contributions (data and analysis comparison
not literature review)
The difference between section 3 and section 2 is that section 2
focuses on code/data analysis found in sources such as Kaggle,
github, etc. while section 3 focuses on research papers that not
necessary studied the same dataset, but the same focus area
(4) Preprocessing activities, Features Selection / Engineering
(10 %)
(See this link for content of the next section)
https://www.kaggle.com/WinningModelDocumentationGuidelin
es
· What were the most important features?
· We suggest you provide:
· a variable importance plot (
an example here
about halfway down the page), showing the 10-20 most
important features and
· partial plots for the 3-5 most important features
· If this is not possible, you should provide a list of the most
important features.
· How did you select features?
· Did you make any important feature transformations?
· Did you find any interesting interactions between features?
· Did you use external data? (if permitted)
(5)
Training Method(s)
10 %
· What training methods did you use?
· Did you ensemble the models?
· If you did ensemble, how did you weight the different models?
A6. Interesting findings
· What was the most important trick you used?
· What do you think set you apart from others in the
competition?
· Did you find any interesting relationships in the data that don't
fit in the sections above?
Many customers are happy to trade off model performance for
simplicity. With this in mind:
· Is there a subset of features that would get 90-95% of your
final performance? Which features? *
· What model that was most important? *
· What would the simplified model score?
· * Try and restrict your simple model to fewer than 10 features
and one training method.
(6) Accuracy metrics reporting, charts, Model Execution Time
(10 %)
Many customers care about how long the winning models take
to train and generate predictions:
· How long does it take to train your model?
· How long does it take to generate predictions using your
model?
· How long does it take to train the simplified model
(referenced in section A6)?
· How long does it take to generate predictions from the
simplified model?
(7) Use of ensemble methods
(15 %)
Per the last chapter we have, make sure you employ at least two
different ensemble models in your code and show the model
details and results
References
Citations to references, websites, blog posts, and external
sources of information where appropriate.
Summary
Summarize the most important aspects of your model and
analysis, such as:
The training method(s) you used (Convolutional Neural
Network, XGBoost)
The most important features
The tool(s) you used
How long it takes to train your model
------------------------------------------------
----------------------------------------------------------------
Quality Criteria (10-20% of overall project):
1.
Thorough performance analysis
: Results in data analysis can be misleading. Without detail
analysis of different performance metrics (e.g. accuracy, recall,
ROC, AUC, etc.) one-side view of results can present
incomplete and inaccurate findings. Presenting a thorough
analysis for overall performance of your models will show that
you did not ignore any factor in your model.
2.
Following standard project templates
: You can find through the Internet several standard templates
for data science projects (How to structure your code, data,
etc.). While following standard templates is not a must or
required but will be considered as part of quality criteria. Here
are examples of code templates for different programming
environments:
a. R and RStudio:
http://projecttemplate.net/getting_started.html
https://nicercode.github.io/blog/2013-04-05-projects/
https://community.rstudio.com/t/data-science-project-template-
for-r/3230/10
b. Python:
https://towardsdatascience.com/manage-your-data-science-
project-structure-in-early-stage-95f91d4d0600
https://drivendata.github.io/cookiecutter-data-science/#example
https://github.com/equinor/data-science-template
c. MS Azure
https://github.com/Azure/Azure-TDSP-ProjectTemplate
https://buckwoody.wordpress.com/2017/08/17/a-data-science-
microsoft-project-template-you-can-use-in-your-solutions/
https://docs.microsoft.com/en-us/azure/machine-learning/team-
data-science-process/team-data-science-process-project-
templates
3.
Better documentation
Save the data + code that generated the output, rather than the
output itself. Intermediate files are okay as long as there is clear
documentation of how they were created
4.
Use Version Control
e.g. using some websites such as Gitlab, GitHub / BitBucket
4.
Document and keep track of your analysis environment
: If you work on a complex project involving many tools /
datasets, the software and computing environment can be
critical for reproducing your analysis Computer architecture:
CPU (Intel, AMD, ARM), GPUs, Operating system: Windows,
Mac OS, Linux / Unix Software toolchain: Compilers,
interpreters, command shell, programming languages (C, Perl,
Python, etc.), database backends, data analysis software
Supporting software / infrastructure: Libraries, R packages,
dependencies External dependencies: Web sites, data
repositories, remote databases, software repositories

Mais conteúdo relacionado

Semelhante a · Make sure you include the major 7 sections graded mentioned be.docx

Case Study Research paper- report Spring 20201) Total points.docx
Case Study Research paper- report Spring 20201) Total points.docxCase Study Research paper- report Spring 20201) Total points.docx
Case Study Research paper- report Spring 20201) Total points.docx
zebadiahsummers
 
Report
ReportReport
Report
butest
 
CSCI-UA 102 sec 3,5, Fall 2015Programming Project 6Joann.docx
CSCI-UA 102 sec 3,5, Fall 2015Programming Project 6Joann.docxCSCI-UA 102 sec 3,5, Fall 2015Programming Project 6Joann.docx
CSCI-UA 102 sec 3,5, Fall 2015Programming Project 6Joann.docx
faithxdunce63732
 
IT 510 Final Project Guidelines and Rubric Overview .docx
IT 510 Final Project Guidelines and Rubric  Overview .docxIT 510 Final Project Guidelines and Rubric  Overview .docx
IT 510 Final Project Guidelines and Rubric Overview .docx
priestmanmable
 
LearningOutcomesassessedin
LearningOutcomesassessedinLearningOutcomesassessedin
LearningOutcomesassessedin
JospehStull43
 
IT 500 Final Project Guidelines and Grading GuideOverview and.docx
IT 500 Final Project Guidelines and Grading GuideOverview and.docxIT 500 Final Project Guidelines and Grading GuideOverview and.docx
IT 500 Final Project Guidelines and Grading GuideOverview and.docx
priestmanmable
 
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docxCase Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
wendolynhalbert
 
Criteria for Research AssignmentPSCI 1010· The paper is due on.docx
Criteria for Research AssignmentPSCI 1010· The paper is due on.docxCriteria for Research AssignmentPSCI 1010· The paper is due on.docx
Criteria for Research AssignmentPSCI 1010· The paper is due on.docx
willcoxjanay
 
IT 500 Final Project Guidelines and Rubric Overview .docx
IT 500 Final Project Guidelines and Rubric  Overview .docxIT 500 Final Project Guidelines and Rubric  Overview .docx
IT 500 Final Project Guidelines and Rubric Overview .docx
priestmanmable
 
Running Head TOWN GUIDE ANDROID APPLICATION5TOWN GUIDE ANDROI.docx
Running Head TOWN GUIDE ANDROID APPLICATION5TOWN GUIDE ANDROI.docxRunning Head TOWN GUIDE ANDROID APPLICATION5TOWN GUIDE ANDROI.docx
Running Head TOWN GUIDE ANDROID APPLICATION5TOWN GUIDE ANDROI.docx
toltonkendal
 

Semelhante a · Make sure you include the major 7 sections graded mentioned be.docx (20)

Case Study Research paper- report Spring 20201) Total points.docx
Case Study Research paper- report Spring 20201) Total points.docxCase Study Research paper- report Spring 20201) Total points.docx
Case Study Research paper- report Spring 20201) Total points.docx
 
Imrad structure
Imrad structureImrad structure
Imrad structure
 
BSOP 326 Expect Success/newtonhelp.com
BSOP 326 Expect Success/newtonhelp.comBSOP 326 Expect Success/newtonhelp.com
BSOP 326 Expect Success/newtonhelp.com
 
BSOP 326 Effective Communication/tutorialrank.com
 BSOP 326 Effective Communication/tutorialrank.com BSOP 326 Effective Communication/tutorialrank.com
BSOP 326 Effective Communication/tutorialrank.com
 
Bsop 326 Enhance teaching - tutorialrank.com
Bsop 326  Enhance teaching - tutorialrank.comBsop 326  Enhance teaching - tutorialrank.com
Bsop 326 Enhance teaching - tutorialrank.com
 
Software Process Models
 Software Process Models  Software Process Models
Software Process Models
 
Report
ReportReport
Report
 
CSCI-UA 102 sec 3,5, Fall 2015Programming Project 6Joann.docx
CSCI-UA 102 sec 3,5, Fall 2015Programming Project 6Joann.docxCSCI-UA 102 sec 3,5, Fall 2015Programming Project 6Joann.docx
CSCI-UA 102 sec 3,5, Fall 2015Programming Project 6Joann.docx
 
IT 510 Final Project Guidelines and Rubric Overview .docx
IT 510 Final Project Guidelines and Rubric  Overview .docxIT 510 Final Project Guidelines and Rubric  Overview .docx
IT 510 Final Project Guidelines and Rubric Overview .docx
 
LearningOutcomesassessedin
LearningOutcomesassessedinLearningOutcomesassessedin
LearningOutcomesassessedin
 
Msr2021 tutorial-di penta
Msr2021 tutorial-di pentaMsr2021 tutorial-di penta
Msr2021 tutorial-di penta
 
IT 500 Final Project Guidelines and Grading GuideOverview and.docx
IT 500 Final Project Guidelines and Grading GuideOverview and.docxIT 500 Final Project Guidelines and Grading GuideOverview and.docx
IT 500 Final Project Guidelines and Grading GuideOverview and.docx
 
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docxCase Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Ooad lab manual(original)
Ooad lab manual(original)Ooad lab manual(original)
Ooad lab manual(original)
 
CMSC 335 FINAL PROJECT
CMSC 335 FINAL PROJECTCMSC 335 FINAL PROJECT
CMSC 335 FINAL PROJECT
 
Criteria for Research AssignmentPSCI 1010· The paper is due on.docx
Criteria for Research AssignmentPSCI 1010· The paper is due on.docxCriteria for Research AssignmentPSCI 1010· The paper is due on.docx
Criteria for Research AssignmentPSCI 1010· The paper is due on.docx
 
Guidelines for Final Year Engineering & Technology Project.ppt
Guidelines for Final Year Engineering & Technology  Project.pptGuidelines for Final Year Engineering & Technology  Project.ppt
Guidelines for Final Year Engineering & Technology Project.ppt
 
IT 500 Final Project Guidelines and Rubric Overview .docx
IT 500 Final Project Guidelines and Rubric  Overview .docxIT 500 Final Project Guidelines and Rubric  Overview .docx
IT 500 Final Project Guidelines and Rubric Overview .docx
 
Running Head TOWN GUIDE ANDROID APPLICATION5TOWN GUIDE ANDROI.docx
Running Head TOWN GUIDE ANDROID APPLICATION5TOWN GUIDE ANDROI.docxRunning Head TOWN GUIDE ANDROID APPLICATION5TOWN GUIDE ANDROI.docx
Running Head TOWN GUIDE ANDROID APPLICATION5TOWN GUIDE ANDROI.docx
 

Mais de alinainglis

· Previous professional experiences that have had a profound.docx
· Previous professional experiences that have had a profound.docx· Previous professional experiences that have had a profound.docx
· Previous professional experiences that have had a profound.docx
alinainglis
 
· Please select ONE of the following questions and write a 200-wor.docx
· Please select ONE of the following questions and write a 200-wor.docx· Please select ONE of the following questions and write a 200-wor.docx
· Please select ONE of the following questions and write a 200-wor.docx
alinainglis
 
· If we accept the fact that we may need to focus more on teaching.docx
· If we accept the fact that we may need to focus more on teaching.docx· If we accept the fact that we may need to focus more on teaching.docx
· If we accept the fact that we may need to focus more on teaching.docx
alinainglis
 
· How many employees are working for youtotal of 5 employees .docx
· How many employees are working for youtotal of 5 employees  .docx· How many employees are working for youtotal of 5 employees  .docx
· How many employees are working for youtotal of 5 employees .docx
alinainglis
 
· How should the risks be prioritized· Who should do the priori.docx
· How should the risks be prioritized· Who should do the priori.docx· How should the risks be prioritized· Who should do the priori.docx
· How should the risks be prioritized· Who should do the priori.docx
alinainglis
 
· Helen Petrakis Identifying Data Helen Petrakis is a 5.docx
· Helen Petrakis Identifying Data Helen Petrakis is a 5.docx· Helen Petrakis Identifying Data Helen Petrakis is a 5.docx
· Helen Petrakis Identifying Data Helen Petrakis is a 5.docx
alinainglis
 
· Global O365 Tenant Settings relevant to SPO, and recommended.docx
· Global O365 Tenant Settings relevant to SPO, and recommended.docx· Global O365 Tenant Settings relevant to SPO, and recommended.docx
· Global O365 Tenant Settings relevant to SPO, and recommended.docx
alinainglis
 
· Focus on the identified client within your chosen case.· Analy.docx
· Focus on the identified client within your chosen case.· Analy.docx· Focus on the identified client within your chosen case.· Analy.docx
· Focus on the identified client within your chosen case.· Analy.docx
alinainglis
 
· FASB ASC & GARS Login credentials LinkUser ID AAA51628Pas.docx
· FASB ASC & GARS Login credentials LinkUser ID AAA51628Pas.docx· FASB ASC & GARS Login credentials LinkUser ID AAA51628Pas.docx
· FASB ASC & GARS Login credentials LinkUser ID AAA51628Pas.docx
alinainglis
 
· Due Sat. Sep. · Format Typed, double-spaced, sub.docx
· Due Sat. Sep. · Format Typed, double-spaced, sub.docx· Due Sat. Sep. · Format Typed, double-spaced, sub.docx
· Due Sat. Sep. · Format Typed, double-spaced, sub.docx
alinainglis
 
· Expectations for Power Point Presentations in Units IV and V I.docx
· Expectations for Power Point Presentations in Units IV and V I.docx· Expectations for Power Point Presentations in Units IV and V I.docx
· Expectations for Power Point Presentations in Units IV and V I.docx
alinainglis
 
· Due Friday by 1159pmResearch Paper--IssueTopic Ce.docx
· Due Friday by 1159pmResearch Paper--IssueTopic Ce.docx· Due Friday by 1159pmResearch Paper--IssueTopic Ce.docx
· Due Friday by 1159pmResearch Paper--IssueTopic Ce.docx
alinainglis
 

Mais de alinainglis (20)

· Present a discussion of what team is. What type(s) of team do .docx
· Present a discussion of what team is. What type(s) of team do .docx· Present a discussion of what team is. What type(s) of team do .docx
· Present a discussion of what team is. What type(s) of team do .docx
 
· Presentation of your project. Prepare a PowerPoint with 8 slid.docx
· Presentation of your project. Prepare a PowerPoint with 8 slid.docx· Presentation of your project. Prepare a PowerPoint with 8 slid.docx
· Presentation of your project. Prepare a PowerPoint with 8 slid.docx
 
· Prepare a research proposal, mentioning a specific researchabl.docx
· Prepare a research proposal, mentioning a specific researchabl.docx· Prepare a research proposal, mentioning a specific researchabl.docx
· Prepare a research proposal, mentioning a specific researchabl.docx
 
· Previous professional experiences that have had a profound.docx
· Previous professional experiences that have had a profound.docx· Previous professional experiences that have had a profound.docx
· Previous professional experiences that have had a profound.docx
 
· Please select ONE of the following questions and write a 200-wor.docx
· Please select ONE of the following questions and write a 200-wor.docx· Please select ONE of the following questions and write a 200-wor.docx
· Please select ONE of the following questions and write a 200-wor.docx
 
· Please use Firefox for access to cronometer.com16 ye.docx
· Please use Firefox for access to cronometer.com16 ye.docx· Please use Firefox for access to cronometer.com16 ye.docx
· Please use Firefox for access to cronometer.com16 ye.docx
 
· Please share theoretical explanations based on social, cultural an.docx
· Please share theoretical explanations based on social, cultural an.docx· Please share theoretical explanations based on social, cultural an.docx
· Please share theoretical explanations based on social, cultural an.docx
 
· If we accept the fact that we may need to focus more on teaching.docx
· If we accept the fact that we may need to focus more on teaching.docx· If we accept the fact that we may need to focus more on teaching.docx
· If we accept the fact that we may need to focus more on teaching.docx
 
· How many employees are working for youtotal of 5 employees .docx
· How many employees are working for youtotal of 5 employees  .docx· How many employees are working for youtotal of 5 employees  .docx
· How many employees are working for youtotal of 5 employees .docx
 
· How should the risks be prioritized· Who should do the priori.docx
· How should the risks be prioritized· Who should do the priori.docx· How should the risks be prioritized· Who should do the priori.docx
· How should the risks be prioritized· Who should do the priori.docx
 
· How does the distribution mechanism control the issues address.docx
· How does the distribution mechanism control the issues address.docx· How does the distribution mechanism control the issues address.docx
· How does the distribution mechanism control the issues address.docx
 
· Helen Petrakis Identifying Data Helen Petrakis is a 5.docx
· Helen Petrakis Identifying Data Helen Petrakis is a 5.docx· Helen Petrakis Identifying Data Helen Petrakis is a 5.docx
· Helen Petrakis Identifying Data Helen Petrakis is a 5.docx
 
· Global O365 Tenant Settings relevant to SPO, and recommended.docx
· Global O365 Tenant Settings relevant to SPO, and recommended.docx· Global O365 Tenant Settings relevant to SPO, and recommended.docx
· Global O365 Tenant Settings relevant to SPO, and recommended.docx
 
· Focus on the identified client within your chosen case.· Analy.docx
· Focus on the identified client within your chosen case.· Analy.docx· Focus on the identified client within your chosen case.· Analy.docx
· Focus on the identified client within your chosen case.· Analy.docx
 
· Find current events regarding any issues in public health .docx
· Find current events regarding any issues in public health .docx· Find current events regarding any issues in public health .docx
· Find current events regarding any issues in public health .docx
 
· Explore and assess different remote access solutions.Assig.docx
· Explore and assess different remote access solutions.Assig.docx· Explore and assess different remote access solutions.Assig.docx
· Explore and assess different remote access solutions.Assig.docx
 
· FASB ASC & GARS Login credentials LinkUser ID AAA51628Pas.docx
· FASB ASC & GARS Login credentials LinkUser ID AAA51628Pas.docx· FASB ASC & GARS Login credentials LinkUser ID AAA51628Pas.docx
· FASB ASC & GARS Login credentials LinkUser ID AAA51628Pas.docx
 
· Due Sat. Sep. · Format Typed, double-spaced, sub.docx
· Due Sat. Sep. · Format Typed, double-spaced, sub.docx· Due Sat. Sep. · Format Typed, double-spaced, sub.docx
· Due Sat. Sep. · Format Typed, double-spaced, sub.docx
 
· Expectations for Power Point Presentations in Units IV and V I.docx
· Expectations for Power Point Presentations in Units IV and V I.docx· Expectations for Power Point Presentations in Units IV and V I.docx
· Expectations for Power Point Presentations in Units IV and V I.docx
 
· Due Friday by 1159pmResearch Paper--IssueTopic Ce.docx
· Due Friday by 1159pmResearch Paper--IssueTopic Ce.docx· Due Friday by 1159pmResearch Paper--IssueTopic Ce.docx
· Due Friday by 1159pmResearch Paper--IssueTopic Ce.docx
 

Último

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Último (20)

Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 

· Make sure you include the major 7 sections graded mentioned be.docx

  • 1. · Make sure you include the major 7 sections graded mentioned below in your paper · The deliverable should contain the following components: ( 1) Overall Goals/Research Hypothesis (10 %) 1-3 research questions to navigate/direct all your project. · You may delay this section until (1) you study all previous work and (2) you do some analysis and understand the dataset/project (2) (Previous/Related Contributions) (15 %) As most of the selected projects use public datasets, no doubt there are different attempts/projects to analyze those datasets. 30 % of this deliverable is in your overall assessment of previous data analysis efforts. This effort should include: · Evaluating existing source codes that they have (e.g. in Kernels and discussion sections) or any other refence. Make sure you try those codes and show their results · In addition to the code, summarize most relevant literature or efforts to analyze the same dataset you have picked. · For the few who picked their own datasets, you are still expecting to do your literature survey in this section on what is most relevant to your data/idea/area and summarize those most relevant contributions.
  • 2. (3) A comparison study (15 %) Compare results in your own work/project with results from previous or other contributions (data and analysis comparison not literature review) The difference between section 3 and section 2 is that section 2 focuses on code/data analysis found in sources such as Kaggle, github, etc. while section 3 focuses on research papers that not necessary studied the same dataset, but the same focus area (4) Preprocessing activities, Features Selection / Engineering (10 %) (See this link for content of the next section) https://www.kaggle.com/WinningModelDocumentationGuidelin es · What were the most important features? · We suggest you provide: · a variable importance plot ( an example here about halfway down the page), showing the 10-20 most important features and · partial plots for the 3-5 most important features · If this is not possible, you should provide a list of the most important features. · How did you select features? · Did you make any important feature transformations?
  • 3. · Did you find any interesting interactions between features? · Did you use external data? (if permitted) (5) Training Method(s) 10 % · What training methods did you use? · Did you ensemble the models? · If you did ensemble, how did you weight the different models? A6. Interesting findings · What was the most important trick you used? · What do you think set you apart from others in the competition? · Did you find any interesting relationships in the data that don't fit in the sections above? Many customers are happy to trade off model performance for simplicity. With this in mind: · Is there a subset of features that would get 90-95% of your final performance? Which features? * · What model that was most important? * · What would the simplified model score?
  • 4. · * Try and restrict your simple model to fewer than 10 features and one training method. (6) Accuracy metrics reporting, charts, Model Execution Time (10 %) Many customers care about how long the winning models take to train and generate predictions: · How long does it take to train your model? · How long does it take to generate predictions using your model? · How long does it take to train the simplified model (referenced in section A6)? · How long does it take to generate predictions from the simplified model? (7) Use of ensemble methods (15 %) Per the last chapter we have, make sure you employ at least two different ensemble models in your code and show the model details and results References Citations to references, websites, blog posts, and external sources of information where appropriate.
  • 5. Summary Summarize the most important aspects of your model and analysis, such as: The training method(s) you used (Convolutional Neural Network, XGBoost) The most important features The tool(s) you used How long it takes to train your model ------------------------------------------------ ---------------------------------------------------------------- Quality Criteria (10-20% of overall project): 1. Thorough performance analysis : Results in data analysis can be misleading. Without detail analysis of different performance metrics (e.g. accuracy, recall, ROC, AUC, etc.) one-side view of results can present incomplete and inaccurate findings. Presenting a thorough analysis for overall performance of your models will show that you did not ignore any factor in your model. 2. Following standard project templates : You can find through the Internet several standard templates for data science projects (How to structure your code, data, etc.). While following standard templates is not a must or required but will be considered as part of quality criteria. Here are examples of code templates for different programming
  • 6. environments: a. R and RStudio: http://projecttemplate.net/getting_started.html https://nicercode.github.io/blog/2013-04-05-projects/ https://community.rstudio.com/t/data-science-project-template- for-r/3230/10 b. Python: https://towardsdatascience.com/manage-your-data-science- project-structure-in-early-stage-95f91d4d0600 https://drivendata.github.io/cookiecutter-data-science/#example https://github.com/equinor/data-science-template c. MS Azure https://github.com/Azure/Azure-TDSP-ProjectTemplate https://buckwoody.wordpress.com/2017/08/17/a-data-science- microsoft-project-template-you-can-use-in-your-solutions/ https://docs.microsoft.com/en-us/azure/machine-learning/team- data-science-process/team-data-science-process-project- templates 3. Better documentation Save the data + code that generated the output, rather than the
  • 7. output itself. Intermediate files are okay as long as there is clear documentation of how they were created 4. Use Version Control e.g. using some websites such as Gitlab, GitHub / BitBucket 4. Document and keep track of your analysis environment : If you work on a complex project involving many tools / datasets, the software and computing environment can be critical for reproducing your analysis Computer architecture: CPU (Intel, AMD, ARM), GPUs, Operating system: Windows, Mac OS, Linux / Unix Software toolchain: Compilers, interpreters, command shell, programming languages (C, Perl, Python, etc.), database backends, data analysis software Supporting software / infrastructure: Libraries, R packages, dependencies External dependencies: Web sites, data repositories, remote databases, software repositories