2. Data Science Lifecycle
• Data Science Lifecycle helps for systematic functioning of the project.
• Normal mistakes in data science is collection of unnecessary data and
analysis without understanding the structure or framing of business
problem.
• To avoid unnecessary work and to save time, one should must follow some
phases or stages.
• Those stages are the lifecycle of data science which describes here.
4. 1. Concept Study
• While starting a new project you need to
study and understand project and its
potential outputs.
• Aim is to understand the problems and
possible options to proceed.
• It is essential to understand the purpose,
source of data, budget of project etc.
• Here, one should check data, algorithms,
time and clients requirement for the
project.
5. 2. Data Preparation
• The data we receive is not in proper
format hence not useful, this raw data
preparation is the most important task
of data science lifecycle.
• You need to clean and prepare data for
further use. You can use R-language or
Python for cleaning, transfer and
visualization.
• Preparation of data can be done by
transformation, extraction, removing
unwanted data and the Exploratory
Data Analysis (EDA) should apply.
6. 3. Modelling
• After preparation of data, it’s time to choose a suitable method, techniques
and model.
• Some Exploratory Data Analysis (EDA) also done in this step for analysis of
data and to understand the relationship between variables. After EDA the
data splits into training and test data to train the model.
• R is most useful tool but python also preferable as its libraries are easy or
convenient for analysis. The python libraries commonly useful are Pandas,
Numpy, Matplotlib etc.
• With the help of different models you can validate the results. If your
model working correctly then you can proceed or you need to retrain the
model again.
7. 4. Model Deployment
• This is the most important step in lifecycle. The trained model can be
deploy with help of API (Application Programming Interface).
• This is the difficult task for Data scientist but API can made it easy.
8. 5. Communicate results
• This is the last step in which you should convey the results to client.
• The consideration which you have decide in the first step can be explain
and communicate.
• Here, you include the details about the lifecycle and determine the success
level of project.
• Once the result get accepted then you can share report, codes and other
required documents.