Developer Data Modeling Mistakes: From Postgres to NoSQL
xtremes
1. PLANT DISEASE IDENTIFIER
Abstract
Now a days, the identification and classification is performed based on observations and through
experience. The system utilizes image-processing techniques to identify and classify the diseases in leaves.
The developed system starts the process by capturing the Leaf’s image using a regular digital camera. Then,
the image is transmitted to the processing level where feature extraction, identification and classification
is done using MATLAB. The leaves are identified whether it is diseased or not and later it is classified if it is
diseased based on color and texture. Both identification and classification are realized by Support Vector
Machine approach. The results obtained are very promising.
PROBLEM DEFINITION
To develop an automatic disease identification system that takes the pomegranate leaf image as the input
and enhance the image by applying various image processing techniques. Later, a variety of image features
are extracted for the enhanced image. Based on these features, the leaf image is classified as either
healthy or diseased employing Fuzzy Logic. Also, the disease grade is determined based on the fuzzy rule
set. Finally, the project is also aimed to bestow a disease treatment advisory module thereby helping
Agronomists/farmers.
Introduction
Agriculture is the largest economic sector and it plays the major role in economic development of India.
The manual identification and classification techniques which are being used to distinguish between
different types of leaf diseases that are relying on human resource. Since these techniques are guided by
human intervention, they are subjected to some kind of errors. Since humans are subjected to tiredness
and due to the shortage of labors, automated system needs to be incorporated to minimize the work and
the automated system also helps to reduce the time consumed by manual techniques.
Many new agricultural automation technologies are being developed by university researchers that
pose questions about the efficiency and effectiveness with which we carry out current agricultural
practices. This has given rise to many new opportunities to service the agronomic requirements
albeit in radically different ways to those currently used.
Leaves are delicate part of plant, so they should be tested via non-destructive techniques. Classification
is vital for the evaluation of agricultural produce. Leaf’s Texture and color is the most important visual
property. Hence, classification of leaf disease is necessary in evaluating agricultural produce, meeting
quality standards and increasing market value. It is also helpful in identifying and taking further measures
for further spreading of the diseases. If the identification and classification is done through manual
techniques, the process will be too slow, we need the expert’s help who are less available and sometimes
it will be error prone. The labors classify based on color, size etc. if these quality measures are mapped
into automated system by using suitable programming language then the work will be faster and error free.
2. The proposed automated identification and classification system is designed to combine four processes
such as Segmentation, feature extraction, identification and classification. Software development is highly
important in this classification system. The entire system is designed over Matlab11 software to inspect
the color and texture of the leaf. Color and Texture of the leaf is very important in classification but since
due to the similarity of colors between some fruits, the size also helps in solving this kind of problems. The
color and size based classification involves extracting the useful information from the leaf surface and
classify it to the respective categories.
This automated system is designed to overcome the problems of manual techniques. The image
could be captured using a regular digital camera or high resolution mobile phone camera. This image is
given as an input to the system for obtaining the leaf features. The system consists of several steps like
segmentation, feature extraction, identification and classification.
(a) (b) (c)
Fig. 1: Set of leaves
The main advantage of this application is to identify and classify the leaf diseases. In this application as
shown in fig.1, two types of diseases that are occurring in pomegranate plant has been used to identify
and classify. Based on the leaf’s color and texture, the leaf is classified to its class. The identification and
classification is done based on leaf’s color and texture features by using Support Vector Machine.
System design
The architectural design is concerned with establishing a basic structure of a system. It involves
identifying the major components of the system and communications between these components. 1-tier
architecture is best suited for our proposed system as it is based on simulation. As our project is executed
in one system where everything resides in a single program, one tier is best applicable. The overall process
is shown here.
Fig. 2.1: Flow chart of identification and classification process
3. The proposed methodology aims to model disease detection/classification and a promising disease grading
system for pomegranate plant leaves. The system makes use of various image processing techniques. The
proposed work is mainly divided into five steps: (1) image acquisition (2) image pre-processing (3) feature
extraction (4) disease classification (5) disease grading. Finally treatment advisory module is built to make
it as a kiosk for the farmers for proper control and management of pomegranate farm. The methodological
analysis of the work is presented graphically in Fig. 3.
Fig3. A Fuzzy based Plant Disease Identifier
III. Feature Extraction
Feature extraction is the process of measuring or calculating the features from the image samples such
that which are sufficient to distinguish between one type of image from another type. Certain leaf
diseases can be easily identified by color and texture feature. The feature extraction process is done
using the MATLAB image processing toolbox. The extraction process Units begins with the conversion of
the original image to resize image and then to feature extraction using K-means clustering shown in Fig.
2.2
4. Original image resized Image L*a*b Image
Fig 2.2: Feature extraction of the image
The extracted part of leaf is as shown in fig.2.2. The features are extracted from the segmented image and
these features are used to identify and classify the diseases that are occurring in the pomegranate plant
leaves. The features that are extracted are the texture and color. Texture features like Energy, Entropy
and contrast. Color features like Hue, Saturation, value, intensity, cyan, magenta, yellow, luminance,
Blue Diff. and Red Diff. The features that are taken for the considerations are Entropy and Energy from
texture and value and intensity from the color feature. But to differentiate between the diseased and
healthy we also use the value of Hue from the color feature also. To obtain the feature values of the
functions are written in Mat lab. The values obtained are taken to the expert to identify the different
properties that they exhibit and after the experts opinion the values are discriminated using the MS-
Office software. Features that show the difference between healthy and diseased.
The Energy of Healthy leaf lies b/w 0-0.5
The Energy of Diseased leaf lies b/w 0.5-1
The Entropy of Healthy leaf lies b/w 4-6
The Energy of Diseased leaf lies b/w 0.4-4
The Value of Healthy leaf lies b/w 0.3-0.45
The Value of Diseased leaf lies b/w .05-0.3
The Intensity of Healthy leaf lies b/w .25-.4
The Intensity of Diseased leaf lies b/w 0.05-0.25
The Magenta of Healthy leaf lies b/w 70-120
The Energy of Diseased leaf lies b/w 10-70
The Luminance of Healthy leaf lies b/w 60-90
The Luminance of Diseased leaf lies b/w 10-60
5. The Blue diff. of Healthy leaf lies b/w 20-35
The Blue diff. of diseased leaf lies b/w 5-20
The prefix “non” is not a word; it should be joined to the word it modifies, usually without a hyphen.
Features that show the difference between healthy and diseased.
The Energy of Anthracnose leaf lies b/w 0.8-1
The Energy of Bacterial Blight leaf lies b/w 0.5-0.8
The Value of Anthracnose leaf lies b/w 0.05-0.15
The Value of Bacterial Blight leaf lies b/w 0.2-0.35
The Intensity of Anthracnose leaf lies b/w 0.05-0.15
The Intensity of Bacterial Blight leaf lies b/w 0.15-0.3
The Magenta of Anthracnose leaf lies b/w 10-45
The Magenta of Bacterial Blight leaf lies b/w 45-80
The Luminance of Anthracnose leaf lies b/w 10-35
The Luminance of Bacterial Blight leaf lies b/w 35-65
The Blue Diff. of Anthracnose leaf lies b/w 2-10
The Blue Diff. of Bacterial Blight leaf lies b/w 10-25
Identification and Classify
It is desirable to have a system that has the ability to recognize the disease occurred to the leaf and
classify it to its respective classes. Such a system can be used for classifying the diseased and healthy leaf.
A technique for identifying the diseased and healthy is proposed here. Support vector machine has been
used to sort the leaves to their respective classes.
The support vector machine is defined as a system which uses hypothesis space of a linear function in a
high dimensional feature space, trained with a learning algorithm from optimization theory that
implements a learning bias derived from statistical learning theory.
The implementations of the multiclass methods that will be studied in this paper. For a given multiclass
problem, M will denote the number of classes and ωi, i = 1,2,3….. M will denote the M classes. For binary
classification we will refer to the two classes as positive and negative; a binary classifier will be assumed to
produce an output function that gives relatively large values for examples from the positive class and
relatively small values.
MCSVM Crammer and Singer (2001) propose a multiclass formulation that we call partial ranking. The
dual cost is a, function of a n × k matrix of Lagrange coefficients where n is the number of examples and k
the number of classes. Each iteration of the MCSVM algorithm maximizes the restriction of the dual cost to
a single row of the coefficient matrix. Successive rows are selected using the gradient of the cost function.
Unlike the coefficients matrix, the gradient is not sparse. This approach is not feasible when the number of
classes k grows exponentially, because the gradient becomes too large.
The basic thing is SVM is never designed as multiclass classifier. So like its basic principle supports only
two classes. Training Process : By using the different set of features the Support vector machine can be
implemented for classifying more than two classes which helps in the easy classification and also is an
accurate way by loading the different set of values that are extracted by the process of feature extraction
and sorting those values whether it belongs to healthy or to which disease does it belongs.
6. Testing: The testing process is done by taking any query leaf image and it is given for the test to the system
and the system processes the query image and extracts the value and later it segregates that leaf is
healthy or to which kind of disease it belongs.
Fig. 2.2: SVM to classify Healthy and diseased
Fig. 2.3: SVM to classify Anthrac nose or Bacterial Blight
V.Implementation
The implementation phase begins with leaf’s sample being captured using regular digital camera
with black background with the help of a stand. The image is loaded into matlab for processing. The
features such as texture and color features are extracted for identifying and classifying such as healthy or
diseased are extracted for classifying the sample image. There are different modules which will perform
different operations on the image being loaded. The modules are :
Ø Image capture
Ø Image Resize
7. Ø Filtering image
Ø Segmentation
Ø Color and Texture features extraction
Ø Identify the sample
Ø classify the sample
A. Image capture
An image of the leaf is captured by using a digital camera or any mobile phone camera.This image is
loaded into the matlab by using the function ‘imread’. This function reads the image from the specified
path. The image is stored in the matrix form of rows and columns . If is a grayscale image, then it is stored
as an M-by-N array. If the file contains a true color image or RGB image, then it is stored as an M-by-N-by-3
array.
The syntax for selecting an image is I=imread(filename, fmt)
which reads a grayscale or color image from the file specified by the string filename. If the file is not in the
current directory, or in a directory on the MATLAB path, specify the full pathname. The text string fmt
specifies the format of the file by its standard file extension, for example: *.jpeg, *.jpg, *.bmp, etc.
B. Image Resize
The captured image will be having a larger size and it will be difficult for the normal systems to process
the image ,so to overcome this problem the image size is reduced to either [512,512]pixel or to
[256,256]pixel.
C. Filtering Image
The filtering is a process which helps in removal of noises that are occurred during the time of image
acquisition and the filter that has been used is the Gaussian filter.
In electronics and signal processing a Gaussian filter is a filter whose impulse response is a Gaussian
function. Gaussian filters are designed to give no overshoot to a step function input while minimizing the
rise and fall time. This behavior is closely connected to the fact that the Gaussian filter has the minimum
possible group delay. Mathematically, a Gaussian filter modifies the input signal by convolution with a
Gaussian function; this transformation is also known as the Weierstrass transform.
h = fspecial('gaussian');
D.Segmentation
The search for homogeneous regions in an image and later the classification of these regions. For image
segment based classification, the images that need to be classified are segmented into many
8. homogeneous areas with similar spectrum information firstly, and the image segments. K-mean color
clustering is used for segmenting the diseased part from the leaf image.
Fig. 3.4: leaf Segmentation using K-means clustering algorithm
Start:
Step1: Separate RGB components from original 24 bit input image.
Step2: Convert it into HSV and YCbCr color space.
Step3: Extract the color features Hue(H), Saturation(S), Intensity value(V),luminance(Y),Cyan, Magenta,
Yellow Diff. B/w blue Component and ref.val,
Diff. B/w red Component and ref.val (1) through (3).
Step 4: Compute mean, Standard deviation, and range for each RGB, HIS and YCbCr components.
Stop.
Verification and Validation
Verification and Validation (V & V) is the name given to the checking and analysis process that ensures that
software conforms to its specification and meets the needs of the customers who are paying for that
software.
· ‘Validation: Are we building the right product?’
· ‘Verification: Are we building the product right?’
9. Verification involves checking that the software conforms to its specification. We should check that the
system meets its specified functional and non-functional requirements.
Validation is a more general process. We should ensure that the software meets the expectations of the
customer.
Within the V & V process, two techniques of system checking and analysis may be used:
1. Software inspections analyze and check system representations such as the
requirements document, design diagrams and the program source code.
2. Software testing involves executing an implementation of the software with test data and examining the
outputs of the software and its operational behavior to check that it is performing as required. Testing is a
dynamic technique of verification and validation because it works with an executable representation of the
system.The testing phase our project includes the following tests.
Defect testing
The goal of defect testing is to expose latent defects in a software system before the system is
delivered. This contrasts with validation testing which is intended to demonstrate that system meets its
specification. Validation testing requires the system to perform correctly using given acceptance test cases.
A successful defect test is a test which causes the system to perform incorrectly and hence exposes a
defect. This emphasizes an important fact about testing. It demonstrates the presence, not the absence, of
program faults.This software is exhaustively tested for defects and all the defects have been successfully
countered.
Black-box testing
Functional or black-box testing is an approach to testing where the tests are derived from the program or
component specification. The system is a ‘black-box’ whose behavior can only be determined by studying
its inputs and the related outputs. Another name for this is functional testing because the tester is only
concerned with the functionality and not the implementation of the software.
10. This software is tested repeatedly by supplying many inputs and observing the output. In each case
it has performed up to the mark.
Structural testing
Structural testing is an approach to testing where the tests are derived from knowledge of the
software’s structure and implementation. This approach is sometimes called ‘white-box’ testing or ‘clear-
box’ testing to distinguish it from black-box testing. Structural testing is usually applied to relatively small
programs units such as sub-routines or the operations associated with an object. As the name implies, the
tester can analyze the code and use knowledge about the structure of a component to derive test data.
The analysis of the code can be used to find how many test cases needed to guarantee that all the
statements in the program or component are executed to least once during the testing process.
Each of the small modules in the software is tested independently and satisfactory results were
obtained. The different modules tested are image resize, segmentation, extract features, classify with
fuzzy and grade with fuzzy.
Interface testing
Interface testing takes place when modules or sub-systems are integrated to create larger systems. Each
module or sub-system has a defined interface which is called by other program components. The objective
of interface testing is to detect faults which may have been introduced into the system because of
interface errors or invalid assumptions about the interfaces. In the present software no errors occurred
because of interfacing the different modules.
Conclusion
This work presents a new technique for identification and classification of leafs. This technique begins
with capturing the leaf’s image using regular digital camera with a stand. The features are efficiently
extracted from the query image. The color of the leaf determines its class. The Support Vector Machine
fuzzy logic technique is used for both identification and classification of leaves. The proposed technique
accurately identifies and classifies the leaf whether. The results are good for pomegranate leaves, this kind
of system can be employed in pomegranate fields and also in the Android application enabled mobiles, etc.
REFERENCES