The document discusses using machine learning techniques like artificial neural networks, linear regression, and support vector regression to forecast daily oil production from an oil field. It analyzes production data from the Volve oil field in Norway using these three methods. All methods showed potential for production forecasting, though artificial neural networks performed best for one well. The performance of algorithms depends on the specific case, so each must be evaluated individually to select the best technique.
The document describes research developing an artificial neural network model to predict construction costs for expressway projects in Iraq. Data on past expressway projects was collected from the Stat Commission for Roads and Bridges in Iraq. A neural network model was built and trained on this data. The model was able to predict total construction costs with 90% accuracy based on correlation and an average accuracy of 89% compared to actual costs. The model performance was found to be relatively insensitive to the number of hidden layers, momentum term, and learning rate.
This paper reviews load forecasting using a neuro-fuzzy system. It discusses how neural networks and fuzzy logic can be combined in a neuro-fuzzy system to improve load forecasting accuracy. The paper first provides background on load forecasting and different techniques used. It then proposes using a neuro-fuzzy approach where load data is classified with fuzzy sets and a neural network is trained on each classification to forecast loads. Combining the learning ability of neural networks with the symbolic reasoning of fuzzy logic in a neuro-fuzzy system can potentially provide more accurate short-term load forecasts. The paper concludes that neuro-fuzzy systems show advantages over other statistical and AI methods for load forecasting.
Short-Term Forecasting of Electricity Consumption in Palestine Using Artifici...ijaia
Nowadays, planning the process of electricity consumption demand is one of the keys success factors for
the development of countries. Due to the importance of electricity, countries have greatly paid attention to
the prediction of electricity consumption. Electricity consumption prediction is a major problem for the
power sector; an efficient prediction will help electrical companies to take the right decisions and to
optimize their supply strategies for their work. In this paper, we proposed a model that is used to predict
the future electricity consumption depending on the previous consumption. This model provides companies
and authorities to know the future information about the electricity consumption, so they can organize their
distribution and make suitable plans to maintain the stability in the delivery and distribution of electricity.
We aim to create a model that will be able to study the previous electricity consumption patterns and use
this data to predict the future electricity consumption. The system analyzes the collected data of electricity
consumption of the previous years, then byusing the mean value for each day and the use of Multilayer
Feed-Forward with Backpropagation Neural Networks (MFFNNBP) as a tool to predict the future
electricity consumption in Palestine. The data used in this paper depends on data collection of months and
years. Finally, this proposed model conducts a systematic process with the aim of determining the future
electricity consumption in Palestine. The proposed application and the result in this paper are developed in
order to contribute to the improvement of the current energy planning tools in Palestine. The experimental
results show that the model performs good results of prediction, with low Mean Square Error (MSE).
Techniques to Apply Artificial Intelligence in Power Plantsijtsrd
In todays world, we are experiencing tremendous growth in the research and application of Artificial intelligence. Power plants are a vast sector where there is a scope of using AI to rectify the faults and optimize the overall running of the plants. The use of AI will help in reducing human dependence, and during a breakdown, will assist in rectifying the problem by determining the cause quickly. This paper focusses on proposing numerous methods to implement AI in various power plants and how it will help in the same. Anshika Gupta "Techniques to Apply Artificial Intelligence in Power Plants" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-5 , August 2020, URL: https://www.ijtsrd.com/papers/ijtsrd33050.pdf Paper Url :https://www.ijtsrd.com/engineering/information-technology/33050/techniques-to-apply-artificial-intelligence-in-power-plants/anshika-gupta
IRJET - House Price Prediction using Machine Learning and RPAIRJET Journal
This document discusses using machine learning and robotic process automation (RPA) to predict house prices. Specifically, it proposes using the CatBoost algorithm and RPA to extract real-time data for house price prediction. RPA involves using software robots to automate data extraction, while CatBoost will be used to predict prices based on the extracted dataset. The system aims to reduce problems faced by customers by providing more accurate price predictions compared to relying solely on real estate agents. It will extract data using RPA, clean the data, then apply machine learning algorithms like CatBoost to predict house prices based on various attributes.
The document proposes the UP-Growth+ algorithm to efficiently mine high utility itemsets from transactional databases. It first constructs a UP-Tree to store transaction information using two database scans while removing unpromising items. The UP-Tree aims to reduce overestimated utilities. Potential high utility itemsets are then generated from the UP-Tree using the UP-Growth+ algorithm through two strategies to further decrease overestimations. Finally, actual high utility itemsets are identified from the potential set by considering real utilities in the database.
This document summarizes a study that uses a genetic algorithm to optimize imputing missing cost data for fans used in road tunnels in Sweden. The genetic algorithm is used to impute the missing cost data by optimizing the valid data period used. The results show highly correlated data (R-squared of 0.99) after imputing the missing values, indicating the genetic algorithm provides an effective way to optimize imputing and create complete data that can then be used for forecasting and life cycle cost analysis. The document also reviews other methods for data imputation and discusses experimental results comparing the proposed two-stage approach using K-means clustering and multilayer perceptron on several datasets.
Data mining-for-prediction-of-aircraft-component-replacementSaurabh Gawande
The document presents an approach to using data mining techniques to build predictive models for predicting the need to replace aircraft components using data collected from aircraft sensors. It addresses four key challenges: selecting relevant data from the multiple datasets generated by aircraft, automatically labeling examples with classification values, evaluating models while accounting for dependencies between examples, and combining results from models built on different datasets. The approach was applied to predict problems for various aircraft components using over 3 years of data from 34 aircraft.
The document describes research developing an artificial neural network model to predict construction costs for expressway projects in Iraq. Data on past expressway projects was collected from the Stat Commission for Roads and Bridges in Iraq. A neural network model was built and trained on this data. The model was able to predict total construction costs with 90% accuracy based on correlation and an average accuracy of 89% compared to actual costs. The model performance was found to be relatively insensitive to the number of hidden layers, momentum term, and learning rate.
This paper reviews load forecasting using a neuro-fuzzy system. It discusses how neural networks and fuzzy logic can be combined in a neuro-fuzzy system to improve load forecasting accuracy. The paper first provides background on load forecasting and different techniques used. It then proposes using a neuro-fuzzy approach where load data is classified with fuzzy sets and a neural network is trained on each classification to forecast loads. Combining the learning ability of neural networks with the symbolic reasoning of fuzzy logic in a neuro-fuzzy system can potentially provide more accurate short-term load forecasts. The paper concludes that neuro-fuzzy systems show advantages over other statistical and AI methods for load forecasting.
Short-Term Forecasting of Electricity Consumption in Palestine Using Artifici...ijaia
Nowadays, planning the process of electricity consumption demand is one of the keys success factors for
the development of countries. Due to the importance of electricity, countries have greatly paid attention to
the prediction of electricity consumption. Electricity consumption prediction is a major problem for the
power sector; an efficient prediction will help electrical companies to take the right decisions and to
optimize their supply strategies for their work. In this paper, we proposed a model that is used to predict
the future electricity consumption depending on the previous consumption. This model provides companies
and authorities to know the future information about the electricity consumption, so they can organize their
distribution and make suitable plans to maintain the stability in the delivery and distribution of electricity.
We aim to create a model that will be able to study the previous electricity consumption patterns and use
this data to predict the future electricity consumption. The system analyzes the collected data of electricity
consumption of the previous years, then byusing the mean value for each day and the use of Multilayer
Feed-Forward with Backpropagation Neural Networks (MFFNNBP) as a tool to predict the future
electricity consumption in Palestine. The data used in this paper depends on data collection of months and
years. Finally, this proposed model conducts a systematic process with the aim of determining the future
electricity consumption in Palestine. The proposed application and the result in this paper are developed in
order to contribute to the improvement of the current energy planning tools in Palestine. The experimental
results show that the model performs good results of prediction, with low Mean Square Error (MSE).
Techniques to Apply Artificial Intelligence in Power Plantsijtsrd
In todays world, we are experiencing tremendous growth in the research and application of Artificial intelligence. Power plants are a vast sector where there is a scope of using AI to rectify the faults and optimize the overall running of the plants. The use of AI will help in reducing human dependence, and during a breakdown, will assist in rectifying the problem by determining the cause quickly. This paper focusses on proposing numerous methods to implement AI in various power plants and how it will help in the same. Anshika Gupta "Techniques to Apply Artificial Intelligence in Power Plants" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-5 , August 2020, URL: https://www.ijtsrd.com/papers/ijtsrd33050.pdf Paper Url :https://www.ijtsrd.com/engineering/information-technology/33050/techniques-to-apply-artificial-intelligence-in-power-plants/anshika-gupta
IRJET - House Price Prediction using Machine Learning and RPAIRJET Journal
This document discusses using machine learning and robotic process automation (RPA) to predict house prices. Specifically, it proposes using the CatBoost algorithm and RPA to extract real-time data for house price prediction. RPA involves using software robots to automate data extraction, while CatBoost will be used to predict prices based on the extracted dataset. The system aims to reduce problems faced by customers by providing more accurate price predictions compared to relying solely on real estate agents. It will extract data using RPA, clean the data, then apply machine learning algorithms like CatBoost to predict house prices based on various attributes.
The document proposes the UP-Growth+ algorithm to efficiently mine high utility itemsets from transactional databases. It first constructs a UP-Tree to store transaction information using two database scans while removing unpromising items. The UP-Tree aims to reduce overestimated utilities. Potential high utility itemsets are then generated from the UP-Tree using the UP-Growth+ algorithm through two strategies to further decrease overestimations. Finally, actual high utility itemsets are identified from the potential set by considering real utilities in the database.
This document summarizes a study that uses a genetic algorithm to optimize imputing missing cost data for fans used in road tunnels in Sweden. The genetic algorithm is used to impute the missing cost data by optimizing the valid data period used. The results show highly correlated data (R-squared of 0.99) after imputing the missing values, indicating the genetic algorithm provides an effective way to optimize imputing and create complete data that can then be used for forecasting and life cycle cost analysis. The document also reviews other methods for data imputation and discusses experimental results comparing the proposed two-stage approach using K-means clustering and multilayer perceptron on several datasets.
Data mining-for-prediction-of-aircraft-component-replacementSaurabh Gawande
The document presents an approach to using data mining techniques to build predictive models for predicting the need to replace aircraft components using data collected from aircraft sensors. It addresses four key challenges: selecting relevant data from the multiple datasets generated by aircraft, automatically labeling examples with classification values, evaluating models while accounting for dependencies between examples, and combining results from models built on different datasets. The approach was applied to predict problems for various aircraft components using over 3 years of data from 34 aircraft.
IRJET- Error Reduction in Data Prediction using Least Square Regression MethodIRJET Journal
This document proposes a modification to the least squares regression method to reduce errors in data prediction. It divides the original data set into three parts, uses the first part to make predictions with least squares regression and fits those predictions to the second part of the data to minimize errors. It then validates the model on the third part of data and compares errors to the original least squares method. The proposed method shows reduced errors in prediction based on mean absolute error, mean relative error and root mean square error metrics in most test ranges of the validation data.
PROFILE DOSE ANALYSIS OF 6MV LINEAR ACCELERATOR WITH CCD ELECTRONIC PORTAL IM...AM Publications
Profile dose analysis of 6 MV linear accelerator use CCD Electronic portal imaging device has been ivestigated. The aim of that research is analysis the profile dose curve of CCD EPID. The analysis include calculate the linierity. Symetrisity and penumbra value. Linier accelerator electa compac and CCD EPID are the material of that research. CCD EPID beamed with 10 x 10 cm field with 5 kind of MU. The MU values are 20 MU until 100 MU. The image of CCD EPID converted to grey-scale. Than we calculated the grey scale value become profile dose curve in cross-line and inline position. The result are we get simetrisity and penumbra less than 2%, but linierity value more 0,2% more than 3%. It means that the symetrisity and penumbra agree with AAPM TG no. 47. But the linerity must has more investigated to decrease he value until 3%.
Real-time PMU Data Recovery Application Based on Singular Value DecompositionPower System Operation
Phasor measurement units (PMUs) allow for the enhancement of power system monitoring and control applications and they will prove even more crucial in the future, as the grid becomes more decentralized and subject to higher uncertainty. Tools that improve PMU data quality and facilitate data analytics workflows are thus needed. In this work, we leverage a previously described algorithm to develop a python application for PMU data recovery. Because of its intrinsic nature, PMU data can be dimensionally reduced using singular value decomposition (SVD). Moreover, the high spatio-temporal correlation can be leveraged to estimate the value of measurements that are missing due to drop-outs. These observations are at the base of the data recovery application described in this work. Extensive testing is performed to study the performance under different data drop-out scenarios, and the results show very high recovery accuracy. Additionally, the application is designed to take advantage of a high performance PMU data platform called PredictiveGrid™, developed by PingThings.
KEYWORDS
Preprocessing and secure computations for privacy preservation data miningIAEME Publication
This document discusses privacy-preserving data mining techniques. It covers the following key points:
1. Data preprocessing is an essential step before performing secure computations and data mining on distributed datasets. Tasks like handling irrelevant data, using taxonomy trees, eliminating redundancy, and handling missing values are discussed.
2. Secure computation protocols like secure sum and secure union allow sites to jointly perform computations without disclosing private inputs. Secure sum protocols using randomization, k-anonymization, and homomorphic encryption are compared.
3. A secure union protocol using commutative encryption and one using public-key cryptography are presented. The public-key approach reduces computation time compared to the commutative encryption method.
IRJET- Different Data Mining Techniques for Weather PredictionIRJET Journal
This document discusses different data mining techniques that can be used for weather prediction, including back propagation, decision trees, k-means clustering, expectation maximization, and numerical and statistical methods. It provides an overview of each technique, explaining the basic process or algorithm involved. For example, it explains that back propagation is a deep learning algorithm that trains multilayer neural networks in two phases - propagation and weight updating. It also discusses how decision trees use rules to classify weather data based on input parameters, and how k-means clustering groups similar weather observations into clusters. The document aims to compare these techniques for applying data mining to weather forecasting.
IRJET - Breast Cancer Risk and Diagnostics using Artificial Neural Network(ANN)IRJET Journal
This document describes research using an artificial neural network (ANN) to classify breast cancer as benign or malignant based on the Wisconsin Breast Cancer dataset. The ANN model was trained and tested on 683 instances from the dataset. The model achieved 97.8% accuracy on the training set and 97.5% accuracy on the test set. Various performance metrics including mean absolute error, root mean square error, and kappa statistics were used to evaluate the model, demonstrating low error rates. The ANN model outperformed other classification algorithms in related work and efficiently classified breast cancer with high accuracy and precision.
This document discusses developing a theory of data analysis systems that integrates statistical methodology with the design of distributed data systems. It aims to balance tradeoffs between computational, transmission, and statistical costs when performing large-scale, distributed data analysis. As a proof of concept, it presents a toy example involving maximum likelihood estimation of parameters for a Gaussian process model using distributed spatial data. The example quantifies various costs associated with data access, transmission, and computation to jointly optimize the statistical analysis approach and data system design. Challenges include developing objective functions that can optimize both aspects simultaneously and approximating statistical costs like uncertainty.
Support Vector Machine–Based Prediction System for a Football Match Resultiosrjce
This document describes a study that used a support vector machine (SVM) to develop a football match result prediction system. The SVM model was trained on 16 datasets from the 2014-2015 English Premier League season and tested on 15 additional matches. The SVM used a Gaussian combination kernel and was optimized with various parameters. The system achieved a prediction accuracy of 53.3%, which the study concluded was a relatively low performance. The document discusses related work on prediction systems and provides details on SVM algorithm implementation and parameters used in an effort to predict football match results.
This document discusses the importance of mathematics in various fields of engineering. It outlines several key areas of mathematics and provides examples of their applications in electrical, civil, mechanical, biomedical and other engineering domains. The areas described include complex numbers, matrices and determinants, Laplace transforms, statistics and probability, vectors and trigonometry, differentiation, integration, and functions, polynomials and linear equations. Across these areas, mathematics is essential for modeling and solving real-world engineering problems involving areas like circuit analysis, structural design, fluid mechanics, biomedical device development, and more.
This document discusses optimization in power systems. Optimization aims to find the maximum and minimum values of functions, and can be used to optimize various systems including power networks, transportation, and more. In power systems specifically, optimization becomes important as systems grow larger and more complex over time due to increasing demand. Optimization techniques can save power utilities hundreds of millions annually through reduced fuel costs and improved reliability while meeting environmental standards. The key goals of optimization in power systems are minimizing generation costs and maximizing social welfare while satisfying operational constraints.
This document discusses forecasting daily runoff using artificial neural networks (ANN). It presents research applying ANN models to the Gunjwani watershed in India. The document describes developing ANN and multiple linear regression models using rainfall, runoff, evaporation, humidity and temperature data from the watershed. It evaluates the models based on statistical performance criteria like mean square error, mean absolute error and correlation coefficient. The results show that the multi-layer perceptron ANN model provided a better forecast of runoff compared to the multiple linear regression models.
A crucial ingredient of a successful weather prediction system is its ability to combine observational data with the
output of numerical weather prediction models to estimate the state of the atmosphere and the oceans. This problem of estimation of the state of a high dimensional chaotic system such as the atmosphere, given noisy and partial observations of it is known as data assimilation in the context of earth sciences. The main object of interest in these problems is
the conditional distribution, called the posterior, of the state conditioned on the observations. Monte Carlo methods are the most commonly used techniques to study this posterior and also to use it efficiently for prediction. I will give a general introduction to the data assimilation problems and also to Monte Carlo techniques, followed by a discussion of some commonly used Monte Carlo algorithms for data assimilation.
Turnover Prediction of Shares Using Data Mining Techniques : A Case Study csandit
Predicting the Total turnover of a company in the ever fluctuating Stock market has always proved to be a precarious situation and most certainly a difficult task at hand. Data mining is a
well-known sphere of Computer Science that aims at extracting meaningful information from large databases. However, despite the existence of many algorithms for the purpose of
predicting future trends, their efficiency is questionable as their predictions suffer from a high
error rate. The objective of this paper is to investigate various existing classification algorithms
to predict the turnover of different companies based on the Stock price. The authorized datasetfor predicting the turnover was taken from www.bsc.com and included the stock market valuesof various companies over the past 10 years. The algorithms were investigated using the ‘R’
tool. The feature selection algorithm, Boruta, was run on this dataset to extract the important
and influential features for classification. With these extracted features, the Total Turnover of
the company was predicted using various algorithms like Random Forest, Decision Tree, SVM and Multinomial Regression. This prediction mechanism was implemented to predict the turnover of a company on an everyday basis and hence could help navigate through dubious
stock markets trades. An accuracy rate of 95% was achieved by the above prediction process.
Moreover, the importance of the stock market attributes was established as well.
Stock Ranking - A Neural Networks Approachriturajvasant
The document discusses using neural networks to perform stock ranking. It begins with defining stock ranking and neural networks. It then discusses why neural networks are well-suited for stock ranking due to their ability to model nonlinear relationships. The methodology section outlines the steps: variable selection, data collection and preprocessing, neural network selection including training and validation. Results show neural networks outperform traditional linear models for stock ranking.
This document describes a final year project by four students at Himalaya College of Engineering in Nepal to analyze and predict stock market prices using artificial neural networks. The project aims to develop a neural network model to forecast stock prices on the Nepal Stock Exchange. Various technical, fundamental, and statistical analysis methods are currently used to predict stock prices but with limited success due to the complex nature of financial markets. The project outlines the design of the neural network, selection of input parameters, data collection, model training and testing. The goal is to apply neural networks to help forecast stock prices in Nepal's stock market.
This document summarizes a study on using data mining techniques like multiple linear regression and density-based clustering to estimate crop production in East Godavari district of India. Multiple linear regression and density-based clustering were used to model the relationship between crop production and factors like rainfall, area sown, fertilizer use. The estimated values from both techniques were found to have a percentage difference ranging from -14% to 13% when compared to actual production values, indicating the techniques can adequately estimate crop production. Tables of actual versus estimated values using both techniques are provided for comparison.
Survey on deep learning applied to predictive maintenance IJECEIAES
Prognosis health monitoring (PHM) plays an increasingly important role in the management of machines and manufactured products in today’s industry, and deep learning plays an important part by establishing the optimal predictive maintenance policy. However, traditional learning methods such as unsupervised and supervised learning with standard architectures face numerous problems when exploiting existing data. Therefore, in this essay, we review the significant improvements in deep learning made by researchers over the last 3 years in solving these difficulties. We note that researchers are striving to achieve optimal performance in estimating the remaining useful life (RUL) of machine health by optimizing each step from data to predictive diagnostics. Specifically, we outline the challenges at each level with the type of improvement that has been made, and we feel that this is an opportunity to try to select a state-of-the-art architecture that incorporates these changes so each researcher can compare with his or her model. In addition, post-RUL reasoning and the use of distributed computing with cloud technology is presented, which will potentially improve the classification accuracy in maintenance activities. Deep learning will undoubtedly prove to have a major impact in upgrading companies at the lowest cost in the new industrial revolution, Industry 4.0.
Statistical modeling for computer-aided design of analog MOS integrated circuits
Introduction to Cadence & MOS Device Characterization
ANALOG & MIXED SIGNAL CENTER (amsc.tamu.edu)
AMS Laurea Institutional Theses Repository
Emerging Yield and Reliability Challenges in Nanometer CMOS Technologies
Machine Learning-Based Approach for Hardware Faults Prediction
Different Classification Technique for Data mining in Insurance Industry usin...IOSRjournaljce
this paper addresses the issues and techniques for Property/Casualty actuaries applying data mining methods. Data mining means the effective unknown pattern discovery from a large amount database. It is an interactive knowledge discovery procedure which is includes data acquisition, data integration, data exploration, model building, and model validation. The paper provides an overview of the data discovery method and introduces some important data mining method for application to insurance concluding cluster discovery approaches.
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...ijaia
In the era of the fourth industrial revolution, measuring and ensuring the reliability, efficiency and safety of the industrial systems and components are one of the uppermost key concern. In addition, predicting performance degradation or remaining useful life (RUL) of an equipment over time based on its historical sensor data enables companies to greatly reduce their maintenance cost. In this way, companies can prevent costly unexpected breakdown and become more profitable and competitive in the marketplace. This paper introduces a deep learning-based method by combining CNN(Convolutional Neural Networks) and LSTM (Long Short-Term Memory)neural networks to predict RUL for industrial equipment. The proposed method does not depend upon any degradation trend assumptions and it can learn complex temporal representative and distinguishing patterns in the sensor data. In order to evaluate the efficiency and effectiveness of the proposed method, we evaluated it on two different experiment: RUL estimation and predicting the status of the IoT devices in 2-week period. Experiments are conducted on a publicly available NASA’s turbo fan-engine dataset. Based on the experiment results, the deep learning-based approach achieved high prediction accuracy. Moreover, the results show that the method outperforms standard well-accepted machine learning algorithms and accomplishes competitive performance when compared to the state-of-the art methods
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...gerogepatton
In the era of the fourth industrial revolution, measuring and ensuring the reliability, efficiency and safety of the industrial systems and components are one of the uppermost key concern. In addition, predicting performance degradation or remaining useful life (RUL) of an equipment over time based on its historical sensor data enables companies to greatly reduce their maintenance cost. In this way, companies can prevent costly unexpected breakdown and become more profitable and competitive in the marketplace. This paper introduces a deep learning-based method by combining CNN(Convolutional Neural Networks) and LSTM (Long Short-Term Memory)neural networks to predict RUL for industrial equipment. The proposed method does not depend upon any degradation trend assumptions and it can learn complex temporal representative and distinguishing patterns in the sensor data. In order to evaluate the efficiency and effectiveness of the proposed method, we evaluated it on two different experiment: RUL estimation and predicting the status of the IoT devices in 2-week period. Experiments are conducted on a publicly available NASA’s turbo fan-engine dataset. Based on the experiment results, the deep learning-based approach achieved high prediction accuracy. Moreover, the results show that the method outperforms standard well-accepted machine learning algorithms and accomplishes competitive performance when compared to the state-of-the art methods
EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...csandit
This research study proposes a novel method for automatic fault prediction from foundry data
introducing the so-called Meta Prediction Function (MPF). Kernel Principal Component
Analysis (KPCA) is used for dimension reduction. Different algorithms are used for building the
MPF such as Multiple Linear Regression (MLR), Adaptive Neuro Fuzzy Inference System
(ANFIS), Support Vector Machine (SVM) and Neural Network (NN). We used classical
machine learning methods such as ANFIS, SVM and NN for comparison with our proposed
MPF. Our empirical results show that the MPF consistently outperform the classical methods.
IRJET- Error Reduction in Data Prediction using Least Square Regression MethodIRJET Journal
This document proposes a modification to the least squares regression method to reduce errors in data prediction. It divides the original data set into three parts, uses the first part to make predictions with least squares regression and fits those predictions to the second part of the data to minimize errors. It then validates the model on the third part of data and compares errors to the original least squares method. The proposed method shows reduced errors in prediction based on mean absolute error, mean relative error and root mean square error metrics in most test ranges of the validation data.
PROFILE DOSE ANALYSIS OF 6MV LINEAR ACCELERATOR WITH CCD ELECTRONIC PORTAL IM...AM Publications
Profile dose analysis of 6 MV linear accelerator use CCD Electronic portal imaging device has been ivestigated. The aim of that research is analysis the profile dose curve of CCD EPID. The analysis include calculate the linierity. Symetrisity and penumbra value. Linier accelerator electa compac and CCD EPID are the material of that research. CCD EPID beamed with 10 x 10 cm field with 5 kind of MU. The MU values are 20 MU until 100 MU. The image of CCD EPID converted to grey-scale. Than we calculated the grey scale value become profile dose curve in cross-line and inline position. The result are we get simetrisity and penumbra less than 2%, but linierity value more 0,2% more than 3%. It means that the symetrisity and penumbra agree with AAPM TG no. 47. But the linerity must has more investigated to decrease he value until 3%.
Real-time PMU Data Recovery Application Based on Singular Value DecompositionPower System Operation
Phasor measurement units (PMUs) allow for the enhancement of power system monitoring and control applications and they will prove even more crucial in the future, as the grid becomes more decentralized and subject to higher uncertainty. Tools that improve PMU data quality and facilitate data analytics workflows are thus needed. In this work, we leverage a previously described algorithm to develop a python application for PMU data recovery. Because of its intrinsic nature, PMU data can be dimensionally reduced using singular value decomposition (SVD). Moreover, the high spatio-temporal correlation can be leveraged to estimate the value of measurements that are missing due to drop-outs. These observations are at the base of the data recovery application described in this work. Extensive testing is performed to study the performance under different data drop-out scenarios, and the results show very high recovery accuracy. Additionally, the application is designed to take advantage of a high performance PMU data platform called PredictiveGrid™, developed by PingThings.
KEYWORDS
Preprocessing and secure computations for privacy preservation data miningIAEME Publication
This document discusses privacy-preserving data mining techniques. It covers the following key points:
1. Data preprocessing is an essential step before performing secure computations and data mining on distributed datasets. Tasks like handling irrelevant data, using taxonomy trees, eliminating redundancy, and handling missing values are discussed.
2. Secure computation protocols like secure sum and secure union allow sites to jointly perform computations without disclosing private inputs. Secure sum protocols using randomization, k-anonymization, and homomorphic encryption are compared.
3. A secure union protocol using commutative encryption and one using public-key cryptography are presented. The public-key approach reduces computation time compared to the commutative encryption method.
IRJET- Different Data Mining Techniques for Weather PredictionIRJET Journal
This document discusses different data mining techniques that can be used for weather prediction, including back propagation, decision trees, k-means clustering, expectation maximization, and numerical and statistical methods. It provides an overview of each technique, explaining the basic process or algorithm involved. For example, it explains that back propagation is a deep learning algorithm that trains multilayer neural networks in two phases - propagation and weight updating. It also discusses how decision trees use rules to classify weather data based on input parameters, and how k-means clustering groups similar weather observations into clusters. The document aims to compare these techniques for applying data mining to weather forecasting.
IRJET - Breast Cancer Risk and Diagnostics using Artificial Neural Network(ANN)IRJET Journal
This document describes research using an artificial neural network (ANN) to classify breast cancer as benign or malignant based on the Wisconsin Breast Cancer dataset. The ANN model was trained and tested on 683 instances from the dataset. The model achieved 97.8% accuracy on the training set and 97.5% accuracy on the test set. Various performance metrics including mean absolute error, root mean square error, and kappa statistics were used to evaluate the model, demonstrating low error rates. The ANN model outperformed other classification algorithms in related work and efficiently classified breast cancer with high accuracy and precision.
This document discusses developing a theory of data analysis systems that integrates statistical methodology with the design of distributed data systems. It aims to balance tradeoffs between computational, transmission, and statistical costs when performing large-scale, distributed data analysis. As a proof of concept, it presents a toy example involving maximum likelihood estimation of parameters for a Gaussian process model using distributed spatial data. The example quantifies various costs associated with data access, transmission, and computation to jointly optimize the statistical analysis approach and data system design. Challenges include developing objective functions that can optimize both aspects simultaneously and approximating statistical costs like uncertainty.
Support Vector Machine–Based Prediction System for a Football Match Resultiosrjce
This document describes a study that used a support vector machine (SVM) to develop a football match result prediction system. The SVM model was trained on 16 datasets from the 2014-2015 English Premier League season and tested on 15 additional matches. The SVM used a Gaussian combination kernel and was optimized with various parameters. The system achieved a prediction accuracy of 53.3%, which the study concluded was a relatively low performance. The document discusses related work on prediction systems and provides details on SVM algorithm implementation and parameters used in an effort to predict football match results.
This document discusses the importance of mathematics in various fields of engineering. It outlines several key areas of mathematics and provides examples of their applications in electrical, civil, mechanical, biomedical and other engineering domains. The areas described include complex numbers, matrices and determinants, Laplace transforms, statistics and probability, vectors and trigonometry, differentiation, integration, and functions, polynomials and linear equations. Across these areas, mathematics is essential for modeling and solving real-world engineering problems involving areas like circuit analysis, structural design, fluid mechanics, biomedical device development, and more.
This document discusses optimization in power systems. Optimization aims to find the maximum and minimum values of functions, and can be used to optimize various systems including power networks, transportation, and more. In power systems specifically, optimization becomes important as systems grow larger and more complex over time due to increasing demand. Optimization techniques can save power utilities hundreds of millions annually through reduced fuel costs and improved reliability while meeting environmental standards. The key goals of optimization in power systems are minimizing generation costs and maximizing social welfare while satisfying operational constraints.
This document discusses forecasting daily runoff using artificial neural networks (ANN). It presents research applying ANN models to the Gunjwani watershed in India. The document describes developing ANN and multiple linear regression models using rainfall, runoff, evaporation, humidity and temperature data from the watershed. It evaluates the models based on statistical performance criteria like mean square error, mean absolute error and correlation coefficient. The results show that the multi-layer perceptron ANN model provided a better forecast of runoff compared to the multiple linear regression models.
A crucial ingredient of a successful weather prediction system is its ability to combine observational data with the
output of numerical weather prediction models to estimate the state of the atmosphere and the oceans. This problem of estimation of the state of a high dimensional chaotic system such as the atmosphere, given noisy and partial observations of it is known as data assimilation in the context of earth sciences. The main object of interest in these problems is
the conditional distribution, called the posterior, of the state conditioned on the observations. Monte Carlo methods are the most commonly used techniques to study this posterior and also to use it efficiently for prediction. I will give a general introduction to the data assimilation problems and also to Monte Carlo techniques, followed by a discussion of some commonly used Monte Carlo algorithms for data assimilation.
Turnover Prediction of Shares Using Data Mining Techniques : A Case Study csandit
Predicting the Total turnover of a company in the ever fluctuating Stock market has always proved to be a precarious situation and most certainly a difficult task at hand. Data mining is a
well-known sphere of Computer Science that aims at extracting meaningful information from large databases. However, despite the existence of many algorithms for the purpose of
predicting future trends, their efficiency is questionable as their predictions suffer from a high
error rate. The objective of this paper is to investigate various existing classification algorithms
to predict the turnover of different companies based on the Stock price. The authorized datasetfor predicting the turnover was taken from www.bsc.com and included the stock market valuesof various companies over the past 10 years. The algorithms were investigated using the ‘R’
tool. The feature selection algorithm, Boruta, was run on this dataset to extract the important
and influential features for classification. With these extracted features, the Total Turnover of
the company was predicted using various algorithms like Random Forest, Decision Tree, SVM and Multinomial Regression. This prediction mechanism was implemented to predict the turnover of a company on an everyday basis and hence could help navigate through dubious
stock markets trades. An accuracy rate of 95% was achieved by the above prediction process.
Moreover, the importance of the stock market attributes was established as well.
Stock Ranking - A Neural Networks Approachriturajvasant
The document discusses using neural networks to perform stock ranking. It begins with defining stock ranking and neural networks. It then discusses why neural networks are well-suited for stock ranking due to their ability to model nonlinear relationships. The methodology section outlines the steps: variable selection, data collection and preprocessing, neural network selection including training and validation. Results show neural networks outperform traditional linear models for stock ranking.
This document describes a final year project by four students at Himalaya College of Engineering in Nepal to analyze and predict stock market prices using artificial neural networks. The project aims to develop a neural network model to forecast stock prices on the Nepal Stock Exchange. Various technical, fundamental, and statistical analysis methods are currently used to predict stock prices but with limited success due to the complex nature of financial markets. The project outlines the design of the neural network, selection of input parameters, data collection, model training and testing. The goal is to apply neural networks to help forecast stock prices in Nepal's stock market.
This document summarizes a study on using data mining techniques like multiple linear regression and density-based clustering to estimate crop production in East Godavari district of India. Multiple linear regression and density-based clustering were used to model the relationship between crop production and factors like rainfall, area sown, fertilizer use. The estimated values from both techniques were found to have a percentage difference ranging from -14% to 13% when compared to actual production values, indicating the techniques can adequately estimate crop production. Tables of actual versus estimated values using both techniques are provided for comparison.
Survey on deep learning applied to predictive maintenance IJECEIAES
Prognosis health monitoring (PHM) plays an increasingly important role in the management of machines and manufactured products in today’s industry, and deep learning plays an important part by establishing the optimal predictive maintenance policy. However, traditional learning methods such as unsupervised and supervised learning with standard architectures face numerous problems when exploiting existing data. Therefore, in this essay, we review the significant improvements in deep learning made by researchers over the last 3 years in solving these difficulties. We note that researchers are striving to achieve optimal performance in estimating the remaining useful life (RUL) of machine health by optimizing each step from data to predictive diagnostics. Specifically, we outline the challenges at each level with the type of improvement that has been made, and we feel that this is an opportunity to try to select a state-of-the-art architecture that incorporates these changes so each researcher can compare with his or her model. In addition, post-RUL reasoning and the use of distributed computing with cloud technology is presented, which will potentially improve the classification accuracy in maintenance activities. Deep learning will undoubtedly prove to have a major impact in upgrading companies at the lowest cost in the new industrial revolution, Industry 4.0.
Statistical modeling for computer-aided design of analog MOS integrated circuits
Introduction to Cadence & MOS Device Characterization
ANALOG & MIXED SIGNAL CENTER (amsc.tamu.edu)
AMS Laurea Institutional Theses Repository
Emerging Yield and Reliability Challenges in Nanometer CMOS Technologies
Machine Learning-Based Approach for Hardware Faults Prediction
Different Classification Technique for Data mining in Insurance Industry usin...IOSRjournaljce
this paper addresses the issues and techniques for Property/Casualty actuaries applying data mining methods. Data mining means the effective unknown pattern discovery from a large amount database. It is an interactive knowledge discovery procedure which is includes data acquisition, data integration, data exploration, model building, and model validation. The paper provides an overview of the data discovery method and introduces some important data mining method for application to insurance concluding cluster discovery approaches.
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...ijaia
In the era of the fourth industrial revolution, measuring and ensuring the reliability, efficiency and safety of the industrial systems and components are one of the uppermost key concern. In addition, predicting performance degradation or remaining useful life (RUL) of an equipment over time based on its historical sensor data enables companies to greatly reduce their maintenance cost. In this way, companies can prevent costly unexpected breakdown and become more profitable and competitive in the marketplace. This paper introduces a deep learning-based method by combining CNN(Convolutional Neural Networks) and LSTM (Long Short-Term Memory)neural networks to predict RUL for industrial equipment. The proposed method does not depend upon any degradation trend assumptions and it can learn complex temporal representative and distinguishing patterns in the sensor data. In order to evaluate the efficiency and effectiveness of the proposed method, we evaluated it on two different experiment: RUL estimation and predicting the status of the IoT devices in 2-week period. Experiments are conducted on a publicly available NASA’s turbo fan-engine dataset. Based on the experiment results, the deep learning-based approach achieved high prediction accuracy. Moreover, the results show that the method outperforms standard well-accepted machine learning algorithms and accomplishes competitive performance when compared to the state-of-the art methods
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...gerogepatton
In the era of the fourth industrial revolution, measuring and ensuring the reliability, efficiency and safety of the industrial systems and components are one of the uppermost key concern. In addition, predicting performance degradation or remaining useful life (RUL) of an equipment over time based on its historical sensor data enables companies to greatly reduce their maintenance cost. In this way, companies can prevent costly unexpected breakdown and become more profitable and competitive in the marketplace. This paper introduces a deep learning-based method by combining CNN(Convolutional Neural Networks) and LSTM (Long Short-Term Memory)neural networks to predict RUL for industrial equipment. The proposed method does not depend upon any degradation trend assumptions and it can learn complex temporal representative and distinguishing patterns in the sensor data. In order to evaluate the efficiency and effectiveness of the proposed method, we evaluated it on two different experiment: RUL estimation and predicting the status of the IoT devices in 2-week period. Experiments are conducted on a publicly available NASA’s turbo fan-engine dataset. Based on the experiment results, the deep learning-based approach achieved high prediction accuracy. Moreover, the results show that the method outperforms standard well-accepted machine learning algorithms and accomplishes competitive performance when compared to the state-of-the art methods
EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...csandit
This research study proposes a novel method for automatic fault prediction from foundry data
introducing the so-called Meta Prediction Function (MPF). Kernel Principal Component
Analysis (KPCA) is used for dimension reduction. Different algorithms are used for building the
MPF such as Multiple Linear Regression (MLR), Adaptive Neuro Fuzzy Inference System
(ANFIS), Support Vector Machine (SVM) and Neural Network (NN). We used classical
machine learning methods such as ANFIS, SVM and NN for comparison with our proposed
MPF. Our empirical results show that the MPF consistently outperform the classical methods.
IRJET- Agricultural Crop Yield Prediction using Deep Learning ApproachIRJET Journal
This document discusses using artificial neural networks to predict agricultural crop yields. It begins with an abstract that outlines using ANNs to predict crop yield given various input parameters like soil pH, nitrogen levels, temperature, rainfall, etc. It then provides an introduction on the importance of accurate crop yield prediction. The next sections discuss literature on previous ANN crop yield prediction models, the proposed ANN approach including network architecture and activation functions, the design process, and conclusions. The key points are that ANNs can accurately predict crop yields given various climatic and soil inputs, and providing farmers with these predictions could help maximize profits and minimize losses.
An intrusion detection algorithm for amiIJCI JOURNAL
Nowadays, using the smart metering devices for energy users to manage a wide variety of subscribers,
reading devices for measuring, billing, disconnection and connection of subscribers’ connection
management is an important issue. The performance of these intelligent systems is based on information
transfer in the context of information technology, so reported data from network should be managed to
avoid the malicious activities that including the issues that could affect the quality of service the system. In
this paper for control of the reported data and to ensure the veracity of the obtained information, using
intrusion detection system is proposed based on the support vector machine and principle component
analysis (PCA) to recognize and identify the intrusions and attacks in the smart grid. Here, the operation of
intrusion detection systems for different kernel of SVM when using support vector machine (SVM) and PCA
simultaneously is studied. To evaluate the algorithm, based on data KDD99, numerical simulation is done
on five different kernels for an intrusion detection system using support vector machine with PCA
simultaneously. Also comparison analysis is investigated for presented intrusion detection algorithm in
terms of time - response, rate of increase network efficiency and increase system error and differences in
the use or lack of use PCA. The results indicate that correct detection rate and the rate of attack error
detection have best value when PCA is used, and when the core of algorithm is radial type, in SVM
algorithm reduces the time for data analysis and enhances performance of intrusion detection.
A Predictive Stock Data Analysis with SVM-PCA Model .......................................................................1
Divya Joseph and Vinai George Biju
HOV-kNN: A New Algorithm to Nearest Neighbor Search in Dynamic Space.......................................... 12
Mohammad Reza Abbasifard, Hassan Naderi and Mohadese Mirjalili
A Survey on Mobile Malware: A War without End................................................................................... 23
Sonal Mohite and Prof. R. S. Sonar
An Efficient Design Tool to Detect Inconsistencies in UML Design Models............................................. 36
Mythili Thirugnanam and Sumathy Subramaniam
An Integrated Procedure for Resolving Portfolio Optimization Problems using Data Envelopment
Analysis, Ant Colony Optimization and Gene Expression Programming ................................................. 45
Chih-Ming Hsu
Emerging Technologies: LTE vs. WiMAX ................................................................................................... 66
Mohammad Arifin Rahman Khan and Md. Sadiq Iqbal
Introducing E-Maintenance 2.0 ................................................................................................................. 80
Abdessamad Mouzoune and Saoudi Taibi
Detection of Clones in Digital Images........................................................................................................ 91
Minati Mishra and Flt. Lt. Dr. M. C. Adhikary
The Significance of Genetic Algorithms in Search, Evolution, Optimization and Hybridization: A Short
Review ...................................................................................................................................................... 103
IRJET - Intelligent Weather Forecasting using Machine Learning TechniquesIRJET Journal
This document discusses using machine learning techniques to forecast weather intelligently. It proposes using multi-target regression and recurrent neural network (RNN) models trained on historical weather data from Bangalore to predict future weather conditions like temperature, humidity, and precipitation. The data is first preprocessed before being fed to the models. The models are evaluated to accurately predict weather in the short term to help people like farmers and commuters without relying on expensive equipment.
Overview of soft intelligent computing technique for supercritical fluid extr...IJAAS Team
Optimization of Supercritical Fluid Extraction process with mathematical modeling is essential for industrial applications. The response surface methodology (RSM) has been proven to be a useful and effective statistical method for studying the relationships between measured responses and independent factors. Recently there are growing interest in applying smart system or artificial technique to model and simulate a chemical process and also to predict, compute, classify and optimize as well as for process control. This system works by generalizing the experimental result and the process behavior and finally predict and estimate the problem. This smart system is a major assistance in the development of process from laboratory to pilot or industrial. The main advantage of intelligent systems is that the predictions can be performed easily, fast, and accurate way, which physical models unable to do. This paper shares several works that have been utilizing intelligent systems for modeling and simulating the supercritical fluid extraction process.
This document summarizes a research paper that uses an artificial neural network approach to forecast stock market prices in India. The paper trains a feedforward neural network using a backpropagation algorithm on data from 5 Indian companies between 2004 and 2013. The network is tested in MATLAB to predict stock prices and calculate an error rate for accuracy. The neural network model is found to provide a computational method for predicting stock market movements based on historical price and volume data.
Multimode system condition monitoring using sparsity reconstruction for quali...IJECEIAES
In this paper, we introduce an improved multivariate statistical monitoring method based on the stacked sparse autoencoder (SSAE). Our contribution focuses on the choice of the SSAE model based on neural networks to solve diagnostic problems of complex systems. In order to monitor the process performance, the squared prediction error (SPE) chart is linked with nonparametric adaptive confidence bounds which arise from the kernel density estimation to minimize erroneous alerts. Then, faults are localized using two methods: contribution plots and sensor validity index (SVI). The results are obtained from experiments and real data from a drinkable water processing plant, demonstrating how the applied technique is performed. The simulation results of the SSAE model show a better ability to detect and identify sensor failures.
A Hierarchical Feature Set optimization for effective code change based Defec...IOSR Journals
This document summarizes research on using support vector machines (SVMs) for software defect prediction. It analyzes 11 datasets from NASA projects containing code metrics and defect information for modules. The researchers preprocessed the data by removing duplicate/inconsistent instances, constant attributes, and balancing the datasets. They used SVMs with 5-fold cross validation to classify modules as defective or non-defective, achieving an average accuracy of 70% across the datasets. The researchers conclude SVMs can effectively predict defects but note earlier studies using the NASA data may have overstated capabilities due to insufficient data preprocessing.
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNINGIRJET Journal
1. The document discusses using machine learning techniques like random forests and support vector machines to predict traffic patterns using large datasets from intelligent transportation systems.
2. It proposes predicting traffic using an SVM algorithm with Euclidean distance metrics on traffic data derived from online sources, aiming to improve accuracy and reduce errors compared to existing systems.
3. The system would take in historical vehicle movement data to be trained via machine learning, allowing it to process large amounts of real-time sensor data and better predict traffic conditions, which could help minimize congestion and carbon emissions from transportation.
This document discusses using artificial neural networks and MATLAB 7.10 to develop an efficient system for sorting mechanical spare parts. It involves using wavelet transforms to extract features from images of parts, which are then used to train an artificial neural network. The neural network can accurately recognize parts based on their wavelet features with high efficiency. Simulation results show the system can successfully identify the name of a selected spare part from its image with a graphical output.
Artificial Neural Network (ANN) is a fast-growing method which has been used in different
industries during recent years. The main idea for creating ANN which is a subset of artificial
intelligence is to provide a simple model of human brain in order to solve complex scientific and
industrial problems. ANNs are high-value and low-cost tools in modelling, simulation, control,
condition monitoring, sensor validation and fault diagnosis of different systems. It have high
flexibility and robustness in modeling, simulating and diagnosing the behavior of rotating machines
even in the presence of inaccurate input data. They can provide high computational speed for
complicated tasks that require rapid response such as real-time processing of several simultaneous
signals. ANNs can also be used to improve efficiency and productivity of energy in rotating
equipment
Accident Prediction System Using Machine LearningIRJET Journal
This document describes a machine learning model to predict road accident hotspots in Bangalore, India. The researchers collected accident data from government websites and other sources. They used K-means clustering to group similar data points and label them as high or low risk zones. The dataset was preprocessed and split into training and testing sets. A K-means clustering algorithm was trained on the larger training set to create clusters of accident-prone areas based on factors like weather, road conditions, etc. The model can then predict whether new locations belong to a high or low risk cluster. The user interface allows emergency responders and city planners to input a location and get a prediction to help prevent future accidents.
A tutorial on secure outsourcing of large scalecomputation for big dataredpel dot com
A tutorial on secure outsourcing of large scalecomputation for big data
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Performance analysis of binary and multiclass models using azure machine lear...IJECEIAES
Network data is expanding and that too at an alarming rate. Besides, the sophisticated attack tools used by hackers lead to capricious cyber threat landscape. Traditional models proposed in the field of network intrusion detection using machine learning algorithms emphasize more on improving attack detection rate and reducing false alarms but time efficiency is often overlooked. Therefore, in order to address this limitation, a modern solution has been presented using Machine Learning-as-a-Service platform. The proposed work analyses the performance of eight two-class and three multiclass algorithms using UNSW NB-15, a modern intrusion detection dataset. 82,332 testing samples were considered to evaluate the performance of algorithms. The proposed two class decision forest model exhibited 99.2% accuracy and took 6 seconds to learn 1,75,341 network instances. Multiclass classification task was also undertaken wherein attack types like generic, exploits, shellcode and worms were classified with a recall percentage of 99%, 94.49%, 91.79% and 90.9% respectively by the multiclass decision forest model that also leapfrogged others in terms of training and execution time.
Departure Delay Prediction using Machine Learning.IRJET Journal
This document discusses using machine learning techniques to predict flight delays. It proposes building a machine learning model using airline dataset and applying three machine learning algorithms: Naive Bayes, K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). The document describes preprocessing steps like data cleaning, normalization, and feature extraction. It also provides details on the objectives, methodology, system architecture, and scope of the study for developing a machine learning model to forecast flight delays.
This document discusses applying a perceptron neural network model to predict failures in the subsystems of a thermal power plant. It begins by providing background on predictive maintenance and the use of neural networks for failure prediction. It then describes the case study of a 500 MW thermal power plant in East Iran. The document examines the main subsystems of the power plant and factors that influence failures in each. It proposes using a perceptron multilayer neural network trained on 2 years of daily operational data to predict failure times and aid predictive maintenance planning. The goal is to help schedule maintenance visits, procure parts in a timely manner, and reduce storage costs.
A Transfer Learning Approach to Traffic Sign RecognitionIRJET Journal
This document presents a study on traffic sign recognition using transfer learning with three pre-trained convolutional neural network models: InceptionV3, Xception, and ResNet50. The models were trained on the German Traffic Sign Recognition Benchmark dataset containing 43 classes of traffic signs. InceptionV3 achieved the highest test accuracy of 97.15% for traffic sign classification, followed by Xception at 96.79%, while ResNet50 performed poorly with only 60.69% accuracy. Transfer learning with InceptionV3 is shown to be an effective approach for traffic sign recognition tasks.
Semelhante a Data-Driven Hydrocarbon Production Forecasting Using Machine Learning Techniques (20)
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Diana Rendina
Librarians are leading the way in creating future-ready citizens – now we need to update our spaces to match. In this session, attendees will get inspiration for transforming their library spaces. You’ll learn how to survey students and patrons, create a focus group, and use design thinking to brainstorm ideas for your space. We’ll discuss budget friendly ways to change your space as well as how to find funding. No matter where you’re at, you’ll find ideas for reimagining your space in this session.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Data-Driven Hydrocarbon Production Forecasting Using Machine Learning Techniques
1. 1
Abstract— Data analytics utilizes advanced statistical and
machine learning methods to find the concealed information and
trends present in different types of datasets. These methods have
recently shown great potential to solve the problems in the oil and
gas industry. The ability to find insights from large datasets can
make an oil company more profitable and successful.
The innovation of sophisticated artificial intelligence methods
as well as new developments in powerful high-speed computing
resources have made the machine learning techniques more
powerful than ever. The oil and gas industry has been benefited
from these algorithms and machine learning techniques have been
applied to many petroleum engineering challenges.
Artificial Neural Network (ANN), Linear Regression (LR),
and Support Vector Regression (SVR) were employed in this
research to forecast the daily oil production using the Volve oil
field production dataset. All three methods show great potential
for hydrocarbon production forecasting. Results for well NO159-
F-1C, however, indicate that ANN had the best performance
compared to the other two methods. This doesn’t mean that ANN
is the superior method compared to LR and SVR in every
situation. The performance of an algorithm must be examined for
each specific case in order to select the best technique.
Index Terms—Machine Learning, Hydrocarbon production
forecast, ANN, LR, SVR.
I. INTRODUCTION
ata analytics is an evolving area which involves utilizing
advanced statistical and machine learning methods to find
out the concealed information and trends present in different
types of datasets. The ability to find insights from large datasets
can make an oil company more profitable and successful.
Driving meaningful information from available exploration and
production data can help companies to lower costs and higher
efficiency. Machine learning is a tool which helps data
scientists to drive such meaningful insights from raw data.
Demand for employing of new mathematical and
computational methods in oil and gas industry is more than
ever. In today’s highly competitive environment there is a
never-ending race between companies to cut the cost and
increase the production efficiency. Oil and gas industry, like
many other industries, has taken advantage of the recent
artificial intelligence advancements. New development in
modern computers has enabled mathematical theories to be
more powerful than before. These theories are providing
engineers and scientists with tools that intelligent machines can
be developed by applying them. Machine learning techniques,
as an application of artificial intelligence, has been greatly
employed in exploration, production, and management of
hydrocarbons.
Machine learning techniques have recently attracted
interests in many areas, including mathematics, healthcare,
economics, and engineering, among many others. This is due to
innovation of sophisticated artificial intelligence methods as
well as new developments in powerful high-speed computing
resources. The oil and gas industry has been vastly benefited
from these improvements. Machine learning techniques have
been applied to many petroleum engineering challenges such as
well logs analysis [1], prediction of bubble point pressure of
crude oils [2], hydrocarbon production forecast [3],
characterization of reservoir heterogeneity [4], prediction of
thermodynamic properties of reservoir fluids [5], forecasting
crude oil viscosity and solution GOR [6], and ultimate recovery
estimation [7]. Artificial intelligence applications apply
unconventional ways to connect input data to output which
attracted the interest of engineers and scientists. Machine
learning techniques help us analyze and forecast hydrocarbon
production in highly complex systems where understanding the
physical mechanisms are complicated [3].
In the world of data science, data visualization is essential
to analyze massive amounts of data and make data-driven
decisions. Data visualization is the graphical representation of
datasets. This representation could be in the form of charts,
graphs, and maps. Data visualization helps us to see and find
trends, outliers, and patterns in data. Correlation pairs-plots and
heat-maps are used in this study to visualize data and
understand the trends and correlations among data features.
In this research, production data from Volve field, an oil
field on the Norwegian continental shelf (NCS) [8], was
analyzed using machine learning methods. The operator
Equinor together with the Volve license partners, ExxonMobil
and Bayerngas, has disclosed all subsurface and operating data
from this oil field [9]. Artificial Neural Networks (ANN),
Linear Regression (LR), and Epsilon-Support Vector
Regression (ε-SVR) are the utilized machine learning
techniques to predict daily oil production from well head
pressure and well head temperature as input features.
A Python code was developed to apply ANN, LR, and SVR
methods to the production data of four different wells from
Volve field. Prediction results, then, were compared to the real
data values. All three methods show a great potential for
hydrocarbon production forecasting. Results for well NO159-
F-1C, however, indicate that ANN had the best performance
compare to other two methods. This doesn’t mean that ANN is
the superior method compare to LR and SVR in every situation.
Data-Driven Hydrocarbon Production Forecasting
Using Machine Learning Techniques
Masoud Safari Zanjani, Mohammad Abdus Salam, and Osman Kandara
Department of Computer Science, Southern University, Baton Rouge, Louisiana, USA
D
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
65 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
2. 2
The performance of an algorithm must be examined for each
specific case in order to determine the best technique.
II. METHODOLOGY
Machine learning algorithms find patterns in data and
generate insight to make better decisions [10]. Machine
learning is a technique that gives the computers the ability to
learn from examples and improve their performance. This
technique empowers computers to have decision-making ability
by using advanced mathematical algorithms. Machine learning
is a part of Artificial Intelligence (AI) which gives the human-
like behavior to machines.
Many problems in oil and gas industry are categorized as
continuous value problems or in statistical terms “regression”
problems. In regression analysis typically the effects of
variable(s) are estimated on another variable. These problems
usually consist of a dependent or a set of dependent variables
and one or more independent variable.
A. Linear Regression
Linear Regression (LR) estimates the relationship between
one or more independent variables and a dependent variable by
minimizing the sum of the squares in the difference between the
observed and predicted values [11]. In Linear Regression the
relationship between a dependent variable and one or more
independent variables is modeled by fitting a linear equation.
Linear Regression fits a linear model with coefficients =
, … , that will best minimize the residual sum of squares
between the observed responses in the dataset and the responses
predicted by the linear approximation [12]. Mathematically it
solves a problem of the form: min‖ − ‖
B. Support Vector Regression (SVR)
In ε-SVR, the objective is to find a function f(x) which has
at most ε deviation from the actually obtained targets yi for all
the training data and at the same time is as flat as possible [13].
This method relies on defining the loss function that ignores
errors which are situated within a certain distance from true
value. This distance is ε and this type of function is often called
epsilon-intensive loss function [14].
The free parameters in SVR model are C and epsilon. C is
the penalty parameter of the error term and epsilon is the
acceptable deviation from real values. For predicted points
within a distance epsilon from the actual values, no penalty will
be associated in the training loss function [15].
C. Support Vector Regression (SVR)
An Artificial Neural Network (ANN) is a mathematical
algorithm that tries to simulate the functionalities and structure
of biological brains where neurons are highly connected and
data is processes by learning from repetition of events [16].
These systems are generally capable of learning, machine
learning, to perform a task using examples and without being
programmed with task specific rules. ANN include many
connected processing units which work together to process data
and generate meaningful information from it. ANN can be used
for various data science problems such as classification,
prediction, and pattern recognition.
The basic building block of an artificial neural network is
an artificial neuron which is a simple mathematical function.
This function has three sets of rules: multiplication, summation
and activation [17]. The inputs are weighted at the entrance of
artificial neuron which means that every input value is
multiplied by a weight. In the middle section of artificial neuron
is summation function which sums all weighted inputs and bias.
At the end, the result of summation passes through an activation
function which is also called transfer function (Fig. 1) [17].
Feedforward Neural Network is the simplest architecture
of artificial Neural Network (Fig. 2). In this architecture,
neurons (nodes) are arranged in layers and nodes from adjacent
layers have connections (or edges) between them. These
connections are weight associated.
A feedforward neural network has three layers;
1. Input layer: no computation happens in this layer and input
nodes just pass the data to the hidden nodes.
2. Hidden layer: nodes in this layer perform computations and
transfer information from the input nodes to the output
nodes. A network can have multiple hidden layers.
Fig. 1. Working principle of an artificial neuron [17]
Fig. 2. An example of feedforward neural network [18].
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
66 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
3. 3
3. Output layer: nodes in this layer are responsible for
computations and transferring information to outside of the
network.
In a feedforward network, information moves from input
layer toward output layer, one direction, and there are no cycles
or loops in the network.
III. RESULTS AND ANALYSIS
Forecasting hydrocarbon production accurately is a
significant step in management and planning of a petroleum
reservoir. In this research, three machine learning methods are
employed to forecast oil production. The results and discussions
are presented in this chapter.
A. Studied Petroleum Field
To build a successful model it is necessary to have a right
dataset to train and test the model. Finding the right data with
the right format is usually a challenge for machine learning
methods. Right data correlates with the target that is going to be
predicted [19]. To build a successful model, the problem needs
to be precisely understood first.
Data from Volve field on the Norwegian continental shelf
(NCS) was used for this study [9]. Datasets were released by
the operator, Equinor, on May 2018 [8]. One of the specific
goals of the data release was to allow students to train on real
datasets. This dataset is the most comprehensive and complete
dataset ever gathered on the NCS. It covers data in regards of
production data, well design, completion string design, seismic
data, well logs (petrophysical and drilling), geological and
stratigraphical data, static and dynamic models, surface and
grid data [8].
The production dataset has been used for this study. The
studied wells are NO159-F-1C, NO159-F-5AH, NO159-F-
14H, and NO159-F-15D. Each well has the data on data
recording date, average downhole pressure, average downhole
temperature, average drill pipe tubing, average annulus
pressure, average choke size, average well-head pressure,
average well-head temperature, oil volume, gas volume, water
volume, type of flow (production or injection), and well type
(oil production or water injection). The production data was
recorded on a daily basis.
To measure the production data, different sensors are
usually installed in downhole and well-head. The recorded data,
then, are transferred to monitoring stations to be stored and
analyzed. The process is shown in Fig. 3 schematically.
B. Data Visualization
Data visualization is essential to analyze massive amounts
of data and make data-driven decisions. Data visualization
helps us to see and find trends, outliers, and patterns in data.
Each parameter was plotted versus time (day). Fig. 4 shows
the average well-head pressure, average well-head temperature,
and oil production volume for well NO159-F-1C. There are 427
readings, one data reading per day, for this well.
C. Feature Selection
Feature selection is an important concept is machine
learning problems. The data features are used to train the
machine learning model and have a significant influence on the
performance of the model. Feature selection and data cleansing
should be the first step on building the model.
Feature selection is the process of choosing the features
that contribute to the prediction value and are correlated to the
desired output, either automatically or manually. Employing
irrelevant features in the model may decrease the accuracy of
the model and make the model learn from irrelevant parameters.
Highly correlated parameters, on the other hand, may have
the shortcoming defect of not adding a new feature to the
process of training the model [20]. Selecting highly correlated
parameters as features may lead to reduction of model accuracy
due to the lack of variation in the input data.
The correlation heat-maps were generated in this work to
identify the highly correlated parameters. Pearson correlation
coefficient is a measure of the linear correlation between two
variables of X and Y. This coefficient has a value between +1
and -1, where +1 means total positive linear correlation, 0 is no
linear correlation, and -1 is total negative linear correlation.
To make it more easily understandable visually, correlation
results are presented in the form of heat-map. A heatmap is a
graphical form of data representation which uses a system of
color-coding to represent different values. The results for well
NO159-F-1C is shown in Fig. 5. In this map a positive
correlation is shown in red and a negative correlation is shown
in blue.
As an example, in the results shown in Fig. 6 it can be seen
that BORE_OIL_VOL and BORE_GAS_VOL are highly
correlated; Pearson correlation coefficient of 0.99. It means that
including both of these parameters will not add a new feature to
the model and these parameters have basically linearly
correlated data.
The objective of this study is to predict the values of daily
oil production. Well-head pressure and well-head temperature
are selected as training features. These parameters are not
highly correlated, correlation coefficient of -0.42, which means
they will not add redundant feature to the model.
Fig. 3. Data recording process.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
67 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
4. 4
D. Feature Scaling
Data values usually vary in range, magnitude, and units.
This may cause a problem in machine learning algorithms as
they use Euclidean distance between two data points in their
computations. Feature scaling is used to normalize the range of
independent variables or features of data to solve this problem.
The min-max scaling is used here to scale the range in
[0,1]. The general formula for a min-max of [0,1] is given as:
=
E. Data Partitioning
To build a machine learning algorithm, data is usually
divided to two partition; Training dataset and Testing dataset.
Training dataset is the part of the data that is used to train and
fit the model. Test dataset, however, is used to evaluate the
fitness of the model on the dataset. Data points in the training
set are excluded from the test dataset.
In this research, 70 percent of data was used as training and
remaining 30 percent was used as testing dataset. Both training
and testing data subsets for the well NO159-F-1C have a pretty
similar distribution compared to the total data distribution (Fig.
6). This guarantees an accurate model development procedure
in the next phase of predictive model development.
F. Artificial Neural Network (ANN)
The result of ANN prediction model is presented in this
section. Artificial Neural Networks (ANN) are computational
algorithms with the intention of simulating the behavior of
biological brains. ANN prediction cross-plots of the total
datasets, training dataset, and testing dataset are given in Fig. 8
for well NO159-F-1C. These graphs show predicted values for
daily oil production versus the real values. The closer the points
are to the 45º line (x = y), model has a better forecasting
performance. Fig. 8, shows that a high number of data points
falling along the 45º line, indicating a good agreement between
predicted values and the real data values. Training data show a
(a)
(b)
(c)
Fig. 4. Data plot versus time (day) for well NO159-F-1C; (a) average well-head pressure (b) average well-head temperature, (c) daily oil production volume.
Fig. 5. Heat-map representation of correlation coefficient values between all
data attributes for well NO159-F-1C.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
68 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
5. 5
better agreement compared to testing dataset as the testing
database is not presented to the model at the model building
step.
A point-by-point comparison between the ANN predicted
values versus the real data is shown in Fig. 8 for well NO159-
F-1C. A good match can be observed between the forecasted
values and real values.
G. Linear Regression (LR)
In Linear Regression the relationship between a dependent
variable and one or more independent variables is modeled by
fitting a linear equation. LR prediction cross-plots of the total
datasets, training dataset, and testing dataset are given in Fig.
10 for well NO159-F-1C. These graphs show predicted values
for daily oil production versus the real values. Fig. 9, shows that
a high number of data points falling along the 45º line,
indicating a good agreement between predicted values and the
real data values. Similar to ANN, training data show a better
agreement compared to testing dataset as the testing database is
not presented to the model at the model building step.
A point-by-point comparison between the LR predicted
values versus the real data is shown in Fig. 10 for well NO159-
F-1C. A good match can be observed between the forecasted
values and real values.
H. Support Vector Regression (SVR)
In ε-SVR, the objective is to find a function f(x) which has
at most ε deviation from the actually obtained targets yi for all
the training data and at the same time is as flat as possible [13].
SVR prediction cross-plots of the total datasets, training
dataset, and testing dataset are given in Fig. 11 for well NO159-
F-1C. These graphs show predicted values for daily oil
production versus the real values. As it can be seen by
comparing Fig. 11 to Fig. 9 and 7, data points fall closer to the
45º line in ANN and LR methods which means ANN and LR
had a better performance compared to SVR in predicting daily
oil production for well NO159-F-1C.
A point-by-point comparison between the SVR predicted
values versus the real data is shown in Fig. 12 for well NO159-
F-1C. With comparison of this graph to Fig. 11 and 9, it can be
seen that ANN and LR comparing to SVR, were able to predict
the daily oil production more accurately.
I. Models Comparison
It is usually a challenge to pick the right machine learning
algorithm for a specific problem. There are several statistical
and practical ways to compare different methods. It is not
probably a good idea to just compare the overall accuracy as
there are more indicators that need to be investigated depending
on the specific application.
To compare the results of this work visually, the prediction
values from all three methods are drawn in one graph (Fig. 13).
In this graph the total real values of daily oil production are
compared to the corresponding predicted values by ANN, LR,
and SVR. As it can be seen, ANN had a better performance
comparing to other two, followed by LR.
(a)
(b)
(c)
Fig. 6. Data distributions for well NO159-F-1C; (a) total dataset, (b) training
dataset, and (c) testing dataset.
(a)
(b)
(c)
Fig. 7. ANN prediction cross-plots of (a) total datasets, (b) training dataset,
and (c) testing dataset for well NO159-F-1C.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
69 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
6. 6
Fig. 8. Point-by-point comparison between the ANN predicted values versus the real data for well NO159-F-1C.
(a) (b) (c)
Fig. 9. LR prediction cross-plots of (a) total datasets, (b) training dataset, and (c) testing dataset for well NO159-F-1C.
Fig. 10. Point-by-point comparison between the LR predicted values versus the real data for well NO159-F-1C.
(a) (b) (c)
Fig. 11. SVR prediction cross-plots of (a) total datasets, (b) training dataset, and (c) testing dataset for well NO159-F-1C.
Fig. 12. Point-by-point comparison between the SVR predicted values versus the real data for well NO159-F-1C.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
70 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
7. 7
As it was mentioned before, the objective of this project
was to forecast the daily oil production of a petroleum field
using machine learning techniques. To compare the
performance of different utilized methods practically, the
amount of the predicted oil production was calculated
cumulatively. Cumulative produced oil is an important
reservoir management parameter which shows the total amount
of the oil that was produced from a specific well through a
desired time period. To calculate this parameter, only the testing
dataset was considered as it is new for the model and has not
been used to train the model. The real values of the cumulative
oil production were calculated using the testing portion of the
real data and they have been compared to the predicted values
in Fig. 14. As it can be seen here again, ANN shows the best
performance and LR and SVR take the second and third place,
respectively.
IV. CONCLUSION
Oil production prediction is an important input for making
decisions in an oil company. This parameter can be used for
estimating reserves, optimizing production operations, business
planning, and investment scenario evaluation. Production
forecast is conventionally done by empirical equations. In the
recent unconventional resources, however, production
prediction is more challenging because of the extremely low
permeability of the bed rock. Several data-driven techniques
were studied here as a potential solution for oil production
forecasting problem.
Artificial Neural Network (ANN), Linear Regression (LR),
and Support Vector Regression (SVR) were the employed
machine learning techniques to forecast the daily oil
production. Prediction cross-plots and point-to-point
comparison were presented for each method. To compare these
three methods practically, the predicted cumulative oil
production by each method was calculated and compared to the
real cumulative oil production for testing dataset.
All three methods show a great potential for hydrocarbon
production forecasting. Results for well NO159-F-1C,
however, indicate that ANN had the best performance compare
to other two methods. LR was more successfully predicted the
production values compare to SVR. Although, ANN showed a
better performance in this case study, it doesn’t mean that it is
the superior method compare to LR and SVR. Performance of
machine learning methods depends on the studied dataset and
the problem characteristics greatly and therefore; performance
of an algorithm must be examined for each specific dataset and
problem in order to select the best technique.
REFERENCES
[1] C. Zhou, X.-L. Wu and J.-A. Cheng, "Determining
reservoir properties in reservoir studies using a fuzzy
neural network," in SPE Annual Technical Conference,
Houston, Texas, 1993.
[2] A. Hashemi-Fath, A. Pouranfard and P. Foroughizadeh ,
"Development of an artificial neural network model for
prediction of bubble point pressure of crude oils,"
Petroleum, vol. 4, p. 281e291, 2018.
[3] P. Panja, R. Velasco, M. Pathak and M. Deo,
"Application of artificial intelligence to forecast
hydrocarbon production from shales," Petroleum, vol. 4,
pp. 75-89, 2018.
[4] S. Mohaghegh, R. Arefi and S. Ameri, "A
Methodological Approach For Reservoir Heterogeneity
Characterization Using Artificial Neural Networks," in
SPE Annual Technical Conference & Exhibition, New
Orleans, 1994.
[5] J. Nagi, T. S. Kiong and S. K. Ahmed, "Prediction of PVT
Properties In Crude Oil Systems Using Support Vector
Machines," in 3rd International Conference on Energy
and Environment, Malacca, Malaysia, 2009.
Fig. 13. Point-by-point comparison between the real data and predicted values by three methods of ANN, LR, and SVR for well NO159-F-1C.
Fig. 14. Comparison of real cumulative oil production to the corresponding
predicted values by three methods of ANN, LR, and SVR for well NO159-F-
1C.
0
10000
20000
30000
40000
50000
60000
0 20 40 60 80 100 120 140
Cumulativeoilproduction(bbl)
Time (days)
Real values
ANN
LR
SVR
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
71 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
8. 8
[6] M. Oloso, A. Khoukhi , A. Abdulazeez and M. Elshafei,
"Prediction of Crude Oil Viscosity and Gas/Oil Ratio
Curves Using Recent Advances to Neural Networks," in
SPE/EAGE Reservoir Characterization and Simulation
Conference, Abu Dhabi, UAE, 2009.
[7] L. Jin, "Machine Learning Aided Production Data
Analysis For Estimated Ultimate Recovery Forecasting,"
M.S. thesis, Texas A&M University, 2018.
[8] "Volve Data Village," Equinor, 18 10 2018. [Online].
Available: https://data.equinor.com/dataset/Volve.
[Accessed 2019].
[9] "Disclosing all Volve data," Equinor, 14 6 2018. [Online].
Available:
https://www.equinor.com/en/news/14jun2018-
disclosing-volve-data.html. [Accessed 2019].
[10] "Introducing Machine Learning," MathWorks, 2019.
[11] "Ordinary Least Squares Regression," 04 10 2019.
[Online]. Available:
https://www.encyclopedia.com/social-sciences/applied-
and-social-sciences-magazines/ordinary-least-squares-
regression. [Accessed 2019].
[12] scikit-learn, "Generalized Linear Models," scikit-learn
developers, 2007. [Online]. Available: https://scikit-
learn.org/stable/modules/linear_model.html. [Accessed
2019].
[13] A. J. Smola and B. Scholkopf, "A Tutorial on Support
Vector Regression," ESPRIT Working Group in Neural
and Computational Learning, 1998.
[14] "Support Vector Machine Regression," [Online].
Available: http://kernelsvm.tripod.com/. [Accessed
2019].
[15] scikit.learn, "sklearn.svm.SVR," scikit-learn developers,
2007. [Online]. Available: https://scikit-
learn.org/stable/modules/generated/sklearn.svm.SVR.ht
ml. [Accessed 2019].
[16] S. e. Z. Lashari, A. Takbiri-Borujeni, E. Fathi, T. Sun, R.
Rahmani and M. Khazaeli, "Drilling performance
monitoring and optimization: a data-driven approach,"
Journal of Petroleum Exploration and Production
Technology, 2019.
[17] A. Krenker, J. Bešter and A. Kos, "Introduction to the
Artificial Neural Networks," in Artificial Neural
Networks - Methodological Advances and Biomedical
Applications, InTech, 2011, p. 362.
[18] ujjwalkarn, "A Quick Introduction to Neural Networks,"
9 8 2016. [Online]. Available:
https://ujjwalkarn.me/2016/08/09/quick-intro-neural-
networks/. [Accessed 2019].
[19] C. Nicholson, "A.I. Wiki," Skymind, [Online]. Available:
https://skymind.ai/wiki/datasets-ml. [Accessed 2019].
[20] M. Kuhn and K. Johnson, Applied Predictive Modeling,
New York: Springer Science+Business Media, 2013.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
72 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
9. IJCSIS
ISSN (online): 1947-5500
Please consider to contribute to and/or forward to the appropriate groups the following opportunity to submit and publish
original scientific results.
CALL FOR PAPERS
International Journal of Computer Science and Information Security (IJCSIS)
January-December 2020 Issues
The topics suggested by this issue can be discussed in term of concepts, surveys, state of the art, research,
standards, implementations, running experiments, applications, and industrial case studies. Authors are invited
to submit complete unpublished papers, which are not under review in any other conference or journal in the
following, but not limited to, topic areas.
See authors guide for manuscript preparation and submission guidelines.
Indexed by Google Scholar, DBLP, CiteSeerX, Directory for Open Access Journal (DOAJ), Bielefeld
Academic Search Engine (BASE), SCIRUS, Scopus Database, Cornell University Library, ScientificCommons,
ProQuest, EBSCO and more.
Deadline: see web site
Notification: see web site
Revision: see web site
Publication: see web site
For more topics, please see web site https://sites.google.com/site/ijcsis/
For more information, please visit the journal website (https://sites.google.com/site/ijcsis/)
Context-aware systems
Networking technologies
Security in network, systems, and applications
Evolutionary computation
Industrial systems
Evolutionary computation
Autonomic and autonomous systems
Bio-technologies
Knowledge data systems
Mobile and distance education
Intelligent techniques, logics and systems
Knowledge processing
Information technologies
Internet and web technologies, IoT
Digital information processing
Cognitive science and knowledge
Agent-based systems
Mobility and multimedia systems
Systems performance
Networking and telecommunications
Software development and deployment
Knowledge virtualization
Systems and networks on the chip
Knowledge for global defense
Information Systems [IS]
IPv6 Today - Technology and deployment
Modeling
Software Engineering
Optimization
Complexity
Natural Language Processing
Speech Synthesis
Data Mining