SlideShare uma empresa Scribd logo
1 de 25
L A N G L E Y R E S E A R C H C E N T E R
NASA Langley Core Technical Areas
Aerosciences
Atmospheric Characterization
Entry, Descent & Landing Intelligent Flight SystemsMeasurement Systems
Systems Analysis & Concepts
Advanced Materials & Structural Systems
Comprehensive Digital Transformation
Modeling & Simulation
• Physics-based understanding and simulation – Improved
discipline tools
• Integrated analysis and design of complex systems
• Optimally combine testing and M&S
 Open ideation and innovation
 Focused, relevant research
 Intelligent and rapid system designs
 Agile response to emerging missions
Advanced IT
• Open, secure collaboration with NASA & partners
• Networks handle burgeoning data
• Data governance, architecture, and management
High Performance Computing
• Next generation code development
• Rapid Compute power for M&S and BDA&MI
• Architecture for real-time analysis and design
• Rapid synthesis and digestion of global scientific information for
knowledge extraction, insights and answers
• Mining of diverse computational and experimental data sets for
new correlations, discoveries and advanced designs
• Virtual/Digital Experts – Human Machine symbiosis
External collaboration is of paramount importance
Enable Innovative Solutions to Complex NASA Challenges in
Aeronautics, Exploration, and Science
Big Data Analytics and Machine Intelligence
Vision: Virtual Research and Design Partner
Projects and Pilots
Towards
Virtual Partners
Two Key Areas for Virtual Partner – Data Intensive
Scientific Discovery
Deriving new insights, correlations, and discoveries not otherwise
possible from our diverse experimental and computational data sets
The Fourth Paradigm
Projects & Pilots ( cuts across Technical Areas)
• Anomaly Detection in the Nondestructive Evaluation images of Materials
• Predicting Flutter from Aeroelasticity Data
• Cognitive Assessment of Crew/Pilot State
• Knowledge Bot for Complex Simulation Software Optimization
• Rapid Exploration of Aerospace Designs
• Entry, Descent, and Landing Trajectory Data Analysis
The variety of techniques used in these projects and pilots represents a cross-
cutting approach to solving complex, physics-based problems in multiple
aerospace domains with diverse datasets -
Anomaly Detection in the Non-Destructive
Evaluation Images of Materials
Predicting Flutter from
Aeroelasticity Data
Develop techniques and algorithms to automatically detect anomalies during
the nondestructive evaluation of materials
Goals
• Significantly reduce SME analysis time and help experts discover
additional anomalies
• Help to design better material compositions and structures
Techniques
• Two-Dimensional Regression designed to detect anomalous pixels
• Convolutional Neural Networks to classify the image data
Accomplishments & Next Steps
• Algorithms are validated with real data sets and being enhanced
• Deliver a tool with a good UI for SMEs to use as an ‘Assistant’ for anomaly
detection of composite materials analysis in March
Develop methods to automatically detect the onset of flutter during wind
tunnel testing
Goals
• Find new ways of predicting flutter in the time domain
• Identify non-traditional predictor variables and unseen patterns
• Better understand precursors to flutter and improve configurations
Techniques
• Piecewise Regression to locate and track structural modes & coalescence
• Time Series Motifs to identify signatures in the data that could represent
precursors to flutter
Accomplishments & Next Steps
• Peak detection tested with multiple datasets
• Several significant time series motifs detected
• Generating synthetic data for validation of algorithms
Data Intensive Scientific Discovery Projects - 1
Pilot Cognitive State Assessment Rapid Exploration of Aerospace Designs
Build classification models for predicting cognitive state using
physiological data collected during flight simulations
Goals
• Identify unsafe cognitive states in aircrew real-time
• Apply results for more effective pilot training
Techniques
• Ensemble of machine learning tools (deep neural network, gradient
boosting, random forest, support vector machine, decision tree)
• Data pre-processing using detrending and power spectral density
Accomplishments & Next Steps
• Initial data mapping, statistical analysis, and signals processing
• Support classification efforts on single modalities
• Explore combining multiple signal models using ensembling
Develop a generalized machine learning platform to be used for
analyzing mod-sim data for design optimization
Goals
• Provide surrogate modeling to explore the trade space of aerospace
vehicle designs with easy to use web interface
• Use fast machine learning models instead of computationally-
intensive code for rapid exploration and optimization
Techniques
• Supervised machine learning algorithms, SVM and Neural Networks,
will be trained on labeled data
Accomplishments & Next Steps
• Python 2.7 with SKLearn algorithms are being used
• Windows Server 2012 with PHP set up for web interface
Data Intensive Scientific Discovery Projects - 2
Current State
SME pre-selects data to be analyzed and analyzes relying on traditional
methods; Requires expertise and is time-consuming
Being Developed
Long term Vision
Algorithms that mimic SME knowledge:
• Validate the algorithm
• Save SME time
Application of algorithms to entire
dataset and to other legacy datasets
Yields New Insights
Virtual Expert
Autonomous Assistant to SME that analyzes all
possible data and augments decision making
Aerospace Data Analytics
Challenge of Physics-Based Algorithms
All Data
SME-defined subset
of data for analysis
Being Developed
Data Mining techniques to detect
patterns and correlations which will be
validated by SMEs
Data
Analytics
Team and
SMEs
working
together
Two Key Areas for Virtual Partner - Knowledge Analytics
Obtaining insights, identifying trends, aiding in discovery, and finding
answers to specific questions by mining knowledge from scholarly,
web, and multimedia content
Cognitive Computing
Knowledge Assistants
Using Watson Content Analytics
Aerospace Innovation
Advisor POC
Using Watson Discovery Advisor
• Carbon Nanotubes Research
• Autonomous Flight Research
• Space Radiation Research
Example Topics:
• Hybrid Electric Propulsion
• On Demand Mobility
Cognitive-based systems are able to build knowledge and learn, through understanding natural language, to reason and interact more
naturally with human beings than traditional systems. They are also able to put content into context with confidence-weighted responses and
supporting evidence. Uses Natural language processing, machine learning and speech recognition technologies.
• Digest and analyze thousands of articles without reading with ability to
dissect the content interactively
• Automatically identify sub set of documents from large corpus and
provide brief summaries
• Provide a means to rapidly identify trends, and connections
• Identify experts and connections among them at all levels and
affiliations
• Explore technology gaps that could be leveraged
• Replace traditional methods of SMEs manually reviewing and tracking
research
• Help to identify cross-domain leverages and research
Autonomous Flight
Carbon Nanotubes Research
Space Radiation Research
Knowledge Assistants
Using Watson Content Analytics
130,000 articles metadata of ~ 20 year literature analyzed
Identified experts, trends, insights, and connections
Buying scholarly content is a challenge
4000 metadata and full text articles analyzed
Integrate analysis of scholarly and informal web
content to identify experts and new partnerships
1000 metadata and full text articles analyzed. Using
identified possible duplications, connections and
technology gaps. In the process of analyzing all six
elements of Human Research Program
Key Capabilities
Successfully demonstrated value and developed robust expertise; Buying licensed content is a challenge
Being expanded to analyze corpus topics as Systems Design, Uncertainty Quantification and Entry Descent Landing
Cognitive Computing : Systematic and repeatable approach to learning
Understand
scientific and
domain
language
Adapt and learn
quickly from inquires,
results, selections
and iteration
Compose and
visualize
information at
large
…built on a massively parallel
Big data scalable architecture
Domain content Extraction at scale Visualize at scale Discover at scale Learn with speed
Generate new
hypothesis and
discoveries
Watson Discovery Advisor
Accelerate the discovery of new insights by
synthesizing information in seconds
• Take advantage of massive sources of data
• Find answers to questions that have not been asked
yet or answered before
• Find insights into hidden relationships and dig deeper
• Generate leads to hard questions and provide
evidence to substantiate new claims
• Being used in medicine both as diagnostics/treatment
and research advisors
Cognitive Technologies for
Aerospace Engineering
Apply cognitive computing technologies that ‘understand’ massive
amounts of information and enhance experts abilities
Starting to explore application to our domains: key challenges are
adoption to engineering and licensing the scholarly content
Aerospace Innovation Advisor
Proof of Concept (March – June 16)
Example Topics: Hybrid Electric Propulsion;
On Demand Mobility
Ames Research Center:
Aircraft Dispatcher Assistant: Feasibility Study
Armstrong Flight Research Center:
Pilot Assistant: Investigation
Johnson Space Center:
Astronaut Health Investigations
LaRC is connected with all of these efforts
Algorithms and Software
Linear Regression
Application 1: Non-Destructive Evaluation (NDE)
Image Analysis
Goal: Automate delamination detection
Method: Fit data with linear regression and detect
outlier regions. Regression performed on 1D and
2D signals; Using C++ code and R
Application 2: Aeroelastic Flutter Data Analytics
Goal: Detect precursors and onset of aeroelastic
flutter
Method: Fit best quadratics between structural
modes to detect mode coalescence; Using MATLAB
Top: Linear regression of 1D-signals for anomaly detection in
carbon fiber; Bottom: Mode identification in flutter time-series
data using linear regression
Gaussian Process
Application: Knowledge Bot for
Optimizing Complex Simulation Software
Goal: Emulate simulation to predict
convergence divergence
Method: Gaussian Process for emulator
and to find next best point to maximum
knowledge; Using Python
Justification: Approach followed in
literature for weather simulation
emulation
Create
Initial
Points
Evaluate
Point(s) on
Simulator
(FUN3D)
Create/Update
Emulator
Find Next
Best Point
Methodology
Finding boundary of two circles; ~ 99% accuracy
Fun 3D; ~83% accuracy
Time Series Motifs
Application:
Pattern Mining of Time Domain Aeroelastic Flutter Data
Goal:
Identify flutter precursors to:
• Create a dictionary of motifs for a given configuration
• Classify data for use with machine learning
algorithms that will support a real-time ‘Flutter
Assistant’
Method: Application of the Motif Enumeration using
(MOEN) open source algorithm created by Dr. Abdullah
Mueen and MATLAB
Justification: MOEN has been successfully applied to
research problems in other scientific domains including
robotics, biology, and seismology
In order to detect motifs across the various sensor signals, a given sensor’s
output (Signal A) is compared to another sensor (Signal B) by creating a
composite signal (Signal A/B).
The algorithm is then applied to the composite signal to detect the motifs
(above right) common to both sensors. Significant motifs are identified by a
physics-based selection process and then validated by SMEs.
Scott, Robert C., et al. "Aeroservoelastic Wind-Tunnel Test of the SUGAR Truss Braced Wing
Wind-Tunnel Model." 56th AIAA/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials
Conference. 2015.
Deep Learning: Convolutional
Neural Network (CNN)
Application: NDE Image Analysis to Segment
Delaminations
Method: Convolutional encoder/decoder neural
network; end-to-end training to map raw data to
segmentation; Using Caffe and Lua/Torch
Justification: Very successful in medical image
analysis such as wound segmentation (top right)
Results on Simulated Data
Results on Experimental Data
From Wang, Changhan, et al. "A unified framework for automatic wound segmentation and analysis with deep convolutional neural
networks." Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE, 2015.
Artificial neural networks (ANN)
Application 1: Crew State Monitoring
Goal: Build classification models capable of
accurate, real-time prediction of aircrew
cognitive state using physio data collected during
flight simulations
Method: ANN trained to classify cognitive state
Application 2: Rapid Exploration of Aerospace
Designs (READ)
Goal: Build classification / regression models on
user-uploaded simulated data
Method: train ANNs on labeled data, use trained
models for prediction and visualization
EEG ECG Galvanic Skin ResponseRespiration Rate Eye Tracking
Feature Generation
Input Layer
Hidden Layer
Output Layer /
Classification
“Normal”
State
Channelized
Attention
Diverted
Attention
Startle /
Surprise
Ensemble of Machine Learning Techniques
Application 1: Non-Destructive Evaluation (NDE)
Image Analysis
Goal: Automate delamination detection
Method: Combine several machine learning models
into overall prediction using regression to determine
if sample contains a delamination; Using
Python/Scikit-Learn
Application 2: Pilot Cognitive State Assessment
Goal: Build classification models capable of real-
time prediction of pilot cognitive state using
physiological data collected during flight simulations
Method: Utilize 2-level Meta Model combining
multiple classification algorithms to improve
classification accuracy; Using Theano and Python
Random
forests
Extremely
random forests Ada Boost
Gradient
boosting
k – nearest
neighbors
Fully grow k
independent
classification
trees and
combine
predictions
Similar to
random forests
but split per
node is also
randomized
Fit consecutive
weak learners
based on
classification
tree stumps
and combine
predictions
Similar to Ada
Boost with
different loss
function
Identify k
closed points to
test sample
based on
distance metric
EEG ECG Galvanic Skin ResponseRespiration Rate Eye Tracking
Feature Generation
Artificial Neural Network Gradient Boost Classifier Random Forest
Level 1 Models
Artificial Neural Network
Level 2 Meta Model
“Normal” State Channelized Attn Diverted Attn Startle /Surprise
Clustering: K-Means
Application: Knowledge Analytics
Goal: Automatically group thousands of
documents into useful clusters
Method: Utilizing IBM Watson Content
Analytics; K-means is a scalable means of
clustering large datasets
More than 130,000 documents are grouped into
20 clusters; Accurate by SMEs validation
Key Insights,
Challenges and Next Steps
• Focus on ‘Big Analytics’; Big Data is not just about volume
• Data Science is a team effort - Computer Science; Statistics; Mathematics; Subject Matter
Expertise/SME
• Strong collaboration with SMEs is essential and critical; algorithm development is iterative
• Machine learning techniques have to be adopted to our data; Researching thoroughly how
similar problems are being addressed and adopting them is important
• Understanding the problem domain and associated data is essential, difficult & time consuming
• Use Pilots to learn to apply techniques to science/Eng. domains; Use a Phased Approach
• Leverage extensive work and research happening, and the open source platforms
• NASA, Federal, Universities, Industry
• Application of data science/analytics to aerospace research domains is in early stages
• Be ready for challenges and it is a journey with a huge potential
Key Insights
• Trending a new path to develop the capability for very specialized disciplines
• Using a mix of research, experimentation, innovation, and persistence
• Application of techniques to aerospace domains needs maturity and resources
• Difficulty in establishing ground truth and labeled data for ML models; data sets are diverse, complex
and unique
• Takes multi years to get to achieve big goals; balancing that with short term milestones to
demonstrate value and generate buy-in & resources
• Provide a suite of tools and techniques for broad use-Machine Learning/Analytics
• Expand big data movement to get broad buy in and propagating the expertise
Challenges and Next Steps
Acknowledgements – Big Data Analytics Team
Data Analytics and Machine Learning Expertise :
Manjula Ambur, Lin Chen, Christina Heinich, Charles Liles, Robert Milletich,
Daniel Sammons, Ted Sidehamer, and Jeremy Yagle
Subject Matter Expertise:
Damodar Ambur, Danette Allen, Eric Burke, Kyle Ellis, Christie Funk, Dana
Hammond, Angela Harrivel, Jeff Herath, Patty Howell, Constantine Lukashin,
Alan Pope, Brandi Quam, Cheryl Rose, Jamshid Samareh, Mark Sanetrik, Rob
Scott, Lisa Scott-Carnell, Steve Scotti, Walt Silva, Mia Siochi, Chad Stephens,
Scott Striepe, Marty Waszak, Bill Winfree, and Kristopher Wise

Mais conteúdo relacionado

Mais procurados

Mais procurados (18)

Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community Responses
 
Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)
 
Empowering Transformational Science
Empowering Transformational ScienceEmpowering Transformational Science
Empowering Transformational Science
 
CHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning PlatformCHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning Platform
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
CS846_report_akshat_kumar
CS846_report_akshat_kumarCS846_report_akshat_kumar
CS846_report_akshat_kumar
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
 
SciBite
SciBiteSciBite
SciBite
 
Hyperresearch
HyperresearchHyperresearch
Hyperresearch
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
 
The value of emerging technologies for investigating academic practice
The value of emerging technologies for investigating academic practice The value of emerging technologies for investigating academic practice
The value of emerging technologies for investigating academic practice
 
MUDROD - Ranking
MUDROD - RankingMUDROD - Ranking
MUDROD - Ranking
 
Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform
 
Machine Learning in Healthcare: What's Now & What's Next
Machine Learning in Healthcare: What's Now & What's NextMachine Learning in Healthcare: What's Now & What's Next
Machine Learning in Healthcare: What's Now & What's Next
 
Evidence-based Semantic Web Just a Dream or the Way to Go?
Evidence-based Semantic WebJust a Dream or the Way to Go?Evidence-based Semantic WebJust a Dream or the Way to Go?
Evidence-based Semantic Web Just a Dream or the Way to Go?
 

Destaque

Mobile Mega Ramp.pptx
Mobile Mega Ramp.pptxMobile Mega Ramp.pptx
Mobile Mega Ramp.pptx
Jack Casdorph
 
Apelación en subsidio
Apelación en subsidioApelación en subsidio
Apelación en subsidio
Caso Belsunce
 
Drucker chapter 1
Drucker chapter 1Drucker chapter 1
Drucker chapter 1
detjen
 

Destaque (20)

19 03-03-g
19 03-03-g19 03-03-g
19 03-03-g
 
Santiago taylor
Santiago taylorSantiago taylor
Santiago taylor
 
AccountingResume
AccountingResumeAccountingResume
AccountingResume
 
Mobile Mega Ramp.pptx
Mobile Mega Ramp.pptxMobile Mega Ramp.pptx
Mobile Mega Ramp.pptx
 
Apelación en subsidio
Apelación en subsidioApelación en subsidio
Apelación en subsidio
 
Ciclo aprendizaje adultos: La maldición ¿sin causa?
Ciclo aprendizaje adultos: La maldición ¿sin causa?Ciclo aprendizaje adultos: La maldición ¿sin causa?
Ciclo aprendizaje adultos: La maldición ¿sin causa?
 
El sábado enseñaré | Lección 2 | Comienza el ministerio | Escuela Sabática
El sábado enseñaré | Lección 2 | Comienza el ministerio | Escuela SabáticaEl sábado enseñaré | Lección 2 | Comienza el ministerio | Escuela Sabática
El sábado enseñaré | Lección 2 | Comienza el ministerio | Escuela Sabática
 
21 12-02-g
21 12-02-g21 12-02-g
21 12-02-g
 
12 04-03-g
12 04-03-g12 04-03-g
12 04-03-g
 
Discografia de nirvana
Discografia de nirvanaDiscografia de nirvana
Discografia de nirvana
 
Drucker chapter 1
Drucker chapter 1Drucker chapter 1
Drucker chapter 1
 
Sesion adultos: Las enseñanzas de Jesús y la Gran Controversia
Sesion adultos: Las enseñanzas de Jesús y la Gran ControversiaSesion adultos: Las enseñanzas de Jesús y la Gran Controversia
Sesion adultos: Las enseñanzas de Jesús y la Gran Controversia
 
Sesion 23 edurel 3 ero secundaria
Sesion 23 edurel 3 ero secundariaSesion 23 edurel 3 ero secundaria
Sesion 23 edurel 3 ero secundaria
 
11 out of the whirlwind
11 out of the whirlwind11 out of the whirlwind
11 out of the whirlwind
 
05 Mil años de paz
05 Mil años de paz05 Mil años de paz
05 Mil años de paz
 
Lección 1 | Juveniles | La construcción del templo y tú | Escuela Sabática Me...
Lección 1 | Juveniles | La construcción del templo y tú | Escuela Sabática Me...Lección 1 | Juveniles | La construcción del templo y tú | Escuela Sabática Me...
Lección 1 | Juveniles | La construcción del templo y tú | Escuela Sabática Me...
 
07 La ley moral
07 La ley moral07 La ley moral
07 La ley moral
 
04 El regreso de jesús
04 El regreso de jesús04 El regreso de jesús
04 El regreso de jesús
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 

Semelhante a AIAA Conference - Big Data Session_ Final - Jan 2016

MS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.docMS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.doc
butest
 

Semelhante a AIAA Conference - Big Data Session_ Final - Jan 2016 (20)

MS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.docMS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.doc
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 

AIAA Conference - Big Data Session_ Final - Jan 2016

  • 1. L A N G L E Y R E S E A R C H C E N T E R
  • 2. NASA Langley Core Technical Areas Aerosciences Atmospheric Characterization Entry, Descent & Landing Intelligent Flight SystemsMeasurement Systems Systems Analysis & Concepts Advanced Materials & Structural Systems
  • 3. Comprehensive Digital Transformation Modeling & Simulation • Physics-based understanding and simulation – Improved discipline tools • Integrated analysis and design of complex systems • Optimally combine testing and M&S  Open ideation and innovation  Focused, relevant research  Intelligent and rapid system designs  Agile response to emerging missions Advanced IT • Open, secure collaboration with NASA & partners • Networks handle burgeoning data • Data governance, architecture, and management High Performance Computing • Next generation code development • Rapid Compute power for M&S and BDA&MI • Architecture for real-time analysis and design • Rapid synthesis and digestion of global scientific information for knowledge extraction, insights and answers • Mining of diverse computational and experimental data sets for new correlations, discoveries and advanced designs • Virtual/Digital Experts – Human Machine symbiosis External collaboration is of paramount importance Enable Innovative Solutions to Complex NASA Challenges in Aeronautics, Exploration, and Science
  • 4. Big Data Analytics and Machine Intelligence Vision: Virtual Research and Design Partner
  • 6. Two Key Areas for Virtual Partner – Data Intensive Scientific Discovery Deriving new insights, correlations, and discoveries not otherwise possible from our diverse experimental and computational data sets The Fourth Paradigm Projects & Pilots ( cuts across Technical Areas) • Anomaly Detection in the Nondestructive Evaluation images of Materials • Predicting Flutter from Aeroelasticity Data • Cognitive Assessment of Crew/Pilot State • Knowledge Bot for Complex Simulation Software Optimization • Rapid Exploration of Aerospace Designs • Entry, Descent, and Landing Trajectory Data Analysis The variety of techniques used in these projects and pilots represents a cross- cutting approach to solving complex, physics-based problems in multiple aerospace domains with diverse datasets -
  • 7. Anomaly Detection in the Non-Destructive Evaluation Images of Materials Predicting Flutter from Aeroelasticity Data Develop techniques and algorithms to automatically detect anomalies during the nondestructive evaluation of materials Goals • Significantly reduce SME analysis time and help experts discover additional anomalies • Help to design better material compositions and structures Techniques • Two-Dimensional Regression designed to detect anomalous pixels • Convolutional Neural Networks to classify the image data Accomplishments & Next Steps • Algorithms are validated with real data sets and being enhanced • Deliver a tool with a good UI for SMEs to use as an ‘Assistant’ for anomaly detection of composite materials analysis in March Develop methods to automatically detect the onset of flutter during wind tunnel testing Goals • Find new ways of predicting flutter in the time domain • Identify non-traditional predictor variables and unseen patterns • Better understand precursors to flutter and improve configurations Techniques • Piecewise Regression to locate and track structural modes & coalescence • Time Series Motifs to identify signatures in the data that could represent precursors to flutter Accomplishments & Next Steps • Peak detection tested with multiple datasets • Several significant time series motifs detected • Generating synthetic data for validation of algorithms Data Intensive Scientific Discovery Projects - 1
  • 8. Pilot Cognitive State Assessment Rapid Exploration of Aerospace Designs Build classification models for predicting cognitive state using physiological data collected during flight simulations Goals • Identify unsafe cognitive states in aircrew real-time • Apply results for more effective pilot training Techniques • Ensemble of machine learning tools (deep neural network, gradient boosting, random forest, support vector machine, decision tree) • Data pre-processing using detrending and power spectral density Accomplishments & Next Steps • Initial data mapping, statistical analysis, and signals processing • Support classification efforts on single modalities • Explore combining multiple signal models using ensembling Develop a generalized machine learning platform to be used for analyzing mod-sim data for design optimization Goals • Provide surrogate modeling to explore the trade space of aerospace vehicle designs with easy to use web interface • Use fast machine learning models instead of computationally- intensive code for rapid exploration and optimization Techniques • Supervised machine learning algorithms, SVM and Neural Networks, will be trained on labeled data Accomplishments & Next Steps • Python 2.7 with SKLearn algorithms are being used • Windows Server 2012 with PHP set up for web interface Data Intensive Scientific Discovery Projects - 2
  • 9. Current State SME pre-selects data to be analyzed and analyzes relying on traditional methods; Requires expertise and is time-consuming Being Developed Long term Vision Algorithms that mimic SME knowledge: • Validate the algorithm • Save SME time Application of algorithms to entire dataset and to other legacy datasets Yields New Insights Virtual Expert Autonomous Assistant to SME that analyzes all possible data and augments decision making Aerospace Data Analytics Challenge of Physics-Based Algorithms All Data SME-defined subset of data for analysis Being Developed Data Mining techniques to detect patterns and correlations which will be validated by SMEs Data Analytics Team and SMEs working together
  • 10. Two Key Areas for Virtual Partner - Knowledge Analytics Obtaining insights, identifying trends, aiding in discovery, and finding answers to specific questions by mining knowledge from scholarly, web, and multimedia content Cognitive Computing Knowledge Assistants Using Watson Content Analytics Aerospace Innovation Advisor POC Using Watson Discovery Advisor • Carbon Nanotubes Research • Autonomous Flight Research • Space Radiation Research Example Topics: • Hybrid Electric Propulsion • On Demand Mobility Cognitive-based systems are able to build knowledge and learn, through understanding natural language, to reason and interact more naturally with human beings than traditional systems. They are also able to put content into context with confidence-weighted responses and supporting evidence. Uses Natural language processing, machine learning and speech recognition technologies.
  • 11. • Digest and analyze thousands of articles without reading with ability to dissect the content interactively • Automatically identify sub set of documents from large corpus and provide brief summaries • Provide a means to rapidly identify trends, and connections • Identify experts and connections among them at all levels and affiliations • Explore technology gaps that could be leveraged • Replace traditional methods of SMEs manually reviewing and tracking research • Help to identify cross-domain leverages and research Autonomous Flight Carbon Nanotubes Research Space Radiation Research Knowledge Assistants Using Watson Content Analytics 130,000 articles metadata of ~ 20 year literature analyzed Identified experts, trends, insights, and connections Buying scholarly content is a challenge 4000 metadata and full text articles analyzed Integrate analysis of scholarly and informal web content to identify experts and new partnerships 1000 metadata and full text articles analyzed. Using identified possible duplications, connections and technology gaps. In the process of analyzing all six elements of Human Research Program Key Capabilities Successfully demonstrated value and developed robust expertise; Buying licensed content is a challenge Being expanded to analyze corpus topics as Systems Design, Uncertainty Quantification and Entry Descent Landing
  • 12. Cognitive Computing : Systematic and repeatable approach to learning Understand scientific and domain language Adapt and learn quickly from inquires, results, selections and iteration Compose and visualize information at large …built on a massively parallel Big data scalable architecture Domain content Extraction at scale Visualize at scale Discover at scale Learn with speed Generate new hypothesis and discoveries
  • 13. Watson Discovery Advisor Accelerate the discovery of new insights by synthesizing information in seconds • Take advantage of massive sources of data • Find answers to questions that have not been asked yet or answered before • Find insights into hidden relationships and dig deeper • Generate leads to hard questions and provide evidence to substantiate new claims • Being used in medicine both as diagnostics/treatment and research advisors Cognitive Technologies for Aerospace Engineering Apply cognitive computing technologies that ‘understand’ massive amounts of information and enhance experts abilities Starting to explore application to our domains: key challenges are adoption to engineering and licensing the scholarly content Aerospace Innovation Advisor Proof of Concept (March – June 16) Example Topics: Hybrid Electric Propulsion; On Demand Mobility Ames Research Center: Aircraft Dispatcher Assistant: Feasibility Study Armstrong Flight Research Center: Pilot Assistant: Investigation Johnson Space Center: Astronaut Health Investigations LaRC is connected with all of these efforts
  • 15. Linear Regression Application 1: Non-Destructive Evaluation (NDE) Image Analysis Goal: Automate delamination detection Method: Fit data with linear regression and detect outlier regions. Regression performed on 1D and 2D signals; Using C++ code and R Application 2: Aeroelastic Flutter Data Analytics Goal: Detect precursors and onset of aeroelastic flutter Method: Fit best quadratics between structural modes to detect mode coalescence; Using MATLAB Top: Linear regression of 1D-signals for anomaly detection in carbon fiber; Bottom: Mode identification in flutter time-series data using linear regression
  • 16. Gaussian Process Application: Knowledge Bot for Optimizing Complex Simulation Software Goal: Emulate simulation to predict convergence divergence Method: Gaussian Process for emulator and to find next best point to maximum knowledge; Using Python Justification: Approach followed in literature for weather simulation emulation Create Initial Points Evaluate Point(s) on Simulator (FUN3D) Create/Update Emulator Find Next Best Point Methodology Finding boundary of two circles; ~ 99% accuracy Fun 3D; ~83% accuracy
  • 17. Time Series Motifs Application: Pattern Mining of Time Domain Aeroelastic Flutter Data Goal: Identify flutter precursors to: • Create a dictionary of motifs for a given configuration • Classify data for use with machine learning algorithms that will support a real-time ‘Flutter Assistant’ Method: Application of the Motif Enumeration using (MOEN) open source algorithm created by Dr. Abdullah Mueen and MATLAB Justification: MOEN has been successfully applied to research problems in other scientific domains including robotics, biology, and seismology In order to detect motifs across the various sensor signals, a given sensor’s output (Signal A) is compared to another sensor (Signal B) by creating a composite signal (Signal A/B). The algorithm is then applied to the composite signal to detect the motifs (above right) common to both sensors. Significant motifs are identified by a physics-based selection process and then validated by SMEs. Scott, Robert C., et al. "Aeroservoelastic Wind-Tunnel Test of the SUGAR Truss Braced Wing Wind-Tunnel Model." 56th AIAA/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference. 2015.
  • 18. Deep Learning: Convolutional Neural Network (CNN) Application: NDE Image Analysis to Segment Delaminations Method: Convolutional encoder/decoder neural network; end-to-end training to map raw data to segmentation; Using Caffe and Lua/Torch Justification: Very successful in medical image analysis such as wound segmentation (top right) Results on Simulated Data Results on Experimental Data From Wang, Changhan, et al. "A unified framework for automatic wound segmentation and analysis with deep convolutional neural networks." Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE, 2015.
  • 19. Artificial neural networks (ANN) Application 1: Crew State Monitoring Goal: Build classification models capable of accurate, real-time prediction of aircrew cognitive state using physio data collected during flight simulations Method: ANN trained to classify cognitive state Application 2: Rapid Exploration of Aerospace Designs (READ) Goal: Build classification / regression models on user-uploaded simulated data Method: train ANNs on labeled data, use trained models for prediction and visualization EEG ECG Galvanic Skin ResponseRespiration Rate Eye Tracking Feature Generation Input Layer Hidden Layer Output Layer / Classification “Normal” State Channelized Attention Diverted Attention Startle / Surprise
  • 20. Ensemble of Machine Learning Techniques Application 1: Non-Destructive Evaluation (NDE) Image Analysis Goal: Automate delamination detection Method: Combine several machine learning models into overall prediction using regression to determine if sample contains a delamination; Using Python/Scikit-Learn Application 2: Pilot Cognitive State Assessment Goal: Build classification models capable of real- time prediction of pilot cognitive state using physiological data collected during flight simulations Method: Utilize 2-level Meta Model combining multiple classification algorithms to improve classification accuracy; Using Theano and Python Random forests Extremely random forests Ada Boost Gradient boosting k – nearest neighbors Fully grow k independent classification trees and combine predictions Similar to random forests but split per node is also randomized Fit consecutive weak learners based on classification tree stumps and combine predictions Similar to Ada Boost with different loss function Identify k closed points to test sample based on distance metric EEG ECG Galvanic Skin ResponseRespiration Rate Eye Tracking Feature Generation Artificial Neural Network Gradient Boost Classifier Random Forest Level 1 Models Artificial Neural Network Level 2 Meta Model “Normal” State Channelized Attn Diverted Attn Startle /Surprise
  • 21. Clustering: K-Means Application: Knowledge Analytics Goal: Automatically group thousands of documents into useful clusters Method: Utilizing IBM Watson Content Analytics; K-means is a scalable means of clustering large datasets More than 130,000 documents are grouped into 20 clusters; Accurate by SMEs validation
  • 23. • Focus on ‘Big Analytics’; Big Data is not just about volume • Data Science is a team effort - Computer Science; Statistics; Mathematics; Subject Matter Expertise/SME • Strong collaboration with SMEs is essential and critical; algorithm development is iterative • Machine learning techniques have to be adopted to our data; Researching thoroughly how similar problems are being addressed and adopting them is important • Understanding the problem domain and associated data is essential, difficult & time consuming • Use Pilots to learn to apply techniques to science/Eng. domains; Use a Phased Approach • Leverage extensive work and research happening, and the open source platforms • NASA, Federal, Universities, Industry • Application of data science/analytics to aerospace research domains is in early stages • Be ready for challenges and it is a journey with a huge potential Key Insights
  • 24. • Trending a new path to develop the capability for very specialized disciplines • Using a mix of research, experimentation, innovation, and persistence • Application of techniques to aerospace domains needs maturity and resources • Difficulty in establishing ground truth and labeled data for ML models; data sets are diverse, complex and unique • Takes multi years to get to achieve big goals; balancing that with short term milestones to demonstrate value and generate buy-in & resources • Provide a suite of tools and techniques for broad use-Machine Learning/Analytics • Expand big data movement to get broad buy in and propagating the expertise Challenges and Next Steps
  • 25. Acknowledgements – Big Data Analytics Team Data Analytics and Machine Learning Expertise : Manjula Ambur, Lin Chen, Christina Heinich, Charles Liles, Robert Milletich, Daniel Sammons, Ted Sidehamer, and Jeremy Yagle Subject Matter Expertise: Damodar Ambur, Danette Allen, Eric Burke, Kyle Ellis, Christie Funk, Dana Hammond, Angela Harrivel, Jeff Herath, Patty Howell, Constantine Lukashin, Alan Pope, Brandi Quam, Cheryl Rose, Jamshid Samareh, Mark Sanetrik, Rob Scott, Lisa Scott-Carnell, Steve Scotti, Walt Silva, Mia Siochi, Chad Stephens, Scott Striepe, Marty Waszak, Bill Winfree, and Kristopher Wise

Notas do Editor

  1. Welcome. Introductions. Review Agenda
  2. GPU = graphical processing unit. OGA = Other government agencies R. Lightfoot: “Partnerships beyond just getting coffee.” In other words, accomplishing real, complex work via partnerships Bold rectangles are the CDT foundation Explicit, intentional, and robust integration with the dashed ovals (Experimentation, Test, NASA Centers, external partners) Explanation of Ties between Vision benefits and CDT: NASA Missions propelled by digital advances: the Virtual Capabilities are the focal point to this. Digital grand challenges that solve significant challenges for the missions, such as a virtual flight test capability. Robust, mission-focused partnerships. Advanced IT knowledge systems and collaboration directly support this. Also, standardized interfaces among our and partners’ models, sims, etc. Agile response to emerging missions. Virtual capabilities and the overall digital transformation architecture enable agile, flexible, rapid response. Easier to rearrange electrons than people or buildings. Streamline ideation & invention. Big data & machine intelligence can help automate & analyze selected parts of this process. In addition, the entire CDT framework is intended to help people ideate, conceptualize, design, test, and produce results faster & more efficiently than ever before. Faster, better research & design cycles. M&S and big data techniques can help find issues in designs earlier in the process. Less costly to rearrange electronic designs than re-cast / test parts. Also enabled by automated manufacturing. Reduce excess margins. M&S and analytic techniques can contribute significantly to reducing uncertainties, resulting in right-sizing rule-of-thumb margins instead of simply stacking them. Maximize global contributors. Advanced IT collaboration techniques enable NASA to leverage the best brains worldwide. Solve entirely new NASA problems. The overall framework is aimed both at accomplishing existing & emerging missions far better. In addition, the right tools & brainpower can enable NASA to take on missions which seem impossible today.
  3. Main point: At the core of what makes Watson different are three powerful technologies - natural language, hypothesis generation, and evidence based learning. But Watson is more than the sum of its individual parts. Watson is about bringing these capabilities together in a way that’s never been done before resulting in a fundamental change in the way businesses look at quickly solving problems Further speaking points:. Looking at these one by one, understanding natural language and the way we speak breaks down the communication barrier that has stood in the way between people and their machines for so long. Hypothesis generation bypasses the historic deterministic way that computers function and recognizes that there are various probabilities of various outcomes rather than a single definitive ‘right’ response. And adaptation and learning helps Watson continuously improve in the same way that humans learn….it keeps track of which of its selections were selected by users and which responses got positive feedback thus improving future response generation Additional information: The result is a machine that functions along side of us as an assistant rather than something we wrestle with to get an adequate outcome