2. NASA Langley Core Technical Areas
Aerosciences
Atmospheric Characterization
Entry, Descent & Landing Intelligent Flight SystemsMeasurement Systems
Systems Analysis & Concepts
Advanced Materials & Structural Systems
3. Comprehensive Digital Transformation
Modeling & Simulation
• Physics-based understanding and simulation – Improved
discipline tools
• Integrated analysis and design of complex systems
• Optimally combine testing and M&S
Open ideation and innovation
Focused, relevant research
Intelligent and rapid system designs
Agile response to emerging missions
Advanced IT
• Open, secure collaboration with NASA & partners
• Networks handle burgeoning data
• Data governance, architecture, and management
High Performance Computing
• Next generation code development
• Rapid Compute power for M&S and BDA&MI
• Architecture for real-time analysis and design
• Rapid synthesis and digestion of global scientific information for
knowledge extraction, insights and answers
• Mining of diverse computational and experimental data sets for
new correlations, discoveries and advanced designs
• Virtual/Digital Experts – Human Machine symbiosis
External collaboration is of paramount importance
Enable Innovative Solutions to Complex NASA Challenges in
Aeronautics, Exploration, and Science
4. Big Data Analytics and Machine Intelligence
Vision: Virtual Research and Design Partner
6. Two Key Areas for Virtual Partner – Data Intensive
Scientific Discovery
Deriving new insights, correlations, and discoveries not otherwise
possible from our diverse experimental and computational data sets
The Fourth Paradigm
Projects & Pilots ( cuts across Technical Areas)
• Anomaly Detection in the Nondestructive Evaluation images of Materials
• Predicting Flutter from Aeroelasticity Data
• Cognitive Assessment of Crew/Pilot State
• Knowledge Bot for Complex Simulation Software Optimization
• Rapid Exploration of Aerospace Designs
• Entry, Descent, and Landing Trajectory Data Analysis
The variety of techniques used in these projects and pilots represents a cross-
cutting approach to solving complex, physics-based problems in multiple
aerospace domains with diverse datasets -
7. Anomaly Detection in the Non-Destructive
Evaluation Images of Materials
Predicting Flutter from
Aeroelasticity Data
Develop techniques and algorithms to automatically detect anomalies during
the nondestructive evaluation of materials
Goals
• Significantly reduce SME analysis time and help experts discover
additional anomalies
• Help to design better material compositions and structures
Techniques
• Two-Dimensional Regression designed to detect anomalous pixels
• Convolutional Neural Networks to classify the image data
Accomplishments & Next Steps
• Algorithms are validated with real data sets and being enhanced
• Deliver a tool with a good UI for SMEs to use as an ‘Assistant’ for anomaly
detection of composite materials analysis in March
Develop methods to automatically detect the onset of flutter during wind
tunnel testing
Goals
• Find new ways of predicting flutter in the time domain
• Identify non-traditional predictor variables and unseen patterns
• Better understand precursors to flutter and improve configurations
Techniques
• Piecewise Regression to locate and track structural modes & coalescence
• Time Series Motifs to identify signatures in the data that could represent
precursors to flutter
Accomplishments & Next Steps
• Peak detection tested with multiple datasets
• Several significant time series motifs detected
• Generating synthetic data for validation of algorithms
Data Intensive Scientific Discovery Projects - 1
8. Pilot Cognitive State Assessment Rapid Exploration of Aerospace Designs
Build classification models for predicting cognitive state using
physiological data collected during flight simulations
Goals
• Identify unsafe cognitive states in aircrew real-time
• Apply results for more effective pilot training
Techniques
• Ensemble of machine learning tools (deep neural network, gradient
boosting, random forest, support vector machine, decision tree)
• Data pre-processing using detrending and power spectral density
Accomplishments & Next Steps
• Initial data mapping, statistical analysis, and signals processing
• Support classification efforts on single modalities
• Explore combining multiple signal models using ensembling
Develop a generalized machine learning platform to be used for
analyzing mod-sim data for design optimization
Goals
• Provide surrogate modeling to explore the trade space of aerospace
vehicle designs with easy to use web interface
• Use fast machine learning models instead of computationally-
intensive code for rapid exploration and optimization
Techniques
• Supervised machine learning algorithms, SVM and Neural Networks,
will be trained on labeled data
Accomplishments & Next Steps
• Python 2.7 with SKLearn algorithms are being used
• Windows Server 2012 with PHP set up for web interface
Data Intensive Scientific Discovery Projects - 2
9. Current State
SME pre-selects data to be analyzed and analyzes relying on traditional
methods; Requires expertise and is time-consuming
Being Developed
Long term Vision
Algorithms that mimic SME knowledge:
• Validate the algorithm
• Save SME time
Application of algorithms to entire
dataset and to other legacy datasets
Yields New Insights
Virtual Expert
Autonomous Assistant to SME that analyzes all
possible data and augments decision making
Aerospace Data Analytics
Challenge of Physics-Based Algorithms
All Data
SME-defined subset
of data for analysis
Being Developed
Data Mining techniques to detect
patterns and correlations which will be
validated by SMEs
Data
Analytics
Team and
SMEs
working
together
10. Two Key Areas for Virtual Partner - Knowledge Analytics
Obtaining insights, identifying trends, aiding in discovery, and finding
answers to specific questions by mining knowledge from scholarly,
web, and multimedia content
Cognitive Computing
Knowledge Assistants
Using Watson Content Analytics
Aerospace Innovation
Advisor POC
Using Watson Discovery Advisor
• Carbon Nanotubes Research
• Autonomous Flight Research
• Space Radiation Research
Example Topics:
• Hybrid Electric Propulsion
• On Demand Mobility
Cognitive-based systems are able to build knowledge and learn, through understanding natural language, to reason and interact more
naturally with human beings than traditional systems. They are also able to put content into context with confidence-weighted responses and
supporting evidence. Uses Natural language processing, machine learning and speech recognition technologies.
11. • Digest and analyze thousands of articles without reading with ability to
dissect the content interactively
• Automatically identify sub set of documents from large corpus and
provide brief summaries
• Provide a means to rapidly identify trends, and connections
• Identify experts and connections among them at all levels and
affiliations
• Explore technology gaps that could be leveraged
• Replace traditional methods of SMEs manually reviewing and tracking
research
• Help to identify cross-domain leverages and research
Autonomous Flight
Carbon Nanotubes Research
Space Radiation Research
Knowledge Assistants
Using Watson Content Analytics
130,000 articles metadata of ~ 20 year literature analyzed
Identified experts, trends, insights, and connections
Buying scholarly content is a challenge
4000 metadata and full text articles analyzed
Integrate analysis of scholarly and informal web
content to identify experts and new partnerships
1000 metadata and full text articles analyzed. Using
identified possible duplications, connections and
technology gaps. In the process of analyzing all six
elements of Human Research Program
Key Capabilities
Successfully demonstrated value and developed robust expertise; Buying licensed content is a challenge
Being expanded to analyze corpus topics as Systems Design, Uncertainty Quantification and Entry Descent Landing
12. Cognitive Computing : Systematic and repeatable approach to learning
Understand
scientific and
domain
language
Adapt and learn
quickly from inquires,
results, selections
and iteration
Compose and
visualize
information at
large
…built on a massively parallel
Big data scalable architecture
Domain content Extraction at scale Visualize at scale Discover at scale Learn with speed
Generate new
hypothesis and
discoveries
13. Watson Discovery Advisor
Accelerate the discovery of new insights by
synthesizing information in seconds
• Take advantage of massive sources of data
• Find answers to questions that have not been asked
yet or answered before
• Find insights into hidden relationships and dig deeper
• Generate leads to hard questions and provide
evidence to substantiate new claims
• Being used in medicine both as diagnostics/treatment
and research advisors
Cognitive Technologies for
Aerospace Engineering
Apply cognitive computing technologies that ‘understand’ massive
amounts of information and enhance experts abilities
Starting to explore application to our domains: key challenges are
adoption to engineering and licensing the scholarly content
Aerospace Innovation Advisor
Proof of Concept (March – June 16)
Example Topics: Hybrid Electric Propulsion;
On Demand Mobility
Ames Research Center:
Aircraft Dispatcher Assistant: Feasibility Study
Armstrong Flight Research Center:
Pilot Assistant: Investigation
Johnson Space Center:
Astronaut Health Investigations
LaRC is connected with all of these efforts
15. Linear Regression
Application 1: Non-Destructive Evaluation (NDE)
Image Analysis
Goal: Automate delamination detection
Method: Fit data with linear regression and detect
outlier regions. Regression performed on 1D and
2D signals; Using C++ code and R
Application 2: Aeroelastic Flutter Data Analytics
Goal: Detect precursors and onset of aeroelastic
flutter
Method: Fit best quadratics between structural
modes to detect mode coalescence; Using MATLAB
Top: Linear regression of 1D-signals for anomaly detection in
carbon fiber; Bottom: Mode identification in flutter time-series
data using linear regression
16. Gaussian Process
Application: Knowledge Bot for
Optimizing Complex Simulation Software
Goal: Emulate simulation to predict
convergence divergence
Method: Gaussian Process for emulator
and to find next best point to maximum
knowledge; Using Python
Justification: Approach followed in
literature for weather simulation
emulation
Create
Initial
Points
Evaluate
Point(s) on
Simulator
(FUN3D)
Create/Update
Emulator
Find Next
Best Point
Methodology
Finding boundary of two circles; ~ 99% accuracy
Fun 3D; ~83% accuracy
17. Time Series Motifs
Application:
Pattern Mining of Time Domain Aeroelastic Flutter Data
Goal:
Identify flutter precursors to:
• Create a dictionary of motifs for a given configuration
• Classify data for use with machine learning
algorithms that will support a real-time ‘Flutter
Assistant’
Method: Application of the Motif Enumeration using
(MOEN) open source algorithm created by Dr. Abdullah
Mueen and MATLAB
Justification: MOEN has been successfully applied to
research problems in other scientific domains including
robotics, biology, and seismology
In order to detect motifs across the various sensor signals, a given sensor’s
output (Signal A) is compared to another sensor (Signal B) by creating a
composite signal (Signal A/B).
The algorithm is then applied to the composite signal to detect the motifs
(above right) common to both sensors. Significant motifs are identified by a
physics-based selection process and then validated by SMEs.
Scott, Robert C., et al. "Aeroservoelastic Wind-Tunnel Test of the SUGAR Truss Braced Wing
Wind-Tunnel Model." 56th AIAA/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials
Conference. 2015.
18. Deep Learning: Convolutional
Neural Network (CNN)
Application: NDE Image Analysis to Segment
Delaminations
Method: Convolutional encoder/decoder neural
network; end-to-end training to map raw data to
segmentation; Using Caffe and Lua/Torch
Justification: Very successful in medical image
analysis such as wound segmentation (top right)
Results on Simulated Data
Results on Experimental Data
From Wang, Changhan, et al. "A unified framework for automatic wound segmentation and analysis with deep convolutional neural
networks." Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE, 2015.
19. Artificial neural networks (ANN)
Application 1: Crew State Monitoring
Goal: Build classification models capable of
accurate, real-time prediction of aircrew
cognitive state using physio data collected during
flight simulations
Method: ANN trained to classify cognitive state
Application 2: Rapid Exploration of Aerospace
Designs (READ)
Goal: Build classification / regression models on
user-uploaded simulated data
Method: train ANNs on labeled data, use trained
models for prediction and visualization
EEG ECG Galvanic Skin ResponseRespiration Rate Eye Tracking
Feature Generation
Input Layer
Hidden Layer
Output Layer /
Classification
“Normal”
State
Channelized
Attention
Diverted
Attention
Startle /
Surprise
20. Ensemble of Machine Learning Techniques
Application 1: Non-Destructive Evaluation (NDE)
Image Analysis
Goal: Automate delamination detection
Method: Combine several machine learning models
into overall prediction using regression to determine
if sample contains a delamination; Using
Python/Scikit-Learn
Application 2: Pilot Cognitive State Assessment
Goal: Build classification models capable of real-
time prediction of pilot cognitive state using
physiological data collected during flight simulations
Method: Utilize 2-level Meta Model combining
multiple classification algorithms to improve
classification accuracy; Using Theano and Python
Random
forests
Extremely
random forests Ada Boost
Gradient
boosting
k – nearest
neighbors
Fully grow k
independent
classification
trees and
combine
predictions
Similar to
random forests
but split per
node is also
randomized
Fit consecutive
weak learners
based on
classification
tree stumps
and combine
predictions
Similar to Ada
Boost with
different loss
function
Identify k
closed points to
test sample
based on
distance metric
EEG ECG Galvanic Skin ResponseRespiration Rate Eye Tracking
Feature Generation
Artificial Neural Network Gradient Boost Classifier Random Forest
Level 1 Models
Artificial Neural Network
Level 2 Meta Model
“Normal” State Channelized Attn Diverted Attn Startle /Surprise
21. Clustering: K-Means
Application: Knowledge Analytics
Goal: Automatically group thousands of
documents into useful clusters
Method: Utilizing IBM Watson Content
Analytics; K-means is a scalable means of
clustering large datasets
More than 130,000 documents are grouped into
20 clusters; Accurate by SMEs validation
23. • Focus on ‘Big Analytics’; Big Data is not just about volume
• Data Science is a team effort - Computer Science; Statistics; Mathematics; Subject Matter
Expertise/SME
• Strong collaboration with SMEs is essential and critical; algorithm development is iterative
• Machine learning techniques have to be adopted to our data; Researching thoroughly how
similar problems are being addressed and adopting them is important
• Understanding the problem domain and associated data is essential, difficult & time consuming
• Use Pilots to learn to apply techniques to science/Eng. domains; Use a Phased Approach
• Leverage extensive work and research happening, and the open source platforms
• NASA, Federal, Universities, Industry
• Application of data science/analytics to aerospace research domains is in early stages
• Be ready for challenges and it is a journey with a huge potential
Key Insights
24. • Trending a new path to develop the capability for very specialized disciplines
• Using a mix of research, experimentation, innovation, and persistence
• Application of techniques to aerospace domains needs maturity and resources
• Difficulty in establishing ground truth and labeled data for ML models; data sets are diverse, complex
and unique
• Takes multi years to get to achieve big goals; balancing that with short term milestones to
demonstrate value and generate buy-in & resources
• Provide a suite of tools and techniques for broad use-Machine Learning/Analytics
• Expand big data movement to get broad buy in and propagating the expertise
Challenges and Next Steps
25. Acknowledgements – Big Data Analytics Team
Data Analytics and Machine Learning Expertise :
Manjula Ambur, Lin Chen, Christina Heinich, Charles Liles, Robert Milletich,
Daniel Sammons, Ted Sidehamer, and Jeremy Yagle
Subject Matter Expertise:
Damodar Ambur, Danette Allen, Eric Burke, Kyle Ellis, Christie Funk, Dana
Hammond, Angela Harrivel, Jeff Herath, Patty Howell, Constantine Lukashin,
Alan Pope, Brandi Quam, Cheryl Rose, Jamshid Samareh, Mark Sanetrik, Rob
Scott, Lisa Scott-Carnell, Steve Scotti, Walt Silva, Mia Siochi, Chad Stephens,
Scott Striepe, Marty Waszak, Bill Winfree, and Kristopher Wise
Notas do Editor
Welcome.
Introductions.
Review Agenda
GPU = graphical processing unit.
OGA = Other government agencies
R. Lightfoot: “Partnerships beyond just getting coffee.” In other words, accomplishing real, complex work via partnerships
Bold rectangles are the CDT foundation
Explicit, intentional, and robust integration with the dashed ovals (Experimentation, Test, NASA Centers, external partners)
Explanation of Ties between Vision benefits and CDT:
NASA Missions propelled by digital advances: the Virtual Capabilities are the focal point to this. Digital grand challenges that solve significant challenges for the missions, such as a virtual flight test capability.
Robust, mission-focused partnerships. Advanced IT knowledge systems and collaboration directly support this. Also, standardized interfaces among our and partners’ models, sims, etc.
Agile response to emerging missions. Virtual capabilities and the overall digital transformation architecture enable agile, flexible, rapid response. Easier to rearrange electrons than people or buildings.
Streamline ideation & invention. Big data & machine intelligence can help automate & analyze selected parts of this process. In addition, the entire CDT framework is intended to help people ideate, conceptualize, design, test, and produce results faster & more efficiently than ever before.
Faster, better research & design cycles. M&S and big data techniques can help find issues in designs earlier in the process. Less costly to rearrange electronic designs than re-cast / test parts. Also enabled by automated manufacturing.
Reduce excess margins. M&S and analytic techniques can contribute significantly to reducing uncertainties, resulting in right-sizing rule-of-thumb margins instead of simply stacking them.
Maximize global contributors. Advanced IT collaboration techniques enable NASA to leverage the best brains worldwide.
Solve entirely new NASA problems. The overall framework is aimed both at accomplishing existing & emerging missions far better. In addition, the right tools & brainpower can enable NASA to take on missions which seem impossible today.
Main point: At the core of what makes Watson different are three powerful technologies - natural language, hypothesis generation, and evidence based learning. But Watson is more than the sum of its individual parts. Watson is about bringing these capabilities together in a way that’s never been done before resulting in a fundamental change in the way businesses look at quickly solving problems
Further speaking points:. Looking at these one by one, understanding natural language and the way we speak breaks down the communication barrier that has stood in the way between people and their machines for so long. Hypothesis generation bypasses the historic deterministic way that computers function and recognizes that there are various probabilities of various outcomes rather than a single definitive ‘right’ response. And adaptation and learning helps Watson continuously improve in the same way that humans learn….it keeps track of which of its selections were selected by users and which responses got positive feedback thus improving future response generation
Additional information: The result is a machine that functions along side of us as an assistant rather than something we wrestle with to get an adequate outcome