SlideShare a Scribd company logo
1 of 62
Download to read offline
Software Analytics:
Reflection and Path Forward
Dr. Dongmei Zhang
Data, Knowledge, and Intelligence
(DKI) Group
Microsoft Research Asia
Prof. Tao Xie
School of Computer Science
Peking University
Outline
• Origin and early research
• Community building
• New research topics
• Reflections
05/20/2022 MSR 2022 2
Origin and Early Research
05/20/2022 MSR 2022 3
05/20/2022 MSR 2022 4
Software Analytics Group at MSRA, founded in May 2009
Software Analytics Research
Utilize data-driven approach to help create high quality, user friendly,
and efficiently developed and operated software and services
05/20/2022 MSR 2022 5
Information Visualization
Analysis Algorithms
Large-scale Computing
Vertical
Horizontal
https://www.microsoft.com/en-us/research/group/software-analytics/
http://research.microsoft.com/en-us/news/features/softwareanalytics-052013.aspx
Prof. Tao Xie’s
Visit at MSRA SA
05/20/2022 MSR 2022 6
Defining Software Analytics
Software analytics is to enable software practitioners to perform data
exploration and analysis in order to obtain insightful and actionable
information for data-driven tasks around software and services.
05/20/2022 MSR 2022 7
D. Zhang, Y. Dang, J. Lou, S. Han, H. Zhang, and Tao Xie. Software Analytics as a Learning Case in Practice: Approaches and Experiences. In MALETS 2011.
Six dimensions
05/20/2022 MSR 2022 8
Research
Topics
Technology
Pillars
Target
Audience
Connection
to Practice
Output
Input
Research topics – the trinity view
05/20/2022 MSR 2022 9
• Covering major areas of software domain
• Throughout entire development cycle
• Enabling practitioners to obtain insights
Software
Users
Software
Development
Process
Software
System
Input - data sources
05/20/2022 MSR 2022 10
Runtime traces
Program logs
System events
Perf counters
…
Usage log
User surveys
Online forum posts
Blog & Twitter
…
Source code
Bug history
Check-in history
Test cases
…
Output – insightful information
• Conveys meaningful and useful understanding or knowledge towards
completing the target task
• Not easily attainable via directly investigating raw data without aid of
analytics technologies
• Examples
• It is easy to count the number of re-opened bugs, but how to find out the
primary reasons for these re-opened bugs?
• When the availability of an online service drops below a threshold, how to
localize the problem?
05/20/2022 MSR 2022 11
Output – actionable information
• Enables software practitioners to come up with concrete solutions
towards completing the target task
• Examples
• Why bugs were re-opened?
• A list of bug groups each with the same reason of re-opening
• Why availability of online services dropped?
• A list of problematic areas with associated confidence values
• Which part of my code should be refactored?
• A list of cloned code snippets easily explored from different perspectives
05/20/2022 MSR 2022 12
Technology pillars
05/20/2022 MSR 2022 13
Software
Users
Software
Development
Process
Software
System
Information Visualization
Analysis Algorithms
Large-scale Computing
Vertical
Horizontal
Technology pillars
Target audience – software practitioners
05/20/2022 MSR 2022 14
Developer
Tester
Program Manager
Usability engineer
Designer
Support engineer
Management personnel
Operation engineer
Connection to practice
• Software Analytics is naturally tied with software development
practice
• Getting real
05/20/2022 MSR 2022 15
Real
Data
Real
Problems
Real
Users
Real
Tools
Early projects
05/20/2022 MSR 2022 16
StackMine – Performance debugging in the large via mining millions of stack traces
Scalable code clone analysis
Data exploration for Customer Experience Improvement Program (CEIP)
05/20/2022 MSR 2022 17
Performance Debugging in the Large via
Mining Millions of Stack Traces
S. Han, Y. Dong, D. Zhang, and T. Xie, ICSE 2012
Comprehending Performance from Real-World
Execution Traces: A Device-Driver Case
X. Yu, S. Han, D. Zhang, and T. Xie, ASPLOS 2014
05/20/2022 MSR 2022 18
Performance Debugging in the Large via
Mining Millions of Stack Traces
S. Han, Y. Dong, D. Zhang, and T. Xie, ICSE 2012
Comprehending Performance from Real-World
Execution Traces: A Device-Driver Case
X. Yu, S. Han, D. Zhang, and T. Xie, ASPLOS 2014
as representative paper in 2012, 1 of 20 representative
papers (one paper a year)
Community Building
05/20/2022 MSR 2022 19
Building Upon Rich Work by the Communities
05/20/2022 MSR 2022 20
FSE/SDP Workshop on the Future of Software Engineering Research (FoSER 2010)
...
MSR 2012 Keynote
05/20/2022 MSR 2022 21
SoftMine 2013 Keynote
05/20/2022 MSR 2022 22
CCCF/IEEE Software 2013 Articles
05/20/2022 MSR 2022 23
Shonan Meeting 2013
05/20/2022 MSR 2022 24
Tutorials/Tech Briefings at ICSE/FSE/ASE...
• [ASE 11 Tutorial] Zhang & Xie. xSA: eXtreme Software Analytics -
Marriage of eXtreme Computing and Software Analytics
• [CSEE&T 12 Tutorial] Zhang, Dang, Han & Xie. Teaching and Training
for Software Analytics
• [ICSE 12 SEIP Mini Tutorial] Zhang & Xie. Software Analytics in
Practice: Mini Tutorial
• [ICSE 13 Tutorial] Zhang & Tao Xie. Software Analytics: Achievements
and Challenges
• [FSE 14 Tutorial] Zhang & Tao Xie. Software Analytics: Achievements
and Challenges
05/20/2022 MSR 2022 25
Community Building by Others
05/20/2022 MSR 2022 26
IEEE Software
2013 Special Issue
Dagstuhl Seminar
2014
International Workshop on
Software Analytics (SWAN)
2015, 2016, 2017, 2018
...
Expanding Community
05/20/2022 MSR 2022 27
...
Beyond SE Communities: ASPLOS 2021 Keynote
05/20/2022 MSR 2022 28
ASPLOS is the premier forum for interdisciplinary systems research, intersecting computer architecture, hardware
and emerging technologies, programming languages and compilers, operating systems, and networking.
New Research Topic (1)
Cloud Intelligence
05/20/2022 MSR 2022 29
Cloud Services
• Shift to cloud becoming mainstream
• Critical role of cloud computing platforms fortified by COVID-19
05/20/2022 MSR 2022 30
2018 2019 2020 2021 2022
System
Infrastructure
11% 13% 16% 19% 22%
Infrastructure
software
13% 15% 17% 18% 20%
Application
software
34% 36% 38% 39% 40%
Business process
outsourcing
27% 28% 29% 29% 30%
Total 19% 21% 24% 26% 28%
Cloud shift proportion by category
Source: Gartner (August 2018)
2019 2020 2021 2022
BPaaS 45,212 44,741 47,521 50,336
PaaS 37,512 43,823 55,486 68,964
SaaS 102,064 101,480 117,773 138,261
IaaS 44,457 51,421 65,264 82,225
DaaS 616 1,204 1,945 2,542
Total Market 242,696 257,549 304,990 362,263
Worldwide public cloud services end-user spending forecast (Millions of USD)
Source: Gartner (November 2020)
Note: Totals may not add up due to rounding.
Focusing on Cloud Computing
• Huge space for improvement for cloud computing platforms
• Software Analytics is the digital transformation of software industry
• Cloud intelligence
• Software Analytics focusing on cloud computing
• Re-emergence of AI
• Making impact is key
05/20/2022 MSR 2022 31
Cloud Intelligence
Using AI/ML technologies to effectively and efficiently design, build and
operate complex cloud services at scale
MSR 2022 32
Customers
Engineering
Services
• AI for System
Designing and building high-quality services with better
reliability, performance, and efficiency
• AI for Customers
Improving customer satisfaction with intelligence and
better user experiences
• AI for DevOps
Achieving high productivity in DevOps via empowering
engineers with intelligent tooling
05/20/2022
• Cloud Intelligence Workshop
• @ AAAI 2020
• @ ICSE 2021
• @ SysML 2022
• Program Chair
Jian Zhang, Microsoft Azure
• Steering Committee
Rama Akkiraju, IBM
Ricardo Bianchini, Microsoft Research
Mike Dahlin, Google
Marcus Fontoura, Microsoft Azure
Ahmed E. Hassan, Queen’s University
Michael Lyu, Chinese University of Hong Kong
Erik Meijer, Facebook
Tao Xie, Peking University
Dongmei Zhang, Microsoft Research
Yuanyuan Zhou, UCSD
Related Efforts
05/20/2022 MSR 2022 33
• AIOps by Gartner
“Put simply, AIOps is the application of machine learning
(ML) and data science to IT operations problems. AIOps
platforms combine big data and ML functionality to
enhance and partially replace all primary IT operations
functions, including availability and performance
monitoring, event correlation and analysis, and IT service
management and automation.”
• AIOps extended
AIOps: Real-world Challenges and Research Innovations
Yingnong Dang, Qingwei Lin, Peng Huang
Technical Briefing, ICSE 2019
Scenarios
05/20/2022 MSR 2022 34
Service health measuring (KPI)
• Availability / reliability
• Performance
• Security
Anomalous behavior detection
• KPI (Overall, component)
• Resource (overhead / leak)
Health prediction
• Infrastructure (e.g., power, cooling)
• HW, SW Failure
• Workload
• System capacity
Auto-recovery/adjustment/healing
• Recovery option optimization
• Auto healing
Programming
• API/code suggestion
• Code defect, smell, code review
• Test coverage, test selection
CI/CD
• Integration testing and strategy
• Rollout risk assessment and strategy
Auto-triage & diagnosis
• Auto-triage (investigation owner)
• Diagnosis intelligence
Repair/mitigation decision
• Solution recommendation
• Decision support
Customer behavior understanding
• Usage experience
• Customer churn
Proactive customer engagement
• Service auto-scale (up/down)
• Engaging before reporting
Intelligent customer support
• Self-serve
• Efficient communication
• Intelligent suggestion/hints
Service Engineering Customer
Problems and Challenges
MSR 2022 35
Detection
Diagnosis
Optimization
Prediction
• Time-series anomaly detection
• Log-based anomaly detection
• Multi-dimensional change detection
• …
• Log pattern mining
• Correlation analysis
• Dependency graph diagnosis
• …
• Context/dependency-aware prediction
• Automated feature engineering
• Extremely-imbalanced data prediction
• …
Diverse requirements, noisy
data, high dimensions, lack
of labeled data …
Diverse causes, complex
service dependency,
scattered knowledge…
Huge problem space,
large scale data, complex
constraints and tradeoffs, …
Highly imbalanced class,
fast system evolution,
unpredictable behavior
changes, …
• Multi-constraint/objective optimization
• DL-based combinatorial search
• Optimization under prediction uncertainty
• …
PROBLEMS CHALLENGES
05/20/2022
Disk Failure Prediction in Cloud Computing Platform
Improving Service Availability of Cloud Systems by Predicting Disk Error, Y. Xu, K. Sui, R. Yao, H. Zhang, Q. Lin, Y. Dang, P. Li, K. Jiang, W. Zhang, J. Lou, M. Chintalapati, D. Zhang, USNIX ATC 2018.
NTAM: Neighborhood-Temporal Attention Model for Disk Failure Prediction in Cloud Platforms, C. Luo, P. Zhao, B. Qiao, Y. Wu, H. Zhang, W. Wu, W. Lu, Y. Dang, S. Rajmohan, Q. Lin, D. Zhang, the Web Conference
2021.
05/20/2022 MSR 2022 36
Virtual Machine (VM) Availability and Disk Failures
• Hardware issues are one of the top reasons of VM going down and VM reboot
• Disk failures contribute most to the hardware issues
05/20/2022 MSR 2022 37
Source: https://www.backblaze.com/blog/hard-drive-stats-for-2018/
Source: https://www.microsoft.com/en-us/research/wp-
content/uploads/2016/08/a7-narayanan.pdf
SSD Annualized Failure Rates
Binary Classification Problem
The training set is a collection of 𝑁𝑁 training samples, denoted as
𝐷𝐷 = { 𝑋𝑋1, 𝑦𝑦1 , (𝑋𝑋2, 𝑦𝑦2) … , (𝑋𝑋𝑁𝑁, 𝑦𝑦𝑁𝑁)}
𝑋𝑋𝑖𝑖 represents the corresponding disk 𝑑𝑑𝑖𝑖’s own status data and neighborhood information,
i.e., 𝑋𝑋𝑖𝑖 = 𝐴𝐴𝑖𝑖 ∪ 𝐵𝐵𝑖𝑖, 𝐴𝐴𝑖𝑖 ∈ 𝑅𝑅ℎ×𝑛𝑛 represents 𝑑𝑑𝑖𝑖’s own status data, and 𝐵𝐵𝑖𝑖 is a subset of unions
of all 𝐴𝐴𝑖𝑖.
𝑦𝑦𝑖𝑖 ∈ {0,1} is the label
𝑦𝑦𝑖𝑖 = 1 means that the corresponding disk will fail in near future
𝑦𝑦𝑖𝑖 = 0 means ‘healthy’
Loss function
𝐿𝐿 = −
1
𝑁𝑁
�
𝑖𝑖=1
𝑁𝑁
[𝑦𝑦𝑖𝑖 ⋅ log �
𝑦𝑦𝑖𝑖 + 1 − 𝑦𝑦𝑖𝑖 ⋅ log(1 − �
𝑦𝑦𝑖𝑖)]
05/20/2022 MSR 2022 38
Related Work
• Traditional machine learning based approaches
• Support Vector Machine (SVM) [MSST 2013]
• Decision Tree (DT) [DSN 2014]
• Random Forest (RF) [DSN 2018]
• Gradient Boosting Decision Tree (GBDT) [Ph.D. Dissertation, UCLA 2017]
• Regularized Greedy Forest (RGF) [KDD 2016]
• Cloud Disk Error Forecasting (CDEF) [USENIX ATC 2018]
• Deep Learning based approaches
• Recurrent Neural Network (RNN) [IEEE Transactions on Computers 2016]
• Long Short-Term Memory (LSTM) [ICDM 2018]
• Temporal Convolution Neural Network (TCNN) [DAC 2019]
• Convolution Neural Network with Long Short-Term Memory (CNN+LSTM) [FAST 2020]
• Neighborhood-Temporal Attention Model (NTAM) [Web Conference 2021]
05/20/2022 MSR 2022 39
Observations (1)
• VMs can be impacted before disks completely fail
• Disk errors occur before disk completely fails
• Disk errors often reflected by system-level signals such as OS events
05/20/2022 MSR 2022 40
Name Description
Timestamp The timestamp 𝑡𝑡 of the feature vector recorded.
Disk ID The unique ID of disk 𝑑𝑑𝑖𝑖 .
Node ID The unique ID of each computing server (i.e. node) 𝑑𝑑𝑖𝑖 is associated with.
SMART Attributes The SMART attributes of 𝑑𝑑𝑖𝑖 recorded at 𝑡𝑡, providing information such as the Current Pending
Sector Count, Seek Error Rate, Soft Read Error Rate, etc.
System-related
attributes
OS events such as paging error, file system error, device reset, telemetry loss, etc.
Driver-related
attributes
Gathered from disk driver with information on Flush Count, IO Latency, Controller Reset, etc.
Observation (2)
• A disk’s health status may be impacted by its neighboring disks
• Incorporating individual disk’s status and its neighborhood info
05/20/2022 MSR 2022 41
Figure 2: The architecture of the neighborhood-aware component underlying NTAM.
Observation (3)
• Extremely imbalanced disk population
• Data enhancement via Temporal Progressive Sampling (TPS)
05/20/2022 MSR 2022 42
Figure 4: The design of the Temporal Progressive Sampling (TPS) method.
Neighborhood-Temporal Attention Model (NTAM)
• Neighborhood-aware component
To effectively incorporate
neighborhood information
• Temporal component
To better capture temporal
information
• Decision component
Decide whether the corresponding
disk will fail in near future or not
05/20/2022 MSR 2022 43
Failure probability
Temporal-encoded vector
Neighbor-encoded vectors
Disk Ai & Neighbors Bi
Figure 1: Overview of Neighborhood-aware Attention Model (NTAM).
AI & Software Engineering
05/20/2022 MSR 2022 44
New Research Topic (2)
RAISE 2013 Keynote & Vision Statement
05/20/2022 MSR 2022 45
SIGSOFT Webinar 2019
05/20/2022 MSR 2022 46
IEEE Software 2020 Special Issue
05/20/2022 MSR 2022 47
Making IntelliTest More Intelligent
05/20/2022 MSR 2022 48
Pex journey [ASE 2014]
Pex shipped as IntelliTest in
Visual Studio Enterprise Edition
since 2015
Self-learning (data driven)
Thummalapenta, Xie, Tillmann, de Halleux, and Schulte. MSeqGen: Object-
Oriented Unit-Test Generation via Mining Source Code. ESEC/FSE 2009.
ICSE 2020 Technical Briefing
05/20/2022 MSR 2022 49
Programming is not easy, even for easy task
SELECT e1.brand AS brand, e1.Year as year
FROM table e1=(select sum(sale) as salesum, year,
brand, group by year, brand )
LEFT OUTER JOIN table e2=(select sum(sale) as
salesum, year, brand, group by year, brand)
ON (e1.year = e2.year AND e1. salesum >= e2.
salesum)
GROUP BY e1.brand, e1.year
HAVING COUNT(*) <= 2
ORDER BY year;
A Question: Writing a SQL statement for “top 2 selling brands in each year”
given a table of three columns “sales”, “Brand”, and “year”.
NL2Regex, NL2SQL, ...
05/20/2022 MSR 2022 51
Zhong, Guo, Yang, Peng, Xie, Lou, Liu and Zhang. SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications. EMNLP 2018.
Guo, Liu, Lou, Li, Liu, Xie, and Liu. Benchmarking Meaning Representations in Neural Semantic Parsing. EMNLP 2020.
Dong, Sun, Liu, Lou, and Zhang. Data-Anonymous Encoding for Text-to-SQL Generation. EMNLP 2019.
Conversational Interface for
NL to Data Analysis/Visualization in Excel
aiXcoder
05/20/2022 MSR 2022 53
After aiXcoder 2.0 became online (currently 4.0)
for 1 month, #download > 130K!
So far 2C: 300K users
2B: major banks/IT companies
https://aixcoder.com/en/
aiXcoder L and Next
05/20/2022 MSR 2022 54
Billion-scale model parameters NL2Code
New Trend: Big Pre-trained Model + Task Adaptation
GPT-3 can program?
Reflections
05/20/2022 MSR 2022 56
Data Driven vs. Problem Driven
05/20/2022 MSR 2022 57
AI + Human Intelligence
05/20/2022 MSR 2022 58
Making Impact in Practice
• Finding the critical scenario
• Closing the loop
• End-to-end and fast iteration
05/20/2022 MSR 2022 59
Perspective Potential Impact
Problem Applicability
Assumption Problem validity
Constraint
Formulation and solution
Requirement
Evaluation Usefulness in practice
Technology readiness framework
Takeaways
• Software Analytics
digital transformation of software industry
• Thriving community
• New research topics
• Cloud Intelligence
• AI and Software Engineering
• Reflections
• Data driven vs. problem driven
• AI + human intelligence
• Making impact in practice
• WE ARE HIRING!
05/20/2022 MSR 2022 60
Acknowledgement
Sincere thank-you to all the academic collaborators, colleagues and
partners in Microsoft, and our talented intern students for the
collaboration and partnership over the years!
05/20/2022 MSR 2022 61
Thanks!
05/20/2022 MSR 2022 62

More Related Content

Similar to MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection and Path Forward

Build Answer-generating Apps that Users Love: Development best practices for ...
Build Answer-generating Apps that Users Love: Development best practices for ...Build Answer-generating Apps that Users Love: Development best practices for ...
Build Answer-generating Apps that Users Love: Development best practices for ...TIBCO Jaspersoft
 
Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...
Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...
Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...Michael Lew
 
What is the future of data strategy?
What is the future of data strategy?What is the future of data strategy?
What is the future of data strategy?Denodo
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
 
Agile Mumbai 2022 - Ashwinee Singh | Agile in AI or AI in Agile?
Agile Mumbai 2022 - Ashwinee Singh | Agile in AI or AI in Agile?Agile Mumbai 2022 - Ashwinee Singh | Agile in AI or AI in Agile?
Agile Mumbai 2022 - Ashwinee Singh | Agile in AI or AI in Agile?AgileNetwork
 
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...DevOps for Enterprise Systems
 
Modern Business Intelligence - Design and Implementations
Modern Business Intelligence - Design and ImplementationsModern Business Intelligence - Design and Implementations
Modern Business Intelligence - Design and ImplementationsDavid J Rosenthal
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devopsUlf Mattsson
 
A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...IRJET Journal
 
FinishedProject
FinishedProjectFinishedProject
FinishedProjectHappyjuice
 
Project FMEA for Recognizing Difficulties in Machine Learning Application Sys...
Project FMEA for Recognizing Difficulties in Machine Learning Application Sys...Project FMEA for Recognizing Difficulties in Machine Learning Application Sys...
Project FMEA for Recognizing Difficulties in Machine Learning Application Sys...Naoshi Uchihira
 
Cloud-Based IoT Analytics and Machine Learning
Cloud-Based IoT Analytics and Machine LearningCloud-Based IoT Analytics and Machine Learning
Cloud-Based IoT Analytics and Machine LearningSatyaKVivek
 
New Business Development Proposal - Adding Project Portfolio Management (PPM)...
New Business Development Proposal - Adding Project Portfolio Management (PPM)...New Business Development Proposal - Adding Project Portfolio Management (PPM)...
New Business Development Proposal - Adding Project Portfolio Management (PPM)...Rolly Perreaux, PMP
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
Application Delivery Infrastructure for Multi-Cloud Enterprises
 Application Delivery Infrastructure for Multi-Cloud Enterprises Application Delivery Infrastructure for Multi-Cloud Enterprises
Application Delivery Infrastructure for Multi-Cloud EnterprisesEnterprise Management Associates
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business
 
Bhadale group of companies data science project methodologies catalogue
Bhadale group of companies data science project methodologies catalogueBhadale group of companies data science project methodologies catalogue
Bhadale group of companies data science project methodologies catalogueVijayananda Mohire
 
AI improves software testing by Kari Kakkonen at TQS
AI improves software testing by Kari Kakkonen at TQSAI improves software testing by Kari Kakkonen at TQS
AI improves software testing by Kari Kakkonen at TQSKari Kakkonen
 

Similar to MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection and Path Forward (20)

Build Answer-generating Apps that Users Love: Development best practices for ...
Build Answer-generating Apps that Users Love: Development best practices for ...Build Answer-generating Apps that Users Love: Development best practices for ...
Build Answer-generating Apps that Users Love: Development best practices for ...
 
Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...
Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...
Data Mining & Predictive Analytics - Lesson 14 - Concepts Recapitulation and ...
 
What is the future of data strategy?
What is the future of data strategy?What is the future of data strategy?
What is the future of data strategy?
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Agile Mumbai 2022 - Ashwinee Singh | Agile in AI or AI in Agile?
Agile Mumbai 2022 - Ashwinee Singh | Agile in AI or AI in Agile?Agile Mumbai 2022 - Ashwinee Singh | Agile in AI or AI in Agile?
Agile Mumbai 2022 - Ashwinee Singh | Agile in AI or AI in Agile?
 
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
 
Modern Business Intelligence - Design and Implementations
Modern Business Intelligence - Design and ImplementationsModern Business Intelligence - Design and Implementations
Modern Business Intelligence - Design and Implementations
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devops
 
A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...
 
FinishedProject
FinishedProjectFinishedProject
FinishedProject
 
Project FMEA for Recognizing Difficulties in Machine Learning Application Sys...
Project FMEA for Recognizing Difficulties in Machine Learning Application Sys...Project FMEA for Recognizing Difficulties in Machine Learning Application Sys...
Project FMEA for Recognizing Difficulties in Machine Learning Application Sys...
 
Cloud-Based IoT Analytics and Machine Learning
Cloud-Based IoT Analytics and Machine LearningCloud-Based IoT Analytics and Machine Learning
Cloud-Based IoT Analytics and Machine Learning
 
MTech- Viva_Voce
MTech- Viva_VoceMTech- Viva_Voce
MTech- Viva_Voce
 
New Business Development Proposal - Adding Project Portfolio Management (PPM)...
New Business Development Proposal - Adding Project Portfolio Management (PPM)...New Business Development Proposal - Adding Project Portfolio Management (PPM)...
New Business Development Proposal - Adding Project Portfolio Management (PPM)...
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Application Delivery Infrastructure for Multi-Cloud Enterprises
 Application Delivery Infrastructure for Multi-Cloud Enterprises Application Delivery Infrastructure for Multi-Cloud Enterprises
Application Delivery Infrastructure for Multi-Cloud Enterprises
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
Bhadale group of companies data science project methodologies catalogue
Bhadale group of companies data science project methodologies catalogueBhadale group of companies data science project methodologies catalogue
Bhadale group of companies data science project methodologies catalogue
 
AI improves software testing by Kari Kakkonen at TQS
AI improves software testing by Kari Kakkonen at TQSAI improves software testing by Kari Kakkonen at TQS
AI improves software testing by Kari Kakkonen at TQS
 

More from Tao Xie

DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...
DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...
DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...Tao Xie
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringTao Xie
 
Diversity and Computing/Engineering: Perspectives from Allies
Diversity and Computing/Engineering: Perspectives from AlliesDiversity and Computing/Engineering: Perspectives from Allies
Diversity and Computing/Engineering: Perspectives from AlliesTao Xie
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...Tao Xie
 
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...Tao Xie
 
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven ResearchISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven ResearchTao Xie
 
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...Tao Xie
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringTao Xie
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecurityTao Xie
 
Planning and Executing Practice-Impactful Research
Planning and Executing Practice-Impactful ResearchPlanning and Executing Practice-Impactful Research
Planning and Executing Practice-Impactful ResearchTao Xie
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringTao Xie
 
Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Transferring Software Testing Tools to Practice (AST 2017 Keynote)Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Transferring Software Testing Tools to Practice (AST 2017 Keynote)Tao Xie
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTao Xie
 
Advances in Unit Testing: Theory and Practice
Advances in Unit Testing: Theory and PracticeAdvances in Unit Testing: Theory and Practice
Advances in Unit Testing: Theory and PracticeTao Xie
 
Common Technical Writing Issues
Common Technical Writing IssuesCommon Technical Writing Issues
Common Technical Writing IssuesTao Xie
 
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckTao Xie
 
Transferring Software Testing and Analytics Tools to Practice
Transferring Software Testing and Analytics Tools to PracticeTransferring Software Testing and Analytics Tools to Practice
Transferring Software Testing and Analytics Tools to PracticeTao Xie
 
User Expectations in Mobile App Security
User Expectations in Mobile App SecurityUser Expectations in Mobile App Security
User Expectations in Mobile App SecurityTao Xie
 
Impact-Driven Research on Software Engineering Tooling
Impact-Driven Research on Software Engineering ToolingImpact-Driven Research on Software Engineering Tooling
Impact-Driven Research on Software Engineering ToolingTao Xie
 

More from Tao Xie (20)

DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...
DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...
DSML 2021 Keynote: Intelligent Software Engineering: Working at the Intersect...
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software Engineering
 
Diversity and Computing/Engineering: Perspectives from Allies
Diversity and Computing/Engineering: Perspectives from AlliesDiversity and Computing/Engineering: Perspectives from Allies
Diversity and Computing/Engineering: Perspectives from Allies
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
 
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
 
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven ResearchISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
 
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software Engineering
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
 
Planning and Executing Practice-Impactful Research
Planning and Executing Practice-Impactful ResearchPlanning and Executing Practice-Impactful Research
Planning and Executing Practice-Impactful Research
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Transferring Software Testing Tools to Practice (AST 2017 Keynote)Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Transferring Software Testing Tools to Practice (AST 2017 Keynote)
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to Practice
 
Advances in Unit Testing: Theory and Practice
Advances in Unit Testing: Theory and PracticeAdvances in Unit Testing: Theory and Practice
Advances in Unit Testing: Theory and Practice
 
Common Technical Writing Issues
Common Technical Writing IssuesCommon Technical Writing Issues
Common Technical Writing Issues
 
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
 
Transferring Software Testing and Analytics Tools to Practice
Transferring Software Testing and Analytics Tools to PracticeTransferring Software Testing and Analytics Tools to Practice
Transferring Software Testing and Analytics Tools to Practice
 
User Expectations in Mobile App Security
User Expectations in Mobile App SecurityUser Expectations in Mobile App Security
User Expectations in Mobile App Security
 
Impact-Driven Research on Software Engineering Tooling
Impact-Driven Research on Software Engineering ToolingImpact-Driven Research on Software Engineering Tooling
Impact-Driven Research on Software Engineering Tooling
 

Recently uploaded

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 

Recently uploaded (20)

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 

MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection and Path Forward

  • 1. Software Analytics: Reflection and Path Forward Dr. Dongmei Zhang Data, Knowledge, and Intelligence (DKI) Group Microsoft Research Asia Prof. Tao Xie School of Computer Science Peking University
  • 2. Outline • Origin and early research • Community building • New research topics • Reflections 05/20/2022 MSR 2022 2
  • 3. Origin and Early Research 05/20/2022 MSR 2022 3
  • 4. 05/20/2022 MSR 2022 4 Software Analytics Group at MSRA, founded in May 2009
  • 5. Software Analytics Research Utilize data-driven approach to help create high quality, user friendly, and efficiently developed and operated software and services 05/20/2022 MSR 2022 5 Information Visualization Analysis Algorithms Large-scale Computing Vertical Horizontal https://www.microsoft.com/en-us/research/group/software-analytics/ http://research.microsoft.com/en-us/news/features/softwareanalytics-052013.aspx
  • 6. Prof. Tao Xie’s Visit at MSRA SA 05/20/2022 MSR 2022 6
  • 7. Defining Software Analytics Software analytics is to enable software practitioners to perform data exploration and analysis in order to obtain insightful and actionable information for data-driven tasks around software and services. 05/20/2022 MSR 2022 7 D. Zhang, Y. Dang, J. Lou, S. Han, H. Zhang, and Tao Xie. Software Analytics as a Learning Case in Practice: Approaches and Experiences. In MALETS 2011.
  • 8. Six dimensions 05/20/2022 MSR 2022 8 Research Topics Technology Pillars Target Audience Connection to Practice Output Input
  • 9. Research topics – the trinity view 05/20/2022 MSR 2022 9 • Covering major areas of software domain • Throughout entire development cycle • Enabling practitioners to obtain insights Software Users Software Development Process Software System
  • 10. Input - data sources 05/20/2022 MSR 2022 10 Runtime traces Program logs System events Perf counters … Usage log User surveys Online forum posts Blog & Twitter … Source code Bug history Check-in history Test cases …
  • 11. Output – insightful information • Conveys meaningful and useful understanding or knowledge towards completing the target task • Not easily attainable via directly investigating raw data without aid of analytics technologies • Examples • It is easy to count the number of re-opened bugs, but how to find out the primary reasons for these re-opened bugs? • When the availability of an online service drops below a threshold, how to localize the problem? 05/20/2022 MSR 2022 11
  • 12. Output – actionable information • Enables software practitioners to come up with concrete solutions towards completing the target task • Examples • Why bugs were re-opened? • A list of bug groups each with the same reason of re-opening • Why availability of online services dropped? • A list of problematic areas with associated confidence values • Which part of my code should be refactored? • A list of cloned code snippets easily explored from different perspectives 05/20/2022 MSR 2022 12
  • 13. Technology pillars 05/20/2022 MSR 2022 13 Software Users Software Development Process Software System Information Visualization Analysis Algorithms Large-scale Computing Vertical Horizontal Technology pillars
  • 14. Target audience – software practitioners 05/20/2022 MSR 2022 14 Developer Tester Program Manager Usability engineer Designer Support engineer Management personnel Operation engineer
  • 15. Connection to practice • Software Analytics is naturally tied with software development practice • Getting real 05/20/2022 MSR 2022 15 Real Data Real Problems Real Users Real Tools
  • 16. Early projects 05/20/2022 MSR 2022 16 StackMine – Performance debugging in the large via mining millions of stack traces Scalable code clone analysis Data exploration for Customer Experience Improvement Program (CEIP)
  • 17. 05/20/2022 MSR 2022 17 Performance Debugging in the Large via Mining Millions of Stack Traces S. Han, Y. Dong, D. Zhang, and T. Xie, ICSE 2012 Comprehending Performance from Real-World Execution Traces: A Device-Driver Case X. Yu, S. Han, D. Zhang, and T. Xie, ASPLOS 2014
  • 18. 05/20/2022 MSR 2022 18 Performance Debugging in the Large via Mining Millions of Stack Traces S. Han, Y. Dong, D. Zhang, and T. Xie, ICSE 2012 Comprehending Performance from Real-World Execution Traces: A Device-Driver Case X. Yu, S. Han, D. Zhang, and T. Xie, ASPLOS 2014 as representative paper in 2012, 1 of 20 representative papers (one paper a year)
  • 20. Building Upon Rich Work by the Communities 05/20/2022 MSR 2022 20 FSE/SDP Workshop on the Future of Software Engineering Research (FoSER 2010) ...
  • 23. CCCF/IEEE Software 2013 Articles 05/20/2022 MSR 2022 23
  • 25. Tutorials/Tech Briefings at ICSE/FSE/ASE... • [ASE 11 Tutorial] Zhang & Xie. xSA: eXtreme Software Analytics - Marriage of eXtreme Computing and Software Analytics • [CSEE&T 12 Tutorial] Zhang, Dang, Han & Xie. Teaching and Training for Software Analytics • [ICSE 12 SEIP Mini Tutorial] Zhang & Xie. Software Analytics in Practice: Mini Tutorial • [ICSE 13 Tutorial] Zhang & Tao Xie. Software Analytics: Achievements and Challenges • [FSE 14 Tutorial] Zhang & Tao Xie. Software Analytics: Achievements and Challenges 05/20/2022 MSR 2022 25
  • 26. Community Building by Others 05/20/2022 MSR 2022 26 IEEE Software 2013 Special Issue Dagstuhl Seminar 2014 International Workshop on Software Analytics (SWAN) 2015, 2016, 2017, 2018 ...
  • 28. Beyond SE Communities: ASPLOS 2021 Keynote 05/20/2022 MSR 2022 28 ASPLOS is the premier forum for interdisciplinary systems research, intersecting computer architecture, hardware and emerging technologies, programming languages and compilers, operating systems, and networking.
  • 29. New Research Topic (1) Cloud Intelligence 05/20/2022 MSR 2022 29
  • 30. Cloud Services • Shift to cloud becoming mainstream • Critical role of cloud computing platforms fortified by COVID-19 05/20/2022 MSR 2022 30 2018 2019 2020 2021 2022 System Infrastructure 11% 13% 16% 19% 22% Infrastructure software 13% 15% 17% 18% 20% Application software 34% 36% 38% 39% 40% Business process outsourcing 27% 28% 29% 29% 30% Total 19% 21% 24% 26% 28% Cloud shift proportion by category Source: Gartner (August 2018) 2019 2020 2021 2022 BPaaS 45,212 44,741 47,521 50,336 PaaS 37,512 43,823 55,486 68,964 SaaS 102,064 101,480 117,773 138,261 IaaS 44,457 51,421 65,264 82,225 DaaS 616 1,204 1,945 2,542 Total Market 242,696 257,549 304,990 362,263 Worldwide public cloud services end-user spending forecast (Millions of USD) Source: Gartner (November 2020) Note: Totals may not add up due to rounding.
  • 31. Focusing on Cloud Computing • Huge space for improvement for cloud computing platforms • Software Analytics is the digital transformation of software industry • Cloud intelligence • Software Analytics focusing on cloud computing • Re-emergence of AI • Making impact is key 05/20/2022 MSR 2022 31
  • 32. Cloud Intelligence Using AI/ML technologies to effectively and efficiently design, build and operate complex cloud services at scale MSR 2022 32 Customers Engineering Services • AI for System Designing and building high-quality services with better reliability, performance, and efficiency • AI for Customers Improving customer satisfaction with intelligence and better user experiences • AI for DevOps Achieving high productivity in DevOps via empowering engineers with intelligent tooling 05/20/2022
  • 33. • Cloud Intelligence Workshop • @ AAAI 2020 • @ ICSE 2021 • @ SysML 2022 • Program Chair Jian Zhang, Microsoft Azure • Steering Committee Rama Akkiraju, IBM Ricardo Bianchini, Microsoft Research Mike Dahlin, Google Marcus Fontoura, Microsoft Azure Ahmed E. Hassan, Queen’s University Michael Lyu, Chinese University of Hong Kong Erik Meijer, Facebook Tao Xie, Peking University Dongmei Zhang, Microsoft Research Yuanyuan Zhou, UCSD Related Efforts 05/20/2022 MSR 2022 33 • AIOps by Gartner “Put simply, AIOps is the application of machine learning (ML) and data science to IT operations problems. AIOps platforms combine big data and ML functionality to enhance and partially replace all primary IT operations functions, including availability and performance monitoring, event correlation and analysis, and IT service management and automation.” • AIOps extended AIOps: Real-world Challenges and Research Innovations Yingnong Dang, Qingwei Lin, Peng Huang Technical Briefing, ICSE 2019
  • 34. Scenarios 05/20/2022 MSR 2022 34 Service health measuring (KPI) • Availability / reliability • Performance • Security Anomalous behavior detection • KPI (Overall, component) • Resource (overhead / leak) Health prediction • Infrastructure (e.g., power, cooling) • HW, SW Failure • Workload • System capacity Auto-recovery/adjustment/healing • Recovery option optimization • Auto healing Programming • API/code suggestion • Code defect, smell, code review • Test coverage, test selection CI/CD • Integration testing and strategy • Rollout risk assessment and strategy Auto-triage & diagnosis • Auto-triage (investigation owner) • Diagnosis intelligence Repair/mitigation decision • Solution recommendation • Decision support Customer behavior understanding • Usage experience • Customer churn Proactive customer engagement • Service auto-scale (up/down) • Engaging before reporting Intelligent customer support • Self-serve • Efficient communication • Intelligent suggestion/hints Service Engineering Customer
  • 35. Problems and Challenges MSR 2022 35 Detection Diagnosis Optimization Prediction • Time-series anomaly detection • Log-based anomaly detection • Multi-dimensional change detection • … • Log pattern mining • Correlation analysis • Dependency graph diagnosis • … • Context/dependency-aware prediction • Automated feature engineering • Extremely-imbalanced data prediction • … Diverse requirements, noisy data, high dimensions, lack of labeled data … Diverse causes, complex service dependency, scattered knowledge… Huge problem space, large scale data, complex constraints and tradeoffs, … Highly imbalanced class, fast system evolution, unpredictable behavior changes, … • Multi-constraint/objective optimization • DL-based combinatorial search • Optimization under prediction uncertainty • … PROBLEMS CHALLENGES 05/20/2022
  • 36. Disk Failure Prediction in Cloud Computing Platform Improving Service Availability of Cloud Systems by Predicting Disk Error, Y. Xu, K. Sui, R. Yao, H. Zhang, Q. Lin, Y. Dang, P. Li, K. Jiang, W. Zhang, J. Lou, M. Chintalapati, D. Zhang, USNIX ATC 2018. NTAM: Neighborhood-Temporal Attention Model for Disk Failure Prediction in Cloud Platforms, C. Luo, P. Zhao, B. Qiao, Y. Wu, H. Zhang, W. Wu, W. Lu, Y. Dang, S. Rajmohan, Q. Lin, D. Zhang, the Web Conference 2021. 05/20/2022 MSR 2022 36
  • 37. Virtual Machine (VM) Availability and Disk Failures • Hardware issues are one of the top reasons of VM going down and VM reboot • Disk failures contribute most to the hardware issues 05/20/2022 MSR 2022 37 Source: https://www.backblaze.com/blog/hard-drive-stats-for-2018/ Source: https://www.microsoft.com/en-us/research/wp- content/uploads/2016/08/a7-narayanan.pdf SSD Annualized Failure Rates
  • 38. Binary Classification Problem The training set is a collection of 𝑁𝑁 training samples, denoted as 𝐷𝐷 = { 𝑋𝑋1, 𝑦𝑦1 , (𝑋𝑋2, 𝑦𝑦2) … , (𝑋𝑋𝑁𝑁, 𝑦𝑦𝑁𝑁)} 𝑋𝑋𝑖𝑖 represents the corresponding disk 𝑑𝑑𝑖𝑖’s own status data and neighborhood information, i.e., 𝑋𝑋𝑖𝑖 = 𝐴𝐴𝑖𝑖 ∪ 𝐵𝐵𝑖𝑖, 𝐴𝐴𝑖𝑖 ∈ 𝑅𝑅ℎ×𝑛𝑛 represents 𝑑𝑑𝑖𝑖’s own status data, and 𝐵𝐵𝑖𝑖 is a subset of unions of all 𝐴𝐴𝑖𝑖. 𝑦𝑦𝑖𝑖 ∈ {0,1} is the label 𝑦𝑦𝑖𝑖 = 1 means that the corresponding disk will fail in near future 𝑦𝑦𝑖𝑖 = 0 means ‘healthy’ Loss function 𝐿𝐿 = − 1 𝑁𝑁 � 𝑖𝑖=1 𝑁𝑁 [𝑦𝑦𝑖𝑖 ⋅ log � 𝑦𝑦𝑖𝑖 + 1 − 𝑦𝑦𝑖𝑖 ⋅ log(1 − � 𝑦𝑦𝑖𝑖)] 05/20/2022 MSR 2022 38
  • 39. Related Work • Traditional machine learning based approaches • Support Vector Machine (SVM) [MSST 2013] • Decision Tree (DT) [DSN 2014] • Random Forest (RF) [DSN 2018] • Gradient Boosting Decision Tree (GBDT) [Ph.D. Dissertation, UCLA 2017] • Regularized Greedy Forest (RGF) [KDD 2016] • Cloud Disk Error Forecasting (CDEF) [USENIX ATC 2018] • Deep Learning based approaches • Recurrent Neural Network (RNN) [IEEE Transactions on Computers 2016] • Long Short-Term Memory (LSTM) [ICDM 2018] • Temporal Convolution Neural Network (TCNN) [DAC 2019] • Convolution Neural Network with Long Short-Term Memory (CNN+LSTM) [FAST 2020] • Neighborhood-Temporal Attention Model (NTAM) [Web Conference 2021] 05/20/2022 MSR 2022 39
  • 40. Observations (1) • VMs can be impacted before disks completely fail • Disk errors occur before disk completely fails • Disk errors often reflected by system-level signals such as OS events 05/20/2022 MSR 2022 40 Name Description Timestamp The timestamp 𝑡𝑡 of the feature vector recorded. Disk ID The unique ID of disk 𝑑𝑑𝑖𝑖 . Node ID The unique ID of each computing server (i.e. node) 𝑑𝑑𝑖𝑖 is associated with. SMART Attributes The SMART attributes of 𝑑𝑑𝑖𝑖 recorded at 𝑡𝑡, providing information such as the Current Pending Sector Count, Seek Error Rate, Soft Read Error Rate, etc. System-related attributes OS events such as paging error, file system error, device reset, telemetry loss, etc. Driver-related attributes Gathered from disk driver with information on Flush Count, IO Latency, Controller Reset, etc.
  • 41. Observation (2) • A disk’s health status may be impacted by its neighboring disks • Incorporating individual disk’s status and its neighborhood info 05/20/2022 MSR 2022 41 Figure 2: The architecture of the neighborhood-aware component underlying NTAM.
  • 42. Observation (3) • Extremely imbalanced disk population • Data enhancement via Temporal Progressive Sampling (TPS) 05/20/2022 MSR 2022 42 Figure 4: The design of the Temporal Progressive Sampling (TPS) method.
  • 43. Neighborhood-Temporal Attention Model (NTAM) • Neighborhood-aware component To effectively incorporate neighborhood information • Temporal component To better capture temporal information • Decision component Decide whether the corresponding disk will fail in near future or not 05/20/2022 MSR 2022 43 Failure probability Temporal-encoded vector Neighbor-encoded vectors Disk Ai & Neighbors Bi Figure 1: Overview of Neighborhood-aware Attention Model (NTAM).
  • 44. AI & Software Engineering 05/20/2022 MSR 2022 44 New Research Topic (2)
  • 45. RAISE 2013 Keynote & Vision Statement 05/20/2022 MSR 2022 45
  • 47. IEEE Software 2020 Special Issue 05/20/2022 MSR 2022 47
  • 48. Making IntelliTest More Intelligent 05/20/2022 MSR 2022 48 Pex journey [ASE 2014] Pex shipped as IntelliTest in Visual Studio Enterprise Edition since 2015 Self-learning (data driven) Thummalapenta, Xie, Tillmann, de Halleux, and Schulte. MSeqGen: Object- Oriented Unit-Test Generation via Mining Source Code. ESEC/FSE 2009.
  • 49. ICSE 2020 Technical Briefing 05/20/2022 MSR 2022 49
  • 50. Programming is not easy, even for easy task SELECT e1.brand AS brand, e1.Year as year FROM table e1=(select sum(sale) as salesum, year, brand, group by year, brand ) LEFT OUTER JOIN table e2=(select sum(sale) as salesum, year, brand, group by year, brand) ON (e1.year = e2.year AND e1. salesum >= e2. salesum) GROUP BY e1.brand, e1.year HAVING COUNT(*) <= 2 ORDER BY year; A Question: Writing a SQL statement for “top 2 selling brands in each year” given a table of three columns “sales”, “Brand”, and “year”.
  • 51. NL2Regex, NL2SQL, ... 05/20/2022 MSR 2022 51 Zhong, Guo, Yang, Peng, Xie, Lou, Liu and Zhang. SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications. EMNLP 2018. Guo, Liu, Lou, Li, Liu, Xie, and Liu. Benchmarking Meaning Representations in Neural Semantic Parsing. EMNLP 2020. Dong, Sun, Liu, Lou, and Zhang. Data-Anonymous Encoding for Text-to-SQL Generation. EMNLP 2019. Conversational Interface for
  • 52. NL to Data Analysis/Visualization in Excel
  • 53. aiXcoder 05/20/2022 MSR 2022 53 After aiXcoder 2.0 became online (currently 4.0) for 1 month, #download > 130K! So far 2C: 300K users 2B: major banks/IT companies https://aixcoder.com/en/
  • 54. aiXcoder L and Next 05/20/2022 MSR 2022 54 Billion-scale model parameters NL2Code
  • 55. New Trend: Big Pre-trained Model + Task Adaptation GPT-3 can program?
  • 57. Data Driven vs. Problem Driven 05/20/2022 MSR 2022 57
  • 58. AI + Human Intelligence 05/20/2022 MSR 2022 58
  • 59. Making Impact in Practice • Finding the critical scenario • Closing the loop • End-to-end and fast iteration 05/20/2022 MSR 2022 59 Perspective Potential Impact Problem Applicability Assumption Problem validity Constraint Formulation and solution Requirement Evaluation Usefulness in practice Technology readiness framework
  • 60. Takeaways • Software Analytics digital transformation of software industry • Thriving community • New research topics • Cloud Intelligence • AI and Software Engineering • Reflections • Data driven vs. problem driven • AI + human intelligence • Making impact in practice • WE ARE HIRING! 05/20/2022 MSR 2022 60
  • 61. Acknowledgement Sincere thank-you to all the academic collaborators, colleagues and partners in Microsoft, and our talented intern students for the collaboration and partnership over the years! 05/20/2022 MSR 2022 61