Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Â
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
1. Research Methodology on
Pursuing Impact-Driven Research
Tao Xie
Department of Computer Science
University of Illinois at Urbana-Champaign
taoxie@illinois.edu
http://taoxie.cs.illinois.edu/
Innovations in Software Engineering Conference (ISEC 2018)
Feb 9-11 2018, Hyderabad, India
2. Evolution of Research Assessment
⢠#Papers ď¨
⢠#International Venue Papers ď¨
⢠#SCI/EI Papers ď¨
⢠#CCF A (B/C) Category Papers ď¨
⢠???
CRA 2015 Report:
âHiring Recommendation. Evaluate candidates on the basis of the contributions in their top one or
two publications, âŚâ
âTenure and Promotion Recommendation. Evaluate candidates for tenure and promotion on the
basis of the contributions in their most important three to five publications (where systems and
other artifacts may be included).â
http://cra.org/resources/best-practice-memos/incentivizing-quality-and-impact-evaluating-scholarship-in-hiring-tenure-and-promotion/
3. Societal Impact
ACM Richard Tapia Celebration of Diversity in Computing
Join us at the next Tapia Conference in Orlando, FL on September 19-22, 2018!
http://tapiaconference.org/
Margaret Burnett: âWomenomics &
Gender-Inclusive Softwareâ
âBecause anybody who thinks that weâre just
here because weâre smart forgets that weâre also
privileged, and we have to extend that farther. So
weâve got to educate and help every generation
and we all have to keep it up in lots of ways.â
â David Notkin, 1955-2013
Andy Ko: âWhy the
Software Industry Needs
Computing Education
Researchâ
4. Impact on Research Communities Beyond SE
Representational State Transfer
(REST) as a key architectural
principle of WWW (2000)
Related to funding/head-count allocation, student recruitment, âŚ
ď community growth
Roy Fielding Richard Taylor
âŚ
Andreas Zeller
Delta debugging (1999)Symbolic execution (1976)
also by James King, William
Howden, Karl Levitt, et al.
Lori Clarke
http://asegrp.blogspot.in/2016/07/outward-thinking-for-our-research.html
5. Practice Impact
⢠Diverse/balanced research styles shall/can be embraced
⢠Our community already well appreciates impact on other researchers, e.g.,
SIGSOFT Impact Awards, ICSE MIP, paper citations
⢠But often insufficient effort for last mileage or focus on real problems
⢠Strong need of ecosystem to incentivize practice impact pursued by
researchers
⢠Top down:
⢠Bottom up:
⢠Conference PC for reviewing papers
⢠Impact counterpart of âhighly novel ideasâ?
⢠Impact counterpart of âartifact evaluationâ?
⢠Promote and recognize practice impact
⢠Counterpart of ACM Software System Award? http://www.cs.umd.edu/hcil/newabcs/
http://cra.org/resources/best-practice-memos/incentivizing-quality-and-impact-evaluating-scholarship-in-hiring-tenure-and-promotion/
6. Practice-Impact Levels of Research
⢠Study/involve industrial data/subjects
⢠Indeed, insights sometimes may benefit practitioners
⢠Hit (with a tool) and run
⢠Authors hit and run (upon industrial data/subjects)
⢠Practitioners hit and run
⢠Continuous adoption by practitioners
⢠Importance of benefited domain/system (which can be just a single one)
⢠Ex. WeChat test generation tool ď WeChat with > 900 million users
⢠Ex. MSRA SA on SAS ď MS Online Service with hundreds of million users
⢠Ex. Beihang U. on CarStream ď Shenzhou Rental with > 30,000 vehicles over 60 cities
⢠Scale of practitioner users
⢠Ex. MSR Pex ď Visual Studio 2015+ IntelliTest
⢠Ex. MSR Code Hunt with close to 6 million registered/anonymous/API accounts
⢠Ex. MSRA SA XIAO ď Visual Studio 2012+ Clone Analysis
Think about >90% startups fail! It is
challenging to start from research and
then single-handedly bring it to
continuous adoption by target users;
academia-industry collaborations are
often desirable.
7. Practice-Impact Levels of Research
⢠If there are practice impacts but no underlying research (e.g.,
published research), then there is no practice-impactful research
⢠More like a startupâs or a big companyâs product with business secrets
⢠Some industry-academia collaborations treat university researchers
(students) like cheap(er) engineering labor ď no or little research
8. Desirable Problems for Academia-Industry
Collaborations
⢠Not all industrial problems are worth effort investment from university
groups
⢠High business/industry value
⢠Allow research publications (not business secret) to advance the knowledge
⢠Challenging problem (does it need highly intellectual university researchers?)
⢠Desirably real man-power investment from both sides
⢠My recent examples
⢠Tencent WeChat [FSEâ16 Industry], [ICSEâ17 SEIP]: Android app testing/analysis
⢠Exploring collaborations with Baidu, Alibaba, Huawei, etc.
⢠Exploring new collaborations with MSRA SA
9. Sustained Productive Academia-Industry
Collaborations
⢠Careful selection of target problems/projects
⢠Desirable to start with money-free collaborations(?)
⢠If curiosity-driven nature is also from industry (lab) side, watch out.
⢠Each collaboration party needs to bring in something important and unique â
win-win situation
⢠High demand of abstraction/generalization skills on the academic collaborators to pursue
research upon high-practice-impact work.
⢠Think more about the interest/benefit of the collaborating party
⢠(Long-term) relationship/trust building
⢠Mutual understanding of expected contributions to the collaborations
⢠Balancing research and âengineeringâ
⢠Focus, commitment, deliverables, funding, âŚ
10. Optimizing âResearch Returnâ:
Pick a Problem Best for You
Your Passion
(Interest/Passion)
High Impact
(Societal Needs/Purpose)
Your Strength
(Gifts/Potential)Best problems for you
Find your passion: If you donât have to work/study for money, what would you do?
Test of impact: If you are given $1M to fund a research project, what would you fund?
Find your strength/Avoid your weakness: What are you (not) good at?
Find what interests you that you can do well, and is needed by the people Adapted from Slides by
ChengXiang Zhai, YY ZHou
11. Brief Desirable Characteristics of Your Paper/Project
⢠Two main elements
⢠Interesting idea(s) accompanying interesting claim(s)
⢠claim(s) well validated with evidence
⢠Then how to define âinterestingâ?
⢠Really depend on the readersâ taste but there may be general taste for a
community
⢠Ex: being the first in X, being non-trivial, contradicting conventional wisdoms, âŚ
⢠Can be along problem or solution space; in SE, being the first to point out a
refreshing and practical problem would be much valued
⢠Uniqueness, elegance, significance?
D. Notkin: Software, Software Engineering and Software Engineering Research: Some Unconventional Thoughts. J. Comput.
Sci. Technol. 24(2): 189-197 (2009) https://link.springer.com/article/10.1007/s11390-009-9217-4
D. Notkinâs ICSM 2006 keynote talk.
12. Factors Affecting Choosing a Problem/Project
⢠What factors affect you (not) to choose a problem/project?
⢠Besides your supervisor/mentor asks you (not) to choose it
http://www.weizmann.ac.il/mcb/UriAlon/nurturing/HowToChooseGoodProblem.pdf
13. Big Picture and Vision
⢠Step back and think about what research problems will be most
important and most influential/significant to solve in the long term
⢠Long term could be the whole career
⢠People tend not to think about important/long term problems
Richard Hamming âyou and your researchâ
http://www.cs.virginia.edu/~robins/YouAndYourResearch.html
Ivan Sutherland âtechnology and courageâ
http://labs.oracle.com/techrep/Perspectives/smli_ps-1.pdf
Less important More important
Shorter term
Longer term
This slide was made based on
discussion with David Notkin
14. Research Space
Talk: The Pipeline from Computing Research to Surprising Inventions by Peter Lee
http://www.youtube.com/watch?v=_kpjw9Is14Q
http://blogs.technet.com/b/inside_microsoft_research/archive/2011/12/31/microsoft-
research-redmond-year-in-review.aspx a blog post by Peter Lee
ŠPeter Lee
15. Big Picture and Vision âcont.
⢠If you are given 1 (4) million dollars to lead a team of 5 (10) team
members for 5 (10) years, what would you invest them on?
16. Factors Affecting Choosing a Problem/Project
⢠Impact/significant: Is the problem/solution important? Are
there any significant challenges?
⢠Industrial impact, research impact, âŚ
⢠DONâT work on a problem imagined by you but not being a real problem
⢠E.g., determined based on your own experience, observation of practice,
feedback from others (e.g., colleagues, industrial collaborators)
⢠Novelty: is the problem novel? is the solution novel?
⢠If a well explored or crowded space, watch out (how much
space/depth? how many people in that space?)
17. Factors Affecting Choosing a Problem/Project II
⢠Risk: how likely the research could fail?
⢠reduced with significant feasibility studies and risk management in
the research development process
⢠E.g., manual âminingâ of bugs
⢠Cost: how high effort investment would be needed?
⢠Sometimes being able to be reduced with using tools and
infrastructures available to us
⢠Need to consider evaluation cost (solutions to some problem may
be difficult to evaluate)
⢠But donât shut down a direction simply due to cost
18. Factors Affecting Choosing a Problem/Project III
⢠Better than existing approaches (in important ways) besides new:
engineering vs. science
⢠Competitive advantage
⢠âsecret weaponâ
⢠Why you/your group is the best one to pursue it?
⢠Ex. a specific tool/infrastructure, access to specific data, collaborators, an
insight,âŚ
⢠Need to know your own strengths/weaknesses
⢠Underlying assumptions and principles - how do you (systematically) choose
what to pursue?
⢠core values that drive your research agenda in some broad way
This slide was made based on discussion with David Notkin
19. Example Principles â Problem Space
⢠Question core assumptions or conventional wisdoms about SE
⢠Play around industrial tools to address their limitation
⢠Collaborate with industrial collaborators to decide on
problems of relevance to practice
⢠Investigate SE mining requirement and adapt or develop
mining algorithms to address them
(e.g., Suresh Thummalapenta [ICSE 09, ASE 09])
D. Notkin: Software, Software Engineering and Software Engineering Research: Some Unconventional Thoughts. J. Comput.
Sci. Technol. 24(2): 189-197 (2009) https://link.springer.com/article/10.1007/s11390-009-9217-4
D. Notkinâs ICSM 2006 keynote talk.
20. Example Principles â Solution Space
⢠Integration of static and dynamic analysis
⢠Using dynamic analysis to realize tasks originally realized by
static analysis
⢠Or the other way around
⢠Using compilers to realize tasks originally realized by
architectures
⢠Or the other way around
⢠âŚ
21. Factors Affecting Choosing a Problem/Project IV
⢠Intellectual curiosity
⢠Other benefits (including option value)
⢠Emerging trends or space
⢠Funding opportunities, e.g., security
⢠Infrastructure used by later research
⢠âŚ
⢠What you are interested in, enjoy, passionate, and believe in
⢠AND a personal taste
⢠Tradeoff among different factors
22. Dijkstraâs Three Golden Rules for Successful
Scientific Research
1. âInternalâ: Raise your quality standards as high as you can live
with, avoid wasting your time on routine problems, and always
try to work as closely as possible at the boundary of your
abilities. Do this, because it is the only way of discovering how
that boundary should be moved forward.
2. âExternalâ: We all like our work to be socially relevant and
scientifically sound. If we can find a topic satisfying both
desires, we are lucky; if the two targets are in conflict with each
other, let the requirement of scientific soundness prevail.
http://www.cs.utexas.edu/~EWD/ewd06xx/EWD637.PDF
23. Dijkstraâs Three Golden Rules for Successful
Scientific Research cont.
3. âInternal/ Externalâ: Never tackle a problem of which you can be
pretty sure that (now or in the near future) it will be tackled by
others who are, in relation to that problem, at least as competent
and well-equipped as you.
http://www.cs.utexas.edu/~EWD/ewd06xx/EWD637.PDF
24. Jim Grayâs Five Key Properties for a Long-Range Research Goal
⢠Understandable: simple to state.
⢠Challenging: not obvious how to do it.
⢠Useful: clear benefit.
⢠Testable: progress and solution is testable.
⢠Incremental: can be broken in to smaller steps
⢠So that you can see intermediate progress
http://arxiv.org/ftp/cs/papers/9911/9911005.pdf
http://research.microsoft.com/pubs/68743/gray_turing_fcrc.pdf
25. Tony Hoareâs Criteria for a Grand Challenge
⢠Fundamental
⢠Astonishing
⢠Testable
⢠Inspiring
⢠Understandable
⢠Useful
⢠Historical
http://vimeo.com/39256698
http://www.cs.yale.edu/homes/dachuan/Grand/HoareCC.pdf
The Verifying Compiler: A Grand Challenge for
Computing Research by Hoare, CACM 2003
26. Tony Hoareâs Criteria for a Grand Challenge
cont.
⢠International
⢠Revolutionary
⢠Research-directed
⢠Challenging
⢠Feasible
⢠Incremental
⢠Co-operative
http://vimeo.com/39256698
http://www.cs.yale.edu/homes/dachuan/Grand/HoareCC.pdf
The Verifying Compiler: A Grand Challenge for
Computing Research by Hoare, CACM 2003
27. Tony Hoareâs Criteria for a Grand Challenge
cont.
⢠Competitive
⢠Effective
⢠Risk-managed
http://vimeo.com/39256698
http://www.cs.yale.edu/homes/dachuan/Grand/HoareCC.pdf
The Verifying Compiler: A Grand Challenge for
Computing Research by Hoare, CACM 2003
28. Heilmeier's Catechism
Anyone proposing a research project or product development effort should be able to
answer
⢠What are you trying to do? Articulate your objectives using absolutely
no jargon.
⢠How is it done today, and what are the limits of current practice?
⢠What's new in your approach and why do you think it will be
successful?
⢠Who cares?
⢠If you're successful, what difference will it make?
⢠What are the risks and the payoffs?
⢠How much will it cost?
⢠How long will it take?
⢠What are the midterm and final "exams" to check for success?
http://www9.georgetown.edu/faculty/yyt/bolts&nuts/TheHeilmeierCatechism.pdf
29. Ways of Coming Up a Problem/Project
⢠Know and investigate literatures and the area
⢠Investigate assumptions, limitations, generality, practicality, validation
of existing work
⢠Address issues in your own development experiences or from other
developersâ
⢠Explore what is âhotâ (pros and cons)
⢠See where your âhammersâ could hit or be extended
⢠Ask âwhy notâ on your own work or othersâ work
⢠Understand existing patterns of thinking
⢠http://people.engr.ncsu.edu/txie/adviceonresearch.html
⢠Think more and hard, and interact with others
⢠Brainstorming sessions, reading groups
⢠âŚ
Some points were extracted from Barbara Ryderâs slides: http://cse.unl.edu/~grother/nsefs/05/research.pdf
30. Example Techniques on Producing Research Ideas
⢠Research Matrix (Charles Ling and Qiang Yang)
⢠Shallow/Deep Paper Categorization (Tao Xie)
⢠Paper Recommendation (Tao Xie)
⢠Students recommend/describe a paper (not read by the advisor
before) to the advisor and start brainstorming from there
⢠Research Generalization (Tao Xie)
⢠âballoonâ/ âdonutâ technique
31. Technique: Research Matrix
Š Charles Ling and Qiang Yang
See Book Chapter 4.3: Crafting Your Research Future: A Guide to Successful Master's and
Ph.D. Degrees in Science & Engineering by Charles Ling and Qiang Yang
http://www.amazon.com/Crafting-Your-Research-Future-Engineering/dp/1608458105
32. Technique: Shallow Paper Categorization
Š Tao Xie
⢠See Tao Xieâs research groupâs shallow paper category:
⢠https://sites.google.com/site/asergrp/bibli
⢠Categorize papers on the research topic being focused
⢠Both the resulting category and the process of collecting and
categorizing papers are valuable
33. Technique: Deep Paper Categorization
Š Tao Xie
⢠Adopted by Tao Xieâs research group and collaborators
⢠Categorize papers on the research topic being focused (in a deep way)
⢠Draw a table (rows: papers; columns: characterization dimensions of
papers)
⢠Compare and find gaps/correlations across papers
Example Table on Symbolic Analysis:
34. Technique: âBalloonâ/âDonutâ
Š Tao Xie⢠Adopted by Tao Xieâs research group and collaborators
⢠Balloon: the process is like blowing air into a balloon
⢠Donut: the final outcome is like a donut shape (with the actual realized
problem/tool as the inner circle and the applicable generalized
problem/solution boundary addressed by the approach as the outer circle)
⢠Process: do the following for the problem/solution space separately
⢠Step 1. Describe what the exact concrete problem/solution that your tool
addresses/implements (assuming it is X)
⢠Step 2. Ask questions like âWhy X? But not an expanded scope of X?â
⢠Step 3. Expand/generalize the description by answering the questions (sometimes
you need to shrink if overgeneralize)
⢠Goto Step 1
35. Example Application of âBalloonâ/âDonutâ
Š Tao Xie
⢠Final Product: Xusheng Xiao, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux.
Precise Identification of Problems for Structural Test Generation. ICSE 2011
⢠Problem Space
⢠Step 1. (Inner circle) Address too many false-warning issues reported by Pex
⢠Step 2. Why Pex? But not dynamic symbolic execution (DSE)?
⢠Step 3. Hmmm⌠the ideas would work for the same problem faced by DSE too
⢠Step 1. Address too many false-warning issues reported by DSE
⢠Step 2. Why DSE? But not symbolic execution?
⢠Step 3. Hmmm.. the ideas would work for the same problem faced by symbolic
execution too
⢠âŚ.
⢠Outer circle: Address too many false-warning issues reported by test-generation
tools that focus on structural coverage and analyze code for test generation
(some techniques work for random test generation too)
36. Example Application of âBalloonâ/âDonutâ
Š Tao Xie
⢠Final Product: Xusheng Xiao, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux.
Precise Identification of Problems for Structural Test Generation. ICSE 2011
⢠Solution Space
⢠Step 1. (Inner circle) Realize issue pruning based on symbolic analysis
implemented with Pex
⢠Step 2. Why Pex? But not dynamic symbolic execution (DSE)?
⢠Step 3. Hmmm⌠the ideas can be realized with general DSE
⢠Step 1. Realize issue pruning based on symbolic analysis implemented with DSE
⢠Step 2. Why DSE? But not symbolic execution?
⢠Step 3. Hmmm ⌠the ideas can be realized with general symbolic execution
⢠âŚ.
⢠Outer circle: Realize issue pruning based on dynamic data dependence (which can
be realized with many different techniques!), potentially the approach can use
static data dependence but with tradeoffs between dynamic and static
37. More Advice Resources
⢠Advice on Writing Research Papers:
https://www.slideshare.net/taoxiease/how-to-write-research-papers-
24172046
⢠Common Technical Writing Issues:
https://www.slideshare.net/taoxiease/common-technical-writing-
issues-61264106
⢠More advice at http://taoxie.cs.illinois.edu/advice/
38. More Reading
⢠âOn Impact in Software Engineering Researchâ by
Andreas Zeller
⢠âDoing Research in Software Analysis Lessons and Tipsâ
by Zhendong Su
⢠âSome Research Paper Writing Recommendationsâ by
Arie van Deursen
⢠âDoes Being Exceptional Require an Exceptional Amount
of Work?â by Cal Newport
⢠Book: Crafting Your Research Future: A Guide to
Successful Master's and Ph.D. Degrees in Science &
Engineering by Charles Ling and Qiang Yang
39. Experience Reports on Successful Tool Transfer
⢠Yingnong Dang, Dongmei Zhang, Song Ge, Ray Huang, Chengyun Chu, and Tao Xie. Transferring Code-
Clone Detection and Analysis to Practice. In Proceedings of ICSE 2017, SEIP.
http://taoxie.cs.illinois.edu/publications/icse17seip-xiao.pdf
⢠Nikolai Tillmann, Jonathan de Halleux, and Tao Xie. Transferring an Automated Test Generation Tool to
Practice: From Pex to Fakes and Code Digger. In Proceedings of ASE 2014, Experience Papers.
http://taoxie.cs.illinois.edu/publications/ase14-pexexperiences.pdf
⢠Jian-Guang Lou, Qingwei Lin, Rui Ding, Qiang Fu, Dongmei Zhang, and Tao Xie. Software Analytics for
Incident Management of Online Services: An Experience Report. In Proceedings ASE 2013, Experience
Paper.
http://taoxie.cs.illinois.edu/publications/ase13-sas.pdf
⢠Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, and Tao Xie. Software
Analytics in Practice. IEEE Software, Special Issue on the Many Faces of Software Analytics, 2013.
http://taoxie.cs.illinois.edu/publications/ieeesoft13-softanalytics.pdf
⢠Yingnong Dang, Dongmei Zhang, Song Ge, Chengyun Chu, Yingjun Qiu, and Tao Xie. XIAO: Tuning Code
Clones at Hands of Engineers in Practice. In Proceedings of ACSAC 2012.
http://taoxie.cs.illinois.edu/publications/acsac12-xiao.pdf