Anúncio

How ChatGPT and AI-assisted coding changes software engineering profoundly

Pekka Abrahamsson / Tampere University
30 de Mar de 2023
Anúncio

Mais conteúdo relacionado

Similar a How ChatGPT and AI-assisted coding changes software engineering profoundly(20)

Anúncio

How ChatGPT and AI-assisted coding changes software engineering profoundly

  1. How ChatGPT & AI-assisted Coding Changes Software Engineering Profoundly Professor Pekka Abrahamsson Tampere University, Finland K E Y N O T E A D D R E S S The 38th ACM/SIGAPP Symposium On Applied Computing March-30th, 2023
  2. Pekka Abrahamsson • Dr. Pekka Abrahamsson works as a full professor of software engineering at the Tampere Univeristy in Finland. He received his PhD in Software Engineering in 2002 from the University of Oulu. His research is in the area of emerging software technologies, empirical software engineering, and the ethics of artificial intelligence. • Before his current position, he has served as a full professor at the University of Jyväskylä (Finland), University of Helsinki (Finland), Free University of Bolzano (Italy), Norwegian University of Science and Technology (Norway). He also worked at VTT Technical Research Centre of Finland as a research professor of software technologies. • He is widely recognized for his academic achievements. He is a pioneer in the field of research on agile software engineering methods and processes. Abrahamsson is the most cited researcher in his field in Finland. He is the first Professor of Software Engineering to be invited to the Finnish Academy of Science and Letters. • He has published broadly in his areas of expertise and received many awards and recognitions. He was recently ranked in the all-time top 1% of software engineering scientists globally. Arnetminer named him among the 100 most influential software engineering scientists in the world in 2016. Abrahamsson was awarded the Nokia Foundation Award 2007. He is the Software Startup Research Network (SSRN) co-founder and a seasoned expert in leading large research projects. • His h-index is 62 and he has more than 15600+ citations (March 2023)
  3. Shocking news! • “There is a general agreement that the state of art in practice [in software industry] is unsatisfactory. • This state is often described by the term “software crisis” referring to the poor quality of systems, excessive costs, schedule and budget overruns. • It is suggested that the problems lie not in the lack of methods, techniques or tools. • We agree and suggest that the fundamental problem is the limited understanding of system design and its basic principles.”
  4. Shocking news.. 35 ago.. • “There is a general agreement that the state of art in practice [in software industry] is unsatisfactory. • This state is often described by the term “software crisis” referring to the poor quality of systems, excessive costs, schedule and budget overruns. • It is suggested that the problems lie not in the lack of methods, techniques or tools. • We agree and suggest that the fundamental problem is the limited understanding of system design and its basic principles.” Source: Iivari, J. & Koskela, E. (1987): “The PIOCO Model for Information Systems Design”, MIS Quarterly, 11(03). Pp. 401-419
  5. Universal Solution Fallacy We should have known this? Malouin, J. L. and M. Landry (1983). "The mirage of universal methods in systems design." Journal of Applied Systems Analysis 10: 47-62. New method/technology
  6. (Ongoing) Misconceptions in the field • Dependable large systems can only be attained through rigorous application of the engineering design process • The key design objective is an architecture that meets specifications derived from knowable and collectable requirements • Individuals of sufficient talent and experience can achieve an intellectual grasp of the system • The implementation can be completed before the environment changes very much Source: Denning, P.J., Gunderson, C. and Hayes-Roth, R., 2008. The profession of IT Evolutionary system development. Communications of the ACM, 51(12), pp.29-31.
  7. 7 A State-of-the-art process: Preparing Ditalini with flageolet & pesto Unreliable source Unfamiliar terminology Ambiguous instructions Confusing measures Incomplete instructions Unclear goal
  8. Manipulatibity Safety Vulnerability Volalitility Robustness Sustainability Depentability Friendliness Shameability Pleasurability Substitution of human contact Normative recognition Data quality Moral de/re/upskilling Alientation Dignity Virtuousness Trustability Benevolence Care concerns Abusability Responsibility Value sensitivity Malevolence Lethality Maleficence Fairness Unpredictability Social sorting Social solidarity Universal service Respect for autonomy Legality Consent Access to data Data collection limitation Privacy Foreseeability Predictability Deceptability Liability Transparency Righteousness Blamability Biasness Source: Vakkuri, V. and Abrahamsson, P., 2018. The key concepts of ethics of artificial intelligence. In 2018 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC) (pp. 1-6). IEEE.
  9. Summary: What makes software engineering so hard? • We are falling short in all the key areas of software engineering • Requirement gathering and management • Technical debt • Integration and interoperability • Security and privacy • Scalability and performance • Testing and quality assurance • Talent shortage • We rely too much on human effort in software development. More than 80% of the code today is still manually entered.
  10. 211 companies were surveyed. It is a jungle out there… For Ethically Aligned AI Development Source: Vakkuri, V., Kemell, K.K., Jantunen, M., Halme, E. and Abrahamsson, P., 2021. ECCOLA—A method for implementing ethically aligned AI systems. Journal of Systems and Software, 182, p.111067. Download your copy at bit.ly/eccola-method
  11. Eary personal experimentation in Jan/2023
  12. Code completion tools • Microsoft’s Copilot uses Large Language Model called Codex, developed by OpenAI, based on GPT-3 • Trained on Github code • Works as a developer’s assistant (pair programmer) • Focused only on code • May introduce errors • 55% increase in productivity (1 study) Source: Pudari, R. and Ernst, N.A., 2023. From Copilot to Pilot: Towards AI Supported Software Development. arXiv preprint arXiv:2303.04142.
  13. Source: Dall-e generated photos
  14. Maybe ChatGPT (and language models) are just a hype?
  15. https://futureoflife.org/open-letter/pause- giant-ai-experiments/ • Therefore, we call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4. This pause should be public and verifiable, and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.
  16. ChatGPT factsheet • A chatbot, developed by OpenAI company, based in the US, operations funded by Microsoft by a significant degree • Built on top of the Large Language Models (LLMs), GPT-3.5, GPT-4 • 100 million+ users, 25M daily • GPT-3.5 has 170 Billion parameters, GPT-4 has something between 400-1000B (not confirmed) • It is now estimated to produce a volume of text every 14 days that is equivalent to all the printed works of humanity. • -Source: Dr Thompson, Feb/2023, cited in report by the National Bureau of Economic Research (Scholes, Bernanke, MIT)
  17. GPT-4 promiseware • GPT-4 accepts both image and text inputs (note! output is in text only today) • Some Demo’ed Applications: • GPT-4 can convert your hand-drawn website mockups into actual website code. • See your refrigerator contents and tell you recipes you can make. • Read the tax code and calculate your taxes while citing sources. • GPT-4 outperforms ChatGPT (GPT 3.5) on most academic and professional exams taken by humans like SAT, GRE, Bar Exams, etc. • GPT-4 scored in the 90th percentile on the Uniform Bar Exam compared to GPT-3.5, which scored in the 10th percentile. • GPT-4 is 82% better than ChatGPT/GPT 3.5 at detecting inappropriate requests and has better guardrails. • ChatGPT plugins will be a game-changer for GPT allowing it to talk to external apps like Zapier, Wolfram, Code interpreters, etc. Open AI may have ushered in a new era of AI app stores.
  18. 15 ways to benefit from ChatGPT Natural Language Understanding Multilingual Conversations Knowledge Base Creative Writing Problem Solving Simulating Conversations Personalized Recommendations Summarization and Simplification Debates and Perspectives Code and Technical Help Role-playing and Gaming Learning and Education Emotional Support Language Translation Grammar and Writing Assistance
  19. How ChatGPT is argued to help software engineers? 1.Providing answers to technical questions: Software engineers often encounter complex technical problems that require research and analysis. ChatGPT can provide quick and accurate answers to these questions, drawing on a vast repository of knowledge. 2.Generating code snippets: ChatGPT can also generate code snippets for specific tasks, which can save software engineers time and effort. This can be particularly useful for common tasks or for code that follows a specific pattern. 3.Assisting with debugging: ChatGPT can help software engineers identify and troubleshoot issues in their code by analyzing error messages and providing suggestions for fixes. 4.Offering insights on emerging technologies: ChatGPT can keep software engineers up-to-date with the latest trends and advancements in their field, such as new programming languages, frameworks, or tools. 5.Supporting collaboration: ChatGPT can help facilitate collaboration among software engineers by providing a platform for real-time communication and sharing of ideas and resources.
  20. Known issues / challenges • There are several problems with the use of ChatGPT, Copilot and others, which need to be solved before wider adoption: • Code ownership, IPR issues • Limited applicability scope (limited due to training data) • False instructions, advice, information • Code defects • Known and unknown security threats • Security and privacy concerns • Working in a client development environment • Difficulty in integrating with an existing workflow and tools • Costs of large language models can be very high
  21. ChatGPT’s own advice with IPR issues
  22. 36 Common Use Cases AI-Assisted learning / Project onboarding / Training / Personal assistant Use Case 1 AI-Assisted Software Engineering / Development Use Case 2 AI-Assisted Decision Making based on your own data Use Case 3
  23. What do the scholars say now? • ~1000 papers on Large Language Models in Arxiv (as of March-28th) • 52 papers on LLMs and software engineering • General themes covered: Program Synthesis, AI Evaluation, Bug Detection, Error Handling, Learning Materials Generation, Code Analysis, Code Completion Systems, Reverse Engineering, Spreadsheet Models and Code Poisoning • 170 articles on ChatGPT or employed ChatGPT in Arxiv • 90 articles with ChatGPT on title • Only three studies related to Software Engineering • ChatGPT and Software Testing Education: Promises & Perils (experiment) • Towards Human-Bot Collaborative Software Architecting with ChatGPT (case study) • ChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software Design (experience-based)
  24. Example Prompt engineering patterns for SW development Source: White, J., Hays, S., Fu, Q., Spencer-Smith, J. and Schmidt, D.C., 2023. ChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software Design. arXiv preprint arXiv:2303.07839.
  25. Example Prompt engineering patterns for SW development Source: White, J., Hays, S., Fu, Q., Spencer-Smith, J. and Schmidt, D.C., 2023. ChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software Design. arXiv preprint arXiv:2303.07839.
  26. Studied themes varied greatly • Virtual Reality and Metaverse • Translation Evaluation • Machine Translation • Ethics and Regulation • Academic Publishing • Plagiarism Detection • AI Generated Content • Bug Fixing • Bioinformatics • Sentiment Analysis • Medical Advice • Construction Project Scheduling • Software Testing Education • Large Language Model Failures • Statistical Process Control • Designer AI • Ordered Importance Communications • Learning Gain Comparison • Zero-Shot Information Extraction • Causal-Discovery Performance • AI Ethics
  27. Some empirical findings • ChatGPT was able to respond correctly to 56% of Software Testing exam questions, Jalil et al, 2023 • ChatGPT narrowely passed a computer science exam (24/40, student average 24), Bordt and von Luxburg, 2023 • ChatGPT resembles closely human patterns in language use, Cai et al, 2023 (10/12 experiments passed) • ChatGPT's ranking preferences are quite consistent with human, Ji et al, 2023 (can be used to categorize data, zero-shot ranking capability good) • ChatGPT beats Grammarly in fixing grammatical errors, Wu et al, 2023 • ChatGPT’s zero-shot Text-to-SQL capabilities are impressively good, Liu et al., 2023 • ChatGPT is an excellent Keyphrase generator, Song et al, 2023 • ChatGPT lacks moral authority and is not consistent in its advice, Krügel et al, 2023 • ChatGPT is already at commercial product level in language translation, Jiao et al, 2023 • ChatGPT is 20x less costly than M-Turk for text annotation tasks and more accurate, Gilardi et al., 2023
  28. Conducting Systematic Literature Reviews with ChatGPT: A Proposal Source: Waseem, M., Ahmad, A., Liang, P., Fehmideh, M., Abrahamsson, P. and Mikkonen, T., Conducting Systematic Literature Reviews with ChatGPT, 2023, Researchgate
  29. Final thought, a new must-have skill for you all, the art of Prompt Engineering
  30. Key messages • Despite of advances, software engineering continues to be in crisis • Adoption of AI-assisted tools is still in its infancy • Introduction of LLMs may be a game changer in the field of SE but also in other fields as well. • ChatGPT offered the missing user interface for the use of AI in various contexts. While scientific studies are still coming, early results indicate positive influences across many sectors. • It may hot air as well • Assistant that delivers 50% false results and provides a different answer to every question, would get fired in real life • Ethics issues are real, training material is biased • Yet I believe that we should explore the new AI tools such as ChatGPT will full force • The question remains, how ChatGPT will help you research?
  31. Reach me at pekka.abrahamsson@tuni.fi
Anúncio