O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Basics of Computation and Modeling - Lecture 2 in Introduction to Computational Social Science

5.643 visualizações

Publicada em

Second lecture of the course CSS01: Introduction to Computational Social Science at the University of Helsinki, Spring 2015.(http://blogs.helsinki.fi/computationalsocialscience/).

Lecturer: Lauri Eloranta
Questions & Comments: https://twitter.com/laurieloranta

Publicada em: Dados e análise
  • Login to see the comments

Basics of Computation and Modeling - Lecture 2 in Introduction to Computational Social Science

  2. 2. • LECTURE 1: Introduction to Computational Social Science [DONE] • Tuesday 01.09. 16:00 – 18:00, U35, Seminar room114 • LECTURE 2: Basics of Computation and Modeling [TODAY] • Wednesday 02.09. 16:00 – 18:00, U35, Seminar room 113 • LECTURE 3: Big Data and Information Extraction • Monday 07.09. 16:00 – 18:00, U35, Seminar room 114 • LECTURE 4: Network Analysis • Monday 14.09. 16:00 – 18:00, U35, Seminar room 114 • LECTURE 5: Complex Systems • Tuesday 15.09. 16:00 – 18:00, U35, Seminar room 114 • LECTURE 6: Simulation in Social Science • Wednesday 16.09. 16:00 – 18:00, U35, Seminar room 113 • LECTURE 7: Ethical and Legal issues in CSS • Monday 21.09. 16:00 – 18:00, U35, Seminar room 114 • LECTURE 8: Summary • Tuesday 22.09. 17:00 – 19:00, U35, Seminar room 114 LECTURESSCHEDULE
  3. 3. • PART 1: COMPUTATION • Role of Computation • How Computers Work • What is Programming • PART 2: MODELING • What is Modeling • Unified Modeling Language (UML) LECTURE 2OVERVIEW
  4. 4. • Understanding computers as information processing systems help us understand complex systems as information processing systems as well • Knowing how computers, programs and programming languages work and what you are able to do with them helps us grasp how to approach a research problem on a practical level (e.g. which tools to choose) • There are practical consequences on the selection of computers, programs and programming languages in relation to the research problem we are trying to solve (e.g. selection of tools affect the answers we are able to get) MOTIVATION WHYUNDERSTANDINGCOMPUTERSMATTERS (Cioffi-Revilla 2014.)
  5. 5. • Computation is used in CSS as a language to formalize (1) theory and (2) empirical research to research social complexity. • Computation is in most cases applied computation: computation itself is rarely researched (as is done in computer science). • Information processing paradigm 1. Computing as a fundamental part of complex social systems 2. Computing as tools for research ROLE OF COMPUTATION IN CSS (Cioffi-Revilla 2014.)
  7. 7. • Computers are formed of hardware and software • Hardware: the physical parts of the computer that enable computing • Micro-processor, physical memory (= electronic physical machines) • Hardware provides the physical means for information processing • Software: the (textual) instructions for that tell the hardware what to do • The non-physical parts: e.g. MS. Word is a software program • Example: your physical iPhone is hardware, the apps you run on it are software. HARDWARE & SOFTWARE (Hennessy & Patterson 2013.)
  9. 9. • CPU = Central Processing Unit, Does all the computing work • Processes the instuctions on an program • Controller, Registers, Arithmetic & Logic Unit • Main Memory (RAM, Random Access Memory) • Fastest memory, close to CPU (so that CPU-Memory-can work well together) • Instructions for computing are loaded to Main memory and executed from there by the CPU • Secondary memory (Hard Disk) • Slower memory with big volume • Able to store big amounts of data, but the access is much slower • Programs fetched first to Main memory from secondary memory, before they are run by the CPU • Input and Output, I/O devices • Screen, mouse, keyboard, network connections, … COMPUTER ARCHITECTURE (Hennessy & Patterson 2013.)
  10. 10. • Computers run programs according the instruction cycle, also called as fetch-execute-cycle (or fetch-decode-execute cycle) • Basically it is about cycling two steps • 1. Fetch the next instruction of the program from main memory to CPU • 2. Execute that instruction in CPU • Repeat steps 1 & 2 INSTRUCTION CYCLE (Hennessy & Patterson 2013.)
  11. 11. • In relation to the information processing paradigm, computers can be seen quite similar to complex adaptive social systems • Computers are formed of information processing system (CPU, Memory) and its environment (via I/O devices) • Can a social system be seen as an information processing “computer”? ANANALOGYBETWEENCOMPUTERS ANDCOMPLEXSYSTEMS (Hennessy & Patterson 2013.)
  13. 13. • Programming is the act of writing the instructions for the Computer/CPU to execute • A Program is a set of those instructions • An iPhone app is a program • The textual form of those instructions is called CODE and it is separated from the DATA, which is to information the CODE is computing via CPU • Programs are written in special languages called programing languages PROGRAMMING (Hennessy & Patterson 2013.)
  14. 14. • Central Processing Unit (CPU) can only understand instructions that are written in its “native” language • This CPU language is called Machine Code, and it varies from CPU to CPU, based on make and model • For example ARM <-> Intel X86 machine codes • Machine language is not (or is hardly) human readable. The closest correspondent is low-level Assembly Language • Machine code or machine language is a set of instructions executed directly by a computer's central processing unit (CPU). Each instruction performs a very specific task, such as a load, a jump, or an ALU operation on a unit of data in a CPU register or memory. Every program directly executed by a CPU is made up of a series of such instructions. (Wikipedia 2015, Machine code) CPU HAS ITS OWN LANGUAGE (Hennessy & Patterson 2013.)
  15. 15. • People write code in “human readable programming languages” (or semi-human-readable, as assembly) • One is able to see what the program does from the code • CPU does not understand human readable languages & code, as it only understands Machine Code • Human readable programming languages needs to be translated to machine code so that CPU is able to execute the code • There are two ways to do this: 1. Compiling 2. Interpreting PROGRAMMINGISDONEIN HUMANREADABLELANGUAGES (Hennessy & Patterson 2013.)
  16. 16. • Compiling code: The human readable code is transformed (=compiled) once to machine code. After this the machine code program can be run many times. • -> This is equivalent in translating a book to a foreign language (machine code), After the translation, book can be read many times. • Interpreting code: The human readable code is interpreted to Machine code at the same time it is executed by the CPU. This means, that the interpretation/translation is happening at the same time the instructions are executed. • ->This is equivalent of having a real life conversation via a human interpreter. • Whether a language is compiled or intepreted has practical effects • Speed, how variables are resolved, etc. HELPINGMACHINESREADCODE: COMPILINGANDINTERPETING (Hennessy & Patterson 2013.)
  17. 17. • The abstraction level of a programming language depends on how “far” it is from Machine Code & dealing with hardware related specifics (such as memory management) • Languages can be compiled/interpreted to other languages ABSTRACTION LEVELOF THE LANGUAGE Low level language High level language Machine Code Assembly C Java R Visual Programming C++ Scala
  18. 18. • There are hundreds of programming languages • http://en.wikipedia.org/wiki/List_of_programming_languages • Languages differ in • Syntax = how they are written, rules of writing instructions • Semantics = what different words and concepts mean • Pragmatics = what the language is used for • Languages also differ in that are they compiled or interpreted to machine code PROGRAMMING LANGUAGES
  19. 19. IN C LANGUAGE: #include<stdio.h> main() { printf("Hello World"); } SYNTAX & SEMANTICS HELLOWORLD-EXAMPLE IN JAVA LANGUAGE: public class HelloWorld { public static void main(String[] args) { System.out.println("Hello, World"); } } IN SCHEME -LANGUAGE: (define hello-world (lambda () (begin (write ‘Hello-World) (newline) (hello-world)))) IN PYTHON LANGUAGE: print "Hello, World!"
  20. 20. • Data types: • Most basic type of information in the language • integer, real, boolean… • Data structures: • More complex structures of data. • list, stack, array, tree • Variables: places to store functions and data • Assignments: a way to tie a certain value to certain variable • X = 5 + 2; • Functions: • A command that performs certain functionality • Takes arguments and retunrs a value • Print(“Hello World”)  “Hello World” • Control Structures: • Control the flow of the program • Loop, skip, iterate, do something while certain conditions hold PROGRAMMING LANGUAGES INCLUDE (Cioffi-Revilla 2014.)
  21. 21. • There are many different paradigms in the ways people do programs; below are the three most common: • Procedural / Imperative Programming • Line-by-line telling what the program should do: • 1 Do This • 2 Do that • 3 Do those things • Object-Oriented Programming (OOP) • Based on objects that contain functions and data • Objects preserve state • Functional Programming (FP) • Functions as first class citizens PARADIGMS OF PROGRAMMING
  22. 22. • An algorithm is a self-contained set of step-by-step operations to achieve a desired result. • There are different algorithms for different purposes • Search algorithms • Sort algorithms • Image processing algorithms • Etc.. • A real life algorithm might be: how people get study credits • Sign up for a course • Participate lectures • Do lecture assignments and final work • Return lecture assignments and final work • If your work passes the grading, you get study credits ALGORITHMS (Cioffi-Revilla 2014.)
  23. 23. • The way one writes code matters, because you or someone else needs to be able to easily understand & modify the code • This may happen after long periods of time (after one has forgotten how the program works) • Good coding style produces code that is simple, readable, understandable, concise and well structured • Code is also a way to communicate how the program works • Documenting your code is a crucial part of programming! • General principles according Cioffi-Revilla 2014 • Readability • Commenting • Modularity • Defensive coding CODING STYLE
  24. 24. • A good summary on how to write, refactor and manage code and data: • Gentzkow, Matthew and Jesse M. Shapiro. 2014. Code and Data for the Social Sciences: A Practitioner’s Guide. University of Chicago mimeo, http://web.stanford.edu/~gentzkow/research/CodeAndData.pdf • Handles matters such as: • Automation • Version Control • Directories • Data Keys • Abstractation • Documentation • Management MANAGINGANDREFACTORING CODE&DATA
  25. 25. • You learn programming by doing! • Start with something small • University of Helsinki: Many Computer Science Courses • CSS02 – Introduction to Programming in Social Sciences (II period, 2015). • MOOC Courses Online: • Coursera • Data Science Specialization (Highly recommended)https://www.coursera.org/specialization/jhudatascience/1?utm_medi um=catalog • CodeAcademy • http://www.codecademy.com/learn • MIT Open Course WARE • http://ocw.mit.edu/courses/intro-programming/ • Udemy • https://www.udemy.com/courses/Development/ WHERETO LEARN PROGRAMMING
  27. 27. • Model is a formal and purposeful representation and abstraction of reality • Scientific Modeling is a scientific activity, the aim of which is to make a particular part or feature of the world easier to understand, define, quantify, visualize, or simulate by referencing it to existing and usually commonly accepted knowledge. It requires selecting and identifying relevant aspects of a situation in the real world and then using different types of models for different aims, such as conceptual models to better understand, operational models to operationalize, mathematical models to quantify, and graphical models to visualize the subject. (Wikipedia 2015, Scientific Modeling) • Reality  Abstraction  Model of the Phenomena MODEL
  28. 28. 1. Models of Phenomena: model based on real world phenomena (e.g. how ants collect food) 2. Models of Data: modeling based on raw data (e.g. plotting) 3. Models of Theory: model is the structural and formal presentation of a textual theory • Different Modeling Perspectives (Ontological) • Physical models (e.g. miniature buildings) • Fictional models (e.g. Bohr model of atom) • Mathematical models: set-theory models, equations.. • Descriptions • Mixed models • A good summary on scientific modeling: • http://plato.stanford.edu/entries/models-science/ MODELSAS REPRESENTATIONS (Stanford Encyclopedia 2015.)
  29. 29. • Ontology is the philosophical study of the nature of being, becoming, existence, or reality, as well as the basic categories of being and their relations. Traditionally listed as a part of the major branch of philosophy known as metaphysics, ontology deals with questions concerning what entities exist or can be said to exist, and how such entities can be grouped, related within a hierarchy, and subdivided according to similarities and differences. (Wikipedia 2015, Ontology) • In computer science and information science, an ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse. It is thus a practical application of philosophical ontology, with a taxonomy. (Wikipedia 2015, Ontology information science) ONTOLOGY
  30. 30. • Entire social world consists of social systems and their environments • These systems form of • Classes • Objects (of a certain class, called instances) • Associations between classes and objects (e.g relationships between entities) • Real World (Referent Social System)   Model (abstracted Social System) ONTOLOGY& SOCIAL SYSTEMS (Cioffi-Revilla 2014.)
  33. 33. • Deep epistemological and philosophy of science related questions, which are not unproblematic • What is the true relationship between the model and reality? • What can be actually researched with models? • What questions the models are actually able to answer? • Modeling takes also a certain stance on the philosophy of science, leaning towards empiricism & positivism, or at least critical realism. MODELING IS PROBLEMATIC
  34. 34. • A really good primer on model thinking is the course given by Scott E. Page at the University of Michigan. One is able to participate the course for free in Coursera: https://www.coursera.org/course/modelthinking • Why Model? • To be an intelligent citizen of the world • To be a clearer thinker • To understand and use data • To better decide, strategize, and design • Course videos also freely available in YouTube: • https://www.youtube.com/watch?v=K- gxhxGwJ38&index=2&list=PLGqc26s6O0E2P2BnK73JWXk4YYTgl3dm b MODELTHINKING
  36. 36. • The Unified Modeling Language (UML) is a general-purpose modeling language in the field of software engineering, which is designed to provide a standard way to visualize the design of a system. (Wikipedia 2015, UML) • UML is a standardized notational system for graphically representing complex systems consisting of classes, objects, associations among them, dynamic interactions and other scientifically important features. (Cioffi-Revilla 2014) • Developed during the 1990s • Is part of the ISO standard • Static Modeling: Models the static structure of the system • Dynamic Modeling: Models the dynamic behavior of the system UNIFIED MODELIN LANGUAGE (UML)
  37. 37. • Use Case Diagrams • Class Diagrams • Sequence Diagrams • State Diagrams • Component Diagrams • Deployment Diagrams • Most useful for Social Science modeling might be the Class, State, Sequence diagrams MAINTYPES OF UML MODELS (Bell 2004.)
  38. 38. • Class diagram represents the static structure of a complex system • Class diagram forms of • Rectangles representing classes and objects (name on top) • Classes and objects can have • Attributes (e.g age, sex) • Methods = a certain function the class or object is able to perform (e.g.getMarried()) • Links between rectangles representing associations between classes and objects CLASS DIAGRAM (Cioffi-Revilla 2014.)
  39. 39. CLASSES nameOfClass Attributes (optional) Methods (optional) Family -age -weight -height Person
  40. 40. • Four types of associations represented by different arrowhead-links: • Inheritance/generalization (empty arrowhead) • Aggregation (empty diamond) • Composition (black diamond) • Generic association (plain link / directional arrow symbol) CLASS DIAGRAM & ASSOCIATIONS (Cioffi-Revilla 2014.) (Image from: http://www.javacodegeeks.com /2013/01/quick-summary- object-associations.html)
  41. 41. ASSOCITATIONS Family -age -weight -height Person belongs to
  42. 42. • Multiples represent the quantities in relation of association • E.g. How many children a parent has in the particular model • There are many different range options • 0..1 = between 0 and 1 • 1 = exactly 1 • 0..* or * = between 0 and unspecified many • 1..* = between 1 and unspecified many • 0..N or N = between 0 and unspecified many • 1..N = between 1 and unspecified many CLASS DIAGRAM & MULTIPLES (Cioffi-Revilla 2014.)
  43. 43. MULTIPLES Family -age -weight -height Person belongs to 1..* 0..1
  45. 45. • Sketch a UML Class Diagram model that represents elections • What are the main classes, objects and relationships between the classes? • Do you find the model useful? ASSIGNMENT
  46. 46. • Gentzkow, M.; Shapiro, J, M. 2014. Code and Data for the Social Sciences: A Practitioner’s Guide. University of Chicago mimeo, http://faculty.chicagobooth.edu/matthew.gentzkow/research/CodeAndData.pdf • Granger, C. 2015. Coding is not the new literacy. http://www.chris-granger.com/2015/01/26/coding-is-not-the- new-literacy/ • Epstein, J. M. 2008. Why Model?. Keynote address to the Second World Congress on Social Simulation. In Why Model?: Keynote address to the Second World Congress on Social Simulation. George Mason University. • Page, S. E. 2012. The Model Thinker: Prologue, Introduction and Chapter 1. Link provided by University of Michigan & Coursera: • http://vserver1.cscs.lsa.umich.edu/~spage/ONLINECOURSE/R1Page.pdf • Stanford Encyclopedia of Philosophy, 2012. Models in Science. • http://plato.stanford.edu/entries/models-science/ • Bell, D. 2003. UML basics: An introduction to the Unified Modeling Language. The Rational Edge. https://www.ibm.com/developerworks/rational/library/content/RationalEdge/sep03/f_umlbasics_db.pdf LECTURE 2 READING
  47. 47. • Cioffi-Revilla, C. 2014. Introduction to Computational Social Science. Springer-Verlag, London • Gentzkow, M.; Shapiro, J, M. 2014. Code and Data for the Social Sciences: A Practitioner’s Guide. University of Chicago mimeo, http://faculty.chicagobooth.edu/matthew.gentzkow/research/CodeAndData.pdf • Hennessy, J. L.; Patterson, D. A. 2013. Computer Organization and Design. Elsevier, Waltham. • Stanford Encyclopedia of Philosophy, 2012. Models in Science. • http://plato.stanford.edu/entries/models-science/ • Bell, D. 2003. UML basics: An introduction to the Unified Modeling Language. The Rational Edge. https://www.ibm.com/developerworks/rational/library/content/RationalEdge/sep03/f_umlbasics_db.pdf • Wikipedia 2015, Scientific Modeling. http://en.wikipedia.org/wiki/Scientific_modelling • Wikipedia 2015, Ontology. http://en.wikipedia.org/wiki/Ontology • Wikipedia 2015, Ontology (information science) http://en.wikipedia.org/wiki/Ontology_(information_science) REFERENCES
  48. 48. Thank You! Questions and comments? twitter: @laurieloranta