SlideShare a Scribd company logo
1 of 69
Download to read offline
ITCS332:
    Organization of
Programming Languages


       Chapter 6
       Data Types
                    ISBN 0-321-33025-0
Chapter 6 Topics
       •    Introduction
       •    Primitive Data Types
       •    Character String Types
       •    User-Defined Ordinal Types
       •    Array Types
       •    Associative Arrays
       •    Record Types
       •    Union Types
       •    Pointer and Reference Types


.ITCS332 by Dr. Abdel Fattah Salman                 6-2
Introduction
• A data type defines a collection of data objects and a set of predefined
  operations on those objects
• How well the data types match the real-world problem space; so it is crucial
  that a L support an appropriate variety of data types and structures.
• PL/I included many data types, with intent to supporting a large range of
  applications.
• A better approach in ALGOL68: provide a few basic types and a few flexible
  structure-defining operators allowing a user to design data structures for each
  need.
• User-defined types improves readability through the use of meaningful names
  for types.
• User-defined types aid modifiability: A user can change the type of a category
  of variables in a program by changing only a type declaration statement.
• The fundamental idea of an abstract data type is that the use of a types is
  separated from the representation and set of operations on values of that type.
• The 2 most common structured (nonscalar) data types are: arrays and records

.ITCS332 by Dr. Abdel Fattah Salman                                         6-3
Introduction
• These DT are specified by type operators or constructors: In C Ls, the type
  operators are: brackets, parentheses, and asterisks are used to specify
  arrays, functions, and pointers.
• A descriptor is the collection of the attributes of a variable. In an
  implementation, a descriptor is a collection of memory cells that store
  variable attributes
• If all attributes are static, descriptors are needed only at compile time. They
  are built by a compiler and stored in symbol table.
• For dynamic attributes, part or all of the descriptor are maintained during
  execution.
• Descriptors are used for type checking and to build the code for allocation
  and deallocation operations.
• The word object is associated with the value of a var and space it occupies.
• An object represents an instance of a user-defined (abstract data) type.
• In OO Ls, every instance of every class (predefined or user-defined) is
  object
• One design issue for all data types: What operations are defined for vars of
  the type and how are they specified?
.ITCS332 by Dr. Abdel Fattah Salman                                           6-4
Primitive Data Types
• Primitive data types: Those not defined in terms of other
  data types
• Almost all programming languages provide a set of
  primitive data types
     – Some primitive data types are merely reflections of the
       hardware
     – Others require little non-hardware support
     – Primitive data types are used along with one or more type
       constructors to build the structured types.
     – Primitive data types include: integer, real, decimal, character,
       boolean


.ITCS332 by Dr. Abdel Fattah Salman                                   6-5
Primitive Data Types: Integer
• Almost always an exact reflection of the hardware so the mapping
  is trivial
• There may be as many as eight different (in size) integer types in
  a language
• Java’s signed integer sizes: byte, short, int, long.
• C# and C++ include unsigned integer types.
• Signed integers are stored in 2’s complement representation.




.ITCS332 by Dr. Abdel Fattah Salman                              6-6
Primitive Data Types: Floating Point
• FP types model real numbers as approximations (π and e).
• Problems: Approximated representation and Loss of accuracy
  through arithmetic operations
• Languages for scientific use support at least two floating-point
  types (e.g., float(4 bytes) and double(8 bytes);
  sometimes more
• Usually exactly like the hardware, but not always
• IEEE Floating-Point Standard 754
• Precision is the accuracy of the
  fractional part of the value it bits.
• Range is a combination of the
  exponent and fraction ranges.


.ITCS332 by Dr. Abdel Fattah Salman                                  6-7
Primitive Data Types: Decimal
• Computers designed for business applications have hardware
  support for decimal data types.
• Decimal types store a fixed number of decimal digits, with a
  decimal point at fixed position in the value.
• For business applications (money)
   – Essential to COBOL
   – C# offers a decimal data type
• Advantage: accuracy
• Disadvantages: limited range, wastes memory.
• Decimal types are stored in BCD: unpacked -one digit per byte,
  packed -2 digits per byte.
• Operations on decimal values are done in hardware or by
  simulation.

.ITCS332 by Dr. Abdel Fattah Salman                                6-8
Primitive Data Types: Boolean
• Simplest of all
• Range of values: two elements, one for “true” and one
  for “false”.
• Boolean types are used to represent switches or flags in
  programs
• Could be implemented as a single bit, but often as a byte
     – Advantage: readability




.ITCS332 by Dr. Abdel Fattah Salman                      6-9
Primitive Data Types: Character
• Stored as numeric codings
• Most commonly used coding: ASCII
• An alternative, 16-bit coding: Unicode
     – Globalization of business
     – Computers need to communicate with other computers
     – Includes characters from most natural languages
     – The first 128 characters of Unicode are similar to those of
       ASCII
     – Originally used in Java
     – C# and JavaScript also support Unicode


.ITCS332 by Dr. Abdel Fattah Salman                                  6-10
Character String Types
•   Values are sequences of characters
•   Design issues:
     – Is it a primitive type or just a special kind of array?
     – Should the length of strings be static or dynamic?
•   C and C++ define strings as array of chars and provide string operations as
    functions in standard library “string.h”. Strings are ASCIIZ.
•   Problem: Move string data do not guard against overflowing a destination. C++
    programmers must use string class from standard library rather than char array.
•   In C# and Java, strings are supported as a primitive type by string class
    (constant strings) and stringbuffer class (variable strings like arrays of chars).
•   Typical operations on strings:
     – Assignment, Comparison (=, >, etc )and copying are complicated if
        operands have variable lengths.
     – Catenation
     – Substring reference: is a reference to a substring in a given string
     – Perl, JavaScirpt, and PHP include built-in Pattern matching operations
        based on regular expressions.

.ITCS332 by Dr. Abdel Fattah Salman                                               6-11
Character String Types
• The pattern expression:          /[A-Za-z][A-Za-zd]+/
  matches typical names in PLs.
• Brackets enclose character classes.
• The first class specifies all letters; the second specifies all letters and
  digits.
• The plus specifies that there must be one or more of what is in the
  category.
• So, the whole pattern matches strings that begin with a letter followed
  by one or more letters or digits.
• The pattern expression /d+.?d*|.d+/ ,matches numeric
  literals.
• The . Specifies the decimal point; the ? Quantifies what it follows to
  have zero or one appearance; The | separates 2 alternatives in the
  whole pattern:
• The first pattern matches strings of one or more digits, possibly
  followed by decimal point, followed by zero or more digits;
• The second alternative matches strings that begin with a decimal point
  followed by one or more digits.
.ITCS332 by Dr. Abdel Fattah Salman                                       6-12
Character String Type in Certain Languages

• C and C++
     – Not primitive
     – Use char arrays and a library of functions that provide
       operations
• SNOBOL4 (a string manipulation language)
     – Primitive
     – Many operations, including elaborate pattern matching
• Java
     – Primitive via the String class




.ITCS332 by Dr. Abdel Fattah Salman                              6-13
Character String Length Options
There are several design options regarding the string length:
•   Static length string: the length is set when it is created as in COBOL and Java’s
    String class
•   Limited Dynamic Length: strings of varying length up to a fixed maximum defined by
    var’s definition as in C and C++
     – In C-based language, a special character is used to indicate the end of a string’s
         characters, rather than maintaining the length
•   Dynamic Length strings: strings of varying length with no maximum as in SNOBOL4,
    Perl, JavaScript
•   Ada supports all three string length options:
     – Type string from the standard package.
     – Type bounded_string from the Ada.Strings.Bounded package.
     – Type Unbounded_string from the Ada.Strings.Unbounded package
•   Character String Type Evaluation
     – Aid to writeability
     – Dealing with strings as arrays can be more cumbersome than dealing with primitive
         string type. As a primitive type with static length, they are inexpensive to provide--
         why not have them? (Providing string through standard library is like primitive
         strings).
     – Dynamic length is nice and flexible, but is it worth the expense?
.ITCS332 by Dr. Abdel Fattah Salman                                                       6-14
Character String Implementation
• String types could be supported in hardware but in most cases
  software is used to implement string storage, retrieval, and
  manipulation.
• Static length: compile-time descriptor with 3 fields:
   – Name of the type
   – Type’s length in character
   – Address of the first character
• Limited dynamic length: may need a run-time descriptor to store
  both the maximum and the current lengths (C and C++ do not
  require limited dynamic descriptor because the string is
  terminated with null).
• Dynamic length: need run-time descriptor to store only the
  current length;
• All descriptors are stored in symbol table..
.ITCS332 by Dr. Abdel Fattah Salman                            6-15
Compile- and Run-Time Descriptors
• Allocation/de-allocation is the biggest implementation problem: The
  storage must grow and shrink as needed. There are 2 approaches:
       String can be store in a linked list – extra storage occupied by the
       links and necessary complexity of string operations, and simple
       allocation and deallocation.
       Using adjacent memory cells to store a complete string – requires
       less storage and faster string operations, but allocation and
       deallocation are slower.




 Compile-time descriptor                   Run-time descriptor for
 for static strings                        limited dynamic strings
.ITCS332 by Dr. Abdel Fattah Salman                                       6-16
User-Defined Ordinal Types
• An ordinal type is one in which the range of possible
  values can be easily associated with the set of positive
  integers
• Examples of primitive ordinal types in Java
     – integer
     – char
     – boolean




.ITCS332 by Dr. Abdel Fattah Salman                      6-17
Enumeration Types
• An enumeration type is one in which all possible values, which
  are named constants, are provided in the definition.
• Enumeration types provide a way of defining and grouping
  collections of named constants.
• Enumeration constants are implicitly assigned integers: 0,1, …
  and can be explicitly assigned any integer in the definition.
• An example in C# :
     enum days {mon, tue, wed, thu, fri, sat, sun};
• Design issues: All are related to type checking
   – Is an enumeration constant allowed to appear in more than one
     type definition, and if so, how is the type of an occurrence of
     that constant checked?
   – Are enumeration values coerced to integer?
   – Any other type coerced to an enumeration type?
.ITCS332 by Dr. Abdel Fattah Salman                              6-18
Enumeration Types
• In Ls that do not have enumeration types, programmers simulate them
  with integer values as in FORTRAN77: we use 0 to represent blue, 1 to
  represent red, …
  integer red, blue
  data red, blue /0, 1/
• The problem with this approach is that because we have not defined a
  type for our colors, there is no type checking when they are used.
  enum colors {red, blue, green, yellow, black};
  colors mycolor= blue, yourColor= red;
• The enumeration values are coerced to int when they are put in an
  integer context. For example, if the current value of mycolor is blue,
  then mycolor++ would assign green to mycolor.
• C++ enumeration constants can appear in only ONE enumeration type
  in the same referencing environment.
• C# enumeration types are like C++, except that they never coerced to
  integers.
.ITCS332 by Dr. Abdel Fattah Salman                                  6-19
Evaluation of Enumerated Type
• Aid to readability: Named values are easily recognized, whereas coded
  values are not - e.g., no need to code a color as a number
• Aid to reliability, e.g., compiler can check:
   –No arithmetic operations are legal on enumeration types. (don’t allow
     colors to be added)
   –No enumeration variable can be assigned a value outside its defined
     range.
   –In C++: Numeric values can be assigned to enumeration type
     variables only if they are cast to the type of the assigned variables .
     Numeric values are assigned to enumeration type variables are
     checked to determine whether they are in the range of the internal
     values of the enumeration type.
   –Ada, C#, and Java 5.0 provide better support for enumeration than
     C++ because enumeration type variables in these languages are not
     coerced into integer types
.ITCS332 by Dr. Abdel Fattah Salman                                     6-20
Subrange Types
• A Subrange Type is an ordered contiguous subsequence of an
  ordinal type. For example: 12..18 is a subrange of integer type
• Ada’s design:
   – Subranges are part of subtypes. Subtypes are not new types,
     but only new names for restricted versions of existing types.
     type Days is (mon, tue, wed, thu, fri, sat, sun);
     subtype Weekdays is Days range mon..fri;
     subtype Index is Integer range 1..100;
     – The restriction on the existing type is in the range of possible
       values. All operations defined for parent type are also defined
       for the subtype.
           Day1: Days;
           Day2: Weekday;
           Day2 := Day1;

.ITCS332 by Dr. Abdel Fattah Salman                                 6-21
Subrange Types
     – The compiler must generate range-checking code for every
       assignment to a subrange variable.
     – Types are checked for compatibility at compile time and range
       checking is done at run time.
     – Common uses of user-defined ordinal types: indexes of arrays and
       loop vars.
     – Subrange types are different from Ada’s derived types:

     Type derived_small_int is new integer range 1..100;
     Subtype subrange_small_int is integer range 1..100;

     – Vars of both types inherit the value range and operations of integer.
     – Variables of derived_small_int are not compatible with any
       integer type.
     – Variables of type subrange_small_int are compatible with
       variables and constants of integer type and any subtype of integer.
.ITCS332 by Dr. Abdel Fattah Salman                                      6-22
Subrange Evaluation
• Aid to readability: Make it clear to the readers that variables of
    subrange can store only certain range of values
• Reliability: Assigning a value to a subrange variable that is
  outside the specified range is detected as an error by the compiler
  or by the run-time system.
Implementation of User-Defined Ordinal Types
• Enumeration types are implemented as integers
• Subrange types are implemented like the parent types with
    code inserted (by the compiler) to restrict assignments to
    subrange variables.
     – This step increases code size and execution time but is usually
       considered well worth the cost.

.ITCS332 by Dr. Abdel Fattah Salman                                6-23
Array Types
• An array is an aggregate of homogeneous data elements in which an
  individual element is identified by its position in the aggregate,
  relative to the first element.
• A reference to an array element needed one or more subscripts which
  require a run-time calculation to determine the mem location being
  referenced.
• Array Design Issues:
   – What types are legal for subscripts?
   – Are subscripting expressions in element references range
      checked?
   – When are subscript ranges bound?
   – When does allocation take place?
   – What is the maximum number of subscripts?
   – Can array objects be initialized?
   – Are any kind of slices allowed?
 .ITCS332 by Dr. Abdel Fattah Salman                            6-24
Array Indexing
• Specific elements are determined by means of a two-level mechanism:
  the first part is the aggregate name, the second part is a dynamic
  selector consisting of one or more items known as subscripts or indexes.
• If all subscripts in a reference are constants – the selector is static,
  otherwise it is dynamic.
• Indexing (or subscripting) is a mapping from indices to elements
    array_name(index_value_list)→ an element
• Index Syntax
   – FORTRAN, PL/I, Ada use parentheses
       • Ada explicitly uses parentheses to show uniformity between
         array references and function calls because both are mappings
   – Most other languages use brackets
   – In Ls that provide multidimensional arrays as array of arrays, each
     subscript appears in its own bracket.
.ITCS332 by Dr. Abdel Fattah Salman                                   6-25
Arrays Index (Subscript) Types
• A problem with using parentheses to enclose subscripts and
  subprogram parameters. Context information is used to solve it.
• Array element references map subscripts to specific array element
• Function calls map parameters to functional values.
• List(59) may be a reference to array element or a call to a
  function named list.
• FORTRAN, C: integer only
• Pascal: any ordinal type (integer, Boolean, char, enumeration)
• Ada: integer or enumeration (includes Boolean and char)
• Java: integer types only
• C, C++, Perl, and Fortran do not specify range checking
• Java, ML, C# specify range checking
.ITCS332 by Dr. Abdel Fattah Salman                             6-26
Subscript Binding and Array Categories
• There are 5 categories of arrays: category definition is based on
  range subscript binding and binding to storage
• The category name indicate where and when storage is allocated.
• A Static array is one in which the subscript ranges are statically
  bound and storage allocation is static (before run-time)
   – Advantage: efficiency (no dynamic allocation)
• A Fixed stack-dynamic array: is one in which subscript ranges are
  statically bound, but the allocation is done at declaration
  elaboration time during execution.
   – Advantage: space efficiency; A large array in one subprogram
      can use the same space in another subprogram (The 2
      subprograms are not active at the same time).

.ITCS332 by Dr. Abdel Fattah Salman                             6-27
Subscript Binding and Array Categories (continued)
• A Stack-dynamic array is one in which subscript ranges are
    dynamically bound and the storage allocation is dynamic (done at
    run-time). Once the subscript ranges are bound and storage is
    allocated, they remain fixed during the lifetime of the var.
     – Advantage: flexibility (the size of an array need not be known
        until the array is to be used).
• A Fixed heap-dynamic array: similar to fixed stack-dynamic:
    storage binding is dynamic but fixed after allocation (i.e., binding
    is done when requested and storage is allocated from heap, not
    stack).
• A Heap-dynamic array is one in which binding of subscript
    ranges and storage allocation is dynamic and can change any
    number of times during the array's lifetime.
     – Advantage: flexibility (arrays can grow or shrink during
       program execution).
.ITCS332 by Dr. Abdel Fattah Salman                                  6-28
Subscript Binding and Array Categories (continued)
• C and C++ arrays that include static modifier are static
• C and C++ arrays without static modifier are fixed stack-dynamic
• C and C++ provide fixed heap-dynamic arrays (By using
  operators: new and delete).
• Ada arrays can be stack-dynamic
• In Java all arrays are heap-dynamic arrays.
• C# provides fixed heap-dynamic arrays and includes a second
  array class ArrayList that provides heap-dynamic: Objects of this
  class are created without any elements and added to this object using the
  add method.
• Perl and JavaScript support heap-dynamic arrays: array can
  grow and shrink: In Perl we create array of 5 elements with @list =
  (1,2,3,5,7).
• It can be lengthened with push function as push(@list, 11, 19);
• The arrays can be emptied with @list = ();

.ITCS332 by Dr. Abdel Fattah Salman                                    6-29
Array Initialization
• Some language allow initialization at the time of storage allocation
   – C, C++, Java, C# example int list [] = {4, 5, 7, 83}
   – Character strings in C and C++ char name [] = “freddie”;
    – Arrays of strings in C and C++
       char *names [] = {“Bob”, “Jake”, “Joe”];
    – Java initialization of String objects
      string[] names = {“Bob”, “Jake”, “Joe”};

List:array(1..5)of integer:=(1,3,5,7,9); initializes all elements
Bunch:array(1..5)of integer:=(1=>17,3=>35,others =>0)


    – The first and third elements are initialized using direct assignment and
      others clause initializes the remaining elements.

 .ITCS332 by Dr. Abdel Fattah Salman                                      6-30
Arrays Operations
• An array operation is one that operates on an array as a unit.
• Ada allows array assignment and also concatenation (&).
  Concatenation is defined between 2 single-dimensional arrays and
  between a single-dimensional array and a scalar.
• Fortran provides elemental operations because they are between
  pairs of array elements
   – For example, + operator between two arrays results in an array
     of the sums of the element pairs of the two arrays
   – Library functions for matrix multiplication, transpose, dot
     product,…
• APL provides the most powerful array processing operations for
  vectors and matrixes as well as unary operators (for example, to
  reverse column elements). See examples of APL array operations
  on page 272.

.ITCS332 by Dr. Abdel Fattah Salman                             6-31
Rectangular and Jagged Arrays
• A rectangular array is a multi-dimensioned array in
  which all of the rows have the same number of elements
  and all columns have the same number of elements. All
  subscripts are placed in a single pair of brackets.
• A jagged array has rows (columns) with varying number
  of elements. The use separate pair of brackets for each
  dimension a[6][5].
     – Possible when multi-dimensioned arrays actually appear as
       arrays of arrays




.ITCS332 by Dr. Abdel Fattah Salman                                6-32
Slices
• A slice is some substructure of an array; nothing more
  than a referencing mechanism
• Slices are only useful in languages that have array
  operations
• Slice Examples:
     – Fortran 95
     Integer, Dimension (10) :: Vector
     Integer, Dimension (3, 3) :: Mat
     Integer, Dimension (3, 3) :: Cube

     Vector (3:6) is a four element array


.ITCS332 by Dr. Abdel Fattah Salman                    6-33
Slices Examples in Fortran 95




.ITCS332 by Dr. Abdel Fattah Salman           6-34
Implementation of Arrays
• Implementing arrays requires more compile time effort than does
  implementing simple types (int).
• The code to access array element must be generated at compile
  time.
• Access function maps subscript expressions to an address in the
  array
• Access function for single-dimensioned arrays:
    address(list[k])= address(list[lower_bound]) +
                          ((k-lower_bound)* element_size)


• The compile-time descriptor for single-dimensioned arrays
  includes information needed to construct access function.
• If all attributes are static and index range checking in not done at
  run-time, no descriptor is needed.
.ITCS332 by Dr. Abdel Fattah Salman                                 6-35
Accessing Multi-dimensioned Arrays

• Multidimensional arrays are more complex to implement than single-
  dimensioned arrays .
• Memory is linear – a simple sequence of bytes.
• Values of data types that have 2 or more dimensions must be mapped
  onto the single-dimensioned memory.
• Two common ways to store a multidimensional array :
   – Row major order (by rows) – used in most languages
   – Column major order (by columns) – used in Fortran
• Sequential access to matrix elements will be faster if they are accessed
  in the order in which they are stored – minimizing paging.
• The access function for a multidimensional array is the mapping of its
  base address and a set of index values to the address in memory of the
  element specified by the index values.
• The access function for a 2-dimensional array stored in row-major order
  is shown below:
.ITCS332 by Dr. Abdel Fattah Salman                                   6-36
Row / Column major ordering

                                      11 12 13 14 15
                                      21 22 23 24 25
                                      31 32 33 34 35

   Row major order (second subscript increases faster)
    11 12 13 14 15 21 22 23 24 25 31 32 33 34 35

    Column major order (first subscript increases faster)
    11 21 31 12 22 32 13 23 33 14 24 34 15 25 35


.ITCS332 by Dr. Abdel Fattah Salman                         6-37
Locating an Element in a Multi-dimensioned Array
• The address of an element is the base address of the array plus the
  element size times the number of elements preceding it in the array.
Loc (a[i, j])= address of a[1,1] + (# of elements preceding it ) * el_size
             = address of a[1,1] + ((number of rows above ith row * row_size)
                                 + number of elements left of jth column) * el_size
             = address of a[1,1] + (((i-1)*n + (j-1)) * el_size
             = address of a[1,1] + (i*n-n+j-1) * el_size
             = address of a[1,1] + ((i*n+j)-(n+1)) * el_size
             = address of a[1,1] + ((i*n+j)*el_size-(n+1) * el_size
             = address of a[1,1] -(n+1)*el_size +(i*n+j)* el_size


                                               (i-1)*element size




                       (j-1)*element size
.ITCS332 by Dr. Abdel Fattah Salman                                            6-38
Locating an Element in a Multi-dimensioned Array

 • The address of an element is the base address of the array plus the
   element size times the number of elements preceding it in the array.
 Location(a[i,j])= address of a[1,1]+
                  ((number of rows above ith row * row_size) +
                  number of elements left of jth column) * element_size
 • General format:
 Location(a[i,j])=address of a[row_lb , col_lb]-
            (((row_lb * n)+ col_lb)* element_size)+
            (((I * n) + j) * element_size)

• The first 2 terms are the constant part and
  the last is the variable part.
• For each dimension on an array, ONE add
  and ONE multiply instructions are required
  for the access function.
 .ITCS332 by Dr. Abdel Fattah Salman                                      6-39
Compile-Time Descriptors




           Single-dimensioned array
                                      Multi-dimensional array
.ITCS332 by Dr. Abdel Fattah Salman                             6-40
Associative Arrays
• An associative array is an unordered collection of data elements
  that are indexed by an equal number of values called keys
     – User defined keys must be stored in the structure
     – In nonassociative arrays: the indices never need to be stored
     – Each element of an associative array is a pair of entities: a
         key and a value.
• Design issues: What is the form of references to elements?
• In Perl, associative arrays are called hashes – their elements are
  stored and retrieved with hash functions. Every hash variable must
  begin with %. Scalar variable begin with $. The key value is
  placed in braces and the hash name is replaced by a scalar variable
  name that is the same except for the first character.

.ITCS332 by Dr. Abdel Fattah Salman                              6-41
Associative Arrays in Perl
• Names begin with %; literals are delimited by parentheses
     %hi_temps=("Mon"=>77, "Tue" => 79, “Wed” => 65, …);
• Subscripting is done using braces and keys
   $hi_temps{"Wed"} = 83;
   – Elements can be removed with delete
     delete $hi_temps{"Tue"};
     The entire hash can be emptied by assigning an empty literal to
       it: @salaries = ();




.ITCS332 by Dr. Abdel Fattah Salman                               6-42
Perl’s Associative Arrays
•    Perl has a primitive datatype for hash tables aka “associative arrays”.
•    Elements indexed not by consecutive integers but by arbitrary keys
•    %ages refers to an associative array and @people to a regular array
•    Note the use of { } for associative arrays and [ ] for regular arrays

       %ages = (“Bill Clinton”=>53,”Hillary”=>51,
         "Socks“=>"27 in cat years");
       $ages{“Hillary”} = 52;
       @people=("Bill Clinton“,"Hillary“,"Socks“);
       $ages{“Bill Clinton"};      # Returns 53
       $people[1];                 # returns “Hillary”
• keys(X), values (X) and each(X)
       foreach $person (keys(%ages)) {print "I know the age
         of $personn";}
       foreach $age (values(%ages)){print "Somebody is
         $agen";}
       while (($person, $age) = each(%ages)) {print "$person
         is $agen";}
    .ITCS332 by Dr. Abdel Fattah Salman                                        6-43
Record Types
  • A record is a possibly heterogeneous aggregate of
    data elements in which the individual elements are
    identified by names
  • Design issues:
        – What is the syntactic form of references to the field?
        – Are elliptical references allowed?




.ITCS332 by Dr. Abdel Fattah Salman                                6-44
Definition of Records
•   COBOL uses level numbers to show nested records; others use recursive
    definition dot notation
•   Record Field References
     – COBOL
     field_name OF record_name_1 OF ... OF record_name_n
     – Others ()
     record_name_1. record_name_2. ... record_name_n. field_name
•   COBOL uses level numbers to show nested records; others use recursive
    definition
     01 EMP-REC.
          02 EMP-NAME.
               05 FIRST PIC X(20).
               05 MID       PIC X(10).
               05 LAST PIC X(20).
          02 HOURLY-RATE PIC 99V99.
.ITCS332 by Dr. Abdel Fattah Salman                                         6-45
Definition of Records in Ada
  • Record structures are indicated in an orthogonal way
    type Emp_Rec_Type is record
       First: String (1..20);
       Mid: String (1..10);
       Last: String (1..20);
       Hourly_Rate: Float;
    end record;
    Emp_Rec: Emp_Rec_Type;



.ITCS332 by Dr. Abdel Fattah Salman                    6-46
References to Records
• Most language use dot notation:      Emp_Rec.Name
• Fully qualified references must include all record names
• Elliptical references allow leaving out record names as long as the
  reference is unambiguous, for example in COBOL
  FIRST, FIRST OF EMP-NAME, and FIRST of EMP-REC
  are elliptical references to the employee’s first name.
Operations on Records
•   Assignment is very common if the types are identical
•   Ada allows record comparison
•   Ada records can be initialized with aggregate literals
•   COBOL provides MOVE CORRESPONDING
     – Copies a field of the source record to the corresponding field
       in the target record
.ITCS332 by Dr. Abdel Fattah Salman                                6-47
Evaluation and Comparison to Arrays
• Straight forward and safe design
• Records are used when collection of data values is
  heterogeneous
• Access to array elements is much slower than access to
  record fields, because subscripts are dynamic (field
  names are static)
• Dynamic subscripts could be used with record field
  access, but it would disallow type checking and it would
  be much slower




.ITCS332 by Dr. Abdel Fattah Salman                    6-48
Implementation of Record Type



 Offset address relative to
 the beginning of the records
 is associated with each field




.ITCS332 by Dr. Abdel Fattah Salman              6-49
Unions Types
  • A union is a type whose variables are allowed to store
    different type values at different times during execution
  • Design issues
     – Should type checking be required?
     – Should unions be embedded in records?
  Discriminated vs. Free Unions
  • Fortran, C, and C++ provide union constructs in which there
    is no language support for type checking; the union in these
    languages is called free union
  • Type checking of unions require that each union include a
    type indicator called a discriminant
     – Supported by Ada
.ITCS332 by Dr. Abdel Fattah Salman                                6-50
Ada Union Types
type Shape is (Circle, Triangle, Rectangle);
type Colors is (Red, Green, Blue);
type Figure (Form: Shape) is record
  Filled: Boolean;
  Color: Colors;
  case Form is
     when Circle => Diameter: Float;
     when Triangle =>
          Leftside, Rightside: Integer;
          Angle: Float;
     when Rectangle => Side1, Side2: Integer;
  end case;
end record;
.ITCS332 by Dr. Abdel Fattah Salman               6-51
Ada Union Type Illustrated




A discriminated union of three shape variables

Evaluation of Unions
• Potentially unsafe construct: Do not allow type checking
• Java and C# do not support unions: Reflective of growing concerns
  for safety in programming language
 .ITCS332 by Dr. Abdel Fattah Salman                            6-52
Pointer and Reference Types
• A pointer type variable has a range of values consisting of
  memory addresses and a special value, NULL (nil)
   – Provide the power of indirect addressing
   – Provide a way to manage dynamic memory
   – A pointer can be used to access a location in the area where
     storage is dynamically created (usually called a heap)
• Design Issues of Pointers
     – What are the scope of and lifetime of a pointer variable?
     – What is the lifetime of a heap-dynamic variable?
     – Are pointers restricted as to the type of value to which they can
       point?
     – Are pointers used for dynamic storage management, indirect
       addressing, or both?
     – Should the language support pointer types, reference types, or both?

.ITCS332 by Dr. Abdel Fattah Salman                                     6-53
Pointer Operations
• Two fundamental operations: assignment and
  dereferencing
• Assignment is used to set a pointer variable’s value to
  some useful address
• Dereferencing yields the value stored at the location
  represented by the pointer’s value
     – Dereferencing can be explicit or implicit
     – C++ uses an explicit operation via *
       j = *ptr
       sets j to the value located at ptr


.ITCS332 by Dr. Abdel Fattah Salman                         6-54
Pointer Assignment Illustrated




The assignment operation j = *ptr

.ITCS332 by Dr. Abdel Fattah Salman               6-55
Problems with Pointers


•     Dangling pointers (dangerous)
       – Dangling Pointer is when dynamic memory has been deallocated (deleted)
         but there is one or more pointers still pointing to it. A pointer points to a
         heap-dynamic variable that has been de-allocated
       – Creating one:
             • Allocate a heap-dynamic variable and set a pointer to point at it
             • Set a second pointer to the value of the first pointer
             • Deallocate the heap-dynamic variable, using the first pointer
       – Example:
           int         *myPtr,*urPtr;
           myPtr = new int(10);
           cout << "The value of myPtr is " << *myPtr << endl;
           urPtr = myPtr;
           delete myPtr; // urPtr is a “dangling pointer”
            *myPtr = 5;
            cout << "The value of myPtr is " << *myPtr << endl;
•     It is an error to dereference a pointer after deleting any of its aliases. This creates
      “dangling pointers”
    .ITCS332 by Dr. Abdel Fattah Salman                                                 6-56
Problems with Pointers
• Lost heap-dynamic variable
   – An allocated heap-dynamic variable that is no longer accessible to
     the user program (often called garbage)
   – Creating one:
       • Pointer p1 is set to point to a newly created heap-dynamic
         variable
       • Pointer p1 is later set to point to another newly created heap-
         dynamic variable. This causes losing the first heap-dynamic
         variable, i.e. that variable cannot be accessed or deallocated.
       • Example:
              void *p1,*p2;
              p1 = new int(10);
              p1=new float (7.4); //The int var(=10) is lost

• The process of losing heap-dynamic variables is called memory
    leakage

.ITCS332 by Dr. Abdel Fattah Salman                                   6-57
Pointers in Ada
• Some dangling pointers are disallowed because dynamic
  objects can be automatically de-allocated at the end of
  pointer's type scope
• All pointers are initialized to null
• The lost heap-dynamic variable problem is not
  eliminated by Ada




.ITCS332 by Dr. Abdel Fattah Salman                     6-58
Pointers in C and C++
•   Extremely flexible but must be used with care
•   Pointers can point at any variable regardless of when it was allocated
•   Used for dynamic storage management and addressing
•   Pointer arithmetic is possible
•   Explicit dereferencing and address-of operators
•   Domain type need not be fixed (void *)
         float stuff[100];
           float *p;
           p = stuff;
     *(p+5) is equivalent to stuff[5] and p[5]
     *(p+i) is equivalent to stuff[i] and p[i]

• void * can point to any type and can be type checked (cannot be de-
  referenced)

.ITCS332 by Dr. Abdel Fattah Salman                                          6-59
Pointers in Fortran 95
• Pointers point to heap and non-heap variables
• Implicit dereferencing
• Pointers can only point to variables that have the
  TARGET attribute
• The TARGET attribute is assigned in the declaration:
     REAL, POINTER :: ptr (POINTER is an attribute)
     ptr => target (where target is either a pointer or a non-pointer
       with the TARGET attribute)
     The TARGET attribute is assigned in the declaration, e.g.


    INTEGER, TARGET :: NODE
.ITCS332 by Dr. Abdel Fattah Salman                                 6-60
Reference Types

• C++ includes a special kind of pointer type called a
  reference type that is used primarily for formal
  parameters
     – Advantages of both pass-by-reference and pass-by-value
• Java extends C++’s reference variables and allows them
  to replace pointers entirely
     – References refer to call instances
• C# includes both the references of Java and the pointers
  of C++



.ITCS332 by Dr. Abdel Fattah Salman                             6-61
Evaluation of Pointers
• Dangling pointers and dangling objects are problems as
  is heap management
• Pointers are like goto's--they widen the range of cells
  that can be accessed by a variable
• Pointers or references are necessary for dynamic data
  structures--so we can't design a language without them.

Representations of Pointers
• Large computers use single values
• Intel microprocessors use segment and offset

.ITCS332 by Dr. Abdel Fattah Salman                    6-62
Solving Dangling Pointer Problem
• Tombstone: extra heap cell that is a pointer to the heap-
  dynamic variable
   – The actual pointer variable points only at tombstones
   – When heap-dynamic variable is de-allocated, tombstone
     remains but set to nil
   – Costly in time and space
• Locks-and-keys: Pointer values are represented as (key, address)
  pairs
   – Heap-dynamic variables are represented as variable plus cell
     for integer lock value.
   – When heap-dynamic variable is allocated, lock value is created
     and placed in lock cell and key cell of pointer.


.ITCS332 by Dr. Abdel Fattah Salman                            6-63
Heap Management
•   Memory management: identify unused, dynamically allocated memory cells
    and return them to the heap
•   Approaches
     – Manual: explicit allocation and deallocation (C, C++)
     – Automatic:
          • Reference counters (modula2, Adobe Photoshop)
          • Garbage collection (Lisp, Java)
•   Problems with manual approach:
     – Requires programmer effort
     – Programmer’s failures leads to space leaks and dangling references/sharing
     – Proper explicit memory management is difficult and has been estimated to
        account for up to 40% of development time!
•   A very complex run-time process
•   Single-size cells vs. variable-size cells
•   Two approaches to reclaim garbage
     – Reference counters (eager approach): reclamation is gradual
     – Garbage collection (lazy approach): reclamation occurs when the list of
        available space becomes empty
.ITCS332 by Dr. Abdel Fattah Salman                                          6-64
Reference Counter
• Idea: keep track how many references there are to a cell in memory. If
  this number drops to 0, the cell is garbage.
• Reference counters: maintain a counter in every cell that store the
  number of pointers currently pointing at the cell
• Store garbage in free list; allocate from this list
• Advantages
   – resources can be freed directly
   – immediate reuse of memory possible
• Disadvantages
   – Can’t handle cyclic data structures
   – Bad locality properties
   – Large overhead for pointer manipulation
   – Disadvantages: space required, execution time required,
      complications for cells connected circularly



.ITCS332 by Dr. Abdel Fattah Salman                                    6-65
Garbage Collection
•   GC is a process by which dynamically allocated storage is reclaimed during
    the execution of a program.
•   Usually refers to automatic periodic storage reclamation by the garbage
    collector (part of the run-time system), as opposed to explicit code to free
    specific blocks of memory.
•   Usually triggered during memory allocation when available free memory falls
    below a threshold. Normal execution is suspended and GC is run.
•   The run-time system allocates storage cells as requested and disconnects
    pointers from cells as necessary; garbage collection then begins
     – Every heap cell has an extra bit used by collection algorithm
     – All cells initially set to garbage
     – All pointers traced into heap, and reachable cells marked as not garbage
     – All garbage cells returned to list of available cells
     – Disadvantages: when you need it most, it works worst (takes most time
        when program needs most of cells in heap)
•   Major GC algorithms:
     – Mark and sweep
     – Copying
     – Incremental garbage collection algorithms
.ITCS332 by Dr. Abdel Fattah Salman                                          6-66
Marking Algorithm




.ITCS332 by Dr. Abdel Fattah Salman               6-67
Variable-Size Cells
• All the difficulties of single-size cells plus more
• Required by most programming languages
• If garbage collection is used, additional problems occur
     – The initial setting of the indicators of all cells in the heap is
       difficult
     – The marking process in nontrivial
     – Maintaining the list of available space is another source of
       overhead




.ITCS332 by Dr. Abdel Fattah Salman                                        6-68
Summary
• The data types of a language are a large part of what determines
  that language’s style and usefulness
• The primitive data types of most imperative languages include
  numeric, character, and Boolean types
• The user-defined enumeration and subrange types are convenient
  and add to the readability and reliability of programs
• Arrays and records are included in most languages
• Pointers are used for addressing flexibility and to control dynamic
  storage management




.ITCS332 by Dr. Abdel Fattah Salman                              6-69

More Related Content

What's hot

Java basic datatypes
Java basic datatypesJava basic datatypes
Java basic datatypesSoba Arjun
 
Datatype introduction- JAVA
Datatype introduction- JAVADatatype introduction- JAVA
Datatype introduction- JAVAHamna_sheikh
 
ITFT-Constants, variables and data types in java
ITFT-Constants, variables and data types in javaITFT-Constants, variables and data types in java
ITFT-Constants, variables and data types in javaAtul Sehdev
 
Chapter1pp
Chapter1ppChapter1pp
Chapter1ppJ. C.
 
Object oriented programming tutorial
Object oriented programming tutorialObject oriented programming tutorial
Object oriented programming tutorialGhulam Abbas Khan
 
Introduction to oop with c++
Introduction to oop with c++Introduction to oop with c++
Introduction to oop with c++Shruti Patel
 
Java input output package
Java input output packageJava input output package
Java input output packageSujit Kumar
 
CS4443 - Modern Programming Language - I Lecture (2)
CS4443 - Modern Programming Language - I  Lecture (2)CS4443 - Modern Programming Language - I  Lecture (2)
CS4443 - Modern Programming Language - I Lecture (2)Dilawar Khan
 
Java OOP Concepts 1st Slide
Java OOP Concepts 1st SlideJava OOP Concepts 1st Slide
Java OOP Concepts 1st Slidesunny khan
 
Coding Style & Tips for JAVA
Coding Style & Tips for JAVACoding Style & Tips for JAVA
Coding Style & Tips for JAVASAGARDAVE29
 
Chapter 2: Elementary Programming
Chapter 2: Elementary ProgrammingChapter 2: Elementary Programming
Chapter 2: Elementary ProgrammingEric Chou
 

What's hot (20)

Java basic datatypes
Java basic datatypesJava basic datatypes
Java basic datatypes
 
Datatype introduction- JAVA
Datatype introduction- JAVADatatype introduction- JAVA
Datatype introduction- JAVA
 
Data types IN JAVA
Data types IN JAVAData types IN JAVA
Data types IN JAVA
 
3 describing syntax
3 describing syntax3 describing syntax
3 describing syntax
 
7-Java Language Basics Part1
7-Java Language Basics Part17-Java Language Basics Part1
7-Java Language Basics Part1
 
3 jf h-linearequations
3  jf h-linearequations3  jf h-linearequations
3 jf h-linearequations
 
3b jf h-readingdatafromconsole
3b jf h-readingdatafromconsole3b jf h-readingdatafromconsole
3b jf h-readingdatafromconsole
 
Csc240 -lecture_4
Csc240  -lecture_4Csc240  -lecture_4
Csc240 -lecture_4
 
ITFT-Constants, variables and data types in java
ITFT-Constants, variables and data types in javaITFT-Constants, variables and data types in java
ITFT-Constants, variables and data types in java
 
Chapter1pp
Chapter1ppChapter1pp
Chapter1pp
 
Lect5
Lect5Lect5
Lect5
 
Object oriented programming tutorial
Object oriented programming tutorialObject oriented programming tutorial
Object oriented programming tutorial
 
Introduction to oop with c++
Introduction to oop with c++Introduction to oop with c++
Introduction to oop with c++
 
Lecture 1 - Objects and classes
Lecture 1 - Objects and classesLecture 1 - Objects and classes
Lecture 1 - Objects and classes
 
Java input output package
Java input output packageJava input output package
Java input output package
 
CS4443 - Modern Programming Language - I Lecture (2)
CS4443 - Modern Programming Language - I  Lecture (2)CS4443 - Modern Programming Language - I  Lecture (2)
CS4443 - Modern Programming Language - I Lecture (2)
 
Oop
OopOop
Oop
 
Java OOP Concepts 1st Slide
Java OOP Concepts 1st SlideJava OOP Concepts 1st Slide
Java OOP Concepts 1st Slide
 
Coding Style & Tips for JAVA
Coding Style & Tips for JAVACoding Style & Tips for JAVA
Coding Style & Tips for JAVA
 
Chapter 2: Elementary Programming
Chapter 2: Elementary ProgrammingChapter 2: Elementary Programming
Chapter 2: Elementary Programming
 

Viewers also liked

Cute cats and dogs
Cute cats and dogsCute cats and dogs
Cute cats and dogsgmyachtsman
 
Unit 1 review
Unit 1 reviewUnit 1 review
Unit 1 reviewcblockus
 
프레젠테이션2
프레젠테이션2프레젠테이션2
프레젠테이션2yunjuna7632
 
LaunchPad Resources Module
LaunchPad Resources ModuleLaunchPad Resources Module
LaunchPad Resources Moduleacastle08
 
Ehr034 instalações prediais hidráulico -sanitário
Ehr034   instalações prediais hidráulico -sanitárioEhr034   instalações prediais hidráulico -sanitário
Ehr034 instalações prediais hidráulico -sanitárioEulalia Cristina
 
ProvenMentor-Professional-Diploma-in-Digital-Marketing-Brochure
ProvenMentor-Professional-Diploma-in-Digital-Marketing-BrochureProvenMentor-Professional-Diploma-in-Digital-Marketing-Brochure
ProvenMentor-Professional-Diploma-in-Digital-Marketing-BrochureRob Firmin
 
Internet product-of-foss
Internet product-of-fossInternet product-of-foss
Internet product-of-fossnghia le trung
 
Cultual Olympiad
Cultual OlympiadCultual Olympiad
Cultual Olympiadjoelyp
 
Mobile User Interfaces for Efficient Verification of Holograms
Mobile User Interfaces for Efficient Verification of HologramsMobile User Interfaces for Efficient Verification of Holograms
Mobile User Interfaces for Efficient Verification of HologramsJens Grubert
 
Pemanfaatan ekstrak serai(sitronela) sebagai pengusir nyamuk
Pemanfaatan ekstrak serai(sitronela) sebagai pengusir nyamukPemanfaatan ekstrak serai(sitronela) sebagai pengusir nyamuk
Pemanfaatan ekstrak serai(sitronela) sebagai pengusir nyamukMuhammad Syahida
 
โปรพอลิส
โปรพอลิสโปรพอลิส
โปรพอลิสRpg Thailand
 
Assessment techniques overview
Assessment techniques overviewAssessment techniques overview
Assessment techniques overviewGavin Henning
 
Album de viatges
Album de viatgesAlbum de viatges
Album de viatgesEilaRuiz
 

Viewers also liked (20)

Numpy Talk at SIAM
Numpy Talk at SIAMNumpy Talk at SIAM
Numpy Talk at SIAM
 
Cute cats and dogs
Cute cats and dogsCute cats and dogs
Cute cats and dogs
 
Science
ScienceScience
Science
 
Musical instruments
Musical instrumentsMusical instruments
Musical instruments
 
Unit 1 review
Unit 1 reviewUnit 1 review
Unit 1 review
 
jbug-vagrant
jbug-vagrantjbug-vagrant
jbug-vagrant
 
XayDungCongDong-ANKGM
XayDungCongDong-ANKGMXayDungCongDong-ANKGM
XayDungCongDong-ANKGM
 
프레젠테이션2
프레젠테이션2프레젠테이션2
프레젠테이션2
 
Session 6 - Poll
Session 6 - PollSession 6 - Poll
Session 6 - Poll
 
LaunchPad Resources Module
LaunchPad Resources ModuleLaunchPad Resources Module
LaunchPad Resources Module
 
Ehr034 instalações prediais hidráulico -sanitário
Ehr034   instalações prediais hidráulico -sanitárioEhr034   instalações prediais hidráulico -sanitário
Ehr034 instalações prediais hidráulico -sanitário
 
ProvenMentor-Professional-Diploma-in-Digital-Marketing-Brochure
ProvenMentor-Professional-Diploma-in-Digital-Marketing-BrochureProvenMentor-Professional-Diploma-in-Digital-Marketing-Brochure
ProvenMentor-Professional-Diploma-in-Digital-Marketing-Brochure
 
Internet product-of-foss
Internet product-of-fossInternet product-of-foss
Internet product-of-foss
 
Cultual Olympiad
Cultual OlympiadCultual Olympiad
Cultual Olympiad
 
Mobile User Interfaces for Efficient Verification of Holograms
Mobile User Interfaces for Efficient Verification of HologramsMobile User Interfaces for Efficient Verification of Holograms
Mobile User Interfaces for Efficient Verification of Holograms
 
Pemanfaatan ekstrak serai(sitronela) sebagai pengusir nyamuk
Pemanfaatan ekstrak serai(sitronela) sebagai pengusir nyamukPemanfaatan ekstrak serai(sitronela) sebagai pengusir nyamuk
Pemanfaatan ekstrak serai(sitronela) sebagai pengusir nyamuk
 
โปรพอลิส
โปรพอลิสโปรพอลิส
โปรพอลิส
 
Assessment techniques overview
Assessment techniques overviewAssessment techniques overview
Assessment techniques overview
 
Album de viatges
Album de viatgesAlbum de viatges
Album de viatges
 
Daya dukung lingkungan
Daya dukung lingkunganDaya dukung lingkungan
Daya dukung lingkungan
 

Similar to 332 ch07 (20)

Ch06Part1.ppt
Ch06Part1.pptCh06Part1.ppt
Ch06Part1.ppt
 
Datatype
DatatypeDatatype
Datatype
 
ch6-Short.ppt eee cse www rrr www qqq rrr ttt
ch6-Short.ppt eee cse www rrr www qqq rrr tttch6-Short.ppt eee cse www rrr www qqq rrr ttt
ch6-Short.ppt eee cse www rrr www qqq rrr ttt
 
Quarter-2-CH-1.ppt
Quarter-2-CH-1.pptQuarter-2-CH-1.ppt
Quarter-2-CH-1.ppt
 
pl12ch6.ppt
pl12ch6.pptpl12ch6.ppt
pl12ch6.ppt
 
chapter 5.ppt
chapter 5.pptchapter 5.ppt
chapter 5.ppt
 
Java platform
Java platformJava platform
Java platform
 
8. data types
8. data types8. data types
8. data types
 
Ch6
Ch6Ch6
Ch6
 
Data.ppt
Data.pptData.ppt
Data.ppt
 
Java session3
Java session3Java session3
Java session3
 
DLD5.pdf
DLD5.pdfDLD5.pdf
DLD5.pdf
 
01 Java Language And OOP PART I
01 Java Language And OOP PART I01 Java Language And OOP PART I
01 Java Language And OOP PART I
 
[Distributed System] ch4. interprocess communication
[Distributed System] ch4. interprocess communication[Distributed System] ch4. interprocess communication
[Distributed System] ch4. interprocess communication
 
C#
C#C#
C#
 
an introduction to c++ templates-comprehensive guide.ppt
an introduction to c++ templates-comprehensive guide.pptan introduction to c++ templates-comprehensive guide.ppt
an introduction to c++ templates-comprehensive guide.ppt
 
Unit 4 plsql
Unit 4  plsqlUnit 4  plsql
Unit 4 plsql
 
C# Basics
C# BasicsC# Basics
C# Basics
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
Learning core java
Learning core javaLearning core java
Learning core java
 

Recently uploaded

Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 

Recently uploaded (20)

Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 

332 ch07

  • 1. ITCS332: Organization of Programming Languages Chapter 6 Data Types ISBN 0-321-33025-0
  • 2. Chapter 6 Topics • Introduction • Primitive Data Types • Character String Types • User-Defined Ordinal Types • Array Types • Associative Arrays • Record Types • Union Types • Pointer and Reference Types .ITCS332 by Dr. Abdel Fattah Salman 6-2
  • 3. Introduction • A data type defines a collection of data objects and a set of predefined operations on those objects • How well the data types match the real-world problem space; so it is crucial that a L support an appropriate variety of data types and structures. • PL/I included many data types, with intent to supporting a large range of applications. • A better approach in ALGOL68: provide a few basic types and a few flexible structure-defining operators allowing a user to design data structures for each need. • User-defined types improves readability through the use of meaningful names for types. • User-defined types aid modifiability: A user can change the type of a category of variables in a program by changing only a type declaration statement. • The fundamental idea of an abstract data type is that the use of a types is separated from the representation and set of operations on values of that type. • The 2 most common structured (nonscalar) data types are: arrays and records .ITCS332 by Dr. Abdel Fattah Salman 6-3
  • 4. Introduction • These DT are specified by type operators or constructors: In C Ls, the type operators are: brackets, parentheses, and asterisks are used to specify arrays, functions, and pointers. • A descriptor is the collection of the attributes of a variable. In an implementation, a descriptor is a collection of memory cells that store variable attributes • If all attributes are static, descriptors are needed only at compile time. They are built by a compiler and stored in symbol table. • For dynamic attributes, part or all of the descriptor are maintained during execution. • Descriptors are used for type checking and to build the code for allocation and deallocation operations. • The word object is associated with the value of a var and space it occupies. • An object represents an instance of a user-defined (abstract data) type. • In OO Ls, every instance of every class (predefined or user-defined) is object • One design issue for all data types: What operations are defined for vars of the type and how are they specified? .ITCS332 by Dr. Abdel Fattah Salman 6-4
  • 5. Primitive Data Types • Primitive data types: Those not defined in terms of other data types • Almost all programming languages provide a set of primitive data types – Some primitive data types are merely reflections of the hardware – Others require little non-hardware support – Primitive data types are used along with one or more type constructors to build the structured types. – Primitive data types include: integer, real, decimal, character, boolean .ITCS332 by Dr. Abdel Fattah Salman 6-5
  • 6. Primitive Data Types: Integer • Almost always an exact reflection of the hardware so the mapping is trivial • There may be as many as eight different (in size) integer types in a language • Java’s signed integer sizes: byte, short, int, long. • C# and C++ include unsigned integer types. • Signed integers are stored in 2’s complement representation. .ITCS332 by Dr. Abdel Fattah Salman 6-6
  • 7. Primitive Data Types: Floating Point • FP types model real numbers as approximations (π and e). • Problems: Approximated representation and Loss of accuracy through arithmetic operations • Languages for scientific use support at least two floating-point types (e.g., float(4 bytes) and double(8 bytes); sometimes more • Usually exactly like the hardware, but not always • IEEE Floating-Point Standard 754 • Precision is the accuracy of the fractional part of the value it bits. • Range is a combination of the exponent and fraction ranges. .ITCS332 by Dr. Abdel Fattah Salman 6-7
  • 8. Primitive Data Types: Decimal • Computers designed for business applications have hardware support for decimal data types. • Decimal types store a fixed number of decimal digits, with a decimal point at fixed position in the value. • For business applications (money) – Essential to COBOL – C# offers a decimal data type • Advantage: accuracy • Disadvantages: limited range, wastes memory. • Decimal types are stored in BCD: unpacked -one digit per byte, packed -2 digits per byte. • Operations on decimal values are done in hardware or by simulation. .ITCS332 by Dr. Abdel Fattah Salman 6-8
  • 9. Primitive Data Types: Boolean • Simplest of all • Range of values: two elements, one for “true” and one for “false”. • Boolean types are used to represent switches or flags in programs • Could be implemented as a single bit, but often as a byte – Advantage: readability .ITCS332 by Dr. Abdel Fattah Salman 6-9
  • 10. Primitive Data Types: Character • Stored as numeric codings • Most commonly used coding: ASCII • An alternative, 16-bit coding: Unicode – Globalization of business – Computers need to communicate with other computers – Includes characters from most natural languages – The first 128 characters of Unicode are similar to those of ASCII – Originally used in Java – C# and JavaScript also support Unicode .ITCS332 by Dr. Abdel Fattah Salman 6-10
  • 11. Character String Types • Values are sequences of characters • Design issues: – Is it a primitive type or just a special kind of array? – Should the length of strings be static or dynamic? • C and C++ define strings as array of chars and provide string operations as functions in standard library “string.h”. Strings are ASCIIZ. • Problem: Move string data do not guard against overflowing a destination. C++ programmers must use string class from standard library rather than char array. • In C# and Java, strings are supported as a primitive type by string class (constant strings) and stringbuffer class (variable strings like arrays of chars). • Typical operations on strings: – Assignment, Comparison (=, >, etc )and copying are complicated if operands have variable lengths. – Catenation – Substring reference: is a reference to a substring in a given string – Perl, JavaScirpt, and PHP include built-in Pattern matching operations based on regular expressions. .ITCS332 by Dr. Abdel Fattah Salman 6-11
  • 12. Character String Types • The pattern expression: /[A-Za-z][A-Za-zd]+/ matches typical names in PLs. • Brackets enclose character classes. • The first class specifies all letters; the second specifies all letters and digits. • The plus specifies that there must be one or more of what is in the category. • So, the whole pattern matches strings that begin with a letter followed by one or more letters or digits. • The pattern expression /d+.?d*|.d+/ ,matches numeric literals. • The . Specifies the decimal point; the ? Quantifies what it follows to have zero or one appearance; The | separates 2 alternatives in the whole pattern: • The first pattern matches strings of one or more digits, possibly followed by decimal point, followed by zero or more digits; • The second alternative matches strings that begin with a decimal point followed by one or more digits. .ITCS332 by Dr. Abdel Fattah Salman 6-12
  • 13. Character String Type in Certain Languages • C and C++ – Not primitive – Use char arrays and a library of functions that provide operations • SNOBOL4 (a string manipulation language) – Primitive – Many operations, including elaborate pattern matching • Java – Primitive via the String class .ITCS332 by Dr. Abdel Fattah Salman 6-13
  • 14. Character String Length Options There are several design options regarding the string length: • Static length string: the length is set when it is created as in COBOL and Java’s String class • Limited Dynamic Length: strings of varying length up to a fixed maximum defined by var’s definition as in C and C++ – In C-based language, a special character is used to indicate the end of a string’s characters, rather than maintaining the length • Dynamic Length strings: strings of varying length with no maximum as in SNOBOL4, Perl, JavaScript • Ada supports all three string length options: – Type string from the standard package. – Type bounded_string from the Ada.Strings.Bounded package. – Type Unbounded_string from the Ada.Strings.Unbounded package • Character String Type Evaluation – Aid to writeability – Dealing with strings as arrays can be more cumbersome than dealing with primitive string type. As a primitive type with static length, they are inexpensive to provide-- why not have them? (Providing string through standard library is like primitive strings). – Dynamic length is nice and flexible, but is it worth the expense? .ITCS332 by Dr. Abdel Fattah Salman 6-14
  • 15. Character String Implementation • String types could be supported in hardware but in most cases software is used to implement string storage, retrieval, and manipulation. • Static length: compile-time descriptor with 3 fields: – Name of the type – Type’s length in character – Address of the first character • Limited dynamic length: may need a run-time descriptor to store both the maximum and the current lengths (C and C++ do not require limited dynamic descriptor because the string is terminated with null). • Dynamic length: need run-time descriptor to store only the current length; • All descriptors are stored in symbol table.. .ITCS332 by Dr. Abdel Fattah Salman 6-15
  • 16. Compile- and Run-Time Descriptors • Allocation/de-allocation is the biggest implementation problem: The storage must grow and shrink as needed. There are 2 approaches: String can be store in a linked list – extra storage occupied by the links and necessary complexity of string operations, and simple allocation and deallocation. Using adjacent memory cells to store a complete string – requires less storage and faster string operations, but allocation and deallocation are slower. Compile-time descriptor Run-time descriptor for for static strings limited dynamic strings .ITCS332 by Dr. Abdel Fattah Salman 6-16
  • 17. User-Defined Ordinal Types • An ordinal type is one in which the range of possible values can be easily associated with the set of positive integers • Examples of primitive ordinal types in Java – integer – char – boolean .ITCS332 by Dr. Abdel Fattah Salman 6-17
  • 18. Enumeration Types • An enumeration type is one in which all possible values, which are named constants, are provided in the definition. • Enumeration types provide a way of defining and grouping collections of named constants. • Enumeration constants are implicitly assigned integers: 0,1, … and can be explicitly assigned any integer in the definition. • An example in C# : enum days {mon, tue, wed, thu, fri, sat, sun}; • Design issues: All are related to type checking – Is an enumeration constant allowed to appear in more than one type definition, and if so, how is the type of an occurrence of that constant checked? – Are enumeration values coerced to integer? – Any other type coerced to an enumeration type? .ITCS332 by Dr. Abdel Fattah Salman 6-18
  • 19. Enumeration Types • In Ls that do not have enumeration types, programmers simulate them with integer values as in FORTRAN77: we use 0 to represent blue, 1 to represent red, … integer red, blue data red, blue /0, 1/ • The problem with this approach is that because we have not defined a type for our colors, there is no type checking when they are used. enum colors {red, blue, green, yellow, black}; colors mycolor= blue, yourColor= red; • The enumeration values are coerced to int when they are put in an integer context. For example, if the current value of mycolor is blue, then mycolor++ would assign green to mycolor. • C++ enumeration constants can appear in only ONE enumeration type in the same referencing environment. • C# enumeration types are like C++, except that they never coerced to integers. .ITCS332 by Dr. Abdel Fattah Salman 6-19
  • 20. Evaluation of Enumerated Type • Aid to readability: Named values are easily recognized, whereas coded values are not - e.g., no need to code a color as a number • Aid to reliability, e.g., compiler can check: –No arithmetic operations are legal on enumeration types. (don’t allow colors to be added) –No enumeration variable can be assigned a value outside its defined range. –In C++: Numeric values can be assigned to enumeration type variables only if they are cast to the type of the assigned variables . Numeric values are assigned to enumeration type variables are checked to determine whether they are in the range of the internal values of the enumeration type. –Ada, C#, and Java 5.0 provide better support for enumeration than C++ because enumeration type variables in these languages are not coerced into integer types .ITCS332 by Dr. Abdel Fattah Salman 6-20
  • 21. Subrange Types • A Subrange Type is an ordered contiguous subsequence of an ordinal type. For example: 12..18 is a subrange of integer type • Ada’s design: – Subranges are part of subtypes. Subtypes are not new types, but only new names for restricted versions of existing types. type Days is (mon, tue, wed, thu, fri, sat, sun); subtype Weekdays is Days range mon..fri; subtype Index is Integer range 1..100; – The restriction on the existing type is in the range of possible values. All operations defined for parent type are also defined for the subtype. Day1: Days; Day2: Weekday; Day2 := Day1; .ITCS332 by Dr. Abdel Fattah Salman 6-21
  • 22. Subrange Types – The compiler must generate range-checking code for every assignment to a subrange variable. – Types are checked for compatibility at compile time and range checking is done at run time. – Common uses of user-defined ordinal types: indexes of arrays and loop vars. – Subrange types are different from Ada’s derived types: Type derived_small_int is new integer range 1..100; Subtype subrange_small_int is integer range 1..100; – Vars of both types inherit the value range and operations of integer. – Variables of derived_small_int are not compatible with any integer type. – Variables of type subrange_small_int are compatible with variables and constants of integer type and any subtype of integer. .ITCS332 by Dr. Abdel Fattah Salman 6-22
  • 23. Subrange Evaluation • Aid to readability: Make it clear to the readers that variables of subrange can store only certain range of values • Reliability: Assigning a value to a subrange variable that is outside the specified range is detected as an error by the compiler or by the run-time system. Implementation of User-Defined Ordinal Types • Enumeration types are implemented as integers • Subrange types are implemented like the parent types with code inserted (by the compiler) to restrict assignments to subrange variables. – This step increases code size and execution time but is usually considered well worth the cost. .ITCS332 by Dr. Abdel Fattah Salman 6-23
  • 24. Array Types • An array is an aggregate of homogeneous data elements in which an individual element is identified by its position in the aggregate, relative to the first element. • A reference to an array element needed one or more subscripts which require a run-time calculation to determine the mem location being referenced. • Array Design Issues: – What types are legal for subscripts? – Are subscripting expressions in element references range checked? – When are subscript ranges bound? – When does allocation take place? – What is the maximum number of subscripts? – Can array objects be initialized? – Are any kind of slices allowed? .ITCS332 by Dr. Abdel Fattah Salman 6-24
  • 25. Array Indexing • Specific elements are determined by means of a two-level mechanism: the first part is the aggregate name, the second part is a dynamic selector consisting of one or more items known as subscripts or indexes. • If all subscripts in a reference are constants – the selector is static, otherwise it is dynamic. • Indexing (or subscripting) is a mapping from indices to elements array_name(index_value_list)→ an element • Index Syntax – FORTRAN, PL/I, Ada use parentheses • Ada explicitly uses parentheses to show uniformity between array references and function calls because both are mappings – Most other languages use brackets – In Ls that provide multidimensional arrays as array of arrays, each subscript appears in its own bracket. .ITCS332 by Dr. Abdel Fattah Salman 6-25
  • 26. Arrays Index (Subscript) Types • A problem with using parentheses to enclose subscripts and subprogram parameters. Context information is used to solve it. • Array element references map subscripts to specific array element • Function calls map parameters to functional values. • List(59) may be a reference to array element or a call to a function named list. • FORTRAN, C: integer only • Pascal: any ordinal type (integer, Boolean, char, enumeration) • Ada: integer or enumeration (includes Boolean and char) • Java: integer types only • C, C++, Perl, and Fortran do not specify range checking • Java, ML, C# specify range checking .ITCS332 by Dr. Abdel Fattah Salman 6-26
  • 27. Subscript Binding and Array Categories • There are 5 categories of arrays: category definition is based on range subscript binding and binding to storage • The category name indicate where and when storage is allocated. • A Static array is one in which the subscript ranges are statically bound and storage allocation is static (before run-time) – Advantage: efficiency (no dynamic allocation) • A Fixed stack-dynamic array: is one in which subscript ranges are statically bound, but the allocation is done at declaration elaboration time during execution. – Advantage: space efficiency; A large array in one subprogram can use the same space in another subprogram (The 2 subprograms are not active at the same time). .ITCS332 by Dr. Abdel Fattah Salman 6-27
  • 28. Subscript Binding and Array Categories (continued) • A Stack-dynamic array is one in which subscript ranges are dynamically bound and the storage allocation is dynamic (done at run-time). Once the subscript ranges are bound and storage is allocated, they remain fixed during the lifetime of the var. – Advantage: flexibility (the size of an array need not be known until the array is to be used). • A Fixed heap-dynamic array: similar to fixed stack-dynamic: storage binding is dynamic but fixed after allocation (i.e., binding is done when requested and storage is allocated from heap, not stack). • A Heap-dynamic array is one in which binding of subscript ranges and storage allocation is dynamic and can change any number of times during the array's lifetime. – Advantage: flexibility (arrays can grow or shrink during program execution). .ITCS332 by Dr. Abdel Fattah Salman 6-28
  • 29. Subscript Binding and Array Categories (continued) • C and C++ arrays that include static modifier are static • C and C++ arrays without static modifier are fixed stack-dynamic • C and C++ provide fixed heap-dynamic arrays (By using operators: new and delete). • Ada arrays can be stack-dynamic • In Java all arrays are heap-dynamic arrays. • C# provides fixed heap-dynamic arrays and includes a second array class ArrayList that provides heap-dynamic: Objects of this class are created without any elements and added to this object using the add method. • Perl and JavaScript support heap-dynamic arrays: array can grow and shrink: In Perl we create array of 5 elements with @list = (1,2,3,5,7). • It can be lengthened with push function as push(@list, 11, 19); • The arrays can be emptied with @list = (); .ITCS332 by Dr. Abdel Fattah Salman 6-29
  • 30. Array Initialization • Some language allow initialization at the time of storage allocation – C, C++, Java, C# example int list [] = {4, 5, 7, 83} – Character strings in C and C++ char name [] = “freddie”; – Arrays of strings in C and C++ char *names [] = {“Bob”, “Jake”, “Joe”]; – Java initialization of String objects string[] names = {“Bob”, “Jake”, “Joe”}; List:array(1..5)of integer:=(1,3,5,7,9); initializes all elements Bunch:array(1..5)of integer:=(1=>17,3=>35,others =>0) – The first and third elements are initialized using direct assignment and others clause initializes the remaining elements. .ITCS332 by Dr. Abdel Fattah Salman 6-30
  • 31. Arrays Operations • An array operation is one that operates on an array as a unit. • Ada allows array assignment and also concatenation (&). Concatenation is defined between 2 single-dimensional arrays and between a single-dimensional array and a scalar. • Fortran provides elemental operations because they are between pairs of array elements – For example, + operator between two arrays results in an array of the sums of the element pairs of the two arrays – Library functions for matrix multiplication, transpose, dot product,… • APL provides the most powerful array processing operations for vectors and matrixes as well as unary operators (for example, to reverse column elements). See examples of APL array operations on page 272. .ITCS332 by Dr. Abdel Fattah Salman 6-31
  • 32. Rectangular and Jagged Arrays • A rectangular array is a multi-dimensioned array in which all of the rows have the same number of elements and all columns have the same number of elements. All subscripts are placed in a single pair of brackets. • A jagged array has rows (columns) with varying number of elements. The use separate pair of brackets for each dimension a[6][5]. – Possible when multi-dimensioned arrays actually appear as arrays of arrays .ITCS332 by Dr. Abdel Fattah Salman 6-32
  • 33. Slices • A slice is some substructure of an array; nothing more than a referencing mechanism • Slices are only useful in languages that have array operations • Slice Examples: – Fortran 95 Integer, Dimension (10) :: Vector Integer, Dimension (3, 3) :: Mat Integer, Dimension (3, 3) :: Cube Vector (3:6) is a four element array .ITCS332 by Dr. Abdel Fattah Salman 6-33
  • 34. Slices Examples in Fortran 95 .ITCS332 by Dr. Abdel Fattah Salman 6-34
  • 35. Implementation of Arrays • Implementing arrays requires more compile time effort than does implementing simple types (int). • The code to access array element must be generated at compile time. • Access function maps subscript expressions to an address in the array • Access function for single-dimensioned arrays: address(list[k])= address(list[lower_bound]) + ((k-lower_bound)* element_size) • The compile-time descriptor for single-dimensioned arrays includes information needed to construct access function. • If all attributes are static and index range checking in not done at run-time, no descriptor is needed. .ITCS332 by Dr. Abdel Fattah Salman 6-35
  • 36. Accessing Multi-dimensioned Arrays • Multidimensional arrays are more complex to implement than single- dimensioned arrays . • Memory is linear – a simple sequence of bytes. • Values of data types that have 2 or more dimensions must be mapped onto the single-dimensioned memory. • Two common ways to store a multidimensional array : – Row major order (by rows) – used in most languages – Column major order (by columns) – used in Fortran • Sequential access to matrix elements will be faster if they are accessed in the order in which they are stored – minimizing paging. • The access function for a multidimensional array is the mapping of its base address and a set of index values to the address in memory of the element specified by the index values. • The access function for a 2-dimensional array stored in row-major order is shown below: .ITCS332 by Dr. Abdel Fattah Salman 6-36
  • 37. Row / Column major ordering 11 12 13 14 15 21 22 23 24 25 31 32 33 34 35 Row major order (second subscript increases faster) 11 12 13 14 15 21 22 23 24 25 31 32 33 34 35 Column major order (first subscript increases faster) 11 21 31 12 22 32 13 23 33 14 24 34 15 25 35 .ITCS332 by Dr. Abdel Fattah Salman 6-37
  • 38. Locating an Element in a Multi-dimensioned Array • The address of an element is the base address of the array plus the element size times the number of elements preceding it in the array. Loc (a[i, j])= address of a[1,1] + (# of elements preceding it ) * el_size = address of a[1,1] + ((number of rows above ith row * row_size) + number of elements left of jth column) * el_size = address of a[1,1] + (((i-1)*n + (j-1)) * el_size = address of a[1,1] + (i*n-n+j-1) * el_size = address of a[1,1] + ((i*n+j)-(n+1)) * el_size = address of a[1,1] + ((i*n+j)*el_size-(n+1) * el_size = address of a[1,1] -(n+1)*el_size +(i*n+j)* el_size (i-1)*element size (j-1)*element size .ITCS332 by Dr. Abdel Fattah Salman 6-38
  • 39. Locating an Element in a Multi-dimensioned Array • The address of an element is the base address of the array plus the element size times the number of elements preceding it in the array. Location(a[i,j])= address of a[1,1]+ ((number of rows above ith row * row_size) + number of elements left of jth column) * element_size • General format: Location(a[i,j])=address of a[row_lb , col_lb]- (((row_lb * n)+ col_lb)* element_size)+ (((I * n) + j) * element_size) • The first 2 terms are the constant part and the last is the variable part. • For each dimension on an array, ONE add and ONE multiply instructions are required for the access function. .ITCS332 by Dr. Abdel Fattah Salman 6-39
  • 40. Compile-Time Descriptors Single-dimensioned array Multi-dimensional array .ITCS332 by Dr. Abdel Fattah Salman 6-40
  • 41. Associative Arrays • An associative array is an unordered collection of data elements that are indexed by an equal number of values called keys – User defined keys must be stored in the structure – In nonassociative arrays: the indices never need to be stored – Each element of an associative array is a pair of entities: a key and a value. • Design issues: What is the form of references to elements? • In Perl, associative arrays are called hashes – their elements are stored and retrieved with hash functions. Every hash variable must begin with %. Scalar variable begin with $. The key value is placed in braces and the hash name is replaced by a scalar variable name that is the same except for the first character. .ITCS332 by Dr. Abdel Fattah Salman 6-41
  • 42. Associative Arrays in Perl • Names begin with %; literals are delimited by parentheses %hi_temps=("Mon"=>77, "Tue" => 79, “Wed” => 65, …); • Subscripting is done using braces and keys $hi_temps{"Wed"} = 83; – Elements can be removed with delete delete $hi_temps{"Tue"}; The entire hash can be emptied by assigning an empty literal to it: @salaries = (); .ITCS332 by Dr. Abdel Fattah Salman 6-42
  • 43. Perl’s Associative Arrays • Perl has a primitive datatype for hash tables aka “associative arrays”. • Elements indexed not by consecutive integers but by arbitrary keys • %ages refers to an associative array and @people to a regular array • Note the use of { } for associative arrays and [ ] for regular arrays %ages = (“Bill Clinton”=>53,”Hillary”=>51, "Socks“=>"27 in cat years"); $ages{“Hillary”} = 52; @people=("Bill Clinton“,"Hillary“,"Socks“); $ages{“Bill Clinton"}; # Returns 53 $people[1]; # returns “Hillary” • keys(X), values (X) and each(X) foreach $person (keys(%ages)) {print "I know the age of $personn";} foreach $age (values(%ages)){print "Somebody is $agen";} while (($person, $age) = each(%ages)) {print "$person is $agen";} .ITCS332 by Dr. Abdel Fattah Salman 6-43
  • 44. Record Types • A record is a possibly heterogeneous aggregate of data elements in which the individual elements are identified by names • Design issues: – What is the syntactic form of references to the field? – Are elliptical references allowed? .ITCS332 by Dr. Abdel Fattah Salman 6-44
  • 45. Definition of Records • COBOL uses level numbers to show nested records; others use recursive definition dot notation • Record Field References – COBOL field_name OF record_name_1 OF ... OF record_name_n – Others () record_name_1. record_name_2. ... record_name_n. field_name • COBOL uses level numbers to show nested records; others use recursive definition 01 EMP-REC. 02 EMP-NAME. 05 FIRST PIC X(20). 05 MID PIC X(10). 05 LAST PIC X(20). 02 HOURLY-RATE PIC 99V99. .ITCS332 by Dr. Abdel Fattah Salman 6-45
  • 46. Definition of Records in Ada • Record structures are indicated in an orthogonal way type Emp_Rec_Type is record First: String (1..20); Mid: String (1..10); Last: String (1..20); Hourly_Rate: Float; end record; Emp_Rec: Emp_Rec_Type; .ITCS332 by Dr. Abdel Fattah Salman 6-46
  • 47. References to Records • Most language use dot notation: Emp_Rec.Name • Fully qualified references must include all record names • Elliptical references allow leaving out record names as long as the reference is unambiguous, for example in COBOL FIRST, FIRST OF EMP-NAME, and FIRST of EMP-REC are elliptical references to the employee’s first name. Operations on Records • Assignment is very common if the types are identical • Ada allows record comparison • Ada records can be initialized with aggregate literals • COBOL provides MOVE CORRESPONDING – Copies a field of the source record to the corresponding field in the target record .ITCS332 by Dr. Abdel Fattah Salman 6-47
  • 48. Evaluation and Comparison to Arrays • Straight forward and safe design • Records are used when collection of data values is heterogeneous • Access to array elements is much slower than access to record fields, because subscripts are dynamic (field names are static) • Dynamic subscripts could be used with record field access, but it would disallow type checking and it would be much slower .ITCS332 by Dr. Abdel Fattah Salman 6-48
  • 49. Implementation of Record Type Offset address relative to the beginning of the records is associated with each field .ITCS332 by Dr. Abdel Fattah Salman 6-49
  • 50. Unions Types • A union is a type whose variables are allowed to store different type values at different times during execution • Design issues – Should type checking be required? – Should unions be embedded in records? Discriminated vs. Free Unions • Fortran, C, and C++ provide union constructs in which there is no language support for type checking; the union in these languages is called free union • Type checking of unions require that each union include a type indicator called a discriminant – Supported by Ada .ITCS332 by Dr. Abdel Fattah Salman 6-50
  • 51. Ada Union Types type Shape is (Circle, Triangle, Rectangle); type Colors is (Red, Green, Blue); type Figure (Form: Shape) is record Filled: Boolean; Color: Colors; case Form is when Circle => Diameter: Float; when Triangle => Leftside, Rightside: Integer; Angle: Float; when Rectangle => Side1, Side2: Integer; end case; end record; .ITCS332 by Dr. Abdel Fattah Salman 6-51
  • 52. Ada Union Type Illustrated A discriminated union of three shape variables Evaluation of Unions • Potentially unsafe construct: Do not allow type checking • Java and C# do not support unions: Reflective of growing concerns for safety in programming language .ITCS332 by Dr. Abdel Fattah Salman 6-52
  • 53. Pointer and Reference Types • A pointer type variable has a range of values consisting of memory addresses and a special value, NULL (nil) – Provide the power of indirect addressing – Provide a way to manage dynamic memory – A pointer can be used to access a location in the area where storage is dynamically created (usually called a heap) • Design Issues of Pointers – What are the scope of and lifetime of a pointer variable? – What is the lifetime of a heap-dynamic variable? – Are pointers restricted as to the type of value to which they can point? – Are pointers used for dynamic storage management, indirect addressing, or both? – Should the language support pointer types, reference types, or both? .ITCS332 by Dr. Abdel Fattah Salman 6-53
  • 54. Pointer Operations • Two fundamental operations: assignment and dereferencing • Assignment is used to set a pointer variable’s value to some useful address • Dereferencing yields the value stored at the location represented by the pointer’s value – Dereferencing can be explicit or implicit – C++ uses an explicit operation via * j = *ptr sets j to the value located at ptr .ITCS332 by Dr. Abdel Fattah Salman 6-54
  • 55. Pointer Assignment Illustrated The assignment operation j = *ptr .ITCS332 by Dr. Abdel Fattah Salman 6-55
  • 56. Problems with Pointers • Dangling pointers (dangerous) – Dangling Pointer is when dynamic memory has been deallocated (deleted) but there is one or more pointers still pointing to it. A pointer points to a heap-dynamic variable that has been de-allocated – Creating one: • Allocate a heap-dynamic variable and set a pointer to point at it • Set a second pointer to the value of the first pointer • Deallocate the heap-dynamic variable, using the first pointer – Example: int *myPtr,*urPtr; myPtr = new int(10); cout << "The value of myPtr is " << *myPtr << endl; urPtr = myPtr; delete myPtr; // urPtr is a “dangling pointer” *myPtr = 5; cout << "The value of myPtr is " << *myPtr << endl; • It is an error to dereference a pointer after deleting any of its aliases. This creates “dangling pointers” .ITCS332 by Dr. Abdel Fattah Salman 6-56
  • 57. Problems with Pointers • Lost heap-dynamic variable – An allocated heap-dynamic variable that is no longer accessible to the user program (often called garbage) – Creating one: • Pointer p1 is set to point to a newly created heap-dynamic variable • Pointer p1 is later set to point to another newly created heap- dynamic variable. This causes losing the first heap-dynamic variable, i.e. that variable cannot be accessed or deallocated. • Example: void *p1,*p2; p1 = new int(10); p1=new float (7.4); //The int var(=10) is lost • The process of losing heap-dynamic variables is called memory leakage .ITCS332 by Dr. Abdel Fattah Salman 6-57
  • 58. Pointers in Ada • Some dangling pointers are disallowed because dynamic objects can be automatically de-allocated at the end of pointer's type scope • All pointers are initialized to null • The lost heap-dynamic variable problem is not eliminated by Ada .ITCS332 by Dr. Abdel Fattah Salman 6-58
  • 59. Pointers in C and C++ • Extremely flexible but must be used with care • Pointers can point at any variable regardless of when it was allocated • Used for dynamic storage management and addressing • Pointer arithmetic is possible • Explicit dereferencing and address-of operators • Domain type need not be fixed (void *) float stuff[100]; float *p; p = stuff; *(p+5) is equivalent to stuff[5] and p[5] *(p+i) is equivalent to stuff[i] and p[i] • void * can point to any type and can be type checked (cannot be de- referenced) .ITCS332 by Dr. Abdel Fattah Salman 6-59
  • 60. Pointers in Fortran 95 • Pointers point to heap and non-heap variables • Implicit dereferencing • Pointers can only point to variables that have the TARGET attribute • The TARGET attribute is assigned in the declaration: REAL, POINTER :: ptr (POINTER is an attribute) ptr => target (where target is either a pointer or a non-pointer with the TARGET attribute) The TARGET attribute is assigned in the declaration, e.g. INTEGER, TARGET :: NODE .ITCS332 by Dr. Abdel Fattah Salman 6-60
  • 61. Reference Types • C++ includes a special kind of pointer type called a reference type that is used primarily for formal parameters – Advantages of both pass-by-reference and pass-by-value • Java extends C++’s reference variables and allows them to replace pointers entirely – References refer to call instances • C# includes both the references of Java and the pointers of C++ .ITCS332 by Dr. Abdel Fattah Salman 6-61
  • 62. Evaluation of Pointers • Dangling pointers and dangling objects are problems as is heap management • Pointers are like goto's--they widen the range of cells that can be accessed by a variable • Pointers or references are necessary for dynamic data structures--so we can't design a language without them. Representations of Pointers • Large computers use single values • Intel microprocessors use segment and offset .ITCS332 by Dr. Abdel Fattah Salman 6-62
  • 63. Solving Dangling Pointer Problem • Tombstone: extra heap cell that is a pointer to the heap- dynamic variable – The actual pointer variable points only at tombstones – When heap-dynamic variable is de-allocated, tombstone remains but set to nil – Costly in time and space • Locks-and-keys: Pointer values are represented as (key, address) pairs – Heap-dynamic variables are represented as variable plus cell for integer lock value. – When heap-dynamic variable is allocated, lock value is created and placed in lock cell and key cell of pointer. .ITCS332 by Dr. Abdel Fattah Salman 6-63
  • 64. Heap Management • Memory management: identify unused, dynamically allocated memory cells and return them to the heap • Approaches – Manual: explicit allocation and deallocation (C, C++) – Automatic: • Reference counters (modula2, Adobe Photoshop) • Garbage collection (Lisp, Java) • Problems with manual approach: – Requires programmer effort – Programmer’s failures leads to space leaks and dangling references/sharing – Proper explicit memory management is difficult and has been estimated to account for up to 40% of development time! • A very complex run-time process • Single-size cells vs. variable-size cells • Two approaches to reclaim garbage – Reference counters (eager approach): reclamation is gradual – Garbage collection (lazy approach): reclamation occurs when the list of available space becomes empty .ITCS332 by Dr. Abdel Fattah Salman 6-64
  • 65. Reference Counter • Idea: keep track how many references there are to a cell in memory. If this number drops to 0, the cell is garbage. • Reference counters: maintain a counter in every cell that store the number of pointers currently pointing at the cell • Store garbage in free list; allocate from this list • Advantages – resources can be freed directly – immediate reuse of memory possible • Disadvantages – Can’t handle cyclic data structures – Bad locality properties – Large overhead for pointer manipulation – Disadvantages: space required, execution time required, complications for cells connected circularly .ITCS332 by Dr. Abdel Fattah Salman 6-65
  • 66. Garbage Collection • GC is a process by which dynamically allocated storage is reclaimed during the execution of a program. • Usually refers to automatic periodic storage reclamation by the garbage collector (part of the run-time system), as opposed to explicit code to free specific blocks of memory. • Usually triggered during memory allocation when available free memory falls below a threshold. Normal execution is suspended and GC is run. • The run-time system allocates storage cells as requested and disconnects pointers from cells as necessary; garbage collection then begins – Every heap cell has an extra bit used by collection algorithm – All cells initially set to garbage – All pointers traced into heap, and reachable cells marked as not garbage – All garbage cells returned to list of available cells – Disadvantages: when you need it most, it works worst (takes most time when program needs most of cells in heap) • Major GC algorithms: – Mark and sweep – Copying – Incremental garbage collection algorithms .ITCS332 by Dr. Abdel Fattah Salman 6-66
  • 67. Marking Algorithm .ITCS332 by Dr. Abdel Fattah Salman 6-67
  • 68. Variable-Size Cells • All the difficulties of single-size cells plus more • Required by most programming languages • If garbage collection is used, additional problems occur – The initial setting of the indicators of all cells in the heap is difficult – The marking process in nontrivial – Maintaining the list of available space is another source of overhead .ITCS332 by Dr. Abdel Fattah Salman 6-68
  • 69. Summary • The data types of a language are a large part of what determines that language’s style and usefulness • The primitive data types of most imperative languages include numeric, character, and Boolean types • The user-defined enumeration and subrange types are convenient and add to the readability and reliability of programs • Arrays and records are included in most languages • Pointers are used for addressing flexibility and to control dynamic storage management .ITCS332 by Dr. Abdel Fattah Salman 6-69