Basic Semantics: Symbols, Scopes, ... ------------------------------------------ Semantics: describes the meaning of programs -- operational Vs denotational -- formal Vs informal Semantics is typically defined in a bottom-up fashion: -- values -- names -- variables -- constants -- expressions -- statements -- compound statements -- procedures -- program 5.1 Attributes, Binding, and Semantic Functions Meanings of names is captured via attributes associated with the names: -- type -- value -- location Binding: establishing an association between name and an attribute -- binding time -- static -- language definition time -- lang impl time -- compile-time -- link time -- load time -- dynamic Eg: type is statically bound in most langs value is dynamically bound location may be dynamically or statically bound Bindings are held in -- symbol table: name --> static attributes -- environment: name --> locations -- memory: locations --> value Declarations Blocks and Scopes ------------------------------- Scope: region of program over which a declaration is in effect ie bindings established by decl must be maintained global package or module file class procedure block Block-structured, lexically-scoped language Visibility: redefinitions in inner scopes supercede outer definitions qualifiers may be needed to make otherwise invisible names to be visible in a scope Eg: local var superceding global var names in other packages private members in classes Symbol Table --------------- Uses data structures that allow efficient name lookup operations in the presence of scope changes. We can use -- hash tables to lookup attributes for each name -- a scope stack that keeps track of the current scope (in which a symbol to be looked up appears) and its surrounding scopes. The top most element in the scope stack corresponds to the current (ie innermost) scope, while the bottommost element will correspond to the outermost (ie global) scope. Support for scopes: lexical scopes can be supported using a scope stack as follows: a) symbols in a program reside in multiple hash tables: in particular, symbols within each scope are contained in a single hash table for that scope b) at any point in the program, the scope stack keeps track of all the scopes surrounding that program point, as described above. The elements of the stack contain pointers to the corresponding hash table. c) to lookup a name, we start from the hash table pointed to by the top element of the stack. If the symbol is not found, we try hash table pointed by the next lower entry in the stack. This process is repeated until we find the name, or we reach the bottom of the stack (this means that the name has never been declared.) d) scope entry and exit operations modify the scope stack appropriately. Specifically, when a new scope is entered, a corresponding hash table is created. A pointer to this hash table is pushed onto the scope stack. When we exit a scope, the top of the stack is popped off. Example: <------------------ (1) float y = 1.0 <------------------ (2) void f(int x) { <------------------ (3) for (int x = 0; ...) { <------------------ (4) { <------------------ (5) int y = 1; <------------------ (6) } <------------------ (7) { <------------------ (8) float x = 1.0; <------------------ (9) } <------------------ (10) } <------------------ (11) } <------------------ (12) main() { <------------------ (13) float y = 10.0; <------------------ (14) f(1); } <------------------ (15) Illustration: At (1), we have a single hash table, which is the global hash table. The scope stack contains exactly one entry, which points to this global hash table. When the compiler moves from (1) to (2), the name y is added to the hash table for the current scope. Since the top of scope stack points to the global table, this means that "y" is being added to the global table. When compiler moves from (2) to (3), the name "f" is added to the global table, a new hash table for f's scope is created. A pointer to f's table is pushed on the scope stack. Then the parameter "x" added to hashtable for the current scope (ie f) (Students should work out the rest of the moves.) Static or lexical scoping: associations are determined at compile time, using a sequential processing of program. Dynamic scoping: associations are determined at runtime --- processing of program statements follows the execution order of different statements. Example: if we added a new function "g" to the above program as follows: void g() { int y ; f() ; } Consider references to the name "y" at (3). With static scoping, this name always refers to the global variable y defined between (1) and (2). With dynamic scoping, the search for non-local names follows the execution nesting, rather than static nesting. This means that if "f" is called from main, then at (3), "y" will refer to the float variable declared in main. If "f" is invoked from within g, the same name will refer to the integer variable "y" defined in g. Thus, the same name may refer to different entities, possibly of different types, in a dynamically scoped language. Since the type associated with the name "y" at (3) can differ depending upon the point of call, we cannot statically determine the type of y. This suggests that dynamic scoping does not fit well with static typing. Since static typing has now been accepted to be the right approach, almost all current languages (C/C++/Java/SML/LISP) use static scoping