Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.
|
|
|
|
Motivated by the challenges of languages and their beautiful elaboration.
Posted:
Nov 24, 2009 4:52 PM
|
|
iuuq:// (TM) script: draft - sci.logic | Google Groups Discovery services. BSD sockets. TCP/IP BPEL JINI Web services. XML RPC Group communication DHTs. Multicast .... The design of Http://MeAmI.org.org has been motivated by the challenges as we face languages and their beautiful elaboration. ... http://groups.google.com/group/sci.logic/msg/edcea6be81b182f8 MEDIATED EXECUTION BY SAMEER SUNDRESH + MARTIN MUSATOV B.S., University of Illinois at Urbana-Champaign, 2001 DISSERTATION Submitted in partial fulllment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate College of the University of Illinois at Urbana-Champaign, 2009 Urbana, Illinois Doctoral Committee: Professor Gul Agha, Chair Associate Professor Sam Kamin Associate Professor Grigore Rosu Carolyn Talcott, SRI International Table of Contents Chapter1 Introduction .................................. 1 1.1 Whycustomizealanguage? ................................ 1 1.1.1 Creatingdomain-speciclanguages ........................ 2 1.1.2 Reusingcodeinanewcontext .......................... 3 1.1.3 Full-systemsimulations .............................. 3 1.1.4 Enhancingsecuritythroughsandboxing . . . . . . . . . . . . . . . . . . . . . 4 1.2 Inwhatwaysmightalanguagebecustomized? . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 User-denedfunctionsanddatatypes ...................... 5 1.2.2 Syntax........................................ 5 1.2.3 Compile-timemacros................................ 5 1.2.4 Statictypesystem ................................. 6 1.2.5 Primitiveoperations ................................ 6 1.2.6 Controlstructures ................................. 6 1.2.7 Variablescoping .................................. 7 1.2.8 Concurrencysemantics............................... 7 1.3 Thesisoutlineandreader'sguide ............................. 8 1.3.1 Detailedoutline................................... 8 Chapter2 Amessage-orientedexecutionmodel ................... 10 2.1 Message-orientedcomputation............................... 10 2.1.1 Systemarchitecture ................................ 11 2.1.2 Programobjectsandrequestmessages ...................... 12 2.1.3 Termtraversal ................................... 12 2.1.4 Programs as requestmessages .......................... 15 2.2 Messagehandlersandmediation ............................. 17 2.2.1 Catchingandre-throwingrequestmessages . . . . . . . . . . . . . . . . . . . 17 2.2.2 Handlingrequestmessagesbycases ....................... 18 2.2.3 One-shotandmulti-casehandlers......................... 20 2.3 Abaselanguage ...................................... 23 2.3.1 The Request Propagation rules........................ 27 2.3.2 The Base-Level Operations rules....................... 27 2.3.3 Step-by-stepreductionexamples ......................... 28 Chapter3 Sequentiallanguagefeatures ........................ 32 3.1 Simplecontrolstructures ................................. 32 3.1.1 if statements .................................... 32 3.1.2 while loops ..................................... 34 3.2 Lexicalvariables ...................................... 35 3.2.1 Mutablevariables ................................. 37 3.2.2 Avoiding insertion of a handlers term....................... 38 3.3 Lambdaandapply ..................................... 39 3.3.1 The lambda operation ............................... 40 ii 3.3.2 The apply operation ................................ 42 3.4 Abstractdatastructures .................................. 45 Chapter4 Concurrentlanguagefeatures ........................ 48 4.1 Actors............................................ 48 4.1.1 Informalsemantics ................................. 48 4.1.2 Deningactors ................................... 49 4.2 Localsynchronizationconstraints ............................. 51 4.2.1 Example:asingle-elementbuer ......................... 51 4.2.2 Dening local synchronization constraints . . . . . . . . . . . . . . . . . . . . 52
4.3 Meta-actors......................................... 54 4.3.1 Statically-assignedmeta-actors .......................... 54
4.3.2 Dynamically-reassignablemeta-actors . . . . . . . . . . . . . . . . . . . . . . 57
4.3.3 Usingmeta-actorsforcoordination ........................ 58
4.4 Evaluation.......................................... 60
Chapter5 ImplementationinJavascript ........................ 63
5.1 TheJavascriptprogramminglanguage .......................... 63
5.1.1 Variables ...................................... 64
5.1.2 Functions ...................................... 64
5.1.3 Conditionals .................................... 66
5.1.4 undefined, null andcomparison......................... 67
5.1.5 Otheroperators................................... 68
5.1.6 Objects ....................................... 69
5.1.7 Exceptions ..................................... 73
5.2 Generalimplementationstrategy ............................. 74
5.2.1 Interactiveevaluation ............................... 75
5.2.2 The post operation ................................. 75
5.3 ImplementationsofkeyJavascriptconstructs . . . . . . . . . . . . . . . . . . . . . . 77
5.3.1 Primitivevalues .................................. 77
5.3.2 Conditional statements: while loops ....................... 77
5.3.3 Environments.................................... 78
5.3.4 Objects(includingarrays)............................. 80
5.3.5 Functionsandmethods .............................. 82
5.3.6 Exceptions ..................................... 84
5.3.7 User-denedrequesthandlers ........................... 84
5.3.8 Reiedmessagesandcontinuations ........................ 86
5.4 Performanceconsiderations ................................ 87
Chapter6 Applications .................................. 89
6.1 Sandboxedwidgetsonwebpages ............................. 89
6.1.1 Relationshiptotherestofthisthesis. . . . . . . . . . . . . . . . . . . . . . . 89
6.1.2 Widgetarchitecture ................................ 90
6.2 Resourcelimits ....................................... 91
6.2.1 Executiontimelimits ............................... 91
6.2.2 Discussion...................................... 94
6.2.3 Memoryresourcelimits .............................. 94
6.3 Mediatedcommunication ................................. 95
6.3.1 LimitingDOMtreeaccess............................. 95
6.3.2 Limitingcommunicationwithservers. . . . . . . . . . . . . . . . . . . . . . . 97
iii
Chapter7 Relatedwork.................................. 99
7.1 Virtualizationandsandboxing............................... 99
7.1.1 System-levelvirtualization............................. 100
7.1.2 Languagevirtualmachines ............................ 101
7.1.3 Javascriptsandboxing ............................... 102
7.2 Re ection, metaprogramming and aspect-oriented programming . . . . . . . . . . . . 102
7.2.1 Re ectionandmetacircularevaluators . . . . . . . . . . . . . . . . . . . . . . 102
7.2.2 Metaprogramming ................................. 104
7.2.3 Aspect-orientedprogramming ........................... 106
7.3 Semanticframeworks.................................... 106
7.4 Delimitedcontinuations .................................. 109
7.5 Dynamicbinding ...................................... 110
7.5.1 Theoreticalresults ................................. 110
7.5.2 Implementations .................................. 111
7.6 Actorsystems........................................ 112
7.6.1 Backgroundonactors ............................... 112
7.6.2 Actor coordination and re ective customization . . . . . . . . . . . . . . . . . 113
Chapter8 Conclusions................................... 115
8.1 Contributions ........................................ 116
8.2 Futurework ......................................... 117
8.2.1 Evolution of the request-based execution model . . . . . . . . . . . . . . . . . 117
8.2.2 Implementationimprovements .......................... 118
8.2.3 Furtherapplications ................................ 119
References........................................... 120
Author'sbiography...................................... 126
iv
Chapter 1 Introduction
It is often useful to be able to inspect and modify how part of a program executes, and the environment in which it executes. For example, variables and functions allow us to write programs that are abstract in certain values and function denitions. But we can go much further than that. For example, it can be useful to create a domain-specic language (DSL) customized to the task at hand. Rather than writing this DSL from scratch, it is expedient to reuse relevant features from the base language, creating an embedded DSL. But on the other hand, in some cases we may also want to remove or redene the behavior of certain constructs inherited from the base language. One case of this is when writing a custom sandbox for 3rd party mobile code: while we may not need to introduce new constructs, it is important to regulate how the code interacts with the rest of the system. This can be done by mediating access to certain language constructs (such as direct access to memory or other resources), while directly inheriting others (such as arithmetic, conditionals, and functions). This thesis deals with the problem of prototyping systems consisting of multiple, interacting levels of language dialects. This work focus on dynamically customizing the execution semantics of a programming language within a context. The key novelty of the described approach is it allows any language construct to be arbitrarily and dynamically customized within a bounded scope. By masking or virtualizing lower-level features and introducing higher-level features, a customization may introduce a new level of abstraction, up to and including a completely dierent programming language.
1.1 Why customize a language? Programming languages are a means of communication amongst human software developers and computers. Languages are used to describe solutions to problems; a \good? programming language allows developers to describe solutions in a way that makes sense to themselves, other developers,
1
and computers. The question of what makes the most sense to people is a problem of psychology, and outside the scope of this thesis. In practice, experienced software developers tend to be skilled at coming up with abstractions that make sense to themselves and each other, but end up having to translate these descriptions to a form that is executable on a computer. This may be because the developers make use of informal notation that is not dened in an existing programming language, or because they overload existing notation to mean something slightly dierent. In the latter case, code that looks unsafe or incorrect may actually be reasonable under the appropriate non-standard interpretation. The translation from \developer language? to \computer language? can either be done manually, or by dening a domain-specic language implementation. If the translation is purely manual, then possibly unsafe or interfering code which may at best contain bugs that are only revealed at deployment, or at worst contain intentional exploits. As such, it is useful to provide access to language customization both for developers producing a particular piece of code, and for developers and advanced users of systems that deploy that code. The goal of an extensible programming language is to ease the process of teaching a computer how to understand the composition of the developers? intended behaviors. The following sections describe three cases in which an extensible programming language can be useful.
1.1.1 Creating domain-specic languages The most immediate case of language extension alluded to above is domain-specic languages (DSLs). A domain-specic language introduces special constructs specically designed for a given problem domain. For example, TEX and PostScript are domain-specic languages for control of lower levels. As such, this chapter introduces an alternative linguistic framework for mediating program execution. Succeeding chapters show how the primitives of this framework can be used to dene a wide range of dierent language constructs.
2.1 Message-oriented computation Our approach to dening and extending languages focuses on message-oriented computation. Object-oriented programs are follow a similar form as the denitions of post and with env in Chapter 5. Later, we will consider how to implement it using the handler construct instead. As with earlier denitions (such as the if statement), the denition of time limit is separated into
(a) a phase where we pre-evaluate the arguments num-steps, on-preempt and on-return; andconstruct custom virtual environments with an application.
7.1.2 Language virtual machines Several programming languages use custom virtual machine environments as an intermediate target for code generation. Virtual machine code (often called bytecode) typically consists of simple, low-level operations along with somewhat higher level operations which are used as primitives by higher-level languages. While the Java Virtual Machine [61] and its younger cousin, the .Net Common Language Runtime [27], are the most widely used language virtual machines, they are far from the rst. Some of the earliest language virtual machines were O-code, for BCPL [73]; and p-code for Pascal [40]. Smalltalk [33]|an in uential re ective object-oriented language|was also implemented atop a virtual machine. In the case of Smalltalk, live VM program images can be saved to disk, ported to other machines, and restarted. Language virtual machines come in both general-purpose and language-specic varieties. In addition to Smalltalk, contemporary languages such as OCaml and Haskell maintain custom virtual machines. The Java VM was originally designed for the Java programming language alone; but as Java grew in popularity, languages such as Scala [68] chose to use the JVM as their own target platform. Indeed, upcoming versions of the JVM plan to add support for bytecode instructions useful for dynamically-checked languages other than Java. Observing this pattern, the .Net CLR and the LLVM compiler infrastructure [56] were both designed as targets for multiple dierent languages. The key dierence between the two is subprogram, AOP focuses on enriching functions which match certain patterns. Several studies have been conducted on the implementation and analysis of aspect-oriented programming, including the following. Bockisch, et al. [15] investigated the integration of aspect-oriented features into language virtual machines, in order to provide good performance while enabling join points to be chosen at run time. In contrast, [77] developed a static analysis which enables more ecient compilation of dynamic aspects. A type system for aspect-oriented computation is presented in [39], for the polyadic ABC language. ABC is similar to the -calculus, modeling nonterminating computations of interacting peers. In contrast, language-level virtualization focuses on mediating a subprogram's behavior by way of a meta-level. Another theoretical study of aspects is presented in [76], which denes a semantics of aspects by translation to an ML-like functional language. This contrasts with our approach of building up a language by using virtualization constructs. On the experimental side, [7] explored the use of user-dened analyses to identify pointcuts. Notably, their analysis accounts for the eects of aspects, and can remove dynamic tests which may be resolved statically. Similar techniques could be useful in a static, translational implementation of language-level virtualization. An interesting complement to our work is aspect mining, in which aspects are extracted from existing code bases; see, for example [21]. Both aspect mining and language level virtualization are useful tools for refactoring and reusing existing code bases in controlled environments.
7.3 Semantic frameworks Programming languages can be considered both in terms of implementation strategies and mathematical denitions of the underlying semantics. Many approaches exist to denining
106
executable semantics, wherein a semantic specication can be used to interpret real programs. From this perspective, our request-based model of execution can be viewed as an approach to dynamically creating new, extended language semantics for executing certain program fragments. We compare our approach to various systems of operational semantics surveyed in [75]. Small-step structural operational semantics (SOS) denes a syntax-directed reduction function on the states of a program [70]. A computation is expressed as a derivation tree, where each node corresponds to a single-step rule in the semantics. As pointed out in [75], SOS does not provide facilities for the modular denition of eectful operations: (1) the full state of a program be captured in the syntax of each semantic rule; and (2) conceptually simple operations like halt require changes to many seemingly unrelated rules. Our approach does not suer from these limitations because our basic execution model allows request messages to automatically propagate to the appropriate level. For example, the denition of halt is simple and modular:
handler(halt;args:k:unit(), :::)
Similarly, program state related to certain operations can be abstracted away, and ignored by unrelated operations. Examples include our denitions of lexical variables in Chapters 3 and 5. Big-step operational semantics abstracts SOS's reduction function on program states to an evaluation function from programs to results [43]. The return messages (Ret v) in our execution model borrow this idea. While big-step semantics can make language denitions simpler and higher-level as compared to small-step semantics, it does not provide a way to peer into the individual evaluation steps. In addition to debugging concerns, a program which does not terminate has no result value. Our execution model gets around this problem by exposing intermediate steps as request messages, which may or may not be captured by various parts of a program or language denition, depending on whether they are of interest. Modular SOS (MSOS) extends SOS with the capacity to describe state information that is not part of a program term itself, and which may be used or ignored by rules [65, 66]. Rules which ignore components of the state are assumed not to change those components. One could, theoretically, dene a dynamic language extension mechanism which merged new \language modules? with a copy of the MSOS semantics of the base language in eect, and instantiating a new interpreter for the new, extended semantics. While this is feasible when all modules of the language semantics are equally trusted, it does not account for access permissions to state
107
components, untrusted language extensions, and the coexistence of dierent denitions of the same construct within dierent parts of the same program. Our approach intrinsically accounts for these additional concerns by explicitly scoping language extensions as handler operations within the language. Hence a \language module? in our approach can only access those state components and operations which a normal program dened at the same point can access. Context reduction semantics diers from the previously-described systems in that it splits a program, at each step of execution, into an evaluation context and a redex [30, 97]. This corresponds closely to our request messages: the redex corresponds to the pair of an operation name and its arguments, and the context corresponds to the delimited continuation. But there are several key dierences. Evaluation contexts represent the entire context surrounding a redex, hence they correspond to full continuations, in contrast to our delimited continuations. Context reduction semantics involves simple, unconditional rules based on pattern matching on the context and the redex, whereas in this thesis, handlers may make use of arbitrary operations that are available in their scope, and we may rethrow requests that a handler cannot itself service. This is possible because new handlers can be dened anywhere within our programs, whereas context reduction semantics, as with the other systems above, separates the semantic rules from the program. In context reduction, valid contexts are specied via a grammar, and inferred via a parsing mechanism; whereas our execution model involves an explicit traversal directed by the programmer (or language designer). We made this choice because it is easier to understand how to arbitrarily redene operations if the operation denition has full control over (and responsibility for) the traversal. It would be interesting to understand how to integrate the grammar/parsing aspect of contxt reduction into the request-based execution model. All of the systems discussed above make use of an interleaving model of concurrency, as do we in Chapters 4 and 6. For example, a two-level re ective semantics has been developed for distributed systems in terms of the actor model [94], which has been applied to problems such as distributed garbage collection. In principle, we could also oer a true concurrency model, where the par operation accepts and evaluates a number of concurrent terms, internally servicing certain operations on its own, only emitting requests for operations it does not know how to implement. In hardware terms, this corresponds to executing programs on multiple slave processors, emitting messages on a shared bus to perform communication. However, this possibility is not explored in detail in this thesis.
108
7.4 Delimited continuations Continuations formalize the intuitive notion of \the rest of the program? at any point of execution [82]. A continuation is a function that, when called, executes the rest of the program|and since programs either terminate or diverge, a call to a continuation does not return. This behavior makes continuations useful for describing control
ow constructs. For example, a function may be passed normal return continuation and an exception continuation, and then internally choose which continuation to call, depending on whether an exceptional condition occurs during its execution. In practice, continuations have been used as a compilation mechanism|the CPS transform|as well as presented to the application programmer as a general-purpose construct, such as call-with-current-continuation in Scheme [19]. Delimited continuations [29] represent the rest of the execution of a program up to a certain point. In practice, delimited continuations often represent the rest of the computation of a particular subterm of a program, where the root of that subterm serves as the delimiter. One may alternatively think of a delimited continuation as a continuation prex, which may be composed with another (delimited) continuation to form a larger (delimited) continuation. Historically, delimited continuations have been dened in terms of a pair of operators, most notably either the dynamic delimited continuation operators prompt/control, or the static delimited continuation operators shift/reset. For example, the expression prompt C[control k:e] captures the delimited continuation C, evaluating to prompt e[k := C], as presented in [51]. The corresponding case for shift/reset is similar, but slightly dierent: in the case of prompt/control, if the captured delimited continuation C is returned by the prompt expression and then executed, any control operation it invokes may capture a much larger delimited continuation than was apparent from the original program code. The static operators, shift/reset, x this problem by wrapping the delimited continuation returned by reset in a shift block. Thus shift/reset has a straightforward transformation to prompt/control; but the question of a transformation in the opposite direction was open for quite some time. As it turns out, these and various other approaches to delimited continuations, as well as full continuations, are mutually expressible in terms of each other [17]. The delimited continuations used in this thesis are of the dynamic variety, since it is convenient to be able to take code out of one dynamic context and put it into another. Another key dierence is that we use dierent delimiters for dierent operators (corresponding to their handler scope). Other implementations of delimited continuations make similar generalizations [52]. The continuations within our request messages are in fact delimited continuations [24]. While a
109
standard continuation is a non-returning procedure which represents the rest of a program, a delimited continuation represents the rest of the computation of a subterm, and hence returns (unless the subterm diverges). Kiselyov et al. [53] showed how dynamic binding can be expressed in a delimited control framework. Their notion of delimited dynamic bindings coincides with the behavior of request messages (producers of delimited continuations) and handlers (the scope of dynamic bindings and consumers of delimited continuations) in language-level virtualization.
7.5 Dynamic binding One of the key features enabling request mediation is dynamic binding. The idea is that when a program term is placed within the scope of a handler, that handler's binding takes eect within the program term. If the program term is later placed in the scope of an alternate handler for an operation, the new handler should take eect. This section brie y reviews the literature on dynamic binding. This work can be classied into three general areas: theoretical results, implementations, and applications. Additionally, for an overview of binding constructs, see [12].
7.5.1 Theoretical results Several dierent approaches to dynamic binding have been studied in a theoretical context. The system N provides a model of dynamic binding based on name-indexed variables [23]. Rather than storing a single value in a variable, N allows multiple values to be passed along dierent named channels associated with the same lexical variable. This model is a generalization of both lexical scoping and (lexically-global) dynamic scoping, since each lexical variable can be treated as a dynamic environment or record. Hence records are essentially built-in to N, which allows for the development of a type system with dierent value types at each name index of a variable. However, names are not rst-class values, which means that private names must be managed by convention, and new names cannot be created dynamically. Dynamic binding a la Lisp was formalized in [64]. The system d and its derivatives retain Lisp's distinction between lexical and dynamic variables. The article studied applications of dynamic variables to dening exceptions, which is similar to our use of request messages in this thesis, since request messages can be viewed as restartable exceptions. The work also covered a number of practically-motivated evaluation strategies, including a dynamic-environment passing transformation from dynamic to lexical scoping; and both deep and shallow binding. Deep binding
110
corresponds to searching through a list for a variable's binding, while shallow binding is implemented by saving and restoring the contents of a global variable. These topics were explored from an implementer's perspective in [13]. One of the more recent advances has been the combination of dynamic binding and and delimited control within a single, consistent formalism [53]. The system, called DB+DC, enriches the language from [64] with a type system and delimited control. The authors conclude that delimited dynamic binding must allow for delimited continuations which close over part of the dynamic environment, rather than all or none. Indeed, this is exactly what we do in this thesis with handlers and rec-handlers expressions, since these dynamic binding sites also serve as delimiters for continuations in request messages. The paper also shows that DB+DC can in fact be macro-expressed in terms of delimited control. An approach to implicit parameter passing which is slightly dierent from dynamic variables was presented in [60]. This system allows for Hindley-Milner type inference of implicitly propagated function arguments. However, a comparison of the implementation of implicit parameters versus other dynamic binding systems has revealed a discrepancy [50]|whereas reading a dynamic variable returns the value of the latest binding, implicit parameters may be irrevocably substituted sooner. However, this may arguably be the desired behavior in the situations implicit parameters are designed to solve.
7.5.2 Implementations Lisp implementations have a long history of dynamic variables. Earlier Lisps relied heavily or exclusively on dynamic scoping [13]. Common Lisp [80] brought lexical scoping as the default, while still allowing a separate namespace of (lexically-global) dynamically-scoped special variables. Some other languages, such as Perl, also follow this pattern (Perl uses local for dynamic scoping and my for lexical scoping). TEX [54] and Emacs Lisp [59] also retain dynamic scoping. In many of these cases, however, the choice of dynamic scoping is not an absolutely necessary choice so much as an engineering compromise based on use cases and implementation technology. The Scheme programming language, inspired by the untyped -calculus, is the most popular lexically-scoped Lisp dialect. Owing to the history of the Lisp community, several Scheme implementations include additional features for dynamic scoping|particularly, lexically-scoped dynamic variables. The SRFI 39 standard [28] denes a system of parameter objects which can be dynamically created, passed by reference, and dynamically bound. While parameters are distinct
111
from variables, the fact that they are rst-class objects which can be stored in lexical variables means they are essentially lexically-scoped dynamic variables|much like how operations are not rst-class functions, but operation names are represented as symbols in this thesis. One existing weakness of parameter objects is the fact that dierent Scheme implementations handle parameters dierently in multi-threaded programs. The preferable approach is for parameter bindings in separate threads to be independent, just as sequential parameter bindings would be. The paper [36] proposed a minimal practical implementation of dynamic scoping for imperative languages. This project focused on the structure of dynamic environment frames, lookup methods based on hashing, and a pragmatic approaches to introducing dynamic variables using C++ classes and C preprocessor macros.
7.6 Actor systems The actor model [5] presents a convenient way to describe and implement concurrent object-oriented systems. In relation to this thesis, the concept of meta-actors|supervisors which mediate the visible actions of other actors|represent a restricted form of our request mediation. In Chapter 4, we examined denitions of several actor constructs in terms of request mediation.
7.6.1 Background on actors Actors have been implemented as both libraries and native features in many production languages. Erlang [9], developed for safety-critical distributed systems such as telephone switches, is essentially an industrial-strength implementation of the actor model. Historically, object-oriented languages like Smalltalk share a very similar model|however, Smalltalk's messages are synchronous. Even Scheme was originally inspired by a desire to create an actor system [87], although Scheme ended up taking a rather dierent evolutionary path. More recently, actors have been implemented as library abstractions in many languages, including Java [10], Scala [35], Python [57], and C++ [11, 63]. The primary downside of actor libraries is they cannot always enforce actor semantics. For example, the behaviors of two actors in Java could hold references to a shared variable, because Java's threads allow shared memory. This can happen inadvertently if an actor sends a message which contains an object that is only a shallow copy of an object it retains. Similarly, if actors are con ated with base-language objects, a programmer may call a synchronous object method rather than sending an asynchronous message; this may be
112
done inadvertently or inconsistently. It is possible to dene front-end preprocessors which rule out these violations, but this essentially entails creating a new, derived language [91]. An alternative to creating a new language or living with a leaky system is to catch violations dynamically. This is the approach we have taken with our request handlers for actor operations. While it does not give us strong static guarantees, it does allow us to detect violations when testing a prototype. Another concern which requires run-time support is distributed garbage collection actors [42, 93, 25, 90].
7.6.2 Actor coordination and re ective customization As mentioned in the introduction, there are many approaches to coordination in actor systems. In general, program control structures can be viewed as patterns of message passing amongst actors [38]. One approach we examined in this paper is nite automaton local synchronization constraints. This idea can be extended to protocol automata describing sessions involving multiple actors|for example, a client interacting with a server. A related concept is message receive patterns: for example, in Erlang, an actor's behavior may be dened piecewise, with dierent handlers for dierent message patterns. This idea has been generalized to patterns involving a conjunction of multiple messages [86]. An alternative to applying constraints locally within each actor is to mediate and lter messages as they pass between actors. Meta-actors are hence a useful implementation strategy for message lters [6]. In addition to the two cases examined in this paper (xed and reassignable meta-actors), we can consider other tradeos: one meta-actor per actor vs. a stack of multiple meta-actors per actor vs. multiple actors supervised by a single meta-actor, known as group re ection [96]. Each of these cases can be described using the denitions of meta-actors we presented. Finally, there are several approaches to describing complex coordination patterns amongst multiple actors. One form is protocol description languages, such as DIL, which can be used to customize program behavior, such as failure semantics [11, 84]. Another approach, called synchronizers, allows us to describe synchronization constraints in terms of message patterns and methods of various actors that they enable and disable|a distributed generalization of local synchronization constraints [1, 32]. Synchronizers have been extended to describe realtime behavior [67]. An alternative to enabling and disabling methods is creating and trapping the corresponding messages in
ight. For example, Reo circuits provide a way to specify rules that govern the creation, propagation, and consumption of messages [8]. Protocol customization mechanisms have been used
113
to introduce fault-tolerance and dependability features into existing systems [2, 3].
114
Chapter 8
Conclusions
This thesis has studied an approach to programming languages that allows arbitrary language features to be redened and extended from within the language itself. The key to this approach has been the view that a program executes by dispatching a sequence of request messages to an underlying underlying semantics object, which is itself dened as a lower-level program. In our model, program terms are trivially equivalent to a subset of these messages. Hence uses of all language features|even those of our base language|can be treated as request messages. Our model conveniently captures the standard notion of a stack of multiple levels of abstraction, such as a script interpreted by a user process hosted by an operating system running on a processor. Having multiple levels of semantics objects, we generalized our notion of request message dispatch, to allow messages to either be handled by a particular level, or automatically propagated to the underlying level. As a consequence, we are able to create contexts which trap and service arbitrary request messages|hence dene and redene arbitrary operations; while we transparently pass all other requests to the underlying semantics|hence inherting all the other denitions in scope. To better understand the request-based execution model, we used it to dene a number of language features. Notably: all major features of JavaScript, as well as a simple actor system with meta-actors. The fact that even base language features are treated as messages had pluses and minuses when applied to a real language. On the negative side, it required a base-level language interpreter dened in the request-based style. To get the full re ective power of our model, we had to dene a custom interpreter. But on the plus side, given such a base-level interpreter, we are able to dynamically create derived language interpreters within custom contexts by redening only the relevant operations. An important feature of the request-based model is that it enables supervisory control over the eects a program may have, while allowing the program itself to radically redene the language in which it is written. Moreover, any program may act as a supervisor for subprograms. This is
115
because custom language extensions, by construction, (a) are only as powerful as \normal? code in the same context, and (b) have the ability to fully mediate code within their scope. Unlike models of re ection in which a program \reaches down? and modies its own interpreter, this aspect of the request-based model makes it useful in sandboxed environments, such as web pages. Finally, it is worth noting that the request-based execution model is enabled by the combination of three important preexisting language features:
? dynamic scoping|request messages are serviced by their nearest dynamic enclosing handler. ? explicit call-by-name evaluation|allowing the denition of an operation to customize how arguments are evaluated ? delimited continuations|continuations allow us to replicate, pause, abort, or customize the evaluation of a program following invocation of an operation; while their delimited nature allows us to restrict the scope of these control eects Each of these features is important in our model. For example, dynamic scoping combined with delimited continuations is what allows us to dene imperative variables on top of a functional base language. The correspondence between terms and messages is what unies these features.
8.1 Contributions The key contributions of this thesis can be summarized as follows:
1. Introduced a request-based execution model that allows arbitrary language features to be dynamically redened, which is consistent with (and in fact, useful for) program sandboxing. 2. Showed how to dene a real programming language (JavaScript) in terms of the request-based execution model, allowing dynamic extensions to JavaScript in a browser-based prototype implementation. 3. Related the request-based execution model to prior work, including virtual machines, re ection, and meta-actor systems. 4. Serves as a (we believe, compelling) argument for the inclusion of dynamic scoping, explicit call-by-name/nonstrict evaluation, and delimited continuations in real programming languages. While the request-based execution model as described in this thesis is still 116
immature, and will invariably be superseded by better formulations, these three language
features are well-studied, and yet not as commonly available as they ought to be.
8.2 Future work Possible future work related to this thesis can be divided into three categories:
1. evolution of the request-based execution model 2. implementation improvements 3. further applications 8.2.1 Evolution of the request-based execution model It would be useful to make it easier to correctly and concisely describe implementations of language features via request handlers. One avenue of improvement would be to create a type system which allows us to dene the type signature of an operation, and then check that their denitions and redenitions (via request handlers) and their uses are all consistent with the type signature. Since our goal would be to type any language construct (rather than just, say, functions and variables), this would need to be a fairly rich type system. We would need the ability to introduce new types and new typing rules, show that those typing rules are consistent, and annotate terms with eect types as well as result types. One possible approach is a dependent type system, in which the type environment and typing rules are reied as type-level functions. This work will require a careful examination of the state-of-the-art in type-based reasoning systems, such as Coq, NuPRL, and Twelf. Another, related, avenue for improvement is to examine and refactor our approach to variable binding. This thesis considers variables as names that are accessed via explicit get and set operations. This is consistent with the low-level realization of variables in memory, but it has two potential issues going forward. First of all, is it easier to type (or infer the types of) variables if they are expressed dierently, for example, as operations themselves, rather than as names? Secondly, it would be useful to provide variable binding mechanisms as a component out of which operation denitions can be constructed. For example, it may be more convenient to adopt a higher-order abstract syntax (HOAS) style, perhaps providing a few dierent binding options (e.g., lexical and dynamic variables).
117
Variable binders aren't the only potentially reusable language feature components (or subfeatures). It would be useful to dene a library of subfeatures. To make this useful, we would need to investigate design patterns for intentionally making language features and subfeatures easily composable. For example, as it stands right now, if a new operation is dened via a handler and used within the scope of that handler, and we cannot introduce another mediating handler in between the two, then we have no good way to customize that feature. In solving this problem, we should be careful not to violate the sandboxing model: nested programs should not be able to arbitrarily examine and modifying their containing scope unless they are explicitly trusted to do so.
8.2.2 Implementation improvements A key limiter on the performance of request-based interpreters is the cost of capturing delimited continuations. In most cases, this capture could well be performed lazily, as is the case in production implementations of delimited continuations. However, preliminary experiments with delimited continuations in OCaml [52] has turned up a slight inconsistency between the type signatures control/prompt and friends and the semantics of our handler operation. Traditionally, delimited continuations do not distinguish between \normal? return via the delimited term reducing to a value, and \forced? return by capturing a delimited continuation and inserting a value in its place. Our handler operation does distinguish between these cases; and implementing it in terms of control/prompt together with a discriminated sum type annoyingly requires us to insert an additional layer of injections/projections to the delimited continuation at each step of evaluation. It should be possible to optimize this away in a lower-level implementation. A related concern is an ecient implementation of dynamic scoping, for the use cases typically encountered in the request-based execution model. One possibility is to eagerly inline dynamic handlers (or their addresses) into a program. To make this work, we must keep track of what has been inlined into a term, so that we can undo the inlining if a delimited continuation containing that term is later evaluated in a modied dynamic environment. In this case, it is particularly important that we know whether a given delimited continuation is used in only one place, or in multiple parts of the program simultaenously. In the latter case, we need to either not perform the inlining, or make a copy of the code.
118
8.2.3 Further applications Javascript on web pages is an interesting domain to consider because of the clear need to better mediate foreign code running on a web page. If implementations challenges such as those outlined above can be overcome, it could be useful to create a native Javascript interpreter for a browser that supports request mediation. This would reclassify the techniques we have explored from early prototypes to more plausible solutions. Aside from web pages, other computing environments also benet from the ability to manipulate and safely mediate arbitrary code. It would be interesting to build features like our request mediation constructs into an operating system. For example, we might build upon existing debugging and tracing facilities. What would make this really useful is if we could use a simple, high-level scripting language to easily, safely and completely mediate the execution of any software component or fragment thereof. In essence, this would allow us to perform virtualization at an arbitrary granularity.
119
References
[1] Gul Agha, Svend Frlund, WooYoung Kim, Rajendra Panwar, Anna Patterson, and Daniel Sturman. Abstraction and modularity mechanisms for concurrent computing. In in David Skillicorn and Domenico Talia (editors), Programming Languages for Parallel Processing, pp 146-157, IEEE Computer Society, May, 1995. [2] Gul Agha, Svend Frlund, Rajendra Panwar, and Daniel Sturman. A linguistic framework for dynamic composition of fault-tolerance protocols. In Conference on Dependable Computing for Critical Applications (DCCA-3), pp 197-207, International Federation of Information Processing Societies, Palermo (Sicily), Italy, September, 1992. [3] Gul Agha, Svend Frlund, Rajendra Panwar, and Daniel Sturman. A linguistic framework for dynamic composition of dependability protocols. In in C. E. Landwehr, B. Randell, and L. Simoncini (editors), Dependable Computing and Fault-Tolerant Systems VIII, pp 345-363, IFIP Transactions, Springer-Verlag, 1993. [4] Gul Agha and Daniel C. Sturman. A methodology for adapting to patterns of faults. In G. Koob (ed.), Foundation of Ultradependability, pages 23{60, 1994. [5] Gul A. Agha. ACTORS: A Model of Concurrent Computation in Distributed Systems. MIT Press, 1986. [6] Mehmet Aksit, Ken Wakita, Jan Bosch, Lodewijk Bergmans, and Akinori Yonezawa. Abstracting object interactions using composition lters. In Object-Based Distributed Processing, Lecture Notes in Computer Science 791, pages 152{184. Springer Verlag, 1993. [7] Tomoyuki Aotani and Hidehiko Masuhara. Scope: an aspectj compiler for supporting user-dened analysis-based pointcuts. In AOSD '07: Proceedings of the 6th international conference on Aspect-oriented software development, 2007. [8] Farhad Arbab. Reo: A channel-based coordination model for component composition. In Mathematical Structures in Computer Science, pages 329{366, 2004. [9] Joe Armstrong, Robert Virding, Claes Wikstrom, and Mike Williams. Concurrent Programming in Erlang. Prentice Hall, 1996. [10] Mark Astley, Thomas Clausen, James Waldby, Rajesh Kumar, and Amin Shali. Actor foundry. http://osl.cs.uiuc.edu/af/. [11] Mark Astley, Daniel C. Sturman, and Gul A. Agha. Customizable middleware for modular distributed software. Communications of the ACM, 44:2001, 2001. [12] M.P. Atkinson and R. Morrison. Types, bindings and parameters in a persistent environment. pages 3{20, 1988. [13] Henry G. Baker. Shallow binding in lisp 1.5. Commun. ACM, 21(7):565{569, 1978. 120
[14] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Wareld. Xen and the art of virtualization. In Symposium on Operating System Principles, 2003. [15] Christoph Bockisch, Michael Haupt, Mira Mezini, and Klaus Ostermann. Virtual machine support for dynamic join points. In AOSD '04: Proceedings of the 3rd international conference on Aspect-oriented software development, 2004. [16] Ana Bove and Laura Arbilla. A con uent calculus of macro expansion and evaluation. SIGPLAN Lisp Pointers, V(1):278{287, 1992. [17] Chung chieh Shan. Shift to control. In Proceedings of the 5th workshop on scheme and functional programming, ed. Olin Shivers and Oscar Waddell, 99107, 2004. [18] Manuel Clavel, Francisco Duran, Steven Eker, Patrick Lincoln, Narciso Mart-Oliet, Jose Meseguer, and Jose F. Quesada. Maude: Specication and programming in rewriting logic. Theoretical Computer Science, 2001. [19] Will Clinger, Anne Hartheimer, and Eric Ost. Implementation strategies for continuations. In LFP '88: Proceedings of the 1988 ACM conference on LISP and functional programming, pages 124{131, New York, NY, USA, 1988. ACM. [20] William Clinger. Macros in scheme. SIGPLAN Lisp Pointers, IV(4):17{23, 1991. [21] Grigoreta Soa Cojocar and Gabriela Serban. On some criteria for comparing aspect mining techniques. In LATE '07: Proceedings of the 3rd workshop on Linking aspect technology and evolution, 2007. [22] R.J. Creasy. The origin of the VM/370 time-sharing system. IBM Journal of Research and Development, 25(5), 1981. [23] Laurent Dami. A lambda-calculus for dynamic binding. Theor. Comput. Sci., 192(2):201{231, 1998. [24] Olivier Danvy and Andrzej Filinski. Abstracting control. In LFP '90: Proceedings of the 1990 ACM conference on LISP and functional programming, pages 151{160, New York, NY, USA, 1990. ACM. [25] Peter Dickman. Incremental, distributed orphan detection and actor garbage collection using graph partitioning and euler cycles. In in Proceedings Workshop on Distributed Algorithms 96, O. Babaoglu, K. Marzullo, Eds., Lecture Notes in Computer Science 1151, pages 141{158. Springer-Verlag, 1996. [26] ECMA-262. ECMAScript Language Specication. ECMA, 3rd edition, December 1999. [27] ECMA-335. Common Language Infrastructure. ECMA, 4th edition, June 2006. [28] Marc Feeley. Srfi 39: Parameter objects. 2003. [29] Matthias Felleisen, Daniel P. Friedman, Bruce Duba, and John Marrill. Beyond continuations. Technical Report 216, Computer Science Department, Indiana University, 1987. [30] Matthias Felleisen and Robert Hieb. A revised report on the syntactic theories of sequential control and state. Theoretical Computer Science, page 103(2):235271, 1992. [31] Svend Frolund. Inheritance of synchronization constraints in concurrent object-oriented programming languages. In European Conference on Object-Oriented Programming, in O. L. Madsen (Editor), Lecture Notes in Computer Science 615, pages 185{196. Springer Verlag, 1992. 121
[32] Svend Frlund and Gul Agha. A language framework for multi-object coordination. In Proceedings of the European Conference on Object-Oriented Programming. [33] Adele Goldberg, David Robson, and Michael A. Harrison. Smalltalk-80: The Language and its Implementation. Longman Higher Education, May 1983. [34] Jr. Guy L. Steele. Common LISP: The Language. Digital Press, 2nd edition, 1990. [35] Philipp Haller and Martin Odersky. Event-based programming without inversion of control. In Proc. Joint Modular Languages Conference, Springer LNCS, 2006. [36] David R. Hanson and Todd A. Proebsting. Dynamic variables. SIGPLAN Author's biography Sameer Sundresh received a Bachelor of Science degree in Computer Science from the University of Illinois at Urbana-Champaign in 2001; he has been enrolled in the graduate program in Computer Science since then. From 2001-2005, Sameer worked on sensor networks, including collaborative eldwork on localization based on radio and acoustic signals and other known constraints, and a sensor network simulator of questionable value other than accumulating citations. Being tired of overcalibrating dodgy sensors on bombing ranges and army bases in the middle of nowhere, during 2005-2006, Sameer made a failed attempt to develop a language-parametric module system (with a few unsuccessful revivals as he learned more about programming languages and type systems). Since December of 2006, he has intermittently been working on the request-based execution model reported on in this thesis; for about the rst year, this included an attempt at a type-and-eect system which never quite worked right and obscured the central ideas. From the August 2007 through August 2008, he consulted part-time for Pattern Insight, Inc., which inspired the application to Javascript. Since the Spring 2007 semester, Sameer has worked as a teaching assistant for CS 241 (systems programming), CS 421 (programming languages), and CS 423 (operating systems). Since 2004, Sameer has also been actively involved in the University of Illinois ACM student chapter. His participation includes work on a mobile phone-based payment system with SIGEmbedded, founding of a student SIGPLAN chapter, service as the 2005 Re ections Projections Conference chair, guiding the formation of a corporate relations committee, and stints as secretary and vice chair. 126
|
|
|
|