X10 is a language that is focused on enhanced application performance and developer productivity while enabling applications to advantage the scale-out computational environment. The considerations and concerns that are addressed by the X10 language design are numerous and discussed below and in subsequent areas of our site and documentation.
Larger computational problems require more powerful computers capable of performing a larger number of operations per second. The era of increasing performance by simply increasing clocking frequency is now behind us. It is becoming increasingly difficult to mange chip power and heat. Instead, computer designers are starting to look at scale out systems in which the system’s computational capacity is increased by adding additional nodes of comparable power to existing nodes, and connecting nodes with a high-speed communication network.
A central problem with scale out systems is a definition of the memory model, that is, a model of the interaction between shared memory and simultaneous (read, write) operations on that memory by multiple processors. The traditional “one operation at a time, to completion” model that underlies Lamport’s notion of sequential consistency (SC) proves too expensive to implement in hardware, at scale. Various models of relaxed consistency have proven too difficult for programmers to work with.
One response to this problem has been to move to a fragmented memory model. Multiple processors are made to interact via a relatively language-neutral message passing format such as MPI [9]. This model has enjoyed some success: several high-performance applications have been written in this style. Unfortunately, this model leads to a loss of programmer productivity: the message-passing format is integrated into the host language by means of an application-programming interface (API), the programmer must explicitly represent and manage the interaction between multiple processes and choreograph their data exchange; large data structures (such as distributed arrays, graphs, hash-tables) that are conceptually unitary must be thought of as fragmented across different nodes; all processors must generally execute the same code (in an SPMD fashion) etc.
One response to this problem has been the advent of the partitioned global address space (PGAS) model underlying languages such as UPC, Titanium and Co-Array Fortran [3, 10]. These languages permit the programmer to think of a single computation running across multiple processors, sharing a common address space. All data resides at some processors, which is said to have affinity to the data. Each processor may operate directly on the data it contains but must use some indirect mechanism to access or update data at other processors. Some kind of global barriers are used to ensure that processors remain roughly in lock-step. X10 is a modern object-oriented programming language in the PGAS family. The fundamental goal of X10 is to enable scalable, high-performance, high-productivity transformational programming for high-end computers—for traditional numerical computation workloads (such as weather simulation, molecular dynamics, particle transport problems etc) as well as commercial server workloads.
X10 is based on state-of-the-art object-oriented programming ideas primarily to take advantage of their proven flexibility and ease-of-use for a wide spectrum of programming problems. X10 takes advantage of several years of research (e.g., in the context of the Java Grande forum, [7, 1]) on how to adapt such languages to the context of high-performance numerical computing. Thus X10 provides support for user-defined struct types (such as Int, Float, Complex etc), supports a very flexible form of multi-dimensional arrays (based on ideas in ZPL [4]) and supports IEEE-standard floating point arithmetic. Some capabilities for supporting operator overloading are also provided.
X10 introduces a flexible treatment of concurrency, distribution and locality, within an integrated type system. X10 extends the PGAS model with asynchrony (yielding the APGAS programming model). X10 introduces places as an abstraction for a computational context with a locally synchronous view of shared memory.
An X10 computation runs over a large collection of places. Each place hosts some data and runs one or more activities. Activities are extremely lightweight threads of execution. An activity may synchronously (and atomically) use one or more memory locations in the place in which it resides, leveraging current symmetric
multiprocessor (SMP) technology. To access or update memory at other places, it must spawn activities asynchronously (either explicitly or implicitly). X10 provides weaker ordering guarantees for inter-place data access, enabling applications to scale. Immutable data needs no consistency management and may be freely copied by the implementation between places. One or more clocks may be used to order activities running in multiple places. Arrays may be distributed across multiple places. Arrays support parallel collective operations. A novel exception flow model ensures that exceptions thrown by asynchronous activities can be caught at a suitable parent activity. The type system tracks which memory accesses are local. The programmer may introduce place casts which verify the access is local at run time. Linking with native code is supported.
X10 v2.0 builds on v1.7 to support the following features: structs (i.e., “headerless”, inlinable objects), type rules for preventing escape of this from a constructor, the introduction of a global object model, permitting user-specified (immutable) fields to be replicated with the object reference. value classes are no longer supported; their functionality is accomplished by using structs or global fields and methods. Several representative idioms for concurrency and communication have already found pleasant expression in X10. We intend to develop several full-scale applications to get better experience with the language, and revisit the design in the light of this experience.