Overview¶
elasticslice is a new GENI-like API endpoint that provides an interface for clients that want to dynamically add and remove nodes from slivers at CMs (component managers), throughout the lifetime of their slice, and are willing to allow the CM server to remove dynamically-added nodes from their sliver, if the server needs the nodes for any reason. This is a different model than the one supported by the core GENI APIs, in which a client submits a ticket to an AM or CM to obtain the resources provided for in the ticket — and then is guaranteed to have those resources until the ticket expires.
If you’re coming from the Emulab model of projects and experiments, the new slice and sliver terminology may be confusing. You can read http://groups.geni.net/geni/wiki/GENIConcepts to get started — but the main thing is to understand that a slice is a container for resources. Creating a slice just creates an object at a Slice Authority, to which a set of users is bound (these users can modify the slice and its resources). You get a slice and a slice credential from a Slice Authority, a GENI component. When you present a slice credential to a Component Manager (or an Aggregate Manager), along with a request for resources, that CM may honor that request, and if it does, gives you resources in a collection of slivers bound to your slice at that CM. In the ProtoGENI world, each sliver is a node, or a LAN, or some other resource. However, we often refer to the collection of slivers associated with a slice at the CM as the sliver itself. Said another way, a slice is a portion of a federated testbed with multiple, mutually-trusting clusters, each providing a set of resources. You request resources at a cluster, bound to your slice, and the resources are called slivers. In the ProtoGeni world, the ProtoGeni SA will give you a slice and a credential for that slice, and that credential can be used to get you slivers at any of the ProtoGeni CMs (in the ProtoGeni world, a CM arbitrates resource allocation for a whole cluster). At other GENI testbeds, a sliver might be a single node. In the ProtoGENI world, a sliver is a lot like an Emulab experiment. It’s just that in ProtoGENI, you might have multiple slivers, one at each of the ProtoGENI CMs, all associated with one slice!
(Note: in the ProtoGENI world, each XMLRPC server serves one or more
endpoints. For instance, if you are invoking a GENI “slice authority”
method, you call Resolve(slice_urn)
at
https://www.emulab.net:12369/protogeni/xmlrpc/sa
– and thus you are
using the “sa” endpoint, which provides the slice authority GENI
interface. Thus, the “elasticslice” API is available at
/protogeni/xmlrpc/dynslice
(dynslice is the legacy name). Its methods
are described below.)
In this model, the client may call the new CMv2 methods AddNodes
and
DeleteNodes
to add and delete nodes from an existing sliver; the
manifest and ticket are adjusted accordingly. If nodes are added, they
are allocated to the user for the duration of the ticket.
However, the ability to add nodes incrementally opens up possibilities of resource abuse by eager clients, even if such abuse is unintended — users do not typically know how the CM is currently utilized, and what its future schedule/load will look like. Thus, the CM must have the ability to reduce a slice’s eager, dynamic, resource usage — without completely terminating the sliver in question. This is a cooperative, best-effort dynamic usage mechanism. Of course we still support a model where users dynamically add nodes to an existing sliver, and the server won’t reclaim them — but those users may not be able to renew their slivers infinitely like cooperative users. Fundamentally, if a user’s experiment can tolerate the addition of nodes, it may be able to tolerate the removal of nodes.
Such an interface requires client-server cooperation, however, and it requires that the server be able to notify the client if it is planning to delete one of the dynamically-added nodes in the experiment. We have chosen to ask the client to implement a XMLRPC API of its own, and tell the server if it wants to receive such notifications (although the client/user can choose to not participate in the cooperative scheme). We could have alternately extended the server-side elasticslice API endpoint with a CheckForNotifications() method, but such polling exposes our XMLRPC server to heavier load, and forces a client to poll often if it wants to really stay on top of what the server is going to do. We have chosen to design the interface in a way that polling is not required; and indeed, clients need not implement this server-style interface — if they do not, the server attempts to send their nodes a “warning” signal by scheduling a shutdown to occur at the end of the timeout period the server will wait after notifying a client that a node will be removed from the client’s sliver. See below in the “elasticslice Client-side API endpoint” section for details and help on implementing a cooperative server side in your client. We believe that dynamically-resizeable slices are likely to be scripted in any case, since the size of the experiment presumaby corresponds to the need of more compute cycle, more disk space, more network bandwidth, etc. For instance, consider an OpenStack experiment running on CloudLab. Suppose that the user initially allocated 5 compute nodes, but the load on the compute nodes grew larger. Or suppose that certain tests required more compute nodes, but other future testing would not require as many. The experiment can be resized with more compute nodes depending on load or experiment progress — and such activity could be controlled by a script.
This interface is powerful, and enables resource sharing between the ProtoGENI CM server that is managing a ProtoGENI cluster; and another entity that can utilize underused ProtoGENI resources temporarily for other purposes. For instance, at the University of Utah, we partner with our local HPC center, and the center temporarily utilizes unused Apt cluster (http://www.aptlab.net) nodes for HPC purposes. Utah’s HPC center uses Slurm as its cluster scheduler, and uses different scheduling policy, constraints, and workloads than the Apt cluster uses to manage its resources. We’ve written an intelligent client that uses this elasticslice interface to request nodes for Slurm when HPC load is high; and release nodes when HPC load shrinks. Of course, if Apt load becomes higher, the Apt CM elasticslice server may contact the client and notify it that it will reclaim nodes from HPC, even if HPC can use them — according to server-side policy. The client is given a “vote” in this process — it can tell the elasticslice server which nodes that it would prefer the server reclaim — but the server does not promise to honor those preferences (as it has its own constraints). Finally, the client receives pending delete notifications, and can then use the time before the server reclaims the node to remove the node from Slurm and sync back any user data on the node... and to stop scheduling Slurm jobs on that node.
elasticslice architecture: main actors and components¶
In an elasticslice–ProtoGENI ecosphere, there is
- a GENI user (with a certificate and SSH keypairs, belonging to a slice authority that is federated with one or more ProtoGENI CMs);
- an elasticslice-enabled ProtoGENI CM XMLRPC server (which implements the elastic AddNodes and DeleteNodes CM methods, as well as the entire dynslice Endpoint API described later);
- an elasticslice server daemon running on each elasticslice-enbled ProtoGENI CM, which is responsible for setting the server’s resource values, and reclaiming/deleting elastic nodes from slices as necessary, given its policies;
- an elasticslice client program run by the user that invokes ProtoGENI XMLRPC methods to create slices, add/delete nodes from them (and potentially, interact with the elasticslice server daemon, for best cooperative results);
- (optionally) an elasticslice client-side server run by the user, presumably in the same program as the client program, that receives notifications from the elasticslice server daemon, and responds to them (i.e., by cleaning up a node that the server is going to reclaim/delete once the client finishes cleaning it up and stops using it).
elasticslice source code: a brief tour¶
The elasticslice library was designed to allow users to quickly begin to manage existing ProtoGENI slices and Cloudlab experiments.
Resource values: how to know what to add or delete¶
Because the server and client may have completely different methods and algorithms for scheduling use of their resources (i.e., ProtoGENI experiments are typically network-driven whereas HPC jobs are CPU and disk I/O-bound, which necessitate different resource scheduling algorithms), they must have a way to cooperate in a way that will provide an optimal resource utilization experience for both sides, but will not force either side to understand the other’s scheduling policies and mechanisms. For now, we have designed an interface by which the server and client send each other a simple floating-point value for each node they control; higher values represent higher valuation of a node. Each time the server updates its valuation list, it sends the client a message with the new resource values, for all resources the cluster controls. Similarly, when the client updates its valuations of the resources it currently holds in its slice/experiment, it can send an update to the server.
By exchanging values, both sides can make good (hopefully optimal) decisions about the addition and deletion of nodes from slices. For instance, if the client adds a node that is low-valued by the server, the client can hope that this value will continue to hold into the future, and that the server will continue to view this node as having low value, and will thus be less likely to attempt to reclaim it from the client if the server needs more resources. Similarly, when the server does need more resources, it can attempt to choose a low-valued node from the client, modulo its own node priorities.
Resource groups: expressing node dependencies¶
Not yet implemented
Values alone are not enough for some kinds of clients to manage their resources. Suppose, for instance, that the elasticslice server needs to reclaim/delete 3 nodes from an elastic slice whose elastic nodes are currently part of an HPC cluster, and are running HPC jobs. Suppose further that the currently-running jobs are distributed in such a way that if the server would pick 3 nodes to delete, based solely on a combination of its values for those nodes, and the client’s values for those nodes, by deleting those 3 nodes, all currently-running jobs would effectively be fatally killed (due to the distribution of jobs). It would be much more effective if the client could communicate to the server that a node is in one or more groups — and that if you select a node from one group to reclaim/delete, you may as well select the other nodes from that group, if you must reclaim more than one node.
We will allow clients to specify these node-group memberships, and the server will attempt to honor those groups. However, the server must take both its and the client’s constraints into the decision, so this is an optimistic promise, not a hard guarantee.
(Depending on how the first version of this performs, we will enhance or replace this mechanism.)