Managing Data Operating System
Also available as:
PDF
loading table of contents...

Define the device plug-in interface

You must define the plug-in class for implementing the DevicePlugin interface.

DevicePlugin Interface

/**
 * A must interface for vendor plugin to implement.
 * */
public interface DevicePlugin {
  /**
   * Called first when device plugin framework wants to register.
   * @return DeviceRegisterRequest {@link DeviceRegisterRequest}
   * @throws Exception
   * */
  DeviceRegisterRequest getRegisterRequestInfo()
      throws Exception;

  /**
   * Called when update node resource.
   * @return a set of {@link Device}, {@link java.util.TreeSet} recommended
   * @throws Exception
   * */
  Set<Device> getDevices() throws Exception;

  /**
   * Asking how these devices should be prepared/used
   * before/when container launch. A plugin can do some tasks in its own or
   * define it in DeviceRuntimeSpec to let the framework do it.
   * For instance, define {@code VolumeSpec} to let the
   * framework to create volume before running container.
   *
   * @param allocatedDevices A set of allocated {@link Device}.
   * @param yarnRuntime Indicate which runtime YARN will use
   *        Could be {@code RUNTIME_DEFAULT} or {@code RUNTIME_DOCKER}
   *        in {@link DeviceRuntimeSpec} constants. The default means YARN's
   *        non-docker container runtime is used. The docker means YARN's
   *        docker container runtime is used.
   * @return a {@link DeviceRuntimeSpec} description about environment,
   * {@link         VolumeSpec}, {@link MountVolumeSpec}. etc
   * @throws Exception
   * */
  DeviceRuntimeSpec onDevicesAllocated(Set<Device>; allocatedDevices,
      YarnRuntimeType yarnRuntime) throws Exception;

  /**
   * Called after device released.
   * @param releasedDevices A set of released devices
   * @throws Exception
   * */
  void onDevicesReleased(Set<Device> releasedDevices)
      throws Exception;
}
Property Description
getRegisterRequestInfo This method is used for the plug-in to get a new resource type name and then the ResourceManager. The DeviceRegisterRequest returned by the method consists of a plug-in version and a resource type name. For example, nvidia.com/gpu.
getDevices This method is used to get the latest vendor device list in this Node Manager node. The resource count pre-defined in node-resources.xml will be overridden. It is recommended that the vendor plug-in manages the allowed devices reported to YARN in its own configuration. YARN can only have a blacklist configuration specified using the devices.denied-numbers parameter in the container-executor.cfg file. In this method, you may invoke a shell command or invoke RESTful/RPC to remote service to get the list of devices whenever required.
Note
Note
The Device object can describe a fake device. If the major device number, minor device number and device path are blank, the framework does not do isolation for it. This provides feasibility for you to define a fake device without real hardware.
onDevicesAllocated This method is invoked to provide information to the framework on how to use these devices. The Node Manager invokes this interface to let the plug-in start preparation tasks like create volume before container launch and provides information on how to expose the devices to container when launching it. This is described in the DeviceRuntimeSpec interface. For example, DeviceRuntimeSpec can describe the container launch requirements like environment variables, device and volume mounts, Docker runtime type, and so on.
onDeviceReleased This method is used for the plug-in to do clean up work like device reset before the container terminates.