10 Apple Metal Support
David Anderson edited this page 2023-12-06 16:14:49 -08:00

Introduction

Hardware

Apple computers can have several types of GPUs:

  • Integrated GPUs: built into the CPU. For example, Apple Silicon chips (M1 and M2) have built-in GPUs. This architecture has the advantage that CPU and GPU can share memory efficiently.
  • Discrete GPUs connected by Thunderbolt. These are called 'external GPUs' (eGPUs). They are typically NVIDIA or AMD. According to Apple, only Intel Macs support eGPUs.
  • Discrete GPUs connected by PCI. There are signs that Apple is phasing this out as well.

Software

Mac OS lets applications access GPUs via two APIs:

  • OpenCL. As of now, this is included with Mac OS. Apple announced that they have stopped supporting it and at some point will no longer include it. It's possible that it will be available from elsewhere.

  • Metal is Apple's replacement for OpenCL and OpenGL. See also Wikipedia.

There's support for accessing NVIDIA GPUs via CUDA, but it looks like this is being phased out. I don't think AMD's CAL API has ever been supported.

BOINC: current

How GPUs (and other coprocessors) are enumerated

BOINC uses struct COPROC to describe either

  • a single GPU (or coprocessor)
  • a set of (nearly) identical GPUs.

There are derived classes COPROC_NVIDIA, COPROC_ATI, and COPROC_INTEL that contain additional vendor-specific info (like CUDA capabilities, CAL version #, etc.).

The class COPROCS describes all the processing resources on the host. It has an array of COPROC objects; the first element represents the CPU, which may be usable via OpenCL. It has separate vendor-specific objects for NVIDIA, ATI, and Intel.

The client enumerates GPUs using three APIs:

  • CAL (AMD)
  • CUDA (NVIDIA)
  • OpenCL

The top-level function (COPROCS::detect_gpus() first calls functions to enumerate NVIDIA and then ATI GPUs. Each of these returns a global vector of objects: nvidia_gpus and ati_gpus.

The COPROCS::get_opencl() function loops over platforms (i.e. vendors). For each vendor it enumerates CPUs, then GPUs and accelerators. In the latter case it tries to match the GPU up with an entry in the NVIDIA and ATI vectors. In the Intel case it adds a record to a global vector intel_gpus.

Next, COPROCS::correlate_gpus() reduces the vectors (e.g. nvidia_gpus) to a single COPROC object. It calls vendor-specific functions, e.g. COPROC_NVIDIA::correlate(). Each of these identifies the most powerful instance (on the basis of memory, FLOPS, etc.) and identifies the instances that are close to that in all these dimensions. The result is a single COPROC object with a count field indicating the number of instances.

How GPUs are identified

(to be completed)

How apps know what GPU to use

(to be completed)

BOINC: proposal

To simplify things, let's have these goals:

  • Be able to use integrated GPUs (Apple Silicon and Intel) from either Metal or OpenCL.
  • Use a single name ('Apple_M') for all Apple Silicon GPUs.

For now, let's NOT have the goal of accessing discrete GPUs (NVIDIA, ATI) via Metal. This would add a lot of complexity because of the need to correlate multiple GPU instances. For now, these GPUs can be used via OpenCL (and possibly CUDA in the case of NVIDIA). By the time Apple removes OpenCL, discrete GPUs will probably have little total power compared to Apple Silicon GPUs.

So here's what we need to do:

  • Add Metal-specific info (versions, capabilities) to COPROC.
  • Add a new class COPROC_APPLE.
  • Pick a name for Apple Silicon GPUs, e.g. Apple_M.
  • Call Metal to enumerate GPUs. Ignore all except Apple Silicon and Intel.
  • If OpenCL detects an Apple Silicon or Intel GPU, and it was detected via Metal, copy the OpenCL info to the COPROC_APPLE or COPROC_INTEL object.
  • Adopt a convention for Metal plan class names, and add logic at various places in the server code.

Questions

If there are multiple Apple Silicon chips, what does Metal return? How would we distinguish between them? How could the client tell an app to use a particular chip?