drm/i915 Intel GFX Driver

The drm/i915 driver supports all (with the exception of some very early models) integrated GFX chipsets with both Intel display and rendering blocks. This excludes a set of SoC platforms with an SGX rendering unit, those have basic support through the gma500 drm driver.

Core Driver Infrastructure

This section covers core driver infrastructure used by both the display and the GEM parts of the driver.

Runtime Power Management

The i915 driver supports dynamic enabling and disabling of entire hardware blocks at runtime. This is especially important on the display side where software is supposed to control many power gates manually on recent hardware, since on the GT side a lot of the power management is done by the hardware. But even there some manual control at the device level is required.

Since i915 supports a diverse set of platforms with a unified codebase and hardware engineers just love to shuffle functionality around between power domains there’s a sizeable amount of indirection required. This file provides generic functions to the driver for grabbing and releasing references for abstract power domains. It then maps those to the actual power wells present for a given platform.

intel_wakeref_t intel_runtime_pm_get_raw(struct intel_runtime_pm * rpm)

grab a raw runtime pm reference

Parameters

struct intel_runtime_pm * rpm
the intel_runtime_pm structure

Description

This is the unlocked version of intel_display_power_is_enabled() and should only be used from error capture and recovery code where deadlocks are possible. This function grabs a device-level runtime pm reference (mostly used for asynchronous PM management from display code) and ensures that it is powered up. Raw references are not considered during wakelock assert checks.

Any runtime pm reference obtained by this function must have a symmetric call to intel_runtime_pm_put_raw() to release the reference again.

Return

the wakeref cookie to pass to intel_runtime_pm_put_raw(), evaluates as True if the wakeref was acquired, or False otherwise.

intel_wakeref_t intel_runtime_pm_get(struct intel_runtime_pm * rpm)

grab a runtime pm reference

Parameters

struct intel_runtime_pm * rpm
the intel_runtime_pm structure

Description

This function grabs a device-level runtime pm reference (mostly used for GEM code to ensure the GTT or GT is on) and ensures that it is powered up.

Any runtime pm reference obtained by this function must have a symmetric call to intel_runtime_pm_put() to release the reference again.

Return

the wakeref cookie to pass to intel_runtime_pm_put()

intel_wakeref_t __intel_runtime_pm_get_if_active(struct intel_runtime_pm * rpm, bool ignore_usecount)

grab a runtime pm reference if device is active

Parameters

struct intel_runtime_pm * rpm
the intel_runtime_pm structure
bool ignore_usecount
get a ref even if dev->power.usage_count is 0

Description

This function grabs a device-level runtime pm reference if the device is already active and ensures that it is powered up. It is illegal to try and access the HW should intel_runtime_pm_get_if_active() report failure.

If ignore_usecount is true, a reference will be acquired even if there is no user requiring the device to be powered up (dev->power.usage_count == 0). If the function returns false in this case then it’s guaranteed that the device’s runtime suspend hook has been called already or that it will be called (and hence it’s also guaranteed that the device’s runtime resume hook will be called eventually).

Any runtime pm reference obtained by this function must have a symmetric call to intel_runtime_pm_put() to release the reference again.

Return

the wakeref cookie to pass to intel_runtime_pm_put(), evaluates as True if the wakeref was acquired, or False otherwise.

intel_wakeref_t intel_runtime_pm_get_noresume(struct intel_runtime_pm * rpm)

grab a runtime pm reference

Parameters

struct intel_runtime_pm * rpm
the intel_runtime_pm structure

Description

This function grabs a device-level runtime pm reference (mostly used for GEM code to ensure the GTT or GT is on).

It will _not_ power up the device but instead only check that it’s powered on. Therefore it is only valid to call this functions from contexts where the device is known to be powered up and where trying to power it up would result in hilarity and deadlocks. That pretty much means only the system suspend/resume code where this is used to grab runtime pm references for delayed setup down in work items.

Any runtime pm reference obtained by this function must have a symmetric call to intel_runtime_pm_put() to release the reference again.

Return

the wakeref cookie to pass to intel_runtime_pm_put()

void intel_runtime_pm_put_raw(struct intel_runtime_pm * rpm, intel_wakeref_t wref)

release a raw runtime pm reference

Parameters

struct intel_runtime_pm * rpm
the intel_runtime_pm structure
intel_wakeref_t wref
wakeref acquired for the reference that is being released

Description

This function drops the device-level runtime pm reference obtained by intel_runtime_pm_get_raw() and might power down the corresponding hardware block right away if this is the last reference.

void intel_runtime_pm_put_unchecked(struct intel_runtime_pm * rpm)

release an unchecked runtime pm reference

Parameters

struct intel_runtime_pm * rpm
the intel_runtime_pm structure

Description

This function drops the device-level runtime pm reference obtained by intel_runtime_pm_get() and might power down the corresponding hardware block right away if this is the last reference.

This function exists only for historical reasons and should be avoided in new code, as the correctness of its use cannot be checked. Always use intel_runtime_pm_put() instead.

void intel_runtime_pm_put(struct intel_runtime_pm * rpm, intel_wakeref_t wref)

release a runtime pm reference

Parameters

struct intel_runtime_pm * rpm
the intel_runtime_pm structure
intel_wakeref_t wref
wakeref acquired for the reference that is being released

Description

This function drops the device-level runtime pm reference obtained by intel_runtime_pm_get() and might power down the corresponding hardware block right away if this is the last reference.

void intel_runtime_pm_enable(struct intel_runtime_pm * rpm)

enable runtime pm

Parameters

struct intel_runtime_pm * rpm
the intel_runtime_pm structure

Description

This function enables runtime pm at the end of the driver load sequence.

Note that this function does currently not enable runtime pm for the subordinate display power domains. That is done by intel_power_domains_enable().

void intel_uncore_forcewake_get(struct intel_uncore * uncore, enum forcewake_domains fw_domains)

grab forcewake domain references

Parameters

struct intel_uncore * uncore
the intel_uncore structure
enum forcewake_domains fw_domains
forcewake domains to get reference on

Description

This function can be used get GT’s forcewake domain references. Normal register access will handle the forcewake domains automatically. However if some sequence requires the GT to not power down a particular forcewake domains this function should be called at the beginning of the sequence. And subsequently the reference should be dropped by symmetric call to intel_unforce_forcewake_put(). Usually caller wants all the domains to be kept awake so the fw_domains would be then FORCEWAKE_ALL.

void intel_uncore_forcewake_user_get(struct intel_uncore * uncore)

claim forcewake on behalf of userspace

Parameters

struct intel_uncore * uncore
the intel_uncore structure

Description

This function is a wrapper around intel_uncore_forcewake_get() to acquire the GT powerwell and in the process disable our debugging for the duration of userspace’s bypass.

void intel_uncore_forcewake_user_put(struct intel_uncore * uncore)

release forcewake on behalf of userspace

Parameters

struct intel_uncore * uncore
the intel_uncore structure

Description

This function complements intel_uncore_forcewake_user_get() and releases the GT powerwell taken on behalf of the userspace bypass.

void intel_uncore_forcewake_get__locked(struct intel_uncore * uncore, enum forcewake_domains fw_domains)

grab forcewake domain references

Parameters

struct intel_uncore * uncore
the intel_uncore structure
enum forcewake_domains fw_domains
forcewake domains to get reference on

Description

See intel_uncore_forcewake_get(). This variant places the onus on the caller to explicitly handle the dev_priv->uncore.lock spinlock.

void intel_uncore_forcewake_put(struct intel_uncore * uncore, enum forcewake_domains fw_domains)

release a forcewake domain reference

Parameters

struct intel_uncore * uncore
the intel_uncore structure
enum forcewake_domains fw_domains
forcewake domains to put references

Description

This function drops the device-level forcewakes for specified domains obtained by intel_uncore_forcewake_get().

void intel_uncore_forcewake_flush(struct intel_uncore * uncore, enum forcewake_domains fw_domains)

flush the delayed release

Parameters

struct intel_uncore * uncore
the intel_uncore structure
enum forcewake_domains fw_domains
forcewake domains to flush
void intel_uncore_forcewake_put__locked(struct intel_uncore * uncore, enum forcewake_domains fw_domains)

grab forcewake domain references

Parameters

struct intel_uncore * uncore
the intel_uncore structure
enum forcewake_domains fw_domains
forcewake domains to get reference on

Description

See intel_uncore_forcewake_put(). This variant places the onus on the caller to explicitly handle the dev_priv->uncore.lock spinlock.

int __intel_wait_for_register_fw(struct intel_uncore * uncore, i915_reg_t reg, u32 mask, u32 value, unsigned int fast_timeout_us, unsigned int slow_timeout_ms, u32 * out_value)

wait until register matches expected state

Parameters

struct intel_uncore * uncore
the struct intel_uncore
i915_reg_t reg
the register to read
u32 mask
mask to apply to register value
u32 value
expected value
unsigned int fast_timeout_us
fast timeout in microsecond for atomic/tight wait
unsigned int slow_timeout_ms
slow timeout in millisecond
u32 * out_value
optional placeholder to hold registry value

Description

This routine waits until the target register reg contains the expected value after applying the mask, i.e. it waits until

(intel_uncore_read_fw(uncore, reg) & mask) == value

Otherwise, the wait will timeout after slow_timeout_ms milliseconds. For atomic context slow_timeout_ms must be zero and fast_timeout_us must be not larger than 20,0000 microseconds.

Note that this routine assumes the caller holds forcewake asserted, it is not suitable for very long waits. See intel_wait_for_register() if you wish to wait without holding forcewake for the duration (i.e. you expect the wait to be slow).

Return

0 if the register matches the desired condition, or -ETIMEDOUT.

int __intel_wait_for_register(struct intel_uncore * uncore, i915_reg_t reg, u32 mask, u32 value, unsigned int fast_timeout_us, unsigned int slow_timeout_ms, u32 * out_value)

wait until register matches expected state

Parameters

struct intel_uncore * uncore
the struct intel_uncore
i915_reg_t reg
the register to read
u32 mask
mask to apply to register value
u32 value
expected value
unsigned int fast_timeout_us
fast timeout in microsecond for atomic/tight wait
unsigned int slow_timeout_ms
slow timeout in millisecond
u32 * out_value
optional placeholder to hold registry value

Description

This routine waits until the target register reg contains the expected value after applying the mask, i.e. it waits until

(intel_uncore_read(uncore, reg) & mask) == value

Otherwise, the wait will timeout after timeout_ms milliseconds.

Return

0 if the register matches the desired condition, or -ETIMEDOUT.

enum forcewake_domains intel_uncore_forcewake_for_reg(struct intel_uncore * uncore, i915_reg_t reg, unsigned int op)

which forcewake domains are needed to access a register

Parameters

struct intel_uncore * uncore
pointer to struct intel_uncore
i915_reg_t reg
register in question
unsigned int op
operation bitmask of FW_REG_READ and/or FW_REG_WRITE

Description

Returns a set of forcewake domains required to be taken with for example intel_uncore_forcewake_get for the specified register to be accessible in the specified mode (read, write or read/write) with raw mmio accessors.

NOTE

On Gen6 and Gen7 write forcewake domain (FORCEWAKE_RENDER) requires the callers to do FIFO management on their own or risk losing writes.

Interrupt Handling

These functions provide the basic support for enabling and disabling the interrupt handling support. There’s a lot more functionality in i915_irq.c and related files, but that will be described in separate chapters.

void intel_irq_init(struct drm_i915_private * dev_priv)

initializes irq support

Parameters

struct drm_i915_private * dev_priv
i915 device instance

Description

This function initializes all the irq support including work items, timers and all the vtables. It does not setup the interrupt itself though.

void intel_runtime_pm_disable_interrupts(struct drm_i915_private * dev_priv)

runtime interrupt disabling

Parameters

struct drm_i915_private * dev_priv
i915 device instance

Description

This function is used to disable interrupts at runtime, both in the runtime pm and the system suspend/resume code.

void intel_runtime_pm_enable_interrupts(struct drm_i915_private * dev_priv)

runtime interrupt enabling

Parameters

struct drm_i915_private * dev_priv
i915 device instance

Description

This function is used to enable interrupts at runtime, both in the runtime pm and the system suspend/resume code.

Intel GVT-g Guest Support(vGPU)

Intel GVT-g is a graphics virtualization technology which shares the GPU among multiple virtual machines on a time-sharing basis. Each virtual machine is presented a virtual GPU (vGPU), which has equivalent features as the underlying physical GPU (pGPU), so i915 driver can run seamlessly in a virtual machine. This file provides vGPU specific optimizations when running in a virtual machine, to reduce the complexity of vGPU emulation and to improve the overall performance.

A primary function introduced here is so-called “address space ballooning” technique. Intel GVT-g partitions global graphics memory among multiple VMs, so each VM can directly access a portion of the memory without hypervisor’s intervention, e.g. filling textures or queuing commands. However with the partitioning an unmodified i915 driver would assume a smaller graphics memory starting from address ZERO, then requires vGPU emulation module to translate the graphics address between ‘guest view’ and ‘host view’, for all registers and command opcodes which contain a graphics memory address. To reduce the complexity, Intel GVT-g introduces “address space ballooning”, by telling the exact partitioning knowledge to each guest i915 driver, which then reserves and prevents non-allocated portions from allocation. Thus vGPU emulation module only needs to scan and validate graphics addresses without complexity of address translation.

void intel_vgpu_detect(struct drm_i915_private * dev_priv)

detect virtual GPU

Parameters

struct drm_i915_private * dev_priv
i915 device private

Description

This function is called at the initialization stage, to detect whether running on a vGPU.

void intel_vgt_deballoon(struct i915_ggtt * ggtt)

deballoon reserved graphics address trunks

Parameters

struct i915_ggtt * ggtt
the global GGTT from which we reserved earlier

Description

This function is called to deallocate the ballooned-out graphic memory, when driver is unloaded or when ballooning fails.

int intel_vgt_balloon(struct i915_ggtt * ggtt)

balloon out reserved graphics address trunks

Parameters

struct i915_ggtt * ggtt
the global GGTT from which to reserve

Description

This function is called at the initialization stage, to balloon out the graphic address space allocated to other vGPUs, by marking these spaces as reserved. The ballooning related knowledge(starting address and size of the mappable/unmappable graphic memory) is described in the vgt_if structure in a reserved mmio range.

To give an example, the drawing below depicts one typical scenario after ballooning. Here the vGPU1 has 2 pieces of graphic address spaces ballooned out each for the mappable and the non-mappable part. From the vGPU1 point of view, the total size is the same as the physical one, with the start address of its graphic space being zero. Yet there are some portions ballooned out( the shadow part, which are marked as reserved by drm allocator). From the host point of view, the graphic address space is partitioned by multiple vGPUs in different VMs.

                       vGPU1 view         Host view
            0 ------> +-----------+     +-----------+
              ^       |###########|     |   vGPU3   |
              |       |###########|     +-----------+
              |       |###########|     |   vGPU2   |
              |       +-----------+     +-----------+
       mappable GM    | available | ==> |   vGPU1   |
              |       +-----------+     +-----------+
              |       |###########|     |           |
              v       |###########|     |   Host    |
              +=======+===========+     +===========+
              ^       |###########|     |   vGPU3   |
              |       |###########|     +-----------+
              |       |###########|     |   vGPU2   |
              |       +-----------+     +-----------+
     unmappable GM    | available | ==> |   vGPU1   |
              |       +-----------+     +-----------+
              |       |###########|     |           |
              |       |###########|     |   Host    |
              v       |###########|     |           |
total GM size ------> +-----------+     +-----------+

Return

zero on success, non-zero if configuration invalid or ballooning failed

Intel GVT-g Host Support(vGPU device model)

Intel GVT-g is a graphics virtualization technology which shares the GPU among multiple virtual machines on a time-sharing basis. Each virtual machine is presented a virtual GPU (vGPU), which has equivalent features as the underlying physical GPU (pGPU), so i915 driver can run seamlessly in a virtual machine.

To virtualize GPU resources GVT-g driver depends on hypervisor technology e.g KVM/VFIO/mdev, Xen, etc. to provide resource access trapping capability and be virtualized within GVT-g device module. More architectural design doc is available on https://01.org/group/2230/documentation-list.

int intel_gvt_init(struct drm_i915_private * dev_priv)

initialize GVT components

Parameters

struct drm_i915_private * dev_priv
drm i915 private data

Description

This function is called at the initialization stage to create a GVT device.

Return

Zero on success, negative error code if failed.

void intel_gvt_driver_remove(struct drm_i915_private * dev_priv)

cleanup GVT components when i915 driver is unbinding

Parameters

struct drm_i915_private * dev_priv
drm i915 private *

Description

This function is called at the i915 driver unloading stage, to shutdown GVT components and release the related resources.

void intel_gvt_resume(struct drm_i915_private * dev_priv)

GVT resume routine wapper

Parameters

struct drm_i915_private * dev_priv
drm i915 private *

Description

This function is called at the i915 driver resume stage to restore required HW status for GVT so that vGPU can continue running after resumed.

Workarounds

Error

kernel-doc missing

Display Hardware Handling

This section covers everything related to the display hardware including the mode setting infrastructure, plane, sprite and cursor handling and display, output probing and related topics.

Mode Setting Infrastructure

The i915 driver is thus far the only DRM driver which doesn’t use the common DRM helper code to implement mode setting sequences. Thus it has its own tailor-made infrastructure for executing a display configuration change.

Frontbuffer Tracking

Error

kernel-doc missing

Error

kernel-doc missing

Error

kernel-doc missing

Display FIFO Underrun Reporting

Error

kernel-doc missing

Error

kernel-doc missing

Plane Configuration

This section covers plane configuration and composition with the primary plane, sprites, cursors and overlays. This includes the infrastructure to do atomic vsync’ed updates of all this state and also tightly coupled topics like watermark setup and computation, framebuffer compression and panel self refresh.

Atomic Plane Helpers

Error

kernel-doc missing

Error

kernel-doc missing

Output Probing

This section covers output probing and related infrastructure like the hotplug interrupt storm detection and mitigation code. Note that the i915 driver still uses most of the common DRM helper code for output probing, so those sections fully apply.

Hotplug

Error

kernel-doc missing

Error

kernel-doc missing

High Definition Audio

Error

kernel-doc missing

Error

kernel-doc missing

struct i915_audio_component

Used for direct communication between i915 and hda drivers

Definition

struct i915_audio_component {
  struct drm_audio_component      base;
  int aud_sample_rate[MAX_PORTS];
};

Members

base
the drm_audio_component base class
aud_sample_rate
the array of audio sample rate per port

Intel HDMI LPE Audio Support

Error

kernel-doc missing

Error

kernel-doc missing

Panel Self Refresh PSR (PSR/SRD)

Error

kernel-doc missing

Error

kernel-doc missing

Frame Buffer Compression (FBC)

Error

kernel-doc missing

Error

kernel-doc missing

Display Refresh Rate Switching (DRRS)

Error

kernel-doc missing

Error

kernel-doc missing

Error

kernel-doc missing

Error

kernel-doc missing

Error

kernel-doc missing

Error

kernel-doc missing

Error

kernel-doc missing

DPIO

Error

kernel-doc missing

CSR firmware support for DMC

Error

kernel-doc missing

Error

kernel-doc missing

Video BIOS Table (VBT)

Error

kernel-doc missing

Error

kernel-doc missing

Error

kernel-doc missing

Display clocks

Error

kernel-doc missing

Error

kernel-doc missing

Display PLLs

Error

kernel-doc missing

Error

kernel-doc missing

Error

kernel-doc missing

Memory Management and Command Submission

This sections covers all things related to the GEM implementation in the i915 driver.

Intel GPU Basics

An Intel GPU has multiple engines. There are several engine types.

  • RCS engine is for rendering 3D and performing compute, this is named I915_EXEC_RENDER in user space.
  • BCS is a blitting (copy) engine, this is named I915_EXEC_BLT in user space.
  • VCS is a video encode and decode engine, this is named I915_EXEC_BSD in user space
  • VECS is video enhancement engine, this is named I915_EXEC_VEBOX in user space.
  • The enumeration I915_EXEC_DEFAULT does not refer to specific engine; instead it is to be used by user space to specify a default rendering engine (for 3D) that may or may not be the same as RCS.

The Intel GPU family is a family of integrated GPU’s using Unified Memory Access. For having the GPU “do work”, user space will feed the GPU batch buffers via one of the ioctls DRM_IOCTL_I915_GEM_EXECBUFFER2 or DRM_IOCTL_I915_GEM_EXECBUFFER2_WR. Most such batchbuffers will instruct the GPU to perform work (for example rendering) and that work needs memory from which to read and memory to which to write. All memory is encapsulated within GEM buffer objects (usually created with the ioctl DRM_IOCTL_I915_GEM_CREATE). An ioctl providing a batchbuffer for the GPU to create will also list all GEM buffer objects that the batchbuffer reads and/or writes. For implementation details of memory management see GEM BO Management Implementation Details.

The i915 driver allows user space to create a context via the ioctl DRM_IOCTL_I915_GEM_CONTEXT_CREATE which is identified by a 32-bit integer. Such a context should be viewed by user-space as -loosely- analogous to the idea of a CPU process of an operating system. The i915 driver guarantees that commands issued to a fixed context are to be executed so that writes of a previously issued command are seen by reads of following commands. Actions issued between different contexts (even if from the same file descriptor) are NOT given that guarantee and the only way to synchronize across contexts (even from the same file descriptor) is through the use of fences. At least as far back as Gen4, also have that a context carries with it a GPU HW context; the HW context is essentially (most of atleast) the state of a GPU. In addition to the ordering guarantees, the kernel will restore GPU state via HW context when commands are issued to a context, this saves user space the need to restore (most of atleast) the GPU state at the start of each batchbuffer. The non-deprecated ioctls to submit batchbuffer work can pass that ID (in the lower bits of drm_i915_gem_execbuffer2::rsvd1) to identify what context to use with the command.

The GPU has its own memory management and address space. The kernel driver maintains the memory translation table for the GPU. For older GPUs (i.e. those before Gen8), there is a single global such translation table, a global Graphics Translation Table (GTT). For newer generation GPUs each context has its own translation table, called Per-Process Graphics Translation Table (PPGTT). Of important note, is that although PPGTT is named per-process it is actually per context. When user space submits a batchbuffer, the kernel walks the list of GEM buffer objects used by the batchbuffer and guarantees that not only is the memory of each such GEM buffer object resident but it is also present in the (PP)GTT. If the GEM buffer object is not yet placed in the (PP)GTT, then it is given an address. Two consequences of this are: the kernel needs to edit the batchbuffer submitted to write the correct value of the GPU address when a GEM BO is assigned a GPU address and the kernel might evict a different GEM BO from the (PP)GTT to make address room for another GEM BO. Consequently, the ioctls submitting a batchbuffer for execution also include a list of all locations within buffers that refer to GPU-addresses so that the kernel can edit the buffer correctly. This process is dubbed relocation.

GEM BO Management Implementation Details

Buffer Object Eviction

This section documents the interface functions for evicting buffer objects to make space available in the virtual gpu address spaces. Note that this is mostly orthogonal to shrinking buffer objects caches, which has the goal to make main memory (shared with the gpu through the unified memory architecture) available.

int i915_gem_evict_something(struct i915_address_space * vm, struct i915_gem_ww_ctx * ww, u64 min_size, u64 alignment, unsigned long color, u64 start, u64 end, unsigned flags)

Evict vmas to make room for binding a new one

Parameters

struct i915_address_space * vm
address space to evict from
struct i915_gem_ww_ctx * ww
An optional struct i915_gem_ww_ctx.
u64 min_size
size of the desired free space
u64 alignment
alignment constraint of the desired free space
unsigned long color
color for the desired space
u64 start
start (inclusive) of the range from which to evict objects
u64 end
end (exclusive) of the range from which to evict objects
unsigned flags
additional flags to control the eviction algorithm

Description

This function will try to evict vmas until a free space satisfying the requirements is found. Callers must check first whether any such hole exists already before calling this function.

This function is used by the object/vma binding code.

Since this function is only used to free up virtual address space it only ignores pinned vmas, and not object where the backing storage itself is pinned. Hence obj->pages_pin_count does not protect against eviction.

To clarify: This is for freeing up virtual address space, not for freeing memory in e.g. the shrinker.

int i915_gem_evict_for_node(struct i915_address_space * vm, struct i915_gem_ww_ctx * ww, struct drm_mm_node * target, unsigned int flags)

Evict vmas to make room for binding a new one

Parameters

struct i915_address_space * vm
address space to evict from
struct i915_gem_ww_ctx * ww
An optional struct i915_gem_ww_ctx.
struct drm_mm_node * target
range (and color) to evict for
unsigned int flags
additional flags to control the eviction algorithm

Description

This function will try to evict vmas that overlap the target node.

To clarify: This is for freeing up virtual address space, not for freeing memory in e.g. the shrinker.

int i915_gem_evict_vm(struct i915_address_space * vm, struct i915_gem_ww_ctx * ww, struct drm_i915_gem_object ** busy_bo)

Evict all idle vmas from a vm

Parameters

struct i915_address_space * vm
Address space to cleanse
struct i915_gem_ww_ctx * ww
An optional struct i915_gem_ww_ctx. If not NULL, i915_gem_evict_vm will be able to evict vma’s locked by the ww as well.
struct drm_i915_gem_object ** busy_bo
Optional pointer to struct drm_i915_gem_object. If not NULL, then in the event i915_gem_evict_vm() is unable to trylock an object for eviction, then busy_bo will point to it. -EBUSY is also returned. The caller must drop the vm->mutex, before trying again to acquire the contended lock. The caller also owns a reference to the object.

Description

This function evicts all vmas from a vm.

This is used by the execbuf code as a last-ditch effort to defragment the address space.

To clarify: This is for freeing up virtual address space, not for freeing memory in e.g. the shrinker.

Buffer Object Memory Shrinking

This section documents the interface function for shrinking memory usage of buffer object caches. Shrinking is used to make main memory available. Note that this is mostly orthogonal to evicting buffer objects, which has the goal to make space in gpu virtual address spaces.

Error

kernel-doc missing

Batchbuffer Parsing

Motivation: Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in “unsecure” batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device.

The software command parser is similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop’d by hardware, if the parser determines the operation is safe, and submits the batch as “secure” to prevent hardware parsing.

Threats: At a high level, the hardware (and software) checks attempt to prevent granting userspace undue privileges. There are three categories of privilege.

First, commands which are explicitly defined as privileged or which should only be used by the kernel driver. The parser rejects such commands

Second, commands which access registers. To support correct/enhanced userspace functionality, particularly certain OpenGL extensions, the parser provides a whitelist of registers which userspace may safely access

Third, commands which access privileged memory (i.e. GGTT, HWS page, etc). The parser always rejects such commands.

The majority of the problematic commands fall in the MI_* range, with only a few specific commands on each engine (e.g. PIPE_CONTROL and MI_FLUSH_DW).

Implementation: Each engine maintains tables of commands and registers which the parser uses in scanning batch buffers submitted to that engine.

Since the set of commands that the parser must check for is significantly smaller than the number of commands supported, the parser tables contain only those commands required by the parser. This generally works because command opcode ranges have standard command length encodings. So for commands that the parser does not need to check, it can easily skip them. This is implemented via a per-engine length decoding vfunc.

Unfortunately, there are a number of commands that do not follow the standard length encoding for their opcode range, primarily amongst the MI_* commands. To handle this, the parser provides a way to define explicit “skip” entries in the per-engine command tables.

Other command table entries map fairly directly to high level categories mentioned above: rejected, register whitelist. The parser implements a number of checks, including the privileged memory checks, via a general bitmasking mechanism.

int intel_engine_init_cmd_parser(struct intel_engine_cs * engine)

set cmd parser related fields for an engine

Parameters

struct intel_engine_cs * engine
the engine to initialize

Description

Optionally initializes fields related to batch buffer command parsing in the struct intel_engine_cs based on whether the platform requires software command parsing.

void intel_engine_cleanup_cmd_parser(struct intel_engine_cs * engine)

clean up cmd parser related fields

Parameters

struct intel_engine_cs * engine
the engine to clean up

Description

Releases any resources related to command parsing that may have been initialized for the specified engine.

int intel_engine_cmd_parser(struct intel_engine_cs * engine, struct i915_vma * batch, unsigned long batch_offset, unsigned long batch_length, struct i915_vma * shadow, bool trampoline)

parse a batch buffer for privilege violations

Parameters

struct intel_engine_cs * engine
the engine on which the batch is to execute
struct i915_vma * batch
the batch buffer in question
unsigned long batch_offset
byte offset in the batch at which execution starts
unsigned long batch_length
length of the commands in batch_obj
struct i915_vma * shadow
validated copy of the batch buffer in question
bool trampoline
true if we need to trampoline into privileged execution

Description

Parses the specified batch buffer looking for privilege violations as described in the overview.

Return

non-zero if the parser finds violations or otherwise fails; -EACCES if the batch appears legal but should use hardware parsing

int i915_cmd_parser_get_version(struct drm_i915_private * dev_priv)

get the cmd parser version number

Parameters

struct drm_i915_private * dev_priv
i915 device private

Description

The cmd parser maintains a simple increasing integer version number suitable for passing to userspace clients to determine what operations are permitted.

Return

the current version number of the cmd parser

Batchbuffer Pools

Error

kernel-doc missing

Error

kernel-doc missing

User Batchbuffer Execution

Error

kernel-doc missing

Logical Rings, Logical Ring Contexts and Execlists

Error

kernel-doc missing

Error

kernel-doc missing

Global GTT views

int i915_gem_gtt_reserve(struct i915_address_space * vm, struct i915_gem_ww_ctx * ww, struct drm_mm_node * node, u64 size, u64 offset, unsigned long color, unsigned int flags)

reserve a node in an address_space (GTT)

Parameters

struct i915_address_space * vm
the struct i915_address_space
struct i915_gem_ww_ctx * ww
An optional struct i915_gem_ww_ctx.
struct drm_mm_node * node
the struct drm_mm_node (typically i915_vma.mode)
u64 size
how much space to allocate inside the GTT, must be #I915_GTT_PAGE_SIZE aligned
u64 offset
where to insert inside the GTT, must be #I915_GTT_MIN_ALIGNMENT aligned, and the node (offset + size) must fit within the address space
unsigned long color
color to apply to node, if this node is not from a VMA, color must be #I915_COLOR_UNEVICTABLE
unsigned int flags
control search and eviction behaviour

Description

i915_gem_gtt_reserve() tries to insert the node at the exact offset inside the address space (using size and color). If the node does not fit, it tries to evict any overlapping nodes from the GTT, including any neighbouring nodes if the colors do not match (to ensure guard pages between differing domains). See i915_gem_evict_for_node() for the gory details on the eviction algorithm. #PIN_NONBLOCK may used to prevent waiting on evicting active overlapping objects, and any overlapping node that is pinned or marked as unevictable will also result in failure.

Return

0 on success, -ENOSPC if no suitable hole is found, -EINTR if asked to wait for eviction and interrupted.

int i915_gem_gtt_insert(struct i915_address_space * vm, struct i915_gem_ww_ctx * ww, struct drm_mm_node * node, u64 size, u64 alignment, unsigned long color, u64 start, u64 end, unsigned int flags)

insert a node into an address_space (GTT)

Parameters

struct i915_address_space * vm
the struct i915_address_space
struct i915_gem_ww_ctx * ww
An optional struct i915_gem_ww_ctx.
struct drm_mm_node * node
the struct drm_mm_node (typically i915_vma.node)
u64 size
how much space to allocate inside the GTT, must be #I915_GTT_PAGE_SIZE aligned
u64 alignment
required alignment of starting offset, may be 0 but if specified, this must be a power-of-two and at least #I915_GTT_MIN_ALIGNMENT
unsigned long color
color to apply to node
u64 start
start of any range restriction inside GTT (0 for all), must be #I915_GTT_PAGE_SIZE aligned
u64 end
end of any range restriction inside GTT (U64_MAX for all), must be #I915_GTT_PAGE_SIZE aligned if not U64_MAX
unsigned int flags
control search and eviction behaviour

Description

i915_gem_gtt_insert() first searches for an available hole into which is can insert the node. The hole address is aligned to alignment and its size must then fit entirely within the [start, end] bounds. The nodes on either side of the hole must match color, or else a guard page will be inserted between the two nodes (or the node evicted). If no suitable hole is found, first a victim is randomly selected and tested for eviction, otherwise then the LRU list of objects within the GTT is scanned to find the first set of replacement nodes to create the hole. Those old overlapping nodes are evicted from the GTT (and so must be rebound before any future use). Any node that is currently pinned cannot be evicted (see i915_vma_pin()). Similar if the node’s VMA is currently active and #PIN_NONBLOCK is specified, that node is also skipped when searching for an eviction candidate. See i915_gem_evict_something() for the gory details on the eviction algorithm.

Return

0 on success, -ENOSPC if no suitable hole is found, -EINTR if asked to wait for eviction and interrupted.

GTT Fences and Swizzling

Error

kernel-doc missing

Global GTT Fence Handling

Error

kernel-doc missing

Hardware Tiling and Swizzling Details

Error

kernel-doc missing

Object Tiling IOCTLs

Error

kernel-doc missing

Error

kernel-doc missing

WOPCM

WOPCM Layout

The layout of the WOPCM will be fixed after writing to GuC WOPCM size and offset registers whose values are calculated and determined by HuC/GuC firmware size and set of hardware requirements/restrictions as shown below:

  +=========> +====================+ <== WOPCM Top
  ^           |  HW contexts RSVD  |
  |     +===> +====================+ <== GuC WOPCM Top
  |     ^     |                    |
  |     |     |                    |
  |     |     |                    |
  |    GuC    |                    |
  |   WOPCM   |                    |
  |    Size   +--------------------+
WOPCM   |     |    GuC FW RSVD     |
  |     |     +--------------------+
  |     |     |   GuC Stack RSVD   |
  |     |     +------------------- +
  |     v     |   GuC WOPCM RSVD   |
  |     +===> +====================+ <== GuC WOPCM base
  |           |     WOPCM RSVD     |
  |           +------------------- + <== HuC Firmware Top
  v           |      HuC FW        |
  +=========> +====================+ <== WOPCM Base

GuC accessible WOPCM starts at GuC WOPCM base and ends at GuC WOPCM top. The top part of the WOPCM is reserved for hardware contexts (e.g. RC6 context).

GuC

GuC-specific firmware loader

Error

kernel-doc missing

GuC-based command submission

Error

kernel-doc missing

Error

kernel-doc missing

GuC Firmware Layout

Error

kernel-doc missing

GuC Address Space

Error

kernel-doc missing

Tracing

This sections covers all things related to the tracepoints implemented in the i915 driver.

i915_ppgtt_create and i915_ppgtt_release

With full ppgtt enabled each process using drm will allocate at least one translation table. With these traces it is possible to keep track of the allocation and of the lifetime of the tables; this can be used during testing/debug to verify that we are not leaking ppgtts. These traces identify the ppgtt through the vm pointer, which is also printed by the i915_vma_bind and i915_vma_unbind tracepoints.

i915_context_create and i915_context_free

These tracepoints are used to track creation and deletion of contexts. If full ppgtt is enabled, they also print the address of the vm assigned to the context.

switch_mm

Perf

Overview

Gen graphics supports a large number of performance counters that can help driver and application developers understand and optimize their use of the GPU.

This i915 perf interface enables userspace to configure and open a file descriptor representing a stream of GPU metrics which can then be read() as a stream of sample records.

The interface is particularly suited to exposing buffered metrics that are captured by DMA from the GPU, unsynchronized with and unrelated to the CPU.

Streams representing a single context are accessible to applications with a corresponding drm file descriptor, such that OpenGL can use the interface without special privileges. Access to system-wide metrics requires root privileges by default, unless changed via the dev.i915.perf_event_paranoid sysctl option.

Comparison with Core Perf

The interface was initially inspired by the core Perf infrastructure but some notable differences are:

i915 perf file descriptors represent a “stream” instead of an “event”; where a perf event primarily corresponds to a single 64bit value, while a stream might sample sets of tightly-coupled counters, depending on the configuration. For example the Gen OA unit isn’t designed to support orthogonal configurations of individual counters; it’s configured for a set of related counters. Samples for an i915 perf stream capturing OA metrics will include a set of counter values packed in a compact HW specific format. The OA unit supports a number of different packing formats which can be selected by the user opening the stream. Perf has support for grouping events, but each event in the group is configured, validated and authenticated individually with separate system calls.

i915 perf stream configurations are provided as an array of u64 (key,value) pairs, instead of a fixed struct with multiple miscellaneous config members, interleaved with event-type specific members.

i915 perf doesn’t support exposing metrics via an mmap’d circular buffer. The supported metrics are being written to memory by the GPU unsynchronized with the CPU, using HW specific packing formats for counter sets. Sometimes the constraints on HW configuration require reports to be filtered before it would be acceptable to expose them to unprivileged applications - to hide the metrics of other processes/contexts. For these use cases a read() based interface is a good fit, and provides an opportunity to filter data as it gets copied from the GPU mapped buffers to userspace buffers.

Issues hit with first prototype based on Core Perf

The first prototype of this driver was based on the core perf infrastructure, and while we did make that mostly work, with some changes to perf, we found we were breaking or working around too many assumptions baked into perf’s currently cpu centric design.

In the end we didn’t see a clear benefit to making perf’s implementation and interface more complex by changing design assumptions while we knew we still wouldn’t be able to use any existing perf based userspace tools.

Also considering the Gen specific nature of the Observability hardware and how userspace will sometimes need to combine i915 perf OA metrics with side-band OA data captured via MI_REPORT_PERF_COUNT commands; we’re expecting the interface to be used by a platform specific userspace such as OpenGL or tools. This is to say; we aren’t inherently missing out on having a standard vendor/architecture agnostic interface by not using perf.

For posterity, in case we might re-visit trying to adapt core perf to be better suited to exposing i915 metrics these were the main pain points we hit:

  • The perf based OA PMU driver broke some significant design assumptions:

    Existing perf pmus are used for profiling work on a cpu and we were introducing the idea of _IS_DEVICE pmus with different security implications, the need to fake cpu-related data (such as user/kernel registers) to fit with perf’s current design, and adding _DEVICE records as a way to forward device-specific status records.

    The OA unit writes reports of counters into a circular buffer, without involvement from the CPU, making our PMU driver the first of a kind.

    Given the way we were periodically forward data from the GPU-mapped, OA buffer to perf’s buffer, those bursts of sample writes looked to perf like we were sampling too fast and so we had to subvert its throttling checks.

    Perf supports groups of counters and allows those to be read via transactions internally but transactions currently seem designed to be explicitly initiated from the cpu (say in response to a userspace read()) and while we could pull a report out of the OA buffer we can’t trigger a report from the cpu on demand.

    Related to being report based; the OA counters are configured in HW as a set while perf generally expects counter configurations to be orthogonal. Although counters can be associated with a group leader as they are opened, there’s no clear precedent for being able to provide group-wide configuration attributes (for example we want to let userspace choose the OA unit report format used to capture all counters in a set, or specify a GPU context to filter metrics on). We avoided using perf’s grouping feature and forwarded OA reports to userspace via perf’s ‘raw’ sample field. This suited our userspace well considering how coupled the counters are when dealing with normalizing. It would be inconvenient to split counters up into separate events, only to require userspace to recombine them. For Mesa it’s also convenient to be forwarded raw, periodic reports for combining with the side-band raw reports it captures using MI_REPORT_PERF_COUNT commands.

    • As a side note on perf’s grouping feature; there was also some concern that using PERF_FORMAT_GROUP as a way to pack together counter values would quite drastically inflate our sample sizes, which would likely lower the effective sampling resolutions we could use when the available memory bandwidth is limited.

      With the OA unit’s report formats, counters are packed together as 32 or 40bit values, with the largest report size being 256 bytes.

      PERF_FORMAT_GROUP values are 64bit, but there doesn’t appear to be a documented ordering to the values, implying PERF_FORMAT_ID must also be used to add a 64bit ID before each value; giving 16 bytes per counter.

    Related to counter orthogonality; we can’t time share the OA unit, while event scheduling is a central design idea within perf for allowing userspace to open + enable more events than can be configured in HW at any one time. The OA unit is not designed to allow re-configuration while in use. We can’t reconfigure the OA unit without losing internal OA unit state which we can’t access explicitly to save and restore. Reconfiguring the OA unit is also relatively slow, involving ~100 register writes. From userspace Mesa also depends on a stable OA configuration when emitting MI_REPORT_PERF_COUNT commands and importantly the OA unit can’t be disabled while there are outstanding MI_RPC commands lest we hang the command streamer.

    The contents of sample records aren’t extensible by device drivers (i.e. the sample_type bits). As an example; Sourab Gupta had been looking to attach GPU timestamps to our OA samples. We were shoehorning OA reports into sample records by using the ‘raw’ field, but it’s tricky to pack more than one thing into this field because events/core.c currently only lets a pmu give a single raw data pointer plus len which will be copied into the ring buffer. To include more than the OA report we’d have to copy the report into an intermediate larger buffer. I’d been considering allowing a vector of data+len values to be specified for copying the raw data, but it felt like a kludge to being using the raw field for this purpose.

  • It felt like our perf based PMU was making some technical compromises just for the sake of using perf:

    perf_event_open() requires events to either relate to a pid or a specific cpu core, while our device pmu related to neither. Events opened with a pid will be automatically enabled/disabled according to the scheduling of that process - so not appropriate for us. When an event is related to a cpu id, perf ensures pmu methods will be invoked via an inter process interrupt on that core. To avoid invasive changes our userspace opened OA perf events for a specific cpu. This was workable but it meant the majority of the OA driver ran in atomic context, including all OA report forwarding, which wasn’t really necessary in our case and seems to make our locking requirements somewhat complex as we handled the interaction with the rest of the i915 driver.

i915 Driver Entry Points

This section covers the entrypoints exported outside of i915_perf.c to integrate with drm/i915 and to handle the DRM_I915_PERF_OPEN ioctl.

void i915_perf_init(struct drm_i915_private * i915)

initialize i915-perf state on module bind

Parameters

struct drm_i915_private * i915
i915 device instance

Description

Initializes i915-perf state without exposing anything to userspace.

Note

i915-perf initialization is split into an ‘init’ and ‘register’ phase with the i915_perf_register() exposing state to userspace.

void i915_perf_fini(struct drm_i915_private * i915)

Counter part to i915_perf_init()

Parameters

struct drm_i915_private * i915
i915 device instance
void i915_perf_register(struct drm_i915_private * i915)

exposes i915-perf to userspace

Parameters

struct drm_i915_private * i915
i915 device instance

Description

In particular OA metric sets are advertised under a sysfs metrics/ directory allowing userspace to enumerate valid IDs that can be used to open an i915-perf stream.

void i915_perf_unregister(struct drm_i915_private * i915)

hide i915-perf from userspace

Parameters

struct drm_i915_private * i915
i915 device instance

Description

i915-perf state cleanup is split up into an ‘unregister’ and ‘deinit’ phase where the interface is first hidden from userspace by i915_perf_unregister() before cleaning up remaining state in i915_perf_fini().

int i915_perf_open_ioctl(struct drm_device * dev, void * data, struct drm_file * file)

DRM ioctl() for userspace to open a stream FD

Parameters

struct drm_device * dev
drm device
void * data
ioctl data copied from userspace (unvalidated)
struct drm_file * file
drm file

Description

Validates the stream open parameters given by userspace including flags and an array of u64 key, value pair properties.

Very little is assumed up front about the nature of the stream being opened (for instance we don’t assume it’s for periodic OA unit metrics). An i915-perf stream is expected to be a suitable interface for other forms of buffered data written by the GPU besides periodic OA metrics.

Note we copy the properties from userspace outside of the i915 perf mutex to avoid an awkward lockdep with mmap_lock.

Most of the implementation details are handled by i915_perf_open_ioctl_locked() after taking the perf->lock mutex for serializing with any non-file-operation driver hooks.

Return

A newly opened i915 Perf stream file descriptor or negative error code on failure.

int i915_perf_release(struct inode * inode, struct file * file)

handles userspace close() of a stream file

Parameters

struct inode * inode
anonymous inode associated with file
struct file * file
An i915 perf stream file

Description

Cleans up any resources associated with an open i915 perf stream file.

NB: close() can’t really fail from the userspace point of view.

Return

zero on success or a negative error code.

int i915_perf_add_config_ioctl(struct drm_device * dev, void * data, struct drm_file * file)

DRM ioctl() for userspace to add a new OA config

Parameters

struct drm_device * dev
drm device
void * data
ioctl data (pointer to struct drm_i915_perf_oa_config) copied from userspace (unvalidated)
struct drm_file * file
drm file

Description

Validates the submitted OA register to be saved into a new OA config that can then be used for programming the OA unit and its NOA network.

Return

A new allocated config number to be used with the perf open ioctl or a negative error code on failure.

int i915_perf_remove_config_ioctl(struct drm_device * dev, void * data, struct drm_file * file)

DRM ioctl() for userspace to remove an OA config

Parameters

struct drm_device * dev
drm device
void * data
ioctl data (pointer to u64 integer) copied from userspace
struct drm_file * file
drm file

Description

Configs can be removed while being used, the will stop appearing in sysfs and their content will be freed when the stream using the config is closed.

Return

0 on success or a negative error code on failure.

i915 Perf Stream

This section covers the stream-semantics-agnostic structures and functions for representing an i915 perf stream FD and associated file operations.

int read_properties_unlocked(struct i915_perf * perf, u64 __user * uprops, u32 n_props, struct perf_open_properties * props)

validate + copy userspace stream open properties

Parameters

struct i915_perf * perf
i915 perf instance
u64 __user * uprops
The array of u64 key value pairs given by userspace
u32 n_props
The number of key value pairs expected in uprops
struct perf_open_properties * props
The stream configuration built up while validating properties

Description

Note this function only validates properties in isolation it doesn’t validate that the combination of properties makes sense or that all properties necessary for a particular kind of stream have been set.

Note that there currently aren’t any ordering requirements for properties so we shouldn’t validate or assume anything about ordering here. This doesn’t rule out defining new properties with ordering requirements in the future.

int i915_perf_open_ioctl_locked(struct i915_perf * perf, struct drm_i915_perf_open_param * param, struct perf_open_properties * props, struct drm_file * file)

DRM ioctl() for userspace to open a stream FD

Parameters

struct i915_perf * perf
i915 perf instance
struct drm_i915_perf_open_param * param
The open parameters passed to ‘DRM_I915_PERF_OPEN`
struct perf_open_properties * props
individually validated u64 property value pairs
struct drm_file * file
drm file

Description

See i915_perf_ioctl_open() for interface details.

Implements further stream config validation and stream initialization on behalf of i915_perf_open_ioctl() with the perf->lock mutex taken to serialize with any non-file-operation driver hooks.

Note

at this point the props have only been validated in isolation and it’s still necessary to validate that the combination of properties makes sense.

In the case where userspace is interested in OA unit metrics then further config validation and stream initialization details will be handled by i915_oa_stream_init(). The code here should only validate config state that will be relevant to all stream types / backends.

Return

zero on success or a negative error code.

void i915_perf_destroy_locked(struct i915_perf_stream * stream)

destroy an i915 perf stream

Parameters

struct i915_perf_stream * stream
An i915 perf stream

Description

Frees all resources associated with the given i915 perf stream, disabling any associated data capture in the process.

Note

The perf->lock mutex has been taken to serialize with any non-file-operation driver hooks.

ssize_t i915_perf_read(struct file * file, char __user * buf, size_t count, loff_t * ppos)

handles read() FOP for i915 perf stream FDs

Parameters

struct file * file
An i915 perf stream file
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
loff_t * ppos
(inout) file seek position (unused)

Description

The entry point for handling a read() on a stream file descriptor from userspace. Most of the work is left to the i915_perf_read_locked() and i915_perf_stream_ops->read but to save having stream implementations (of which we might have multiple later) we handle blocking read here.

We can also consistently treat trying to read from a disabled stream as an IO error so implementations can assume the stream is enabled while reading.

Return

The number of bytes copied or a negative error code on failure.

long i915_perf_ioctl(struct file * file, unsigned int cmd, unsigned long arg)

support ioctl() usage with i915 perf stream FDs

Parameters

struct file * file
An i915 perf stream file
unsigned int cmd
the ioctl request
unsigned long arg
the ioctl data

Description

Implementation deferred to i915_perf_ioctl_locked().

Return

zero on success or a negative error code. Returns -EINVAL for an unknown ioctl request.

void i915_perf_enable_locked(struct i915_perf_stream * stream)

handle I915_PERF_IOCTL_ENABLE ioctl

Parameters

struct i915_perf_stream * stream
A disabled i915 perf stream

Description

[Re]enables the associated capture of data for this stream.

If a stream was previously enabled then there’s currently no intention to provide userspace any guarantee about the preservation of previously buffered data.

void i915_perf_disable_locked(struct i915_perf_stream * stream)

handle I915_PERF_IOCTL_DISABLE ioctl

Parameters

struct i915_perf_stream * stream
An enabled i915 perf stream

Description

Disables the associated capture of data for this stream.

The intention is that disabling an re-enabling a stream will ideally be cheaper than destroying and re-opening a stream with the same configuration, though there are no formal guarantees about what state or buffered data must be retained between disabling and re-enabling a stream.

Note

while a stream is disabled it’s considered an error for userspace to attempt to read from the stream (-EIO).

__poll_t i915_perf_poll(struct file * file, poll_table * wait)

call poll_wait() with a suitable wait queue for stream

Parameters

struct file * file
An i915 perf stream file
poll_table * wait
poll() state table

Description

For handling userspace polling on an i915 perf stream, this ensures poll_wait() gets called with a wait queue that will be woken for new stream data.

Note

Implementation deferred to i915_perf_poll_locked()

Return

any poll events that are ready without sleeping

__poll_t i915_perf_poll_locked(struct i915_perf_stream * stream, struct file * file, poll_table * wait)

poll_wait() with a suitable wait queue for stream

Parameters

struct i915_perf_stream * stream
An i915 perf stream
struct file * file
An i915 perf stream file
poll_table * wait
poll() state table

Description

For handling userspace polling on an i915 perf stream, this calls through to i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that will be woken for new stream data.

Note

The perf->lock mutex has been taken to serialize with any non-file-operation driver hooks.

Return

any poll events that are ready without sleeping

i915 Perf Observation Architecture Stream

int i915_oa_stream_init(struct i915_perf_stream * stream, struct drm_i915_perf_open_param * param, struct perf_open_properties * props)

validate combined props for OA stream and init

Parameters

struct i915_perf_stream * stream
An i915 perf stream
struct drm_i915_perf_open_param * param
The open parameters passed to DRM_I915_PERF_OPEN
struct perf_open_properties * props
The property state that configures stream (individually validated)

Description

While read_properties_unlocked() validates properties in isolation it doesn’t ensure that the combination necessarily makes sense.

At this point it has been determined that userspace wants a stream of OA metrics, but still we need to further validate the combined properties are OK.

If the configuration makes sense then we can allocate memory for a circular OA buffer and apply the requested metric set configuration.

Return

zero on success or a negative error code.

int i915_oa_read(struct i915_perf_stream * stream, char __user * buf, size_t count, size_t * offset)

just calls through to i915_oa_ops->read

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
size_t * offset
(inout): the current position for writing into buf

Description

Updates offset according to the number of bytes successfully copied into the userspace buffer.

Return

zero on success or a negative error code

void i915_oa_stream_enable(struct i915_perf_stream * stream)

handle I915_PERF_IOCTL_ENABLE for OA stream

Parameters

struct i915_perf_stream * stream
An i915 perf stream opened for OA metrics

Description

[Re]enables hardware periodic sampling according to the period configured when opening the stream. This also starts a hrtimer that will periodically check for data in the circular OA buffer for notifying userspace (e.g. during a read() or poll()).

void i915_oa_stream_disable(struct i915_perf_stream * stream)

handle I915_PERF_IOCTL_DISABLE for OA stream

Parameters

struct i915_perf_stream * stream
An i915 perf stream opened for OA metrics

Description

Stops the OA unit from periodically writing counter reports into the circular OA buffer. This also stops the hrtimer that periodically checks for data in the circular OA buffer, for notifying userspace.

int i915_oa_wait_unlocked(struct i915_perf_stream * stream)

handles blocking IO until OA data available

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics

Description

Called when userspace tries to read() from a blocking stream FD opened for OA metrics. It waits until the hrtimer callback finds a non-empty OA buffer and wakes us.

Note

it’s acceptable to have this return with some false positives since any subsequent read handling will return -EAGAIN if there isn’t really data ready for userspace yet.

Return

zero on success or a negative error code

void i915_oa_poll_wait(struct i915_perf_stream * stream, struct file * file, poll_table * wait)

call poll_wait() for an OA stream poll()

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
struct file * file
An i915 perf stream file
poll_table * wait
poll() state table

Description

For handling userspace polling on an i915 perf stream opened for OA metrics, this starts a poll_wait with the wait queue that our hrtimer callback wakes when it sees data ready to read in the circular OA buffer.

All i915 Perf Internals

This section simply includes all currently documented i915 perf internals, in no particular order, but may include some more minor utilities or platform specific details than found in the more high-level sections.

struct perf_open_properties

for validated properties given to open a stream

Definition

struct perf_open_properties {
  u32 sample_flags;
  u64 single_context:1;
  u64 hold_preemption:1;
  u64 ctx_handle;
  int metrics_set;
  int oa_format;
  bool oa_periodic;
  int oa_period_exponent;
  struct intel_engine_cs *engine;
  bool has_sseu;
  struct intel_sseu sseu;
  u64 poll_oa_period;
};

Members

sample_flags
DRM_I915_PERF_PROP_SAMPLE_* properties are tracked as flags
single_context
Whether a single or all gpu contexts should be monitored
hold_preemption
Whether the preemption is disabled for the filtered context
ctx_handle
A gem ctx handle for use with single_context
metrics_set
An ID for an OA unit metric set advertised via sysfs
oa_format
An OA unit HW report format
oa_periodic
Whether to enable periodic OA unit sampling
oa_period_exponent
The OA unit sampling period is derived from this
engine
The engine (typically rcs0) being monitored by the OA unit
has_sseu
Whether sseu was specified by userspace
sseu
internal SSEU configuration computed either from the userspace specified configuration in the opening parameters or a default value (see get_default_sseu_config())
poll_oa_period
The period in nanoseconds at which the CPU will check for OA data availability

Description

As read_properties_unlocked() enumerates and validates the properties given to open a stream of metrics the configuration is built up in the structure which starts out zero initialized.

bool oa_buffer_check_unlocked(struct i915_perf_stream * stream)

check for data and update tail ptr state

Parameters

struct i915_perf_stream * stream
i915 stream instance

Description

This is either called via fops (for blocking reads in user ctx) or the poll check hrtimer (atomic ctx) to check the OA buffer tail pointer and check if there is data available for userspace to read.

This function is central to providing a workaround for the OA unit tail pointer having a race with respect to what data is visible to the CPU. It is responsible for reading tail pointers from the hardware and giving the pointers time to ‘age’ before they are made available for reading. (See description of OA_TAIL_MARGIN_NSEC above for further details.)

Besides returning true when there is data available to read() this function also updates the tail, aging_tail and aging_timestamp in the oa_buffer object.

Note

It’s safe to read OA config state here unlocked, assuming that this is only called while the stream is enabled, while the global OA configuration can’t be modified.

Return

true if the OA buffer contains data, else false

int append_oa_status(struct i915_perf_stream * stream, char __user * buf, size_t count, size_t * offset, enum drm_i915_perf_record_type type)

Appends a status record to a userspace read() buffer.

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
size_t * offset
(inout): the current position for writing into buf
enum drm_i915_perf_record_type type
The kind of status to report to userspace

Description

Writes a status record (such as DRM_I915_PERF_RECORD_OA_REPORT_LOST) into the userspace read() buffer.

The buf offset will only be updated on success.

Return

0 on success, negative error code on failure.

int append_oa_sample(struct i915_perf_stream * stream, char __user * buf, size_t count, size_t * offset, const u8 * report)

Copies single OA report into userspace read() buffer.

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
size_t * offset
(inout): the current position for writing into buf
const u8 * report
A single OA report to (optionally) include as part of the sample

Description

The contents of a sample are configured through DRM_I915_PERF_PROP_SAMPLE_* properties when opening a stream, tracked as stream->sample_flags. This function copies the requested components of a single sample to the given read() buf.

The buf offset will only be updated on success.

Return

0 on success, negative error code on failure.

int gen8_append_oa_reports(struct i915_perf_stream * stream, char __user * buf, size_t count, size_t * offset)

Copies all buffered OA reports into userspace read() buffer.

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
size_t * offset
(inout): the current position for writing into buf

Description

Notably any error condition resulting in a short read (-ENOSPC or -EFAULT) will be returned even though one or more records may have been successfully copied. In this case it’s up to the caller to decide if the error should be squashed before returning to userspace.

Note

reports are consumed from the head, and appended to the tail, so the tail chases the head?… If you think that’s mad and back-to-front you’re not alone, but this follows the Gen PRM naming convention.

Return

0 on success, negative error code on failure.

int gen8_oa_read(struct i915_perf_stream * stream, char __user * buf, size_t count, size_t * offset)

copy status records then buffered OA reports

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
size_t * offset
(inout): the current position for writing into buf

Description

Checks OA unit status registers and if necessary appends corresponding status records for userspace (such as for a buffer full condition) and then initiate appending any buffered OA reports.

Updates offset according to the number of bytes successfully copied into the userspace buffer.

NB: some data may be successfully copied to the userspace buffer even if an error is returned, and this is reflected in the updated offset.

Return

zero on success or a negative error code

int gen7_append_oa_reports(struct i915_perf_stream * stream, char __user * buf, size_t count, size_t * offset)

Copies all buffered OA reports into userspace read() buffer.

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
size_t * offset
(inout): the current position for writing into buf

Description

Notably any error condition resulting in a short read (-ENOSPC or -EFAULT) will be returned even though one or more records may have been successfully copied. In this case it’s up to the caller to decide if the error should be squashed before returning to userspace.

Note

reports are consumed from the head, and appended to the tail, so the tail chases the head?… If you think that’s mad and back-to-front you’re not alone, but this follows the Gen PRM naming convention.

Return

0 on success, negative error code on failure.

int gen7_oa_read(struct i915_perf_stream * stream, char __user * buf, size_t count, size_t * offset)

copy status records then buffered OA reports

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
size_t * offset
(inout): the current position for writing into buf

Description

Checks Gen 7 specific OA unit status registers and if necessary appends corresponding status records for userspace (such as for a buffer full condition) and then initiate appending any buffered OA reports.

Updates offset according to the number of bytes successfully copied into the userspace buffer.

Return

zero on success or a negative error code

int i915_oa_wait_unlocked(struct i915_perf_stream * stream)

handles blocking IO until OA data available

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics

Description

Called when userspace tries to read() from a blocking stream FD opened for OA metrics. It waits until the hrtimer callback finds a non-empty OA buffer and wakes us.

Note

it’s acceptable to have this return with some false positives since any subsequent read handling will return -EAGAIN if there isn’t really data ready for userspace yet.

Return

zero on success or a negative error code

void i915_oa_poll_wait(struct i915_perf_stream * stream, struct file * file, poll_table * wait)

call poll_wait() for an OA stream poll()

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
struct file * file
An i915 perf stream file
poll_table * wait
poll() state table

Description

For handling userspace polling on an i915 perf stream opened for OA metrics, this starts a poll_wait with the wait queue that our hrtimer callback wakes when it sees data ready to read in the circular OA buffer.

int i915_oa_read(struct i915_perf_stream * stream, char __user * buf, size_t count, size_t * offset)

just calls through to i915_oa_ops->read

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
size_t * offset
(inout): the current position for writing into buf

Description

Updates offset according to the number of bytes successfully copied into the userspace buffer.

Return

zero on success or a negative error code

int oa_get_render_ctx_id(struct i915_perf_stream * stream)

determine and hold ctx hw id

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics

Description

Determine the render context hw id, and ensure it remains fixed for the lifetime of the stream. This ensures that we don’t have to worry about updating the context ID in OACONTROL on the fly.

Return

zero on success or a negative error code

void oa_put_render_ctx_id(struct i915_perf_stream * stream)

counterpart to oa_get_render_ctx_id releases hold

Parameters

struct i915_perf_stream * stream
An i915-perf stream opened for OA metrics

Description

In case anything needed doing to ensure the context HW ID would remain valid for the lifetime of the stream, then that can be undone here.

void i915_oa_stream_enable(struct i915_perf_stream * stream)

handle I915_PERF_IOCTL_ENABLE for OA stream

Parameters

struct i915_perf_stream * stream
An i915 perf stream opened for OA metrics

Description

[Re]enables hardware periodic sampling according to the period configured when opening the stream. This also starts a hrtimer that will periodically check for data in the circular OA buffer for notifying userspace (e.g. during a read() or poll()).

void i915_oa_stream_disable(struct i915_perf_stream * stream)

handle I915_PERF_IOCTL_DISABLE for OA stream

Parameters

struct i915_perf_stream * stream
An i915 perf stream opened for OA metrics

Description

Stops the OA unit from periodically writing counter reports into the circular OA buffer. This also stops the hrtimer that periodically checks for data in the circular OA buffer, for notifying userspace.

int i915_oa_stream_init(struct i915_perf_stream * stream, struct drm_i915_perf_open_param * param, struct perf_open_properties * props)

validate combined props for OA stream and init

Parameters

struct i915_perf_stream * stream
An i915 perf stream
struct drm_i915_perf_open_param * param
The open parameters passed to DRM_I915_PERF_OPEN
struct perf_open_properties * props
The property state that configures stream (individually validated)

Description

While read_properties_unlocked() validates properties in isolation it doesn’t ensure that the combination necessarily makes sense.

At this point it has been determined that userspace wants a stream of OA metrics, but still we need to further validate the combined properties are OK.

If the configuration makes sense then we can allocate memory for a circular OA buffer and apply the requested metric set configuration.

Return

zero on success or a negative error code.

ssize_t i915_perf_read(struct file * file, char __user * buf, size_t count, loff_t * ppos)

handles read() FOP for i915 perf stream FDs

Parameters

struct file * file
An i915 perf stream file
char __user * buf
destination buffer given by userspace
size_t count
the number of bytes userspace wants to read
loff_t * ppos
(inout) file seek position (unused)

Description

The entry point for handling a read() on a stream file descriptor from userspace. Most of the work is left to the i915_perf_read_locked() and i915_perf_stream_ops->read but to save having stream implementations (of which we might have multiple later) we handle blocking read here.

We can also consistently treat trying to read from a disabled stream as an IO error so implementations can assume the stream is enabled while reading.

Return

The number of bytes copied or a negative error code on failure.

__poll_t i915_perf_poll_locked(struct i915_perf_stream * stream, struct file * file, poll_table * wait)

poll_wait() with a suitable wait queue for stream

Parameters

struct i915_perf_stream * stream
An i915 perf stream
struct file * file
An i915 perf stream file
poll_table * wait
poll() state table

Description

For handling userspace polling on an i915 perf stream, this calls through to i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that will be woken for new stream data.

Note

The perf->lock mutex has been taken to serialize with any non-file-operation driver hooks.

Return

any poll events that are ready without sleeping

__poll_t i915_perf_poll(struct file * file, poll_table * wait)

call poll_wait() with a suitable wait queue for stream

Parameters

struct file * file
An i915 perf stream file
poll_table * wait
poll() state table

Description

For handling userspace polling on an i915 perf stream, this ensures poll_wait() gets called with a wait queue that will be woken for new stream data.

Note

Implementation deferred to i915_perf_poll_locked()

Return

any poll events that are ready without sleeping

void i915_perf_enable_locked(struct i915_perf_stream * stream)

handle I915_PERF_IOCTL_ENABLE ioctl

Parameters

struct i915_perf_stream * stream
A disabled i915 perf stream

Description

[Re]enables the associated capture of data for this stream.

If a stream was previously enabled then there’s currently no intention to provide userspace any guarantee about the preservation of previously buffered data.

void i915_perf_disable_locked(struct i915_perf_stream * stream)

handle I915_PERF_IOCTL_DISABLE ioctl

Parameters

struct i915_perf_stream * stream
An enabled i915 perf stream

Description

Disables the associated capture of data for this stream.

The intention is that disabling an re-enabling a stream will ideally be cheaper than destroying and re-opening a stream with the same configuration, though there are no formal guarantees about what state or buffered data must be retained between disabling and re-enabling a stream.

Note

while a stream is disabled it’s considered an error for userspace to attempt to read from the stream (-EIO).

long i915_perf_ioctl_locked(struct i915_perf_stream * stream, unsigned int cmd, unsigned long arg)

support ioctl() usage with i915 perf stream FDs

Parameters

struct i915_perf_stream * stream
An i915 perf stream
unsigned int cmd
the ioctl request
unsigned long arg
the ioctl data

Note

The perf->lock mutex has been taken to serialize with any non-file-operation driver hooks.

Return

zero on success or a negative error code. Returns -EINVAL for an unknown ioctl request.

long i915_perf_ioctl(struct file * file, unsigned int cmd, unsigned long arg)

support ioctl() usage with i915 perf stream FDs

Parameters

struct file * file
An i915 perf stream file
unsigned int cmd
the ioctl request
unsigned long arg
the ioctl data

Description

Implementation deferred to i915_perf_ioctl_locked().

Return

zero on success or a negative error code. Returns -EINVAL for an unknown ioctl request.

void i915_perf_destroy_locked(struct i915_perf_stream * stream)

destroy an i915 perf stream

Parameters

struct i915_perf_stream * stream
An i915 perf stream

Description

Frees all resources associated with the given i915 perf stream, disabling any associated data capture in the process.

Note

The perf->lock mutex has been taken to serialize with any non-file-operation driver hooks.

int i915_perf_release(struct inode * inode, struct file * file)

handles userspace close() of a stream file

Parameters

struct inode * inode
anonymous inode associated with file
struct file * file
An i915 perf stream file

Description

Cleans up any resources associated with an open i915 perf stream file.

NB: close() can’t really fail from the userspace point of view.

Return

zero on success or a negative error code.

int i915_perf_open_ioctl_locked(struct i915_perf * perf, struct drm_i915_perf_open_param * param, struct perf_open_properties * props, struct drm_file * file)

DRM ioctl() for userspace to open a stream FD

Parameters

struct i915_perf * perf
i915 perf instance
struct drm_i915_perf_open_param * param
The open parameters passed to ‘DRM_I915_PERF_OPEN`
struct perf_open_properties * props
individually validated u64 property value pairs
struct drm_file * file
drm file

Description

See i915_perf_ioctl_open() for interface details.

Implements further stream config validation and stream initialization on behalf of i915_perf_open_ioctl() with the perf->lock mutex taken to serialize with any non-file-operation driver hooks.

Note

at this point the props have only been validated in isolation and it’s still necessary to validate that the combination of properties makes sense.

In the case where userspace is interested in OA unit metrics then further config validation and stream initialization details will be handled by i915_oa_stream_init(). The code here should only validate config state that will be relevant to all stream types / backends.

Return

zero on success or a negative error code.

int read_properties_unlocked(struct i915_perf * perf, u64 __user * uprops, u32 n_props, struct perf_open_properties * props)

validate + copy userspace stream open properties

Parameters

struct i915_perf * perf
i915 perf instance
u64 __user * uprops
The array of u64 key value pairs given by userspace
u32 n_props
The number of key value pairs expected in uprops
struct perf_open_properties * props
The stream configuration built up while validating properties

Description

Note this function only validates properties in isolation it doesn’t validate that the combination of properties makes sense or that all properties necessary for a particular kind of stream have been set.

Note that there currently aren’t any ordering requirements for properties so we shouldn’t validate or assume anything about ordering here. This doesn’t rule out defining new properties with ordering requirements in the future.

int i915_perf_open_ioctl(struct drm_device * dev, void * data, struct drm_file * file)

DRM ioctl() for userspace to open a stream FD

Parameters

struct drm_device * dev
drm device
void * data
ioctl data copied from userspace (unvalidated)
struct drm_file * file
drm file

Description

Validates the stream open parameters given by userspace including flags and an array of u64 key, value pair properties.

Very little is assumed up front about the nature of the stream being opened (for instance we don’t assume it’s for periodic OA unit metrics). An i915-perf stream is expected to be a suitable interface for other forms of buffered data written by the GPU besides periodic OA metrics.

Note we copy the properties from userspace outside of the i915 perf mutex to avoid an awkward lockdep with mmap_lock.

Most of the implementation details are handled by i915_perf_open_ioctl_locked() after taking the perf->lock mutex for serializing with any non-file-operation driver hooks.

Return

A newly opened i915 Perf stream file descriptor or negative error code on failure.

void i915_perf_register(struct drm_i915_private * i915)

exposes i915-perf to userspace

Parameters

struct drm_i915_private * i915
i915 device instance

Description

In particular OA metric sets are advertised under a sysfs metrics/ directory allowing userspace to enumerate valid IDs that can be used to open an i915-perf stream.

void i915_perf_unregister(struct drm_i915_private * i915)

hide i915-perf from userspace

Parameters

struct drm_i915_private * i915
i915 device instance

Description

i915-perf state cleanup is split up into an ‘unregister’ and ‘deinit’ phase where the interface is first hidden from userspace by i915_perf_unregister() before cleaning up remaining state in i915_perf_fini().

int i915_perf_add_config_ioctl(struct drm_device * dev, void * data, struct drm_file * file)

DRM ioctl() for userspace to add a new OA config

Parameters

struct drm_device * dev
drm device
void * data
ioctl data (pointer to struct drm_i915_perf_oa_config) copied from userspace (unvalidated)
struct drm_file * file
drm file

Description

Validates the submitted OA register to be saved into a new OA config that can then be used for programming the OA unit and its NOA network.

Return

A new allocated config number to be used with the perf open ioctl or a negative error code on failure.

int i915_perf_remove_config_ioctl(struct drm_device * dev, void * data, struct drm_file * file)

DRM ioctl() for userspace to remove an OA config

Parameters

struct drm_device * dev
drm device
void * data
ioctl data (pointer to u64 integer) copied from userspace
struct drm_file * file
drm file

Description

Configs can be removed while being used, the will stop appearing in sysfs and their content will be freed when the stream using the config is closed.

Return

0 on success or a negative error code on failure.

void i915_perf_init(struct drm_i915_private * i915)

initialize i915-perf state on module bind

Parameters

struct drm_i915_private * i915
i915 device instance

Description

Initializes i915-perf state without exposing anything to userspace.

Note

i915-perf initialization is split into an ‘init’ and ‘register’ phase with the i915_perf_register() exposing state to userspace.

void i915_perf_fini(struct drm_i915_private * i915)

Counter part to i915_perf_init()

Parameters

struct drm_i915_private * i915
i915 device instance
int i915_perf_ioctl_version(void)

Version of the i915-perf subsystem

Parameters

void
no arguments

Description

This version number is used by userspace to detect available features.

Style

The drm/i915 driver codebase has some style rules in addition to (and, in some cases, deviating from) the kernel coding style.

Register macro definition style

The style guide for i915_reg.h.

Follow the style described here for new macros, and while changing existing macros. Do not mass change existing definitions just to update the style.

File Layout

Keep helper macros near the top. For example, _PIPE() and friends.

Prefix macros that generally should not be used outside of this file with underscore ‘_’. For example, _PIPE() and friends, single instances of registers that are defined solely for the use by function-like macros.

Avoid using the underscore prefixed macros outside of this file. There are exceptions, but keep them to a minimum.

There are two basic types of register definitions: Single registers and register groups. Register groups are registers which have two or more instances, for example one per pipe, port, transcoder, etc. Register groups should be defined using function-like macros.

For single registers, define the register offset first, followed by register contents.

For register groups, define the register instance offsets first, prefixed with underscore, followed by a function-like macro choosing the right instance based on the parameter, followed by register contents.

Define the register contents (i.e. bit and bit field macros) from most significant to least significant bit. Indent the register content macros using two extra spaces between #define and the macro name.

Define bit fields using REG_GENMASK(h, l). Define bit field contents using REG_FIELD_PREP(mask, value). This will define the values already shifted in place, so they can be directly OR’d together. For convenience, function-like macros may be used to define bit fields, but do note that the macros may be needed to read as well as write the register contents.

Define bits using REG_BIT(N). Do not add _BIT suffix to the name.

Group the register and its contents together without blank lines, separate from other registers and their contents with one blank line.

Indent macro values from macro names using TABs. Align values vertically. Use braces in macro values as needed to avoid unintended precedence after macro substitution. Use spaces in macro values according to kernel coding style. Use lower case in hexadecimal values.

Naming

Try to name registers according to the specs. If the register name changes in the specs from platform to another, stick to the original name.

Try to re-use existing register macro definitions. Only add new macros for new register offsets, or when the register contents have changed enough to warrant a full redefinition.

When a register macro changes for a new platform, prefix the new macro using the platform acronym or generation. For example, SKL_ or GEN8_. The prefix signifies the start platform/generation using the register.

When a bit (field) macro changes or gets added for a new platform, while retaining the existing register macro, add a platform acronym or generation suffix to the name. For example, _SKL or _GEN8.

Examples

(Note that the values in the example are indented using spaces instead of TABs to avoid misalignment in generated documentation. Use TABs in the definitions.):

#define _FOO_A                      0xf000
#define _FOO_B                      0xf001
#define FOO(pipe)                   _MMIO_PIPE(pipe, _FOO_A, _FOO_B)
#define   FOO_ENABLE                REG_BIT(31)
#define   FOO_MODE_MASK             REG_GENMASK(19, 16)
#define   FOO_MODE_BAR              REG_FIELD_PREP(FOO_MODE_MASK, 0)
#define   FOO_MODE_BAZ              REG_FIELD_PREP(FOO_MODE_MASK, 1)
#define   FOO_MODE_QUX_SNB          REG_FIELD_PREP(FOO_MODE_MASK, 2)

#define BAR                         _MMIO(0xb000)
#define GEN8_BAR                    _MMIO(0xb888)