Skip to content

gpu

zeus.device.gpu

Abstraction layer for GPU devices.

The main function of this module is get_gpus, which returns a GPU Manager object specific to the platform.

Important

In theory, any NVIDIA GPU would be supported. On the other hand, for AMD GPUs, we currently only support ROCm 6.2 and later.

Getting handles to GPUs

The main API exported from this module is the get_gpus function. It returns either NVIDIAGPUs or AMDGPUs depending on the platform.

from zeus.device import get_gpus
gpus = get_gpus()

Calling GPU management APIs

GPU management library APIs are mapped to methods on GPU.

For example, for NVIDIA GPUs (which uses pynvml), you would have called:

handle = pynvml.nvmlDeviceGetHandleByIndex(gpu_index)
constraints = pynvml.nvmlDeviceGetPowerManagementLimitConstraints(handle)

With the Zeus GPU abstraction layer, you would now call:

gpus = get_gpus() # returns an NVIDIAGPUs object
constraints = gpus.getPowerManagementLimitConstraints(gpu_index)

Non-blocking calls

Some implementations of GPU support non-blocking calls to setters. If non-blocking calls are not supported, setting block will be ignored and the call will block. Check GPU.supports_non_blocking to see if non-blocking calls are supported. Note that non-blocking calls will not raise exceptions even if the call fails.

Currently, only ZeusdNVIDIAGPU supports non-blocking calls to methods that set the GPU's power limit, GPU frequency, memory frequency, and persistence mode. This is possible because the Zeus daemon supports a block: bool parameter in HTTP requests, which can be set to False to make the call return immediately without checking the result.

Error handling

The following exceptions are defined in this module:

ZeusBaseGPUError

Bases: ZeusBaseError

Zeus base GPU exception class.

Source code in zeus/device/exception.py
 6
 7
 8
 9
10
11
class ZeusBaseGPUError(ZeusBaseError):
    """Zeus base GPU exception class."""

    def __init__(self, message: str) -> None:
        """Initialize Base Zeus Exception."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/exception.py
 9
10
11
def __init__(self, message: str) -> None:
    """Initialize Base Zeus Exception."""
    super().__init__(message)

DeprecatedAliasABCMeta

Bases: ABCMeta

Metaclass that combines ABC functionality with automatic deprecated alias creation.

This metaclass looks for methods decorated with @deprecated_alias and automatically creates the old camelCase method names that emit deprecation warnings once and then call the new snake_case methods.

Since this is frequently composed with abc.ABCMeta, this metaclass inherits from it to avoid metaclass conflicts.

Example

class MyClass(abc.ABC, metaclass=DeprecatedAliasABCMeta):
    @deprecated_alias("oldMethod")
    @abc.abstractmethod
    def new_method(self):
        pass

class MyImplementation(MyClass):
    def new_method(self):
        return "implementation"

obj = MyImplementation()
obj.new_method()  # No warning
obj.oldMethod()   # Emits deprecation warning, calls new_method
Source code in zeus/device/common.py
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
class DeprecatedAliasABCMeta(abc.ABCMeta):
    """Metaclass that combines ABC functionality with automatic deprecated alias creation.

    This metaclass looks for methods decorated with `@deprecated_alias` and automatically
    creates the old camelCase method names that emit deprecation warnings once and then
    call the new snake_case methods.

    Since this is frequently composed with `abc.ABCMeta`, this metaclass inherits from it
    to avoid metaclass conflicts.

    !!! Example
        ```python
        class MyClass(abc.ABC, metaclass=DeprecatedAliasABCMeta):
            @deprecated_alias("oldMethod")
            @abc.abstractmethod
            def new_method(self):
                pass

        class MyImplementation(MyClass):
            def new_method(self):
                return "implementation"

        obj = MyImplementation()
        obj.new_method()  # No warning
        obj.oldMethod()   # Emits deprecation warning, calls new_method
        ```
    """

    def __new__(mcs, name, bases, namespace):
        """Create the class and add deprecated alias methods."""
        cls = super().__new__(mcs, name, bases, namespace)

        # Create deprecated aliases for methods marked with @deprecated_alias
        for attr_name in dir(cls):
            if attr_name.startswith("_"):
                continue

            try:
                attr = getattr(cls, attr_name)
            except AttributeError:
                continue

            # Check if this method has the deprecated alias marker
            if hasattr(attr, "_deprecated_alias"):
                old_name = attr._deprecated_alias
                # Create and attach the deprecated wrapper method
                deprecated_method = _make_deprecated_method(attr, old_name, attr_name)
                setattr(cls, old_name, deprecated_method)

        return cls

__new__

__new__(mcs, name, bases, namespace)

Create the class and add deprecated alias methods.

Source code in zeus/device/common.py
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
def __new__(mcs, name, bases, namespace):
    """Create the class and add deprecated alias methods."""
    cls = super().__new__(mcs, name, bases, namespace)

    # Create deprecated aliases for methods marked with @deprecated_alias
    for attr_name in dir(cls):
        if attr_name.startswith("_"):
            continue

        try:
            attr = getattr(cls, attr_name)
        except AttributeError:
            continue

        # Check if this method has the deprecated alias marker
        if hasattr(attr, "_deprecated_alias"):
            old_name = attr._deprecated_alias
            # Create and attach the deprecated wrapper method
            deprecated_method = _make_deprecated_method(attr, old_name, attr_name)
            setattr(cls, old_name, deprecated_method)

    return cls

GPU

Bases: ABC

Abstract base class for managing one GPU.

For each method, child classes should call into vendor-specific GPU management libraries (e.g., NVML for NVIDIA GPUs).

Source code in zeus/device/gpu/common.py
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
class GPU(abc.ABC, metaclass=DeprecatedAliasABCMeta):
    """Abstract base class for managing one GPU.

    For each method, child classes should call into vendor-specific
    GPU management libraries (e.g., NVML for NVIDIA GPUs).
    """

    def __init__(self, gpu_index: int) -> None:
        """Initializ the GPU with a specified index."""
        self.gpu_index = gpu_index

    def _warn_sys_admin(self) -> None:
        """Warn the user if the current process doesn't have `SYS_ADMIN` privileges."""
        # Deriving classes can disable this warning by setting this attribute.
        if not getattr(self, "_disable_sys_admin_warning", False) and not has_sys_admin():
            warnings.warn(
                "You are about to call a GPU management API that requires "
                "`SYS_ADMIN` privileges. Some energy optimizers that change the "
                "GPU's power settings need this.\nSee "
                "https://ml.energy/zeus/getting_started/#system-privileges "
                "for more information and how to obtain `SYS_ADMIN`.",
                stacklevel=2,
            )
            # Only warn once.
            self._disable_sys_admin_warning = True

    @property
    @abc.abstractmethod
    def supports_nonblocking_setters(self) -> bool:
        """Return True if the GPU object supports non-blocking configuration setters."""
        return False

    @deprecated_alias("getName")
    @abc.abstractmethod
    def get_name(self) -> str:
        """Return the name of the GPU model."""
        pass

    @deprecated_alias("getPowerManagementLimitConstraints")
    @abc.abstractmethod
    def get_power_management_limit_constraints(self) -> tuple[int, int]:
        """Return the minimum and maximum power management limits. Units: mW."""
        pass

    @abc.abstractmethod
    def get_power_management_limit(self) -> int:
        """Return the current power management limit. Units: mW."""
        pass

    @deprecated_alias("setPowerManagementLimit")
    @abc.abstractmethod
    def set_power_management_limit(self, power_limit_mw: int, block: bool = True) -> None:
        """Set the GPU's power management limit. Unit: mW."""
        pass

    @deprecated_alias("resetPowerManagementLimit")
    @abc.abstractmethod
    def reset_power_management_limit(self, block: bool = True) -> None:
        """Reset the GPU's power management limit to the default value."""
        pass

    @deprecated_alias("setPersistenceMode")
    @abc.abstractmethod
    def set_persistence_mode(self, enabled: bool, block: bool = True) -> None:
        """Set persistence mode."""
        pass

    @abc.abstractmethod
    def get_persistence_mode(self) -> bool:
        """Return whether persistence mode is currently enabled."""
        pass

    @deprecated_alias("getSupportedMemoryClocks")
    @abc.abstractmethod
    def get_supported_memory_clocks(self) -> list[int]:
        """Return a list of supported memory clock frequencies. Units: MHz."""
        pass

    @deprecated_alias("setMemoryLockedClocks")
    @abc.abstractmethod
    def set_memory_locked_clocks(self, min_clock_mhz: int, max_clock_mhz: int, block: bool = True) -> None:
        """Lock the memory clock to a specified range. Units: MHz."""
        pass

    @deprecated_alias("resetMemoryLockedClocks")
    @abc.abstractmethod
    def reset_memory_locked_clocks(self, block: bool = True) -> None:
        """Reset the locked memory clocks to the default."""
        pass

    @deprecated_alias("getSupportedGraphicsClocks")
    @abc.abstractmethod
    def get_supported_graphics_clocks(self, memory_clock_mhz: int | None = None) -> list[int]:
        """Return a list of supported graphics clock frequencies. Units: MHz.

        Args:
            memory_clock_mhz: Memory clock frequency to use. Some GPUs have
                different supported graphics clocks depending on the memory clock.
        """
        pass

    @deprecated_alias("setGpuLockedClocks")
    @abc.abstractmethod
    def set_gpu_locked_clocks(self, min_clock_mhz: int, max_clock_mhz: int, block: bool = True) -> None:
        """Lock the GPU clock to a specified range. Units: MHz."""
        pass

    @deprecated_alias("resetGpuLockedClocks")
    @abc.abstractmethod
    def reset_gpu_locked_clocks(self, block: bool = True) -> None:
        """Reset the locked GPU clocks to the default."""
        pass

    @deprecated_alias("getAveragePowerUsage")
    @abc.abstractmethod
    def get_average_power_usage(self) -> int:
        """Return the average power usage of the GPU. Units: mW."""
        pass

    @deprecated_alias("getInstantPowerUsage")
    @abc.abstractmethod
    def get_instant_power_usage(self) -> int:
        """Return the current power draw of the GPU. Units: mW."""
        pass

    @deprecated_alias("getAverageMemoryPowerUsage")
    @abc.abstractmethod
    def get_average_memory_power_usage(self) -> int:
        """Return the average power usage of the GPU's memory. Units: mW."""
        pass

    @deprecated_alias("supportsGetTotalEnergyConsumption")
    @abc.abstractmethod
    def supports_get_total_energy_consumption(self) -> bool:
        """Check if the GPU supports retrieving total energy consumption."""
        pass

    @deprecated_alias("getTotalEnergyConsumption")
    @abc.abstractmethod
    def get_total_energy_consumption(self) -> int:
        """Return the total energy consumption of the GPU since driver load. Units: mJ."""
        pass

    @deprecated_alias("getGpuTemperature")
    @abc.abstractmethod
    def get_gpu_temperature(self) -> int:
        """Return the current GPU temperature. Units: Celsius."""
        pass

supports_nonblocking_setters abstractmethod property

supports_nonblocking_setters

Return True if the GPU object supports non-blocking configuration setters.

__init__

__init__(gpu_index)
Source code in zeus/device/gpu/common.py
23
24
25
def __init__(self, gpu_index: int) -> None:
    """Initializ the GPU with a specified index."""
    self.gpu_index = gpu_index

_warn_sys_admin

_warn_sys_admin()

Warn the user if the current process doesn't have SYS_ADMIN privileges.

Source code in zeus/device/gpu/common.py
27
28
29
30
31
32
33
34
35
36
37
38
39
40
def _warn_sys_admin(self) -> None:
    """Warn the user if the current process doesn't have `SYS_ADMIN` privileges."""
    # Deriving classes can disable this warning by setting this attribute.
    if not getattr(self, "_disable_sys_admin_warning", False) and not has_sys_admin():
        warnings.warn(
            "You are about to call a GPU management API that requires "
            "`SYS_ADMIN` privileges. Some energy optimizers that change the "
            "GPU's power settings need this.\nSee "
            "https://ml.energy/zeus/getting_started/#system-privileges "
            "for more information and how to obtain `SYS_ADMIN`.",
            stacklevel=2,
        )
        # Only warn once.
        self._disable_sys_admin_warning = True

get_name abstractmethod

get_name()

Return the name of the GPU model.

Source code in zeus/device/gpu/common.py
48
49
50
51
52
@deprecated_alias("getName")
@abc.abstractmethod
def get_name(self) -> str:
    """Return the name of the GPU model."""
    pass

get_power_management_limit_constraints abstractmethod

get_power_management_limit_constraints()

Return the minimum and maximum power management limits. Units: mW.

Source code in zeus/device/gpu/common.py
54
55
56
57
58
@deprecated_alias("getPowerManagementLimitConstraints")
@abc.abstractmethod
def get_power_management_limit_constraints(self) -> tuple[int, int]:
    """Return the minimum and maximum power management limits. Units: mW."""
    pass

get_power_management_limit abstractmethod

get_power_management_limit()

Return the current power management limit. Units: mW.

Source code in zeus/device/gpu/common.py
60
61
62
63
@abc.abstractmethod
def get_power_management_limit(self) -> int:
    """Return the current power management limit. Units: mW."""
    pass

set_power_management_limit abstractmethod

set_power_management_limit(power_limit_mw, block=True)

Set the GPU's power management limit. Unit: mW.

Source code in zeus/device/gpu/common.py
65
66
67
68
69
@deprecated_alias("setPowerManagementLimit")
@abc.abstractmethod
def set_power_management_limit(self, power_limit_mw: int, block: bool = True) -> None:
    """Set the GPU's power management limit. Unit: mW."""
    pass

reset_power_management_limit abstractmethod

reset_power_management_limit(block=True)

Reset the GPU's power management limit to the default value.

Source code in zeus/device/gpu/common.py
71
72
73
74
75
@deprecated_alias("resetPowerManagementLimit")
@abc.abstractmethod
def reset_power_management_limit(self, block: bool = True) -> None:
    """Reset the GPU's power management limit to the default value."""
    pass

set_persistence_mode abstractmethod

set_persistence_mode(enabled, block=True)

Set persistence mode.

Source code in zeus/device/gpu/common.py
77
78
79
80
81
@deprecated_alias("setPersistenceMode")
@abc.abstractmethod
def set_persistence_mode(self, enabled: bool, block: bool = True) -> None:
    """Set persistence mode."""
    pass

get_persistence_mode abstractmethod

get_persistence_mode()

Return whether persistence mode is currently enabled.

Source code in zeus/device/gpu/common.py
83
84
85
86
@abc.abstractmethod
def get_persistence_mode(self) -> bool:
    """Return whether persistence mode is currently enabled."""
    pass

get_supported_memory_clocks abstractmethod

get_supported_memory_clocks()

Return a list of supported memory clock frequencies. Units: MHz.

Source code in zeus/device/gpu/common.py
88
89
90
91
92
@deprecated_alias("getSupportedMemoryClocks")
@abc.abstractmethod
def get_supported_memory_clocks(self) -> list[int]:
    """Return a list of supported memory clock frequencies. Units: MHz."""
    pass

set_memory_locked_clocks abstractmethod

set_memory_locked_clocks(min_clock_mhz, max_clock_mhz, block=True)

Lock the memory clock to a specified range. Units: MHz.

Source code in zeus/device/gpu/common.py
94
95
96
97
98
@deprecated_alias("setMemoryLockedClocks")
@abc.abstractmethod
def set_memory_locked_clocks(self, min_clock_mhz: int, max_clock_mhz: int, block: bool = True) -> None:
    """Lock the memory clock to a specified range. Units: MHz."""
    pass

reset_memory_locked_clocks abstractmethod

reset_memory_locked_clocks(block=True)

Reset the locked memory clocks to the default.

Source code in zeus/device/gpu/common.py
100
101
102
103
104
@deprecated_alias("resetMemoryLockedClocks")
@abc.abstractmethod
def reset_memory_locked_clocks(self, block: bool = True) -> None:
    """Reset the locked memory clocks to the default."""
    pass

get_supported_graphics_clocks abstractmethod

get_supported_graphics_clocks(memory_clock_mhz=None)

Return a list of supported graphics clock frequencies. Units: MHz.

Parameters:

Name Type Description Default
memory_clock_mhz int | None

Memory clock frequency to use. Some GPUs have different supported graphics clocks depending on the memory clock.

None
Source code in zeus/device/gpu/common.py
106
107
108
109
110
111
112
113
114
115
@deprecated_alias("getSupportedGraphicsClocks")
@abc.abstractmethod
def get_supported_graphics_clocks(self, memory_clock_mhz: int | None = None) -> list[int]:
    """Return a list of supported graphics clock frequencies. Units: MHz.

    Args:
        memory_clock_mhz: Memory clock frequency to use. Some GPUs have
            different supported graphics clocks depending on the memory clock.
    """
    pass

set_gpu_locked_clocks abstractmethod

set_gpu_locked_clocks(min_clock_mhz, max_clock_mhz, block=True)

Lock the GPU clock to a specified range. Units: MHz.

Source code in zeus/device/gpu/common.py
117
118
119
120
121
@deprecated_alias("setGpuLockedClocks")
@abc.abstractmethod
def set_gpu_locked_clocks(self, min_clock_mhz: int, max_clock_mhz: int, block: bool = True) -> None:
    """Lock the GPU clock to a specified range. Units: MHz."""
    pass

reset_gpu_locked_clocks abstractmethod

reset_gpu_locked_clocks(block=True)

Reset the locked GPU clocks to the default.

Source code in zeus/device/gpu/common.py
123
124
125
126
127
@deprecated_alias("resetGpuLockedClocks")
@abc.abstractmethod
def reset_gpu_locked_clocks(self, block: bool = True) -> None:
    """Reset the locked GPU clocks to the default."""
    pass

get_average_power_usage abstractmethod

get_average_power_usage()

Return the average power usage of the GPU. Units: mW.

Source code in zeus/device/gpu/common.py
129
130
131
132
133
@deprecated_alias("getAveragePowerUsage")
@abc.abstractmethod
def get_average_power_usage(self) -> int:
    """Return the average power usage of the GPU. Units: mW."""
    pass

get_instant_power_usage abstractmethod

get_instant_power_usage()

Return the current power draw of the GPU. Units: mW.

Source code in zeus/device/gpu/common.py
135
136
137
138
139
@deprecated_alias("getInstantPowerUsage")
@abc.abstractmethod
def get_instant_power_usage(self) -> int:
    """Return the current power draw of the GPU. Units: mW."""
    pass

get_average_memory_power_usage abstractmethod

get_average_memory_power_usage()

Return the average power usage of the GPU's memory. Units: mW.

Source code in zeus/device/gpu/common.py
141
142
143
144
145
@deprecated_alias("getAverageMemoryPowerUsage")
@abc.abstractmethod
def get_average_memory_power_usage(self) -> int:
    """Return the average power usage of the GPU's memory. Units: mW."""
    pass

supports_get_total_energy_consumption abstractmethod

supports_get_total_energy_consumption()

Check if the GPU supports retrieving total energy consumption.

Source code in zeus/device/gpu/common.py
147
148
149
150
151
@deprecated_alias("supportsGetTotalEnergyConsumption")
@abc.abstractmethod
def supports_get_total_energy_consumption(self) -> bool:
    """Check if the GPU supports retrieving total energy consumption."""
    pass

get_total_energy_consumption abstractmethod

get_total_energy_consumption()

Return the total energy consumption of the GPU since driver load. Units: mJ.

Source code in zeus/device/gpu/common.py
153
154
155
156
157
@deprecated_alias("getTotalEnergyConsumption")
@abc.abstractmethod
def get_total_energy_consumption(self) -> int:
    """Return the total energy consumption of the GPU since driver load. Units: mJ."""
    pass

get_gpu_temperature abstractmethod

get_gpu_temperature()

Return the current GPU temperature. Units: Celsius.

Source code in zeus/device/gpu/common.py
159
160
161
162
163
@deprecated_alias("getGpuTemperature")
@abc.abstractmethod
def get_gpu_temperature(self) -> int:
    """Return the current GPU temperature. Units: Celsius."""
    pass

EmptyGPUs

Bases: GPUs

A concrete class implementing the GPUs abstract base class, but representing an empty collection of GPUs.

This class is used to represent a scenario where no GPUs are available or detected. Any method call attempting to interact with a GPU will raise a ValueError.

Source code in zeus/device/gpu/common.py
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
class EmptyGPUs(GPUs):
    """A concrete class implementing the GPUs abstract base class, but representing an empty collection of GPUs.

    This class is used to represent a scenario where no GPUs are available or detected.
    Any method call attempting to interact with a GPU will raise a ValueError.
    """

    def __init__(self, ensure_homogeneous: bool = False) -> None:
        """Initialize the EMPTYGPUs class.

        Since this class represents an empty collection of GPUs, no actual initialization of GPU objects is performed.
        """
        pass

    def __del__(self) -> None:
        """Clean up any resources if necessary.

        As this class represents an empty collection of GPUs, no specific cleanup is required.
        """
        pass

    @property
    def gpus(self) -> Sequence["GPU"]:
        """Return an empty list as no GPUs are being tracked."""
        return []

    def __len__(self) -> int:
        """Return 0, indicating no GPUs are being tracked."""
        return 0

    def _ensure_homogeneous(self) -> None:
        """Raise a ValueError as no GPUs are being tracked."""
        raise ValueError("No GPUs available to ensure homogeneity.")

    def _warn_sys_admin(self) -> None:
        """Raise a ValueError as no GPUs are being tracked."""
        raise ValueError("No GPUs available to warn about SYS_ADMIN privileges.")

    @deprecated_alias("getName")
    def get_name(self, gpu_index: int) -> str:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("getPowerManagementLimitConstraints")
    def get_power_management_limit_constraints(self, gpu_index: int) -> tuple[int, int]:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    def get_power_management_limit(self, gpu_index: int) -> int:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("setPowerManagementLimit")
    def set_power_management_limit(self, gpu_index: int, power_limit_mw: int, block: bool = True) -> None:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("resetPowerManagementLimit")
    def reset_power_management_limit(self, gpu_index: int, block: bool = True) -> None:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("setPersistenceMode")
    def set_persistence_mode(self, gpu_index: int, enabled: bool, block: bool = True) -> None:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    def get_persistence_mode(self, gpu_index: int) -> bool:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("getSupportedMemoryClocks")
    def get_supported_memory_clocks(self, gpu_index: int) -> list[int]:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("setMemoryLockedClocks")
    def set_memory_locked_clocks(
        self,
        gpu_index: int,
        min_clock_mhz: int,
        max_clock_mhz: int,
        block: bool = True,
    ) -> None:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("resetMemoryLockedClocks")
    def reset_memory_locked_clocks(self, gpu_index: int, block: bool = True) -> None:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("getSupportedGraphicsClocks")
    def get_supported_graphics_clocks(self, gpu_index: int, memory_clock_mhz: int | None = None) -> list[int]:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("setGpuLockedClocks")
    def set_gpu_locked_clocks(
        self,
        gpu_index: int,
        min_clock_mhz: int,
        max_clock_mhz: int,
        block: bool = True,
    ) -> None:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("resetGpuLockedClocks")
    def reset_gpu_locked_clocks(self, gpu_index: int, block: bool = True) -> None:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("getInstantPowerUsage")
    def get_instant_power_usage(self, gpu_index: int) -> int:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("supportsGetTotalEnergyConsumption")
    def supports_get_total_energy_consumption(self, gpu_index: int) -> bool:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("getTotalEnergyConsumption")
    def get_total_energy_consumption(self, gpu_index: int) -> int:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

    @deprecated_alias("getGpuTemperature")
    def get_gpu_temperature(self, gpu_index: int) -> int:
        """Raise a ValueError as no GPUs are available."""
        raise ValueError("No GPUs available.")

gpus property

gpus

Return an empty list as no GPUs are being tracked.

__init__

__init__(ensure_homogeneous=False)

Since this class represents an empty collection of GPUs, no actual initialization of GPU objects is performed.

Source code in zeus/device/gpu/common.py
338
339
340
341
342
343
def __init__(self, ensure_homogeneous: bool = False) -> None:
    """Initialize the EMPTYGPUs class.

    Since this class represents an empty collection of GPUs, no actual initialization of GPU objects is performed.
    """
    pass

__del__

__del__()

Clean up any resources if necessary.

As this class represents an empty collection of GPUs, no specific cleanup is required.

Source code in zeus/device/gpu/common.py
345
346
347
348
349
350
def __del__(self) -> None:
    """Clean up any resources if necessary.

    As this class represents an empty collection of GPUs, no specific cleanup is required.
    """
    pass

__len__

__len__()

Return 0, indicating no GPUs are being tracked.

Source code in zeus/device/gpu/common.py
357
358
359
def __len__(self) -> int:
    """Return 0, indicating no GPUs are being tracked."""
    return 0

_ensure_homogeneous

_ensure_homogeneous()

Raise a ValueError as no GPUs are being tracked.

Source code in zeus/device/gpu/common.py
361
362
363
def _ensure_homogeneous(self) -> None:
    """Raise a ValueError as no GPUs are being tracked."""
    raise ValueError("No GPUs available to ensure homogeneity.")

_warn_sys_admin

_warn_sys_admin()

Raise a ValueError as no GPUs are being tracked.

Source code in zeus/device/gpu/common.py
365
366
367
def _warn_sys_admin(self) -> None:
    """Raise a ValueError as no GPUs are being tracked."""
    raise ValueError("No GPUs available to warn about SYS_ADMIN privileges.")

get_name

get_name(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
369
370
371
372
@deprecated_alias("getName")
def get_name(self, gpu_index: int) -> str:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

get_power_management_limit_constraints

get_power_management_limit_constraints(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
374
375
376
377
@deprecated_alias("getPowerManagementLimitConstraints")
def get_power_management_limit_constraints(self, gpu_index: int) -> tuple[int, int]:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

get_power_management_limit

get_power_management_limit(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
379
380
381
def get_power_management_limit(self, gpu_index: int) -> int:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

set_power_management_limit

set_power_management_limit(gpu_index, power_limit_mw, block=True)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
383
384
385
386
@deprecated_alias("setPowerManagementLimit")
def set_power_management_limit(self, gpu_index: int, power_limit_mw: int, block: bool = True) -> None:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

reset_power_management_limit

reset_power_management_limit(gpu_index, block=True)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
388
389
390
391
@deprecated_alias("resetPowerManagementLimit")
def reset_power_management_limit(self, gpu_index: int, block: bool = True) -> None:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

set_persistence_mode

set_persistence_mode(gpu_index, enabled, block=True)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
393
394
395
396
@deprecated_alias("setPersistenceMode")
def set_persistence_mode(self, gpu_index: int, enabled: bool, block: bool = True) -> None:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

get_persistence_mode

get_persistence_mode(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
398
399
400
def get_persistence_mode(self, gpu_index: int) -> bool:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

get_supported_memory_clocks

get_supported_memory_clocks(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
402
403
404
405
@deprecated_alias("getSupportedMemoryClocks")
def get_supported_memory_clocks(self, gpu_index: int) -> list[int]:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

set_memory_locked_clocks

set_memory_locked_clocks(gpu_index, min_clock_mhz, max_clock_mhz, block=True)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
407
408
409
410
411
412
413
414
415
416
@deprecated_alias("setMemoryLockedClocks")
def set_memory_locked_clocks(
    self,
    gpu_index: int,
    min_clock_mhz: int,
    max_clock_mhz: int,
    block: bool = True,
) -> None:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

reset_memory_locked_clocks

reset_memory_locked_clocks(gpu_index, block=True)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
418
419
420
421
@deprecated_alias("resetMemoryLockedClocks")
def reset_memory_locked_clocks(self, gpu_index: int, block: bool = True) -> None:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

get_supported_graphics_clocks

get_supported_graphics_clocks(gpu_index, memory_clock_mhz=None)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
423
424
425
426
@deprecated_alias("getSupportedGraphicsClocks")
def get_supported_graphics_clocks(self, gpu_index: int, memory_clock_mhz: int | None = None) -> list[int]:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

set_gpu_locked_clocks

set_gpu_locked_clocks(gpu_index, min_clock_mhz, max_clock_mhz, block=True)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
428
429
430
431
432
433
434
435
436
437
@deprecated_alias("setGpuLockedClocks")
def set_gpu_locked_clocks(
    self,
    gpu_index: int,
    min_clock_mhz: int,
    max_clock_mhz: int,
    block: bool = True,
) -> None:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

reset_gpu_locked_clocks

reset_gpu_locked_clocks(gpu_index, block=True)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
439
440
441
442
@deprecated_alias("resetGpuLockedClocks")
def reset_gpu_locked_clocks(self, gpu_index: int, block: bool = True) -> None:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

get_instant_power_usage

get_instant_power_usage(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
444
445
446
447
@deprecated_alias("getInstantPowerUsage")
def get_instant_power_usage(self, gpu_index: int) -> int:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

supports_get_total_energy_consumption

supports_get_total_energy_consumption(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
449
450
451
452
@deprecated_alias("supportsGetTotalEnergyConsumption")
def supports_get_total_energy_consumption(self, gpu_index: int) -> bool:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

get_total_energy_consumption

get_total_energy_consumption(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
454
455
456
457
@deprecated_alias("getTotalEnergyConsumption")
def get_total_energy_consumption(self, gpu_index: int) -> int:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

get_gpu_temperature

get_gpu_temperature(gpu_index)

Raise a ValueError as no GPUs are available.

Source code in zeus/device/gpu/common.py
459
460
461
462
@deprecated_alias("getGpuTemperature")
def get_gpu_temperature(self, gpu_index: int) -> int:
    """Raise a ValueError as no GPUs are available."""
    raise ValueError("No GPUs available.")

ZeusGPUInvalidArgError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Invalid Argument.

Source code in zeus/device/gpu/common.py
473
474
475
476
477
478
class ZeusGPUInvalidArgError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Invalid Argument."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
476
477
478
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUNotSupportedError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Not Supported Operation on GPU.

Source code in zeus/device/gpu/common.py
481
482
483
484
485
486
class ZeusGPUNotSupportedError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Not Supported Operation on GPU."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
484
485
486
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUNoPermissionError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps No Permission to perform GPU operation.

Source code in zeus/device/gpu/common.py
489
490
491
492
493
494
class ZeusGPUNoPermissionError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps No Permission to perform GPU operation."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
492
493
494
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUAlreadyInitializedError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Already Initialized GPU.

Source code in zeus/device/gpu/common.py
497
498
499
500
501
502
class ZeusGPUAlreadyInitializedError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Already Initialized GPU."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
500
501
502
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUNotFoundError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Not Found GPU.

Source code in zeus/device/gpu/common.py
505
506
507
508
509
510
class ZeusGPUNotFoundError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Not Found GPU."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
508
509
510
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUInsufficientSizeError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Insufficient Size.

Source code in zeus/device/gpu/common.py
513
514
515
516
517
518
class ZeusGPUInsufficientSizeError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Insufficient Size."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
516
517
518
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUInsufficientPowerError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Insufficient Power.

Source code in zeus/device/gpu/common.py
521
522
523
524
525
526
class ZeusGPUInsufficientPowerError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Insufficient Power."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
524
525
526
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUDriverNotLoadedError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Driver Error.

Source code in zeus/device/gpu/common.py
529
530
531
532
533
534
class ZeusGPUDriverNotLoadedError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Driver Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
532
533
534
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUTimeoutError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Timeout Error.

Source code in zeus/device/gpu/common.py
537
538
539
540
541
542
class ZeusGPUTimeoutError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Timeout Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
540
541
542
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUIRQError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps IRQ Error.

Source code in zeus/device/gpu/common.py
545
546
547
548
549
550
class ZeusGPUIRQError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps IRQ Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
548
549
550
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPULibraryNotFoundError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Library Not Found Error.

Source code in zeus/device/gpu/common.py
553
554
555
556
557
558
class ZeusGPULibraryNotFoundError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Library Not Found Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
556
557
558
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUFunctionNotFoundError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Function Not Found Error.

Source code in zeus/device/gpu/common.py
561
562
563
564
565
566
class ZeusGPUFunctionNotFoundError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Function Not Found Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
564
565
566
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUCorruptedInfoROMError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Corrupted Info ROM Error.

Source code in zeus/device/gpu/common.py
569
570
571
572
573
574
class ZeusGPUCorruptedInfoROMError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Corrupted Info ROM Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
572
573
574
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPULostError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Lost GPU Error.

Source code in zeus/device/gpu/common.py
577
578
579
580
581
582
class ZeusGPULostError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Lost GPU Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
580
581
582
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUResetRequiredError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Reset Required Error.

Source code in zeus/device/gpu/common.py
585
586
587
588
589
590
class ZeusGPUResetRequiredError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Reset Required Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
588
589
590
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUOperatingSystemError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Operating System Error.

Source code in zeus/device/gpu/common.py
593
594
595
596
597
598
class ZeusGPUOperatingSystemError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Operating System Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
596
597
598
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPULibRMVersionMismatchError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps LibRM Version Mismatch Error.

Source code in zeus/device/gpu/common.py
601
602
603
604
605
606
class ZeusGPULibRMVersionMismatchError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps LibRM Version Mismatch Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
604
605
606
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUMemoryError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Insufficient Memory Error.

Source code in zeus/device/gpu/common.py
609
610
611
612
613
614
class ZeusGPUMemoryError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Insufficient Memory Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
612
613
614
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUUnknownError

Bases: ZeusBaseGPUError

Zeus GPU exception that wraps Unknown Error.

Source code in zeus/device/gpu/common.py
617
618
619
620
621
622
class ZeusGPUUnknownError(ZeusBaseGPUError):
    """Zeus GPU exception that wraps Unknown Error."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
620
621
622
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

ZeusGPUHeterogeneousError

Bases: ZeusBaseGPUError

Exception for when GPUs are not homogeneous.

Source code in zeus/device/gpu/common.py
625
626
627
628
629
630
class ZeusGPUHeterogeneousError(ZeusBaseGPUError):
    """Exception for when GPUs are not homogeneous."""

    def __init__(self, message: str) -> None:
        """Intialize the exception object."""
        super().__init__(message)

__init__

__init__(message)
Source code in zeus/device/gpu/common.py
628
629
630
def __init__(self, message: str) -> None:
    """Intialize the exception object."""
    super().__init__(message)

has_sys_admin cached

has_sys_admin()

Check if the current process has SYS_ADMIN capabilities.

Source code in zeus/device/common.py
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
@lru_cache(maxsize=1)
def has_sys_admin() -> bool:
    """Check if the current process has `SYS_ADMIN` capabilities."""
    # First try to read procfs.
    try:
        with open("/proc/self/status") as f:
            for line in f:
                if line.startswith("CapEff"):
                    bitmask = int(line.strip().split()[1], 16)
                    has = bool(bitmask & (1 << 21))
                    logger.info(
                        "Read security capabilities from /proc/self/status -- SYS_ADMIN: %s",
                        has,
                    )
                    return has
    except Exception:
        logger.info("Failed to read capabilities from /proc/self/status", exc_info=True)

    # If that fails, try to use the capget syscall.
    class CapHeader(ctypes.Structure):
        _fields_ = [("version", ctypes.c_uint32), ("pid", ctypes.c_int)]

    class CapData(ctypes.Structure):
        _fields_ = [
            ("effective", ctypes.c_uint32),
            ("permitted", ctypes.c_uint32),
            ("inheritable", ctypes.c_uint32),
        ]

    # Attempt to load libc and set up capget
    try:
        libc = ctypes.CDLL("libc.so.6")
        capget = libc.capget
        capget.argtypes = [ctypes.POINTER(CapHeader), ctypes.POINTER(CapData)]
        capget.restype = ctypes.c_int
    except Exception:
        logger.info("Failed to load libc.so.6", exc_info=True)
        return False

    # Initialize the header and data structures
    header = CapHeader(version=0x20080522, pid=0)  # Use the current process
    data = CapData()

    # Call capget and check for errors
    if capget(ctypes.byref(header), ctypes.byref(data)) != 0:
        errno = ctypes.get_errno()
        logger.info("capget failed with error: %s (errno %s)", os.strerror(errno), errno)
        return False

    bitmask = data.effective
    has = bool(bitmask & (1 << 21))
    logger.info("Read security capabilities from capget -- SYS_ADMIN: %s", has)
    return has

deprecated_alias

deprecated_alias(old_name)

Decorator that marks a method to have a deprecated camelCase alias.

Apply this decorator to the new snake_case method. When the old camelCase name is called, it will emit a deprecation warning once and then call the new snake_case method.

Example
@deprecated_alias("getName")
def get_name(self):
    return "GPU Name"

The class using this decorator should use DeprecatedAliasABCMeta as its metaclass.

Parameters:

Name Type Description Default
old_name str

The old camelCase method name to create as a deprecated alias.

required

Returns:

Type Description
Callable[[Callable], Callable]

The decorated function with the _deprecated_alias attribute set.

Source code in zeus/device/common.py
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
def deprecated_alias(old_name: str) -> Callable[[Callable], Callable]:
    """Decorator that marks a method to have a deprecated camelCase alias.

    Apply this decorator to the new snake_case method. When the old camelCase
    name is called, it will emit a deprecation warning once and then call the
    new snake_case method.

    Example:
        ```python
        @deprecated_alias("getName")
        def get_name(self):
            return "GPU Name"
        ```

    The class using this decorator should use `DeprecatedAliasABCMeta` as its metaclass.

    Args:
        old_name: The old camelCase method name to create as a deprecated alias.

    Returns:
        The decorated function with the `_deprecated_alias` attribute set.
    """

    def decorator(func):
        func._deprecated_alias = old_name
        return func

    return decorator

get_gpus

get_gpus(ensure_homogeneous=False)

Initialize and return a singleton object for GPU management.

This function returns a GPU management object that aims to abstract the underlying GPU vendor and their specific monitoring library (pynvml for NVIDIA GPUs and amdsmi for AMD GPUs). Management APIs are mapped to methods on the returned GPUs object.

GPU availability is checked in the following order:

  1. NVIDIA GPUs using pynvml
  2. AMD GPUs using amdsmi
  3. If both are unavailable, a ZeusGPUInitError is raised.

Parameters:

Name Type Description Default
ensure_homogeneous bool

If True, ensures that all tracked GPUs have the same name.

False
Source code in zeus/device/gpu/__init__.py
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
def get_gpus(ensure_homogeneous: bool = False) -> GPUs:
    """Initialize and return a singleton object for GPU management.

    This function returns a GPU management object that aims to abstract
    the underlying GPU vendor and their specific monitoring library
    (pynvml for NVIDIA GPUs and amdsmi for AMD GPUs). Management APIs
    are mapped to methods on the returned [`GPUs`][zeus.device.gpu.GPUs] object.

    GPU availability is checked in the following order:

    1. NVIDIA GPUs using `pynvml`
    1. AMD GPUs using `amdsmi`
    1. If both are unavailable, a `ZeusGPUInitError` is raised.

    Args:
        ensure_homogeneous (bool): If True, ensures that all tracked GPUs have the same name.
    """
    global _gpus
    if _gpus is not None:
        return _gpus

    if nvml_is_available():
        _gpus = NVIDIAGPUs(ensure_homogeneous)
        return _gpus
    elif amdsmi_is_available():
        _gpus = AMDGPUs(ensure_homogeneous)
        return _gpus
    else:
        raise ZeusGPUInitError("NVML and AMDSMI unavailable. Failed to initialize GPU management library.")