Skip to content

profile

zeus.profile

Thermally stable energy profiling for GPU workloads.

See our blog post for more details: Thermally Stable Profiling for Accurate GPU Energy Measurement

Public API

Overview

The module provides functions to determine the best measurement and cooldown durations that yield stable (low-variance) energy measurements for a user-provided callable.

Single trial
============

|<--- cooldown_duration --->|            |<--- measurement_duration --->|
|                           |            |                              |
+----------- idle ----------+-- warmup --+--iter--iter-- ... --iter-----+
                                         |                              |
                                         +-- energy_per_iter measured --+

Sweep (num_trials = 3, num_warmup_trials = 1)
======================

Warmup trial:  [--iter--iter--...--iter--]
Trial 1:  [--- cooldown ---|-- warmup --|--iter--iter--...--iter--]
Trial 2:  [--- cooldown ---|-- warmup --|--iter--iter--...--iter--]
Trial 3:  [--- cooldown ---|-- warmup --|--iter--iter--...--iter--]
                                         \________________________/
    measurement_duration=5.0 s [VALID]    std of energy_per_iter
                                          < trial_stddev_threshold

Measurement duration sweep: fixes cooldown_duration at the maximum of the cooldown search range and sweeps measurement_duration. Each configuration is measured for num_trials trials; configurations whose energy standard deviation falls below trial_stddev_threshold are considered valid.

Cooldown duration sweep: fixes measurement_duration at the maximum of the measurement search range and sweeps cooldown_duration with the same validity criterion.

Both durations can also be chosen manually and passed directly to measure.

Multi-GPU / distributed setting: In a distributed setting each rank should create its own ZeusMonitor with gpu_indices=[local_rank] and pass it to the profiling functions. Every rank executes the workload and measures energy on its local GPU. all_reduce is used internally to aggregate results across ranks (energy is summed, time takes the max across ranks, and temperature is averaged). Only rank 0 logs progress and reports.

TrialResult dataclass

Result of a single measurement trial.

Attributes:

Name Type Description
energy_per_iter float

Energy consumed per iteration (Joules).

time_per_iter float

Wall-clock time per iteration (seconds).

total_energy float

Total energy consumed during the measurement window (Joules).

total_time float

Total wall-clock time of the measurement window (seconds).

iterations int

Number of iterations executed in the window.

temperature_before float

GPU temperature (Celsius) before the measurement.

temperature_after float

GPU temperature (Celsius) after the measurement.

Source code in zeus/profile.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
@dataclass
class TrialResult:
    """Result of a single measurement trial.

    Attributes:
        energy_per_iter: Energy consumed per iteration (Joules).
        time_per_iter: Wall-clock time per iteration (seconds).
        total_energy: Total energy consumed during the measurement window (Joules).
        total_time: Total wall-clock time of the measurement window (seconds).
        iterations: Number of iterations executed in the window.
        temperature_before: GPU temperature (Celsius) before the measurement.
        temperature_after: GPU temperature (Celsius) after the measurement.
    """

    energy_per_iter: float
    time_per_iter: float
    total_energy: float
    total_time: float
    iterations: int
    temperature_before: float
    temperature_after: float

SweepResult dataclass

Result of sweeping one parameter value across multiple trials.

Attributes:

Name Type Description
measurement_duration float

The measurement duration used (seconds).

cooldown_duration float

The cooldown duration used (seconds).

trials list[TrialResult]

Per-trial results.

energy_mean float

Mean energy_per_iter across trials.

energy_std float

Sample standard deviation of energy_per_iter across trials.

avg_temperature_before float

Average temperature before the measurement.

avg_temperature_after float

Average temperature after the measurement.

avg_total_time float

Mean total time across trials.

avg_total_energy float

Mean total energy across trials.

is_valid bool

True when energy_std < trial_stddev_threshold.

Source code in zeus/profile.py
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
@dataclass
class SweepResult:
    """Result of sweeping one parameter value across multiple trials.

    Attributes:
        measurement_duration: The measurement duration used (seconds).
        cooldown_duration: The cooldown duration used (seconds).
        trials: Per-trial results.
        energy_mean: Mean `energy_per_iter` across trials.
        energy_std: Sample standard deviation of `energy_per_iter` across trials.
        avg_temperature_before: Average temperature before the measurement.
        avg_temperature_after: Average temperature after the measurement.
        avg_total_time: Mean total time across trials.
        avg_total_energy: Mean total energy across trials.
        is_valid: `True` when `energy_std < trial_stddev_threshold`.
    """

    measurement_duration: float
    cooldown_duration: float
    trials: list[TrialResult]
    energy_mean: float
    energy_std: float
    avg_temperature_before: float
    avg_temperature_after: float
    avg_total_time: float
    avg_total_energy: float
    is_valid: bool

    def __str__(self) -> str:
        """One-line summary without per-trial details."""
        tag = "VALID" if self.is_valid else "INVALID"
        return (
            f"measurement={self.measurement_duration:.1f} s  "
            f"cooldown={self.cooldown_duration:.1f} s  "
            f"mean={self.energy_mean:.4f} J  "
            f"std={self.energy_std:.4f} J  "
            f"temp_before={self.avg_temperature_before:.1f} \u00b0C  "
            f"temp_after={self.avg_temperature_after:.1f} \u00b0C  "
            f"[{tag}]"
        )

__str__

__str__()

One-line summary without per-trial details.

Source code in zeus/profile.py
138
139
140
141
142
143
144
145
146
147
148
149
def __str__(self) -> str:
    """One-line summary without per-trial details."""
    tag = "VALID" if self.is_valid else "INVALID"
    return (
        f"measurement={self.measurement_duration:.1f} s  "
        f"cooldown={self.cooldown_duration:.1f} s  "
        f"mean={self.energy_mean:.4f} J  "
        f"std={self.energy_std:.4f} J  "
        f"temp_before={self.avg_temperature_before:.1f} \u00b0C  "
        f"temp_after={self.avg_temperature_after:.1f} \u00b0C  "
        f"[{tag}]"
    )

SweepReport dataclass

Full report from a measurement-duration or cooldown-duration sweep.

Attributes:

Name Type Description
sweep_param tuple[str, list[float]]

(parameter_name, swept_values).

fixed_param tuple[str, float]

(parameter_name, fixed_value).

entries list[SweepResult]

All sweep entries (one per swept value).

Source code in zeus/profile.py
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
@dataclass
class SweepReport:
    """Full report from a measurement-duration or cooldown-duration sweep.

    Attributes:
        sweep_param: `(parameter_name, swept_values)`.
        fixed_param: `(parameter_name, fixed_value)`.
        entries: All sweep entries (one per swept value).
    """

    sweep_param: tuple[str, list[float]]
    fixed_param: tuple[str, float]
    entries: list[SweepResult]

    def __str__(self) -> str:
        """Multi-line summary: one line per swept value."""
        sweep_name, _ = self.sweep_param
        fixed_name, fixed_value = self.fixed_param
        prefixes = [f"{sweep_name}={getattr(e, sweep_name):.1f} s" for e in self.entries]
        max_prefix_len = max(len(p) for p in prefixes)
        lines = [f"Sweep {sweep_name} (fixed {fixed_name}={fixed_value:.1f} s):"]
        for prefix, e in zip(prefixes, self.entries):
            tag = "VALID" if e.is_valid else "INVALID"
            lines.append(
                f"  {prefix:<{max_prefix_len}}  "
                f"mean={e.energy_mean:.4f} J  "
                f"std={e.energy_std:.4f} J  "
                f"temp_before={e.avg_temperature_before:.1f} \u00b0C  "
                f"temp_after={e.avg_temperature_after:.1f} \u00b0C  "
                f"[{tag}]"
            )
        return "\n".join(lines)

__str__

__str__()

Multi-line summary: one line per swept value.

Source code in zeus/profile.py
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
def __str__(self) -> str:
    """Multi-line summary: one line per swept value."""
    sweep_name, _ = self.sweep_param
    fixed_name, fixed_value = self.fixed_param
    prefixes = [f"{sweep_name}={getattr(e, sweep_name):.1f} s" for e in self.entries]
    max_prefix_len = max(len(p) for p in prefixes)
    lines = [f"Sweep {sweep_name} (fixed {fixed_name}={fixed_value:.1f} s):"]
    for prefix, e in zip(prefixes, self.entries):
        tag = "VALID" if e.is_valid else "INVALID"
        lines.append(
            f"  {prefix:<{max_prefix_len}}  "
            f"mean={e.energy_mean:.4f} J  "
            f"std={e.energy_std:.4f} J  "
            f"temp_before={e.avg_temperature_before:.1f} \u00b0C  "
            f"temp_after={e.avg_temperature_after:.1f} \u00b0C  "
            f"[{tag}]"
        )
    return "\n".join(lines)

_is_rank_zero

_is_rank_zero()

Return True when not distributed or when this is rank 0.

Source code in zeus/profile.py
82
83
84
def _is_rank_zero() -> bool:
    """Return True when not distributed or when this is rank 0."""
    return get_rank() == 0

_calibrate_iteration_duration

_calibrate_iteration_duration(target_function, zeus_monitor, num_warmup_iterations, num_calibration_iterations)

Warm up target_function and measure per-iteration execution time.

Source code in zeus/profile.py
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
def _calibrate_iteration_duration(
    target_function: Callable[[], Any],
    zeus_monitor: ZeusMonitor,
    num_warmup_iterations: int,
    num_calibration_iterations: int,
) -> float:
    """Warm up *target_function* and measure per-iteration execution time."""
    for _ in range(num_warmup_iterations):
        target_function()

    sync_execution(zeus_monitor.gpu_indices, sync_with=zeus_monitor.sync_with)
    start = time.monotonic()
    for _ in range(num_calibration_iterations):
        target_function()
    sync_execution(zeus_monitor.gpu_indices, sync_with=zeus_monitor.sync_with)
    elapsed = time.monotonic() - start

    iteration_duration = elapsed / num_calibration_iterations
    [iteration_duration] = all_reduce([iteration_duration], "max")
    logger.info("Calibrated iteration duration: %.3f ms", iteration_duration * 1000)
    return iteration_duration

_read_avg_gpu_temperature

_read_avg_gpu_temperature(zeus_monitor)

Return the mean GPU temperature (deg C) across all monitored GPUs.

Source code in zeus/profile.py
209
210
211
212
213
def _read_avg_gpu_temperature(zeus_monitor: ZeusMonitor) -> float:
    """Return the mean GPU temperature (deg C) across all monitored GPUs."""
    temps = [zeus_monitor.gpus.get_gpu_temperature(idx) for idx in zeus_monitor.gpu_indices]
    assert temps, "ZeusMonitor is monitoring zero GPUs."
    return sum(temps) / len(temps)

_run_trial

_run_trial(target_function, zeus_monitor, cooldown_duration, measurement_duration, num_warmup_iterations, iteration_duration)

Execute one trial: cooldown -> warmup -> measure.

Source code in zeus/profile.py
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
def _run_trial(
    target_function: Callable[[], Any],
    zeus_monitor: ZeusMonitor,
    cooldown_duration: float,
    measurement_duration: float,
    num_warmup_iterations: int,
    iteration_duration: float,
) -> TrialResult:
    """Execute one trial: cooldown -> warmup -> measure."""
    iterations = max(1, int(measurement_duration / iteration_duration))

    if cooldown_duration > 0:
        time.sleep(cooldown_duration)

    temperature_before = _read_avg_gpu_temperature(zeus_monitor)

    for _ in range(num_warmup_iterations):
        target_function()

    zeus_monitor.begin_window("__zeus_profile_run_trial")
    for _ in range(iterations):
        target_function()
    result = zeus_monitor.end_window("__zeus_profile_run_trial")

    temperature_after = _read_avg_gpu_temperature(zeus_monitor)

    [total_energy] = all_reduce([result.total_energy], "sum")
    [total_time] = all_reduce([result.time], "max")
    [temp_before_sum] = all_reduce([temperature_before], "sum")
    [temp_after_sum] = all_reduce([temperature_after], "sum")
    world_size = get_world_size()

    return TrialResult(
        energy_per_iter=total_energy / iterations,
        time_per_iter=total_time / iterations,
        total_energy=total_energy,
        total_time=total_time,
        iterations=iterations,
        temperature_before=temp_before_sum / world_size,
        temperature_after=temp_after_sum / world_size,
    )

_build_sweep_result

_build_sweep_result(measurement_duration, cooldown_duration, trials, trial_stddev_threshold)

Aggregate trial results into a SweepResult.

Source code in zeus/profile.py
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
def _build_sweep_result(
    measurement_duration: float,
    cooldown_duration: float,
    trials: list[TrialResult],
    trial_stddev_threshold: float,
) -> SweepResult:
    """Aggregate trial results into a [`SweepResult`][zeus.profile.SweepResult]."""
    energies = [t.energy_per_iter for t in trials]
    n = len(trials)
    e_std = statistics.stdev(energies) if n >= 2 else 0.0
    return SweepResult(
        measurement_duration=measurement_duration,
        cooldown_duration=cooldown_duration,
        trials=trials,
        energy_mean=statistics.mean(energies),
        energy_std=e_std,
        avg_temperature_before=sum(t.temperature_before for t in trials) / n,
        avg_temperature_after=sum(t.temperature_after for t in trials) / n,
        avg_total_time=sum(t.total_time for t in trials) / n,
        avg_total_energy=sum(t.total_energy for t in trials) / n,
        is_valid=e_std < trial_stddev_threshold,
    )

_sweep

_sweep(target_function, zeus_monitor, sweep_values, fixed_value, sweep_type, num_trials, num_warmup_trials, trial_stddev_threshold, num_warmup_iterations, iteration_duration)

Run a parameter sweep and return a SweepReport.

Warmup trials skip cooldown and warmup iterations and use the maximum measurement duration.

Source code in zeus/profile.py
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
def _sweep(
    target_function: Callable[[], Any],
    zeus_monitor: ZeusMonitor,
    sweep_values: list[float],
    fixed_value: float,
    sweep_type: Literal["cooldown_duration", "measurement_duration"],
    num_trials: int,
    num_warmup_trials: int,
    trial_stddev_threshold: float,
    num_warmup_iterations: int,
    iteration_duration: float,
) -> SweepReport:
    """Run a parameter sweep and return a [`SweepReport`][zeus.profile.SweepReport].

    Warmup trials skip cooldown and warmup iterations and use the maximum
    measurement duration.
    """
    is_cooldown_sweep = sweep_type == "cooldown_duration"

    if num_warmup_trials > 0:
        warmup_cooldown = 0.0
        warmup_measure = fixed_value if is_cooldown_sweep else max(sweep_values)
        warmup_warmup_iterations = 0
        if _is_rank_zero():
            logger.info("Running %d warmup trial(s)", num_warmup_trials)
        for i in range(num_warmup_trials):
            _run_trial(
                target_function=target_function,
                zeus_monitor=zeus_monitor,
                cooldown_duration=warmup_cooldown,
                measurement_duration=warmup_measure,
                num_warmup_iterations=warmup_warmup_iterations,
                iteration_duration=iteration_duration,
            )
            if _is_rank_zero():
                logger.info("Warmup trial %d/%d done", i + 1, num_warmup_trials)

    entries: list[SweepResult] = []
    for val in sweep_values:
        cooldown_dur = val if is_cooldown_sweep else fixed_value
        measure_dur = fixed_value if is_cooldown_sweep else val

        if _is_rank_zero():
            logger.info("Starting %s=%.1f s  [%d trials]", sweep_type, val, num_trials)

        trials: list[TrialResult] = []
        for trial_idx in range(num_trials):
            trial = _run_trial(
                target_function=target_function,
                zeus_monitor=zeus_monitor,
                cooldown_duration=cooldown_dur,
                measurement_duration=measure_dur,
                num_warmup_iterations=num_warmup_iterations,
                iteration_duration=iteration_duration,
            )
            trials.append(trial)
            if _is_rank_zero():
                logger.info("Trial %d/%d done", trial_idx + 1, num_trials)

        entry = _build_sweep_result(measure_dur, cooldown_dur, trials, trial_stddev_threshold)
        entries.append(entry)

    swept_name = "cooldown_duration" if is_cooldown_sweep else "measurement_duration"
    fixed_name = "measurement_duration" if is_cooldown_sweep else "cooldown_duration"
    return SweepReport(
        sweep_param=(swept_name, sweep_values),
        fixed_param=(fixed_name, fixed_value),
        entries=entries,
    )

profile_measurement_duration

profile_measurement_duration(target_function, zeus_monitor, measurement_duration_search_range=None, cooldown_duration=10.0, num_trials=10, num_warmup_trials=2, trial_stddev_threshold=0.01, num_warmup_iterations=10, num_calibration_iterations=100, iteration_duration=None)

Sweep measurement durations and return a SweepReport.

Parameters:

Name Type Description Default
target_function Callable[[], Any]

Callable to profile (invoked with no arguments).

required
zeus_monitor ZeusMonitor

ZeusMonitor instance.

required
measurement_duration_search_range list[float] | None

Durations (seconds) to sweep. Defaults to [1.0, 2.0, ..., 10.0].

None
cooldown_duration float

Cooldown held constant during the sweep.

10.0
num_trials int

Repeated trials per sweep point.

10
num_warmup_trials int

Number of throwaway trials to run before the sweep to warm up the GPU thermal state.

2
trial_stddev_threshold float

Maximum acceptable energy_std (Joules) for a duration to be considered valid.

0.01
num_warmup_iterations int

Warm-up iterations before each measurement.

10
num_calibration_iterations int

Iterations used to estimate per-iteration time.

100
iteration_duration float | None

Pre-calibrated iteration duration (seconds). If None, calibration runs automatically.

None
Source code in zeus/profile.py
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
def profile_measurement_duration(
    target_function: Callable[[], Any],
    zeus_monitor: ZeusMonitor,
    measurement_duration_search_range: list[float] | None = None,
    cooldown_duration: float = 10.0,
    num_trials: int = 10,
    num_warmup_trials: int = 2,
    trial_stddev_threshold: float = 0.01,
    num_warmup_iterations: int = 10,
    num_calibration_iterations: int = 100,
    iteration_duration: float | None = None,
) -> SweepReport:
    """Sweep measurement durations and return a [`SweepReport`][zeus.profile.SweepReport].

    Args:
        target_function: Callable to profile (invoked with no arguments).
        zeus_monitor: [`ZeusMonitor`][zeus.monitor.energy.ZeusMonitor] instance.
        measurement_duration_search_range: Durations (seconds) to sweep.
            Defaults to `[1.0, 2.0, ..., 10.0]`.
        cooldown_duration: Cooldown held constant during the sweep.
        num_trials: Repeated trials per sweep point.
        num_warmup_trials: Number of throwaway trials to run before the sweep
            to warm up the GPU thermal state.
        trial_stddev_threshold: Maximum acceptable `energy_std` (Joules) for a
            duration to be considered valid.
        num_warmup_iterations: Warm-up iterations before each measurement.
        num_calibration_iterations: Iterations used to estimate per-iteration time.
        iteration_duration: Pre-calibrated iteration duration (seconds).  If `None`,
            calibration runs automatically.
    """
    search_range = measurement_duration_search_range or _DEFAULT_SEARCH_RANGE
    if iteration_duration is None:
        iteration_duration = _calibrate_iteration_duration(
            target_function, zeus_monitor, num_warmup_iterations, num_calibration_iterations
        )

    if _is_rank_zero():
        logger.info("Sweeping measurement_duration (cooldown fixed at %.1f s)", cooldown_duration)
    report = _sweep(
        target_function=target_function,
        zeus_monitor=zeus_monitor,
        sweep_values=search_range,
        fixed_value=cooldown_duration,
        sweep_type="measurement_duration",
        num_trials=num_trials,
        trial_stddev_threshold=trial_stddev_threshold,
        num_warmup_iterations=num_warmup_iterations,
        iteration_duration=iteration_duration,
        num_warmup_trials=num_warmup_trials,
    )
    if _is_rank_zero():
        logger.info("%s", report)
    return report

profile_cooldown_duration

profile_cooldown_duration(target_function, zeus_monitor, cooldown_duration_search_range=None, measurement_duration=10.0, num_trials=10, num_warmup_trials=2, trial_stddev_threshold=0.01, num_warmup_iterations=10, num_calibration_iterations=100, iteration_duration=None)

Sweep cooldown durations and return a SweepReport.

Parameters:

Name Type Description Default
target_function Callable[[], Any]

Callable to profile (invoked with no arguments).

required
zeus_monitor ZeusMonitor

ZeusMonitor instance.

required
cooldown_duration_search_range list[float] | None

Durations (seconds) to sweep. Defaults to [1.0, 2.0, ..., 10.0].

None
measurement_duration float

Measurement duration held constant during the sweep.

10.0
num_trials int

Repeated trials per sweep point.

10
num_warmup_trials int

Number of throwaway trials to run before the sweep to warm up the GPU thermal state.

2
trial_stddev_threshold float

Maximum acceptable energy_std (Joules) for a duration to be considered valid.

0.01
num_warmup_iterations int

Warm-up iterations before each measurement.

10
num_calibration_iterations int

Iterations used to estimate per-iteration time.

100
iteration_duration float | None

Pre-calibrated iteration duration (seconds). If None, calibration runs automatically.

None
Source code in zeus/profile.py
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
def profile_cooldown_duration(
    target_function: Callable[[], Any],
    zeus_monitor: ZeusMonitor,
    cooldown_duration_search_range: list[float] | None = None,
    measurement_duration: float = 10.0,
    num_trials: int = 10,
    num_warmup_trials: int = 2,
    trial_stddev_threshold: float = 0.01,
    num_warmup_iterations: int = 10,
    num_calibration_iterations: int = 100,
    iteration_duration: float | None = None,
) -> SweepReport:
    """Sweep cooldown durations and return a [`SweepReport`][zeus.profile.SweepReport].

    Args:
        target_function: Callable to profile (invoked with no arguments).
        zeus_monitor: [`ZeusMonitor`][zeus.monitor.energy.ZeusMonitor] instance.
        cooldown_duration_search_range: Durations (seconds) to sweep.
            Defaults to `[1.0, 2.0, ..., 10.0]`.
        measurement_duration: Measurement duration held constant during
            the sweep.
        num_trials: Repeated trials per sweep point.
        num_warmup_trials: Number of throwaway trials to run before the sweep
            to warm up the GPU thermal state.
        trial_stddev_threshold: Maximum acceptable `energy_std` (Joules) for a
            duration to be considered valid.
        num_warmup_iterations: Warm-up iterations before each measurement.
        num_calibration_iterations: Iterations used to estimate per-iteration time.
        iteration_duration: Pre-calibrated iteration duration (seconds).  If `None`,
            calibration runs automatically.
    """
    search_range = cooldown_duration_search_range or _DEFAULT_SEARCH_RANGE
    if iteration_duration is None:
        iteration_duration = _calibrate_iteration_duration(
            target_function, zeus_monitor, num_warmup_iterations, num_calibration_iterations
        )

    if _is_rank_zero():
        logger.info("Sweeping cooldown_duration (measurement fixed at %.1f s)", measurement_duration)
    report = _sweep(
        target_function=target_function,
        zeus_monitor=zeus_monitor,
        sweep_values=search_range,
        fixed_value=measurement_duration,
        sweep_type="cooldown_duration",
        num_trials=num_trials,
        trial_stddev_threshold=trial_stddev_threshold,
        num_warmup_iterations=num_warmup_iterations,
        iteration_duration=iteration_duration,
        num_warmup_trials=num_warmup_trials,
    )
    if _is_rank_zero():
        logger.info("%s", report)
    return report

profile_parameters

profile_parameters(target_function, zeus_monitor, measurement_duration_search_range=None, cooldown_duration_search_range=None, num_trials=10, num_warmup_trials=2, trial_stddev_threshold=0.01, num_warmup_iterations=10, num_calibration_iterations=100)

Auto-profile both measurement and cooldown durations.

Performs two sequential sweeps:

  1. Measurement duration sweep -- cooldown is fixed at the maximum of cooldown_duration_search_range.
  2. Cooldown duration sweep -- measurement duration is fixed at the maximum of measurement_duration_search_range.

Warmup trials are run once before the first sweep only.

Parameters:

Name Type Description Default
target_function Callable[[], Any]

Callable to profile (invoked with no arguments).

required
zeus_monitor ZeusMonitor

ZeusMonitor instance.

required
measurement_duration_search_range list[float] | None

Durations (seconds) to sweep. Defaults to [1.0, 2.0, ..., 10.0].

None
cooldown_duration_search_range list[float] | None

Durations (seconds) to sweep. Defaults to [1.0, 2.0, ..., 10.0].

None
num_trials int

Repeated trials per sweep point.

10
num_warmup_trials int

Number of throwaway trials to run before the first sweep to warm up the GPU thermal state.

2
trial_stddev_threshold float

Maximum acceptable energy_std (Joules) for a duration to be considered valid.

0.01
num_warmup_iterations int

Warm-up iterations before each measurement.

10
num_calibration_iterations int

Iterations used to estimate per-iteration time.

100

Returns:

Type Description
tuple[SweepReport, SweepReport]

(measurement_sweep_report, cooldown_sweep_report)

Source code in zeus/profile.py
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
def profile_parameters(
    target_function: Callable[[], Any],
    zeus_monitor: ZeusMonitor,
    measurement_duration_search_range: list[float] | None = None,
    cooldown_duration_search_range: list[float] | None = None,
    num_trials: int = 10,
    num_warmup_trials: int = 2,
    trial_stddev_threshold: float = 0.01,
    num_warmup_iterations: int = 10,
    num_calibration_iterations: int = 100,
) -> tuple[SweepReport, SweepReport]:
    """Auto-profile both measurement and cooldown durations.

    Performs two sequential sweeps:

    1. **Measurement duration sweep** -- cooldown is fixed at the *maximum*
       of `cooldown_duration_search_range`.
    2. **Cooldown duration sweep** -- measurement duration is fixed at the
       *maximum* of `measurement_duration_search_range`.

    Warmup trials are run once before the first sweep only.

    Args:
        target_function: Callable to profile (invoked with no arguments).
        zeus_monitor: [`ZeusMonitor`][zeus.monitor.energy.ZeusMonitor] instance.
        measurement_duration_search_range: Durations (seconds) to sweep.
            Defaults to `[1.0, 2.0, ..., 10.0]`.
        cooldown_duration_search_range: Durations (seconds) to sweep.
            Defaults to `[1.0, 2.0, ..., 10.0]`.
        num_trials: Repeated trials per sweep point.
        num_warmup_trials: Number of throwaway trials to run before the first
            sweep to warm up the GPU thermal state.
        trial_stddev_threshold: Maximum acceptable `energy_std` (Joules) for a
            duration to be considered valid.
        num_warmup_iterations: Warm-up iterations before each measurement.
        num_calibration_iterations: Iterations used to estimate per-iteration time.

    Returns:
        `(measurement_sweep_report, cooldown_sweep_report)`
    """
    m_range = measurement_duration_search_range or _DEFAULT_SEARCH_RANGE
    c_range = cooldown_duration_search_range or _DEFAULT_SEARCH_RANGE

    iteration_duration = _calibrate_iteration_duration(
        target_function, zeus_monitor, num_warmup_iterations, num_calibration_iterations
    )

    measurement_report = profile_measurement_duration(
        target_function=target_function,
        zeus_monitor=zeus_monitor,
        measurement_duration_search_range=m_range,
        cooldown_duration=max(c_range),
        num_trials=num_trials,
        trial_stddev_threshold=trial_stddev_threshold,
        num_warmup_iterations=num_warmup_iterations,
        iteration_duration=iteration_duration,
        num_warmup_trials=num_warmup_trials,
    )

    cooldown_report = profile_cooldown_duration(
        target_function=target_function,
        zeus_monitor=zeus_monitor,
        cooldown_duration_search_range=c_range,
        measurement_duration=max(m_range),
        num_trials=num_trials,
        trial_stddev_threshold=trial_stddev_threshold,
        num_warmup_iterations=num_warmup_iterations,
        iteration_duration=iteration_duration,
        num_warmup_trials=0,
    )

    return measurement_report, cooldown_report

measure

measure(target_function, zeus_monitor, measurement_duration, cooldown_duration, num_warmup_iterations=10, num_calibration_iterations=100)

Run a single energy measurement trial.

Parameters:

Name Type Description Default
target_function Callable[[], Any]

Callable to profile (invoked with no arguments).

required
zeus_monitor ZeusMonitor

ZeusMonitor instance.

required
measurement_duration float

Target measurement window length (seconds).

required
cooldown_duration float

Idle time before the measurement (seconds).

required
num_warmup_iterations int

Warm-up iterations before the measurement.

10
num_calibration_iterations int

Iterations used to estimate per-iteration time.

100
Source code in zeus/profile.py
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
def measure(
    target_function: Callable[[], Any],
    zeus_monitor: ZeusMonitor,
    measurement_duration: float,
    cooldown_duration: float,
    num_warmup_iterations: int = 10,
    num_calibration_iterations: int = 100,
) -> TrialResult:
    """Run a single energy measurement trial.

    Args:
        target_function: Callable to profile (invoked with no arguments).
        zeus_monitor: [`ZeusMonitor`][zeus.monitor.energy.ZeusMonitor] instance.
        measurement_duration: Target measurement window length (seconds).
        cooldown_duration: Idle time before the measurement (seconds).
        num_warmup_iterations: Warm-up iterations before the measurement.
        num_calibration_iterations: Iterations used to estimate per-iteration time.
    """
    iteration_duration = _calibrate_iteration_duration(
        target_function, zeus_monitor, num_warmup_iterations, num_calibration_iterations
    )

    return _run_trial(
        target_function=target_function,
        zeus_monitor=zeus_monitor,
        cooldown_duration=cooldown_duration,
        measurement_duration=measurement_duration,
        num_warmup_iterations=num_warmup_iterations,
        iteration_duration=iteration_duration,
    )