diff options
author | Tvrtko Ursulin <tvrtko.ursulin@intel.com> | 2022-05-27 08:24:52 +0100 |
---|---|---|
committer | Tvrtko Ursulin <tvrtko.ursulin@intel.com> | 2022-06-17 09:03:11 +0100 |
commit | 45c64ecf97ee370bbdbd8eed7aed9c8ff5d1b0dd (patch) | |
tree | 1dd23d2db2a6b933895b9b6eef7acd233d8b1240 /drivers/gpu/drm/i915/gt/intel_context.h | |
parent | 1556c3b4c7ed2c8f17f200d53897251fc68b7377 (diff) |
drm/i915: Improve user experience and driver robustness under SIGINT or similar
We have long standing customer complaints that pressing Ctrl-C (or to the
effect of) causes engine resets with otherwise well behaving programs.
Not only is logging engine resets during normal operation not desirable
since it creates support incidents, but more fundamentally we should avoid
going the engine reset path when we can since any engine reset introduces
a chance of harming an innocent context.
Reason for this undesirable behaviour is that the driver currently does
not distinguish between banned contexts and non-persistent contexts which
have been closed.
To fix this we add the distinction between the two reasons for revoking
contexts, which then allows the strict timeout only be applied to banned,
while innocent contexts (well behaving) can preempt cleanly and exit
without triggering the engine reset path.
Note that the added context exiting category applies both to closed non-
persistent context, and any exiting context when hangcheck has been
disabled by the user.
At the same time we rename the backend operation from 'ban' to 'revoke'
which more accurately describes the actual semantics. (There is no ban at
the backend level since banning is a concept driven by the scheduling
frontend. Backends are simply able to revoke a running context so that
is the more appropriate name chosen.)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220527072452.2225610-1-tvrtko.ursulin@linux.intel.com
Diffstat (limited to 'drivers/gpu/drm/i915/gt/intel_context.h')
-rw-r--r-- | drivers/gpu/drm/i915/gt/intel_context.h | 25 |
1 files changed, 18 insertions, 7 deletions
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index b7d3214d2cdd..8e2d70630c49 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -25,6 +25,8 @@ ##__VA_ARGS__); \ } while (0) +#define INTEL_CONTEXT_BANNED_PREEMPT_TIMEOUT_MS (1) + struct i915_gem_ww_ctx; void intel_context_init(struct intel_context *ce, @@ -309,18 +311,27 @@ static inline bool intel_context_set_banned(struct intel_context *ce) return test_and_set_bit(CONTEXT_BANNED, &ce->flags); } -static inline bool intel_context_ban(struct intel_context *ce, - struct i915_request *rq) +bool intel_context_ban(struct intel_context *ce, struct i915_request *rq); + +static inline bool intel_context_is_schedulable(const struct intel_context *ce) { - bool ret = intel_context_set_banned(ce); + return !test_bit(CONTEXT_EXITING, &ce->flags) && + !test_bit(CONTEXT_BANNED, &ce->flags); +} - trace_intel_context_ban(ce); - if (ce->ops->ban) - ce->ops->ban(ce, rq); +static inline bool intel_context_is_exiting(const struct intel_context *ce) +{ + return test_bit(CONTEXT_EXITING, &ce->flags); +} - return ret; +static inline bool intel_context_set_exiting(struct intel_context *ce) +{ + return test_and_set_bit(CONTEXT_EXITING, &ce->flags); } +bool intel_context_exit_nonpersistent(struct intel_context *ce, + struct i915_request *rq); + static inline bool intel_context_force_single_submission(const struct intel_context *ce) { |