From 1fc719d13ac07b082ff9daa8f40254680cade111 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Mon, 11 Jun 2018 11:48:08 +0100 Subject: drm/i915/ringbuffer: Brute force context restore An issue encountered with switching mm on gen7 is that the GPU likes to hang (with the VS unit busy) when told to force restore the current context. We can simply workaround this by substituting the MI_FORCE_RESTORE flag with a round-trip through the kernel_context, forcing the context to be saved and restored; thereby reloading the PP_DIR registers and updating the modified page directory! v2: Undo attempted optimisation in caller (Tvrtko) Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Mika Kuoppala Cc: Matthew Auld Cc: Tvrtko Ursulin Reviewed-by: Joonas Lahtinen Reviewed-by: Mika Kuoppala Link: https://patchwork.freedesktop.org/patch/msgid/20180611104808.24295-1-chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) (limited to 'drivers/gpu/drm/i915/intel_ringbuffer.c') diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 6496c1d00dbb..5bc53a5f4504 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1456,6 +1456,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) (HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ? INTEL_INFO(i915)->num_rings - 1 : 0; + bool force_restore = false; int len; u32 *cs; @@ -1469,6 +1470,12 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) len = 4; if (IS_GEN7(i915)) len += 2 + (num_rings ? 4*num_rings + 6 : 0); + if (flags & MI_FORCE_RESTORE) { + GEM_BUG_ON(flags & MI_RESTORE_INHIBIT); + flags &= ~MI_FORCE_RESTORE; + force_restore = true; + len += 2; + } cs = intel_ring_begin(rq, len); if (IS_ERR(cs)) @@ -1493,6 +1500,26 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) } } + if (force_restore) { + /* + * The HW doesn't handle being told to restore the current + * context very well. Quite often it likes goes to go off and + * sulk, especially when it is meant to be reloading PP_DIR. + * A very simple fix to force the reload is to simply switch + * away from the current context and back again. + * + * Note that the kernel_context will contain random state + * following the INHIBIT_RESTORE. We accept this since we + * never use the kernel_context state; it is merely a + * placeholder we use to flush other contexts. + */ + *cs++ = MI_SET_CONTEXT; + *cs++ = i915_ggtt_offset(to_intel_context(i915->kernel_context, + engine)->state) | + MI_MM_SPACE_GTT | + MI_RESTORE_INHIBIT; + } + *cs++ = MI_NOOP; *cs++ = MI_SET_CONTEXT; *cs++ = i915_ggtt_offset(rq->hw_context->state) | flags; -- cgit From b3ee09a4de33259a89d30aca6b2ebb0bc26640af Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Mon, 11 Jun 2018 12:08:44 +0100 Subject: drm/i915/ringbuffer: Fix context restore upon reset The discovery with trying to enable full-ppgtt was that we were completely failing to the load both the mm and context following the reset. Although we were performing mmio to set the PP_DIR (per-process GTT) and CCID (context), these were taking no effect (the assumption was that this would trigger reload of the context and restore the page tables). It was not until we performed the LRI + MI_SET_CONTEXT in a following context switch would anything occur. Since we are then required to reset the context image and PP_DIR using CS commands, we place those commands into every batch. The hardware should recognise the no-ops and eliminate the expensive context loads, but we still have to pay the cost of using cross-powerwell register writes. In practice, this has no effect on actual context switch times, and only adds a few hundred nanoseconds to no-op switches. We can improve the latter by eliminating the w/a around known no-op switches, but there is an ulterior motive to keeping them. Always emitting the context switch at the beginning of the request (and relying on HW to skip unneeded switches) does have one key advantage. Should we implement request reordering on Haswell, we will not know in advance what the previous executing context was on the GPU and so we would not be able to elide the MI_SET_CONTEXT commands ourselves and always have to emit them. Having our hand forced now actually prepares us for later. Now since that context and mm follow the request, we no longer (and not for a long time since requests took over!) require a trace point to tell when we write the switch into the ring, since it is always. (This is even more important when you remember that simply writing into the ring bears no relation to the current mm.) v2: Sandybridge has to agree to use LRI as well. Testcase: igt/drv_selftests/live_hangcheck Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Mika Kuoppala Cc: Matthew Auld Cc: Tvrtko Ursulin Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20180611110845.31890-1-chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 125 ++++++++++++++++---------------- 1 file changed, 61 insertions(+), 64 deletions(-) (limited to 'drivers/gpu/drm/i915/intel_ringbuffer.c') diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 5bc53a5f4504..d72a6a5ff3ac 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -541,11 +541,23 @@ static struct i915_request *reset_prepare(struct intel_engine_cs *engine) return i915_gem_find_active_request(engine); } -static void reset_ring(struct intel_engine_cs *engine, - struct i915_request *request) +static void skip_request(struct i915_request *rq) { - GEM_TRACE("%s seqno=%x\n", - engine->name, request ? request->global_seqno : 0); + void *vaddr = rq->ring->vaddr; + u32 head; + + head = rq->infix; + if (rq->postfix < head) { + memset32(vaddr + head, MI_NOOP, + (rq->ring->size - head) / sizeof(u32)); + head = 0; + } + memset32(vaddr + head, MI_NOOP, (rq->postfix - head) / sizeof(u32)); +} + +static void reset_ring(struct intel_engine_cs *engine, struct i915_request *rq) +{ + GEM_TRACE("%s seqno=%x\n", engine->name, rq ? rq->global_seqno : 0); /* * RC6 must be prevented until the reset is complete and the engine @@ -569,43 +581,11 @@ static void reset_ring(struct intel_engine_cs *engine, * If the request was innocent, we try to replay the request with * the restored context. */ - if (request) { - struct drm_i915_private *dev_priv = request->i915; - struct intel_context *ce = request->hw_context; - struct i915_hw_ppgtt *ppgtt; - - if (ce->state) { - I915_WRITE(CCID, - i915_ggtt_offset(ce->state) | - BIT(8) /* must be set! */ | - CCID_EXTENDED_STATE_SAVE | - CCID_EXTENDED_STATE_RESTORE | - CCID_EN); - } - - ppgtt = request->gem_context->ppgtt ?: engine->i915->mm.aliasing_ppgtt; - if (ppgtt) { - u32 pd_offset = ppgtt->pd.base.ggtt_offset << 10; - - I915_WRITE(RING_PP_DIR_DCLV(engine), PP_DIR_DCLV_2G); - I915_WRITE(RING_PP_DIR_BASE(engine), pd_offset); - - /* Wait for the PD reload to complete */ - if (intel_wait_for_register(dev_priv, - RING_PP_DIR_BASE(engine), - BIT(0), 0, - 10)) - DRM_ERROR("Wait for reload of ppgtt page-directory timed out\n"); - - ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine); - } - + if (rq) { /* If the rq hung, jump to its breadcrumb and skip the batch */ - if (request->fence.error == -EIO) - request->ring->head = request->postfix; - } else { - engine->legacy_active_context = NULL; - engine->legacy_active_ppgtt = NULL; + rq->ring->head = intel_ring_wrap(rq->ring, rq->head); + if (rq->fence.error == -EIO) + skip_request(rq); } } @@ -1446,6 +1426,29 @@ void intel_legacy_submission_resume(struct drm_i915_private *dev_priv) intel_ring_reset(engine->buffer, 0); } +static int load_pd_dir(struct i915_request *rq, + const struct i915_hw_ppgtt *ppgtt) +{ + const struct intel_engine_cs * const engine = rq->engine; + u32 *cs; + + cs = intel_ring_begin(rq, 6); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = MI_LOAD_REGISTER_IMM(1); + *cs++ = i915_mmio_reg_offset(RING_PP_DIR_DCLV(engine)); + *cs++ = PP_DIR_DCLV_2G; + + *cs++ = MI_LOAD_REGISTER_IMM(1); + *cs++ = i915_mmio_reg_offset(RING_PP_DIR_BASE(engine)); + *cs++ = ppgtt->pd.base.ggtt_offset << 10; + + intel_ring_advance(rq, cs); + + return 0; +} + static inline int mi_set_context(struct i915_request *rq, u32 flags) { struct drm_i915_private *i915 = rq->i915; @@ -1590,31 +1593,28 @@ static int remap_l3(struct i915_request *rq, int slice) static int switch_context(struct i915_request *rq) { struct intel_engine_cs *engine = rq->engine; - struct i915_gem_context *to_ctx = rq->gem_context; - struct i915_hw_ppgtt *to_mm = - to_ctx->ppgtt ?: rq->i915->mm.aliasing_ppgtt; - struct i915_gem_context *from_ctx = engine->legacy_active_context; - struct i915_hw_ppgtt *from_mm = engine->legacy_active_ppgtt; + struct i915_gem_context *ctx = rq->gem_context; + struct i915_hw_ppgtt *ppgtt = ctx->ppgtt ?: rq->i915->mm.aliasing_ppgtt; + unsigned int unwind_mm = 0; u32 hw_flags = 0; int ret, i; lockdep_assert_held(&rq->i915->drm.struct_mutex); GEM_BUG_ON(HAS_EXECLISTS(rq->i915)); - if (to_mm != from_mm || - (to_mm && intel_engine_flag(engine) & to_mm->pd_dirty_rings)) { - trace_switch_mm(engine, to_ctx); - ret = to_mm->switch_mm(to_mm, rq); + if (ppgtt) { + ret = load_pd_dir(rq, ppgtt); if (ret) goto err; - to_mm->pd_dirty_rings &= ~intel_engine_flag(engine); - engine->legacy_active_ppgtt = to_mm; - hw_flags = MI_FORCE_RESTORE; + if (intel_engine_flag(engine) & ppgtt->pd_dirty_rings) { + unwind_mm = intel_engine_flag(engine); + ppgtt->pd_dirty_rings &= ~unwind_mm; + hw_flags = MI_FORCE_RESTORE; + } } - if (rq->hw_context->state && - (to_ctx != from_ctx || hw_flags & MI_FORCE_RESTORE)) { + if (rq->hw_context->state) { GEM_BUG_ON(engine->id != RCS); /* @@ -1624,35 +1624,32 @@ static int switch_context(struct i915_request *rq) * as nothing actually executes using the kernel context; it * is purely used for flushing user contexts. */ - if (i915_gem_context_is_kernel(to_ctx)) + if (i915_gem_context_is_kernel(ctx)) hw_flags = MI_RESTORE_INHIBIT; ret = mi_set_context(rq, hw_flags); if (ret) goto err_mm; - - engine->legacy_active_context = to_ctx; } - if (to_ctx->remap_slice) { + if (ctx->remap_slice) { for (i = 0; i < MAX_L3_SLICES; i++) { - if (!(to_ctx->remap_slice & BIT(i))) + if (!(ctx->remap_slice & BIT(i))) continue; ret = remap_l3(rq, i); if (ret) - goto err_ctx; + goto err_mm; } - to_ctx->remap_slice = 0; + ctx->remap_slice = 0; } return 0; -err_ctx: - engine->legacy_active_context = from_ctx; err_mm: - engine->legacy_active_ppgtt = from_mm; + if (unwind_mm) + ppgtt->pd_dirty_rings |= unwind_mm; err: return ret; } -- cgit From 41d37680ca0b157ad1ed29409bcdc2b0bc21d11f Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Mon, 11 Jun 2018 12:08:45 +0100 Subject: drm/i915: Wrap around the tail offset before setting ring->tail The HW only accepts offsets within ring->size, and fails peculiarly if the RING_HEAD or RING_TAIL is set to ring->size. Therefore whenever we set ring->head/ring->tail we want to make sure it is within value (using intel_ring_wrap()). v2: Double check execlists as well v3: Remove redundancy with assert_ring_tail_valid() v4: Just assert in intel_ring_reset() rather than be over-defensive. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Mika Kuoppala Cc: Matthew Auld Cc: Tvrtko Ursulin Reviewed-by: Joonas Lahtinen #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20180611110845.31890-2-chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'drivers/gpu/drm/i915/intel_ringbuffer.c') diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index d72a6a5ff3ac..bb4af8f84a22 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -496,6 +496,10 @@ static int init_ring_common(struct intel_engine_cs *engine) DRM_DEBUG_DRIVER("%s initialization failed [head=%08x], fudging\n", engine->name, I915_READ_HEAD(engine)); + /* Check that the ring offsets point within the ring! */ + GEM_BUG_ON(!intel_ring_offset_valid(ring, ring->head)); + GEM_BUG_ON(!intel_ring_offset_valid(ring, ring->tail)); + intel_ring_update_space(ring); I915_WRITE_HEAD(engine, ring->head); I915_WRITE_TAIL(engine, ring->tail); @@ -1062,6 +1066,8 @@ err: void intel_ring_reset(struct intel_ring *ring, u32 tail) { + GEM_BUG_ON(!intel_ring_offset_valid(ring, tail)); + ring->tail = tail; ring->head = tail; ring->emit = tail; -- cgit From d9d117e40d4ffc03438177eeac83d96dfeee76be Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Mon, 11 Jun 2018 18:18:25 +0100 Subject: drm/i915/ringbuffer: Serialize load of PD_DIR After triggering the mm switch with a load of PD_DIR, which may be deferred unto the MI_SET_CONTEXT on rcs, serialise the next commands with that load by posting a read of PD_DIR (or else those subsequent commands may access the stale page tables). Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Mika Kuoppala Cc: Matthew Auld Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20180611171825.13678-2-chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 49 +++++++++++++++++++++++++-------- 1 file changed, 37 insertions(+), 12 deletions(-) (limited to 'drivers/gpu/drm/i915/intel_ringbuffer.c') diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index bb4af8f84a22..735622dc73ec 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1361,8 +1361,9 @@ intel_ring_context_pin(struct intel_engine_cs *engine, static int intel_init_ring_buffer(struct intel_engine_cs *engine) { - struct intel_ring *ring; struct i915_timeline *timeline; + struct intel_ring *ring; + unsigned int size; int err; intel_engine_setup_common(engine); @@ -1388,12 +1389,21 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine) GEM_BUG_ON(engine->buffer); engine->buffer = ring; - err = intel_engine_init_common(engine); + size = PAGE_SIZE; + if (HAS_BROKEN_CS_TLB(engine->i915)) + size = I830_WA_SIZE; + err = intel_engine_create_scratch(engine, size); if (err) goto err_unpin; + err = intel_engine_init_common(engine); + if (err) + goto err_scratch; + return 0; +err_scratch: + intel_engine_cleanup_scratch(engine); err_unpin: intel_ring_unpin(ring); err_ring: @@ -1455,6 +1465,25 @@ static int load_pd_dir(struct i915_request *rq, return 0; } +static int flush_pd_dir(struct i915_request *rq) +{ + const struct intel_engine_cs * const engine = rq->engine; + u32 *cs; + + cs = intel_ring_begin(rq, 4); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + /* Stall until the page table load is complete */ + *cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT; + *cs++ = i915_mmio_reg_offset(RING_PP_DIR_BASE(engine)); + *cs++ = i915_ggtt_offset(engine->scratch); + *cs++ = MI_NOOP; + + intel_ring_advance(rq, cs); + return 0; +} + static inline int mi_set_context(struct i915_request *rq, u32 flags) { struct drm_i915_private *i915 = rq->i915; @@ -1638,6 +1667,12 @@ static int switch_context(struct i915_request *rq) goto err_mm; } + if (ppgtt) { + ret = flush_pd_dir(rq); + if (ret) + goto err_mm; + } + if (ctx->remap_slice) { for (i = 0; i < MAX_L3_SLICES; i++) { if (!(ctx->remap_slice & BIT(i))) @@ -2158,16 +2193,6 @@ int intel_init_render_ring_buffer(struct intel_engine_cs *engine) if (ret) return ret; - if (INTEL_GEN(dev_priv) >= 6) { - ret = intel_engine_create_scratch(engine, PAGE_SIZE); - if (ret) - return ret; - } else if (HAS_BROKEN_CS_TLB(dev_priv)) { - ret = intel_engine_create_scratch(engine, I830_WA_SIZE); - if (ret) - return ret; - } - return 0; } -- cgit From a2bbf714834288da0ed4668d4e4d895ffef73de2 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Thu, 14 Jun 2018 10:41:03 +0100 Subject: drm/i915/gtt: Only keep gen6 page directories pinned while active In order to be able to evict the gen6 ppgtt, we have to unpin it at some point. We can simply use our context activity tracking to know when the ppgtt is no longer in use by hardware, and so only keep it pinned while being used a request. For the kernel_context (and thus aliasing_ppgtt), it remains pinned at all times, as the kernel_context itself is pinned at all times. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Mika Kuoppala Cc: Matthew Auld Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20180614094103.18025-5-chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) (limited to 'drivers/gpu/drm/i915/intel_ringbuffer.c') diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 735622dc73ec..6a937a896651 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1179,6 +1179,27 @@ static void intel_ring_context_destroy(struct intel_context *ce) __i915_gem_object_release_unless_active(ce->state->obj); } +static int __context_pin_ppgtt(struct i915_gem_context *ctx) +{ + struct i915_hw_ppgtt *ppgtt; + int err = 0; + + ppgtt = ctx->ppgtt ?: ctx->i915->mm.aliasing_ppgtt; + if (ppgtt) + err = gen6_ppgtt_pin(ppgtt); + + return err; +} + +static void __context_unpin_ppgtt(struct i915_gem_context *ctx) +{ + struct i915_hw_ppgtt *ppgtt; + + ppgtt = ctx->ppgtt ?: ctx->i915->mm.aliasing_ppgtt; + if (ppgtt) + gen6_ppgtt_unpin(ppgtt); +} + static int __context_pin(struct intel_context *ce) { struct i915_vma *vma; @@ -1227,6 +1248,7 @@ static void __context_unpin(struct intel_context *ce) static void intel_ring_context_unpin(struct intel_context *ce) { + __context_unpin_ppgtt(ce->gem_context); __context_unpin(ce); i915_gem_context_put(ce->gem_context); @@ -1324,6 +1346,10 @@ __ring_context_pin(struct intel_engine_cs *engine, if (err) goto err; + err = __context_pin_ppgtt(ce->gem_context); + if (err) + goto err_unpin; + i915_gem_context_get(ctx); /* One ringbuffer to rule them all */ @@ -1332,6 +1358,8 @@ __ring_context_pin(struct intel_engine_cs *engine, return ce; +err_unpin: + __context_unpin(ce); err: ce->pin_count = 0; return ERR_PTR(err); -- cgit From 4fdd5b4e9aba5fbbc6d3072a5a87fa1d3f3fc030 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Sat, 16 Jun 2018 21:25:34 +0100 Subject: drm/i915: Fix fallout of fake reset along resume commit b2209e62a450 ("drm/i915/execlists: Reset the CSB head tracking on reset/sanitization") and commit 1288786b18f7 ("drm/i915: Move GEM sanitize from resume_early to resume") show the conflicting requirements on the code. We must reset the GPU before trashing live state on a fast resume (hibernation debug, or error paths), but we must only reset our state tracking iff the GPU is reset (or power cycled). This is tricky if we are disabling GPU reset to simulate broken hardware; we reset our state tracking but the GPU is left intact and recovers from its stale state. v2: Again without the assertion for forcewake, no longer required since commit b3ee09a4de33 ("drm/i915/ringbuffer: Fix context restore upon reset") as the contexts are reset from the CS ensuring everything is powered up. Fixes: b2209e62a450 ("drm/i915/execlists: Reset the CSB head tracking on reset/sanitization") Fixes: 1288786b18f7 ("drm/i915: Move GEM sanitize from resume_early to resume") Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Mika Kuoppala Cc: Joonas Lahtinen Reviewed-by: Joonas Lahtinen Link: https://patchwork.freedesktop.org/patch/msgid/20180616202534.18767-1-chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 8 -------- 1 file changed, 8 deletions(-) (limited to 'drivers/gpu/drm/i915/intel_ringbuffer.c') diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 6a937a896651..4dae23885006 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -563,14 +563,6 @@ static void reset_ring(struct intel_engine_cs *engine, struct i915_request *rq) { GEM_TRACE("%s seqno=%x\n", engine->name, rq ? rq->global_seqno : 0); - /* - * RC6 must be prevented until the reset is complete and the engine - * reinitialised. If it occurs in the middle of this sequence, the - * state written to/loaded from the power context is ill-defined (e.g. - * the PP_BASE_DIR may be lost). - */ - assert_forcewakes_active(engine->i915, FORCEWAKE_ALL); - /* * Try to restore the logical GPU state to match the continuation * of the request queue. If we skip the context/PD restore, then -- cgit