name: gitcode-pr-impact-locator description: Use when needing to identify which merged PR caused a regression, behavioral change, or anomaly in a GitCode repository. Triggers on phrases like "which PR caused this", "定位引入问题的PR", "排查哪个PR导致的", "find the PR that broke X", or when user describes a symptom with a time window and target branch and wants to trace it to specific code changes.
GitCode PR Impact Locator
Systematically narrow down which merged PR introduced a problem by correlating PR metadata, file changes, and code diffs against the reported symptom.
Core Workflow
User describes symptom + time window + branch
│
▼
1. Clarify scope (missing info? ask)
│
▼
2. Fetch merged PR list (gc pr list)
│
▼
3. Triage: score each PR by relevance
(keywords, files touched, timing, size)
│
▼
4. Deep-dive top suspects (gc pr diff / curl)
│
▼
5. Cross-reference file history (curl fallback)
│
▼
6. Output ranked suspects with evidence
Step 1: Clarify Scope
Before fetching PRs, ensure you have:
| Required | Optional (but helpful) |
|---|---|
Target repo (owner/repo) |
Exact file path affected |
| Target branch | Known "good" vs "bad" commit/tag |
| Time window (e.g., "last 10 days") | Specific test/feature name |
| Symptom description | Expected vs actual counts/values |
Ask only for missing critical items. Don't block on optional ones.
Step 2: Fetch Merged PRs
Primary: gc CLI
# First batch (most recent 30)
gc pr list -R <owner/repo> --state merged --base <branch> \
--sort updated --direction desc --limit 30 --format table
# Page 2+ (gc lacks --paginate)
gc pr list -R <owner/repo> --state merged --base <branch> \
--sort updated --direction desc --limit 30 --page 2 --format table
Fallback: curl API (larger batches, more fields)
curl -s -H "Authorization: Bearer $GC_TOKEN" \
"https://gitcode.com/api/v5/repos/<owner>/<repo>/pulls?state=merged&base=<branch>&sort=updated&direction=desc&per_page=100"
Extract with: | python3 -c "import json,sys; [print(f'#{p[\"number\"]} | {p[\"title\"]}') for p in json.load(sys.stdin)]"
Get merge timestamps
gc CLI pr list --format table doesn't show merge dates. Use curl to get merged_at:
curl ... | python3 -c "
import json,sys
for p in json.load(sys.stdin):
print(f'#{p[\"number\"]} | merged:{p.get(\"merged_at\",\"?\")[:10]} | {p[\"title\"]}')
"
Step 3: Triage — Score PRs by Relevance
For each PR in the time window, score against these dimensions:
Dimension 1: Keywords (high weight)
Match PR title against problem domain terms. If symptom is "test_ops.py test count dropped", keywords: test_ops, test, skip, case, patch, common_device, common_utils, testing.
# Quick filter
gc pr list ... --format json | python3 -c "
import json,sys
keywords = ['test_ops','skip','patch','common_device']
for p in json.load(sys.stdin):
score = sum(1 for kw in keywords if kw.lower() in p['title'].lower())
if score > 0: print(f'#{p[\"number\"]} [score={score}] {p[\"title\"]}')
"
Dimension 2: Files Changed (high weight)
A PR that touches the affected file or its infrastructure is high-priority.
# Get changed files (curl — gc pr view doesn't show file list)
curl -s -H "Authorization: Bearer $GC_TOKEN" \
"https://gitcode.com/api/v5/repos/<owner>/<repo>/pulls/<number>/files?per_page=100" \
| python3 -c "import json,sys; [print(f['filename']) for f in json.load(sys.stdin)]"
Match against:
- The exact file path mentioned in the symptom
- Parent/sibling directories (e.g.,
test_upstream/if symptom is about test generation) - Infrastructure files (
common_device_type.py,common_utils.py,patch_manager.py)
Dimension 3: Timing (medium weight)
PR merged shortly before symptom appeared → high relevance. Correlate merged_at with when the problem was first observed.
Dimension 4: Change Magnitude (medium weight)
Large PRs (+many files) touching infrastructure are more likely to cause side effects. Check:
gc pr diff --repo <owner/repo> <number> 2>&1 | head -5
# Shows: Changes: +X -Y in Z file(s)
Quick Triage Table
Create a markdown table for the top candidates:
| PR | Score | Title | Files | Key Touchpoints |
|----|-------|-------|-------|-----------------|
| #36598 | *** | torch inductor patch add | 3 | common_device_type.py.patch |
| #34013 | ** | fix inductor patch apply bug | 96 | inductor .diff files |
Step 4: Deep-Dive Top Suspects
For each high-scoring PR, examine the actual code changes:
gc pr diff --repo <owner/repo> <number>
Focus on:
- Control flow changes — did a function's logic change in a way that alters what gets included/excluded?
- Configuration changes — did a list, mapping, or flag change?
- New additions — was a new file/patch added that modifies behavior?
- Removals — were any guards, skips, or filters removed?
Compare before/after for critical functions. Quote the exact diff lines as evidence.
Step 5: Cross-Reference File History (curl fallback)
gc CLI lacks file history. Use curl:
# Commit history for a specific file
curl -s -H "Authorization: Bearer $GC_TOKEN" \
"https://gitcode.com/api/v5/repos/<owner>/<repo>/commits?path=<file>&sha=<branch>&per_page=10" \
| python3 -c "
import json,sys
for c in json.load(sys.stdin):
print(c['commit']['committer']['date'], c['commit']['message'].split('\n')[0])
"
This reveals if the affected file was modified by a PR not obvious from title alone.
Step 6: Output Ranked Findings
## 定位结论
### 首要嫌疑 — #N "PR title" (merged YYYY-MM-DD)
**变更文件:**
- `path/to/changed/file1`
- `path/to/changed/file2`
**核心证据:** [specific code change and why it explains the symptom]
**diff 摘录:**
```diff
- old behavior
+ new behavior (explains symptom)
次要嫌疑 — #M "PR title"
...
排除的 PR
| PR | 排除原因 |
|---|---|
| #X | 变更范围不重叠 / 时间不匹配 / ... |
Always include the **排除** section — showing your work builds confidence.
## gc CLI Quick Reference
```bash
# Auth check
gc auth status
# List PRs
gc pr list -R owner/repo --state merged --base branch \
--sort updated --direction desc --limit 30 [--page N]
# View PR
gc pr view --repo owner/repo <number>
gc pr view --repo owner/repo <number> --json # (note: may miss description/merged_at)
# View PR diff
gc pr diff --repo owner/repo <number>
# Search issues
gc issue list -R owner/repo --search "keyword" --state all --limit 10
# List labels
gc label list -R owner/repo
curl API Fallback Reference
Use when gc CLI is missing a capability. All calls need $GC_TOKEN from the environment.
# List PRs (supports per_page up to 100, includes merged_at)
curl -s -H "Authorization: Bearer $GC_TOKEN" \
"https://gitcode.com/api/v5/repos/<owner>/<repo>/pulls?state=merged&base=<branch>&sort=updated&direction=desc&per_page=100&page=1"
# PR details (includes body/description, merged_at)
curl -s -H "Authorization: Bearer $GC_TOKEN" \
"https://gitcode.com/api/v5/repos/<owner>/<repo>/pulls/<number>"
# PR files changed
curl -s -H "Authorization: Bearer $GC_TOKEN" \
"https://gitcode.com/api/v5/repos/<owner>/<repo>/pulls/<number>/files?per_page=100"
# File commit history (gc missing)
curl -s -H "Authorization: Bearer $GC_TOKEN" \
"https://gitcode.com/api/v5/repos/<owner>/<repo>/commits?path=<file>&sha=<branch>&per_page=10"
# File content at branch (for checking current state)
curl -s -H "Authorization: Bearer $GC_TOKEN" \
"https://gitcode.com/api/v5/repos/<owner>/<repo>/contents/<path>?ref=<branch>"
# Decode with: base64.b64decode(data['content']).decode()
Problem-Specific Patterns
Test Case Count Changes (up or down)
Key files to check:
test_upstream/torch/testing/_internal/common_device_type.py.patch— controls device-type test class generationtest_upstream/torch/testing/_internal/common_utils.py.patch— test utility decorators (@skip, etc.)- Test-specific
.patchfiles matching the affected test file
Functions to scrutinize:
get_all_device_types()— what device types are test classes generated for?filter_desired_device_types()— any filtering changes?instantiate_device_type_tests()— the entry point for test class generation
Key indicators in diffs:
@unittest.skipadded/removed- Device type lists changed (
["cpu", "cuda"]→["cpu", "cuda", "npu"]) - Filtering/normalization logic added to device type handling
- Patch termination/formatting fixes (can cause patch application to silently change behavior)
Build / CI Failures
Key files: CI configs, build scripts, dependency pin files, submodule commit ID updates.
Performance Regressions
Key files: Hot-path source files, compiler/inductor config, kernel launch code, memory allocator changes.
Common Mistakes
| Mistake | Fix |
|---|---|
| Only checking PR title, not diff | Titles can be misleading. Always pr diff top suspects. |
| Stopping at first plausible PR | List excluded PRs and why they were ruled out. |
| Ignoring small PRs | A 3-file change to test infrastructure can have more impact than a 96-file mechanical fix. |
| Not checking file commit history | A file might have been changed by a PR with an unrelated-looking title. |
Forgetting to check common_device_type.py / common_utils.py patches |
These test infrastructure patches affect ALL tests, not just the ones in their filename. |