Align cleanup bytecode layout with CPython by youknowone · Pull Request #7935 · RustPython/RustPython

youknowone · 2026-05-20T10:12:48Z

Summary

Align exception, finally, and with cleanup bytecode layout with CPython 3.14 compiler behavior for issue youknowone#32.

This updates shared codegen/CFG assembly behavior rather than editing Lib/, using the local CPython compiler sources as reference:

Python/codegen.c for try/finally, with, while, break/continue, and named-except cleanup layout
Python/flowgraph.c for late CFG cleanup and load-fast borrow behavior
Python/assemble.c for jump offset fixed-point behavior around EXTENDED_ARG

Validation

cargo fmt --check
cargo test -p rustpython-codegen --lib
cargo build --release -p rustpython
cargo clippy -p rustpython-codegen --lib
git diff --check
scripts/compare_bytecode.py with release RustPython for the issue Add proper support for kwargs #32 grouped scope:
- multiprocessing/forkserver.py
- test/test_compile.py
- test/test_faulthandler.py
- test/test_hashlib.py
- test/test_importlib/extension/test_loader.py
- test/test_poplib.py
- test/test_socket.py
- test/test_sys_settrace.py
- test/test_unittest/testmock/testsealable.py

All grouped-scope compare reports are at 0 differences.

Summary by CodeRabbit

Refactor
- Improved compiler control-flow and bytecode layout generation for complex constructs (match OR-patterns, try/finally, infinite-loop patterns, multi-with), with better NOP/location preservation and more consistent jump/epilogue ordering to match reference behavior.
Tests
- Added extensive tests covering bytecode/layout edge cases (NOP preservation, location overrides, jump/return epilogues, and various try/while/match interactions).

coderabbitai · 2026-05-20T10:12:57Z

📝 Walkthrough

Walkthrough

This PR refines RustPython's bytecode codegen to achieve closer CPython-parity by improving control-flow handling in try/finally, while-true loops, match statements, and jump assembly. Changes span both high-level AST compilation and low-level IR finalization, introducing new statement-shape predicates, refined NOP preservation logic, location-override propagation, and conditional jump retargeting.

Changes

CPython-parity bytecode codegen refinements

Layer / File(s)	Summary
Statement-shape predicates for control-flow decisions `crates/codegen/src/compile.rs`	New predicates (`statements_are_single_with`, `statements_end_with_conditional`, `statements_end_with_while_true_without_break`) enable precise detection of statement patterns used to optimize control-flow and NOP preservation decisions in try/finally and loop blocks.
Try/finally optimized-entry and NOP preservation `crates/codegen/src/compile.rs`	Refactored try/finally entry logic to use statement-shape predicates for finalbody checks, compute `force_remove_finally_entry_nop` for with-while-true patterns, remove fast-borrow disabling for bare-except handlers, and adjust finally-exit label conditions.
While-True cleanup and nested-with interactions `crates/codegen/src/compile.rs`	Extended cleanup NOP removal to cover while-true-without-break patterns, reworked nested-multiline-with cleanup-target boolean logic, and added tests asserting with/for/inner-infinite-loop cleanup adjacency shapes.
Match OR-pattern detection and case emission `crates/codegen/src/compile.rs`	Detect Pattern::MatchOr early (`contains_match_or`), compute `has_match_or_case` to disable load-fast borrowing for match-or cases, adjust case-body emission and lineno_override handling to preserve CPython pass/NOP shapes after subject POP_TOP.
Emit edge no-location before JumpNoInterrupt `crates/codegen/src/compile.rs`	Insert `set_no_location()` immediately prior to emitting a `JumpNoInterrupt` on specific cleanup/control-flow edges.
Compile.rs bytecode layout validation tests `crates/codegen/src/compile.rs`	Add and update tests validating match-or load-fast preservation, NEXT_LOCATION NOP retention, nested-finally conditional fallthrough, try/finally boolop-while variants, named-except shared cleanup paths, async match/comprehension jump layout, and multi-with infinite-loop interactions.
IR finalize pipeline restructuring and early cleanups `crates/codegen/src/ir.rs`	Run `remove_redundant_const_pop_top_pairs()` immediately after `insert_superinstructions()`, insert `retarget_conditional_jumps_to_empty_while_exit_epilogue()` before duplication/DCE/unreachable-elimination, and mirror ordering in debug tracing.
Resolve next-location overrides before linearization `crates/codegen/src/ir.rs`	Add `resolve_next_location_overrides()` reverse-CFG pass that rewrites `lineno_override == -2` by copying location/end-location/lineno_override from following instructions and forces `-1` at scope-exit/jump/chain-end boundaries; call it during finalize_code before cache precomputation.
Jump-argument sizing and assembly adjustments `crates/codegen/src/ir.rs`	Adjust jump-argument resolution (mutable new_arg for EndAsyncFor) and force EXTENDED_ARG sizing for certain backward jumps when resolved deltas reach the 0xff boundary with targets already outside one-byte range.
Unreachable-block and CPython-shaped epilogue preservation `crates/codegen/src/ir.rs`	Add `preserves_cpython_unreachable_fallthrough_return_epilogue()` to detect and retain unreachable blocks that match CPython-shaped no-location LOAD_CONST + RETURN_VALUE epilogues instead of clearing them unconditionally.
Jump reordering with while-body anchoring and conditional retargeting `crates/codegen/src/ir.rs`	Add `follows_while_body_anchor_order()` to avoid reordering when anchor order holds, track block-push presence in conditional chains to skip false-path reordering, and add `retarget_conditional_jumps_to_empty_while_exit_epilogue()` to rewrite conditional targets around line-marker implicit-return epilogues.
Return duplication gating and exception cleanup cloning `crates/codegen/src/ir.rs`	Prevent duplicating final implicit-return epilogue when NoInterrupt jumps target the last block or a lineful fallthrough exists; collect conditional sources before cloning named-except cleanup returns; skip inlining when controlling jump is NoInterrupt.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

RustPython/RustPython#7870: Overlapping changes around LOAD_FAST_BORROW, empty-block fallthrough, and while/orelse block reuse in codegen/compile passes.
RustPython/RustPython#7781: Similar CPython-parity control-flow and no-location NOP handling in compile.rs and the retargeting pass in ir.rs.
RustPython/RustPython#7721: Related work on preserving CPython-shaped no-location NOPs and JumpNoInterrupt/set_no_location emission.

Poem

🐰 I hopped through NOPs and linenos bright,

Chased while True loops deep into the night,
I nudged a jump, I copied a line,
Now epilogues fall through, preserved just fine,
A rabbit's bytecode dance — CPython-shaped delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Align cleanup bytecode layout with CPython' directly summarizes the main change—updating RustPython's code generation to match CPython's bytecode layout for exception, finally, and with cleanup patterns.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 8728-8731: The current check only sees root-level OR patterns by
testing has_match_or_case over m.pattern with matches!(...,
ast::Pattern::MatchOr(_)); implement a recursive detector (e.g.,
contains_match_or(pattern: &ast::Pattern) -> bool) that walks all pattern
variants and returns true if any nested ast::Pattern::MatchOr is found, then
replace the .any(...) calls (including the similar check at the later block
around the 8788-8793 area) to use contains_match_or(&m.pattern) so nested forms
like [1 | 2, x] or Foo(bar=1 | 2) are correctly detected.
- Around line 1096-1099: The helper statements_end_with_conditional currently
only checks for a top-level ast::Stmt::If; broaden it to treat other conditional
shapes (e.g., ast::Stmt::Match) and container statements whose inner body ends
with a conditional (e.g., ast::Stmt::With whose body.last() is an If, nested
blocks, etc.) by making statements_end_with_conditional recursively inspect the
last statement: return true for If or Match, and for With (and other block-like
stmts) call the same helper on their inner body(s); update any call sites
(notably the final-body layout logic that uses statements_end_with_conditional)
to rely on this expanded predicate so trailing match/with-nested-if forms are
recognized as conditional finalizers.

In `@crates/codegen/src/ir.rs`:
- Around line 10882-10898: The current booleans
has_nointerrupt_jump_to_last_block and
has_lineful_return_fallthrough_to_last_block are global to the function and
incorrectly suppress duplication for all predecessors of last_block; change them
to be edge-specific by computing these predicates per predecessor (or per (pred,
last_block) edge) instead of once for the whole function. Concretely, replace
the top-level uses of has_nointerrupt_jump_to_last_block and
has_lineful_return_fallthrough_to_last_block with code that for each predecessor
block computes: scan that predecessor's instructions with jump_thread_kind(...)
== Some(JumpThreadKind::NoInterrupt) and target==last_block (using
next_nonempty_block to compare), and check whether the specific predecessor's
fallthrough (block.next) ends with a lineful ReturnValue (use
instruction_lineno(...) and matches!(instr.instr.real(),
Some(Instruction::ReturnValue))). Do the same edge-specific refactor at the
other occurrence referenced (lines ~10924-10948), so suppression decisions are
made per-edge rather than function-wide.
- Around line 10787-10798: The current pattern in
ends_with_line_marker_implicit_return(block: &Block) uses [.., marker, load,
ret] which will match blocks that have real instructions before the suffix;
change the match so we only consider truly empty epilogues by requiring the
block be exactly [marker, load, ret] or, if you want to allow only padding nops,
add a guard that verifies all instructions before marker are non-real padding
(e.g. instr.real() is None or Instruction::Nop and/or have the appropriate
no-location flags) before retargeting; update the match and/or add that extra
guard around marker, load, ret to avoid skipping earlier real instructions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 3a0c4118-c74c-4b6a-beaa-4402e66577f3

📥 Commits

Reviewing files that changed from the base of the PR and between ae3804f and 8d1b959.

📒 Files selected for processing (2)

crates/codegen/src/compile.rs
crates/codegen/src/ir.rs

coderabbitai

♻️ Duplicate comments (1)

crates/codegen/src/ir.rs (1)

10931-10936: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Check the fallthrough successor for the lineful return.

This predicate still inspects the current block instead of the fallthrough block, so it never becomes true for the jump cases this branch is trying to suppress. That means duplicate_end_returns() can still clone the final epilogue even when the specific fallthrough edge already reaches a lineful return.

Suggested fix

-                let has_lineful_return_fallthrough_to_last_block = block.next != BlockIdx::NULL
-                    && next_nonempty_block(blocks, block.next) == last_block
-                    && block.instructions.last().is_some_and(|instr| {
+                let has_lineful_return_fallthrough_to_last_block = next != BlockIdx::NULL
+                    && next == last_block
+                    && blocks[next.idx()].instructions.last().is_some_and(|instr| {
                         matches!(instr.instr.real(), Some(Instruction::ReturnValue))
                             && instruction_lineno(instr) > 0
                     });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/codegen/src/ir.rs` around lines 10931 - 10936, The predicate
has_lineful_return_fallthrough_to_last_block is checking the current block's
last instruction instead of the fallthrough block; update it to inspect the
fallthrough block returned by next_nonempty_block(blocks, block.next) (e.g., let
fallthrough = next_nonempty_block(blocks, block.next)) and then test
blocks[fallthrough].instructions.last().is_some_and(|instr|
matches!(instr.instr.real(), Some(Instruction::ReturnValue)) &&
instruction_lineno(instr) > 0); this prevents duplicate_end_returns() from
cloning the final epilogue when the fallthrough already ends with a lineful
return.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@crates/codegen/src/ir.rs`:
- Around line 10931-10936: The predicate
has_lineful_return_fallthrough_to_last_block is checking the current block's
last instruction instead of the fallthrough block; update it to inspect the
fallthrough block returned by next_nonempty_block(blocks, block.next) (e.g., let
fallthrough = next_nonempty_block(blocks, block.next)) and then test
blocks[fallthrough].instructions.last().is_some_and(|instr|
matches!(instr.instr.real(), Some(Instruction::ReturnValue)) &&
instruction_lineno(instr) > 0); this prevents duplicate_end_returns() from
cloning the final epilogue when the fallthrough already ends with a lineful
return.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: e7ad65d4-609a-468e-88b2-90f7b2e590d2

📥 Commits

Reviewing files that changed from the base of the PR and between 8d1b959 and aa50896.

📒 Files selected for processing (2)

crates/codegen/src/compile.rs
crates/codegen/src/ir.rs

Align cleanup bytecode layout with CPython

03a9594

youknowone force-pushed the bytecode-parity branch from 032a6cb to 03a9594 Compare May 20, 2026 11:24

youknowone changed the title ~~[codex] Align cleanup bytecode layout with CPython~~ Align cleanup bytecode layout with CPython May 20, 2026

Align bytecode anchors with CPython

8d1b959

youknowone marked this pull request as ready for review May 20, 2026 12:28

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread crates/codegen/src/compile.rs

Comment thread crates/codegen/src/compile.rs Outdated

Comment thread crates/codegen/src/ir.rs

Comment thread crates/codegen/src/ir.rs Outdated

Fix bytecode CFG review regressions

aa50896

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align cleanup bytecode layout with CPython#7935

Align cleanup bytecode layout with CPython#7935
youknowone wants to merge 3 commits into
RustPython:mainfrom
youknowone:bytecode-parity

youknowone commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

youknowone commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

youknowone commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading