Skip to content

Remove synchronization of JyAttribute.getAttr to improve performance in multithreaded scenarios.#361

Open
Stewori wants to merge 3 commits into
jython:masterfrom
Stewori:desync_JyAttribute_getAttr
Open

Remove synchronization of JyAttribute.getAttr to improve performance in multithreaded scenarios.#361
Stewori wants to merge 3 commits into
jython:masterfrom
Stewori:desync_JyAttribute_getAttr

Conversation

@Stewori

@Stewori Stewori commented Oct 18, 2024

Copy link
Copy Markdown
Member

Fixes #360

jeff5
jeff5 previously requested changes Oct 19, 2024

@jeff5 jeff5 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It all hangs on the correctness of the argument. I find these things hard to convince myself of from just the theory.

An example of a test that arranges intense thread competition is this one I wrote to hammer the (prospective) type system. There was a risk, if I hadn't got it right, of deadlock or publication of half-finished objects.

The time measurements are made to prove the order of events rather than to measure performance. Synchronisation is explicit, so I'm not too worried about code re-ordering in my case.

The claim that the reader sees a coherent state as last written would be easier to accept if setAttr were to make a new list, in which everything could be constructed final, and only the head pointer would need to be volatile (and the nonBuiltin* counters). Assuming a setAttr is much less common than a getAttr, and lists are mostly short, the copy would not be expensive. You are best placed to say whether that's true.

Comment thread src/org/python/core/JyAttribute.java
Comment thread src/org/python/core/JyAttribute.java
Comment thread src/org/python/core/JyAttribute.java
@Stewori

Stewori commented Oct 19, 2024

Copy link
Copy Markdown
Member Author

Before we think about writing a stress test I suggest to await response of @newprogrammer101 . If the PR does not solve the issue or blows something up, we would need to rethink the solution anyway.

@fayazkhan121 fayazkhan121 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (nonBuiltinAttrTypeOffset == 0) { throw new IllegalStateException("No more attr types available."); }
Magic numbers are being used in conditional checks.
Recommendation: Define these values as constants to improve code readability and maintainability

public static byte reserveCustomAttrType() {
public static synchronized byte reserveCustomAttrType() {
The method reserveCustomAttrType has been changed to a synchronised method. This is a good practice for thread safety, but it introduces a potential bottleneck under heavy load.
Recommendation: Assess if finer-grained synchronisation (e.g., using atomic variables) could replace the synchronisation here to improve concurrency without sacrificing thread safety.

Comment thread src/org/python/core/JyAttribute.java
Comment thread src/org/python/core/JyAttribute.java
@newprogrammer101

newprogrammer101 commented Jan 24, 2025

Copy link
Copy Markdown

Before we think about writing a stress test I suggest to await response of @newprogrammer101 . If the PR does not solve the issue or blows something up, we would need to rethink the solution anyway.

Thanks @Stewori for the quick change in this CR. This commit is to resolve concurrency issue on Jython package. But currently when I created multi JythonExecutor instance for testing purpose, I found CPU was at BLOCK state on PrintStream.flush as mentioned in #360 (comment). I would suggest re-evaluate not only synchronize key word on Jython package, but also on its dependencies. @jeff5

@Stewori

Stewori commented May 27, 2026

Copy link
Copy Markdown
Member Author

I finally took the time to stress test the implementation and it works like expected. Generally, the implementation performs nicely. Trouble is, it also performs nicely without using volatile. So, the challenge was to get the stress test to actually provoke a failure if volatile is removed, and then to observe that the failure goes away if volatile is added back. My test now achieves exactly this. However, I don't think it is suitable to be included into the normal test suite, but I will share it here, so whoever is interested (@jeff5 ) can try it.

With that, I'm now even more confident in this PR and think it should be included into the 2.7.5 release.

@Stewori

Stewori commented May 27, 2026

Copy link
Copy Markdown
Member Author

test_JyAttribute.zip

Okay, java DictStressTest runs the test. It is very memory hungry, because for sufficient stress I had to remove all waiting entirely (sorry the code might not be in perfect shape and I probably mixed tabs and spaces...). On my machine, it can consume up to 4GB of ram. In theory, the test should run only one second, but in fact, it takes much longer (the shutdown command is delayed due to all the stress).
At the end, you should see an output like this:

...
rote(8): 511
wrote(1): 251
wrote(2): 918
Number of items of key 1: 9635557
Number of violations of key 1: 0
Number of items of key 2: 8467645
Number of violations of key 2: 0
Number of items of key 3: 8461724
Number of violations of key 3: 0
Number of items of key 4: 8091572
Number of violations of key 4: 0
Number of items of key 5: 6257706
Number of violations of key 5: 0
Number of items of key 6: 5352900
Number of violations of key 6: 0
Number of items of key 7: 4274878
Number of violations of key 7: 0
Number of items of key 8: 4562908
Number of violations of key 8: 0

And no violations means that the test passed. The number of items for each key indicates how many accesses to the dict have been recorded and evaluated for consistency.
Now, in JyAttribute.java, out-comment these lines:

        volatile // <- comment out this line to see failures
        transient JyAttribute next;
        volatile // <- comment out this line to see failures
        transient Object value;

Then recompile (javac *.java) and repeat the test. Often, it will be just fine, but try it ten or more times and occasionally you should see an output like this:

...
wrote(5): 409
wrote(6): 908
wrote(8): 396
wrote(7): 301
wrote(4): 10
Number of items of key 1: 9389605
Number of violations of key 1: 9
Number of items of key 2: 7791174
Number of violations of key 2: 13
Number of items of key 3: 6184843
Number of violations of key 3: 1
Number of items of key 4: 6899296
Number of violations of key 4: 7
Number of items of key 5: 6769969
Number of violations of key 5: 7
Number of items of key 6: 6066541
Number of violations of key 6: 14
Number of items of key 7: 4353699
Number of violations of key 7: 0
Number of items of key 8: 4135389
Number of violations of key 8: 0

which confirms that the test is sharp. In all tries so far (perhaps ~50) I have not encountered a single failure with volatile in place.

@jeff5

jeff5 commented May 28, 2026

Copy link
Copy Markdown
Member

Ok, I believe.

Since we first considered this I have been back to the textbooks on this difficult subject. I see how volatile probably does the trick, but I think the test was worth it, thanks. Although in another context ... I can't seem to get test failures from deliberately incorrect synchronisation.

Do you think this should be a test in ant javatests? It would then be run routinely on a variety of architectures.

@Stewori

Stewori commented May 28, 2026

Copy link
Copy Markdown
Member Author

Although in another context ... I can't seem to get test failures from deliberately incorrect synchronisation.

Please, share more details. What other context?

Do you think this should be a test in ant javatests? It would then be run routinely on a variety of architectures.

No, it is too resource hungry and too non-telling. I think the stress level for a meaningful test is machine dependent and one would have to assert in non-volatile version that the test triggers. To know that the stress is meaningful. I don't see how this check could be done on a regular basis in normal tests. In fact like one in ten times it crashes my JVM with out-of-memory error. Maybe it would work better if the number of calls is limited rather than the time. Because the code that terminates it after the specified time lags behind due to stress. OTOH, if this is relieved, we might stop seeing the failures again. It would be tricky to tune the test to reliably trigger a violation on every machine. Especially without wasting too many resources. Perhaps it could be added as optional test. We should definitely make it available for off-stream testing and investigations. However, maybe it's enough to just link to my previous post from the source code (perhaps even from a doc comment) in JyAttribute.java. And maybe add a hint to the devguide or userguide.

@jeff5 jeff5 dismissed their stale review May 30, 2026 13:30

I think this is resolved by better understanding and practical tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JyAttribute.getAttr significantly slow down during multi-threading

4 participants