OOM using script-exec
Reported by Michael Gibson | September 8th, 2011 @ 03:45 PM | in Rundeck 1.4 (closed)
I am able to reproduce an OOM error when using the script-exec plugin and multiple threads.
To reproduce.
add to each node in the resources.yaml
node-executor: script-exec script-exec: echo ${node.hostname} --
${exec.command}
Dispatch command across 20 nodes at 20 threads.
Note: This only occurs when you have a large number of nodes in your resources.yaml
I have around 2,000 node entries in my resources.yaml.
If I trim that down to 30 or so, I cannot reproduce.
Not sure what the threshold is.
Comments and changes to this ticket
-
Alex-SF September 8th, 2011 @ 04:39 PM
- Tag set to large scale deployment, bug, performance
-
Greg Schueler September 9th, 2011 @ 11:45 AM
can you paste your JVM memory configuration? I am unable to reproduce this. I created 2000 nodes with script-exec executor as you described, but running a job with 20 threads on 20 nodes doesn't seem to produce the issue you mentioned. I will try increasing some of the parameters
using default memory configuration
-
Michael Gibson September 9th, 2011 @ 11:52 AM
Currently reproducing using 1 Gig of memory. I have given it 16Gb as well and had the same result.
RDECK_JVM="$RDECK_JVM -Xmx1024m -Xms1024m" -
Greg Schueler September 9th, 2011 @ 12:17 PM
ok i was testing with "-XX:MaxPermSize=256m -Xmx512m -Xms256m -server ", i will try your config
-
Michael Gibson September 9th, 2011 @ 01:08 PM
Is there a setting somewhere that will increase the verbosity of the stack trace? I have found a few log4j settings but they don't seem to have any effect.
-
Michael Gibson September 11th, 2011 @ 05:30 PM
- no changes were found...
-
Michael Gibson September 11th, 2011 @ 05:34 PM
Able to reproduce oom on both CentOS 5.6/x86_64/i686 and Ubuntu 9.04/xi686 both running Sun JVM 1.6.0_24 using 1024M for heap.
I am attaching the resources.yaml that I used which includes the script-exec commands for each node.
-
Michael Gibson September 11th, 2011 @ 09:11 PM
This seems to be related to the Snakeyaml parser. Upon closer inspection of the heap dump it's clear that the yaml parser is consuming a significant portion of the heap. This would explain the relationship between the resources.yaml size and the performance of the application.
Workaround: I converted the resource model to XML and have been unable to reproduce. Response times are significantly improved.
I can provide the heap dumps if needed but they are just over the 50MB limit for this system. Let me know if you have another means of delivery.
-
Greg Schueler September 12th, 2011 @ 09:34 AM
ah, interesting. perhaps you can send it using some free file share service like dropbox or http://ge.tt/ ?
-
Greg Schueler September 12th, 2011 @ 11:00 AM
I reproduced the heap space error running rundeck 1.3. FYI i had been using development branch (1.4rc) before when I wasn't able to reproduce it.
i think it is due to using snakeyaml to parse all the nodes within the constructor of a class
-
Michael Gibson September 12th, 2011 @ 11:08 AM
Nice!
I am attaching my dumpfile anyway.
I used eclipse's memory analyzer. http://www.eclipse.org/mat/
If you open the file using the Leak suspect option and then open the "Dominator" tree, all signs point to the YAML parser. -
Greg Schueler September 12th, 2011 @ 11:29 AM
i wonder if just upgrading snakeyaml will fix it, we are using 1.7, but this issue was fixed for 1.8: http://code.google.com/p/snakeyaml/issues/detail?id=101
-
Greg Schueler September 12th, 2011 @ 11:35 AM
also, it seems a partial workaround is to separate the contents of the .yaml file into multiple yaml "documents". Simply insert '---' on a new line at various points between the node definitions. I did this after every 500 nodes in you sample file and it mitigated the memory issue.
I believe this is the snakeyaml issue I linked. I will also test upgrading snakeyaml
-
Greg Schueler September 12th, 2011 @ 11:54 AM
tested: replace snakeyaml-1.7.jar with 1.8 or 1.9 jar
rundeck 1.3 and 1.4 dev
update resources.yaml to 5000 nodes
java -Xmx512m -Xms512m5000 nodes parse correctly without the memory issue.
we need to upgrade snakeyaml dependency
-
Greg Schueler September 12th, 2011 @ 03:06 PM
- Assigned user set to Greg Schueler
- Milestone set to Rundeck 1.4
- Milestone order changed from 97 to 0
-
Greg Schueler September 12th, 2011 @ 03:07 PM
- State changed from new to needs_verification
(from [d51b53c512cf92bed3d902fdbfae051c77886607]) Upgrade snakeyaml to 1.9 [#431 state:needs_verification] https://github.com/dtolabs/rundeck/commit/d51b53c512cf92bed3d902fdb...
-
Greg Schueler September 12th, 2011 @ 03:22 PM
(from [240023be566dbb4c9f60dc1d8bd6e1ef2ce1685b]) Update hardcoded dependency in rpm spec [#431] https://github.com/dtolabs/rundeck/commit/240023be566dbb4c9f60dc1d8...
-
Michael Gibson September 21st, 2011 @ 09:38 AM
- Assigned user cleared.
Verified. Unable to reproduce using snakeyaml-1.9.jar
-
Greg Schueler September 21st, 2011 @ 09:54 AM
- State changed from needs_verification to resolved
- Assigned user set to Greg Schueler
thanks for verifying that.
adding to 1.3.1 #446
Please Sign in or create a free account to add a new ticket.
With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป
(DEPRECATED) Please use github issues for issue tracking at http://github.com/dtolabs/rundeck/issues