Saturday, January 29, 2011

java max heap size, how much is too much

I'm having issues with a JRuby (rails) app running in tomcat. Occasionally page requests can take up to a minute to return (even though the rails logs processed the request in seconds so it's obviously a tomcat issue).

I'm wondering what settings are optimal for the java heap size. I know there's no definitive answer, but I thought maybe someone could comment on my setup.

I'm on a small EC2 instance which has 1.7g ram. I have the following JAVA_OPTS:

-Xmx1536m -Xms256m -XX:MaxPermSize=256m -XX:+CMSClassUnloadingEnabled

My first thought is that Xmx is too high. If I only have 1.7gb and I allocated 1.5gb to java, i feel like I'll get a lot of paging. Typically my java process shows (in top) 1.1g res memory and 2g virtual.

I also read somewhere that setting the Xms and Xmx to the same size will help as it eliminates time spend on memory allocation.

I'm not a java person but I've been tasked with figuring out this problem and I'm trying to find out where to start. Any tips are greatly appreciated!!

update
I've started analyzing the garbage collection dumps using -XX:+PrintGCDetails

When i notice these occasional long load times, the gc logs go nuts. the last one I did (which took 25s to complete) I had gc log lines such as:

1720.267: [GC 1720.267: [DefNew: 27712K->16K(31104K), 0.0068020 secs] 281792K->254096K(444112K), 0.0069440 secs]
1720.294: [GC 1720.294: [DefNew: 27728K->0K(31104K), 0.0343340 secs] 281808K->254080K(444112K), 0.0344910 secs]

about 300 of them on a single request!!! Now, I don't totally understand why it's always GC'ng from ~28m down to 0 over and over.

  • Part of your problem is that you are probably starving all other processes for ram. My general rule of thumb for -Xms and -Xmx are as follows:

    -Xms : <System_Memory>*.5
    -Xmx : <System_Memeory>*.75

    So on a 4GB systems it would be: -Xms2048m -Xmx3072m, and in your case I would go with -Xms896m -Xmx1344

    From Zypher
  • While I haven't run any JRuby apps on Tomcat, I have run ColdFusion apps on varied J2EE app servers, and I also have had similar issues.

    In these FAQs, you'll see that SOracle says that on 32-bit Windows, you'll be limited to a max heap size of 1.4 to 1.6 GB. I never was able to get it stable that high, and I suspect you're running a similar configuration.

    My guess is that your requests are taking a long time to run b/c with a heap size that high, the JVM has allocated more physical memory than Windows had to give, and thus Windows spends a lot of time swapping pages in and out of memory to disk so it can provide the required amount of memory to the JVM.

    My recommendation, although counter-intuitive, would be that you actually lower the max heap size to somewhere around 1.2 GB. You can raise the min size as well, if you notice that there are slow-downs in the app's request processing while the JVM has to ask Windows for more memory to increase the size of its heap as it fills with uncollected objects.

    brad : i'm pretty sure you're right. Although we're running Linux (not windows), I'm pretty sure that the machine is swapping too much and having a hard time garbage collecting.
  • hi,

    in addiotion to the previous answers, you should also take PermGen into account. PermGen is not part of the heapspace. with your current configuration your java process could sum up to 1792mb which is the total amount of your machine.

    brad : ya i just read about that also, we're definitely starving the system with the combination of the perm and max heap
    Clint Miller : Oh yeah- that's a good point too. Do you have a reference for PermGen not being part of the heapspace?
    Christian : this post on stackoverflow explains the memory model and also ahs a link to a sun blog with an explanation of the PermGen: http://stackoverflow.com/questions/2129044/java-heap-terminology-young-old-and-permanent-generations you also see it in the gc log (when enabled): `0.431: [Full GC [PSYoungGen: 352K->0K(101952K)] [PSOldGen: 0K->330K(932096K)] 352K->330K(1034048K) [PSPermGen: 3959K->3959K(16384K)], 0.0187660 secs]` you can see that the PermGen is handled separately.
    From Christian
  • I know there's already an answer chosen, but still, here goes my explanation.

    First of all, in the commandline you use, you already reserve 1536 megabyte for the Java heap (-Xmx1536m) and 256 megabyte for the PermGen (-XX:MaxPermSize=256m). The PermGen is allocated separately from the Java heap and is used for storing the Java classes loaded in the JVM.

    These 2 areas together already add up to 1792 megabyte of RAM.

    But in addition to that, there is also RAM needed to load the JVM itself (the native code of the JVM) and RAM to store the code that is generated by the JIT compiler.

    I suspect all those add up to the 2 gigabyte virtual that you mentioned.

    Finally, you also have to take into account the other things that are running on the server and that need RAM too. You didn't really mention how much swap is in use on the server. This would tell you whether the machine is swapping and that is causing the application to react slowly. You should at all times prevent the JVM from hitting the swap. It's much much better to frequently trigger the garbage collector, than to allocate too much heap and have part of the Java heap being swapped out.

    From rubenvdg

0 comments:

Post a Comment