Last updated Mar 10, 2006.
One of the benefits that Java introduced over C++ was the concept of automatic memory management. Specifically, Java Virtual Machines (JVM) provide a garbage collection thread that frees memory when the JVM's heap exhausts available memory. The implementation of the garbage collection strategy is dependent on the JVM vendor, but it suffices to say that the garbage collection process can be expensive. The Sun JVM defines garbage collection in two modes:
- Minor "copy" collections
- Major "Mark-Sweep-Compact" collections
A minor collection runs relatively quickly and involves moving live data around the heap in the presence of running threads. A major collection is a much more intrusive garbage collection that suspends all execution threads while it completes its task. In terms of performance tuning the heap, the primary goal is to reduce the frequency and duration of major garbage collections.
The IBM JVM implements a different architecture and graduates its garbage collection behavior based off of the state of the heap. Garbage collections can range from minimal impacting sweeps (fueled by a mark phase executed concurrently to executing threads) to a worst case scenario of a full mark-sweep-compact, stop-the-world garbage collection.
Because Java applications and their respective components all run inside the JVM and rely on objects instances loaded into the heap, tuning the heap is of paramount importance. But before the heap can be properly tuned, you need insight into the performance of the heap.
Until the release of Java SE 5, monitoring options included:
- Heap size and usage information obtained through calls to the java.lang.Runtime class
- Verbose garbage collection log parsing, enabled by passing startup parameters to the JVM to instruct it to generate verbose logging information every time a garbage collection occurs
- Proprietary JVM APIs, such as Sun's jvmstat interface
- Dangerous proprietary strategies such as injecting a DLL or shared object into the JVM process space to read process memory
Each of these strategies has advantages and disadvantages in terms of overhead and ease of implementation, but the result is that there is no single perfect strategy.
One of the benefits of performance monitoring application servers is that all major vendors have adopted the Java Management Extensions (JMX) and exposed monitoring information in the form of managed beans (MBeans.) With the release of Java SE 5, a JMX registry and support for MBeans was added to the JVM itself, so now that same luxury is available to stand-alone applications as well as enterprise applications. In addition, the Java SE 5 specification has defined new classes and interfaces in the java.lang.management package that expose JVM runtime statistics. A handful of those new interfaces are implemented by managed beans that expose runtime information about the behavior of the JVM memory, both heap and non-heap process memory. In this article we look at three bean interfaces:
- MemoryMXBean
- MemoryPoolMXBean
- GarbageCollectionMXBean
All management MBeans can be loaded through a helper class: java.lang.management.ManagementFactory. The ManagementFactory provides a set of static methods that return the requests MBeans:
MemoryMXBean memorymbean = ManagementFactory.getMemoryMXBean();
List mempoolsmbeans = ManagementFactory.getMemoryPoolMXBeans();
List gcmbeans = ManagementFactory.getGarbageCollectorMXBeans();
The MemoryMXBean provides information about memory usage, including both heap and non-heap memory. Specifically it provides the following two methods:
MemoryUsage getHeapMemoryUsage()
MemoryUsage getNonHeapMemoryUsage()
These methods return an instance of java.lang.management.MemoryUsage that defined four key attributes:
- init: the initial amount of memory that the JVM requested from the operating system during startup
- used: the amount of memory currently in use
- committed: the amount of memory that is guaranteed to be available for use by the JVM; it can change over time and is always guaranteed to be greater than or equal to the used memory
- max: the maximum amount of memory that can be used by the JVM in the specified area (heap or non-heap)
The MemoryPoolMXBean provides information about specific memory pools within the JVM memory spaces, both heap and non-heap. For those familiar with the Sun heap, these pools include the Eden Space, Survivor Space, Tenured Generation, and Permanent Generation.
In addition to others that are presented later. But the point is that these are the logical partitions that memory is subdivided into. For each MemoryPoolMXBean, you can discover the following information:
- Current Usage
- Peak Usage
- Usage at the last collection
- The type of memory space (heap or non-heap)
- The memory managers that operate on this space, for example "Copy" and "MarkSweepCompact"
Finally, the GarbageCollectorMXBean provides collection times and collection counts for each type of memory pool.
Listing 1 uses each of the aforementioned MBeans in conjunction with some memory mismanagement to display information about the JVM memory.
Listing 1. Java5ManagementTest.java
package com.javasrc.management;
import java.lang.management.*;
import java.util.*;
public class Java5ManagementTest
{
public static void dumpMemoryInfo()
{
try
{
System.out.println( "\nDUMPING MEMORY INFO\n" );
// Read MemoryMXBean
MemoryMXBean memorymbean = ManagementFactory.getMemoryMXBean();
System.out.println( "Heap Memory Usage: " + memorymbean.getHeapMemoryUsage() );
System.out.println( "Non-Heap Memory Usage: " + memorymbean.getNonHeapMemoryUsage() );
// Read Garbage Collection information
List gcmbeans = ManagementFactory.getGarbageCollectorMXBeans();
for( GarbageCollectorMXBean gcmbean : gcmbeans )
{
System.out.println( "\nName: " + gcmbean.getName() );
System.out.println( "Collection count: " + gcmbean.getCollectionCount() );
System.out.println( "Collection time: " + gcmbean.getCollectionTime() );
System.out.println( "Memory Pools: " );
String[] memoryPoolNames = gcmbean.getMemoryPoolNames();
for( int i=0; i mempoolsmbeans = ManagementFactory.getMemoryPoolMXBeans();
for( MemoryPoolMXBean mempoolmbean : mempoolsmbeans )
{
System.out.println( "\nName: " + mempoolmbean.getName() );
System.out.println( "Usage: " + mempoolmbean.getUsage() );
System.out.println( "Collection Usage: " + mempoolmbean.getCollectionUsage() );
System.out.println( "Peak Usage: " + mempoolmbean.getPeakUsage() );
System.out.println( "Type: " + mempoolmbean.getType() );
System.out.println( "Memory Manager Names: " ) ;
String[] memManagerNames = mempoolmbean.getMemoryManagerNames();
for( int i=0; i<1000000; s = "My String ">
Listing 1 presents a class that displays JVM memory management information before and after allocating one million strings. It begins by retrieving and displaying the memory usage information through the MemoryMXBean. In my execution, the following are the states before and after the run:
// Before
Heap Memory Usage: init = 33554432(32768K)
used = 241680(236K)
committed = 33357824(32576K)
max = 33357824(32576K)
Non-Heap Memory Usage: init = 29556736(28864K)
used = 12055504(11772K)
committed = 29851648(29152K)
max = 121634816(118784K)
// After
Heap Memory Usage: init = 33554432(32768K)
used = 218656(213K)
committed = 33357824(32576K)
max = 33357824(32576K)
Non-Heap Memory Usage: init = 29556736(28864K)
used = 12131600(11847K)
committed = 29884416(29184K)
max = 121634816(118784K)
In this case, the minimum and maximum values for the heap are set to 32MB, which is because I used the following parameters on startup:
-Xms32m –Xmx32m
The used memory dropped from 236K to 213K after the run, which we will discover is the result of several garbage collections.
The garbage collection information is retrieved from the GarbageCollectorMXBean. The Sun JVM with its default configuration implements two garbage collectors: Copy and MarkSweepCompact. In my sample execution, the following are the states of garbage collection before and after the run:
// Before
Name: Copy
Collection count: 0
Collection time: 0
Memory Pools:
Eden Space
Survivor Space
Name: MarkSweepCompact
Collection count: 0
Collection time: 0
Memory Pools:
Eden Space
Survivor Space
Tenured Gen
Perm Gen
Perm Gen [shared-ro]
Perm Gen [shared-rw]
// After
Name: Copy
Collection count: 63
Collection time: 12
Memory Pools:
Eden Space
Survivor Space
Name: MarkSweepCompact
Collection count: 0
Collection time: 0
Memory Pools:
Eden Space
Survivor Space
Tenured Gen
Perm Gen
Perm Gen [shared-ro]
Perm Gen [shared-rw]
From this output you can see that creating and discarding one million Strings resulted in 63 copy collections that accounted for 12 milliseconds to run. If we change the code to not discard the Strings between iterations then we will see MarkSweepCompact collections occur.
Finally we display information about the various memory pools by accessing the MemoryPoolMXBeans. This returns several memory pools in the Sun JVM:
- Code Cache: contains memory used for compilation and storage of native code
- Eden Space: pool from which memory is initially allocated for most objects
- Survivor Space: pool containing objects that have survived Eden space garbage collection
- Tenured Gen: pool containing long-lived objects
- Perm Gen: contains reflective data of the JVM itself, including class and memory objects
- Perm Gen [shared-ro]: read-only reflective data
- Perm Gen [shared-rw]: read-write reflective data
The following displays sample output for the four primary memory pools (Eden, Survivor Space, Tenured Generation, and Permanent Generation) after the test has completed:
Name: Eden Space
Usage: init = 2162688(2112K) used = 90784(88K) committed = 2162688(2112K) max = 2162688(2112K)
Collection Usage: init = 2162688(2112K) used = 0(0K) committed = 2162688(2112K) max = 2162688(2112K)
Peak Usage: init = 2162688(2112K) used = 2162688(2112K) committed = 2162688(2112K) max = 2162688(2112K)
Type: Heap memory
Memory Manager Names:
MarkSweepCompact
Copy
Name: Survivor Space
Usage: init = 196608(192K) used = 16(0K) committed = 196608(192K) max = 196608(192K)
Collection Usage: init = 196608(192K) used = 16(0K) committed = 196608(192K) max = 196608(192K)
Peak Usage: init = 196608(192K) used = 127928(124K) committed = 196608(192K) max = 196608(192K)
Type: Heap memory
Memory Manager Names:
MarkSweepCompact
Copy
Name: Tenured Gen
Usage: init = 30998528(30272K) used = 127856(124K) committed = 30998528(30272K) max = 30998528(30272K)
Collection Usage: init = 30998528(30272K) used = 0(0K) committed = 0(0K) max = 30998528(30272K)
Peak Usage: init = 30998528(30272K) used = 127856(124K) committed = 30998528(30272K) max = 30998528(30272K)
Type: Heap memory
Memory Manager Names:
MarkSweepCompact
Name: Perm Gen
Usage: init = 8388608(8192K) used = 127800(124K) committed = 8388608(8192K) max = 67108864(65536K)
Collection Usage: init = 8388608(8192K) used = 0(0K) committed = 0(0K) max = 67108864(65536K)
Peak Usage: init = 8388608(8192K) used = 127800(124K) committed = 8388608(8192K) max = 67108864(65536K)
Type: Non-heap memory
Memory Manager Names:
MarkSweepCompact
From this output you can surmise that for a 32MB heap running on Windows with default configuration that Eden was allocated 2112KB, each of the two survivor spaces received 192KB, and the tenured generation received the remaining 30272KB, all adding up to the 32768KB heap. An interesting observation is that the permanent space is allocated an initial 8MB and can grow up to 64MB. By polling this information you can analyze the behavior of the entire heap. For those of you with a passion for this type of work, the MBeans also define a notification interface that you can find in the Java SE 5 Javadocs.
Finally, if you would like to view this information at runtime, the Sun JVM provides the Java Monitoring and Management Console (JConsole) that can connect to a running JVM and present this information. You can read more about it in a Sun article by Mandy Chung: Using JConsole to Monitor Applications.
Summary
Memory monitoring in Java 1.4.x and earlier was something of a black art, but with the introduction of Java SE 5 and the adoption of public interfaces such as JMX, the task has now been made simple. With the technical details of obtaining the information out of the way, you are free to focus on the real business value: analyzing those metrics.
7 Comments To "Should MySQL and Web Server share the same box ?"
#1 Comment By John Latham On October 16, 2006 @ 7:58 am
Special case for shared box: circular multi-master replication topology, with each node running web & db, datasources point to localhost. This will (in principle) scale linearly with number of boxes, until propagation delays become problematic (but less of an issue if using sticky sessions). Useful for read-heavy apps.
#2 Comment By peter On October 16, 2006 @ 8:18 am
John,
Thank you for comment.
It scales linearly only from the first glance. In reality it has problems with scaling writes (you mention it already) second as database size grows it may change from CPU bound workload to IO bound workload which slows things down dramatically.
This is not to mention conflicting updates and complicated failure recovery for circular replication.
In general I can only see it used then conflicting updates are not an issue and application can’t be made aware of multi server configuration.
#3 Comment By Michael On October 16, 2006 @ 2:43 pm
Do you have any comments on using VMWare to partition your web servers / databases as virtual machines on one or multiple (physical) boxes? To me the separate physical web server on one box and database is a better idea, but some people keep on recommending this to me.
#4 Comment By peter On October 17, 2006 @ 1:14 am
Michael,
I think using VMWare and other virtualization techniques are good for two cases - testing and if you share same server among different people (in this case not VMWare but other techniques should be used of course)
Some people also use virtualization to ease with cloning as well as configuration moving to other server - I think it is easy enough to do standard way.
Also sharing any way limits you to resources of single server - dedicated physical web and database box will surely have more power.
#5 Pingback By Zedomax Server Upgrade Complete! | zedomax.com - blog about DIYs and Review on reviews of gadgets and technologies… On February 22, 2007 @ 5:55 pm
[…] If you want to know about running a more efficient web server, check out this article on mysql and web server on different boxes. […]
#6 Pingback By smalls blogger » Blog Archive » links for 2007-07-12 On July 11, 2007 @ 6:11 pm
[…] MySQL Performance Blog » Should MySQL and Web Server share the same box ? Should MySQL and Web Server share the same box ? (tags: mysql apache server performance scaling web architecture php) […]
#7 Comment By Dedicated Hosting Provider On March 24, 2008 @ 10:27 am
I had an infrastructure class during my undergraduate studies and we used VMWare for the entire course. VMWare was very good for simulation of different types of issues but also had a lot of problems. It is hard to simulate real systems using virtual machines and virtual machines are very easily corrupted so you need to ensure you backup your information very frequently.