123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728 |
- ---
- title: Scalable Web Applications
- order: 22
- layout: page
- ---
-
- [[scalable-web-applications]]
- = Scalable web applications
-
- [[introduction]]
- Introduction
- ^^^^^^^^^^^^
-
- Whether you are creating a new web application or maintaining an
- existing one, one thing you certainly will consider is the scalability
- of your web application. Scalability is your web application’s ability
- to handle a growing number of concurrent users. How many concurrent
- users can, for instance, your system serve at its peak usage? Will it be
- intended for a small scale intranet usage of tens to hundreds of
- concurrent users, or do you plan to reach for a medium size global web
- application such as 500px.com with about 1000-3000 concurrent users. Or
- is your target even higher? You might wonder how many concurrent users
- there are in Facebook or Linkedin. We wonder about that too, and thus
- made a small study to estimate it. You can see the results in Figure 1
- below. The estimates are derived from monthly visitors and the average
- visit time data found from Alexa.com.
-
- image:img/webusers.png[image]
-
- _Figure 1: Popular web applications with estimated number of concurrent
- users._
-
- The purpose of this article is to show you common pain points which tend
- to decrease the scalability of your web applications and help you find
- out ways to overcome those. We begin by introducing you to our example
- application. We will show you how to test the scalability of this
- example application with Gatling, a tool for stress testing your web
- application. Then, in the next chapters, we will go through some pain
- points of scalability, such as memory, CPU, and database, and see how to
- overcome these.
-
- [[book-store-inventory-application]]
- Book store inventory application
- --------------------------------
-
- Throughout this example we’ll use a book store inventory application
- (see Figure 2) as an example application when deep diving into the world
- of web application scalability. The application contains a login view
- and an inventory view with CRUD operations to a mockup data source. It
- also has the following common web application features: responsive
- layouts, navigation, data listing and master detail form editor. This
- application is publicly available as a Maven archetype
- (`vaadin-archetype-application-example`). We will first test how many
- concurrent users the application can serve on a single server.
-
- image:img/mockapp-ui.png[image]
-
- _Figure 2: Book store inventory application_
-
- The purpose of scalability testing is to verify whether the
- application's server side can survive with a predefined number of
- concurrent users or not. We can utilize a scalability test to find the
- limits, a breaking point or server side bottlenecks of the application.
- There are several options for scalability testing web applications. One
- of the most used free tools is Apache JMeter. JMeter suits well for
- testing web applications, as long as the client-server communication
- does not use websockets. When using asynchronous websocket
- communication, one can use the free Gatling tool or the commercial
- NeoLoad tool.
-
- You can find a lot of step by step tutorials online on how to use
- Gatling and JMeter. There is typically a set of tips and tricks that one
- should take into account when doing load testing on certain web
- frameworks, such as the open source Vaadin Framework. For more
- information on Vaadin specific tutorials, check the wiki pages on
- https://vaadin.com/scalability[vaadin.com/scalability].
-
- Gatling and JMeter can be used to record client to server requests of a
- web application. After recording, the recorded requests can be played
- back by numbers of concurrent threads. The more threads (virtual users)
- you use the higher the simulated load generated on the tested
- application.
-
- Since we want to test our application both in synchronous and
- asynchronous communication modes, we will use Gatling. Another benefit
- of Gatling compared to JMeter is that it is less heavy for a testing
- server, thus more virtual users can be simulated on a single testing
- server. Figure 3 shows the Gatling settings used to record the example
- scenario of the inventory application. Typically all static resources
- are excluded from the recording (see left bottom corner of the figure),
- since these are typically served from a separate http server such as
- Nginx or from a CDN (Content Delivery Network) service. In our first
- test, however, we still recorded these requests to see the worst case
- situation, where all requests are served from a single application
- server.
-
- image:img/figure3s2.png[image]
-
- _Figure 3: Gatling recorder._
-
- Gatling gathers the recorded requests as text files and composes a Scala
- class which is used to playback the recorded test scenario. We planned a
- test scenario for a typical inventory application user: The user logs in
- and performs several updates (in our case 11) to the store in a
- relatively short time span (3 minutes). We also assumed that they leave
- the inventory application open in their browser which will result in the
- HttpSession not closing before a session timeout (30min in our case).
-
- Let’s assume that we have an extremely large bookstore with several
- persons (say 10000) responsible for updating the inventory. If one
- person updates the inventory 5 times a day, and an update session takes
- 3 minutes, then with the same logic as we calculated concurrent users in
- the Introduction, we will get a continuous load of about 100 concurrent
- users. This is of course not a realistic assumption unless the
- application is a global service or it is a round-the-clock used local
- application, such as a patient information system. For testing purposes
- this is, however, a good assumption.
-
- A snippet from the end of our test scenario is shown in Figure 4. This
- test scenario is configured to be run with 100 concurrent users, all
- started within 60 seconds (see the last line of code).
-
- [source,scala]
- ....
- .pause(9)
- .exec(http(>"request_45")
- .post(>"/test/UIDL/?v-uiId=0")
- .headers(headers_9)
- .body(RawFileBody(>"RecordedSimulation_0045_request.txt")))
- .pause(5)
- .exec(http(>"request_46")
- .post(>"/test/UIDL/?v-uiId=0")
- .headers(headers_9)
- .body(RawFileBody(>"RecordedSimulation_0046_request.txt"))
- .resources(http(>"request_47")
- .post(uri1 + >"/UIDL/?v-uiId=0")
- .headers(headers_9)
- .body(RawFileBody(>"RecordedSimulation_0047_request.txt"))))}
- setUp(scn.inject(rampUsers(100) over (60 seconds))).protocols(httpProtocol)
- ....
-
- _Figure 4: Part of the test scenario of inventory application._
-
- To make the test more realistic, we would like to execute it several
- times. Without repeating we do not get a clear picture of how the server
- will tolerate a continuous high load. In a Gatling test script, this is
- achieved by wrapping the test scenario into a repeat loop. We should
- also flush session cookies to ensure that a new session is created for
- each repeat. See the second line of code in Figure 5, for an example of
- how this could be done.
-
- [source,scala]
- ....
- val scn = scenario("RecordedSimulation")
- .repeat(100,"n"){exec(flushSessionCookies).exec(http("request_0")
- .get("/test/")
- .resources(http("request_1")
- .post(uri1 + "/?v-1440411375172")
- .headers(headers_4)
- .formParam("v-browserDetails", "1")
- .formParam("theme", "mytheme")
- .formParam("v-appId", "test-3556498")
- ....
-
- _Figure 5: Repeating 100 times with session cookie flushing (small part
- of whole script)_
-
- We tested how well this simple example application tolerated several
- concurrent users. We deployed our application in Apache Tomcat 8.0.22 on
- a Windows 10 machine with Java 1.7.0 and an older quad core mobile Intel
- i7 processor. With its default settings (using the default heap size of
- 2GB), Tomcat was able to handle about 200 concurrent users for a longer
- time. The CPU usage for that small number of concurrent users was not a
- problem (it was lower than 5%), but the server configurations were a
- bottleneck. Here we stumbled upon the first scalability pain point:
- server configuration issues (see next chapter). It might sound
- surprising that we could only run a such small number of concurrent
- users by default, but do not worry, we are not stuck here. We will see
- the reasons for this and other scalability pain points in the following
- chapter.
-
- [[scalability-pain-points]]
- Scalability pain points
- -----------------------
-
- We will go through typical pain points of a web application developer
- which she (or he) will encounter when developing a web application for
- hundreds of concurrent users. Each pain point is introduced in its own
- subchapter and followed by typical remedies.
-
- [[server-configuration-issues]]
- Server configuration issues
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- One typical problem that appears when there are lots of concurrent
- users, is that the operating system (especially the *nix based ones) run
- out of file descriptors. This happens since most *nix systems have a
- pretty low default limit for the maximum number of open files, such as
- network connections. This is usually easy to fix with the `ulimit`
- command though sometimes it might require configuring the `sysctl` too.
-
- A little bit unexpected issues can also surface with network bandwidth.
- Our test laptop was on a wireless connection and its sending bandwidth
- started choking at about 300 concurrent users. (Please note that we use
- an oldish laptop in this entire test to showcase the real scalability of
- web apps –your own server environment will no doubt be even more
- scalable even out of the box.) One part of this issue was the wifi and
- another part was that we served the static resources, such as javascript
- files, images and stylesheets, from Tomcat. At this point we stripped
- the static resources requests out of our test script to simulate the
- situation where those are served from a separate http server, such as
- nginx. Please read the blog post
- “https://vaadin.com/blog/-/blogs/optimizing-hosting-setup[Optimizing
- hosting setup]” from our website for more information about the topic.
-
- Another quite typical configuration issue is that the application server
- is not configured for a large number of concurrent users. In our
- example, a symptom of this was that the server started rejecting
- (“Request timed out”) new connections after a while, even though there
- were lots of free memory and CPU resources available.
-
- After we configured our Apache Tomcat for high concurrent mode and
- removed static resource requests, and connected the test laptop into a
- wired network, we were able to push the number of concurrent users from
- 200 up to about 500 users. Our configuration changes into the server.xml
- of Tomcat are shown in Figure 6, where we define a maximum thread count
- (10240), an accepted threads count (4096), and a maximum number of
- concurrent connections (4096).
-
- image:img/figure6a.png[image]
-
- _Figure 6: Configuring Tomcat’s default connector to accept a lot of
- concurrent users._
-
- The next pain point that appeared with more than 500 users was that we
- were out of memory. The default heap size of 2GB eventually ran out with
- such high number of concurrent users. On the other hand, there was still
- a lot of CPU capacity available, since the average load was less than
- 5%.
-
- [[out-of-memory]]
- Out of memory
- ~~~~~~~~~~~~~
-
- Insufficient memory is possibly the most common problem that limits the
- scalability of a web application with a state. An http session is used
- typically to store the state of a web application for its user. In
- Vaadin an http session is wrapped into a `VaadinSession`. A
- VaadinSession contains the state (value) of each component (such as
- `Grid`, `TextFields` etc.) of the user interface. Thus,
- straightforwardly the more components and views you have in your Vaadin
- web application, the bigger is the size of your session.
-
- In our inventory application, each session takes about 0.3MB of memory
- which is kept in memory until the session finally closes and the garbage
- collectors free the resources. The session size in our example is a
- little bit high. With constant load of 100 concurrent users, a session
- timeout of 30 minutes and an average 3 minutes usage time, the expected
- memory usage is about 350MB. To see how the session size and the number
- of concurrent users affect the needed memory in our case, we made a
- simple analysis which results are shown in Figure 7. We basically
- calculated how many sessions there can exist at most, by calculating how
- many users there will be within an average usage time plus the session
- timeout.
-
- image:img/figure6s.png[image]
-
- _Figure 7: Memory need for varying size sessions and a different number
- of concurrent users._
-
- [[remedies]]
- Remedies
- ^^^^^^^^
-
- [[use-more-memory]]
- Use more memory
- +++++++++++++++
-
- This might sound simplistic, but many times it might be enough to just
- add as much memory as possible to the server. Modern servers and server
- operating systems have support for hundreds of gigabytes of physical
- memory. For instance, again in our example, if the size of a session
- would be 0.5MB and we had 5000 concurrent users, the memory need would
- be about 28GB.
-
- You also have to take care that your application server is configured to
- reserve enough memory. For example, the default heap size for Java is
- typically 2GB and for example Apache Tomcat will not reserve more memory
- if you do not ask it to do it with **`-Xmx`** JVM argument. You might
- need a special JVM for extremely large heap sizes. We used the following
- Java virtual machine parameters in our tests:
-
- ....
- -Xms5g -Xmx5g -Xss512k -server
- ....
-
- The parameters **`-Xms`** and **`-Xmx`** are for setting the minimum and
- the maximum heap size for the server (5 GB in the example), the `-Xss`
- is used to reduce the stack size of threads to save memory (typically
- the default is 1MB for 64bit Java) and the `-server` option tells JVM
- that the Java process is a server.
-
- [[minimize-the-size-of-a-session]]
- Minimize the size of a session
- ++++++++++++++++++++++++++++++
-
- The biggest culprit for the big session size in the inventory
- application is the container (BeanItemContainer) which is filled with
- all items of the database. Containers, and especially the built in fully
- featured BeanItemContainer, are typically the most memory hungry parts
- of Vaadin applications. One can either reduce the number of items loaded
- in the container at one time or use some lightweight alternatives
- available from Vaadin Directory
- (https://vaadin.com/directory[vaadin.com/directory]) such as Viritin,
- MCont, or GlazedLists Vaadin Container. Another approach is to release
- containers and views to the garbage collection e.g. every time the user
- switches into another view, though that will slightly increase the CPU
- load since the views and containers have to be rebuilt again, if the
- user returns to the view. The feasibility of this option is up to your
- application design and user flow –usually it’s a good choice.
-
- [[use-a-shorter-session-time-out]]
- Use a shorter session time out
- ++++++++++++++++++++++++++++++
-
- Since every session in the memory reserves it for as long as it stays
- there, the shorter the session timeout is, the quicker the memory is
- freed. Assuming that the average usage time is much shorter than the
- session timeout, we can state that halving the session timeout
- approximately halves the memory need, too. Another way to reduce the
- session’s time in the memory could be instructing users to logout after
- they are done.
-
- The session of a Vaadin application is kept alive by requests (such as
- user interactions) made from the client to the server. Besides user
- interaction, the client side of Vaadin application sends a heartbeat
- request into the server side, which should keep the session alive as
- long as the browser window is open. To override this behaviour and to
- allow closing idle sessions, we recommend that the `closeIdleSessions`
- parameter is used in your servlet configuration. For more details, see
- chapter
- https://vaadin.com/book/-/page/application.lifecycle.html[Application
- Lifecycle] in the Book of Vaadin.
-
- [[use-clustering]]
- Use clustering
- ++++++++++++++
-
- If there is not enough memory, for example if there is no way to reduce
- the size of a session and the application needs a very long session
- timeout, then there is only one option left: clustering. We will discuss
- clustering later in the Out of CPU chapter since clustering is more
- often needed for increasing CPU power.
-
- [[out-of-cpu]]
- Out of CPU
- ~~~~~~~~~~
-
- We were able to get past the previous limit of 500 concurrent users by
- increasing the heap size of Tomcat to 5GB and reducing the session
- timeout to 10 minutes. Following the memory calculations above, we
- should theoretically be able to serve almost 3000 concurrent users with
- our single server, if there is enough CPU available.
-
- Although the average CPU load was rather low (about 10%) still with 800
- concurrent users, it jumped up to 40% every now and then for several
- seconds as the garbage collector cleaned up unused sessions etc. That is
- also the reason why one should not plan to use full CPU capacity of a
- server since that will increase the garbage collection time in worst
- case even to tens of seconds, while the server will be completely
- unresponsive for that time. We suggest that if the average load grows to
- over 50% of the server’s capacity, other means have to be taken into use
- to decrease the load of the single server.
-
- We gradually increased the number of concurrent users to find out the
- limits of our test laptop and Tomcat. After trial and error, we found
- that the safe number of concurrent users for our test laptop was about
- 1700. Above that, several request timeout events occurred even though
- the CPU usage was about 40-50% of total capacity. We expect that using a
- more powerful server, we could have reached 2000-3000 concurrent users
- quite easily.
-
- [[remedies-1]]
- Remedies
- ^^^^^^^^
-
- [[analyze-and-optimize-performance-bottlenecks]]
- Analyze and optimize performance bottlenecks
- ++++++++++++++++++++++++++++++++++++++++++++
-
- If you are not absolutely sure about the origin of the high CPU usage,
- it is always good to verify it with a performance profiling tool. There
- are several options for profiling, such as JProfiler, XRebel, and Java
- VisualVM. We will use VisualVM in this case since it comes freely with
- every (Oracle’s) JDK since the version 1.5.
-
- Our typical procedure goes like this: 1. Deploy your webapp and start
- your server, 2. Start VisualVM and double click your server’s process
- (“e.g. Tomcat (pid 1234)”) on the Applications tab (see Figure 8), 3.
- Start your load test script with, for instance, 100 concurrent users, 4.
- Open the Sampler tab to see where the CPU time is spent, 5. Use the
- filter on the bottom to show the CPU usage of your application (e.g.
- “`biz.mydomain.projectx`”) and possible ORM (Object-relational mapping)
- framework (e.g. “`org.hibernate`”) separately.
-
- Typically, only a small part (e.g. 0.1 - 2 %) of CPU time is spent on
- the classes of your webapp, if your application does not contain heavy
- business logic. Also, CPU time spent on the classes of Vaadin should be
- very small (e.g. 1%). You can be relaxed about performance bottlenecks
- of your code if the most time (>90%) is spent on application server’s
- classes (e.g. “`org.apache.tomcat`”).
-
- Unfortunately, quite often database functions and ORM frameworks take a
- pretty big part of CPU time. We will discuss how to tackle heavy
- database operations in the Database chapter below.
-
- image:img/figure7s.png[image]
-
- _Figure 8: Profiling CPU usage of our inventory application with Java
- VisualVM_
-
- [[use-native-application-server-libraries]]
- Use native application server libraries
- +++++++++++++++++++++++++++++++++++++++
-
- Some application servers (at least Tomcat and Wildfly) allow you to use
- native (operating system specific) implementation of certain libraries.
- For example, The Apache Tomcat Native Library gives Tomcat access to
- certain native resources for performance and compatibility. Here we
- didn’t test the effect of using native libraries instead of standard
- ones. With little online research, it seems that the performance benefit
- of native libraries for Tomcat is visible only if using secured https
- connections.
-
- [[fine-tune-java-garbage-collection]]
- Fine tune Java garbage collection
- +++++++++++++++++++++++++++++++++
-
- We recommended above not to strain a server more than 50% of its total
- CPU capacity. The reason was that above that level, a garbage collection
- pause tends to freeze the server for too long a time. That is because it
- typically starts not before almost all of the available heap is already
- spent and then it does the full collection. Fortunately, it is possible
- to tune the Java garbage collector so that it will do its job in short
- periods. With little online study, we found the following set of JVM
- parameters for web server optimized garbage collection
-
- ....
- -XX:+UseCMSInitiatingOccupancyOnly
- -XX:CMSInitiatingOccupancyFraction=70
- ....
-
- The first parameter prevents Java from using its default garbage
- collection strategy and makes it use CMS (concurrent-mark-sweep)
- instead. The second parameter tells at which level of “occupancy” the
- garbage collection should be started. The value 70% for the second
- parameter is typically a good choice but for optimal performance it
- should be chosen carefully for each environment e.g. by trial and error.
-
- The CMS collector should be good for heap sizes up to about 4GB. For
- bigger heaps there is the G1 (Garbage first) collector that was
- introduced in JDK 7 update 4. G1 collector divides the heap into regions
- and uses multiple background threads to first scan regions that contain
- the most of garbage objects. Garbage first collector is enabled with the
- following JVM parameter.
-
- ....
- -XX:+UseG1GC
- ....
-
- If you are using Java 8 Update 20 or later, and G1, you can optimize the
- heap usage of duplicated Strings (i.e. their internal `char[]` arrays)
- with the following parameter.
-
- ....
- -XX:+UseStringDeduplication
- ....
-
- [[use-clustering-1]]
- Use clustering
- ++++++++++++++
-
- We have now arrived at the point where a single server cannot fulfill
- our scalability needs whatever tricks we have tried. If a single server
- is not enough for serving all users, obviously we have to distribute
- them to two or more servers. This is called clustering.
-
- Clustering has more benefits than simply balancing the load between two
- or more servers. An obvious additional benefit is that we do not have to
- trust a single server. If one server dies, the user can continue on the
- other server. In worst case, the user loses her session and has to log
- in again, but at least she is not left without the service. You probably
- have heard the term “session replication” before. It means that the
- user’s session is copied into other servers (at least into one other) of
- the cluster. Then, if the server currently used by the user goes down,
- the load balancer sends subsequent requests to another server and the
- user should not notice anything.
-
- We will not cover session replication in this article since we are
- mostly interested in increasing the ability to serve more and more
- concurrent users with our system. We will show two ways to do clustering
- below, first with Apache WebServer and Tomcats and then with the Wildfly
- Undertow server.
-
- [[clustering-with-apache-web-server-and-tomcat-nodes]]
- Clustering with Apache Web Server and Tomcat nodes
- ++++++++++++++++++++++++++++++++++++++++++++++++++
-
- Traditionally Java web application clustering is implemented with one
- Apache Web Server as a load balancer and 2 or more Apache Tomcat servers
- as nodes. There are a lot of tutorials online, thus we will just give a
- short summary below.
-
- 1. Install Tomcat for each node
- 2. Configure unique node names with jvmRoute parameter to each Tomcat’s
- server.xml
- 3. Install Apache Web Server to load balancer node
- 4. Edit Apache’s httpd.conf file to include mod_proxy, mod_proxy_ajp,
- and mod_proxy_balancer
- 5. Configure balancer members with node addresses and load factors into
- end of httpd.conf file
- 6. Restart servers
-
- There are several other options (free and commercial ones) for the load
- balancer, too. For example, our customers have used at least F5 in
- several projects.
-
- [[clustering-with-wildfly-undertow]]
- Clustering with Wildfly Undertow
- ++++++++++++++++++++++++++++++++
-
- Using Wildfly Undertow as a load balancer has several advantages over
- Apache Web Server. First, as Undertow comes with your WildFly server,
- there is no need to install yet another software for a load balancer.
- Then, you can configure Undertow with Java (see Figure 8) which
- minimizes the error prone conf file or xml configurations. Finally,
- using the same vendor for application servers and for a load balancer
- reduces the risk of intercompatibility issues. The clustering setup for
- Wildfly Undertow is presented below. We are using sticky session
- management to maximize performance.
-
- 1. Install Wildfly 9 to all nodes
- 2. Configure Wildfly’s standalone.xml
- 1. add `“instance-id=”node-id”` parameter undertow subsystem, e.g:
- `<subsystem xmlns="urn:jboss:domain:undertow:2.0" instance-id="node1"> `(this
- is needed for the sticky sessions).
- 2. set http port to something else than 8080 in socket-binding-group,
- e.g: `<socket-binding name="http" port="${jboss.http.port:8081}"/>`
- 3. Start your node servers accepting all ip addresses:
- `./standalone.sh -c standalone.xml -b=0.0.0.0`
- 4. Code your own load balancer (reverse proxy) with Java and Undertow
- libraries (see Figure 9) and start it as a Java application.
-
- [source,java]
- ....
- public static void main(final String[] args) {
- try {
- LoadBalancingProxyClient loadBalancer = new LoadBalancingProxyClient()
- .addHost(new URI("http://192.168.2.86:8081"),"node1")
- .addHost(new URI("http://192.168.2.216:8082"),"node2")
- .setConnectionsPerThread(1000);
- Undertow reverseProxy = Undertow.builder()
- .addHttpListener(8080, "localhost")
- .setIoThreads(8)
- .setHandler(new ProxyHandler(loadBalancer, 30000, ResponseCodeHandler.HANDLE_404))
- .build();
- reverseProxy.start();
- } catch (URISyntaxException e) {
- throw new RuntimeException(e);
- }
- }
- ....
-
- _Figure 9: Simple load balancer with two nodes and sticky sessions._
-
- [[database]]
- Database
- ~~~~~~~~
-
- In most cases, the database is the most common and also the most tricky
- to optimize. Typically you’ll have to think about your database usage
- before you actually need to start optimizing the memory and CPU as shown
- above. We assume here that you use object to relational mapping
- frameworks such as Hibernate or Eclipselink. These frameworks implement
- several optimization techniques within, which are not discussed here,
- although you might need those if you are using plain old JDBC.
-
- Typically profiling tools are needed to investigate how much the
- database is limiting the scalability of your application, but as a rule
- of thumb: the more you can avoid accessing the database, the less it
- limits the scalability. Consequently, you should generally cache static
- (or rarely changing) database content.
-
- [[remedies-2]]
- Remedies
- ^^^^^^^^
-
- [[analyze-and-optimize-performance-bottlenecks-1]]
- Analyze and optimize performance bottlenecks
- ++++++++++++++++++++++++++++++++++++++++++++
-
- We already discussed shortly, how to use Java VisualVM for finding CPU
- bottlenecks. These same instructions also apply for finding out at what
- level the database consumes the performance. Typically you have several
- Repository-classes (e.g. `CustomerRepository`) in your web application,
- used for CRUD (create, read, update, delete) operations (e.g.
- `createCustomer`). Commonly your repository implementations either
- extend Spring’s JPARepository or use `javax.persistence.EntityManager`
- or Spring’s `Datasource` for the database access. Thus, when profiling,
- you will probably see one or more of those database access methods in
- the list of methods that are using most of your CPU’s capacity.
-
- According to our experience, one of the bottlenecks might be that small
- database queries (e.g. `findTaskForTheDay`) are executed repeatedly
- instead of doing more in one query (e.g. `findTasksForTheWeek`). In some
- other cases, it might be vice versa: too much information is fetched and
- only part of it is used (e.g. `findAllTheTasks`). A real life example of
- the latter happened recently in a customer project, where we were able
- to a gain significant performance boost just by using JPA Projections to
- leave out unnecessary attributes of an entity (e.g. finding only Task’s
- name and id) in a query.
-
- [[custom-caching-and-query-optimization]]
- Custom caching and Query optimization
- +++++++++++++++++++++++++++++++++++++
-
- After performance profiling, you have typically identified a few queries
- that are taking a big part of the total CPU time. A part of those
- queries might be the ones that are relatively fast as a single query but
- they are just done hundreds or thousands of times. Another part of
- problematic queries are those that are heavy as is. Moreover, there is
- also the __N__+1 query problem, when, for example, a query for fetching
- a Task entity results __N__ more queries for fetching one-to-many
- members (e.g. assignees, subtasks, etc.) of the Task.
-
- The queries of the first type might benefit from combining to bigger
- queries as discussed in the previous subchapter (use
- `findTasksForTheWeek` instead of `findTaskForTheDay`). I call this
- approach custom caching. This approach typically requires changes in
- your business logic too: you will need to store (cache) yet unneeded
- entities, for example in a `HashMap` or `List` and then handle all these
- entities sequentially.
-
- The queries of the second type are typically harder to optimize.
- Typically slow queries can be optimized by adding a certain index or
- changing the query logic into a little bit different form. The difficult
- part is to figure out what exactly makes the query slow. I recommend
- using a logging setting that shows the actual sql query made in your log
- file or console (e.g. in Hibernate use `show_sql=true`). Then you can
- take the query and run it against your database and try to vary it and
- see how it behaves. You can even use the `EXPLAIN` keyword to ask MySQL
- or PostgreSql (`EXPLAIN PLAN FOR` in Oracle and `SHOWPLAN_XML` in SQL
- Server) to explain how the query is executed, what indexes are used etc.
-
- The __N__+1 queries can be detected by analysing the executed sqls in
- the log file. The first solution for the issue is redesigning the
- problematic query to use appropriate join(s) to make it fetch all the
- members in a single sql query. Sometimes, it might be enough to use
- `FetchType.EAGER` instead of `LAZY` for the problematic cases. Yet
- another possibility could be your own custom caching as discussed above.
-
- [[second-level-cache]]
- Second-level cache
- ++++++++++++++++++
-
- According to Oracle’s Java EE Tutorial: a second-level cache is a local
- store of entities managed by the persistence provider. It is used to
- improve the application performance. A second-level cache helps to avoid
- expensive database queries by keeping frequently used entities in the
- cache. It is especially useful when you update your database only by
- your persistence provider (Hibernate or Eclipselink), you read the
- cached entities much more often than you update them, and you have not
- clustered your database.
-
- There are different second-level cache vendors such as EHCache, OSCache,
- and SwarmCache for Hibernate. You can find several tutorials for these
- online. One thing to keep in mind is that the configuration of, for
- example, EHCache varies whether you use Spring or not. Our experience of
- the benefits of second-level caches this far is that in real world
- applications the benefits might be surprisingly low. The benefit gain
- depends highly on how much your application uses the kind of data from
- the database that is mostly read-only and rarely updated.
-
- [[use-clustering-2]]
- Use clustering
- ++++++++++++++
-
- There are two common options for clustering or replication of the
- database: master-master replication and master-slave replication. In the
- master-master scheme any node in the cluster can update the database,
- whereas in the master-slave scheme only the master is updated and the
- change is distributed to the slave nodes right after that. Most
- relational database management systems support at least the master-slave
- replication. For instance, in MySql and PostgreSQL, you can enable it by
- few configuration changes and by granting the appropriate master rights
- for replication. You can find several step-by-step tutorials online by
- searching with e.g. the keywords “postgresql master slave replication”.
-
- [[nosql]]
- NoSQL
- +++++
-
- When looking back to the first figure (Figure 1) of the article, you
- might wonder what kind of database solutions the world's biggest web
- application’s use? Most of them use some relation database, partly, and
- have a NoSQL database (such as Cassandra, MongoDB, and Memcached) for
- some of the functionality. The big benefit of many NoSQL solutions is
- that they are typically easier to cluster, and thus help one to achieve
- extremely scalable web applications. The whole topic of using NoSQL is
- so big that we do not have the possibility to discuss it in this
- article.
-
- [[summary]]
- Summary
- -------
-
- We started the study by looking at typical applications and estimated
- their average concurrent user number. We then started with a typical
- Vaadin web application and looked at what bottlenecks we hit on the way,
- by using a standard laptop. We discussed different ways of overcoming
- everything from File Descriptors to Session size minimization, all the
- way to Garbage collection tweaking and clustering your entire
- application. At the end of the day, there are several issues that could
- gap you applications scalability, but as shown in this study, with a few
- fairly simple steps we can scale the app from 200 concurrent users to
- 3000 concurrent users. As a standard architectural answer, however: the
- results in your environment might be different, so use tools discussed
- in this paper to find your bottlenecks and iron them out.
|