12 Agustus 2010

Memory pada AIX

Memory overview

This section gives an overview of memory as it relates to AIX. I discuss how AIX uses virtual memory to address more memory than is physically on your system. I also explain how the VMM actually works and how it services requests.

Any discussion of memory and AIX must start with a description of the VMM. AIX newbies are sometimes surprised to hear that the VMM services all memory requests from the system, not just virtual memory. When RAM is accessed, the VMM needs to allocate space, even when there is plenty of physical memory left on the system. It implements a process of early allocation of paging space. Using this method, the VMM plays a vital role in helping manage real memory, not only virtual memory. Here is how it works. In AIX, all virtual memory segments are partitioned into pages. In AIX, the default size per page is 4KB. Allocated pages can be either RAM or paging space (virtual memory stored on disk). VMM also maintains what is referred to as a free list, which is defined as unallocated page frames. These are used to satisfy page faults. There are usually a very small amount of unallocated pages (which you configure) that the VMM uses to free up space and reassign the page frames to. The virtual memory pages whose page frames are to be reassigned are selected using the VMM's page replacement algorithm. This paging algorithm determines which virtual memory pages currently in RAM ultimately have their page frames brought back to the free list. AIX uses all available memory, except that which is configured to be unallocated and known as the free list.

To reiterate, the purpose of VMM is to manage the allocation of both RAM and virtual pages. From here, you can determine that its objectives are to help minimize both the response time of page faults and to decrease the use of virtual memory where it can. Obviously, given the choice between RAM and paging space, most people would prefer to use physical memory, if the RAM is available. What VMM also does is classify virtual memory segments into two distinct categories. The categories are working segments using computational memory and persistent segments using file memory. It is extremely important to understand the distinction between the two, as this helps you tune the systems to their optimum capabilities.

Computational memory

Computational memory is used while your processes are actually working on computing information. These working segments are temporary (transitory) and only exist up until the time a process terminates or the page is stolen. They have no real permanent disk storage location. When a process terminates, both the physical and paging spaces are released in many cases. When there is a large spike in available pages, you can actually see this happening while monitoring your system. In the VMM world when free physical memory starts getting low, programs that have not been used recently are moved from RAM to paging space to help release physical memory for more real work.

File memory

File memory (unlike computational memory) uses persistent segments and has a permanent storage location on the disk. Data files or executable programs are mapped to persistent segments rather than working segments. The data files can relate to filesystems, such as JFS, JFS2, or NFS. They remain in memory until the time that the file is unmounted, a page is stolen, or a file is unlinked. After the data file is copied into RAM, VMM controls when these pages are overwritten or used to store other data. Given the alternative, most people would much rather have file memory paged to disk rather than computational memory.

When a process references a page which is on disk, it must be paged, which could cause other pages to page out again. VMM is constantly lurking and working in the background trying to steal frames that have not been recently referenced, using the page replacement algorithm I discussed earlier. It also helps detect thrashing, which can occur when memory is extremely low and pages are constantly being paged in and out to support processing. VMM actually has a memory load control algorithm, which can detect if the system is thrashing and actually tries to remedy the situation. Unabashed thrashing can literally cause a system to come to a standstill, as the kernel becomes too concerned with making room for pages than to actually do anything productive.


10 Agustus 2010

Performance tuning considerations in your application server environment

Dynamic Web applications enable you to make vast stores of information instantly accessible to your users through a familiar interface. As your applications become more popular, however, you may find it harder to serve your requests as fast as you could earlier. An understanding of how Web requests are served, along with a grasp of the do's and dont's of Web application development, will ensure less trouble later on.

A static Web request, such as the one shown in Figure 1, is easy to grasp. A client connects to the server, usually on TCP port 80, and makes a simple request using the HTTP protocol.


Figure 1. A client requesting a static file over HTTP
A client requesting a static file over HTTP

The server parses the request, which is mapped to a file on a filesystem. The server then sends the client some headers that describe the payload (such as a Web page or image), and finally sends the file along to the client.

There are few bottlenecks in the above scenario. If the requests are varied enough to make ineffective use of the operating system's disk cache, then the server's disks will become busy and at some point slow the whole process down. If the network pipe feeding the clients becomes saturated, then all clients will suffer. But these conditions aside, the "receive request, send file" process is fairly efficient.

You could get a rough idea of the performance of the static server by making some assumptions. With a request service time of 10ms (largely constrained by having to seek disk heads), you can expect to serve about 100 requests per second before the disks become saturated (10msec/request / 1 second = 100 requests/second). With a 10K document you would be serving about 8mbit/sec of Web traffic (100 requests/second * 10 KBytes/request * 8bits/byte). If files can be served from memory cache, then that decreases the average service time, and hence increases the number of connections per second the server can handle. Should you have some real numbers for your disk service time or average request latency, you can plug them in to the calculations above to get better performance estimates.

Now that the capacity of a server has been shown to be the inverse of the average request service time, it follows that doubling the service time halves the server's capacity in terms of number of connections served per second. With this in mind, turn your attention to dynamic applications.

The flow of a dynamic application depends on the application, but generally looks like Figure 2.


Figure 2. A client requesting a dynamic page over HTTP
A client requesting a dynamic page over HTTP

Much like the client from the previous example, the client from Figure 2 starts the process by making a request. Nothing in the request differentiates a static request from a dynamic request (sometimes, extensions like .php or .cgi give you a hint, but they can be misleading). It is up to the Web server to determine what to do with the request.

In Figure 2, the request is destined to an application server, such as a Solaris system running a Java™ application. The application server does some processing, but then consults a database for more information. With this information in hand, the application server generates an HTML page, which is relayed to the client by the Web server. The service time for this request is now the sum of the components. If the database access cost 7ms, the application server 13ms, and the Web server 5ms, then the service time of the Web page becomes 25ms. Using the inverse rule learned above, the capacity of the components are 142, 77, and 200 requests per second, respectively. The bottleneck is then the application server, which limits this system to 77 connections per second before the Web server is forced to wait, and to queue connections.

It is important to note, though, that just because the system can dispatch 77 connections per second and that a single connection takes 25ms to process, each application user will not have their requests served in that 25ms. Each component can only process one connection at a time, so at peak load, requests will have to wait their turn to get on the CPU of each step. In the example above, the average request time ends up being over 1.1 seconds once you factor in the queueing time and the 25ms processing time. See the Resources for information on solving these queuing problems.

From these examples, you can draw several conclusions:

  • Putting more steps between the user and the final page makes things slower and decreases system capacity.
  • This effect can be noticeable, especially as the page request rate increases.
  • The architecture decisions made at the beginning of a project also affect the ability of a site to handle loads.

The rest of this article will reinforce these points.

N-tier architectures for dynamic sites

The architecture of applications, including dynamic Web applications, is often described in terms of tiers. A static site can be considered to have one tier -- the Web server. If you then had a scripting language running in-process with the Web server, such as PHP, that connected to a database, this would be described as two tier. The example from the previous section has three tiers, which are the front-end Web server, the application server, and the database.

Depending on who you are talking to, a single piece of software might consist of multiple tiers. For example, a PHP script might use a template engine to separate business logic from the presentation, and be considered two separate tiers. A Java application might have a Java servlet performing presentation tasks, which talks with an Enterprise Java Bean (EJB) to perform business logic, which then connects to a database for further information. Therefore one 3-tier architecture may look different from another, especially when different toolsets are involved.

Common architectures

Even though an application architecture varies from application to application, some common trends emerge. Generally, an application needs the functionality of four tiers:

  • A client tier
  • A presentation tier
  • A business logic tier
  • A data tier

In a Web application, the client tier is handled by the Web browser. The browser renders HTML and executes Javascript (and, optionally, Javaapplets, ActiveX, or Flash applets) in order to show and collect information from the user. The presentation tier is the interface from the servers to the client, and is responsible for formatting the output so that it can be displayed on the client. The business logic tier enforces the rules of the business, such as calculations and workflow, that drive the application. Finally, the data access layer is a persistent data store, such as a database or file storage.

Most applications will need functionality from each of these four tiers, even though they may not implement them distinctly and completely.

Another popular architecture is Model-View-Controller, which is a pattern for separating components of an application. In MVC, the model encapsulates the business logic tier, and together with the framework, the data tier. The view handles the presentation of data to the client. The controller's job is to control the application flow.

Scaling the tiers

To scale a Web application means to grow it so that it can handle more traffic. One aspect of scaling is how hardware is deployed to match demand. Another aspect is how the application responds to this new hardware environment. Conceptually, it is easy to imagine throwing a bigger server at the problem, but the nature of the application may be such that other bottlenecks present themselves first. Breaking down the application into a series of tiers helps to reduce the scope of the problem, and allows for easier scaling.

Application bottlenecks aside for now, scaling the application's hardware is usually described as being done horizontally or vertically. Horizontal scaling means that scaling is done by adding more servers to a tier. The earlier example that had an application server bottleneck of 77 requests per second might be solved by a second application server, with the load shared between the two. This would give a theoretical capacity of 154 requests per second, pushing the bottleneck to the database.

Vertical scaling, on the other hand, means that a bigger machine is used. You could use a larger machine that could run two instances of the application server, or one that could run requests in half the time.

At this point it would be tempting to dismiss vertical scaling entirely because it is usually cheaper to buy more smaller machines than it is to continually buy bigger servers. However, there are many cases where vertical scaling is the better option. If you have something like an IBM® Power® server that supports partitioning of hardware through logical partitions (LPAR), you may already have spare capacity to add to your application server tier.

Your application's requirements may also drive you towards vertical partitioning. On one server it is easy to share the user's session state in a shared memory segment. Once you have two servers, you need to find a new way to share state, such as a database. The database will be slower than the memory access, so two servers will not be twice as fast as one.

Databases are another example of a place where vertical scaling is often used. To split your dataset across different servers requires a lot of work at the application level, such as joining columns across two databases and making sure data is consistent. It would be far easier to build a bigger database server, and push off the need to rebuild your application to support the sharding of data.

Modeling a Web application in terms of queues

From the previous discussion on application architectures, it becomes apparent that a Web request passes through different phases, each phase taking a certain amount of time to execute. Requests line up to pass through each of these steps, and once the step has completed, the request lines up for the next step. Each of these steps is much like how people line up at a grocery store to go through the check out.

A Web application can be modeled in terms of these steps, called queues. Each component of your application is a queue. A typical WebSphere application is shown in Figure 3, drawn as a series of queues.


Figure 3. A WebSphere® application shown as a queuing network
A WebSphere application shown as a queuing network

Figure 3 shows how requests wait to be serviced by the Web server, then they wait for the Web container, and so forth. If the rate of requests entering a particular queue exceeds the rate at which the queue can process requests, then the requests back up. When requests back up, the service time is unpredictable, and your users will be seeing stalled browser sessions. The queuing in Figure 3 represents the worst-case scenario, because some requests will be satisfied by the Web server, or without having to hit the database.

Queues are common in the UNIX® environment. The operating system queues disk requests when the applications issue them faster than the disks can return the data, optionally reordering them to minimize seeking. The run queue, which holds an ordered list of processes waiting to run, is another queue. Applications wait for their turn on a limited resource (the CPU) before being served.

Tuning the queues is therefore a balancing act. Too small, and you'll drop users when you still have capacity. Too big, and you'll try to serve more users than your system can handle, causing poor performance for everone.

To complicate the matter, these queuing slots aren't free. There is a memory overhead to hold the spot, and in the case of an application server, memory for the thread that's handling the request. It is then rarely a good idea to allow yourself to queue at the application server. Instead, the recommended approach is to queue any users before the application server, such as on the Web server. This means that the Web server keeps the connection open with the Web client and issues the request once the next available application server slot is freed. The application server is tuned to handle only enough requests per second that it can dispatch in a timely manner.

IBM has documented their recommended approach for laying out a Web application, along with methods to tune various queues. Of note, though, is their guidance that you should avoid queuing inside WebSphere. What this means is that you want to only have requests that you are able to immediately service directed at the WebSphere application server. Your Web server (or even better, a proxy server in front of the Web server) should be holding on to the excess connections as they wait their turn to be processed. This ensures that the heavier application server queues are able to spend their time servicing a limited number of requests rather than trying to serve everyone at once.

Notes for developers

As a developer, there are general principles that you can follow in your applications to help them scale. These are high-level things that are applicable to most Web applications.

Instrumentation

Your application should have some way of providing measurements back to a collection system (even if the collection system is just a log file). These measurements could be regarding how often a certain function in your application is accessed, or how long a request took to handle. This won't make your application any faster, but it helps you understand when things are running slowly, and which parts of your code are taking the most time. Understanding when certain functions are being called can help you correlate events you see on your system (CPU spikes, high disk activity) with something meaningful to your application (image uploading).

Being able to know what's going on in your site is the key to being able to scale it. Parts of your code that you think are not properly optimized may not end up being a problem. Only proper measurement can tell you what's going on.

Sessions

The Web is inherently stateless. Each request you make is independent on the last one. Applications, however, are usually stateful. You must log in to an application to identify yourself, you might have a shopping cart for the life of your visit, or you might just be filling in a profile for later use. Keeping track of sessions can be an expensive operation, especially as you grow beyond one server.

A Web application running on a single server can keep session information in memory, which can be accessed by any instance of the Web application running on the server. Users are usually given a token that identifies the session in this memory bucket. Consider what happens when a second application server is introduced. If the user's first request is to one server, and the second request is to the second server, then two separate sessions exist, and each is different.

The common solution to this problem is to store the session in a database instead of memory. The problem that this causes is that you've introduced another database read, and potentially a database write, per request. This database is also required by each Web application server.

One solution is to only use sessions where you need them. Instead of having your application load the session on each request, only have it load the session when the session is needed. This reduces the number of requests to the backend database.

Another approach is to encrypt the session data and send it back to the client, eliminating the need to store the session locally. You cannot store an infinite amount of data in the user's cookie, but RFC 2109 specifies that clients should be able to store at least 20 cookies per domain name, each a minimum of 4K bytes long.

If database back-ed sessions do prove to become a bottleneck for you, and you can't get rid of them, you should be able to split them onto separate databases, or even multiple databases. For example, you can have even session IDs on one database, an odd session IDs on another.

Caching

Changes in your data happen more in some parts of your application than they do others. A news site might only change its list of top-level categories once a month. Therefore, it would be wasteful to query a database on every request to get the latest list of categories. Similarly, a page representing a news story might only change a couple of times in its life, so there is no need to regenerate it on every request.

Caching means to store the results of an expensive request for later use. You could cache the list of categories, or even the whole page.

When considering caching, ask yourself "Does this information have to be fresh?" If not, it might be a candidate for caching. It might be important to be able to change the news story on a moment's notice, but even if you check for changes once a minute and serve from cache the rest of the time, it may be considered fresh enough.

A complementary method is to invalidate the cached entry when the underlying data changes. If the news story is changed, one of the actions performed on saving might be to remove the cached version. The next request would see that there is no cached version, and generate a new entry.

When caching, you must be careful of what happens when the cached entry expires, or is removed. If many requests are being made for the cached item, you will have many people regenerating the cached item when the cached entry expires. To get around this, you could only have the first request regenerate the cache, and every one else use the stale item until the new one is available.

memcached is a distributed memory caching system popular with applications deployed in UNIX environments. Servers run instances of the memcache daemon, which allocates a block of RAM that's accessible over a simple network protocol. Applications wishing to store or retrieve a piece of data within memcache first hash the key, which tells them which server out of the memcache pool to use. The data can be checked or stored by connecting to the server, which is much quicker than a disk or database access.

When looking for things to cache, you might also consider asking if the information is really needed in the first place. Do you need to display the user's shopping cart on every page? What about just the total? What about a simple link to "view the contents of your cart?".

Edge-Side Includes (ESI) is a markup language that allows you to break apart your Web page into separate, cachable, entities. Your application is responsible for generating the HTML document that contains the ESI markup, and also for generating the components. The proxy cache in front of your Web application reassembles the final document based on the parts, and is responsible for caching some components and making the request for others. Listing 1 shows an example of an ESI document.


Listing 1. ESI example




This is static content



The line above just told the proxy to request /stories/123 and insert
it in the middle of the page





Though a very simple example, Listing 1 shows how two documents can be spliced together, each with their own caching rules.

Asynchronous processing

Loosely tied to the "Does this information have to be fresh?" question is, "Does the information have to be updated by the time this request finishes?" There are many times that you can take the data submitted by the user and delay the processing a matter of seconds, rather than process the information while the user waits for his page to load. This is called asynchronous processing. One common approach is to have the application send the data into a message queue, such as IBM WebSphere MQ, to be processed when the resources are available. This lets you immediately return a page to the user, even though the results of the data processing may not be known.

Consider an e-commerce application where the user submits an order. You may decide that it is important to immediately return the credit card verification, but you may not need to get confirmation from your ordering system that all the components are available. Instead, you could put the order in a queue to be processed, which might happen within seconds. If an error happens, this can be relayed to the user through email, or even inserted into her session should she still be on your site. Another example is reporting. Rather than wait for the report to be generated, you could return a "please check the reports page in a few minutes" message, while the report is farmed out to a separate server asynchronously.

Summary

Applications are usually written in a tiered fashion. Presentation logic is separated from business logic, which is further separated from the persistent storage. This allows for easier maintainability of the code, but introduces overhead. Scaling an application involves understanding the flow of data in this tiered environment, and looking for bottlenecks.

Techniques such as caching and asynchronous processing can reduce the workload on the application by reusing previous work or by farming the bulk of the work to another machine. Providing instrumentation hooks in your application lets you zero in on your hotspots in near real time.

The application server environment operates much like a queuing network, and it is important to manage the size of the queues so that one tier isn't overwhelmed by another. IBM recommends that you queue as far before the application server as possible, such as on the external Web server or proxy cache.

Applications rarely scale only by throwing more hardware at the problem. Usually these techniques must be put into place so that the new hardware can be used effectively.


source : http://www.ibm.com/developerworks/aix/library/au-perf_tuning/index.html

Are You A Bad Boss?

Tentu Anda ingat, sebelum memegang posisi sebagai atasan, Anda juga pernah merasakan suka duka jadi seorang karyawan. Ada saat-saat ketika karena suatu hal Anda tidak puas dengan kepemimpinan boss Anda, lalu dengan serta merta menganugerahi gelar “the bad boss’ untuk atasan tersebut. Dengan kekesalan yang berapi-api pembalasan pun direncanakan terhadap atasan Anda. Lebih ekstrem lagi, beberapa orang bahkan diam-diam memiliki niat untuk meluapkan ketidaksukaan mereka kepada atasan dengan kontak fisik yang membahayakan. Walaupun hal ini hanya sebatas wacana saja, tapi sungguh tidak ada jaminan bahwa hal tersebut tidak mungkin dilakukan sang karyawan.

Ketika Anda sudah memiliki posisi yang sama dengan atasan Anda dulu, memimpin dan mengelola anak buah, bukan tidak mungkin salah satu dari mereka mengalami hal yang dulu pernah Anda rasakan. Tanpa Anda sadari mereka memendam ketidakpuasan terhadap kepemimpinan Anda, atau terhadap hak dan kewajiban yang mereka terima.

Kepemimpinan dan mengelola tim memang tidak harus memuaskan hati semua orang. Tidak ada rumus yang 100% tepat dan akurat untuk menjalankan kepemimpinan karena ini adalah hal rumit yang sulit untuk dilaksanakan dengan sempurna. Untuk meminimalisir kemungkinan Anda dilabeli dengan “bad boss”, salah satu cara yang bisa Anda lakukan adalah mengenali kepemimpinan Anda dan efektifitasnya terhadap anak buah Anda serta terus mengembangkan dan mendewasakan karakter Anda sebagai pemimpin. Kenali juga kelemahan Anda dan berusahalah untuk memperbaikinya.

Beberapa ciri yang sebaiknya Anda pertahankan atau capai agar tidak memiliki predikat the bad boss antara lain :

Going down. A bad boss tidak akan mengetahui kesulitan yang dihadapi anggota timnya. Filsuf Lao-Tzu pernah berkata bahwa "If you want to govern the people, you must place yourself below them." Menjadi seorang atasan tidak berarti Anda harus selalu ada di ‘atas’ sehingga anggota tim Anda sulit untuk ‘meraih’ Anda. Investasikan waktu dan tenaga untuk mengetahui detail pekerjaan yang dilakukan dan situasi kerja yang dihadapi anak buah Anda sehari-hari. Ketahui kesulitan yang dihadapi setiap anak buah Anda dan diskusikan solusi yang paling tepat. Tunjukkan bahwa Anda bersedia untuk dekat dan turut ambil bagian dalam proses pekerjaan tim Anda.

Sampaikan dan ketahui dengan jelas. A bad boss hanya akan memberikan perintah namun tidak tahu setiap detail yang dibutuhkan untuk mencapainya. A bad boss ingin mendapatkan hasil bagus tanpa ingin tahu kesulitan yang dihadapi anggota timnya. Ada anggapan yang menyatakan bahwa seorang atasan memiliki keleluasaan untuk memberikan berbagai perintah atau tugas bagi anak buahnya. Akan tetapi yakinkah Anda bahwa esensi dari tugas yang diberikan diterima dan dimengerti oleh anggota tim dengan sempurna? Bila Anda mendapati masalah, sebelum menyalahkan mereka, apakah Anda sudah mengetahui apa yang sebenarnya Anda harapkan dalam pemberian perintah tersebut? Apakah Anda mengetahui big picture dan setiap detail dari tugas tersebut? Pastikan Anda mengetahui setiap aspek dari tugas yang dibebankan dan menyampaikannya dengan jelas. Konfirmasikan apakah anak buah Anda mengetahui dengan jelas perintah Anda.

Bijaksana menghadapi masalah. A bad boss akan menyalahkan anggota timnya atau faktor lainnya saat menghadapi masalah. Ada anggapan menyesatkan bagi sebagian orang bahwa masalah tidak akan jadi sebuah masalah jika tidak diketahui keberadaannya. Karena itu banyak orang bahkan juga pemimpin yang berusaha untuk melarikan diri dari sebuah persoalan lalu berharap persoalan tersebut akan mereda dengan sendirinya. Pemimpin berkarakter justru akan mengharapkan masalah agar bisa mempersiapkan antisipasi dan solusinya.

Terbuka terhadap kritik. A bad boss tidak menerima kritikan, bahkan akan tersinggung saat dikritik oleh anggota timnya. Pemimpin yang menggunakan metode komunikasi satu arah, atau dengan kata lain tidak bisa menerima kritikan dari anggota timnya, menghendaki setiap perkataannya dituruti, dan tidak menghiraukan aspirasi dan masukan dari timnya hanya akan membuat mereka kehilangan respect atau rasa hormat. Dorong setiap anggota tim Anda untuk menyuarakan pendapat atau kritik mereka terhadap kebijaksanaan atau policy yang Anda buat. Bagaimanapun 'toh keputusan ada di tangan Anda.

Remember!
Semua pemimpin yang baik pernah melalui fase pembentukan yang tidak baik, so biasakan diri Anda dibentuk oleh organisasi Anda. .

Chandra Ming (General Manager JobsDB.com)


source: http://id.jobsdb.com/ID/EN/Resources/JobSeekerArticle/bad%20boss?ID=276