Understanding The Memcached Source Code - Event Driven I

slab allocator (I, II, III) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely, LRU algorithm (I , II , III) for entry expiration; and an event driven model (I - this article) based on libevent; and the consistent harsh (not complete) for data distribution, are built around it. In classic multithreading, large amounts of slow and blocking operations, mostly, I/O, can easily drain out available thread resources, which severely constrains the maximum number of requests a server can handle per unit time. More specific, threads are scheduled out and put into sleep in the middle of procedures that contain blocking I/O, despite piling up requests packets queuing within the network stack. In such situation, server side will show low throughput, low CPU saturation and high latency.

Continue Reading →

Understanding The Memcached Source Code - LRU III

slab allocator (I, II, III) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely, LRU algorithm (I, II, III - this article) for entry expiration; and an event driven model (I) based on libevent; and the consistent harsh (not complete) for data distribution, are built around it. In previous posts, we have discussed different facets of an item, i.e., slab, hash map and LRU list as well as their associated (CRUD) methods, which build up the internal procedures and perform client requests after the corresponding commands are parsed by the drive machine. This time we will go through those procedures by issuing telnet commands to a Memcached instance and see how the discussed modules work together on various item operations. We will also see the whole picture of LRU lists that maintain the property of ‘least recently used’ in accordance to those operations.

Continue Reading →

Understanding The Memcached Source Code - LRU II

slab allocator (I, II, III) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely, LRU algorithm (I , II - this article , III) for entry expiration; and an event driven model (I) based on libevent; and the consistent harsh (not complete) for data distribution, are built around it. More often than not, the LRU algorithm is combined with a hash map, and is referred to as a LRU cache. In a LRU-cache, the hash map enables fast accessing of cached objects; and LRU avoids the cache to grow infinitely by marking expired, or so called, least recently used objects. This time we examine the memcached‘s implementation of hash map.

Continue Reading →

Understanding The Memcached Source Code - LRU I

slab allocator (I, II, III) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely, LRU algorithm (I - this article , II , III) for entry expiration; and an event driven model (I) based on libevent; and the consistent harsh (not complete) for data distribution, are built around it. More often than not, the LRU algorithm is combined with a hash map, and is referred to as a LRU CacheIn a LRU-cache, the hash map enables fast accessing of cached objects; and LRU avoids the cache to grow infinitely by marking expired, or so called, least recently used objects. Next we look at how LRU works from a high level standpoint.

Continue Reading →

Understanding The Memcached Source Code - Slab III

slab allocator (I, II, III - this article) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely, LRU algorithm (I , II , III) for entry expiration; and an event driven model (I) based on libevent; and the consistent harsh (not complete) for data distribution, are built around it. Last time we saw the memory allocating process, which further formulates slabs and the derivative “free lists” (a.k.a., slots). This time we will examine how to take advantage of the established data structures to “slab allocate / release” memory chunks which will be used to store items.

Continue Reading →

Understanding The Memcached Source Code - Slab II

slab allocator (I, II - this article , III) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely, LRU algorithm (I , II , III) for entry expiration; and an event driven model (I) based on libevent; and the consistent harsh (not complete) for data distribution, are built around it. This time we continue examining how slabs memory is allocated. Firstly we look at the two arguments for slabs_init, which were passed over in the previous article. The first one is settings.maxbytes. It limits the overall memory that can be used by the memcached instance. In slabs_init, the value of settings.maxbytes is assigned to the global variable mem_limit which will be used very soon.

Continue Reading →

Understanding The Memcached Source Code - Slab I

slab allocator (I - this article , II , III) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely, LRU algorithm (I , II , III) for entry expiration; and an event driven model (I) based on libevent; and the consistent harsh (not complete) for data distribution, are built around it. Variants of slab allocator is implemented in other systems, such as nginx and Linux kernel, to fight a common problem called memory fragmentation. And this article will, of course, focus on Memcached‘s implementation of the algorithm. memcached version: 1.4.28 Firstly, let’s answer some questions.

Continue Reading →

setsockopt, TCP_NODELAY and Packet Aggregation I

Latency, instead of throughput, is found as the system bottleneck more often than not. However, the TCP socket enables a so-called nagle algorithm by default, which delays an egress packet in order to coalesces it with one that could be sent in the future, into a single TCP segment. This effectively reduces the number of TCP segments and the bandwidth overhead used by the TCP headers, whilst potentially imposes latency for every network request (response) being sent. Lock, and his temperamental brother, Block, are the two notorious villains in the world of programming. In the beginning, they always show up to assist. But sooner or later, they will kick your back-end like really hard. When I consider about nagle algorithem, it seems to me another scenario involving block operations which are meant to be helpful. So I decide to put hands on a keyboard to test if I am wrong. Software setupClient OS: Debian 4.9.88Server OS (LAN & WAN): Unbutu 16.04gcc: 6.3.0 Hardware (or VM) setupServer (LAN): Intel® Core™2 Duo CPU E8400 @ 3.00GHz × 2, 4GBServer (WAN): t2.micro, 1GB

Continue Reading →

setsockopt, SO_KEEPALIVE and Heartbeats

There are two end purposes for sending heartbeats through a persistent connection. For a back-end application, heartbeats are generally used to detect an absent client, so as to drop a connection and release the associated resources; for a client, on the contrary, it is to prevent connection resources stored within intermediate nodes being released (such as a NAT router), SO as to KEEP the connection ALIVE. This article will examine how to configure the four socket options, SO_KEEPALIVE, TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT with setsockopt() to send heartbeats; and discuss the practice of keep-alive heartbeats in general. Experiment setting:OS: Unbutu 16.04gcc: 5.4.0

Continue Reading →

Understanding The React Source Code - UI Updating (DOM Tree) IX

Last time we went through the process from setState() to the updating of a single DOM. We also analyzed the diffing algorithm, which is far from complete as the algorithm is designed for tasks that are much more complex than updating a single DOM node. This time we are going to use two examples to examine the diffing algorithm more in depth. More specific, we look at how the algorithm deals with a mutating DOM tree. N.b., the examples used in this article are derived from the official document which also provides a high level description of the diffing algorithm. You might want to read it first if the topic does not seem very familiar.

Continue Reading →