Linux | Introduction To Memory Management.

Memory Hierarchy:

Memory Hierarchy gives the relationship between the speed, size and cost with respect to the distance from the Processor.

cpu111.png

In the diagram above the peak of the pyramid represents the processor. The Register lies with in the processor itself hence are the closest to the processor and work the fastest. But the number of registers that can be included in a processor is limited, as it would lead to increase in processor size, increase in manufacturing cost etc. Thus the register memory is restricted to minimal.

A level below the register is the L1 cache or the first level cache. In the processors of today, the L1 cache also lies on the processor chip itself, though it might lie outside too.
The cache memory works at a very fast speed but is also extremely expensive as compared to the other memories available. This high cost is one of the major restriction why we can not use lots of cache in a computer even though it is faster.
The general thumb rule is, higher the cache memory faster would be the working of the processor.

After the L1 or first level of cache, modern systems have an L2 or the second level of cache.This is introduced mainly because we can not include a high amount of L1 cache when we are making L1 a part of the chip as it would occupy precious space on the chip.
L2 cache is mostly off the chip. But it is of the same kind as the L1 cache and hence is extremely fast in its operation. Together L1 and L2 help in making the system run fast. Some of the modern computers also have a third level or L3 cache.Cost factor again restricts the amount of second level cache can be included.

All the cache though does make the system work faster but is of very limited in size, which is not really enough for the computer to function.

The next level of memory is the RAM. RAM basically stores all the data that the computer needs while it is running. RAM is also fast in its operation but slower than the cache, and less expensive. Thus we can afford to have a relatively large amount of RAM.  The higher the RAM, faster is the system, but again the price is a main limiting factor with regards to how much RAM can be included in the computer.

All the memories until the level of RAM are volatile, i.e. they lose the data stored in them as soon as power is switched off. This obviously is not a very desirable property as you would lose all your work.
Thus to store data even when power is not there secondary memory is made use of. The secondary memory is the slowest but the size is a lot bigger than the RAM or the cache because the price of the secondary memory is a lot lesser than compared to RAM and cache.

Let us take a real life example to understand the above hierarchy.
Consider the laptop Model verify in Linux

rupin@localhost:~$ sudo dmidecode |grep Version
[sudo] password for rupin:
Version: Intel(R) Core(TM) i5-4210M CPU @ 2.60GHz
Version: ThinkPad L440
Version: No DPK
Version: Not Available 
SBDS Version: 01.08 
Version: J4ET71WW(1.71)

from first two line you will come to know what is your laptop model & which processor is it using?

To know the size of L1, L2, L3 cache and RAM in Linux?

Comand to verify if L1, L2,L3 cache and RAM size?

rupin@localhost:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 60
Stepping: 3
CPU MHz: 800.000
BogoMIPS: 5188.05
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0-3
rupin@localhost:~$ 

Detailed:-

rupin@localhost:~$ sudo lshw -C memory
*-cache:0 
description: L1 cache
physical id: 2
slot: L1-Cache
size: 32KiB
capacity: 32KiB
capabilities: asynchronous internal write-back instruction
*-cache:1
description: L2 cache
physical id: 3
slot: L2-Cache
size: 256KiB
capacity: 256KiB
capabilities: asynchronous internal write-back unified
*-cache:2
description: L3 cache
physical id: 4
slot: L3-Cache
size: 3MiB
capacity: 3MiB
capabilities: asynchronous internal write-back unified
*-cache
description: L1 cache
physical id: 1
slot: L1-Cache
size: 32KiB
capacity: 32KiB
capabilities: asynchronous internal write-back data
*-memory
description: System Memory
physical id: 5
slot: System board or motherboard
size: 8GiB
*-bank:0
description: SODIMM DDR3 Synchronous 1600 MHz (0.6 ns)
product: M471B1G73QH0-YK0
vendor: Samsung
physical id: 0
serial: 196D91D1
slot: ChannelA-DIMM0
size: 8GiB
width: 64 bits
clock: 1600MHz (0.6ns)
*-bank:1
description: DIMM [empty]
physical id: 1
slot: ChannelB-DIMM0
*-firmware
description: BIOS
vendor: LENOVO
physical id: 37
version: J4ET71WW(1.71)
date: 11/13/2014
size: 128KiB
capacity: 8128KiB
rupin@localhost:~$

you can also use below command

sudo dmidecode -t cache -t memory
TO check secondary Memory Size:
rupin@localhost:~$ sudo fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168sectors

Consider the above details My laptop  ThinkPad L440  by Lenovo.
The configuration is :
Intel(R) Core(TM) i5-4210M CPU @ 2.60GHz
L1 cache: 32KB
L2 cache: 256K
L3 cache: 3072K
RAM: 8 GB
Secondary Memory: 500GB

Characteristics of Memory Systems:

The main memory is also known as the primary memory is a part of the central processing unit and is a combination of both RAM (random access memory) and ROM (read only memory).

RAM

The random access memory is a read write memory i.e. information can be read as well as written into this type of memory. It is volatile in nature, i.e., the information it contains is lost as soon as the system is shut down unless ‘saved’ for further usage by users. It is basically used to store programs and data during the computer’s operation.

ROM

The read only memory as the name may suggest contains information that can only be read, i.e., you can’t write on this type of memory. It is non-volatile or permanent in nature. It is basically used to store permanent programs such as a program for the functioning of the monitor.

The main memory is a fast-memory, i.e., it has small access time. It is because of its limited capacity that it is fast. The main memory contains the programs that are currently being worked on. It passes on this information to the control unit as and when required. In case the CPU wants to access some data that is present in a secondary storage device, this data is first transferred to the main memory and then processed.

The main memory is much more costly than the secondary storage devices. Although the ROM IC’s of various computers do not vary much in their capacities, the RAM chips are available in wide ranges of storage capacities. In fact, the capacity of the random access memory is an important specification of a computer.

Different memories can be classified on the basis of these concepts:

1.     Access Mode: which means how easily they are accessible.

2.     Access time: the average time required to reach a storage location and obtain its content is called access time.

3.     Transfer Rate: the transfer rate is the number of characters or words that a device can transfer per second after it has been positioned at the beginning of the record.

4.     Capacity and cost: the capacity and cost may depend upon the requirement and the budget.

memory
characteristics of memories

The main memory has a very low access time and a very high transfer rate. It is limited in capacity and costlier than secondary storage devices.

Below video explains why memory unit exists in Computer

Contiguous & Non-Contiguous Storage Allocation in Memory Management:

The memory allocation can be classified into two methods contiguous memory allocation and non-contiguous memory allocation.

Definition and Explanation:

The earliest computing system required contiguous storage allocation in which each program occupied a single contiguous memory block. In these systems, the technique of multiprogramming was not possible.

In non-contiguous storage allocation, a program is divided into several blocks that may be placed in different parts of main memory. It is more difficult for an operating system to control non-contiguous storage allocation. The benefit is that if main memory has many small holes available instead of a single large hole, the operating system can often load and execute a program that would otherwise need to wait.

Key Differences Between Contiguous and Noncontiguous Memory Allocation

  1. The basic difference between contiguous and non-contiguous memory allocation is that contiguous allocation allocates one single contiguous block of memory to the process whereas, the non-contiguous allocation divides the process into several blocks and place them in the different address space of the memory i.e. in a non-contiguous manner.
  2. In contiguous memory allocation, the process is stored in contiguous memory space; so there is no overhead of address translation during execution. But in non-contiguous memory allocation, there is an overhead of address translation while the process execution, as the process blocks, are spread in the memory space.
  3. Process stored in contiguous memory executes faster in comparison to process stored in noncontiguous memory space.
  4. The solution for contiguous memory allocation is to divide the memory space into the fixed-sized partition and allocate a partition to a single process only. On the other hands, in noncontiguous memory allocation, a process is divided into several blocks and each block is placed at different places in memory according to the availability of the memory.
  5. In contiguous memory allocation, an operating system has to maintain a table which indicates which partition is available for the process and which is occupied by the process. In non-contiguous memory allocation, a table is maintained for each process which indicates the base address of each block of the process placed in the memory space.

Conclusion:

Contiguous memory allocation does not create any overheads and fastens the execution speed of the process but increases memory wastage. In turn, non-contiguous memory allocation creates overheads of address translation, reduces execution speed of a process but, increases memory utilization. So there are pros and cons of both allocation methods.

Address Types:

The Linux system deals with several types of addresses, each with its own semantics. Unfortunately, the kernel code is not always very clear on exactly which type of address is being used in each situation, so the programmer must be careful.

The following is a list of address types used in Linux.

User virtual addresses: These are the regular addresses seen by user-space programs.

Physical addresses: The addresses used between the processor and the system’s memory.

Bus addresses: The addresses used between peripheral buses and memory.

Kernel logical addresses: These make up the normal address space of the kernel.

Kernel virtual addresses: Kernel virtual addresses are similar to logical addresses in that they are a mapping from a kernel-space address to a physical address.

Below figure shows how these address types relate to physical memory.

address_type

Different kernel functions require different types of addresses. It would be nice if there were different C types defined.

Physical Addresses and Pages:

One basic concept in the Linux implementation of virtual memory is the concept of a page. A page is a 4Kb area of memory and is the basic unit of memory with which both the kernel and the CPU deal.Page size varies from one architecture to the next, although most systems currently use 4096-byte pages.The constant PAGE_SIZE (defined in <asm/page.h>) gives the page size on any given architecture.

If you are reading a book, you do not need to have all the pages spread out on a table for you to work effectively just the page you are currently using. I remember many times in college when I had the entire table top covered with open books, including my notebook. As I was studying, I would read a little from one book, take notes on what I read, and, if I needed more details on that subject, I would either go to a different page or a completely different book.

Virtual memory in Linux is very much like that. Just as I only need to have open the pages I am working with currently, a process needs to have only those pages in memory with which it is working. Like me, if the process needs a page that is not currently available (not in physical memory), it needs to go get it (usually from the hard disk).

If another student came along and wanted to use that table, there might be enough space for him or her to spread out his or her books as well. If not, I would have to close some of my books (maybe putting bookmarks at the pages I was using). If another student came along or the table was fairly small, I might have to put some of the books away. Linux does that as well. Imagine that the text books represent the unchanging text portion of the program and the notebook represents the changing data.

It is the responsibility of both the kernel and the CPU to ensure that I don’t end up reading someone else’s textbook or writing in someone else’s notebook. That is, both the kernel and the CPU ensure that one process does not have access to the memory locations of another process (a discussion of cell replication would look silly in my calculus notebook). The CPU also helps the kernel by recognizing when the process tries to access a page that is not yet in memory. It is the kernel’s job to figure out which process it was, what page it was, and to load the appropriate page.

It is also the kernel’s responsibility to ensure that no one process hogs all available memory, just like the librarian telling me to make some space on the table. If there is only one process running (not very likely), there may be enough memory to keep the entire process loaded as it runs. More likely is the case in which dozens of processes are in memory and each gets a small part of the total memory. (Note: Depending on how much memory you have, it is still possible that the entire program is in memory.)

Processes generally adhere to the principle of spatial locality. This means that typically processes will access the same portions of their code over and over again. The kernel could establish a working set of pages for each process, the pages that have been accessed with the last n memory references. If n is small, the processes may not have enough pages in memory to do their job. Instead of letting the processes work, the kernel is busy spending all of its time reading the needed pages. By the time the system has finished reading in the needed pages, it is some other process’s turn. Now, some other process needs more pages, so the kernel needs to read them in. This is called thrashing. Large values of n may lead to cases in which there is not enough memory for all the processes to run.

The solution is to use a portion of a hard disk as a kind of temporary storage for data pages that are not currently needed. This area of the hard disk is called the swap space or swap device and is a separate area used solely for the purpose of holding data pages from programs.

 

Linux provides a mechanism via vmalloc() where non-contiguous physically memory is used.

/usr/src/linux-headers-3.13.0-32-generic/include/linux/vmalloc.h

High and Low Memory:

The kernel (on the x86 architecture, in the default configuration) splits the 4-GB virtual address space between user-space and the kernel; the same set of mappings is used in both contexts. A typical split dedicates 3 GB to user space, and 1 GB of kernel space.

addressvirtual

The kernel cannot directly manipulate memory that is not mapped into the kernel’s address space. The kernel, in other words, needs its own virtual address for any memory it must touch directly.

The 2.6 kernel (with an added patch) can support a “4G/4G” mode on x86 hardware, which enables larger kernel and user virtual address spaces at a mild performance cost.

Thus, in a large memory situation, only the bottom part of physical RAM is mapped directly into kernel logical address space.

Virtual Memory Areas:

The virtual memory area (VMA) is the kernel data structure used to manage distinct regions of a process’s address space. A VMA represents a homogeneous region in the virtual memory of a process: a contiguous range of virtual addresses that have the same permission flags and are backed up by the same object (a file, say, or swap space).

The memory areas of a process can be seen by looking in /proc/<pid/maps>

# cat /proc/1/maps 

What is Virtual Memory?

Virtual Memory is a system that uses an address mapping.
Maps virtual address space to physical address space
– Maps virtual addresses to physical RAM
– Maps virtual addresses to hardware devices.

Advantages:

 

  • Each process can have a different memory mapping – One process’s RAM is inaccessible (and invisible) to other processes.
  • Built-in memory protection – Kernel RAM is invisible to user space processes
  • Memory can be moved, Memory can be swapped to disk hardware device memory can be mapped into a process’s address space.
  • Hardware device memory can be mapped into a process’s address space.
    – Requires the kernel to perform the mapping.
  • Physical RAM can be mapped into multiple processes at once.
    – Shared memory
  • Memory regions can have access permissions.
    – Read, write, execute

So as first thing in your system, you can use the command free to get a first idea of how is been you RAM been utilized.

free -g => in gigabytes.

#free -m

free.png

free.png

The -/+ buffers/cache line shows how much memory is used and free from the perspective of the applications.

Cached:

The Linux Page Cache (“Cached:” from meminfo ) is the largest single consumer of RAM on most systems.Any time you do a read() from a file on disk, that data is read into memory and goes into the page cache.After this read() completes, the kernel has the option to simply throw the page away since it is not being used.
However, if you do a second read of the same area in a file, the data will be read directly out of memory and no trip to the disk will be taken. This is an incredible speed up and is the reason why Linux uses its page cache so extensively it is betting that after you access a page on disk a single time, you will soon access it again.

dentry/inode caches:

Each time you do a ‘ls’ (or any other operation open(), stat(), etc…) on a filesystem, the kernel needs data which are on the disk. The kernel parses these data on the disk and puts it in some filesystem-independent structures so that it can be handled in the same way across all different filesystems. In the same fashion as the page cache in the above examples, the kernel has the option of throwing away these structures once the ‘ls’ is completed. However, it makes the same bets as before if you read it once, you’re bound to read it again. The kernel stores this information in several “caches” called the dentry and inode caches. dentries are common across all filesystems, but each filesystem has its own cache for inodes.In the same fashion as the page cache in the above examples, the kernel has the option of throwing away these structures once the ‘ls’ is completed. However, it makes the same bets as before if you read it once, you’re bound to read it again.The kernel stores this information in several “caches” called the dentry and inode caches. dentries are common across all filesystems, but each filesystem has its own cache for inodes.

Swapping:(kernel 2.6):

A process normally runs on physical memory where the memory is divided into sets of pages. A page is a 4kb area of memory and is the basic unit of memory with which both kernel and CPU deal.There might be a situation when all the pages in physical memory goes full. In such cases, all the inactive pages inside physical memory are shifted to the secondary storage or the swap space using the paging technique. By doing this physical memory gets free pages which can again be utilized by new processes. This entire process is termed as swapping. Swapping is a good idea as it gives you an additional space to store data files and programs when your physical memory is out of space but accessing a hard disk is hundred times slower than accessing memory. In simple terms, Virtual memory is a logical combination of RAM memory and swap space which is used by running process Virtual memory is a memory management technique that is implemented using both hardware and software which gives an application program the impression that it has contiguous working memory (an address space).

Swap space:
This is a space on the hard disk which is used by the operating system to store data pages that are currently not needed. This swap space can be a partition as well as a swap file.

Amount of RAM in the system recommended amount of swap space
4GB of RAM or less a minimum of 2GB of swap space
4GB to 16GB of RAM a minimum of 4GB of swap space
16GB to 64GB of RAM a minimum of 8GB of swap space
64GB to 256GB of RAM a minimum of 16GB of swap space
256GB to 512GB of RAM a minimum of 32GB of swap space

It is a feature in Linux which controls the degree to which the kernel prefers to swap in the procedure of freeing memory. It can be set to values on a scale from 0 to 100. A low value means the kernel will try to swap as much as possible unless there is almost no free memory let on the RAM for any new process.On the other side, a higher value would force kernel aggressively to swap out pages from the physical memory.The default value for Linux machines is 60. Using a higher value will affect the system negatively as accessing a hard disk(swap space) for each and every request by an application program is a very slow process as compared to doing the same from physical memory. So it should be avoided to transfer active pages to swap space aggressively.

To check the current swappiness value

$ cat /proc/sys/vm/swappiness
60

To change the value

echo 40 > /proc/sys/vm/swappiness
To make the changes affect
sysctl -p
sysctl -a | grep swappiness
vm.swappiness = 40

OR

 sysctl -w vm.swappiness=30
 echo 30 >/proc/sys/vm/swappiness

Difference between cache and buffer :

Both cache and buffer are temporary storage areas but they differ in many ways. The buffer is mainly found in ram and acts as an area where the CPU can store data temporarily, for example, data meant for other output devices mainly when the computer and the other devices have different speeds.This way the computer can perform other tasks. Cache, on the other hand, is a high-speed storage area that can be part of the main memory or some other separate storage area like a hard disk. These two methods of caching are referred to as memory caching and disk caching respectively.

1.Cache is a high-speed storage area while a buffer is a normal storage area on ram for temporary storage.
2.Cache is made from static ram which is faster than the slower dynamic ram used for a buffer.
3.The buffer is mostly used for input/output processes while the cache is used during reading and writing processes from the disk.
4.Cache can also be a section of the disk while a buffer is only a section of the ram.
5.A buffer can be used in keyboards to edit typing mistakes while the cache cannot.

Shared Memory:

One of the simplest interprocess communication methods is using shared memory. Shared memory allows two or more processes to access the same memory as if they all called malloc and were returned pointers to the same actual memory. When one process changes the memory, all the other processes see the modification.Shared memory is the fastest form of interprocess communication because all processes share the same piece of memory.Access to this shared memory is as fast as accessing a process’s nonshared memory, and it does not require a system call or entry to the kernel.It also avoids copying data unnecessarily.

The ipcs command provides information on interprocess communication facilities, including shared segments. Use the -m flag to obtain information about shared memory. For example, this code illustrates that one shared memory segment, numbered 1627649, is in use

% ipcs -m

------ Shared Memory Segments -------
key shmid owner perms bytes nattch status
0x00000000 1627649 user 640 25600 0

Also, for multiple processes to use a shared segment, they must make arrangements to use the same key.

Shared memory segments permit fast bi-directional communication among any number of processes. Each user can both read and write, but a program must establish and follow some protocol for preventing race conditions such as overwriting information before it is read. Unfortunately, Linux does not strictly guarantee exclusive access even if you create a new shared segment with IPC_PRIVATE.

Displaying memory and cache:

cat /proc/meminfo

Displaying /proc/meminfo will tell you a lot about the memory on your Linux computer.

 top

The top tool is often used to look at processes consuming most of the cpu, but it also displays memory information on line four and five (which can be toggled by pressing m).

The difference among VIRT, RES, and SHR in top output:

VIRT stands for the virtual size of a process, which is the sum of memory it is actually using, memory it has mapped into itself (for instance thevideo card’s RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes.VIRT represents how much memory the program is able to access at the present moment. RES stands for the resident size, which is an accurate

RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming. (This also corresponds directly to the %MEM column.) This will virtually always be less than the VIRT size since most programs depend on the C library.

SHR indicates how much of the VIRT size is actually sharable memory or libraries). In the case of libraries, it does not necessarily mean that the entire library is the resident. For example, if a program only uses a few functions in a library, the whole library is mapped and will be counted in VIRT and SHR, but only the parts of the library file containing the functions being used will actually be loaded in and be counted under RES.

How to Clear RAM Memory Cache, Buffer and Swap Space on Linux:

How to Clear Cache in Linux?:

Every Linux System has three options to clear cache without interrupting any processes or services.

1. Clear PageCache only.

 $ sync; echo 1 > /proc/sys/vm/drop_caches

2. Clear dentries and inodes.

$ sync; echo 2 > /proc/sys/vm/drop_caches

3. Clear PageCache, dentries and inodes.

$ sync; echo 3 > /proc/sys/vm/drop_caches

sync will flush the file system buffer.

How to Clear Swap Space in Linux?:

If you want to clear Swap space, you may like to run the below command.

$ swapoff -a && swapon -a

Creating a swap partition:

Swap File:

dd if=/dev/zero of=/smallswapfile bs=1024 count=4096

4096+0 records in
4096+0 records out

mkswap /smallswapfile

Setting up swapspace version 1, size = 4190 kB

swapon /smallswapfile
cat /proc/swaps | grep -i smallswapfile

/smallswapfile file 4088 0 -3

Mount permanently in /etc/fstab

Create swap partition:

mkswap /dev/sda5
swapon /dev/sda5

Get UUID from blkid command

blkid | grep -i swap

/dev/sda5: UUID=”6db15ff8-d1f4-44da-85f3-e59dc97ecaad” TYPE=”swap”

Mount permanently in /etc/fstab

UUID=6db15ff8-d1f4-44da-85f3-e59dc97ecaad none swap sw 0 0

 

Video: Troubleshooting of Memory Management in Linux

 


Ref: Referred many websites.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s