Learning Linux - Part 3

Master Linux file system, disk mounting, permissions, and Apache web server setup. Complete hands-on tutorial with Terraform infrastructure as code.


Introduction

In this part of the Linux learning series, we will cover some fundamental concepts and tasks that are essential for working with Linux systems. We will explore the Linux file system, how to mount drives in Linux, and manage Linux permissions. We will be working on a web server as we move along.

Prerequisites

Before we begin, ensure you have the following prerequisites in place:

  1. An Azure Subscription and an account with sufficient permissions to create resources. Contributor role is sufficient for this lab. More granular roles are recommended in production.
  2. Terraform CLI installed on your local machine.
  3. Azure CLI installed on your local machine.
  4. Basic understanding of command line operations.
  5. Some prior knowledge in Terraform and Azure.
  6. Git knowledge is needed for version controlling your Terraform code. It is time to start using version control if you haven't already.
  7. A Storage Account in Azure for Terraform state management.
  8. Git installed on your local machine.

Setting Up the Environment

We are continuing from the previous part of the series. If you haven't completed the previous part, please do so before proceeding. If you haven't cleaned up the environment, you can use the existing resources. Otherwise, you can follow the steps 1 to 5 from Learning Linux - Part 2.

The Linux File System

Before we start the lab, let's understand the Linux file system structure. Linux uses a hierarchical file system structure, starting from the root directory (/). Here are some key directories:

  • /: The root directory, the top of the file system hierarchy.

  • /bin: Contains essential binary executables.

  • /etc: Contains configuration files for the system.

  • /home: Contains user home directories.

  • /var: Contains variable data files, such as logs and databases.

  • /usr: Contains user-installed software and libraries.

  • /tmp: Contains temporary files.

  • /mnt or /media: Used for mounting filesystems, such as external drives.

  • /dev: Contains device files. Abstract representations of hardware devices.

  • /lib: Contains shared libraries and kernel modules.

  • /opt: Contains optional software packages.

  • /sbin: Contains system binary executables, typically for administrative tasks.

  • /~: The home directory of the current user.

    Understanding this structure is crucial for navigating and managing files in a Linux environment.

A few remarks on the File System

  • Linux is case-sensitive, meaning File.txt and file.txt are considered different files.
  • /bin and /usr/bin are symlinks to the same location on most modern Linux distributions.
  • /bin contains essential command line tools that are required for the system to boot and run, as well as tools included in the distribution for all users.
  • Best practice: /usr/local/bin is where user installed software should go. For example Shell scripts that you write yourself should go here.
  • Configuration files in /etc are usually plain text files that can be edited with a text editor

Starting the Lab

We will start by updating package sources and installing a few packages that we will need later on.

  1. Update package sources and upgrade existing packages: sudo apt update && sudo apt upgrade -y
  2. Make sure to restart services when prompted.
  3. Install Apache web server: sudo apt install apache2 -y
  4. Make sure to restart services when prompted by the Apache2 package.
  5. Install Curl: sudo apt install curl -y

Verifying the Apache Web Server installation

Now that we have installed the Apache web server, let's verify that it's running correctly.

  1. Check the status of the Apache service: sudo systemctl status apache2
    • You should see that the service is active (running). linuxlab3001
  2. Check if the Apache service is enabled to start on boot: sudo systemctl is-enabled apache2
    • You should see enabled as the output. linuxlab3002

Where is the Apache Web Server located in the file system? Let's find out.

  1. The configuration files for Apache are located in the /etc/apache2 directory. /etc is where config files for the system live.
    • You can list the contents of this directory using: ls /etc/apache2
  2. The main configuration file is /etc/apache2/apache2.conf.
    • You can view the contents of this file using: cat /etc/apache2/apache2.conf
  3. The config for the default site is located in /etc/apache2/sites-available/000-default.conf.
    • You can view the contents of this file using: cat /etc/apache2/sites-available/000-default.conf linuxlab3003
  4. The document root (where the web files are stored) is located in /var/www/html. The default web site is located here. Data like web sites belongs in /var as it is variable data.

Connecting to the Web Server

Now that we have verified that the Apache Web Server is running, let's connect to it.

  1. Find the Public IP address of your VM in the Azure Portal.
  2. Open a web browser and navigate to http://<your-vm-public-ip>.

Whoops, something went wrong!

If you see a message like "This site can’t be reached" or "Unable to connect", it means that your web server is not accessible from the internet. We might have missed something.

Troubleshooting Web Server Connectivity Issues

  1. Check the Apache service status: Ensure that the Apache service is running.
    • Run sudo systemctl status apache2 on your VM. If it's not running, start it with sudo systemctl start apache2.
  2. Check the VM's firewall settings: Ensure that the firewall on your VM allows incoming HTTP traffic (port 80). We have not enabled UFW (Uncomplicated Firewall) on the VM, so this should not be an issue.
  3. Check Azure NSG rules: Ensure that the Network Security Group (NSG) associated with your VM allows incoming traffic on port 80. Found the issue: There are no inbound rules allowing HTTP traffic.

Fixing the Connectivity Issue with Terraform

We are learning in a DevOps context, so we will fix the issue using Terraform. We are practicing what we preach, and we want to manage our infrastructure as code. Configuration Drift is a real issue in production environments, and we want to avoid it. Even in a lab, we want to follow best practices.

We will use Git to version control our Terraform code. Version control is essential for safe IaC practices.

  1. Before we start implementing code for another NSG rule, let's create a new branch for our changes. This is good practice to avoid breaking the main branch. Run: git checkout -b feature/http-inbound-nsg-rule

  2. Open ./modules/net/main.tf

  3. Modify the existing azurerm_network_security_group resource to include a new security rule for allowing HTTP inbound traffic on port 80. The updated resource should look like this:

    resource "azurerm_network_security_group" "nsg1" {
      name                = "${var.prefix}-${var.project_name}-nsg-${var.environment}"
      resource_group_name = var.resource_group_name
      location            = var.location
     
      # Allow SSH from Bastion subnet
      security_rule = [
        {
          name                                       = "SSH"
          priority                                   = 1001
          direction                                  = "Inbound"
          access                                     = "Allow"
          protocol                                   = "Tcp"
          source_port_range                          = "*"
          destination_port_range                     = "22"
          source_address_prefix                      = ""
          destination_address_prefix                 = "*"
          description                                = ""
          destination_address_prefixes               = []
          destination_application_security_group_ids = []
          destination_port_ranges                    = []
          source_address_prefixes                    = azurerm_subnet.bastion_subnet.address_prefixes
          source_application_security_group_ids      = []
          source_port_ranges                         = []
        },
        {
          name                                       = "http-inbound"
          priority                                   = 1000
          direction                                  = "Inbound"
          access                                     = "Allow"
          protocol                                   = "Tcp"
          source_port_range                          = "*"
          destination_port_range                     = "80"
          source_address_prefix                      = "*"
          destination_address_prefix                 = "*"
          description                                = ""
          destination_address_prefixes               = []
          destination_application_security_group_ids = []
          destination_port_ranges                    = []
          source_address_prefixes                    = []
          source_application_security_group_ids      = []
          source_port_ranges                         = []
        }
      ]
    }
     
  4. Save the file.

  5. Now, let's apply the changes using Terraform. Run the following commands:

    terraform init -backend-config="tflab-linux.tfbackend"
    terraform plan
    terraform apply -auto-approve
  6. If everything goes well, you should see that the changes have been applied successfully.

  7. Now, let's verify that the new NSG rule has been created in the Azure Portal.

    • Navigate to the NSG associated with your Virtual Network and check the inbound security rules.
    • You should see a new rule allowing HTTP traffic on port 80. linuxlab3004
  8. Commit your changes in Git and push to your remote repo if you have one.

  9. If you are familiar with best practices in Git, you can create a Pull Request and merge the changes to the main branch. If not, you can merge the changes directly to the main branch.

  10. Finally, clean up the feature branch.

Let's try connecting to the Web Server again

Now you should see the default Apache web page when you navigate to http://<your-vm-public-ip> in your web browser. Congratulations! You have successfully set up a web server on your Linux VM and made it accessible from the internet.

linuxlab3005

Linux Disk Mounting

We have a web server up and running, but we are not done yet. This server is a high-maintenance pet which needs lots of troubleshooting and manual work if something goes south. We are going to take baby steps in the bovine direction. We will start by mounting a new drive to the VM. This is a common task in Linux administration. The mounted drive is going to host the web files for our web server.

Create a new disk in Azure

We are using Terraform to create a new disk in Azure. Let's keep grinding that IaC axe. Let's see what drives we currently have on our VM:

  • lsblk -o NAME,HCTL,SIZE,MOUNTPOINT | grep -i "sd"

    linuxmount001

Version Control

We are starting off this task by creating a new branch for our changes. git checkout -b feature/new-data-disk

Adding the Disk Resource

  1. Open ./modules/compute/main.tf. Add the following 2 resources to the file:

    resource "azurerm_managed_disk" "webapp_data_disk" {
      name                 = "${var.prefix}-${var.project_name}-data-disk-${random_string.main.result}-${var.environment}"
      resource_group_name  = var.resource_group_name
      location             = var.location
      storage_account_type = "Standard_LRS"
      create_option        = "Empty"
      disk_size_gb         = 4
     
      tags = var.tags
    }
     
    resource "azurerm_virtual_machine_data_disk_attachment" "example_disk_attachment" {
      managed_disk_id    = azurerm_managed_disk.webapp_data_disk.id
      virtual_machine_id = azurerm_linux_virtual_machine.ubuntu_vm1.id # Or azurerm_windows_virtual_machine.example.id
      lun                = 0                                           # Must be a unique LUN for the VM
      caching            = "None"
    }
     
  2. Save the file.

  3. Now, let's apply the changes using Terraform. Run the following commands:

    terraform init -backend-config="tflab-linux.tfbackend"
    terraform plan
    terraform apply -auto-approve
  4. If everything goes well, you should see that the changes have been applied successfully.

  5. Now, let's verify that the new disk has been created in the Azure Portal.

    • Navigate to the Resource Group where your VM is located and check for the new managed disk resource. You should see a new disk with the name you specified in the Terraform code. linuxlab3006
  6. Git commit, pull request, merge and clean up the feature branch.

Formatting and Partitioning a New Disk in Linux

  1. Run lsblk -o NAME,HCTL,SIZE,MOUNTPOINT | grep -i "sd". Notice the new disk "sdc" with no partitions.

    linuxmount002
  2. Create partition using sudo parted /dev/sdc --script mklabel gpt mkpart xfspart xfs 0% 100%

    linuxmount003
  3. Format the partition using sudo mkfs.xfs /dev/sdc1

    linuxmount004
  4. Verify the format using sudo partprobe /dev/sdc1 and lsblk -o NAME,HCTL,SIZE,MOUNTPOINT | grep -i "sd"

    linuxmount005
  5. Create a mount point using sudo mkdir -p /var/www/mysite

    Notice that we are creating the mount point in /var/www which is the default location for web files in Apache. We are following the Filesystem Hierarchy Standard (FHS) which is a standard for file and directory placement in Unix-like operating systems.

  6. Mount the partition using sudo mount /dev/sdc1 /var/www/mysite. Notice that the mount point is added to the partition

    linuxmount006

Add to fstab for Persistent Mounting

The mount above is temporary and will be lost after a reboot. To make it permanent, we need to add it to the fstab file.

  1. Get the UUID of the partition using sudo blkid /dev/sdc1

    linuxmount007
  2. Edit the fstab file using sudo nano /etc/fstab and add the following line at the end of the file:

     UUID=your-uuid-here /var/www/mysite xfs defaults,nofail 0 2
    

    Replace your-uuid-here with the actual UUID obtained from the previous step. Use 0 and 2 for the last two fields, we do not need to dump the partition. However, we want to check it at boot time. Important: Use TAB instead of spaces for correct formatting in the fstab file.

  3. Reboot and verify that the mount is persistent using lsblk -o NAME,HCTL,SIZE,MOUNTPOINT | grep -i "sd"

Preparing the Web Server

Now the web site files will be stored on the new disk. If the VM dies, we can just attach the disk to a new VM and be up and running in no time. This is a step towards the cattle approach, but we are not going moooo all the way yet. We still have many manual steps, like in this lab. In the future, we will automate more and more steps. But for now, we are happy with our progress.

  1. Let's create a simple web site. We will use curl to download a sample web page.

    cd /var/www/mysite
    sudo curl -o index.html https://raw.githubusercontent.com/Jihillestad/tflab-linux-public/refs/heads/main/www/index.html
    sudo curl -o pettocattle1001.png https://raw.githubusercontent.com/Jihillestad/tflab-linux-public/refs/heads/main/www/pettocattle1001.png
  2. Verify that the files are downloaded using ls -la /var/www/mysite

  3. Edit the Apache default site configuration to point to the new document root.

    sudo nano /etc/apache2/sites-available/000-default.conf

    Change the DocumentRoot directive to:

    DocumentRoot /var/www/mysite
    
    linuxlab3007
  4. Save the file and exit the editor.

  5. Run sudo chown -R www-data:www-data /var/www/mysite to change the ownership of the web files to the Apache user.

  6. Restart the Apache service to apply the changes: sudo systemctl restart apache2

  7. Open a web browser and navigate to http://<your-vm-public-ip>. You should see this web page:

    linuxlab3008

We are now running an Apache web server with a simple web site. The web files are stored on a separate disk, which can be easily attached to another VM if needed. We are one step closer to the cattle approach.

Putting Something in the "/bin"

In this example, we will create a small shell script to modify a few things on the Web Site. We will learn a few things about permissions, shell scripting and symlinks.

Linux Permissions

Linux has 3 different types of users:

  • Owner: The user who owns the file.
  • Group: A group of users who have the same permissions on the file.
  • Others: All other users who have access to the file.

There are 3 different types of permissions:

  • Read (r): Permission to read the file.
  • Write (w): Permission to write to the file.
  • Execute (x): Permission to execute the file.

Each of the user types are represented by a bit in a 3-bit binary number. The table below shows an example of permissions we are going to use for our shell script. Only the owner has full rights, nobody else.

User TypeRWXBinaryOctal
Owner1111117
Group0000000
Others0000000

We use the octal representation for setting absolute permissions. In this case, the octal representation is 700. This gives us chmod 700 <file> to set the permissions.

Creating the Shell Script

  1. Make a new folder in your home directory to store your scripts: mkdir -p ~/scripts

  2. Create a new shell script file: nano ~/scripts/update_website.sh

  3. Add the following content to the file:

    #!/usr/bin/env bash
     
    sed -i 's/good for you/the third worst thing in the universe!/g' '/var/www/mysite/index.html'
    systemctl restart apache2
  4. Save the file and exit the editor.

  5. Make the script executable: chmod 700 ~/scripts/update_website.sh

  6. Symlink the script to /usr/local/bin for easy access: sudo ln -s ~/scripts/update_website.sh /usr/local/bin/update_website

  7. Verify the symlink: ls -la /usr/local/bin | grep update

  8. CD to another directory, for example cd ~

  9. Run the script: sudo update_website

  10. Open a web browser and navigate to http://<your-vm-public-ip>. You should see that the text on the web page has changed.

    linuxlab3009

Conclusion

Congratulations on completing Part 3 of the Linux learning series! This one was a big one, we have worked our way through a lot of new concepts and tasks. We have set up a web server, mounted a new disk, and created a simple shell script to modify the web site. We have also learned about the Linux file system, permissions, and symlinks. The train is slowly moving forward, we are getting more and more proficient in Linux administration.

In this lab, we have also managed our infrastructure like DevOps practitioners should. We have used Terraform IaC to create and manage our resources in Azure, and we have used Git for version control. This is the way to go if we want to avoid configuration drift and ensure consistency in our environments.

In the near future, we will automate more and more tasks, and we will move closer to the cattle approach. Stay tuned for more content to come in the world of Linux and DevOps!