A Brief Introduction to Linux

Introduction

Unix-like operating systems are built under the model of free and open-source development and distribution. They often come with a graphical user interface (GUI) and can be run from the command line (CLI) or terminal. The CLI is a text-based interface that works exactly the same way as you would use your mouse, but you use words. It can be intimidating at first, but once you have mastered the basics, it’s really not different than using your mouse!

It is important to know how to use the terminal as all servers, and most bioinformatics tools do not have a GUI and rely on the use of the terminal.

Warning

MacOS and Linux, and Windows have significant differences in their syntax. Where we do things locally, we will point out some of the differences, but for the most part, we will provide Windows users with a local Linux platform so that this course can run as smoothly as possible

For this course, we do not expect you to be masters of Linux, but we will need some knowledge of how to find files, and some other basic Linux commands.

Filesystem Architecture

Linux uses a hierarchical filesystem, similar to Windows and Mac. In this figure from TecAdmin, there is a representation of this.

Filesystem

We can see here that the root or / folder is at the top of the hierarchy, with all other folders, like home/ and var/ inside of it.

We use / at the end of a folder name to show that it is a folder.

There are two different ways for us to know where our file is within the operating system. The first is the absolute path and the second is the relative path. The absolute path gives us information that is true anywhere on your operating system. Whether your terminal is open in /usr/bin/something/ or /var/tmp/, a file will always be located at /usr/Documents/sequence.fasta as it is the true or absolute location. The relative path, as the name suggests, is relative to where you currently are on the file tree. If your terminal is open in /usr/Documents/paper/figures/, the file from before, sequence.fasta will be two folders up from where you are. If you were in /var/bin/something/ it would three folder up, one to the side, and two folders down.

It is often good practice to use absolute paths when you set pipelines up to run- this way, your programs know where to go looking for the files they’re supposed to work on

Connecting to a Server

There are many different ways you can connect to a server with SSH. At SLUBI, we have a strong preference towards Visual Studio Code or its open-source alternative, VSCodium. It provides a graphical user interface where you can create, edit, and view files, and see the file system.

To be able to connect to a server, you need to install an extension called Remote-SSH from the extensions market place.

Extensions from View

Extensions from the shortcut

Now we need to add a host to our list of known hosts. There are several ways of doing this, and all ways lead to establishing a connection.

  1. Under the previously shown View option, navigate to the Command Palette. A dropdown menu at the top will start with a >. Type Remote-SSH and select add new host. Here we will input our credentials. This will populate a file in a hidden folder, .ssh in your local home directories called known_hosts.
  2. You can also write all of these lines manually. This way, you can also specify SSH keys that may be needed to log on to particular servers. We won’t be covering that in this course, though.

After a host has been added, you can connect to it. You can do this either through the Command Palette, or through the small blue backwards and forwards arrows in the bottom left corner of VSCode. You will receive a list of known hosts that you can connect to. If you need to connect with a password, a password prompt will be shown.

Once connected, you can open the file manager on the left.

For privacy concerns, we will share the exact commands locally in the room.

Important

Remeber to only open either your home directories or the folder for your projects with the file manager. If you try to open folders that are too large, you will cause VSCode to crash (and you may cause significant problems for your system administrator!)