DevOps

An Introductory Guide to The InterPlanetary File System (IPFS)

I’ve always found peer-to-peer applications interesting. Central points of failure aren’t fun! Protocols like BitTorrent are widely used and well known. However, there’s something relatively new and uses BitTorrent-like technology, except it’s much more impressive.

What is IPFS?

The InterPlanetary File System (IPFS) is one that caught my eye during research. It’s basically a peer-to-peer, distributed file system, with file versioning (similar to git), deduplication, cryptographic hashes instead of file names and much more. Unlike your traditional file systems that we’ve grown to love, IPFS is very different. It can even possibly replace HTTP.

What’s amazing about IPFS is, for example, if you share a file or site on IPFS the network (anyone else running IPFS) has the ability to distribute that file or site globally. This means that other peers can retrieve that same file or set of files from anyone who cached it. It even can retrieve those files from the closest peer which is similar to a CDN with anycast routing without any of the complexity.

This has the potential to ensure data on the web can be retrieved faster than ever before and is never lost like it has been in the past. A famous example of data loss is GeoCities, a single entity wouldn’t have the ability to shut down thousands of sites like Yahoo did.

I’m not going to get too much into the complexity of what IPFS can do though, there is too much to explain in this short blog post. A good breakdown of what IPFS is and can do, can be found here.

How to install and begin with IPFS

Starting off, I spun up two VMs from GigeNET Cloud running Debian 9 (Stretch). One in our Chicago datacenter and another in our Los Angeles datacenter.

To get the installation of IPFS rolling we’ll go to this page and install ipfs-update, an easy tool to install IPFS with. We’re running on 64bit Linux so we’ll wget the proper tar.gz and extract it. Make sure you always fetch the latest version of ipfs-update!

IPFS distribution download

wget -qO- https://dist.ipfs.io/ipfs-update/v1.5.2/ipfs-update_v1.5.2_linux-amd64.tar.gz | tar xvz

Now lets cd to the extracted directory and run the install script from our cwd (current working directory). Make sure you’re running this with sudo or root privileges.

cd ipfs-update/ && ./install.sh

When ipfs-update gets installed (should be very quick) we’ll install IPFS for real with.

ipfs-update install latest

The output should look something like this.

ipfs root installation

Now that IPFS is installed we need to initialize it and generate a keypair which in turn gives you a unique identity hash. This hash is what identifies your node. Run the following command.

ipfs init

The output should look similar to this.

initializing ipfs node

With this identity hash you can now interact with the IPFS network, but first lets get online. This will start the IPFS daemon and send it to the background when you press CTRL + C. It’s probably not advisable to run this as root, or with elevated privileges. Keep this in mind!

ipfs daemon &

ipfs daemon

Now that we’re connected to the IPFS swarm we’ll try sharing a simple text file. I’ll be adding the file to IPFS which generates a hash that’s unique to that file and becomes its identifier. I’ll then pin the file on 2 servers so that it never disappears from the network as long as those servers are up. People can also pin your files if they run IPFS to distribute them!

Adding and pinning the file on my Chicago VM.

hello ipfs

Now that we have the file’s hash from the other VM we can pin it on our VM in Los Angeles to add some resiliency.

ipfs pin add

Now to test this we’ll cat the file from the IPFS network on another node!

ipfs hello cat

That was a pretty simple test, but it gives you an idea of what IPFS can do in basic situations. Overall the inner workings of IPFS are hard to understand, but it is a fairly new technology and it has a lot of potential.

traefik guide

As someone interested in following DevOps practices it is my goal to find the best solutions that work with our company’s principles.

At its core, DevOps is a strategy of teaming together administrators and developers to form a single unit with a common goal of working together to provide faster deployments, and better standards through automation. To make this strategy work, it’s essential to continuously explore new tools that can potentially provide you with better orchestration, deployments, or coding practices.

In my search for a more programmable HTTPD load balancer I spotted Traefik. Intrigued by the various features and backend support for Docker, I quickly spun up a VM and jumped straight into learning how-to integrate Traefik with Docker.

My first glance at the file-based configurations for Traefik had me a little uneasy. It was the first time I encountered the TOML formatting. It wasn’t the YAML like I encounter in most projects, or bracket-based formatting that I would have encountered in the past with Nginx, Apache or HaProxy.

Traefix Terminology

Before I jump into setting up the basic demonstration of Traefix with Nginx on Docker I’ll go over the new terminology that has been introduced with Traefik:

  • Entrypoints – The network entry points into the Træfik daemon such as the listening address, listening port, SSL certificates, and basic endpoint routes.
  • Frontend – A routing subset that handles the traffic coming in from the Entrypoints and sends the traffic to a specified backend depending on the traffics Header, and Path.
  • Backend – The actual configurations subset that sends the traffic from the frontend to the actual webserver. The webserver selected is based on the load balancing technique configured.

A basic demonstration of Traefix with Nginx on Docker

In demonstration of a basic Traefik setup, we will only focus on the file-based configuration of Traefik Entrypoints. This is due to the fact that Docker dynamically builds the Frontend and Backend configurations through Traefik’s native Docker Swarm integration.

Let’s start by defining two basic Entrypoints with the “defaultEntryPoints” flag. Under this configuration flag we created two Entrypoints labeled these ‘http’, and ‘https’. These labels represent the everyday traffic we see within our browsers. Under the “[entryPoints]” field within the configuration define the labeled entry point ‘http’ to utilize all interfaces, and we assign it port eighty for inbound traffic. Under the same Entrypoint label we instruct that all traffic entering port eight to be redirect to our second Entrypoint labeled “https”. The Entrypoint labeled ‘https’ follows the same syntax as the one labeled ‘http’ with a slight deviation on how it handles the actual web traffic.

Without a redirect it instructs Traefik to accept the traffic on port 443. We also instruct Traefik to utilizes secure certificates for web encryption and were to load the SSL certificates. A full example is shown below, and can also be found within our Github repository.

defaultEntryPoints = ["http","https"]

[entryPoints]

[entryPoints.http]

address = ":80"

[entryPoints.http.redirect]

entryPoint = "https"

[entryPoints.https]

address = ":443"

[entryPoints.https.tls]

[[entryPoints.https.tls.certificates]]

certFile = "/certs/website.crt"

keyFile  = "/certs/website.key"

The TOML configuration we have built is all that is required for a dynamic Docker demonstration. This file will need to be saved as traefik.toml. The configuration file will be used on our Docker Swarm management nodes and needs to be saved on each management node in your Docker Swarm.

To keep focus on Traefik we won’t go into the setup, and configuration of a Docker Swarm cluster. That will be a topic of discussion in a future blog post and will be back referenced here at some point in the future. We will design a basic Docker compose file to demonstrate the Dynamic loading of Traefik with a default Nginx instance. Within the Docker Swarm configuration file will focus on the Traefik image, and the Traefik flags required for basic backend load balancing.

Let’s start by downloading the docker-compose.yaml file that can be found on our github.com page. To download the file, you can go to https://github.com/gigenet-projects/blog-data/tree/master/traefikblog1.

The entire code repository can also be cloned the following command:

[root@dockermngt ~]#  git clone https://github.com/gigenet-projects/blog-data.git

[root@dockermngt ~]#  cd blog-data/traefikblog1

The docker-compose.ymal:

version: '3.3'

services:

 nginx:

   image: nginx

   ports:

     – target: 80

       protocol: tcp

       mode: host

   networks:

     – distributed

   deploy:

     restart_policy:

       delay: 10s

       max_attempts: 10

       window: 120s

     labels:

       traefik.frontend.rule: “Host:traefik,demo.gigenet.com”

       traefik.port: “80”

 traefik:

   image: traefik

   volumes:

     – /var/run/docker.sock:/var/run/docker.sock

     – /opt/traefik/certs/:/certs/

     – /opt/traefik/traefik.toml:/traefik.toml

   deploy:

     restart_policy:

       delay: 120s

       max_attempts: 10

       window: 120s

     placement:

       constraints: [node.role == manager]

   networks:

     – distributed

   ports:

     – 80:80

     – 443:443

     – 8080:8080

   command: –docker –docker.swarmmode –docker.domain=demo –docker.watch –web –loglevel=DEBUG

networks:

 distributed:

   driver: overlay

Under the Traefik service section we will specifically will focus on the Docker compose flags labeled volume, deploy, placement, networks, ports, and command. These flags have a direct impact on how the Traefik Docker image will operate, and need to be configured properly.

Within the volume flag we pass in the docker socket, and the Traefik TOML configuration file we built previously. The Docker socket is used by Traefik for API purposes such as grabbing the Traefix label flags that are assigned to other Docker services. We will go into this in further detail in a few steps. The actual Traefik configuration we built earlier utilized both http, and https. We defined these Entrypoints to tell the Traefik container image the base configuration it will use when starting for the first time. These configuration flags can be overwritten with Docker labels, but we will not be going into such advanced configurations within the blog. As the configuration file had a focus on encryption the configuration will be mounting our SSL certificates under the /certs directory on the Traefik container. The Traefik TOML configuration file, and SSL certificates should be installed on every Docker management node.

Under the deploy section we take focus on the placement flag. Traefik requires the Docker socket to get API data, and only management nodes can provide this data in a Docker Swarm. We tell the placement of the Traefik container to be constrained to only Docker names with a management role. This simple technique will enforce the requirement.

The network flag is a must have for Traefik to work properly. With Docker clusters we can build overlay networks that are internal to just the VM’s assigned to the network. This provides containers with network isolation. For the load balancing to work we need to have the Traefik container on the same network as every webhost we plan to load balance traffic too. In this case we named our network “distributed” and set it to use the overlay driver. 

How inbound traffic is passed to the overlay network

The port flag is very simple, and straight forward. On the Traefik configuration we assigned ports 80, and 443 to take in traffic, and forward traffic. The ports map the port of the container to the live port within the container. We also enable port 8080 in this example, and this is so we can demonstrate the web dashboard that Traefik provides. A snippet of the dashboard of a live cluster is shown below:

Lastly, the command flag will parse any additional commands that we did not configure in the Traefik TOML configuration file directly onto the Traefik binary on boot. We tell Traefik to utilize the Docker backend, and enable the dashboard with this Docker compose demonstration.

Now that we understand the Traefik section of the Docker compose file we can go into detail on how the other services such as Nginx are dynamically connected to the Traefik load balancer. Within the Nginx service we will focus on the ports, networks, and labels flags.

With this specific Nginx container image our web browser will only see a default Nginx “Welcome to nginx!” web page. With focus on the ports flag you’ll notice we are opening port eighty so that packets are not firewalled off by the Docker management service. In the Traefik service piece, we mentioned that the network has to include the same network as Traefik. Within our Nginx service you will notice the required network “distributed” has been assigned.

The labels flag is the most interesting section of the Nginx service. Within the label we set a few flags for Docker to register on the Docker Management API. This is how Traefik will know which backend to assign, and if that backend is alive. To keep this demonstration simplistic, we tell Traefik that the Nginx service has a single Nginx virtual host named ‘demo.gigenet.com’. To assign this to the Nginx service we utilize the ‘traefik.frontend.rule’ flag under the labels section as followed ‘traefik.frontend.rule: “Host:traefik,demo.gigenet.com”’. Notice how the Traefik Frontend is defined within the Docker compose configuration file, and not on the file-based Traefik configuration file. With this flag set Traefik will be able to get every IP address assigned under the network overlay. Traefik will also need to know which ports on the Nginx services are listening, and this is done by the flag “traefik.port”. In our example we assigned port eighty to the “traefik.port” flag, and this also represents the port we opened for network traffic.

With this configuration explained, and ready. It’s now time to launch the Docker stack we built and test out the load balancing. To launch the Docker stack run the following command within the Docker management node.

[root@dockermngt ~]#  docker stack deploy --compose-file=docker-compose.yaml demo

You should now see the “Welcome to nginx” page within your browser when going to the domain name you specified. You can also review the actual load balancing rules by appending :8080 to this domain as showing in the previous picture.

glusterfs

Introduction and use cases

GlusterFS is a clustered file system designed to increase the speed, redundancy, and availability of network storage. When configured correctly with several machines, it can greatly decrease downtime due to maintenance and failures.

Gluster has a variety of use cases, with most configurations being small three server clusters. I’ve personally used Gluster for VM storage in Proxmox, and as a highly available SMB file server setup for Windows clients.

Configurations and requirements

For the purposes of this demonstration, I’ll be using Gluster along with Proxmox. The two work very well with each other when set up in a cluster.

Before using Gluster, you’ll need at least three physical servers, or three virtual machines, which I also recommend be on separate servers. This is the minimal configuration to set up high availability storage. Each server will have a minimum of two drives, one drive will be used for the OS, the other will be used for Gluster.

Gluster operates on a quorum based system in order to maintain consistency across the cluster. In a three server scenario, at least two of the three servers must be online in order to allow writes to the cluster. Two node clusters are possible, but not recommended. With two nodes, the cluster risks a scenario known as split-brain, where the data on the two nodes isn’t the same. This type of inconsistency can cause major issues on production storage.

For demonstration purposes, I’ll be using 3 CentOS 7 virtual machines on a single Proxmox server.

There are two ways we can go about high availability and redundancy, one of which saves more space than the other.

  1. The first way is to set up gluster to simply replicate all the data across the three nodes. This configuration provides the highest availability of the data and maintains a three-node quorum, but also uses the most amount of space.
  2. The second way is similar, but takes up ⅓ less space. This method involves making the third node in the cluster into what’s called an arbiter node. The first two nodes will hold and replicate data. The third node will only hold the metadata of the data on the first two nodes. This way a three-node quorum is still maintained, but much less storage space is used.The only downside is that your data only exists on two nodes instead of three. In this demo I’ll be using the latter configuration, as there are a few extra steps to configuring it correctly.

Configuration

Start by setting up three physical servers or virtual machines with CentOS 7. In my case, I set up three virtual machines with 2 CPU cores, 1GB RAM, and 20GB OS disks. Through the guide, I’ll specify what should be done on all nodes, or one specific node.

glusterfs

All three machines should be on the same subnet/broadcast domain. After installing CentOS 7 on all three nodes, my IP configurations are as follows:

Gluster1: 10.255.255.21

Gluster1: 10.255.255.22

Gluster1: 10.255.255.23

All Nodes:

The first thing we’ll do after the install is edit the /etc/hosts file. We want to add the the hostname of each node along with their IPs into the file, this prevents Gluster from having issues in the case that a DNS server isn’t reachable.

My hosts file on each node is as follows:

glusterfs

All Nodes:

After configuring the hosts file, add the secondary disks to the hosts. I added an 80GB disk to Gluster1 and Gluster2, and a 10GB disk to Gluster3, which will be the arbiter node. If the Gluster nodes are VMs, the disks can simply be added live without shutting down.

Gluster 1&2  Drive configuration:

Gluster 3 Drive configuration:

After adding the disks, run lsblk to ensure they show up on each node:

sda is the OS disk, sdb is the newly added storage disk. We’ll want to format and mount the new storage disk for use. In the case of this demo, I’ll be using xfs for the storage drive.

fdisk /dev/sdb
n
p
enter
enter
w
enter
mkfs.xfs /dev/sdb1

You should now see sdb1 when you run an lsblk:

We’ll now create a mount point and add the drive into /etc/fstab in order for it to mount on boot:

My mountpoint will be named brick1, I’ll explain bricks in more detail after we mount the drive.

mkdir -p /data/brick1

After creating the mountpoint directory, we’ll need to pull the UUID of the drive, you can do this with blkid

blkid /dev/sdb1

Copy down the long UUID string, then go into /etc/fstab and add a similar line:

UUID=<UUID without quotes> /data/brick1 xfs defaults 1 2

Save the file, then run mount -a

Then run df -h

You should now see /dev/sdb1 mounted on /data/brick1

Make sure you format and mount the storage drives on each of the three nodes.

Gluster volumes are made up of what what are called bricks. These bricks can be treated almost like virtual hard drives in a what we’d use for a RAID array.

This depiction gives an idea of what a two server cluster with two replicated Gluster volumes would look like:

This is what the Gluster volume we’re creating will look closer to:

Now it’s time to install and enable GlusterFS, run the following on all three nodes:

yum install centos-release-gluster

yum install glusterfs-server -y

systemctl enable glusterd

Gluster doesn’t play well with selinux and the firewall, we’ll disable the two for now. Since connecting to services such as NFS and Gluster doesn’t require authentication, the cluster should be on a secure internal network in the first place.

Open up /etc/selinux/config with a text editor and change the following:

SELINUX=enforcing

to

SELINUX=disabled

Save and exit the file, then disable the firewall service:

systemctl disable firewalld.service

At this point, reboot your nodes in order for the SELINUX config change to take effect.

On Node Gluster1:

Now it’s time to link all the Gluster nodes together, from the first node, run the following:

gluster peer probe gluster2

gluster peer probe gluster3

Remember to run the above commands using the hostnames of the other nodes, not the IPs.

Now let’s check and see if the nodes have successfully connected to each other:

We can see that all of our nodes are communicating without issue.

Finally, we can create the replicated gluster volume, it’s a long command, ensure there aren’t any errors:

gluster volume create stor1 replica 3 arbiter 1 gluster1:/data/brick1/stor1 gluster2:/data/brick1/stor1 gluster3:/data/brick1/stor1

  • “stor1” is the name of the replicated volume we’re creating
  • “replica 3 arbiter 1” specifies that we wish to create a three node cluster with a single arbiter node, the last node specified in the command will become the arbiter
  • “gluster1:/data/brick1/stor1” creates the brick on the mountpoint we created earlier, I’ve named the bricks stor1 in order to reflect the name of the volume, but this isn’t imperative.

After running the command, you’ll need to start the volume, do this from node 1:

gluster volume start stor1

Then check the status and information:

gluster volume start stor1

gluster volume info stor1:

As you can see, the brick created on the third node is specified as an arbiter, and the other two nodes hold the actual data. At this point, your are ready to connect to GlusterFS from a client device.

I’ll demonstrate this connection from Proxmox, as the two naturally work very well together.

Ensure that the Gluster node hostnames are in your Proxmox /etc/hosts file or available through a DNS server prior to starting.

Start by logging into the Proxmox web gui, then go to Datacenter>Storage>Add>GlusterFS:

glusterfs proxmox

Then, input a storage ID, this can be any name, the first two node hostnames, and the name of the Gluster volume to be used. You can also specify what file types you wish to store on the Gluster volume:

proxmox glusterfs

Don’t worry about the fact that you weren’t able to add the third node into the Proxmox menu, Gluster will automatically discover the rest of the nodes after it connects to one of them.

 

Click add, and Proxmox should automatically mount the new Gluster volume:

glusterfs and proxmox

As we can see, we have a total of 80GB of space, which is now redundantly replicated. This newly added storage can now be used to store VM virtual hard disks, ISO images, backups, and more.

In a larger application scenarios, Gluster is able to increase the speed and resiliency of your storage network. Physical Gluster nodes combined with enterprise SSDs and 10/40G networking can make for an extremely high end storage cluster.

Similarly GigeNET uses enterprise SSDs with a 40 Gbe private network for our storage offerings.

How To Migrate EasyApache 4 Profiles Between cPanel Servers

EasyApache is a convenient utility on cPanel servers which allows you to manage much of the important software that powers your web server. It manages your installations of Apache, PHP, and many PHP extensions and modules.

A common issue encountered while migrating websites between servers is differences in the environment on the destination server. For example, you may be migrating a website that uses the mbstring PHP extension, but that extension is not installed on the new server. So, after you migrate the website, it breaks due to the missing extension and you’re left finishing off a stressful server migration by digging around and troubleshooting all of these residual issues.

Of course, there is no way to completely eliminate this problem, but if you are running EasyApache 4, it is a simple matter to migrate your profile from one server to another. This will ensure that your target server has the same Apache and PHP packages available.

What If My Old Server Is Still Running EasyApache 3?

Although not covered under the scope of this guide, there is a migration process from EasyApache 3 to EasyApache 4, and it is very well documented.

If you are running EasyApache 3 and plan to migrate to a server with EasyApache 4 installed, your best bet is to upgrade the old server first so that you can iron out any issues with the EasyApache 4 upgrade separate from the migration.

Let’s Migrate!

1. On the old server, convert your existing settings to a Profile.

    1. Login to your WHM panel and navigate to Software > EasyApache 4.
    2. You will see at the top a section labeled “Currently Installed Packages”. In this section, click “Convert to profile” to create a profile from your existing settings.How to migrate EasyApache 4 profiles between cPanel servers.
    3. Enter a name and filename for your profile that will be meaningful to you, and then click the “Convert” button.How to migrate EasyApache 4 profiles between cPanel servers.
    4. Now your profile will be created!How to migrate EasyApache 4 profiles between cPanel servers.

2. Download your profile from the old server.

1. Scroll down within the EasyApache 4 interface, and your new profile is most likely at the bottom. You can identify it based upon what you named it when you created it.

2. Click the “Download” button to download a copy of the profile on to your computer.How to migrate EasyApache 4 profiles between cPanel servers.

3. Upload your profile to the new server.

1. On the new server, login to your WHM panel and navigate to Software > EasyApache 4.
2. Toward the top, you will find a button that says “Upload a profile”. Click that button to begin the upload process.How to migrate EasyApache 4 profiles between cPanel servers.
3. Browse for the json file on your computer which you downloaded from the old server. This will be the filename you entered while creating that profile in step 2c above.
4. Click the Upload button to upload this profile to your new server.How to migrate EasyApache 4 profiles between cPanel servers.

4. Provision the profile on the new server.

1. Now that you’ve uploaded the profile, scroll down in EasyApache, and you should find your new profile toward the bottom. You can identify it based upon the name that you entered in step 2c above.
2. Click the “Provision” button to apply this profile to your new server.How to migrate EasyApache 4 profiles between cPanel servers.
3. EasyApache 4 will go through the provisioning steps and install all of the software and modules which were copied over from the old server.

You’re done!

Now your new server should have the same PHP versions and modules available, which should greatly reduce your likelihood of encountering any migration headaches!

sysadmin

Migrating your websites from one server to another can be a difficult and time consuming process, but with preparation and consideration, it can go smoothly.  Throughout my time as a systems administrator, migrations have been some of the most time consuming tasks I’ve worked on. The purpose of this guide is to provide you with the benefit of my experience so that you know what type of issues you may run into, and how you can best pre-emptively avoid those issues.

cPanel has fantastic tools for migrations, and the scope of this guide is meant for users migrating from one cPanel environment to another, however a lot of the general principles here apply to any server migration.

I will break down your migration into a few phases:

  1. Planning – This is stuff that you do before the migration begins to prepare.
  2. Preparation – This is stuff that you do immediately before the migration (day of).
  3. Migration – This is the actual process of migrating the data.
  4. Wrap Up – Cleaning things up to finalize the transition.

cPanel’s Migration Tools

cPanel has included tools for backing up most of the information owned by each user account including:

    • Home directory (all of the user’s files
      • The user’s email files are typically in a Maildir folder inside the home directory.
  • MySQL databases
  • Email accounts
  • Email Forwarders & Filters

Unfortunately, not all of the configuration and modules in WHM have an easy migration path.  If you have a lot of customizations on your server, it would be best to plan to audit this manually and determine what you may need to configure on the new server.

If you’ve made relatively few changes in WHM throughout the course of using the old server, most likely there is not much special to be concerned with.

In WHM, the EasyApache tool is used to provision common server software such as Apache or PHP.  If you have made customizations in EasyApache this information won’t be carried over by your account backups.

If you are running EasyApache 3, you should upgrade to EasyApache 4 first before migrating to a new server. This way, you can isolate any issues that stem from that software upgrade first before you migrate.  This upgrade process is well documented on the cPanel KB. Otherwise you can easily transfer the profile to your new server to retain your current configuration options. 

Planning

These steps and considerations should be reviewed while considering your migration process.  Most of this planning can be done before you have even ordered the new server which you’re planning to migrate to.

  • Clean Up Unused Data

    There is little reason to waste time and effort migrating accounts of former customers, defunct projects, or other no longer used websites.  So, before you begin the migration process, it is worthwhile to spend some time looking through your accounts list to see if there is anything you can eliminate.

    Also, it’s a good practice to look through your larger accounts and see why they are so large.  You may find large useless folders containing things like old backups which you may not need to migrate to your new server.

    Anything that you can eliminate will cut down on the time required for your migration, as well as make sure that you are optimally using your new server resources.

  • Consider IP Address Changes

    While migrating to a new server, there are many cases that you may need to change IP addresses you are currently using.If you have any websites using one such IP address, they will need the IP changed when they are migrated.  If you have a nameserver using an IP that needs to change, you will need to update its GLUE records when it is migrated.

    At GigeNET, and many other providers, the main IP address of your server is tied to a slot location, and thus will need to change if you move to new hardware.   We can however assist with rerouting your routed IP allocations in most cases, as long as the new server is in the same location as the old server.

    This tactic can be used to save time by not changing IPs during server migrations, or changing them beforehand to routable IPs.  If you plan ahead, you could possibly change the IPs of your websites before carrying out the server migration to IPs that you can move easily.

    If you know that you will need to change IP addresses, review where your DNS is hosted for all of your websites.  If you have websites that are using a third party DNS service (some proxy services like CloudFlare will set this up by default as well), any changes to the IP address of these websites will need to be coordinated with whomever has access to those services.  Otherwise, you will want to get access to them beforehand.

    In the case that you are changing IP addresses, it is a good idea to lower the TTL of the DNS records ahead of time.  The TTL, or time to live, is a value which determines how long DNS servers around the Internet will remember (or cache) the information stored in a particular DNS record.  Lowering this value ahead of time can allow for faster DNS propagation.

  • Consider the Software Version Changes

    While migrating to the new server, you will presumably be setting up shop on a server with a freshly installed operating system.  So, most likely that server will be running the latest versions of available software. Your old server may not have been.

  • If you are currently running an Apache version older than 2.4, an update to 2.4 may break some .htaccess rules. This is something that you should be aware of beforehand so that you can make sure you know what needs to be fixed. If this upgrade is carried out while incompatible .htaccess rules exist, you may find yourself chasing down 500 Internal Server Errors later on the affected websites.

You can find details on what specifically may need to be updated in your .htaccess files on the Apache documentation: https://httpd.apache.org/docs/trunk/upgrading.html

  • At the time of writing, EasyApache 4 supports PHP versions 5.4, 5.5, 5.6, 7.0, 7.1, and 7.2.  If you are running PHP 5.3 or older currently, you may run into difficulties migrating your websites to a newer version of PHP. Some websites which worked n PHP 5.3 may continue to work fine on future versions. If you are not a developer, then really, the best way to know for sure is to try it. PHP 5.4 and beyond introduce changes to the way that the code executes, and periodically remove or change the way PHP functions work.  So the further away you get from the version of PHP your site was written for, the less likely it will work.
  • Consider your options:
      • You could upgrade the website’s code to be compatible with the new version of PHP.  Ideally, this is what you would want to do. However, this will likely be time consuming as well as outside the scope of your hosting provider’s support.  You may need to enlist a developer for assistance.
      • If your website is a popular web application such as WordPress, Joomla, or something similar; consider upgrading to the latest version, as typically the latest versions of these applications support new versions of PHP.  This is a good security practice and something that you should ideally be doing anyway.
      • If you need to run older versions of PHP, it is possible to safely do so thanks to CloudLinux’s HardenedPHP feature, which you can read more about here:  https://www.cloudlinux.com/hardenedphp
    • CloudLinux is not free, so consider this as a part of the operating cost for your new server if you are not planning to update the affected websites.

Preparation

These steps should be carried out before starting the migration, but relatively soon before the migration (the day of ideally).

  1. Check DNS TTLs
    Review any notes from the above information in the planning stage.  If you are changing any IP addresses, did you remember to lower the TTL on the DNS records to enable quicker propagation?
  • Maintenance Mode
    As much as possible, you should put any websites into read-only or maintenance modes.  Migrations are not instantaneous, there will be some lag time between when you start copying data from the old server and when the site is live on the new server.  During that time, your visitors will still see the website from the old server.

If changes are made during this time window, they may be lost since the transfer has already began.  To combat this and other issues, many popular web applications can be placed into “Maintenance Mode”.  This prevents your visitors from making changes to dynamic aspects of the site.

If you are not going to go ahead with this, please carefully consider the consequences as they may apply to your website:

  1. Blogs may lose comments posted by visitors during the migration.
  2. Forums may lost posts made during the migration.
  3. eCommerce store fronts may lose information about orders placed during the migration.

  • Prevent Missed Emails
    If emails are received on the old server during the migration, they will be delivered to the mailbox on the old server, but may never have a chance to be copied to the new server.A simple way to combat this is by disabling the Exim mail server on the old server when you are starting the migration.If someone tries to send an email during this time to someone on your server, their mail server will detect the failure connecting and will try again later.
    In most cases, this results in the emails being delivered to the new server with some delay after the new server is brought online. If you do not complete the migration by the time configured on the sender’s mail server for retry, the sender will receive a bounceback and the email will not be delivered. Either case is better than the email being silently delivered to the old mailbox but never seen by the user.

Migration

This portion of the process will be the most time consuming, but requires very little interaction from you!

There are two techniques commonly used for migrating cPanel accounts:

  • WHM Transfer Tool
    The WHM Transfer Tool provides a simple graphical interface for migrating accounts between servers.  If you are doing a transfer yourself with relatively little server administration experience, this may be your preferred method.The tool is well documented, so I will not go into great detail here.  You can consult this documentation for details on the tool: https://documentation.cpanel.net/display/68Docs/Transfer+Tool
  • Scripted Migration
    Server administrators who want more control over the migration may choose to migrate accounts manually or with a basic script.cPanel’s included scripts can be used to migrate individual accounts, like so:

 /scripts/pkgacct username  < Packages up username’s account.

/scripts/restorepkg /path/to/package.tar.gz   < Restores the account in package.tar.gz

Using these scripts, it is relatively simple to create a script to package up all of the accounts.  You can do it with a simple for loop in Bash like this:

for i in $(cat users.txt); do /scripts/pkgacct $i; done

For the above to work, you do need to create a users.txt file containing a list of all of the cPanel usernames you wish to migrate.

If you run into difficulty packaging an account, you can consult the cPanel documentation on this script.

The pkgacct script supports a variety of options, such as skipping the databases, skipping the home directory, and toggling compression of the archive.  These can come in handy for special cases, such as accounts that have massive MySQL databases or home folders. In those cases, there may be circumstances where you would prefer that cPanel not bundle those files up because it is either running into problems, taking a long time, or creating a massive archive; and in those cases you can then have the option to transfer them yourself using other means.

Once the backup packages are created, you can transfer them to your new server using a transfer tool such as scp, sftp, or rsync.

The same methodology can be used to restore the packages.  Something like:

for i in $(ls -1 cpmove*.tar.gz); do /scripts/restorepkg $i; done

In general, the WHM Transfer Tool would be most user’s preferred way to complete a migration.  The scripted migration technique is helpful for cases where you need more control or need to be able to troubleshoot any errors created from the migration scripts.

Wrap Up

Once you’ve completed the migration steps above for all of your accounts, it is a good idea to test all of your websites as soon as you can before switching the live sites to the new server.

If the websites are configured on new IP addresses, you can override the DNS locally on your computer by editing your hosts file.  This way, you can visit the websites in your browser and see what they look like on the new server before making them live for everyone.  If you need help with this step, you may find this guide useful. 

Once you’ve decided that you are ready to bring the sites live on the new server, go ahead and update any residual DNS entries to point the domains to the new server.  If you have IP addresses that you are routing to the new server from the old server, go ahead and proceed.

It is a good idea to review your DNS zones manually to make sure that everything is cleaned up.  Make sure that no references to old IP addresses still exist in the DNS zone, and update accordingly as needed.  Also, if you lowered TTL times, you could take this time to increase them back to higher values. This isn’t required, but is considered to be a good practice and will reduce query load to your DNS server.

Once you think that you have everything live on the new server, it is a good idea to power down the old server completely.  Do not proceed with cancelling the old server with your hosting provider until you are sure everything is working completely and correctly on the new server.  There have been many times during migrations I have worked on that we find some surprises during this step, and customers realize that a few things were not really pointed to the new server.  So, it’s a good idea to do this, since it’s easily reversible (just turn the server back on) and will help you find any oversights quickly so that you can fix them.

When you feel comfortable, you can go ahead and cancel the old server.  

Congratulations, you have successfully completed your migration!

Photo by Alexandru Acea

Syncthing is a fantastic decentralized file synchronization utility, which functions in a manner similar to other cloud storage services like Dropbox or Google Drive, but is self-hosted and does not require a central server to operate.

We recently covered Syncthing on another blog post and wanted to give some further more condensed tips for first time users.

Pertaining to Setup

Syncthing is very simple to install. On most platforms you will simply need to download the latest version of Syncthing Core from syncthing.net, quickly extract a compressed archive, and run the program.

When Syncthing is first installed, the Web UI will launch on https://localhost:8384/.  You can change the listen IP later if you need to access the instance remotely.

When you’re setting up your first shared folder on Syncthing, there are two identifiers that are important to remember.

Folder IDs

The Folder ID is the identity of a shared folder.  This ID is generated by the first peer who creates a shared folder, and then it should be given out to all other peers that you want to share the folder with.  So, it’s created randomly, then never changes for the life of the shared folder.

To create a new shared folder, you simply need to click the Add Folder button.  If you have an existing shared folder and need to get the ID, you can find that information within the Shared Folders pane in the web UI for existing folders.

Device IDs

The Device ID is the identity of the Syncthing instance itself, so this is the identifier for a computer or other device that you want to share with.  This ID is randomly generated during the installation of Syncthing without any user action required. It does not change for the length of the existence of the Syncthing installation on the device.

This ID is used to initiate sharing with other instances, so you will need to know your peers’ device IDs in order to start sharing folders with them.

This ID can be easily obtained by going to Actions > Show ID (found on the upper right corner of the Web UI).

Pertaining to Global Settings

Automatic Updates

Syncthing is not usually installed through a package manager, so you may be wondering about keeping it up to date.  You can turn on Automatic Upgrades from Actions > Settings > General in the Web UI. Generally, you should choose “Stable releases only” as your update option.  This is on by default.

Bandwidth Limiting

Sometimes, you may be using Syncthing on an Internet connection where there is other traffic which should be taking priority.  If you happen to be on a residential Internet connection like most Syncthing users, chances are that you don’t have a lot of upload bandwidth available, and Syncthing pushing out file changes might not be as high priority to you as making sure that your Netflix stream isn’t buffering.

You can turn on incoming or outgoing rate limiting from Actions > Settings > Connections.  The limits are set in kilobytes per second. A good strategy might be to run a speed test on your Internet connection and then set the Syncthing limit to 80% or less than your typical upload speed.  This is usually a good amount to prevent saturating your connection under most conditions.

Bandwidth limits are not turned on by default when you install Syncthing.

Pertaining to Shared Folders

Rescan Interval

To conserve server resources, you may want to increase your rescan interval time.  A very low rescan interval time can cause disk load issues, especially if you have a large amount of files.  You can find this setting by clicking the Edit button on the shared folder from within the Web UI and expanding Advanced Settings.

If you do configure a high rescan interval, you should know that any files added to the folder will take longer to propagate to peers on the network.  You can manually expedite the process by starting a rescan, which can be done by clicking the Rescan button on the shared folder in the Web UI.

Another reason that I’ve found to set this to a higher value is if a rescan occurs while you are transferring files into the folder (such as downloading a file from the Internet into the folder), you can end up syncing out a partial file version.  This is just generally a waste of bandwidth, as Syncthing will have to sync out the full file once the download finishes. It also creates a buildup of corrupt “file versions” on your other peers if they have versioning turned on. So, if you are going to be putting this folder somewhere that it may be receiving slow file transfers, that is another consideration for changing this setting.

File Versioning

Syncthing supports several different types of file versioning.  You can configure the method that you want to use, as well as the maximum age of file versions to retain.  This works similar to “revision history” features in popular cloud storage platforms like Dropbox. By default, revisions are stored in the .stversions folder.

You can configure the settings by clicking the Edit button on the shared folder and expanding the Advanced Settings section.

I have been using Staggered File Versioning, and this has worked well for me.  To understand the different types of file versioning, you may want to consult the Syncthing documentation.

Accessing and restoring old versions of files is as simple as clicking the Versions button on the shared folder and finding the revision that you want to restore from there, it can all be done through the web UI, or you can manually access the file yourself from the .stversions folder in your shared folder’s root directory.

Pertaining to Devices

Introducers

An “Introducer” is a distinction of a peer which can be very useful when used correctly.

Say that you have some files that you share with John, and John wants to share the same files with his friend Sally.  You don’t know Sally, but you don’t mind John’s friends connecting to you as a peer to increase the speed and reliability of the file transfers.  Instead of Sally contacting you with her device ID, you can simply configure John to be an “introducer.” That way, any device John sets up as his peer will automatically become your peer as well.

There are some situations where this can be a drawback.  For instance, if you’re trying to remove a dead peer from the network, your introducers might have a nasty habit of introducing it back to you.  It may be necessary for you to temporarily disable this option while you are trying to delete peers, but otherwise, it can be left on for convenience.

Conclusion

Hopefully, these tips will be insightful for you in setting up your first Syncthing network.  Most of the features of Syncthing are very self explanatory, but as a user for over a year, I compiled these tips for you based on my experiences and the inconveniences that I’ve encountered.

If you have any further questions about Syncthing, I can’t stress enough that their documentation is fantastic, very detailed, and really worth a look. 

 

Have a question? chat with our specialists.

Backup Storage

This adventure was kicked off by my intent to research an open source supplement or replacement for the enterprise backup solution R1soft ServerBackup. At GigeNET we heavily utilize R1soft, and its API to manage the backups for our fully managed customers. The backup software we have chosen all range in feature set, and complexity. 

Here are the 3 best open source backup solutions and how to use them:

R1soft ServerBackup

The R1soft ServerBackup server side installation processes was a very simple process. It involved adding the R1soft repository to CentOS, Debian, or Ubuntu, and just downloading the “r1soft.repo” or “source.list” contents on the website to the correct repository location on the distribution. The next stage of the installation was to pull in the binary from the shell prompt.

On CentOS, we utilized only these two commands:

server@backup# yum install serverbackup-enterprise

server@backup# serverbackup-setup –user DESIRED_USERNAME –pass DESIRED_PASSWORD

The final stage of the installation process required you to log into the web interface at http://SERVER_IP/. R1soft ServerBackup manages all client machine backup data, and machine configurations through the portal. The design of the portal is very relaxing in terms of navigation. Relaxing because I was able to navigate the entire portal within a few minutes, and I was able to find a simple web form to add a new machine to the backup’s list during this time.

To add a new machine to backup we simply follow this straight-forward form on the web interface:

The client installation process has two methods to be aware of. You can use a built-in client installer through the server-side web interface, which accepts RDP or SSH information. This process is done when adding a new machine through the web interface. The second method requires utilizing the package manager with the “r1soft.repo” or “source.list” in the same manner we described in the server installation process. This installation of the Linux client can be rigorous if the client setup is custom, or the machine is majorly out of date. The client requires a kernel module, and majority of the time this module is pre-built for you. In the situation it’s not prebuilt it will have to be compiled against the kernel. This requires a list of packages installs that can be difficult to obtain if you’re on an older kernel. This also leads to failed builds of the client if the kernel is hardened, or customized outside of what is prepackaged by the operating system.

The install process on CentOS without a module compilation:

client@backup# yum install serverbackup

client@backup# serverbackup-setup –get-module

This is the only block-based backup solution that we have encountered for Linux that doesn’t require a special configuration. The backup solution auto detects the drives, partitions, and ignores the filesystem while performing these backups. This leads to a higher efficiency of performing the backup and provides the ability to do a full bare metal restore. The process allows restoring individual files, entire partitions, and single files. An overall hybrid that can allow you to do complex restores on the bare metal level such as keep the old partition layout or changing the partition layout on the fly.

R1soft ServerBackup is an enterprise solution with a very elegant design. However, with complex block-based solutions it does have its drawbacks when it comes to the client. A major issue that we at GigeNET encountered on the server side is that the backup process can lock up on rare occasions. This often leads to failed backups for hundreds of hosts and locks up any new tasks that are queued. The only solution is to restart the service.

Bacula

Bacula is an open source solution that primarily does file-based backups. It also follows the client server model and can backup to various media types. Bacula can utilize tape devices (a feature that was not found with our other solutions I tested). While tape backups might be deemed ancient by some administrators it’s still heavily utilized in larger companies. This is because tape backups can last a decade if stored properly in a salt mine, and there are dedicated companies that have salt “vaults” just for this purpose.

Bacula is installed through the operating systems package manager just like R1soft ServerBackup, but was available without the installation any custom repositories. A minor downside of open source software is that you sometimes are at the whims of the repository maintainers to keep it updated. While you are able to compile the latest stables yourself it generally is a hassle to maintain for smaller teams. The Bacula version that was in Ubuntu, Debian, and CentOS’ repositories defaulted to version 5.1, and the latest stable is the 9.x branch. A minor drawback that I had to come to terms with when utilizing this backup solution. This process it too large to document in this quick overview and would need a dedicated blog in general. Stay tuned for a future blog, or kb article on this topic.

The Bacula server-side backend is overtly complex, and it took me the better part of a day to figure out all of the terminology. I will personally insist that the configuration requires advanced administration skills to follow all of the connections between the director, client, and storage daemons. The director manages all of the backup policies through “jobs, job-defs, schedules, and file-sets” parameters that are dedicated to each individual client. While I understand the complexity is due to its modularity, software designs like these generally just leads to bigger headaches when managing a larger set of systems.

The Bacula client-side configuration was a lot more relaxed. The client is installed through the operating systems package manager with a few short commands. You then alter the “fd” configuration on the client to point back to our Bacula backup director. In the “/etc/bacula/bacula-fd.conf” folder you specify the director with the following lines. The name must consist of a hostname for the backup server.

Below demonstrates this process on a CentOS system:

[client@backup]# yum install bacula-client -y

[client@backup]# systemctl enable bacula-fd

[client@backup]# systemctl restart bacula-fd

In contents of the “/etc/bacula/bacula-fd.conf file.

Director {

Name = BaculaServer-dir

Password = “PASSWORD”

}

Bacula has a TUI console named “Bconsole” that is utilized to manually manage the backend jobs for each client. This client was straight forward once I learned all of the Bacula component terminologies. You are able to view the current backup jobs, and how they performed in general.

Baculas backup and restore speeds were pretty impressive compared to its competitors. On a setup with utilizing normal spindle devices I saw an average of 50-60MB/s on the file set backups and restores. Other solutions mentioned averaged about 40-50MB/s within the same hardware specification, and configurations. I’ve included a few pictures to show a broad overview of this console below:

Listing clients with the Bconsole:

Listing current jobs that have run:

Listing a more detailed view of the job that ran:

BorgBackup

Borg is a file-based backup utility that doesn’t follow a traditional client-server model. The Borg utility does have a server like component built-in, but it’s more generalized to storing content for the backups. A minor downside to utilizing Borg is that it’s not a complete project. The Borg binary only manages the backups, verifications, and restores. To build Borg into a more complete backup solution we combine the binary with the BorgMatic wrapper written by torsion.org. This wrapper utilizes CronD to schedule backup, perform backup rotations, perform backup verifications, and so forth.

The built-in data deduplication feature on Borg is unique, and it compresses the dataset size on the remote server extensively. Not only does the deduplication help save storage space the compression algorism coincides with the duplication for a more cost saving adventure.

The installation of Borg can be done through python-pip or you can utilize the binary that is statically compiled on the main website located here. The installation process is rather simple when done through pip.

This is the install process on CentOS:

client@backup# pip install borgbackup

client@backup# pip install borgmatic

The backup, file listing, and restore interface can be a little difficult to grasp if you don’t have prior experience with tools like GIT. The closest I can come to describing how the data is segmented is it acts like a GIT repository. The backups are each a branch off of each other based on the last logical backup. Then within each of those branches you can pull data out. The built-in tool allows you to mount each branch with the Linux “FUSE” driver and this is how I generally browsed the backups on Linux. The overall backup process is similar pushing a new branch and merging down the branches when the backup rotation comes into play. I’ve included some live snippets that show the overall design.

Listing the backup’s in a Borg repository:

[client@backup]# borg list /backups

borg1.client-2018-03-27T09:55:34.014081 Tue, 2018-03-27 09:55:39

borg1.client-2018-03-28T09:55:34.014081 Tue, 2018-03-28 09:55:39

Extracting a file from a Borg repository:

[client@backup]# borg extract /backups::borg1.client-2018-03-27T09:55:34.014081 /root/helloworld.txt

Mounting a Borg repository to a folder:

[client@backup]# borg mount /backups::borg1.client-2018-03-27T09:55:34.014081 /mnt/backups

[client@backup]# ls /mnt/backups

[client@backup]# bin boot dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var

The biggest drawbacks with Borg is that it’s not really an enterprise ready solution. The lack of a client-server model prevents a more logical approach to machine configuration on a large scale. Borg also does not run as a service and requires a 3rd party wrapper with a CronD to run daily backups. While a cronjob is not a horrible solution it is a step back compared to running the binary as a service. Part of not being enterprise ready steams from the lack of a web interface to manage the backups. While the interface felt similar to GIT and was easy for me to navigate I can see how it would be hard to utilize for an administrator that does not have this experience.

Conclusion

These backup solutions all had their own elegant designs, and feature sets that had me impressed. The open source Bacula solution would not be able to replace R1soft on any scale due to the complexity of the configuration, and the lack of a direct API. The Bacula team does have an enterprise version that is more feature rich. It will be worth exploring after my recent adventures with Bacula.

Then open source BorgBackup solution doesn’t have a direct enterprise solution yet. However, the tool is a wonderful backup solution. The GIT design is very different from the traditional backup solution, and the deduplication is a cost saver when it comes to drive space. Unfortunately, the lack of a client-server model provides us with no direct means of managing the solution on a large scale. A new project has been assembled to resolve these issues, but it doesn’t have a release date. I’ll be watching this very closely.

This subset of backup solutions were not the only solutions we have explored in this adventure. We have also tried out Amanda, UrBackup, and Duplicati. These solutions are all very good in their own nature when it comes to backups, but we had to cut the blog short. I’ll be reviewing these in a future blog post so stay tuned.

Don’t want to do it yourself? Explore GigeNET’s backup services. 

syncthing

What is Syncthing?

Syncthing is a decentralized file synchronization tool. It shares similarities with commercial cloud storage products you may be familiar with, like Dropbox or Google Drive, but unlike these cloud storage products, it does not require you to upload your data to a public cloud. It also shares similarities with self-hosted cloud storage platforms like ownCloud or NextCloud, but unlike those products, it does not require a central server of any kind.

Syncthing works off of a peer-to-peer architecture rather than a client-server architecture. Computers attached to your Syncthing network each retain copies of the files in your shared folders and push new content and changes to each other through peer-to-peer connections. Unlike other peer-to-peer software you may be familiar with, like file sharing applications, Syncthing uses a private sharing model and only devices specifically authorized with each other can share files. All communication between the peers is encrypted to protect against man in the middle attacks intercepting your private data.

My Use Case

In my case, I have a library of almost 3TB of data consisting of over 250,000 files in over 20,000 directories. Most of these files average between 5MB and 100MB in size. There are currently 4 people working on the project who need access to the files. We each need the ability to add, remove, and edit items in the library with the changes synchronizing out to everyone else.

Faced with the challenge of mirroring this rather considerable amount of data between multiple computers, we have gone through a variety of solutions to decide what will work best for us.

The original setup was a central server where we all pulled backups via rsync. This had the advantage of simplifying change synchronization, since we always trusted everything on the central server to be the latest copy of the data. However, it made it more difficult to make updates since we would have to login to FTP to update the data, even though we all had a local copy. The real Achilles heel of this method, for us, was the need for a central server which raised the cost of our project, especially considering the amount of data we were hosting.

We looked into alternative cloud sync tools such as ownCloud and NextCloud, but again these do require a central server. We could have made one of our home servers the central server, but that would have consumed a lot of bandwidth for one of us. We looked at cloud storage solutions as well, but due to security concerns and the sheer cost of hosting 3TB on the cloud at the time, this didn’t seem practical for us either.

Enter Syncthing – a peer-to-peer file synchronization tool without the need for a central server. This solution was set up to be the most cost effective and simple way for us to manage our collections. Once set up, our computers would propagate changes to each other, still using a bit more bandwidth than we did when we had the central server, but at least that burden was distributed equally. It seemed like a great solution since we all had a copy of the files anyway and wanted it to stay that way, and since it allowed us to also begin editing the files locally on our own machines rather than going through a process to get them on the central server.

With all of these benefits, we decided to give it a try, and started using it day to day. It was impressive how we were each able to connect our existing folders (since they were rsynced with each other up to this point, they had the same contents). So we didn’t even have to go through a painful initial sync process. Once Syncthing was set up on all of our machines, it scanned the files and communicated with the other peers to make sure everyone had the same content. Once that was complete, everything was in sync and we were ready to go.

syncthing

How Syncthing Works

Syncthing enables the sharing of folders on your computer in a peer-to-peer manner. There is no central server or authority to manage the files, and you authorize peers in your client to allow them to connect and begin sharing the folder with you.

Peers connect directly to each other over the Internet in order to share data. This is the fastest and most secure method offered by Syncthing, since the data goes directly from one peer computer to the other with no central server or middle man handling the data. This method does require a firewall port to be opened on your network in order to communicate with peers that aren’t on the same network. By default, Syncthing uses TCP port 22000 for this purpose.

If you are syncing between servers or other Internet connections having a static IP address, you could easily lock down your firewall to only allow connections to this port from known IP addresses of your other peers, for additional security if that is a concern for you.

In some cases, direct peer connectivity is simply not possible, such as if you are behind a corporate or school network’s firewall or carrier NAT where you do not have access to the router to ask for a port to be forwarded. In these cases, Syncthing still is able to work, but it will adjust its connection strategy.

If connectivity is not possible directly between peers for any reason, Syncthing will fall back to using a relay server. In this case, you are adding a middle man to your connection, which generally does result in reduced performance. However, since Syncthing uses end-to-end encryption, these relay servers should not be able to see what data you are relaying through them.

The public relay servers used by default are operated for free by members of the community, and anyone can run a Syncthing relay. Relay servers do not store any data, they simply act as a proxy between peers that are unable to connect directly. So, you do not need a server with a lot of disk space to run a relay, but they can use a lot of bandwidth.

In some cases, you may need to use the relay functionality but do not want to rely on public relays out of security concerns, or maybe you simply want to have better performance by running your own private relay. Syncthing makes this possible as well through private relay pools. This still does create a centralized point for your Syncthing environment, but it is only used if the peer-to-peer connection is not possible. If you set up your Syncthing relay on a high speed server provider, like GigeNET, you can rest assured that your relay will operate in a fast and secure manner while you continue using Syncthing to enhance your project.

If you are interested in running a relay, be it a public relay for the good of the community or a private relay for your own project using Syncthing, the official documentation on the process can be found here.

How To Install Syncthing

A typical Syncthing installation will use simply the Syncthing Core application, which provides a command line tool and a Web UI of Syncthing. You can download the version of Syncthing Core for your operating system. There are pre-built packages for most Linux distributions, Windows, MacOS, and other popular operating systems.

The exact procedure for installing may vary from system to system, but for most Linux platforms, you simply need to download and extract a tar.gz archive, then run the Syncthing binary to launch the program.

By default, the Web UI will be available while Syncthing is running on https://localhost:8384/. You can access the Web UI on the local computer through a web browser, or by setting up an SSH tunnel if it is running on a remote server.

Additionally, you can configure Syncthing’s Web UI to listen on other IPs besides localhost if the need arises. Further documentation on this process is available here.

Connecting To Your First Peers

Connecting another peer to a shared folder for the first time is a very straight forward process. You will need to know their Device ID, which you can obtain by going to Actions > Show ID on the upper right corner of the web UI. The Device ID is an alphanumeric string that looks similar to a product license key.

To add the peer, click on the “Add Remote Device” button, which you’ll find toward the bottom left corner of the web UI. On this dialog, enter the device ID provided by your peer who you wish to connect.

You can enter anything you want for the Device Name, it is for your reference only so you know who the peer is. Generally, you can leave the address setting as “dynamic”, which will allow Syncthing to autodiscover the remote address for you.

If you would like the new peer to be able to add other devices to your shared folder, you can add them as an “Introducer” by checking that checkbox. This way, if your peer authorizes a new device on the folder, that peer will be introduced to you and you will begin sharing with them directly without any other steps required.

If you would like the peer to be able to create new shared folders and add them to your Syncthing easily, you can check the “Auto Accept” checkbox which will allow them to do just that.

Lastly, you simply need to check any checkboxes next to folders that you want to share with this peer. Once all of these steps are completed, simply click save, and allow Syncthing some time to connect to the peer. You should be on your way to syncing!

Is Syncthing Perfect?

No, of course not. Syncthing is a free open source application, and it’s not without its imperfections, but it works pretty well and development continues on the project every day. I still plan to use it for a long time to come, despite its imperfections.

I’ve found that with my massive library of files, the default rescan interval is too high for me and creates excessive server load. If you are sharing a very large library (say, hundreds of thousands of files), you too may want to increase your scan interval. Keep in mind, that this will increase the time between a change being made on Syncthing and that change propagating out to your peers. If you want to change this setting, you can do this by clicking the Edit button attached to the specific shared folder from the web UI, and adjusting the value of the setting “rescan interval” under advanced settings. I set mine to 36000 seconds (10 hours) to keep my server load down, since I don’t add files that often. Even with this scan interval, if I want to push changes out right away, I can simply go to the web UI and click the resync button to initiate an immediate scan.

Another pet peeve of mine is I’d like to see better support for the propagation of deletion events. I’ve found that if I delete a file while a peer is disconnected from Syncthing, when that peer eventually reconnects, they will sync back my deleted file to me. This can get really annoying, and sometimes causes me to hold off on making changes if one of my peers is offline for some reason. I would like to see some kind of global “deletion event roster” so that these delete events are not ignored by reconnecting peers, but it seems that Syncthing isn’t doing that yet.

I do sometimes have trust issues with Syncthing, because I’ve encountered some glitches in the web UI that make it seem like there could be a problem, but most of these concerns have been unfounded and Syncthing has done a great job managing my data. I’ve had some instances where the web UI will say that I am hundreds of gigabytes out of sync with my peers, and it appears to be actually syncing data, but not really using any bandwidth. Glitches like this reduce my confidence, but after using it safely for some time, I have learned to trust it even when the web UI is acting bizarrely.

Conclusion

Overall, what Syncthing accomplishes is a challenging task to pull off, and it does a pretty good job of it. I would love to see further development on the project, and I’ve seen new functionality and better interface polishing introduced in the timeframe that I’ve been using it. I think it will only continue to improve with the passage of time, and I definitely think it’s worth a serious look for your file synchronization needs.

sysadmin

During day-to-day server administration, there are a variety of important system metrics to analyze in order to assess the performance of the server, as well as diagnose any issues. A few of the most important metrics from a hardware standpoint for a system administrator to monitor are CPU usage, memory usage, disk I\O. Log data from applications themselves can be equally important when it comes to diagnosing problems with specific programs or websites running on a server.

With that in mind, I am outlining some basic tools I use as a system administrator which are either commonly bundled with Linux distributions, or easily installable from software repositories, and can greatly aid in diagnosing server issues or checking up on the health of the server day to day.

1. atop

If you are a Linux administrator or a system administrator familiar with the command line, you have probably heard of the “top” utility for monitoring system resources and running programs. It is similar to the Task Manager utility on a Windows system.

Atop is a utility similar to top which provides a more detailed look into important server metrics, it can be an even more helpful tool for identifying performance issues.

Atop provides a detailed breakdown of system resource usage such as:

  • CPU usage, both overall and by process ID.
  • System load average.
  • Breakdown of memory usage, overall and by process ID.
  • Disk I\O statistics per physical disk, as well as per LVM volume if you use LVM.
  • Network usage statistics, broken down by network interface.
  • A top-style process list breaking down programs running and sortable by resource usage.

Atop Service

Atop can also be run as a service on the machine. When running as a service, atop will record a snapshot of its statistics every few minutes and record the data to a log file. These log files can then be played back later using the atop utility to review the historic data. This can be incredibly useful in cases where the server is going down for an unknown reason. You can then go back to the historic atop logs and see if a program began consuming a lot of resources just before the server went down.

In order to launch Atop as a service and configure it to run at boot time, you can simply use the following commands:

chkconfig atop on
service atop start

Reviewing Historic Logs

By default, atop logs are stored in daily log files located in /var/log/atop/ with files rotated and renamed based on the date of the log.

To load a historic atop log and view its contents, you can simply open the log with “atop -r”, for example:

atop -r /var/log/atop/atop_20180315

Installing Atop

Atop is not included by default in most Linux distributions. You can check with your specific distribution resources to see where it is available for you.

On CentOS 7, the package is available in EPEL and can be installed in this way:

# Add the EPEL repository to your system
yum install epel-release
# Install the atop package
yum install atop

2. mysqltuner.pl

The “mysqltuner.pl” utility is a third party Perl script which provides fantastic insights into MySQL performance and tuning needs.

I have used this utility on many occasions to optimize poorly performing database servers in order to alleviate high load conditions without requiring hardware upgrades or even any changes to the usage pattern of the databases. Often, MySQL performance can be greatly improved by simply tweaking some basic options in the configuration.

Note: In order to get the best results from this tool, you should wait to run it if you have recently restarted MySQL. Some of the recommendations will not be accurate until MySQL has been running for some time (at least 24 hours) under normal activity.

A few examples of helpful information provided by the mysqltuner.pl utility include:

  • Memory usage information (maximum reached memory usage, maximum possible memory usage given the configuration values)
  • Statistics on amount of slow queries
  • Statistics on server connection usage
  • Statistics on table locks
  • Statistics specific to MyISAM, such as key buffer usage and hit rate.
  • Statistics specific to InnoDB, such as buffer pool use and efficiency.

Additionally, the tool provides recommendations for adjustments to common configuration options in the /etc/my.cnf configuration file. It may recommend adjustments to settings pertaining to things like the query cache size, temporary table size, InnoDB buffer pool size, and other settings.

As with any tool, it is important to exercise your experience as well as common sense when handling its recommendations. Some recommendations made by the tool could result in introducing instability to the MySQL server. For example, if your server is running low on memory already, increasing cache sizes dramatically can cause MySQL to exhaust the rest of your server’s available memory rapidly.

For this reason, I personally always go back and run the tool a second time after applying new settings, paying special attention to the statistic on maximum possible memory usage. That statistic will be accurate even when running MySQL immediately after a restart, although some other statistics provided by the tool may not be. The permissible range here can vary depending on the use case of your server. Safe values could be as high as 90% or higher on a dedicated MySQL server with very little other software running, but on a server with a lot of other programs running such as a cPanel server, allowing MySQL to use this much memory could exhaust the memory needed for other resources.

Obtaining & Running This Tool

The mysqltuner.pl tool is not usually packaged with a Linux distribution or with MySQL. The creator provides it for download on Github. It can be obtained here. The creator also maintains a short link domain to the tool: https://mysqltuner.pl

Once you’ve downloaded the tool, you can execute it by running this command:

perl mysqltuner.pl

3. ss

ss is a command line utility which can be used to gain insights into network connections and open sockets on your server. The tool is included in the iproute2 package and is intended to be a substitute for netstat. It is also notably faster, compared to netstat.

A common use for ss is to check open TCP or UDP ports on the server. This can be useful for creating firewall rules or checking whether a service is really listening on the port you have configured it to listen on.

The commands to run for these types of uses would include:

# Show listening TCP ports on the server
ss -lt
# Show listening UDP ports on the server
ss -lu

Another common use would be checking open connections to the server, which can be helpful for determining the connection volume or whether a connection is open between your server and another IP address.

The commands to run for these types of uses would include:

# Show open TCP connections
ss -t
# Show open UDP connections
ss -u

4. grep

Grep is a very helpful tool for “finding a needle in a haystack.” If you have a lot of text you need to sort through, such as log files or a folder full of configuration files, grep can greatly simplify the task.

A common use for grep is finding log data pertaining to some event, such as sifting through log data for Apache to find access attempts with a specific criteria. For example, if your Apache log file was stored in /var/log/httpd/access.log, you could use commands like these to find relevant log lines.

A few examples:
cat /var/log/httpd/access.log | grep “the text you are searching”
cat /var/log/httpd/access.log | grep index.html
cat /var/log/httpd/access.log | grep 127.0.0.1

Grep is also useful for sorting through the output of other commands, such as the “ss” command covered earlier.

For example, if you are looking for established TCP connections, you could run “ss -t” and pass it to grep like so:
ss -t | grep ESTAB

If you are looking for TCP connections to\from a specific IP, you can find that too!
ss -t | grep 127.0.0.1

A more advanced use of Grep is searching through files to find files containing a string of text. This can be useful if you are searching through multiple configuration files for a setting with a known value, but unknown location.

A few examples of searching folders with grep:
grep -r “text you want” /path/to/search/
grep -r “mysql” /home/user/public_html/

5. nc

Nc is a command line utility to establish connections to servers and interact with the service running on that port. It is an alternative tool to an older command line utility called Telnet. It is useful for testing connectivity and responses from services on a server.

You can use nc to see if a TCP connection is working, which can help in diagnosing service issues like a firewall blocking a port. The tool can connect to any TCP socket service, including protocols such as HTTP, XMPP, MySQL, or even Memcached.

In order to use the tool to interact with a specific service, beyond testing connectivity, you may need to know some specifics of the protocol so that you know what to “say” to the server in order to get a response.

Test Connectivity to HTTP

It is very simple to use nc to test an HTTP web server, you would run this command:
nc server.address.com 80

After connecting, you would use this command on the prompt to request a URL from the web server:
GET /

Test Connectivity to SMTP

Testing connectivity to an SMTP server is a slightly more advanced process, but still very straight forward. Sometimes, these steps are recommended by email blacklist RBL’s to test connectivity to a mail server and check any errors that are encountered.

To connect to an SMTP server, you would use this command:
nc server.address.com 25

Once connected, you can use these SMTP commands to send a test email.

MAIL FROM:sender@address.com
RCPT TO:recipient@address.com
Type email message data
.
QUIT

SSL Alternative

Nc is not designed to connect to services that are SSL enabled. If you are using an SSL service, it is better to use the OpenSSL command line utility. Other than the commands to connect, the process is the same.

The basic command format is: openssl s_client -connect server.address.com:port

So, to connect to an HTTPS server, you could run the following command:

openssl s_client -connect website.com:443

Once the client is connected, you can run protocol commands in exactly the same manner as with nc. This way you can perform the same tests or commands to the SSL enabled service.

Conclusion

While no short blog post can comprehensively cover all of the tools needed in the day to day life of a Linux Administrator, and in fact many of the very common well-known tools are not covered here, hopefully, these insights provoked some new thought and these simple tools will send you down a path toward discovering more in-depth information about your Linux system.

Sound like a hassle? Let us manage your systems.

What is Ansible?

Ansible is a world leading automation and configuration management tool. At GigeNET we heavily invest in the use of Ansible as our automation backbone. We use it to manage deployments on a large array of our platforms from our public cloud’s 1-Click applications to managing our own internal systems desired configurations.

Have you asked yourself questions like, “Why should we invest our resources in utilizing Ansible?”and “How Ansible can simplify our application deployments” or “How can Ansible streamline our production environments” lately? In this blog I will demonstrate the ease of kick starting Ansible development and how simple it is to start building your desired infrastructure state.

The Basics.

The information technology industry likes to develop new terminology. Ansible has fallen into this and has willed their own terminology related to their toolkits.

Key ansible terms:

Playbooks: A set of instructions for Ansible to follow. This normally includes a target to run these instructions sets on, a collection of variables for the roles you execute, and the roles themselves that will be executed.

Inventory: A group or collection of systems in which to execute your Playbooks against.

Tasks: A task is a XML statement that tell Ansible what actions to perform. These statements often involve calling modules.

Roles: A series of tasks that work together to execute the desired state you set out to design.

Modules: A prebuilt script that Ansible uses to run actions on a target. A full list of built-in modules is documented on Ansible’s website here.

The Workstation Setup.

The recommended setup for Ansible is to have a centralized server setup for your “Workstation.” The workstation is where you will keep all of your playbooks, your roles, and manage your system inventory. The install process of Ansible is pretty relaxed and only has a single requirement: You must have python installed.

How to set up your workstation on CentOS 7:

The first thing we will need is to ensure we have python and the python pip extension installed.

[ansible@TheWorkstation ~]$ sudo yum install python-pip -y

With the install of python-pip we will install the Ansible tools through Pip.  Most operating systems have a system package for Ansible, but it has too many limitations for my taste. You must wait for the package maintainer to update the Ansible version, and often time they are behind what is considered stable.  Pip is a package manager, but only manages python packages. In this case we will utilize Pip to perform the install and configure the Ansible configuration file manually to suit our needs.

[ansible@TheWorkstation ~]$ sudo pip install ansible===2.4.0.0
[ansible@TheWorkstation ~]$ mkdir playbook inventory roles modules

Place this configuration file into the playbook directory. We will utilize these configuration flags to prevent host key problems, set the system path of our roles, and modules directories. In a production environment you will want to keep host key checking enabled due to security implications. You can read more about the configuration options here.

ansible@TheWorkstation ~]$ cat <> Playbooks/ansible.cfg
> [defaults]
> host_key_checking = False
> library = /home/ansible/modules
> roles_path = /home/ansible/roles
> EOF

The Inquisition.

Let’s get our hands dirty, and dive into the actual development of a Ansible role. It’s best to think of roles as a set of instructions for Ansible to follow. The initial creation of a role will be building out the recommended directory structure for the role. We will build a small role in the Playbooks directory that will update a system and install Nginx. Let’s get started!

[ansible@TheWorkstation Playbooks]$ mkdir -p roles/MyRole/tasks roles/MyRole/handlers roles/MyRole/files roles/MyRole/templates roles/MyRole/defaults
[ansible@TheWorkstation Playbooks]$ touch roles/MyRole/tasks/main.yaml roles/MyRole/templates/main.yaml roles/MyRole/defaults/main.yaml

Before we start building the Ansible tasks you’ll need to have a desired configuration goal in mind. My natural first step is to determine what I want accomplished and what state I want the system to be in. For this example our goal is to build a simple Nginx role, and with the system to have Nginx installed and a simple website displayed. To get to this desired system state I normally spin up a virtual machine on Virtualbox or on a Cloud instance provider like GigeNET. Once I have a temporary work environment I tend to document each command used to get to my stated goal.

These are the manual commands required to get a simple Nginx configuration on CentOS:
[ansible@TaskBuilder ~]# sudo yum update -y
[ansible@TaskBuilder ~]# sudo yum install epel-release -y
[ansible@TaskBuilder ~]# sudo yum install nginx -y
[ansible@TaskBuilder ~]# sudo service nginx start
[ansible@TaskBuilder ~]# sudo chkconfig nginx on

You should now be able to view in your browser a “Welcome to Nginx” website on the temporary environment.

Your Ansible Cheat Sheet to Playbooks

Now that I know the tasks required to build a role I can start translating these commands to Ansible modules. I start by researching the Modules in the link listed previously in this blog post. We utilize “yum” on our manual adventure so I’ll try to find the exact “yum” module on the website listing. Below is a screenshot that documents the module. You can click on it for a more detailed summary.

With the documentation of the module we can start translating our commands to Ansible tasks. We will utilize two parameters on the yum module: name and state. Documented on the yum modules page are the details and how-to uses these parameters.

Name: Package name, or package specifier with version.
State: Whether to install (present or installed, latest), or remove (absent or removed) a package.

Now that we have the correct module information let’s translate it to something usable on our workstation. Ansible looks for the main.yaml file under the tasks directory to initiate the role.

Here is one of the main files we have previously touched on earlier:
[ansible@TheWorkstation ~]$ cat <> roles/MyRole/tasks/main.yaml

– name: Upgrade all packages
yum:
name: ‘*’
state: latest

– name: Install EPEL Repository
yum:
name: httpd
state: latest

– name: Install Nginx
yum:
name: nginx
state: latest
> EOF

The Inventory File.

The Ansible inventory file is a configuration file where you designate your host groups and list each host under said group. With larger inventories it can get quite complex, but we are only working towards launching a basic role at this time. In our inventory file we create a group named Target, and we set the IP address of the host we want our playbook to run the role against.

[root@TheWorkstation ansible]# cat <> Inventory/hosts
> [target]
> 199.168.117.102
> EOF

The Playbook.

Now that we have a very basic role designed, we need a method to call our role. This is where the Ansible playbook comes in. You can view the role as a single play in a NFL coach’s arsenal and the inventory as the actual team. The playbook is the coach and coach decides which plays the team runs on the field. Previously we built an inventory file with a group named target. In the playbook we designate that our hosts will be every system under the target group. We then tell the playbook to use our role MyRole.

[root@TheWorkstation ansible]# cat <> Playbooks/MyPlay.yaml
> —
> – hosts: target
> roles:
> – MyRole
> EOF

The Launch.

Now that we have the very basics finalized. It’s time to launch our very first Ansible playbook. To launch a playbook, you would simply run the ansible-playbook with the inventory file and playbook we configured earlier.

[ansible@TheWorkstation Playbooks]$ ansible-playbook -i ../Inventory/hosts MyPlay.yaml -k

If everything worked out you will see the following output:

Load More ...