Saturday, 13 June 2015

Introduction to Docker


    A developer always looks to develop apps on one platform and make it available on multiple platforms. While developing the app, they don't care about the development environment. Truly, it's all about the app and all app needs is a secure isolated environment with minimal OS services to run.

    We've been using VMs since a long time for running applications on different platforms. It is indeed a major advancement over physical machine. But the VMs do come with a fully blown OS which takes a lot of resources. So wouldn't it be better if we can run our apps without the underlying OS overhead?
    Let's find out how to achieve this as we go ahead.


Container


    Containers are an application runtime environment similar to virtual machines. Runtime environment contains the things that an application needs in order to execute. But containers are much more light weight than virtual machines. Now as we all know the operating system(Linux) is installed on top of the physical machine. The Linux kernel manages the hardware underneath it. Before the containers and VMs, every app would be installed on the user space.

    Containers enable us to create multiple isolated instances of user space. These isolated instances of user space are called containers. This type of virtualization is called OS level virtualization. Containers are light weight because they share a single common Linux kernel on the host. Thus containers are faster and portable.




    Each container has an independent and isolated instance of user space. Now for an isolated user space we need an isolated root file system, process hierarchy and networking stacks. So each instance of the user space has its own view of the root file system, process hierarchy and networking stacks. Thus an app running inside of a container can change anywhere within its own view of the file system. 

    But this kind of isolation is provided by the feature of the linux kernel called namespaces. They allow partitioning of the system namespaces (For ex. process namespace) and assign a partition to a container.

    Next, cgroups(control groups) is a linux kernel feature to limit, account and isolate resource usage (CPU, memory, disk I/O, etc) of process groups. In case of containers, cgroups are mapped one to one to containers. Thus we can control how much CPU, memory etc. the container has access to.



Docker 


    Docker itself is an isolated runtime environment. It is an open source platform for developers and system admins to build, ship and run distributed apps.

    Well for me, I had to ship one of my applications from Ubuntu to CentOS, that's when I started learning docker. So packaging everything for use elsewhere is always a challenge when it comes to porting your application stack together with its dependencies. This is where docker enables apps to be quickly assembled from components. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.

    Docker brings together the namespaces, cgroups into its docker engine. It provides a standard runtime providing developers to code apps in docker container and package them and ship to any docker container and start working on it. It can be shipped to data center, AWS, Azure anywhere where the destination is running docker runtime or daemon. Docker seems to be evolving as a platform rather than a runtime.

    Remember I said earlier that docker containers share the Linux kernel features such as namespaces, cgroups, etc. Now docker contains libcontainer as the default execution driver for sharing the Linux kernel features. You can configure it to use lxc or libvirt.


Installing Docker


    We'll install docker on Linux 14.04. It is recommended to have a Linux kernel of minimum 3.8. This is for the support of namespaces plus stability. To check the version of Linux kernel you can run the following command in the terminal window as follows:

uname -a



    As you can see the kernel version for my system is 3.13 which is even better. Next run the following command to update the packages.

apt-get update



    It is not necessary to login as root but I don't like to prefix "sudo" before running every command.


    Next we will install docker using the following command:

apt-get install -y docker.io




    Once docker is successfully installed, check if the docker.io service is running using the following command:

service docker.io status



    Woah! Docker is installed and is running. This indicates that the docker daemon is running. Docker comes with a client as well as daemon.

    Now let's use the client and get the version of docker installed on our system using the following command:

docker -v



    Then run the following command to get more details about the docker:

docker version



    This tells us that the client as well as server are of the same version(i.e. 1.0.1). Good for us!

   
    Next, run the following command to get internal info about the docker:

docker info



    As you can see, this gives us information about the containers and images in the docker. Docker uses the aufs union file system as the storage driver. Here the native execution driver indicates that it is using libcontainer. Remember I said earlier that docker uses lxc or libcontainer to talk to the Linux kernel.

    I guess that's it for this post. You must have got the idea of what docker is, what docker does, and how to install it and run some basic commands.

    Do keep learning. I hope this post will get you started with docker. If you have any doubts, do comment below and I'll look to it.

    Happy Reading! :)


Saturday, 6 June 2015

Introduction to Maven - Part I

 
    Maven in simple words is a powerful build tool. A build tool is a tool that automates everything related to building the software project which includes:

        1. Generating source code
        2. Generating documentation from the source code
        3. Compiling source code
        4. Packaging compiled code into JAR files or ZIP files

    The most important reason behind using maven is it helps us manage dependencies. Dependency management is what makes it such a powerful tool. Besides Maven can also be used as a project management tool. It maintains the version information and can be used to generate the javadoc as well as maintain other information about the component.

    Maven is a open source tool managed by the Apache Software Foundation. The reason you would want to use it is its platform independent. We can recreate our builds for any environment. Downloading a dependency will also pull other items it needs i.e it supports transitive dependency. Maven can be integrated with an IDE or can be used standalone.

    You must have seen people comparing Maven and Ant tool but Ant really isn't a build tool as much as it is a scripting tool. You have to explicitly do everything in Ant.

    Ant uses build.xml as the name for a build file. I have listed a simple build.xml file. It is quite easy to understand what this file does. It first cleans the project then compiles it and creates a jar and finally runs the jar. But there is a problem here, what if i call build before clean. This might introduce some errors in my build.

<project>

    <target name="clean">
        <delete dir="build"/>
    </target>

    <target name="compile">
        <mkdir dir="build/classes"/>
        <javac srcdir="src" destdir="build/classes"/>
    </target>

    <target name="jar">
        <mkdir dir="build/jar"/>
        <jar destfile="build/jar/HelloWorld.jar" basedir="build/classes">
            <manifest>
                <attribute name="Main-Class" value="oata.HelloWorld"/>
            </manifest>
        </jar>
    </target>

    <target name="run">
        <java jar="build/jar/HelloWorld.jar" fork="true"/>
    </target>

</project>



On the other hand, maven contains a lot of implicit functionality. Maven uses a pom.xml for its build.
A simple pom.xml file is listed below. We will discuss more about pom file later.

<project>
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.mycompany.app</groupId>
    <artifactId>my-app</artifactId>
    <version>1</version>
</project>



You can observe that not much information is explicit in maven and this can get quite confusing sometimes. Maven uses a convention over configuration model i.e we are good to go until we use standard maven convention rather than configuring everything like that in ant build.xml.

    Now this pom file can be used to do all those things like clean, compile, build and run. It will all work just looking at my pom file because you don't see all those stuffs set up if i follow their conventions, their directory structure. So it is a little non-descriptive at first until you understand the semantics of how maven works.
    It is really centered around managing your entire project's lifecycle.

Maven directory structure


    Now lets have a look at the maven folder structure. By default maven looks for a src/main/java directory underneath our project directory. It compiles all the code into a target directory. And it references all these things looking into our pom.xml file.  
src/main/java - where we store our Java code following the standard package declaration

For  example, for a project with package name com.myproject, java files will be placed under src/main/java/com/myproject/ directory.

src/test/java - all your unit tests code goes here

target - this is where everything gets compiled to, where tests get ran from and contents in this directory get packaged into a jar, zip etc.


Maven pom.xml file


A pom file is an XML representation of project resources like source code, test code, dependencies i.e external JARs used etc. The POM contains references to all of these resources.

Now a pom.xml file can be divided into 4 basic parts:

1. Project Information: 

    
    This can include 

    a. groupId - This is the package name used inside our application. For example: com.myproject
    b. artifactId - This is the application name. For example: HelloWorld
    c. version - This is the version number of our application. For example: 1.0.0
    d. packaging - This is how we want to distribute our application For example: jar, war

2. Dependencies: 


    This includes the direct dependencies i.e the artifacts that we want to use in our application.Now for adding a dependency we need to know the following three things:

    1. groupId
    2. artifactId
    3. version

Note: Maven stores everything it downloads in your home directory/.m2 directory.
For example: C:\Users\<your_username>\.m2\repository

This is how we can add a dependency in our pom.xml file

<dependencies>
    <dependency>
        <groupId>commons-lang</groupId>
        <artifactId>commons-lang</artifactId>
        <version>2.1</version>
    </dependency>
</dependencies>


Note: For storing a dependency it uses the above info. For example for storing the commons-lang artifact it uses the following directory structure:
C:\Users\<your_username>\.m2\repository\commons-lang\commons-lang\2.1\commons-lang-2.1.jar


Note: Maven does not store the downloaded artifacts into individual project directory instead it stores the artifacts into a local repo so that if an artifact is included in multiple projects it does not have to go and download that multiple times.

3. Build: 


    This includes 

    a. Plugins - The plugins that we want to use in our application
    b. Directory Structure - This is where we can override the default java and other directories

    For example: Lets change the final artifact name of the packaged application. This can be done by adding the following code to the pom.xml file:

<build>
    <finalName>test</finalName>
</build>

4. Repositories: 


    This is where we download the artifacts from. By default it downloads the artifacts from central maven repository 



I hope this post has given you something to start with maven. If you have any doubts, just leave a comment and I will look to it.
Happy Reading :)