Apache Spark Architecture, Design and Overview

Apache Spark is a fast, open source and general-purpose cluster computing system with an in-memory data processing engine. Apache Spark is written in Scala and it provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. Apache Spark architecture is designed in such a way that you can use it for ETL (Spark SQL), analytics, machine learning (MLlib), graph processing or building streaming application (spark streaming). Spark is often called cluster computing engine or simply execution engine. Apache Spark is one of most…

Continue ReadingApache Spark Architecture, Design and Overview
Comments Off on Apache Spark Architecture, Design and Overview

Hadoop Single Node Cluster Setup on Ubuntu

In this tutorial, I will explain you setting up Hadoop single node cluster setup on Ubuntu 14.04. Single node cluster will sit on the top of Hadoop Distributed File System (HDFS). Hadoop single node cluster setup on Ubuntu 14.04 Hadoop is a Java framework for running application on the large cluster made up of commodity hardware's. Hadoop framework allows us to run MapReduce programs on file system stored in highly fault-tolerant Hadoop distributed file systems. Related Readings:  How to Learn Apache Hadoop   Also: 7 Best Books to Learn Bigdata Hadoop The main…

Continue ReadingHadoop Single Node Cluster Setup on Ubuntu
Comments Off on Hadoop Single Node Cluster Setup on Ubuntu

7 Best Hadoop Books to Learn Bigdata Hadoop

The Hadoop ecosystem is vast and may take long time to learn bigdata and start implement applications therefore people new to big data Hadoop technology must choose right book to start with. Here are some of Best Hadoop books you may want to consider. The Hadoop Bigdata has a huge demand in the domains like finance, Insurance, Banking, social networking and many other platforms that deal with very large data sets. The Hadoop experts are in great demand in industries which needs to handle and big, complicated data sets. A working knowledge of…

Continue Reading7 Best Hadoop Books to Learn Bigdata Hadoop
1 Comment