Amaltas
  • Consulting
  • Training
  • About

hdfs

Map Reduce and HDFS

Map Reduce and HDFS

Background Map Reduce framework and HDFS go hand-in-hand. The large datasets on which analysis is to be performed usually are saved in HDFS cluster. Each Data Node in the HDFS cluster is also a compute resource and is capable of executing a Map Reduce Job managed by YARN (Yet Another
Anay Tamhankar Nov 23, 2023
HDFS Java API

HDFS Java API

Starting Point There are two important Java classes which are the starting point for using the Java API. Upto Hadoop Version 1.x org.apache.hadoop.fs.FileSystem Hadoop Version 2.x and later org.apache.hadoop.fs.FileContext FileContext class has better handling of multiple fileystems (A single FileContext
Anay Tamhankar Nov 15, 2023
HDFS Architecture

HDFS Architecture

HDFS Architecture Diagram Data Replication How does HDFS manage to provide data reliability and high fault tolerance when it uses commodity (and therefore cheap and prone to failure) hardware ? HDFS does it using a very simple idea. 1. Break a large file into series of chunks or blocks (each block
Anay Tamhankar Nov 14, 2023
Hadoop Distributed File System (HDFS)

Hadoop Distributed File System (HDFS)

What is HDFS ? HDFS is a distributed file system designed to run on commodity hardware. HDFS is highly fault tolerant and provides high throughput access to "application data". It is suitable for applications which have large datasets. Distributed File Systems are file systems that manage storage across a network of
Anay Tamhankar Nov 13, 2023
Amaltas Technologies LLP © 2025.