First we will go through list of configuration files and their properties.
below are commands to check all configurations
$ /usr/lib/hadoop/etc/hadoop
These files have special properties associated that can be configured by hadoop administrators. These prperties are enclosed with <property> </property> tags.
Suppose I want to add abc.txt file of size 50mb inside hdfs then according to standard configuration parameters will occupy 1 data block as each datablock size is 128mb. Standard replication factor is 3. To check all these parameters we will have to look for hdfs-site.xml
hit below command to alter these parameters
$ cat hdfs-site.xml
The replication factor is 3 here as I have hadoop cluster with 11 nodes. Suppose you have 1 node hadoop cluster then there is no point in keeping three different replications on single node.
As you see there are 134217728 bytes of datablocks. (128 mb*1024 *1024) in dfs.blocksize. These special parameters can be changed in these xml
To study more in details refer below links:
https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
Author Profile
- Passionate traveller,Reviewer of restaurants and bars,tech lover,everything about data processing,analyzing,SQL,PLSQL,pig,hive,zookeeper,mahout,kafka,neo4j
Latest Post by this Author
PLSQLApril 26, 2020How effectively we can use temporary tables in Oracle?
Big DataAugust 15, 2019How to analyze hadoop cluster?
Big DataJuly 28, 2019How to setup Hadoop cluster using cloudera vm?
Big DataMay 25, 2019How to configure parameters in Hadoop cluster?