This error occurs in BlockManager::chooseTarget4NewBlock()
(I am referring to the latest code) code. Specific piece of code, which causes this is:
final DatanodeStorageInfo[] targets = blockplacement.chooseTarget(src,
numOfReplicas, client, excludedNodes, blocksize,
favoredDatanodeDescriptors, storagePolicy);
if (targets.length < minReplication) {
throw new IOException("File " + src + " could only be replicated to "
+ targets.length + " nodes instead of minReplication (="
+ minReplication + "). There are "
+ getDatanodeManager().getNetworkTopology().getNumOfLeaves()
+ " datanode(s) running and "
+ (excludedNodes == null? "no": excludedNodes.size())
+ " node(s) are excluded in this operation.");
}
This occurs, when the BlockManager
tries to choose a target host for storing new block of data and can not find a single host (targets.length < minReplication)
. minReplication
is set to 1 (configuration parameter: dfs.namenode.replication.min
) in hdfs-site.xml
file.
This could occur due to one of the following reasons:
- Data Node instances are not running
- Data Node instances are unable to contact the Name Node
- Data Nodes have run out of space, hence no new block of data can be allocated to them
But, in your case, error message also contains following information:
There are 4 datanode(s) running and no node(s) are excluded in this operation.
It means, there are 4 Data Nodes running and all the 4 Data Nodes were considered for placement of data, for this operation.
So, possible suspect is disk space on the Data Nodes. You can check the disk space on your Data Nodes, using the following command:
hdfs dfsadmin -report
It gives report for each of your Live Data Nodes. For e.g. in my case, I got the following:
Live datanodes (1):
Name: 192.168.56.1:50010 (192.168.56.1)
Hostname: 192.168.56.1
Decommission Status : Normal
Configured Capacity: 648690003968 (604.14 GB)
DFS Used: 193849055737 (180.54 GB)
Non DFS Used: 186164975111 (173.38 GB)
DFS Remaining: 268675973120 (250.22 GB)
DFS Used%: 29.88%
DFS Remaining%: 41.42%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun Dec 13 17:17:34 IST 2015
Check the "DFS-Remaining" and "DFS-Remaining%". That should give you an idea about the remaining space on your Data Nodes.
You can also refer to the wiki here: https://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo, which describes the reasons for this error and ways to mitigate it.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…