6. HDFS 简介
• Streaming data access
• Batch processing rather than ineractive user
access
• Write once, read many times
• Large files
• 适用场景:日志挖掘,大规模机器学习,
爬虫建立索引
10. HDFS 使用接口
• Shell 命令:hadoop fs –cmd
cat chgrp chmod chown copyFromLocal copyToLocal cp du, dus expunge
get getmerge ls, lsr mkdir movefromLocal mv put rm rmr setrep stat tail
test text touchz
20. NameNode 作用
• 元数据包含:
---List of files
---List of Blocks for each file
---List of DataNodes for each block
---File attributes, e.g creation time, replication
factor
---namespace attributes, needed replication
excess replication, under replication
---FSImage, FSEditlog