Well, to get started we need DATA. So what can we do?
Source data can have many forms. We will discuss two sources: files and databases.
On this post we will discuss files.
Steps:
- [S]FTP some file(s) to your environment's filesystem (with filezilla for example);
- Load you data to the HDFS filesystem
- $ hadoop fs -copyFromLocal sap.csv .
- Voilá. All done.
$ hadoop fs -help
You are ready to do some more interesting stuff. The final purpose of Hadoop: running MapReduce jobs (at least this one of the ways of loading).
Thank you.
-- ====================
Other Tutorial Links
http://pinelasgarden.blogspot.pt/2012/04/en-big-data-helper-part-1-concepts.html
http://pinelasgarden.blogspot.pt/2012/04/en-big-data-helper-part-2-getting.html
http://pinelasgarden.blogspot.pt/2012/05/en-big-data-helper-part-4-pig.html
http://pinelasgarden.blogspot.pt/2012/05/en-big-data-helper-part-5-mapreduce.html
-- ====================
Other Tutorial Links
http://pinelasgarden.blogspot.pt/2012/04/en-big-data-helper-part-1-concepts.html
http://pinelasgarden.blogspot.pt/2012/04/en-big-data-helper-part-2-getting.html
http://pinelasgarden.blogspot.pt/2012/05/en-big-data-helper-part-4-pig.html
http://pinelasgarden.blogspot.pt/2012/05/en-big-data-helper-part-5-mapreduce.html
Sem comentários:
Enviar um comentário