7. Mahout’s Algorithm
Help見ると
アルゴリズム
いっぱい
[hdfs@svr001 ~]$ mahout
An example program must be given as the first argument.
Valid program names are:
arff.vector: : Generate Vectors from an ARFF file or directory
baumwelch: : Baum-Welch algorithm for unsupervised HMM training
canopy: : Canopy clustering
cat: : Print a file or resource as the logistic regression models would see it
cleansvd: : Cleanup and verification of SVD output
clusterdump: : Dump cluster output to text
clusterpp: : Groups Clustering Output In Clusters
cmdump: : Dump confusion matrix in HTML or text formats
cvb0_local: : LDA via Collapsed Variation Bayes, in memory locally.
dirichlet: : Dirichlet Clustering
recommendfactorized: : Compute recommendations using the factorization of a rating mat
recommenditembased: : Compute recommendations using item-based collaborative filterin
fkmeans: : Fuzzy K-means clustering
fpg: : Frequent Pattern Growth
:
:
:
8. Reccomendation
[hdfs@svr001 ~]$ mahout
An example program must be given as the first argument.
Valid program names are:
arff.vector: : Generate Vectors from an ARFF file or directory
baumwelch: : Baum-Welch algorithm for unsupervised HMM training
レコメンド
canopy: : Canopy clustering
やってみたいよね
cat: : Print a file or resource as the logistic regression models would see it
cleansvd: : Cleanup and verification of SVD output
clusterdump: : Dump cluster output to text
clusterpp: : Groups Clustering Output In Clusters
cmdump: : Dump confusion matrix in HTML or text formats
cvb0_local: : LDA via Collapsed Variation Bayes, in memory locally.
dirichlet: : Dirichlet Clustering
recommendfactorized: : Compute recommendations using the factorization of a rating mat
recommenditembased: : Compute recommendations using item-based collaborative filterin
fkmeans: : Fuzzy K-means clustering
fpg: : Frequent Pattern Growth
:
:
:
13. Command Run
[hdfs@svr001 ~]$ mahout recommenditembased --input
/mahout/recommend_sample1.csv --output /mahout/recome1 --similarityClassname
SIMILARITY_PEARSON_CORRELATION
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /opt/cloudera/parcels/CDH-4.4.01.cdh4.4.0.p0.39/lib/hadoop/bin/hadoop and
HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /opt/cloudera/parcels/CDH-4.4.01.cdh4.4.0.p0.39/lib/mahout/mahout-examples-0.7-cdh4.4.0-job.jar
13/12/12 11:26:23 INFO common.AbstractJob: Command line arguments: {-booleanData=[false], --endPhase=[2147483647], -input=[/mahout/recommend_sample4.csv], --maxPrefsPerUser=[1000], -minPrefsPerUser=[1], --output=[temp/preparePreferenceMatrix], --ratingShift=[0.0],
--startPhase=[0], --tempDir=[temp]}
13/12/12 11:26:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing
the arguments. Applications should implement Tool for the same.
13/12/12 11:26:25 INFO input.FileInputFormat: Total input paths to process : 1
13/12/12 11:26:25 INFO mapred.JobClient: Running job: job_201312042139_0225
13/12/12 11:26:26 INFO mapred.JobClient: map 0% reduce 0%
13/12/12 11:26:34 INFO mapred.JobClient: map 100% reduce 0%
13/12/12 11:26:39 INFO mapred.JobClient: map 100% reduce 67%