基于Hadoop的个性化图书推荐系统基于hadoop的推荐系统论文

您所在的位置：网站首页 › 基于hadoop的商品推荐引擎 › 基于Hadoop的个性化图书推荐系统基于hadoop的推荐系统论文

基于Hadoop的个性化图书推荐系统基于hadoop的推荐系统论文

2024-07-17 14:08| 来源: 网络整理| 查看: 265

一绪论 1

1.1 编写目的 1

1.2 背景及意义 1

1.3 开发及运行环境 2

二需求分析 3

2.1 系统概述 3

2.3 系统功能需求 4

2.3.1 收集原始数据 4

2.3.2 计算物品相似度矩阵 4

2.3.3 计算用户购买向量 5

2.3.4 计算推荐向量并去重和排序 5

2.3.4 数据入库 5

2.3.5 作业控制 5

2.3.6商品推荐功能 6

2.4 系统非功能需求 6

三概要设计 7

3.1系统架构设计 7

3.2系统层次架构设计 8

3.3系统功能模块设计 9

3.3.1 计算物品相似度矩阵 10

3.3.2推荐矩阵(相似度矩阵*向量) 11

3.3.3对推荐向量进行处理 12

3.3.4数据入库 12

3.4系统数据库设计 12

四详细设计 14

4.1推荐模块程序流程图 14

4.2系统架构图 15

4.3数据预处理层 15

4.4推荐结果生成层 16

4.5推荐系统流程图 17

五系统实现 17

5.1计算用户购买商品的列表 17

5.2计算商品的共现关系 18

5.3计算用户的购买向量 18

5.4推荐结果 19

5.5数据去重 19

5.6推荐结果入库 20

5.7构建作业流对象 22

六系统测试 23

6.1计算用户购买商品的列表 23

6.2计算商品的共现次数(共现矩阵) 23

6.3计算用户的购买向量 23

6.4推荐结果 24

6.5数据去重 25

6.6推荐结果入库 25

6.7 web系统推荐商品实现 26

小结 26

参考文献 27

二需求分析

2.1 系统概述

商品推荐系统是对用户的历史行为进行挖掘，对用户兴趣信息进行建模，并对用户未来行为进行预测，从而建立用户和内容的关系，满足用户对商品的推荐需求的一种智能系统。通过对主要的推荐算法进行比较分析，模拟实现了基于用户行为的智能推荐系统，提高了推荐算法的有效性。商品推荐系统是为了更精准的为用户推荐他们想要的内容，如果一个用户在浏览商品信息的时候，通过对用户数据的记录，和已经存在的其他的用户记录进行分析，从而为用户推荐相应的数据。本次毕业设计是基于Hadoop的商品推荐系统，本此课设通过对用户行为的研究，发现用户购买的偏好波动幅度偏大，如何充分利用这一特征是提高推荐系统精准度的关键。

用户行为数据的处理。商品推荐系统用户、商品行为主要是用户的购买行为。

购买行为包含了丰富的用户购买商品，如何处理这些购买商品是推荐系统实现的关键。

系统必须具有高扩展性。网上购物每时每刻都会有新的数据产生，都会执行新的购物行为，系统的扩展性变得尤为关键。

推荐系统的推荐质量。推荐系统的最终目的是推荐，所以推荐质量是整个系统设计的最终目的。好的推荐系统需要兼顾系统的精准度、覆盖率以及新颖度。

基于Hadoop的个性化图书推荐系统基于hadoop的推荐系统论文_分布式

2.3 系统功能需求

基于Hadoop的商品推荐引擎大致可以分为5部分，分别是：计算用户的购买向量、计算物品的相似度矩阵、计算推荐度及相关处理、数据导入数据库和对于整个项目的全部作业控制。

2.3.1 收集原始数据

推荐系统是基于用户、商品行为数据来进行推荐的，没有用户商品数据的推荐系统是无法进行推荐的。rawdata文件:该文件是收集用户对物品的偏好，形成“用户物品偏好”的数据集。数据格式:用户编号物品编号偏好值。

package com.cxx.project.grms.step1; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.conf.Configured; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner; import java.io.IOException; /** * @Author: cxx * 用户购买列表 * @Date: 2018/4/9 11:46 * 原始数据**************** * 10001 20001 1 * 10001 20002 1 * 计算结果**************** * 10001 20001,20005,20006,20007,20002 * 10002 20006,20003,20004 */ public class UserByList extends Configured implements Tool{ //mapper public static class UserByListMapper extends Mapper{ @Override protected void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException { String[] strs = value.toString().split("\t"); context.write(new Text(strs[0].trim()),new Text(strs[1].trim())); } } //reduce public static class UserByListReduce extends Reducer { @Override protected void reduce(Text key,Iterable values,Context context) throws IOException, InterruptedException { StringBuilder sb = new StringBuilder(); for (Text value:values){ sb.append(value.toString()).append(","); } String result = sb.substring(0,sb.length()-1); context.write(key,new Text(result)); } } @Override public int run(String[] strings) throws Exception { Configuration conf = getConf(); Path input=new Path("/grms/rawdata/matrix.txt"); Path output=new Path("/grms/rawdata/222"); Job job = Job.getInstance(conf,this.getClass().getSimpleName()); job.setJarByClass(this.getClass()); job.setMapperClass(UserByListMapper.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class); TextInputFormat.addInputPath(job,input); job.setReducerClass(UserByListReduce.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setOutputFormatClass(TextOutputFormat.class); TextOutputFormat.setOutputPath(job,output); return job.waitForCompletion(true)?0:1; } public static void main(String[] args) throws Exception { System.exit(ToolRunner.run(new UserByList(),args)); } }

基于Hadoop的个性化图书推荐系统基于hadoop的推荐系统论文_用户购买行为_02