当前位置:主页 > 管理论文 > 物流管理论文 >

基于物流信息的分类算法的研究及其应用

发布时间:2018-07-10 08:24

  本文选题:物流 + 数据挖掘 ; 参考:《北京邮电大学》2015年硕士论文


【摘要】:近年来,信息技术的发展推动了信息化在企业物流管理应用中的兴起,使得企业中存储的数据呈现爆炸式增长。以数据作为资源,充分合理的利用数据挖掘技术深化企业物流管理,重点进行基于物流信息的数据挖掘技术及其应用的研究,可以帮助企业提高运作效率、降低成本、及时决策,已成为提升企业竞争力的有效途径。 本文以数据挖掘分类算法中的K近邻算法为研究对象,在阐述了经典K近邻算法的核心思想与研究现状的基础上,总结出其两方面不足:(1)传统算法假设样本的不同属性对分类的重要性相同,导致不相关属性引起分类误判,影响算法准确率。(2)传统算法在选取待分类样本的近邻时需计算其与所有训练样本的距离,计算开销大且结果易受到噪声样本的影响,影响算法效率及准确率。 针对以上两方面不足,分别提出两种改进策略: (1)提出基于属性约简的改进算法,体现不同属性对分类结果的差异性。该算法利用信息熵计算条件属性与决策属性间的相关系数,区分条件属性在分类过程中的重要性,并通过调整相关系数的阈值适当约简样本属性。数值分析显示,改进算法可在一定程度上提升分类准确率。 (2)提出基于聚类的样本裁剪改进算法,从而有效处理海量数据集,降低算法时间复杂度。此算法利用层次聚类限定K-means聚类的初始聚类中心,避免其随机选择影响聚类结果,同时引入K-means聚类修正层次聚类结果并从中选择具有代表性的样本集进行分类测试。仿真实验证明,通过以上的样本裁剪,改进算法可在提高或保持分类准确率的前提下,有效地降低分类器的计算量,提高分类效率。 最后,本文在上述研究工作的基础上设计了一个改进的K近邻协同过滤推荐模型。该模型以北京市物流线路评分数据为应用对象,验证该模型在解决实际问题中的有效性和可行性。实验证明,改进算法推荐结果准确率显著提高,通过该模型能够帮助客户从大量专业信息中快速找到适合的物流公司,具有实际应用性。
[Abstract]:In recent years, the development of information technology has promoted the rise of information technology in the application of enterprise logistics management, making the data stored in the enterprise explosive growth. Taking data as the resource, making full and reasonable use of data mining technology to deepen enterprise logistics management, focusing on the research of data mining technology and its application based on logistics information, can help enterprises improve their operational efficiency and reduce their costs. Timely decision-making has become an effective way to enhance the competitiveness of enterprises. In this paper, the K-nearest neighbor algorithm in the classification algorithm of data mining is taken as the research object, and the core idea and research status of the classical K-nearest neighbor algorithm are expounded. The main conclusions are as follows: (1) the traditional algorithm assumes that the different attributes of the samples are of the same importance to the classification, which leads to the classification misjudgment caused by the unrelated attributes. (2) the traditional algorithm needs to calculate the distance between the nearest neighbor of the sample to be classified and all the training samples. The computation cost is large and the results are easily affected by the noise samples, which affects the efficiency and accuracy of the algorithm. In view of the above two shortcomings, two improved strategies are proposed: (1) an improved algorithm based on attribute reduction is proposed to reflect the difference of classification results between different attributes. The algorithm uses information entropy to calculate the correlation coefficients between conditional attributes and decision attributes to distinguish the importance of conditional attributes in the classification process and to reduce the sample attributes appropriately by adjusting the threshold of correlation coefficients. Numerical analysis shows that the improved algorithm can improve the classification accuracy to some extent. (2) an improved algorithm of sample clipping based on clustering is proposed to deal with massive data sets effectively and reduce the time complexity of the algorithm. This algorithm uses hierarchical clustering to define the initial clustering center of K-means clustering to avoid its random selection to affect the clustering results. At the same time, K-means clustering is introduced to modify the hierarchical clustering results and representative sample sets are selected for classification test. The simulation results show that the improved algorithm can effectively reduce the amount of computation and improve the classification efficiency on the premise of improving or maintaining the accuracy of classification. Finally, an improved K-nearest neighbor collaborative filtering recommendation model is designed based on the above work. The model is applied to the Beijing logistics line scoring data to verify the effectiveness and feasibility of the model in solving practical problems. The experimental results show that the accuracy of the improved recommendation algorithm is significantly improved and the model can help customers quickly find the suitable logistics company from a large number of professional information and it has practical application.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:TP311.13

【参考文献】

相关期刊论文 前10条

1 张华娣;;贝叶斯和SVM在物流客户流失分析中的应用[J];重庆工学院学报(自然科学版);2009年07期

2 李蓉 ,叶世伟 ,史忠植;SVM-KNN分类器——一种提高SVM分类精度的新方法[J];电子学报;2002年05期

3 周彦利;周创明;王晓丹;;基于核的K近邻法[J];航空计算技术;2006年05期

4 刘向东,陈兆乾;一种快速支持向量机分类算法的研究[J];计算机研究与发展;2004年08期

5 张玲珠;周忠眉;;结合属性值贡献度与平均相似度的KNN改进算法[J];计算机工程与应用;2010年18期

6 邓维斌;王国胤;王燕;;基于Rough Set的加权朴素贝叶斯分类算法[J];计算机科学;2007年02期

7 王国胤,于洪,杨大春;基于条件信息熵的决策表约简[J];计算机学报;2002年07期

8 李红莲,王春花,袁保宗;一种改进的支持向量机NN-SVM[J];计算机学报;2003年08期

9 李红莲,王春花,袁保宗,朱占辉;针对大规模训练集的支持向量机的学习策略[J];计算机学报;2004年05期

10 黄创光;印鉴;汪静;刘玉葆;王甲海;;不确定近邻的协同过滤推荐算法[J];计算机学报;2010年08期



本文编号:2112814

资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/wuliuguanlilunwen/2112814.html


Copyright(c)文论论文网All Rights Reserved | 网站地图

版权申明:资料由用户0813c***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱[email protected]