mRMR特征选择算法(feature_selection)的使用

 

mrmr

程序下载地址,本机电脑安装java环境,具体环境安装可自行百度Google.

用以实现用 mRMR 从特征集中提取特征的程序(Python

#inport neccesary bags  import csv#用来保存csv文件 import pandas as pd import numpy as np import re import os#用来调用系统程序  #改变默认文件夹位置 os.chdir("XXX") #input path name datapath ="XXX"  #output path name outputpath="XXX"
"""     mrmr and svm """    #read csv data from path train_data = pd.read_csv(datapath, header=None, index_col=None) X = np.array(train_data) Y = list(map(lambda x: 1, xrange(len(train_data) // 2))) Y2 = list(map(lambda x: 0, xrange(len(train_data) // 2))) Y.extend(Y2) Y=np.array(Y) Y=Y.reshape(2260,1)  #concatenate class and data full_csv_with_class=np.concatenate([Y,X],axis=1) print full_csv_with_class  #print the results of original csv data and final full data print "the shape of data:"+str(X.shape) print "the shape of data and class:"+str(full_csv_with_class.shape)  #generating virtual headers columns=["class"] columns_numbers=np.arange(full_csv_with_class.shape[1]-1) columns.extend(columns_numbers)  # Write data into files csvFile2 = open(outputpath,'w')  writer = csv.writer(csvFile2) m = len(full_csv_with_class) writer.writerow(columns) for i in range(m):     writer.writerow(full_csv_with_class[i]) csvFile2.close()
[[ 1.     1.     1.    ...,  0.     1.     0.075]  [ 1.     0.     0.    ...,  1.     1.     0.1  ]  [ 1.     1.     0.    ...,  1.     0.     0.175]  ...,   [ 0.     0.     0.    ...,  1.     1.     0.075]  [ 0.     0.     0.    ...,  0.     1.     0.025]  [ 0.     0.     0.    ...,  0.     1.     0.05 ]] the shape of data:(2260, 200) the shape of data and class:(2260, 201) 
os.system("./mRMR/mrmr -i "+outputpath+" -n 200 >mRMR/output.mrmrout") print "complete "
complete  
#读取文件  fn=open("mRMR/output.mrmrout",'r') location_mark=0 final_set=[] for line in fn.readlines():     if line.strip() =="":         location_mark=0     if location_mark==1 and line.split()[1]!="Fea":          final_set.APPend(int(line.split()[1]))     if re.findall(r"mRMR",line) and re.findall(r"feature",line):         location_mark=1 print final_set
[133, 135, 140, 130, 145, 110, 115, 105, 120, 125, 150, 102, 185, 190, 180, 195, 100, 160, 165, 155, 170, 175, 101, 5, 85, 95, 98, 90, 99, 200, 177, 33, 50, 14, 8, 149, 109, 94, 121, 134, 113, 84, 21, 156, 71, 31, 6, 59, 189, 158, 122, 176, 58, 46, 64, 188, 10, 1, 38, 184, 19, 138, 2, 159, 81, 181, 44, 199, 26, 63, 82, 45, 148, 114, 172, 183, 32, 7, 48, 131, 146, 163, 83, 39, 49, 171, 80, 132, 197, 77, 88, 56, 9, 157, 198, 75, 164, 147, 70, 76, 196, 27, 182, 25, 96, 127, 13, 57, 126, 65, 107, 34, 108, 60, 139, 69, 55, 89, 30, 35, 40, 106, 20, 15, 104, 97, 111, 18, 103, 41, 78, 116, 61, 192, 3, 43, 67, 23, 118, 191, 4, 11, 194, 119, 66, 17, 87, 137, 136, 167, 141, 53, 117, 154, 28, 86, 42, 151, 52, 74, 68, 193, 51, 22, 179, 153, 62, 186, 152, 169, 12, 161, 129, 112, 166, 93, 47, 79, 162, 128, 29, 16, 143, 36, 187, 168, 144, 73, 124, 91, 54, 174, 178, 24, 173, 37, 142, 72, 123, 92] 
precision_copy=0 recall_copy=0 SN_copy=0 SP_copy=0 GM_copy=0 TP_copy=0 TN_copy=0 FP_copy=0 FN_copy=0 ACC_copy=0 F1_Score_copy=0 F_measure_copy=0 MCC_copy=0 pos_copy=0 neg_copy=0 y_pred_prob_copy=[] y_pred_copy=[]

关键语句:

os.system("./mRMR/mrmr -i "+outputpath+" -n 200 >mRMR/output.mrmrout")

– ./mRMR/mrmr代表执行程序,也即最上面github里面下载的

– -i outputpath代表输出的csv地址,也即原始特诊集合(一下会说明)

– -n 200代表选取200维度,一次从得分排列

– >mRMR/output.mrmrout代表输出的文件(文件情况如下)

output.mrmrout


csv格式需要特别说明,分类的类别需要在第一行,同时必须要有columns的标签(class一行必须有)

这里写图片描述

    [133, 135, 140, 130, 145, 110, 115, 105, 120, 125, 150, 102, 185, 190, 180, 195, 100, 160, 165, 155, 170, 175, 101, 5, 85, 95, 98, 90, 99, 200, 177, 33, 50, 14, 8, 149, 109, 94, 121, 134, 113, 84, 21, 156, 71, 31, 6, 59, 189, 158, 122, 176, 58, 46, 64, 188, 10, 1, 38, 184, 19, 138, 2, 159, 81, 181, 44, 199, 26, 63, 82, 45, 148, 114, 172, 183, 32, 7, 48, 131, 146, 163, 83, 39, 49, 171, 80, 132, 197, 77, 88, 56, 9, 157, 198, 75, 164, 147, 70, 76, 196, 27, 182, 25, 96, 127, 13, 57, 126, 65, 107, 34, 108, 60, 139, 69, 55, 89, 30, 35, 40, 106, 20, 15, 104, 97, 111, 18, 103, 41, 78, 116, 61, 192, 3, 43, 67, 23, 118, 191, 4, 11, 194, 119, 66, 17, 87, 137, 136, 167, 141, 53, 117, 154, 28, 86, 42, 151, 52, 74, 68, 193, 51, 22, 179, 153, 62, 186, 152, 169, 12, 161, 129, 112, 166, 93, 47, 79, 162, 128, 29, 16, 143, 36, 187, 168, 144, 73, 124, 91, 54, 174, 178, 24, 173, 37, 142, 72, 123, 92] 

这些数字是从mRMR/output.mrmrout里面提取出来的特征维度的排序

读者可根据这些排序的维度逐渐提取以寻找最优的维度集合。

重申mrmr程序和特征提取程序地址

相关阅读

面向对象思想、特征等

面向对象思想 面向对象的概念和应用已超越了程序设计和软件开发,扩展到很宽的范围。如数据库系统、交互式界面、应用结构、应用

淘宝童装的4个类目特征

从淘宝类目销量情况看,服装一直是占据淘宝类目的第一榜首。女装是服装类目最多也是最受关注的,其次就是童装了。童装类目也是淘宝服

公众号老被“谣言”坑:分析谣言文章的几大特征

作者:橙子(ID:weixorz)前言自从微信出台违规文章处罚以来,诸多公众号运营因为违反规定被处罚的绝不在少数,永久封号。其中谣言这个坑

Design for Xuture:IOS 11用户界面设计指导六个原则和

随着X的问世,IOS11设计规范也相应更新,读下来的第一感受是:很大篇幅都是针对X进行的增补,包含一丝丝未来移动APP设计的气息,即:Design f

Eigen库 矩阵基本操作:转置矩阵,逆矩阵,伴随矩阵,特征值

#include #include “EigenDense”using namespace Eigen;using namespace std;int main(){   Matrix3d Mat1;   M

发表评论