前一阶段做了关于大西洋鲑鱼肉色分级的研究,通过对颜色特征的分析,发现一些具有与等级间呈线性关系的特征。故利用这些特征进行多元线性回归。利用回归求参数的方法主要有最小二乘和迭代等,本文利用最小二乘和模拟退火对参数寻优,具体算法可参见任何一种介绍机器学习的资料,这里不再赘述。本训练样本数为101个,有3个特征。
1. 最小二乘法
% linear regression
% least squares data = [x1,x2,....,xn, y]
function theta = mylinearregress(data)
[m, n] = size(data);
one = ones(m,1);
y = data(:,n);
X = [one,data(:,1:n-1)];
theta = (X\'*X)\(X\'*y); % theta为线性模型系数
2. 模拟退火法
% linear regression % Simulated Annealing
% LMS algorithm h = theta(1) + theta(2) * x1 + theta(3) * x2 + theta(4) * x3;
function theta = LMS_SA(data)
[m, n] = size(data);
one = ones(m,1);
X = [one,data(:,1:n-1)];
y = data(:,n);
theta0 = [1,1,1,1]\';
H0 = 0.5*(X * theta0 - y)\'* (X * theta0 - y);
alpha = 0.0000001;
T = 1;
T_min = 0.01;
while T > T_min
t0 = 0;
t1 = 0;
while t1 < 100 && t0 < 1000
t = zeros(1,n);
for j = 1:m
for k =1:n
t(k) = t(k) + (y(j) - X(j, :) * theta0) * X(j, k);
end
end
theta1 = theta0 + alpha * t\';
H1 =0.5*(X * theta1 - y)\'* (X * theta1 - y);
if H1 - H0 < 0
theta0 = theta1;
H0 = H1;
elseif exp((H1 - H0)/T) > rand(0,1)
theta0 = theta1;
H0 = H1;
else
t0 = t0 + 1;
end
t1 = t1 + 1;
end
T = 0.99 * T;
end
theta = theta0;
一般情况下,模拟退火只能找到相对的最优值。所以,利用模拟退火获得的回归模型的预测准确率较差。但相对于单纯的只使用梯度下降算法而言已经具有很大的优势。
3. C#与matlab混合编程
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Collections;
using matPrj;
using MathWorks.MATLAB.NET.Arrays;
using MathWorks.MATLAB.NET.Utility;
namespace myLinearRegression
{
class Program
{
static void Main(string[] args)
{
myMathclass myfun = new myMathclass();
StreamReader objReader = new StreamReader("test.txt");
string sLine = "";
double[] data = new double[404];
int count = 0;
while ((sLine = objReader.ReadLine()) != null)
{
foreach (string str in sLine.Split(\'\t\'))
{
data[count] = Convert.ToDouble(str);
count++;
}
}
double[] theta = new double[4];
MWNumericArray m_data = new MWNumericArray(101, 4, data);
MWArray[] output = new MWArray[1];
MWArray[] input = new MWArray[1] { m_data };
myfun.mylinearregress(1,ref output,input);
MWNumericArray x = output[0] as MWNumericArray;
theta = (double[])x.ToArray();
}
}
}
将.m文件在matlab下转换成.DLL文件,然后将该文件与ManagedCPPAPI.netmodule和MWArray.dll文件放到C#工程的bin\debug下即可。