博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
机器学习 LogsticRegression 正则化(matlab实现)
阅读量:4217 次
发布时间:2019-05-26

本文共 3175 字,大约阅读时间需要 10 分钟。

仍然使用之前的根据学生两学期分数,预测录取情况

主程序:

X = load('ex4x.dat');y = load('ex4y.dat');plotData(X,y);[m,n] = size(X);X = [ones(m,1),X];lambda = 1;%[cost,grad] = costFunction(theta,X,y,lambda);%fprintf('Cost at initial theta (zeros): %f\n', cost);init_theta = zeros(n+1,1);options = optimset('GradObj', 'on', 'MaxIter', 400);f = @(t)(costFunction(t, X, y, lambda));[theta, J, exit_flag] = fminunc(f, init_theta, options);% Plot BoundaryplotDecisionBoundary(theta, X, y);hold on;title(sprintf('lambda = %g', lambda))% Labels and Legendxlabel('Microchip Test 1')ylabel('Microchip Test 2')legend('y = 1', 'y = 0', 'Decision boundary')hold off;% Compute accuracy on our training setp = predict(theta, X);fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);

画原始的两学期分数分布图:

function plotData(X, y)    figure;    hold on;    pos = find(y == 1);    neg = find(y == 0);    plot(X(pos, 1), X(pos, 2), 'k+', 'LineWidth', 2, 'MarkerSize', 7);    plot(X(neg, 1), X(neg, 2), 'ko', 'MarkerFaceColor', 'y', 'MarkerSize', 7);    legend('y == 1','y == 0');    hold off;end

代价函数:

梯度(正则化,theta0不参与正则化):

function [J, grad] = costFunction(theta,X,y,lambda)  m = length(y);  %grad = zeros(m,1);  sig = inline('1./(1+exp(-z))');  grad = zeros(size(theta));  J = 1/m*(sum(-y.*log(sig(X*theta))-(1-y).*log(1-sig(X*theta)))) +lambda/(2*m)*sum(theta(2:size(theta)).^2);%计算代价  for j = 1:size(theta)    if j == 1      grad(j) = 1/m*sum((sig(X*theta)-y)'*X(:,j));    else      grad(j) = 1/m*sum((sig(X*theta)-y)'*X(:,j)) + lambda/m*theta(j);    end  endend

画图里面包含了各种情况(这里只是用了最简单的那种):

function plotDecisionBoundary(theta, X, y)%PLOTDECISIONBOUNDARY Plots the data points X and y into a new figure with%the decision boundary defined by theta%   PLOTDECISIONBOUNDARY(theta, X,y) plots the data points with + for the %   positive examples and o for the negative examples. X is assumed to be %   a either %   1) Mx3 matrix, where the first column is an all-ones column for the %      intercept.%   2) MxN, N>3 matrix, where the first column is all-ones    % Plot Data    plotData(X(:,2:3), y);    hold on    if size(X, 2) <= 3        % Only need 2 points to define a line, so choose two endpoints        plot_x = [min(X(:,2))-2,  max(X(:,2))+2];        % Calculate the decision boundary line        plot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));        % Plot, and adjust axes for better viewing        plot(plot_x, plot_y)        % Legend, specific for the exercise        legend('Admitted', 'Not admitted', 'Decision Boundary')        axis([10, 70, 30, 100])    else        % Here is the grid range        u = linspace(-1, 1.5, 50);        v = linspace(-1, 1.5, 50);        z = zeros(length(u), length(v));        % Evaluate z = theta*x over the grid        for i = 1:length(u)            for j = 1:length(v)                z(i,j) = mapFeature(u(i), v(j))*theta;            end        end        z = z'; % important to transpose z before calling contour        % Plot z = 0        % Notice you need to specify the range [0, 0]        contour(u, v, z, [0, 0], 'LineWidth', 2)    end    hold offend

预测:

function p = predict(theta, X)    sig = inline('1./(1+exp(-z))');    p = sig(X * theta) >= 0.5;end

参考博客:

数据源:

你可能感兴趣的文章
Java图形界面中单选按钮JRadioButton和按钮Button事件处理
查看>>
小练习 - 排序:冒泡、选择、快排
查看>>
SparkStreaming 如何保证消费Kafka的数据不丢失不重复
查看>>
Spark Shuffle及其调优
查看>>
数据仓库分层
查看>>
常见数据结构-TrieTree/线段树/TreeSet
查看>>
Hive数据倾斜
查看>>
TopK问题
查看>>
HQL排查数据倾斜
查看>>
DAG以及任务调度
查看>>
LeetCode——DFS
查看>>
MapReduce Task数目划分
查看>>
ZooKeeper分布式锁
查看>>
3126 Prime Path
查看>>
app自动化测试---ADBInterface驱动安装失败问题:
查看>>
RobotFramework+Eclipse安装步骤
查看>>
测试的分类
查看>>
photoshop cc2019快捷键
查看>>
pycharm2019版本去掉下划线的方法
查看>>
九度OJ 1091:棋盘游戏 (DP、BFS、DFS、剪枝)
查看>>