Files
team-learning-program/RLanguage/_book/task-04.html
Yangzhuoran Yang dc97b19112 update book
2021-08-08 09:26:20 +08:00

768 lines
84 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE html>
<html lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title>第 4 章 数据可视化 | R语言数据分析组队学习</title>
<meta name="description" content="第 4 章 数据可视化 | R语言数据分析组队学习" />
<meta name="generator" content="bookdown 0.22 and GitBook 2.6.7" />
<meta property="og:title" content="第 4 章 数据可视化 | R语言数据分析组队学习" />
<meta property="og:type" content="book" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="第 4 章 数据可视化 | R语言数据分析组队学习" />
<meta name="author" content="张晋、杨佳达、牧小熊、杨杨卓然、姚昱君" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="prev" href="task-03.html"/>
<link rel="next" href="task-05.html"/>
<script src="libs/header-attrs-2.9/header-attrs.js"></script>
<script src="libs/jquery-2.2.3/jquery.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-clipboard.css" rel="stylesheet" />
<link href="libs/anchor-sections-1.0.1/anchor-sections.css" rel="stylesheet" />
<script src="libs/anchor-sections-1.0.1/anchor-sections.js"></script>
<style type="text/css">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li><a href="./">R语言数据分析组队学习</a></li>
<li class="divider"></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html"><i class="fa fa-check"></i>欢迎!</a>
<ul>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html#贡献者信息"><i class="fa fa-check"></i>贡献者信息</a></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html#课程简介"><i class="fa fa-check"></i>课程简介</a></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html#课程大纲"><i class="fa fa-check"></i>课程大纲</a></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html#关于-datawhale"><i class="fa fa-check"></i>关于 Datawhale</a></li>
</ul></li>
<li class="part"><span><b>I 准备工作</b></span></li>
<li class="chapter" data-level="" data-path="task-00.html"><a href="task-00.html"><i class="fa fa-check"></i>熟悉规则与R语言入门</a>
<ul>
<li class="chapter" data-level="0.1" data-path="task-00.html"><a href="task-00.html#安装"><i class="fa fa-check"></i><b>0.1</b> 安装</a>
<ul>
<li class="chapter" data-level="0.1.1" data-path="task-00.html"><a href="task-00.html#r"><i class="fa fa-check"></i><b>0.1.1</b> R</a></li>
<li class="chapter" data-level="0.1.2" data-path="task-00.html"><a href="task-00.html#rstudio"><i class="fa fa-check"></i><b>0.1.2</b> RStudio</a></li>
<li class="chapter" data-level="0.1.3" data-path="task-00.html"><a href="task-00.html#r语言程辑包r-package"><i class="fa fa-check"></i><b>0.1.3</b> R语言程辑包R Package</a></li>
</ul></li>
<li class="chapter" data-level="0.2" data-path="task-00.html"><a href="task-00.html#环境配置"><i class="fa fa-check"></i><b>0.2</b> 环境配置</a>
<ul>
<li class="chapter" data-level="0.2.1" data-path="task-00.html"><a href="task-00.html#项目project"><i class="fa fa-check"></i><b>0.2.1</b> 项目Project</a></li>
<li class="chapter" data-level="0.2.2" data-path="task-00.html"><a href="task-00.html#用户界面"><i class="fa fa-check"></i><b>0.2.2</b> 用户界面</a></li>
<li class="chapter" data-level="0.2.3" data-path="task-00.html"><a href="task-00.html#r-markdown"><i class="fa fa-check"></i><b>0.2.3</b> R Markdown</a></li>
<li class="chapter" data-level="0.2.4" data-path="task-00.html"><a href="task-00.html#帮助"><i class="fa fa-check"></i><b>0.2.4</b> 帮助</a></li>
</ul></li>
<li class="chapter" data-level="0.3" data-path="task-00.html"><a href="task-00.html#happy-coding"><i class="fa fa-check"></i><b>0.3</b> Happy Coding!</a></li>
<li class="chapter" data-level="" data-path="task-00.html"><a href="task-00.html#本章作者"><i class="fa fa-check"></i>本章作者</a></li>
<li class="chapter" data-level="" data-path="task-00.html"><a href="task-00.html#关于datawhale"><i class="fa fa-check"></i>关于Datawhale</a></li>
</ul></li>
<li class="part"><span><b>II 开始干活</b></span></li>
<li class="chapter" data-level="1" data-path="task-01.html"><a href="task-01.html"><i class="fa fa-check"></i><b>1</b> 数据结构与数据集</a>
<ul>
<li class="chapter" data-level="1.1" data-path="task-01.html"><a href="task-01.html#准备工作"><i class="fa fa-check"></i><b>1.1</b> 准备工作</a></li>
<li class="chapter" data-level="1.2" data-path="task-01.html"><a href="task-01.html#编码基础"><i class="fa fa-check"></i><b>1.2</b> 编码基础</a>
<ul>
<li class="chapter" data-level="1.2.1" data-path="task-01.html"><a href="task-01.html#算术"><i class="fa fa-check"></i><b>1.2.1</b> 算术</a></li>
<li class="chapter" data-level="1.2.2" data-path="task-01.html"><a href="task-01.html#赋值"><i class="fa fa-check"></i><b>1.2.2</b> 赋值</a></li>
<li class="chapter" data-level="1.2.3" data-path="task-01.html"><a href="task-01.html#函数"><i class="fa fa-check"></i><b>1.2.3</b> 函数</a></li>
<li class="chapter" data-level="1.2.4" data-path="task-01.html"><a href="task-01.html#循环loop"><i class="fa fa-check"></i><b>1.2.4</b> 循环loop</a></li>
<li class="chapter" data-level="1.2.5" data-path="task-01.html"><a href="task-01.html#管道pipe"><i class="fa fa-check"></i><b>1.2.5</b> 管道pipe</a></li>
</ul></li>
<li class="chapter" data-level="1.3" data-path="task-01.html"><a href="task-01.html#数据类型"><i class="fa fa-check"></i><b>1.3</b> 数据类型</a>
<ul>
<li class="chapter" data-level="1.3.1" data-path="task-01.html"><a href="task-01.html#基础数据类型"><i class="fa fa-check"></i><b>1.3.1</b> 基础数据类型</a></li>
<li class="chapter" data-level="1.3.2" data-path="task-01.html"><a href="task-01.html#向量vector"><i class="fa fa-check"></i><b>1.3.2</b> 向量vector</a></li>
<li class="chapter" data-level="1.3.3" data-path="task-01.html"><a href="task-01.html#特殊数据类型"><i class="fa fa-check"></i><b>1.3.3</b> 特殊数据类型</a></li>
</ul></li>
<li class="chapter" data-level="1.4" data-path="task-01.html"><a href="task-01.html#多维数据类型"><i class="fa fa-check"></i><b>1.4</b> 多维数据类型</a>
<ul>
<li class="chapter" data-level="1.4.1" data-path="task-01.html"><a href="task-01.html#矩阵matrix"><i class="fa fa-check"></i><b>1.4.1</b> 矩阵matrix</a></li>
<li class="chapter" data-level="1.4.2" data-path="task-01.html"><a href="task-01.html#列表list"><i class="fa fa-check"></i><b>1.4.2</b> 列表list</a></li>
<li class="chapter" data-level="1.4.3" data-path="task-01.html"><a href="task-01.html#数据表data-frame-与-tibble"><i class="fa fa-check"></i><b>1.4.3</b> 数据表data frame 与 tibble</a></li>
</ul></li>
<li class="chapter" data-level="1.5" data-path="task-01.html"><a href="task-01.html#读写数据"><i class="fa fa-check"></i><b>1.5</b> 读写数据</a>
<ul>
<li class="chapter" data-level="1.5.1" data-path="task-01.html"><a href="task-01.html#内置数据集"><i class="fa fa-check"></i><b>1.5.1</b> 内置数据集</a></li>
<li class="chapter" data-level="1.5.2" data-path="task-01.html"><a href="task-01.html#表格类型数据csv-excel"><i class="fa fa-check"></i><b>1.5.2</b> 表格类型数据csv, excel)</a></li>
<li class="chapter" data-level="1.5.3" data-path="task-01.html"><a href="task-01.html#r的专属类型数据rdata-rds"><i class="fa fa-check"></i><b>1.5.3</b> R的专属类型数据RData, rds</a></li>
<li class="chapter" data-level="1.5.4" data-path="task-01.html"><a href="task-01.html#其他软件spss-stata-sas"><i class="fa fa-check"></i><b>1.5.4</b> 其他软件SPSS, Stata, SAS</a></li>
</ul></li>
<li class="chapter" data-level="1.6" data-path="task-01.html"><a href="task-01.html#练习题"><i class="fa fa-check"></i><b>1.6</b> 练习题</a>
<ul>
<li class="chapter" data-level="1.6.1" data-path="task-01.html"><a href="task-01.html#了解数据集"><i class="fa fa-check"></i><b>1.6.1</b> 了解数据集</a></li>
<li class="chapter" data-level="1.6.2" data-path="task-01.html"><a href="task-01.html#创造数据集"><i class="fa fa-check"></i><b>1.6.2</b> 创造数据集</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="task-01.html"><a href="task-01.html#本章作者-1"><i class="fa fa-check"></i>本章作者</a></li>
<li class="chapter" data-level="" data-path="task-01.html"><a href="task-01.html#关于datawhale-1"><i class="fa fa-check"></i>关于Datawhale</a></li>
</ul></li>
<li class="chapter" data-level="2" data-path="task-02.html"><a href="task-02.html"><i class="fa fa-check"></i><b>2</b> 数据清洗与准备</a>
<ul>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#环境配置-1"><i class="fa fa-check"></i>环境配置</a></li>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#案例数据"><i class="fa fa-check"></i>案例数据</a>
<ul>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#数据集1-h1n1流感问卷数据集"><i class="fa fa-check"></i>数据集1 h1n1流感问卷数据集</a></li>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#数据集2-波士顿房价数据集"><i class="fa fa-check"></i>数据集2 波士顿房价数据集</a></li>
</ul></li>
<li class="chapter" data-level="2.1" data-path="task-02.html"><a href="task-02.html#重复值处理"><i class="fa fa-check"></i><b>2.1</b> 重复值处理</a></li>
<li class="chapter" data-level="2.2" data-path="task-02.html"><a href="task-02.html#缺失值识别与处理"><i class="fa fa-check"></i><b>2.2</b> 缺失值识别与处理</a>
<ul>
<li class="chapter" data-level="2.2.1" data-path="task-02.html"><a href="task-02.html#缺失值识别"><i class="fa fa-check"></i><b>2.2.1</b> 缺失值识别</a></li>
<li class="chapter" data-level="2.2.2" data-path="task-02.html"><a href="task-02.html#缺失值处理"><i class="fa fa-check"></i><b>2.2.2</b> 缺失值处理</a></li>
</ul></li>
<li class="chapter" data-level="2.3" data-path="task-02.html"><a href="task-02.html#异常值识别与处理"><i class="fa fa-check"></i><b>2.3</b> 异常值识别与处理</a>
<ul>
<li class="chapter" data-level="2.3.1" data-path="task-02.html"><a href="task-02.html#异常值识别"><i class="fa fa-check"></i><b>2.3.1</b> 异常值识别</a></li>
<li class="chapter" data-level="2.3.2" data-path="task-02.html"><a href="task-02.html#可视化图形分布"><i class="fa fa-check"></i><b>2.3.2</b> 可视化图形分布</a></li>
<li class="chapter" data-level="2.3.3" data-path="task-02.html"><a href="task-02.html#z-score"><i class="fa fa-check"></i><b>2.3.3</b> z-score</a></li>
<li class="chapter" data-level="2.3.4" data-path="task-02.html"><a href="task-02.html#局部异常因子法"><i class="fa fa-check"></i><b>2.3.4</b> 局部异常因子法</a></li>
<li class="chapter" data-level="2.3.5" data-path="task-02.html"><a href="task-02.html#异常值处理"><i class="fa fa-check"></i><b>2.3.5</b> 异常值处理</a></li>
</ul></li>
<li class="chapter" data-level="2.4" data-path="task-02.html"><a href="task-02.html#特征编码"><i class="fa fa-check"></i><b>2.4</b> 特征编码</a>
<ul>
<li class="chapter" data-level="2.4.1" data-path="task-02.html"><a href="task-02.html#独热编码哑编码"><i class="fa fa-check"></i><b>2.4.1</b> 独热编码/哑编码</a></li>
<li class="chapter" data-level="2.4.2" data-path="task-02.html"><a href="task-02.html#标签编码"><i class="fa fa-check"></i><b>2.4.2</b> 标签编码</a></li>
<li class="chapter" data-level="2.4.3" data-path="task-02.html"><a href="task-02.html#手动编码"><i class="fa fa-check"></i><b>2.4.3</b> 手动编码</a></li>
<li class="chapter" data-level="2.4.4" data-path="task-02.html"><a href="task-02.html#日期特征转换"><i class="fa fa-check"></i><b>2.4.4</b> 日期特征转换</a></li>
</ul></li>
<li class="chapter" data-level="2.5" data-path="task-02.html"><a href="task-02.html#规范化与偏态数据"><i class="fa fa-check"></i><b>2.5</b> 规范化与偏态数据</a>
<ul>
<li class="chapter" data-level="2.5.1" data-path="task-02.html"><a href="task-02.html#规范化"><i class="fa fa-check"></i><b>2.5.1</b> 0-1规范化</a></li>
<li class="chapter" data-level="2.5.2" data-path="task-02.html"><a href="task-02.html#z-score标准化"><i class="fa fa-check"></i><b>2.5.2</b> Z-score标准化</a></li>
<li class="chapter" data-level="2.5.3" data-path="task-02.html"><a href="task-02.html#对数转换log-transform"><i class="fa fa-check"></i><b>2.5.3</b> 对数转换(log transform)</a></li>
<li class="chapter" data-level="2.5.4" data-path="task-02.html"><a href="task-02.html#box-cox"><i class="fa fa-check"></i><b>2.5.4</b> Box-Cox</a></li>
</ul></li>
<li class="chapter" data-level="2.6" data-path="task-02.html"><a href="task-02.html#小拓展"><i class="fa fa-check"></i><b>2.6</b> 小拓展</a></li>
<li class="chapter" data-level="2.7" data-path="task-02.html"><a href="task-02.html#思考与练习"><i class="fa fa-check"></i><b>2.7</b> 思考与练习</a></li>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#附录参考资料"><i class="fa fa-check"></i>附录:参考资料</a>
<ul>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#理论资料"><i class="fa fa-check"></i>理论资料</a></li>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#r语言函数用法示例"><i class="fa fa-check"></i>R语言函数用法示例</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#本章作者-2"><i class="fa fa-check"></i>本章作者</a></li>
<li class="chapter" data-level="" data-path="task-02.html"><a href="task-02.html#关于datawhale-2"><i class="fa fa-check"></i>关于Datawhale</a></li>
</ul></li>
<li class="chapter" data-level="3" data-path="task-03.html"><a href="task-03.html"><i class="fa fa-check"></i><b>3</b> 基本统计分析</a>
<ul>
<li class="chapter" data-level="" data-path="task-03.html"><a href="task-03.html#准备工作-1"><i class="fa fa-check"></i>准备工作</a></li>
<li class="chapter" data-level="3.1" data-path="task-03.html"><a href="task-03.html#多种方法获取描述性统计量"><i class="fa fa-check"></i><b>3.1</b> 多种方法获取描述性统计量</a>
<ul>
<li class="chapter" data-level="3.1.1" data-path="task-03.html"><a href="task-03.html#基础方法"><i class="fa fa-check"></i><b>3.1.1</b> 基础方法</a></li>
<li class="chapter" data-level="3.1.2" data-path="task-03.html"><a href="task-03.html#拓展包方法"><i class="fa fa-check"></i><b>3.1.2</b> 拓展包方法</a></li>
</ul></li>
<li class="chapter" data-level="3.2" data-path="task-03.html"><a href="task-03.html#分组计算描述性统计"><i class="fa fa-check"></i><b>3.2</b> 分组计算描述性统计</a>
<ul>
<li class="chapter" data-level="3.2.1" data-path="task-03.html"><a href="task-03.html#基础方法-1"><i class="fa fa-check"></i><b>3.2.1</b> 基础方法</a></li>
</ul></li>
<li class="chapter" data-level="3.3" data-path="task-03.html"><a href="task-03.html#频数表和列联表"><i class="fa fa-check"></i><b>3.3</b> 频数表和列联表</a></li>
<li class="chapter" data-level="3.4" data-path="task-03.html"><a href="task-03.html#相关"><i class="fa fa-check"></i><b>3.4</b> 相关</a>
<ul>
<li class="chapter" data-level="3.4.1" data-path="task-03.html"><a href="task-03.html#相关的类型"><i class="fa fa-check"></i><b>3.4.1</b> 相关的类型</a></li>
<li class="chapter" data-level="3.4.2" data-path="task-03.html"><a href="task-03.html#相关性的显著性检验"><i class="fa fa-check"></i><b>3.4.2</b> 相关性的显著性检验</a></li>
</ul></li>
<li class="chapter" data-level="3.5" data-path="task-03.html"><a href="task-03.html#方差分析"><i class="fa fa-check"></i><b>3.5</b> 方差分析</a>
<ul>
<li class="chapter" data-level="3.5.1" data-path="task-03.html"><a href="task-03.html#单因素方差分析"><i class="fa fa-check"></i><b>3.5.1</b> 单因素方差分析</a></li>
<li class="chapter" data-level="3.5.2" data-path="task-03.html"><a href="task-03.html#多因素方差分析"><i class="fa fa-check"></i><b>3.5.2</b> 多因素方差分析</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="task-03.html"><a href="task-03.html#本章作者-3"><i class="fa fa-check"></i>本章作者</a></li>
<li class="chapter" data-level="" data-path="task-03.html"><a href="task-03.html#关于datawhale-3"><i class="fa fa-check"></i>关于Datawhale</a></li>
</ul></li>
<li class="chapter" data-level="4" data-path="task-04.html"><a href="task-04.html"><i class="fa fa-check"></i><b>4</b> 数据可视化</a>
<ul>
<li class="chapter" data-level="" data-path="task-04.html"><a href="task-04.html#ggplot2包介绍"><i class="fa fa-check"></i>ggplot2包介绍</a></li>
<li class="chapter" data-level="4.1" data-path="task-04.html"><a href="task-04.html#环境配置-2"><i class="fa fa-check"></i><b>4.1</b> 环境配置</a>
<ul>
<li class="chapter" data-level="" data-path="task-04.html"><a href="task-04.html#案例数据-1"><i class="fa fa-check"></i>案例数据</a></li>
</ul></li>
<li class="chapter" data-level="4.2" data-path="task-04.html"><a href="task-04.html#散点图"><i class="fa fa-check"></i><b>4.2</b> 散点图</a></li>
<li class="chapter" data-level="4.3" data-path="task-04.html"><a href="task-04.html#直方图"><i class="fa fa-check"></i><b>4.3</b> 直方图</a></li>
<li class="chapter" data-level="4.4" data-path="task-04.html"><a href="task-04.html#柱状图"><i class="fa fa-check"></i><b>4.4</b> 柱状图</a></li>
<li class="chapter" data-level="4.5" data-path="task-04.html"><a href="task-04.html#饼状图"><i class="fa fa-check"></i><b>4.5</b> 饼状图</a></li>
<li class="chapter" data-level="4.6" data-path="task-04.html"><a href="task-04.html#折线图"><i class="fa fa-check"></i><b>4.6</b> 折线图</a></li>
<li class="chapter" data-level="4.7" data-path="task-04.html"><a href="task-04.html#ggplot2扩展包主题"><i class="fa fa-check"></i><b>4.7</b> ggplot2扩展包主题</a></li>
<li class="chapter" data-level="" data-path="task-04.html"><a href="task-04.html#本章作者-4"><i class="fa fa-check"></i>本章作者</a></li>
<li class="chapter" data-level="" data-path="task-04.html"><a href="task-04.html#关于datawhale-4"><i class="fa fa-check"></i>关于Datawhale</a></li>
</ul></li>
<li class="chapter" data-level="5" data-path="task-05.html"><a href="task-05.html"><i class="fa fa-check"></i><b>5</b> 模型</a>
<ul>
<li class="chapter" data-level="5.1" data-path="task-05.html"><a href="task-05.html#前言"><i class="fa fa-check"></i><b>5.1</b> 前言</a>
<ul>
<li class="chapter" data-level="5.1.1" data-path="task-05.html"><a href="task-05.html#linear-regression"><i class="fa fa-check"></i><b>5.1.1</b> Linear Regression</a></li>
<li class="chapter" data-level="5.1.2" data-path="task-05.html"><a href="task-05.html#stepwise-regression"><i class="fa fa-check"></i><b>5.1.2</b> Stepwise Regression</a></li>
</ul></li>
<li class="chapter" data-level="5.2" data-path="task-05.html"><a href="task-05.html#分类模型"><i class="fa fa-check"></i><b>5.2</b> 分类模型</a>
<ul>
<li class="chapter" data-level="5.2.1" data-path="task-05.html"><a href="task-05.html#logistics-regression"><i class="fa fa-check"></i><b>5.2.1</b> Logistics Regression</a></li>
<li class="chapter" data-level="5.2.2" data-path="task-05.html"><a href="task-05.html#knn"><i class="fa fa-check"></i><b>5.2.2</b> KNN</a></li>
<li class="chapter" data-level="5.2.3" data-path="task-05.html"><a href="task-05.html#decision-tree"><i class="fa fa-check"></i><b>5.2.3</b> Decision Tree</a></li>
<li class="chapter" data-level="5.2.4" data-path="task-05.html"><a href="task-05.html#random-forest"><i class="fa fa-check"></i><b>5.2.4</b> Random Forest</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="task-05.html"><a href="task-05.html#思考与练习-1"><i class="fa fa-check"></i>思考与练习</a></li>
<li class="chapter" data-level="" data-path="task-05.html"><a href="task-05.html#本章作者-5"><i class="fa fa-check"></i>本章作者</a></li>
<li class="chapter" data-level="" data-path="task-05.html"><a href="task-05.html#关于datawhale-5"><i class="fa fa-check"></i>关于Datawhale</a></li>
</ul></li>
</ul>
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./">R语言数据分析组队学习</a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<div id="task-04" class="section level1" number="4">
<h1><span class="header-section-number">第 4 章</span> 数据可视化</h1>
<p><img src="image/task04_structure.png" style="width:60.0%" /></p>
<div id="ggplot2包介绍" class="section level2 unnumbered">
<h2>ggplot2包介绍</h2>
<p>ggplot2包由Hadley Wickham编写提供了一种基于Wilkinson所述图形语法的图形系统。ggplot2包的目标是提供一个全面的、基于语法的、连贯一致的图形生成系统允许用户创建新颖的、有创新性的数据可视化图形。</p>
<p>总的来说有以下几点:</p>
<ul>
<li>ggplot2的核心理念是将绘图与数据分离数据相关的绘图与数据无关的绘图分离</li>
<li>ggplot2保有命令式作图的调整函数使其更具灵活性</li>
<li>ggplot2将常见的统计变换融入到了绘图中。</li>
<li>ggplot2是按图层作图</li>
</ul>
<p>ggplot2图像的三个基本构成数据、图形属性映射、几何对象</p>
<p>按照ggplot2的绘图理念Plot(图)= data(数据集)+ Aesthetics(美学映射)+ Geometry(几何对象)。</p>
<p>例如:</p>
<div class="sourceCode" id="cb312"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb312-1"><a href="task-04.html#cb312-1" aria-hidden="true" tabindex="-1"></a><span class="co"># ggplot(data,aes(x=x,y=y))+geom_point()</span></span></code></pre></div>
<ul>
<li>数据:用于绘制图形的数据</li>
<li>映射aes()函数是ggplot2中的映射函数, 所谓的映射即为数据集中的数据关联到相应的图形属性过程中一种对应关系, 图形的颜色,形状,分组等都可以通过通过数据集中的变量映射。</li>
<li>几何对象:我们在图中实际看到的图形元素,如点、线、多边形等。</li>
</ul>
<p>ggplot2绘图代码如同数据公式一般只需要套相应的公式即可绘制出丰富的图形后续的讲解也会按照此方法。</p>
<p>ggplot2参考链接</p>
<ul>
<li><a href="https://ggplot2.tidyverse.org/reference/" class="uri">https://ggplot2.tidyverse.org/reference/</a></li>
<li><a href="https://ggplot2-book.org/" class="uri">https://ggplot2-book.org/</a></li>
</ul>
<p>ggplot2的安装方法</p>
<div class="sourceCode" id="cb313"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb313-1"><a href="task-04.html#cb313-1" aria-hidden="true" tabindex="-1"></a><span class="co"># install.packages(&quot;ggplot2&quot;)</span></span></code></pre></div>
</div>
<div id="环境配置-2" class="section level2" number="4.1">
<h2><span class="header-section-number">4.1</span> 环境配置</h2>
<div class="sourceCode" id="cb314"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb314-1"><a href="task-04.html#cb314-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggplot2) <span class="co"># 画图工具ggplot2</span></span>
<span id="cb314-2"><a href="task-04.html#cb314-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggpubr) <span class="co"># 将多个图形拼接</span></span>
<span id="cb314-3"><a href="task-04.html#cb314-3" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(plyr) <span class="co"># 数据处理包</span></span></code></pre></div>
<p>在本讲中会用到ggpubr中的ggrrange这个多图拼接工具详细使用方法参见
<a href="http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/81-ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page/" class="uri">http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/81-ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page/</a></p>
<div id="案例数据-1" class="section level3 unnumbered">
<h3>案例数据</h3>
<p>本节内容将会使用到两个数据集</p>
<p><strong>1.1h1n1流感问卷数据集</strong></p>
<p>h1n1流感问卷数据集是关于h1n1流感问卷调查的一个数据属于外部数据
数据集包含26,707个受访者数据共有32个特征+1个标签是否接种h1n1疫苗</p>
<p>读取相关的数据集</p>
<div class="sourceCode" id="cb315"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb315-1"><a href="task-04.html#cb315-1" aria-hidden="true" tabindex="-1"></a>h1n1_data <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="st">&quot;./datasets/h1n1_flu.csv&quot;</span>, <span class="at">header =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
<p><strong>1.2波士顿房价数据集</strong></p>
<p>波士顿房价数据集属于R语言自带数据集也可以通过外部读取</p>
<p>读取相关的数据集</p>
<div class="sourceCode" id="cb316"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb316-1"><a href="task-04.html#cb316-1" aria-hidden="true" tabindex="-1"></a>boston_data <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="st">&quot;./datasets/BostonHousing.csv&quot;</span>, <span class="at">header =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
</div>
</div>
<div id="散点图" class="section level2" number="4.2">
<h2><span class="header-section-number">4.2</span> 散点图</h2>
<p>散点图是指在数理统计回归分析中,数据点在直角坐标系平面上的分布图,散点图表示因变量随自变量而变化的大致趋势,由此趋势可以选择合适的函数进行经验分布的拟合,进而找到变量之间的函数关系。</p>
<p>散点图的优势:</p>
<ul>
<li>数据用图表来展示,显然比较直观,在工作汇报等场合能起到事半功倍的效果,让听者更容易接受,理解你所处理的数据。</li>
<li>散点图更偏向于研究型图表,能让我们发现变量之间隐藏的关系为我们决策作出重要的引导作用。</li>
<li>散点图核心的价值在于发现变量之间的关系,包括线性与非线性之间的关系。</li>
</ul>
<div class="sourceCode" id="cb317"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb317-1"><a href="task-04.html#cb317-1" aria-hidden="true" tabindex="-1"></a><span class="co"># 读取数据</span></span>
<span id="cb317-2"><a href="task-04.html#cb317-2" aria-hidden="true" tabindex="-1"></a>boston_data <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="st">&quot;./datasets/BostonHousing.csv&quot;</span>, <span class="at">header =</span> <span class="cn">TRUE</span>)</span>
<span id="cb317-3"><a href="task-04.html#cb317-3" aria-hidden="true" tabindex="-1"></a><span class="co"># 绘制简单的散点图 x轴选择的是lstat ,y轴选择的是medv</span></span>
<span id="cb317-4"><a href="task-04.html#cb317-4" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv)) <span class="sc">+</span></span>
<span id="cb317-5"><a href="task-04.html#cb317-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot1-1.png" width="672" /></p>
<p>上图选择的是lstat为x轴medv为y轴绘制的散点图x轴表示弱势群体人口所占比例y轴表示房屋的平均价格通过图上的数据可以看到弱势人群的比例增加会影响房价这2个变量呈现一定的负相关。</p>
<p>ggplot2可以修改散点图的性状和大小,R语言中存储了一些相关的形状
<img src="image/task04_fig1.png" style="width:50.0%" /></p>
<p>size参数修改点的大小color参数修改点的颜色</p>
<div class="sourceCode" id="cb318"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb318-1"><a href="task-04.html#cb318-1" aria-hidden="true" tabindex="-1"></a><span class="co"># 使用第17号形状</span></span>
<span id="cb318-2"><a href="task-04.html#cb318-2" aria-hidden="true" tabindex="-1"></a>p1 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv)) <span class="sc">+</span></span>
<span id="cb318-3"><a href="task-04.html#cb318-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>(<span class="at">shape =</span> <span class="dv">17</span>)</span>
<span id="cb318-4"><a href="task-04.html#cb318-4" aria-hidden="true" tabindex="-1"></a><span class="co"># size参数修改点的大小color参数修改点的颜色</span></span>
<span id="cb318-5"><a href="task-04.html#cb318-5" aria-hidden="true" tabindex="-1"></a>p2 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv)) <span class="sc">+</span></span>
<span id="cb318-6"><a href="task-04.html#cb318-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>(<span class="at">size =</span> <span class="dv">3</span>, <span class="at">color =</span> <span class="st">&quot;red&quot;</span>)</span>
<span id="cb318-7"><a href="task-04.html#cb318-7" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p1, p2, <span class="at">nrow =</span> <span class="dv">1</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot2-1.png" width="672" /></p>
<p>可将数据集的其它属性映射到散点图的颜色属性中</p>
<div class="sourceCode" id="cb319"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb319-1"><a href="task-04.html#cb319-1" aria-hidden="true" tabindex="-1"></a>p3 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv, <span class="at">colour =</span> <span class="fu">factor</span>(rad))) <span class="sc">+</span></span>
<span id="cb319-2"><a href="task-04.html#cb319-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span>
<span id="cb319-3"><a href="task-04.html#cb319-3" aria-hidden="true" tabindex="-1"></a>p4 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv, <span class="at">colour =</span> rad)) <span class="sc">+</span></span>
<span id="cb319-4"><a href="task-04.html#cb319-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span>
<span id="cb319-5"><a href="task-04.html#cb319-5" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p3, p4, <span class="at">nrow =</span> <span class="dv">1</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot4-1.png" width="672" /></p>
<p>ggplot2关于散点图的相关做法有很详细的介绍相关参考链接<a href="https://ggplot2.tidyverse.org/reference/geom_point.html" class="uri">https://ggplot2.tidyverse.org/reference/geom_point.html</a></p>
</div>
<div id="直方图" class="section level2" number="4.3">
<h2><span class="header-section-number">4.3</span> 直方图</h2>
<p>直方图是一种统计报告图,由一系列高度不等的纵向条纹或线段表示数据分布的情况。 一般用横轴表示数据类型,纵轴表示分布情况。
直方图可以很好的查看数据的分布情况,是常用的数据可视化展示图形。</p>
<p>我们对rad变量进行直方图分析</p>
<div class="sourceCode" id="cb320"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb320-1"><a href="task-04.html#cb320-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> rad)) <span class="sc">+</span></span>
<span id="cb320-2"><a href="task-04.html#cb320-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_histogram</span>()</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot6-1.png" width="672" /></p>
<p>可以看到ggplot2可以自动对数据进行直方图的统计</p>
<p>我们给直方图填充颜色同时改变直方图类型color表示直方图的边框fill表示直方图中的填充颜色ggplot2支持RGB颜色表的配色方案linetype表示直方图线的类型</p>
<p>RGB颜色表可以参考<a href="http://www.mgzxzs.com/sytool/se.htm" class="uri">http://www.mgzxzs.com/sytool/se.htm</a></p>
<div class="sourceCode" id="cb321"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb321-1"><a href="task-04.html#cb321-1" aria-hidden="true" tabindex="-1"></a>p5 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> rad)) <span class="sc">+</span></span>
<span id="cb321-2"><a href="task-04.html#cb321-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_histogram</span>(<span class="at">color =</span> <span class="st">&quot;black&quot;</span>, <span class="at">fill =</span> <span class="st">&quot;#69b3a2&quot;</span>)</span>
<span id="cb321-3"><a href="task-04.html#cb321-3" aria-hidden="true" tabindex="-1"></a>p6 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> rad)) <span class="sc">+</span></span>
<span id="cb321-4"><a href="task-04.html#cb321-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_histogram</span>(<span class="at">color =</span> <span class="st">&quot;black&quot;</span>, <span class="at">fill =</span> <span class="st">&quot;#69b3a2&quot;</span>, <span class="at">linetype =</span> <span class="st">&quot;dashed&quot;</span>)</span>
<span id="cb321-5"><a href="task-04.html#cb321-5" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p5, p6, <span class="at">nrow =</span> <span class="dv">1</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot7-1.png" width="672" /></p>
<p>ggplot2也支持在直方图上添加平均线和密度图</p>
<div class="sourceCode" id="cb322"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb322-1"><a href="task-04.html#cb322-1" aria-hidden="true" tabindex="-1"></a>p7 <span class="ot">&lt;-</span> p5 <span class="sc">+</span> <span class="fu">geom_vline</span>(<span class="fu">aes</span>(<span class="at">xintercept =</span> <span class="fu">mean</span>(rad)), <span class="at">color =</span> <span class="st">&quot;blue&quot;</span>, <span class="at">linetype =</span> <span class="st">&quot;dashed&quot;</span>, <span class="at">size =</span> <span class="dv">1</span>)</span>
<span id="cb322-2"><a href="task-04.html#cb322-2" aria-hidden="true" tabindex="-1"></a>p8 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> rad)) <span class="sc">+</span></span>
<span id="cb322-3"><a href="task-04.html#cb322-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_histogram</span>(<span class="at">color =</span> <span class="st">&quot;black&quot;</span>, <span class="at">fill =</span> <span class="st">&quot;#69b3a2&quot;</span>, <span class="fu">aes</span>(<span class="at">y =</span> ..density..)) <span class="sc">+</span></span>
<span id="cb322-4"><a href="task-04.html#cb322-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_density</span>(<span class="at">alpha =</span> .<span class="dv">2</span>, <span class="at">fill =</span> <span class="st">&quot;#FF6666&quot;</span>)</span>
<span id="cb322-5"><a href="task-04.html#cb322-5" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p7, p8, <span class="at">nrow =</span> <span class="dv">1</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot8-1.png" width="672" /></p>
<p>ggplot2关于直方图的相关做法有很详细的介绍相关参考链接<a href="https://ggplot2.tidyverse.org/reference/geom_histogram.html" class="uri">https://ggplot2.tidyverse.org/reference/geom_histogram.html</a></p>
</div>
<div id="柱状图" class="section level2" number="4.4">
<h2><span class="header-section-number">4.4</span> 柱状图</h2>
<p>柱状图是一种常用的数据可视化图形,根据翻译的不同,柱状图又叫长条图、柱状统计图、条状图、棒形图
柱状图图用来比较两个或以上的价值(不同时间或者不同条件),只有一个变量,通常利用于较小的数据集分析。长条图亦可横向排列,或用多维方式表达。需要注意的是柱状图与直方图是不同的数据可视化方法,不要弄混淆了。</p>
<p>对h1n1数据集中填写人的受教育情况进行可视化展示,使用pylr包中的count对edcation进行计数统计</p>
<div class="sourceCode" id="cb323"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb323-1"><a href="task-04.html#cb323-1" aria-hidden="true" tabindex="-1"></a>data <span class="ot">&lt;-</span> <span class="fu">count</span>(h1n1_data[<span class="st">&quot;race&quot;</span>])</span>
<span id="cb323-2"><a href="task-04.html#cb323-2" aria-hidden="true" tabindex="-1"></a>p <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> race, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb323-3"><a href="task-04.html#cb323-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>)</span>
<span id="cb323-4"><a href="task-04.html#cb323-4" aria-hidden="true" tabindex="-1"></a><span class="co"># 也可以进行水平放置</span></span>
<span id="cb323-5"><a href="task-04.html#cb323-5" aria-hidden="true" tabindex="-1"></a>p1 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">coord_flip</span>()</span>
<span id="cb323-6"><a href="task-04.html#cb323-6" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p, p1)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot9-1.png" width="672" /></p>
<p>可以看到左边的柱状图文字有点挡住了我们把文字旋转45°</p>
<div class="sourceCode" id="cb324"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb324-1"><a href="task-04.html#cb324-1" aria-hidden="true" tabindex="-1"></a>data <span class="ot">&lt;-</span> <span class="fu">count</span>(h1n1_data[<span class="st">&quot;race&quot;</span>])</span>
<span id="cb324-2"><a href="task-04.html#cb324-2" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> race, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb324-3"><a href="task-04.html#cb324-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>) <span class="sc">+</span></span>
<span id="cb324-4"><a href="task-04.html#cb324-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">hjust =</span> <span class="dv">1</span>))</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot9a-1.png" width="672" /></p>
<p>对柱状图的样式进行修改</p>
<div class="sourceCode" id="cb325"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb325-1"><a href="task-04.html#cb325-1" aria-hidden="true" tabindex="-1"></a><span class="co"># 更改条的宽度和颜色:</span></span>
<span id="cb325-2"><a href="task-04.html#cb325-2" aria-hidden="true" tabindex="-1"></a><span class="co"># 更改条的宽度</span></span>
<span id="cb325-3"><a href="task-04.html#cb325-3" aria-hidden="true" tabindex="-1"></a>p2 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> race, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb325-4"><a href="task-04.html#cb325-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>, <span class="at">width =</span> <span class="fl">0.5</span>) <span class="sc">+</span></span>
<span id="cb325-5"><a href="task-04.html#cb325-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">hjust =</span> <span class="dv">1</span>))</span>
<span id="cb325-6"><a href="task-04.html#cb325-6" aria-hidden="true" tabindex="-1"></a><span class="co"># 改变颜色</span></span>
<span id="cb325-7"><a href="task-04.html#cb325-7" aria-hidden="true" tabindex="-1"></a>p3 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> race, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb325-8"><a href="task-04.html#cb325-8" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>, <span class="at">color =</span> <span class="st">&quot;blue&quot;</span>, <span class="at">fill =</span> <span class="st">&quot;white&quot;</span>) <span class="sc">+</span></span>
<span id="cb325-9"><a href="task-04.html#cb325-9" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">hjust =</span> <span class="dv">1</span>))</span>
<span id="cb325-10"><a href="task-04.html#cb325-10" aria-hidden="true" tabindex="-1"></a><span class="co"># 最小主题+蓝色填充颜色</span></span>
<span id="cb325-11"><a href="task-04.html#cb325-11" aria-hidden="true" tabindex="-1"></a>p4 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> race, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb325-12"><a href="task-04.html#cb325-12" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>, <span class="at">fill =</span> <span class="st">&quot;steelblue&quot;</span>) <span class="sc">+</span></span>
<span id="cb325-13"><a href="task-04.html#cb325-13" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme_minimal</span>() <span class="sc">+</span></span>
<span id="cb325-14"><a href="task-04.html#cb325-14" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">hjust =</span> <span class="dv">1</span>))</span>
<span id="cb325-15"><a href="task-04.html#cb325-15" aria-hidden="true" tabindex="-1"></a><span class="co"># 选择要显示的项目</span></span>
<span id="cb325-16"><a href="task-04.html#cb325-16" aria-hidden="true" tabindex="-1"></a>p5 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">scale_x_discrete</span>(<span class="at">limits =</span> <span class="fu">c</span>(<span class="st">&quot;White&quot;</span>, <span class="st">&quot;Black&quot;</span>)) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">hjust =</span> <span class="dv">1</span>))</span>
<span id="cb325-17"><a href="task-04.html#cb325-17" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p2, p3, p4, p5)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot10-1.png" width="672" /></p>
<p>对柱状图进行标签显示</p>
<div class="sourceCode" id="cb326"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb326-1"><a href="task-04.html#cb326-1" aria-hidden="true" tabindex="-1"></a>p6 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> data, <span class="fu">aes</span>(<span class="at">x =</span> race, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb326-2"><a href="task-04.html#cb326-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>, <span class="at">fill =</span> <span class="st">&quot;steelblue&quot;</span>) <span class="sc">+</span></span>
<span id="cb326-3"><a href="task-04.html#cb326-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_text</span>(<span class="fu">aes</span>(<span class="at">label =</span> freq), <span class="at">vjust =</span> <span class="sc">-</span><span class="fl">0.3</span>, <span class="at">size =</span> <span class="fl">3.5</span>) <span class="sc">+</span></span>
<span id="cb326-4"><a href="task-04.html#cb326-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme_minimal</span>() <span class="sc">+</span></span>
<span id="cb326-5"><a href="task-04.html#cb326-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">hjust =</span> <span class="dv">1</span>))</span>
<span id="cb326-6"><a href="task-04.html#cb326-6" aria-hidden="true" tabindex="-1"></a><span class="co"># 条形内部标签</span></span>
<span id="cb326-7"><a href="task-04.html#cb326-7" aria-hidden="true" tabindex="-1"></a>p7 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> data, <span class="fu">aes</span>(<span class="at">x =</span> race, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb326-8"><a href="task-04.html#cb326-8" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>, <span class="at">fill =</span> <span class="st">&quot;steelblue&quot;</span>) <span class="sc">+</span></span>
<span id="cb326-9"><a href="task-04.html#cb326-9" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_text</span>(<span class="fu">aes</span>(<span class="at">label =</span> freq), <span class="at">vjust =</span> <span class="fl">1.6</span>, <span class="at">color =</span> <span class="st">&quot;white&quot;</span>, <span class="at">size =</span> <span class="fl">3.5</span>) <span class="sc">+</span></span>
<span id="cb326-10"><a href="task-04.html#cb326-10" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme_minimal</span>() <span class="sc">+</span></span>
<span id="cb326-11"><a href="task-04.html#cb326-11" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">hjust =</span> <span class="dv">1</span>))</span>
<span id="cb326-12"><a href="task-04.html#cb326-12" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p6, p7, <span class="at">nrow =</span> <span class="dv">1</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot11-1.png" width="672" /></p>
<p>如果觉得柱状图的顺序不是你想要的,可以对柱状图的顺序进行修改</p>
<div class="sourceCode" id="cb327"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb327-1"><a href="task-04.html#cb327-1" aria-hidden="true" tabindex="-1"></a>data <span class="ot">&lt;-</span> <span class="fu">within</span>(data, {</span>
<span id="cb327-2"><a href="task-04.html#cb327-2" aria-hidden="true" tabindex="-1"></a> race <span class="ot">&lt;-</span> <span class="fu">factor</span>(race, <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;White&quot;</span>, <span class="st">&quot;Black&quot;</span>, <span class="st">&quot;Hispanic&quot;</span>, <span class="st">&quot;Other or Multiple&quot;</span>))</span>
<span id="cb327-3"><a href="task-04.html#cb327-3" aria-hidden="true" tabindex="-1"></a>})</span>
<span id="cb327-4"><a href="task-04.html#cb327-4" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> race, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb327-5"><a href="task-04.html#cb327-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>, <span class="at">fill =</span> <span class="st">&quot;steelblue&quot;</span>) <span class="sc">+</span></span>
<span id="cb327-6"><a href="task-04.html#cb327-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">hjust =</span> <span class="dv">1</span>))</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot12-1.png" width="672" /></p>
<p>ggplot2关于柱状图的相关做法有很详细的介绍相关参考链接
<a href="https://ggplot2.tidyverse.org/reference/geom_bar.html" class="uri">https://ggplot2.tidyverse.org/reference/geom_bar.html</a></p>
</div>
<div id="饼状图" class="section level2" number="4.5">
<h2><span class="header-section-number">4.5</span> 饼状图</h2>
<p>饼状图作为常用的数据可视化图形之一,广泛的使用在各个领域,能够很清楚展示数据的所占的百分比。
ggplot2并没有类似于geom_pie()这样的函数实现饼图的绘制但ggplot2有一个理念就是通过极坐标变换绘制饼图</p>
<p>饼图在ggplot2中就是通过极坐标变换获得在绘制饼图之前需要绘制堆叠的条形图通过将条形图进行极坐标变换后就能实现饼图绘制了。</p>
<p>对h1n1问卷表中race数据进行数据展示</p>
<div class="sourceCode" id="cb328"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb328-1"><a href="task-04.html#cb328-1" aria-hidden="true" tabindex="-1"></a>data <span class="ot">&lt;-</span> <span class="fu">count</span>(h1n1_data[<span class="st">&quot;race&quot;</span>])</span>
<span id="cb328-2"><a href="task-04.html#cb328-2" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> data, <span class="fu">aes</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> freq, <span class="at">fill =</span> race)) <span class="sc">+</span></span>
<span id="cb328-3"><a href="task-04.html#cb328-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot13-1.png" width="672" /></p>
<p>堆叠的条形图绘制完后接下来就需要进行极坐标变换了ggplot2中coord_polar()函数可以非常方便的实现极坐标变换。</p>
<div class="sourceCode" id="cb329"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb329-1"><a href="task-04.html#cb329-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> data, <span class="fu">aes</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> freq, <span class="at">fill =</span> race)) <span class="sc">+</span></span>
<span id="cb329-2"><a href="task-04.html#cb329-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>) <span class="sc">+</span></span>
<span id="cb329-3"><a href="task-04.html#cb329-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">coord_polar</span>(<span class="at">theta =</span> <span class="st">&quot;y&quot;</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot14-1.png" width="672" /></p>
<p>看起来像饼图了,但是饼图周围还有多余的数字,如何清除呢?
这里的标签其实就是坐标轴的标签可以通过labs()函数将其清除。</p>
<div class="sourceCode" id="cb330"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb330-1"><a href="task-04.html#cb330-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> data, <span class="fu">aes</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> freq, <span class="at">fill =</span> race)) <span class="sc">+</span></span>
<span id="cb330-2"><a href="task-04.html#cb330-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>) <span class="sc">+</span></span>
<span id="cb330-3"><a href="task-04.html#cb330-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">coord_polar</span>(<span class="at">theta =</span> <span class="st">&quot;y&quot;</span>) <span class="sc">+</span></span>
<span id="cb330-4"><a href="task-04.html#cb330-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> <span class="st">&quot;&quot;</span>, <span class="at">title =</span> <span class="st">&quot;&quot;</span>) <span class="sc">+</span></span>
<span id="cb330-5"><a href="task-04.html#cb330-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text =</span> <span class="fu">element_blank</span>())</span></code></pre></div>
<p><img src="RLearning_files/figure-html/plot15-1.png" width="672" /></p>
<p>接下来就是显示各个所占的比例
第一种方法,将百分比直接显示在图例中,这种方式适合分类较多的情况。</p>
<div class="sourceCode" id="cb331"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb331-1"><a href="task-04.html#cb331-1" aria-hidden="true" tabindex="-1"></a>label_value <span class="ot">&lt;-</span> <span class="fu">paste</span>(<span class="st">&quot;(&quot;</span>, <span class="fu">round</span>(data<span class="sc">$</span>freq <span class="sc">/</span> <span class="fu">sum</span>(data<span class="sc">$</span>freq) <span class="sc">*</span> <span class="dv">100</span>, <span class="dv">1</span>), <span class="st">&quot;%)&quot;</span>, <span class="at">sep =</span> <span class="st">&quot;&quot;</span>)</span>
<span id="cb331-2"><a href="task-04.html#cb331-2" aria-hidden="true" tabindex="-1"></a>label_value</span></code></pre></div>
<pre><code>## [1] &quot;(7.9%)&quot; &quot;(6.6%)&quot; &quot;(6%)&quot; &quot;(79.5%)&quot;</code></pre>
<p>将计算的百分比和race匹配</p>
<div class="sourceCode" id="cb333"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb333-1"><a href="task-04.html#cb333-1" aria-hidden="true" tabindex="-1"></a>label <span class="ot">&lt;-</span> <span class="fu">paste</span>(data<span class="sc">$</span>race, label_value, <span class="at">sep =</span> <span class="st">&quot;&quot;</span>)</span>
<span id="cb333-2"><a href="task-04.html#cb333-2" aria-hidden="true" tabindex="-1"></a>label</span></code></pre></div>
<pre><code>## [1] &quot;Black(7.9%)&quot; &quot;Hispanic(6.6%)&quot; &quot;Other or Multiple(6%)&quot;
## [4] &quot;White(79.5%)&quot;</code></pre>
<p>接下来就是将这些百分比标签放到图例中</p>
<div class="sourceCode" id="cb335"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb335-1"><a href="task-04.html#cb335-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> data, <span class="fu">aes</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> freq, <span class="at">fill =</span> race)) <span class="sc">+</span></span>
<span id="cb335-2"><a href="task-04.html#cb335-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>) <span class="sc">+</span></span>
<span id="cb335-3"><a href="task-04.html#cb335-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">coord_polar</span>(<span class="at">theta =</span> <span class="st">&quot;y&quot;</span>) <span class="sc">+</span></span>
<span id="cb335-4"><a href="task-04.html#cb335-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> <span class="st">&quot;&quot;</span>, <span class="at">title =</span> <span class="st">&quot;&quot;</span>) <span class="sc">+</span></span>
<span id="cb335-5"><a href="task-04.html#cb335-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text =</span> <span class="fu">element_blank</span>()) <span class="sc">+</span></span>
<span id="cb335-6"><a href="task-04.html#cb335-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">scale_fill_discrete</span>(<span class="at">labels =</span> label)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-125-1.png" width="672" /></p>
<p>看起来就很不错~</p>
<p>第二种方法,直接将百分比放到各自的饼区中。</p>
<p>首先是去掉饼图中的图例</p>
<div class="sourceCode" id="cb336"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb336-1"><a href="task-04.html#cb336-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> data, <span class="fu">aes</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> freq, <span class="at">fill =</span> race)) <span class="sc">+</span></span>
<span id="cb336-2"><a href="task-04.html#cb336-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>) <span class="sc">+</span></span>
<span id="cb336-3"><a href="task-04.html#cb336-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">coord_polar</span>(<span class="at">theta =</span> <span class="st">&quot;y&quot;</span>) <span class="sc">+</span></span>
<span id="cb336-4"><a href="task-04.html#cb336-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> <span class="st">&quot;&quot;</span>, <span class="at">title =</span> <span class="st">&quot;&quot;</span>) <span class="sc">+</span></span>
<span id="cb336-5"><a href="task-04.html#cb336-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text =</span> <span class="fu">element_blank</span>()) <span class="sc">+</span></span>
<span id="cb336-6"><a href="task-04.html#cb336-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-126-1.png" width="672" /></p>
<p>将标签放置在饼图中</p>
<div class="sourceCode" id="cb337"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb337-1"><a href="task-04.html#cb337-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> data, <span class="fu">aes</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> freq, <span class="at">fill =</span> race)) <span class="sc">+</span></span>
<span id="cb337-2"><a href="task-04.html#cb337-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">&quot;identity&quot;</span>, <span class="at">width =</span> <span class="dv">1</span>) <span class="sc">+</span></span>
<span id="cb337-3"><a href="task-04.html#cb337-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">coord_polar</span>(<span class="at">theta =</span> <span class="st">&quot;y&quot;</span>) <span class="sc">+</span></span>
<span id="cb337-4"><a href="task-04.html#cb337-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">x =</span> <span class="st">&quot;&quot;</span>, <span class="at">y =</span> <span class="st">&quot;&quot;</span>, <span class="at">title =</span> <span class="st">&quot;&quot;</span>) <span class="sc">+</span></span>
<span id="cb337-5"><a href="task-04.html#cb337-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme</span>(<span class="at">axis.text =</span> <span class="fu">element_blank</span>(), <span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>) <span class="sc">+</span></span>
<span id="cb337-6"><a href="task-04.html#cb337-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_text</span>(<span class="fu">aes</span>(<span class="at">label =</span> label), <span class="at">size =</span> <span class="dv">3</span>, <span class="at">position =</span> <span class="fu">position_stack</span>(<span class="at">vjust =</span> <span class="fl">0.5</span>))</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-127-1.png" width="672" /></p>
</div>
<div id="折线图" class="section level2" number="4.6">
<h2><span class="header-section-number">4.6</span> 折线图</h2>
<p>折线图作为反映数据变化的趋势是常用的数据可视化图形之一在ggplot2中通过geom_line()这个函数进行绘制。</p>
<p>对波士顿房价中rad进行可视化展示使用pylr包中的count对edcation进行计数统计</p>
<div class="sourceCode" id="cb338"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb338-1"><a href="task-04.html#cb338-1" aria-hidden="true" tabindex="-1"></a>data <span class="ot">&lt;-</span> <span class="fu">count</span>(boston_data[<span class="st">&quot;rad&quot;</span>])</span>
<span id="cb338-2"><a href="task-04.html#cb338-2" aria-hidden="true" tabindex="-1"></a>data</span></code></pre></div>
<pre><code>## rad freq
## 1 1 20
## 2 2 24
## 3 3 38
## 4 4 110
## 5 5 115
## 6 6 26
## 7 7 17
## 8 8 24
## 9 24 132</code></pre>
<p>把rad为24的数据去除掉</p>
<div class="sourceCode" id="cb340"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb340-1"><a href="task-04.html#cb340-1" aria-hidden="true" tabindex="-1"></a>data <span class="ot">&lt;-</span> data[<span class="dv">1</span><span class="sc">:</span><span class="dv">8</span>, ]</span>
<span id="cb340-2"><a href="task-04.html#cb340-2" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> rad, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb340-3"><a href="task-04.html#cb340-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_line</span>()</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-129-1.png" width="672" /></p>
<p>有时候我们需要在折线图上显示对应x轴的点数据从而可以更加清晰的辨别原始数据,这特别适合数据比较稀疏的情况</p>
<div class="sourceCode" id="cb341"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb341-1"><a href="task-04.html#cb341-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> rad, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb341-2"><a href="task-04.html#cb341-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_line</span>() <span class="sc">+</span></span>
<span id="cb341-3"><a href="task-04.html#cb341-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>(<span class="at">size =</span> <span class="dv">4</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-130-1.png" width="672" /></p>
<p>我们调整横坐标的显示刻度</p>
<div class="sourceCode" id="cb342"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb342-1"><a href="task-04.html#cb342-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> rad, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb342-2"><a href="task-04.html#cb342-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_line</span>() <span class="sc">+</span></span>
<span id="cb342-3"><a href="task-04.html#cb342-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>(<span class="at">size =</span> <span class="dv">4</span>) <span class="sc">+</span></span>
<span id="cb342-4"><a href="task-04.html#cb342-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">scale_x_continuous</span>(<span class="at">breaks =</span> <span class="fu">c</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">8</span>))</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-131-1.png" width="672" /></p>
<p>也可以修改线的类型和颜色</p>
<div class="sourceCode" id="cb343"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb343-1"><a href="task-04.html#cb343-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(data, <span class="fu">aes</span>(<span class="at">x =</span> rad, <span class="at">y =</span> freq)) <span class="sc">+</span></span>
<span id="cb343-2"><a href="task-04.html#cb343-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_line</span>(<span class="at">linetype =</span> <span class="st">&quot;dashed&quot;</span>, <span class="at">color =</span> <span class="st">&quot;red&quot;</span>) <span class="sc">+</span></span>
<span id="cb343-3"><a href="task-04.html#cb343-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>(<span class="at">size =</span> <span class="dv">4</span>) <span class="sc">+</span></span>
<span id="cb343-4"><a href="task-04.html#cb343-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">scale_x_continuous</span>(<span class="at">breaks =</span> <span class="fu">c</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">8</span>))</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-132-1.png" width="672" /></p>
<p>ggplt2关于折线图的相关做法的参考链接
<a href="https://ggplot2.tidyverse.org/reference/geom_abline.html" class="uri">https://ggplot2.tidyverse.org/reference/geom_abline.html</a></p>
</div>
<div id="ggplot2扩展包主题" class="section level2" number="4.7">
<h2><span class="header-section-number">4.7</span> ggplot2扩展包主题</h2>
<p>R语言中的ggplot2包里面的风格固定在需要特殊的图形时需要更改甚至自定义设置主题。
ggplot2内置了8种风格的主题</p>
<table>
<thead>
<tr class="header">
<th>主题函数</th>
<th>效果</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>theme_bw()</td>
<td>网格白色主题</td>
</tr>
<tr class="even">
<td>theme_classic()</td>
<td>经典主题</td>
</tr>
<tr class="odd">
<td>theme_dark()</td>
<td>暗色主题,可用于对比</td>
</tr>
<tr class="even">
<td>theme_gray()</td>
<td>默认主题</td>
</tr>
<tr class="odd">
<td>theme_light()</td>
<td>浅色坐标带网格</td>
</tr>
<tr class="even">
<td>theme_linedraw()</td>
<td>黑色网格线</td>
</tr>
<tr class="odd">
<td>theme_minimal()</td>
<td>极简主题</td>
</tr>
<tr class="even">
<td>theme_void()</td>
<td>空白主题</td>
</tr>
</tbody>
</table>
<p>我们来试一试不同的主题</p>
<div class="sourceCode" id="cb344"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb344-1"><a href="task-04.html#cb344-1" aria-hidden="true" tabindex="-1"></a>p <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv, <span class="at">colour =</span> rad)) <span class="sc">+</span></span>
<span id="cb344-2"><a href="task-04.html#cb344-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span>
<span id="cb344-3"><a href="task-04.html#cb344-3" aria-hidden="true" tabindex="-1"></a>p1 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">theme_bw</span>() <span class="sc">+</span> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;网格白色主题&quot;</span>) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span>
<span id="cb344-4"><a href="task-04.html#cb344-4" aria-hidden="true" tabindex="-1"></a>p2 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">theme_classic</span>() <span class="sc">+</span> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;经典主题&quot;</span>) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span>
<span id="cb344-5"><a href="task-04.html#cb344-5" aria-hidden="true" tabindex="-1"></a>p3 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">theme_dark</span>() <span class="sc">+</span> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;暗色主题&quot;</span>) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span>
<span id="cb344-6"><a href="task-04.html#cb344-6" aria-hidden="true" tabindex="-1"></a>p4 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">theme_gray</span>() <span class="sc">+</span> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;默认主题&quot;</span>) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span>
<span id="cb344-7"><a href="task-04.html#cb344-7" aria-hidden="true" tabindex="-1"></a>p5 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">theme_light</span>() <span class="sc">+</span> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;浅色坐标带网格&quot;</span>) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span>
<span id="cb344-8"><a href="task-04.html#cb344-8" aria-hidden="true" tabindex="-1"></a>p6 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">theme_linedraw</span>() <span class="sc">+</span> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;黑色网格线&quot;</span>) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span>
<span id="cb344-9"><a href="task-04.html#cb344-9" aria-hidden="true" tabindex="-1"></a>p7 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">theme_minimal</span>() <span class="sc">+</span> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;极简主题&quot;</span>) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span>
<span id="cb344-10"><a href="task-04.html#cb344-10" aria-hidden="true" tabindex="-1"></a>p8 <span class="ot">&lt;-</span> p <span class="sc">+</span> <span class="fu">theme_void</span>() <span class="sc">+</span> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;空白主题&quot;</span>) <span class="sc">+</span> <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;none&quot;</span>)</span>
<span id="cb344-11"><a href="task-04.html#cb344-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb344-12"><a href="task-04.html#cb344-12" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p1, p2, p3, p4, p5, p6, p7, p8, <span class="at">ncol =</span> <span class="dv">4</span>, <span class="at">nrow =</span> <span class="dv">2</span>, <span class="at">heights =</span> <span class="fl">1.2</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-133-1.png" width="672" /></p>
<p>除了ggplot2自带的主题外还有许多拓展主题包比如ggthemes、ggthemr
ggthemes在cran上发布因此推荐使用这个
ggthemr 色彩很好看,因此推荐这个</p>
<p>ggthemes相关链接<a href="https://github.com/jrnold/ggthemes" class="uri">https://github.com/jrnold/ggthemes</a></p>
<p>ggthemr相关链接<a href="https://github.com/Mikata-Project/ggthemr" class="uri">https://github.com/Mikata-Project/ggthemr</a></p>
<p>因为ggthemr没有上cran因此需要通过github安装</p>
<div class="sourceCode" id="cb345"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb345-1"><a href="task-04.html#cb345-1" aria-hidden="true" tabindex="-1"></a><span class="co"># devtools::install_github(&#39;Mikata-Project/ggthemr&#39;)</span></span></code></pre></div>
<p>使用方法也是非常简单这里用我比较喜欢的greyscale主题方案</p>
<div class="sourceCode" id="cb346"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb346-1"><a href="task-04.html#cb346-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggthemr)</span>
<span id="cb346-2"><a href="task-04.html#cb346-2" aria-hidden="true" tabindex="-1"></a><span class="fu">ggthemr</span>(<span class="st">&quot;greyscale&quot;</span>)</span>
<span id="cb346-3"><a href="task-04.html#cb346-3" aria-hidden="true" tabindex="-1"></a>p3 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv, <span class="at">colour =</span> <span class="fu">factor</span>(rad))) <span class="sc">+</span></span>
<span id="cb346-4"><a href="task-04.html#cb346-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span>
<span id="cb346-5"><a href="task-04.html#cb346-5" aria-hidden="true" tabindex="-1"></a>p4 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv, <span class="at">colour =</span> rad)) <span class="sc">+</span></span>
<span id="cb346-6"><a href="task-04.html#cb346-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span>
<span id="cb346-7"><a href="task-04.html#cb346-7" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p3, p4, <span class="at">nrow =</span> <span class="dv">1</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-135-1.png" width="672" /></p>
<p>试一试light这个主题配色非常的温柔</p>
<div class="sourceCode" id="cb347"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb347-1"><a href="task-04.html#cb347-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggthemr)</span>
<span id="cb347-2"><a href="task-04.html#cb347-2" aria-hidden="true" tabindex="-1"></a><span class="fu">ggthemr</span>(<span class="st">&quot;light&quot;</span>)</span>
<span id="cb347-3"><a href="task-04.html#cb347-3" aria-hidden="true" tabindex="-1"></a>p3 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv, <span class="at">colour =</span> <span class="fu">factor</span>(rad))) <span class="sc">+</span></span>
<span id="cb347-4"><a href="task-04.html#cb347-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span>
<span id="cb347-5"><a href="task-04.html#cb347-5" aria-hidden="true" tabindex="-1"></a>p4 <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(<span class="at">data =</span> boston_data, <span class="fu">aes</span>(<span class="at">x =</span> lstat, <span class="at">y =</span> medv, <span class="at">colour =</span> rad)) <span class="sc">+</span></span>
<span id="cb347-6"><a href="task-04.html#cb347-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span>
<span id="cb347-7"><a href="task-04.html#cb347-7" aria-hidden="true" tabindex="-1"></a><span class="fu">ggarrange</span>(p3, p4, <span class="at">nrow =</span> <span class="dv">1</span>)</span></code></pre></div>
<p><img src="RLearning_files/figure-html/unnamed-chunk-136-1.png" width="672" /></p>
<p>实战部分:
对提供的数据集我们可以试一试ggthemr中的不同主题同时对波士顿房价进行其它的数据可视化的探索。</p>
<p>ggplot2是一个非常经典的数据可视化R包内容非常丰富由于篇幅的原因没办法将ggplot2中的各种方法全部讲述因此选择了几个常见的图形进行相关的讲解以期达到抛砖引玉的效果。如果对ggplot2感兴趣的同学可以去官网进行更加详细的学习也非常期待大家的数据可视化作品~</p>
</div>
<div id="本章作者-4" class="section level2 unnumbered">
<h2>本章作者</h2>
<p><strong>牧小熊</strong></p>
<blockquote>
<p>华中农业大学研究生Datawhale成员, Datawhale优秀原创作者<br />
知乎:<a href="https://www.zhihu.com/people/muxiaoxiong" class="uri">https://www.zhihu.com/people/muxiaoxiong</a></p>
</blockquote>
</div>
<div id="关于datawhale-4" class="section level2 unnumbered">
<h2>关于Datawhale</h2>
<p>Datawhale是一个专注于数据科学与AI领域的开源组织汇集了众多领域院校和知名企业的优秀学习者聚合了一群有开源精神和探索精神的团队成员。Datawhale 以“for the learner和学习者一起成长”为愿景鼓励真实地展现自我、开放包容、互信互助、敢于试错和勇于担当。同时 Datawhale 用开源的理念去探索开源内容、开源学习和开源方案,赋能人才培养,助力人才成长,建立起人与人,人与知识,人与企业和人与未来的联结。 本次数据挖掘路径学习专题知识将在天池分享详情可关注Datawhale</p>
<p><img src="image/logo.png" width="129" /></p>
</div>
</div>
</section>
</div>
</div>
</div>
<a href="task-03.html" class="navigation navigation-prev " aria-label="Previous page"><i class="fa fa-angle-left"></i></a>
<a href="task-05.html" class="navigation navigation-next " aria-label="Next page"><i class="fa fa-angle-right"></i></a>
</div>
</div>
<script src="libs/gitbook-2.6.7/js/app.min.js"></script>
<script src="libs/gitbook-2.6.7/js/lunr.js"></script>
<script src="libs/gitbook-2.6.7/js/clipboard.min.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="libs/gitbook-2.6.7/js/jquery.highlight.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-clipboard.js"></script>
<script>
gitbook.require(["gitbook"], function(gitbook) {
gitbook.start({
"sharing": {
"github": true,
"facebook": false,
"twitter": false,
"linkedin": true,
"weibo": true,
"instapaper": false,
"vk": false,
"whatsapp": false,
"all": ["facebook", "twitter", "linkedin", "weibo", "instapaper", "whatsapp"]
},
"fontsettings": {
"theme": "white",
"family": "sans",
"size": 2
},
"edit": {
"link": null,
"text": null
},
"history": {
"link": null,
"text": null
},
"view": {
"link": "https://github.com/FinYang/RLearning-book/blob/main/Task04_Visualization.Rmd",
"text": null
},
"download": ["RLearning.pdf"],
"toc": {
"collapse": "subsection"
}
});
});
</script>
</body>
</html>