-
Notifications
You must be signed in to change notification settings - Fork 6
/
Matrices.html
60 lines (60 loc) · 6.01 KB
/
Matrices.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title></title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; }
code > span.dt { color: #902000; }
code > span.dv { color: #40a070; }
code > span.bn { color: #40a070; }
code > span.fl { color: #40a070; }
code > span.ch { color: #4070a0; }
code > span.st { color: #4070a0; }
code > span.co { color: #60a0b0; font-style: italic; }
code > span.ot { color: #007020; }
code > span.al { color: #ff0000; font-weight: bold; }
code > span.fu { color: #06287e; }
code > span.er { color: #ff0000; font-weight: bold; }
</style>
<link rel="stylesheet" href="Class.css" type="text/css" />
</head>
<body>
<h1 id="matrices">Matrices</h1>
<p>Matrices are important for linear algebra, distances (although we typically represent only the upper-diagonal of a symmetric distance matrix), and a few other situations. However, matrices are not as important for data analysis as people think, and instead we want data frames. Matrices and data.frames are quite similar conceptually but are represented very differently.</p>
<p>Every cell in a matrix must have the same type, e.g. logical, integer, numeric, character. So different columns cannot have a different type. And in fact, matrices in R are represented as vectors so must be logical, integer, numeric, character, etc. So all the elements will be coerced to the type that loses no information.</p>
<p>So there is no way to represent integer values along side dates, real values, strings and categorical data within a matrix.</p>
<p>Consider the integer matrix</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">matrix</span>(<span class="dv">1</span>:<span class="dv">15</span>, <span class="dv">5</span>, <span class="dv">3</span>)
[,<span class="dv">1</span>] [,<span class="dv">2</span>] [,<span class="dv">3</span>]
[<span class="dv">1</span>,] <span class="dv">1</span> <span class="dv">6</span> <span class="dv">11</span>
[<span class="dv">2</span>,] <span class="dv">2</span> <span class="dv">7</span> <span class="dv">12</span>
[<span class="dv">3</span>,] <span class="dv">3</span> <span class="dv">8</span> <span class="dv">13</span>
[<span class="dv">4</span>,] <span class="dv">4</span> <span class="dv">9</span> <span class="dv">14</span>
[<span class="dv">5</span>,] <span class="dv">5</span> <span class="dv">10</span> <span class="dv">15</span></code></pre>
<p>However, this is arranged column-wise as an integer vector</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">as.integer</span>(<span class="kw">matrix</span>(<span class="dv">1</span>:<span class="dv">15</span>, <span class="dv">5</span>, <span class="dv">3</span>))
[<span class="dv">1</span>] <span class="dv">1</span> <span class="dv">2</span> <span class="dv">3</span> <span class="dv">4</span> <span class="dv">5</span> <span class="dv">6</span> <span class="dv">7</span> <span class="dv">8</span> <span class="dv">9</span> <span class="dv">10</span> <span class="dv">11</span> <span class="dv">12</span> <span class="dv">13</span> <span class="dv">14</span> <span class="dv">15</span></code></pre>
<p>When creating the matrix, we can use byrow = TRUE to tell R to arrange the given elements along the rows, e.g.,</p>
<pre class="sourceCode r"><code class="sourceCode r">m =<span class="st"> </span><span class="kw">matrix</span>(<span class="dv">1</span>:<span class="dv">15</span>, <span class="dv">5</span>, <span class="dv">3</span>, <span class="dt">byrow =</span> <span class="ot">TRUE</span>)
[,<span class="dv">1</span>] [,<span class="dv">2</span>] [,<span class="dv">3</span>]
[<span class="dv">1</span>,] <span class="dv">1</span> <span class="dv">2</span> <span class="dv">3</span>
[<span class="dv">2</span>,] <span class="dv">4</span> <span class="dv">5</span> <span class="dv">6</span>
[<span class="dv">3</span>,] <span class="dv">7</span> <span class="dv">8</span> <span class="dv">9</span>
[<span class="dv">4</span>,] <span class="dv">10</span> <span class="dv">11</span> <span class="dv">12</span>
[<span class="dv">5</span>,] <span class="dv">13</span> <span class="dv">14</span> <span class="dv">15</span></code></pre>
<p>However, R still stores the values in column-order, i.e., the values of the first column followed by those of the second column, and so on:</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">as.integer</span>(m)
[<span class="dv">1</span>] <span class="dv">1</span> <span class="dv">4</span> <span class="dv">7</span> <span class="dv">10</span> <span class="dv">13</span> <span class="dv">2</span> <span class="dv">5</span> <span class="dv">8</span> <span class="dv">11</span> <span class="dv">14</span> <span class="dv">3</span> <span class="dv">6</span> <span class="dv">9</span> <span class="dv">12</span> <span class="dv">15</span></code></pre>
<p>Matrices are useful ways to store 2-dimensional where the types are the same and we want to do operations that are easier and/or more efficient than a collection of vectors or a data.frame. See BML model simulation in Case Studies in Data Science: Nolan, Temple Lang.</p>
<p>We can subset matrices in the same way we can data frames, i.e., m[i, j]. However, matrices also support subsetting by matrices. <a href="MatrixSubsetting.html">See here</a></p>
</body>
</html>