Rudin – Analysis – Ch. 1, Ex. 6


This is an important exercise. Here we learn to extend a formal notation. We do a few things, first we firm up the notion of a rational exponent (a) and show that the definition has some consistency (b). Next we extend this definition to all real numbers (c) in a sensible way, and then we show that this extension is also self consistent (d).

The importance here is that we are learning how to firm up our notions (in this instance of exponentiation) and the importance of confirming these notions through rigor in parts (b) and (d).

One other important thing to note is that although we come up with a good definition for irrational exponents, we do not have a good method for computing the actual value. We may understand the irrational exponent is the supremum of rational exponents and this may suffice for a definition but the definition doesn’t tell us anything about how close we are with an approximation. (You will get there, but you have no idea if you are close or far) It won’t be until we define our concept of limit that we can answer that question.

Fix \(b > 1\)
(a) If \(m\), \(n\), \(p\), \(q\) are integers, \(n > 0\), \(q > 0\), and \(r=m/n=p/q\), prove that
\[
\left(b^m\right)^{1/n} = \left(b^p\right)^{1/q}
\]
Hence it makes sense to define \(b^r=\left(b^p\right)^{1/q}\)

First we note that \(b^{mn} = (b^m)^n = (b^n)^m \). This should be obvious from the definition of exponentiation (of integers.)

Second suppose \(b^{mn} = x\), then \(b = x^{1/mn}\). But \(b^{mn} = (b^m)^n = x\) so \(b^m = x^{1/n}\), and thus \(b = (x^{1/n})^{1/m}\). Likewise \(b = (x^{1/m})^{1/n}\).

Lastly if \( r= m/n = p/q \) then \(mq = np\), and specifically \((x^{np})^{1/mq}=x\)

Armed with these three facts, consider \((b^m)^{1/n}\)
\[
\begin{align}
(b^m)^{1/n} &= [((b^m)^{1/n})^{np}]^{1/mq} \\
&= [((b^m)^{1/n})^n)^p]^{1/mq} \\
&= [(b^m)^p]^{1/mq} \\
&= [(b^p)^m]^{1/mq} \\
&= [((b^p)^m)^{1/m}]^{1/q} \\
&= (b^p)^{1/q}
\end{align}
\]

The point of this exercise is that any representation of a rational number \(r\) is the same number. (In other words oue definition is consistent.)

(b) Prove that \(b^{r+s} = b^r b^s\) if \(r\) and \(s\) are rational.

Let \(r = m/n\) and \(s = p/q\), then
\[
\begin{align}
b^{r+s} &= b^{m/n+p/q} \\
&= b^{\frac{mq+pn}{nq}} \\
&= (b^{mq+pn})^{1/nq} \\
&= (b^{mq}b^{pn})^{1/nq} \\
&= (b^{mq})^{1/nq}(b^{pn})^{1/nq} \\
&= (b^m)^{1/n}(b^p)^{1/q} \\
&= b^rb^s
\end{align}
\]

With the two parts above proven, we have extended our exponentiation formal notation to any rational exponents. Next we complete the extension to all real numbers.

(c) If \(x\) is real, define \(B(x)\) to be the set of all numbers \(b^t\) where \(t\) is rational and \(t \le x\). Prove that
\[
b^r = \sup B(r)
\]
when \(r\) is rational. Hence is makes sense to define
\[
b^x=\sup B(x)
\]
for every real \(x\)

First we observe that if \(1< b\) then by (1.18) in the text \(b \cdot 1 < b \cdot b\) or \(b < b^2\). Thus by induction we have \( 1 < b < b^2 < b^3 \dots b^m\). Succinctly we see that if \(b > 1\) the any power,
(m < n\) \(b^m < b^m\). Likewise we we can see that for any \( b > 1 \) we have \( b^{1/n} > 1\) (let \(b^m = x\) and consider \(x^{1/n}\).)

Now consider the set \(B(r)\) where \(r\) is rational. Let \(t < r\). \(b^t \in B(r)\). Since \(t < r\) so that \(0 < r - t\). let \(s = r- t > 0\) (or \( r = t+s\)) then
\[
\begin{align}
b^r &= b^{t+s} & \\
&=b^tb^s >b^t \cdot 1 = b^t & \text{by proposition (1.18)(b) in the text ($x=b^t$, $y=b^s$, $z=1$)}\\
b^r &> b^t
\end{align}
\]

Thus for any \( t < r\), \(b^t < b^r\), therefore \(b^r\) is an upper bound of \(B(r)\). Because the only members of \(B(r)\) can be \( r\) or \(t < r\) we see that \(b^r = \sup B(r)\). (d) Prove that \(b^{x+y}=b^xb^y\) for all real \(x\) and \(y\) Keep \(x\) fixed and define \(B_x(y) = \{ b^{x+r} : r \in \mathbb{Q}, \ r \le y\}\). [latex] \begin{align} b^{x+r} &= \sup B(x+r) \\ &= \sup \{ b^{s+r} : s \in \mathbb{Q}, s \le x \} \\ &= \sup \{ b^s b^r : s \in \mathbb{Q}, s \le x \} \\ &= b^r \sup \{ b^s : s \in \mathbb{Q}, s \le x \} \\ &= b^r b^x \end{align} [/latex] We can now compute \(B_x(y)\) [latex] \begin{align} B_x(y) &= \{ b^{x+r} : r \in \mathbb{Q}, r \le y \} \\ &= \{ b^x b^r : r \in \mathbb{Q}, r \le y \} \\ &= b^x \{B^r : r \in \mathbb{Q}, r \le y \} \\ &= b^x B(y) \end{align} [/latex] But \(B^{x+y} = \sup B_x(y) = \sup b_x B(y) = b^x \sup B(y) = b^x b^y\).

Rudin – Analysis – Ch. 1, Ex. 5


Let \(A\) be a nonempty set of real numbers which is bounded below. Let \(-A\) be the set of all numbers \(-x\), where \(x \in A\). Prove that
\[
\inf A = \,-\sup(-A)
\]

To show this we first observe that if \( x < y\) then \( -y < -x \): [latex] \begin{align} x &< y \\ x-y &< y-y \\ x-y &< 0 \\ -x + ( x - y ) &< -x + 0 \\ (-x + x) -y &< -x \\ 0 - y &< -x \\ -y &< -x \end{align} [/latex] Since \(A\) is nonempty, we know that there exists \(x \in A\) and for any \(x \in A\) we have \(-x \in -A\). It should be clear that there are no other elements of \(-A\). And likewise for any element \(y \in -A\), \(-y \in A\) (and there are no other elements of \(A\).) Let \(x\) be any member of \(A\) then \(\inf A \le x \) and from our note above, \( -x \le -\inf A\). Therefore for all \(-x \in -A\), \(-x \le -\inf A\) thus \(-A\) is bounded above by \(-\inf A\). So \(-\inf A\) is an upper bound of \(-A\). We now need to show that it is the least upper bound, i.e. \(-\inf A = \sup(-A)\).

Suppose that this is not the case, that \( -\inf A \ne \sup(-A)\), then \(\sup (-A) < -\inf A\). Then there exists \(\beta < -\inf A \) such that for all \(y \in -A\), \(y \le \beta\). If \(y \in -A\) then \(-y \in A\) and \(-\beta \le -y\) thus \( -\beta\) would be a lower bound of \(A\). But because \(\beta < -\inf A \) then \( \inf A < -\beta\). In other words we have \(\beta\) such that for all \(-y \in A\), \(\inf A < \beta \le -y\). This is contrary to the assumption that \(\inf A\) is the infimum or greatest lower bound of \(A\). Thus \(\beta\) does not exist, so \(-\inf A = \sup(-A)\), i.e. \(\inf A =\,-\sup(-A)\).

Rudin – Analysis – Ch.1, Ex. 4


Let \(E\) be a nonempty subset of an ordered set; suppose \(\alpha\) is a lower bound of \(E\) and \(\beta\) is an upper bond of \(E\). Prove that \(\alpha\le\beta\).

\(E\) is nonempty, thus there exists some element, \(x \in E\). Because \(\alpha\) is a lower bound of \(E\), \(\alpha \le x\). Because \(\beta\) is an upper bound of \(E\), \( x \le \beta\).

By transitivity of the ordering relation, \(\alpha \le x\) and \(x \le \beta\) imply that \(\alpha \le \beta \).

Rudin – Analysis – Ch. 1, Ex. 3

In this exercise we prove some implications of the axioms of multiplication over a field.

This is pretty much an exercise in practicing notating formal proofs. There is nothing enlightening here and we can use proposition (1.14) in the book as an outline.

In all of the parts below, \(x, y, z \in F\), where \(F\) is a field with additive identity \(0\) and multiplicative identity \(1\).

(a) If \(x \ne 0\) and \(xy = xz\) then \(y=z\)
\[
\begin{align}
y &= 1y & \text{(identity element)}\\
&= \left( \frac{1}{x} x \right) y & \text{(existence and definition of multiplicative inverse)}\\
&= \frac{1}{x} \left(x y\right) & \text{(associativity of multiplication)}\\
&= \frac{1}{x} \left(x z\right) & \text{(by assumption)}\\
&= \left(\frac{1}{x} x\right)z & \text{(associativity of multiplication)}\\
&= 1z & \text{(definition of multiplicative inverse)}\\
&= z & \text{(definition of multiplicative identity)}
\end{align}
\]

(b) If \(x \ne 0\) and \(xy=x\) then \(y=1\)
\[
\begin{align}
xy &= x & \text{(by assumption)}\\
xy &= x1 & \text{(by definition of the multiplicative identity)} \\
y &= 1 & \text{(cancellation by part (a) above with }z = 1\text{)}
\end{align}
\]

(c) If \(x \ne 0\) and \(xy=1\) then \(y=1/x\)
\[
\begin{align}
xy &= 1 & \text{(by assumption)} \\
\left(\frac{1}{x}\right)\left(xy\right) &= \left(\frac{1}{x}\right) 1 & \text{(existence of }1/x\text{)} \\
\left( x \left(\frac{1}{x}\right)\right)y &= \left(\frac{1}{x}\right) 1 & \text{(associativity)} \\
1 y &= \left(\frac{1}{x}\right) 1 & \text{(definition of }1/x\text{)} \\
y &= \frac{1}{x} & \text{(definition of the multiplicative identity)}
\end{align}
\]

(d) If \(x \ne 0\) then \(1/(1/x) = x\)

In order to avoid confusion we relabel (c) as:
If \(\alpha \ne 0\) and \(\alpha\beta=1\) then \(\beta=1/\alpha\)

Now let \(\alpha = 1/x\) and \(\beta = x\) then \(\alpha\beta = \left(\frac{1}{x}\right)x = 1\), therefore by (c) \(\beta = 1/\alpha\) or substituting \( x = 1/(1/x)\).

Rudin – Analysis – Ch. 1, Ex 1


This exercise is very simple but it serves as a model for using \( (\neg q \Rightarrow \neg p ) \Leftrightarrow (p \Rightarrow q )\). That is to say, a model for proving something by proving the contrapositive of the statement. In other words to show \( p \Rightarrow q \) we will prove \( \neg q \Rightarrow \neg p\).

It is important to understand this type of argument because it is used often but is also almost always written words similar to “which is contrary to our assumption that…”. I find at times this language although correct can sounds suspicious. (When I don’t follow I sometimes think to myself “well who told you to assume that?”)

Because of its simplicity, this exercise is easy to follow and being in two parts, we can use this as model to fall back on. We’ll do the first part in somewhat pleonastic language, but the logical implications should be clearer. We’ll prove the second part (which has basically the same argument, only replacing addition with multiplication) with more usual language.

In this context in order to show that if \(x\) is irrational and \(r\) is rational (\(r\ne 0\)) then \(x+r\) is irrational, we will instead prove that if \(x+r\) is rational (i.e. not irrational) then \(x\) is rational (i.e. not irrational).

(a) If \(x\) is irrational and \(r \ne 0\) is rational then \(x+r\) is irrational.
Here is an overly formal way to write up our proof.

To prove the irrationality of \(x+r\) let’s assume the opposite (the negation.) So \(x+r\) is not irrational. This means that (we assume) \(x+r\) is rational.

Having assumed this, we now want to prove the negation of the original assumption (that \(x\) is not irrational.)

If \(x+r\) is rational then there exist integers \(p\) and \(q\) such that \(x+r=p/q\).

We also assume that \(r\) is rational, thus there also exists integers \(n\) and \(m\) such that \(r=n/m\).

We then have \(x+n/m=p/q\). Solving for \(x\) we see \(x = p/q-n/m = \frac{pm-nq}{qm}\). This is clearly a rational expression, thus \(x\) is rational, which is what we wanted to prove. We have shown that if \(x+r\) is not irrational (i.e. \(x+r\) is rational) then \(x\) is rational.

Therefore the contra-positive is true, if \(x\) is irrational then \(x+r\) is irrational.

(b) If \(x\) is irrational and \(r\ne 0\) is rational then \(xr\) is irrational.
We’ll now show the same for \(rx\), but will use the more usual vernacular:

Suppose \(x\) is irrational and \(r\) is rational (\(r \ne 0\)). If \(rx\) is rational then \(rx = n/m\) where \(n\) and \(m\) are integers (\(n, m \ne 0\)). Similarly since \(r\) is rational, \(r=p/q\). Thus \(\frac{p}{q}x=\frac{n}{m}\) and \(x=\frac{qn}{pm}\). This implies that \(x\) is rational, which contrary to our assumption that \(x\) is irrational.

Hence rational \(rx\) is impossible for irrational \(x\).

Rudin – Analysis – Ch.1, Ex. 2


This exercise proves the irrationality of \(\sqrt{12}\). First we make a note about rational number expressed as \(p/q\) where \(p/q\) is in lowest terms and then use Rudin’s recipe for the irrationality of \(\sqrt{2}\).

If \(n=p/q\) and \(m\) is an integer, \(m \ne 1\) that divides both \(p\) and \(q\) then \(p’=p/m\) and \(q’=q/m\) are both integers and \(n=p’/q’\). Therefore we can always reduce \(p/q\) so that there is no integer \(m\) (prime or otherwise) such that \(m \ne 1\), \(m\,|\,p\) and \(m\,|\,q\). When we have such \(p\) and \(q\) we then say that the representation of \(n\) is in lowest terms.

Following Rudin’s argument for \(\sqrt{2}\), note that the word even is just another way of saying “divisible by 2” and odd is another way of saying “not divisible by 2.”

Assume we have integers \(p\) and \(q\) such that \( \left(\frac{p}{q}\right)^2= 12 \). Without loss of generality we can assume that these integers are in lowest terms. (That is to say there is no other integer \(\ne 1\) that divides both \(p\) and \(q\). Then
\[
p^2=12q^2 = 4 \cdot 3 \cdot q^2
\]

Therefore 3 divides \(p^2\) (and thus 3 divides \(p\)).
If 3 divides \(p\) then \(p = 3m\) for some integer \(m\). Thus 9 divides \(p^2\) (\(p^2=(3m)^2=9m^2\) )so
\[
9 \cdot m^2 = 4 \cdot 3 \cdot q^2
\]
and thus
\[
3 \cdot m^2  = 4 \cdot q^2
\]

Since 3 does not divide 4, 3 must then divide \(q^2\) and obviously then \(q\). But this is contrary to our assumption that \(p/q\) is in lowest terms therefore \(p/q\) must not exist, i.e. there is no rational \(n = p/q\) such that \(n^2=12\).

The Chain Rule

The Chain Rule, using Rudin’s notation:

Let \(y=f(x)\). From the definition of the derivative we have:

\[
\begin{align}
f(t)-f(x)=&(t-x)[f'(x)+u(t)] \\
g(s)-g(y) =& (s-y)[g'(y)+v(y)]
\end{align}
\]

where \(u(t) \to 0\) as \(t \to x\) and \(v(s) \to 0\) as \(s\to y = f(x)\)

Let \(h(x) = g(f(x))\)

\[
\begin{align}
h(t) – h(x) =& g(f(t)) – g(f(x)) \\
=& (s-y)[g'(y) + v(s)] \\
=& (f(t) – f(x))[g'(y) + v(s)] \\
=& (t-x)[f'(x)+u(t)][g'(y)+v(s)]
\end{align}
\]

Therfore
\[
\displaystyle \frac{h(t)-h(x)}{t-x} \, = \, \frac{(t-x)[f'(x)+u(t)][g'(y)+v(s)]}{(t-x)}
\]

as \(t \to x\) then \(s \to y\), \(u(t) \to 0\), and \(v(s) \to 0\) so
\[
h'(x) = f'(x) \, g'(f(x)) =g'(f(x)) \, f'(x)
\]

sin(x)/x

To find the derivative of \(\sin(x)\) we need know that \(\lim\limits_{x \to 0}\frac{\sin x}{x} = 1\). The limit itself is quite trivial, but I always forget how you start.

The key is similar triangles and the so called “squeeze” theorem for limits:

Consider the sector ABD of the unit circle shown below

AB and AD are radii (and since this is the unit circle the radius is 1); \(x\) is the arc of the unit circle subtended (?) by AB and AD; by definition of sine and cosine, \(\sin(x) = BE\) and \(\cos(x) = AE\)

We need to make the following assumptions:

  1. Two triangles are similar if two of their angles are congruent (AA or AAA postulate) (And we’re going to assume we know what similar means and what are the consequences of that.)
  2. Given two “shapes” (or more abstractly, sets) A and B, if for every point in shape A the point is also in shape B (or in the language of sets \( A \subseteq B\)) then area of A is less than or equal to the area of B. (In other words if shape A is completely within shape B then the area of A will be less than or equal to the area of B) Of course this assumes that we know what area is (i.e. that it is well defined.) That is not trivial, but we’ll assume we know what were talking about.)To be a bit more specific our notion of area, \(\mu()\), of a “shape” (i.e. a set) must satisfy the following:given sets \(A\) and \(B\), if the areas \(\mu(A)\) and \(\mu(B)\) are defined and \(A \subseteq B\) then \(\mu(A) \le \mu(B)\)
  3. By inspection and #2: \( \text{Area of} \bigtriangleup ABE \le \text{Area of sector } ABD \le \text{Area of} \bigtriangleup ABC\).

The first thing to note is that \(\bigtriangleup ABE \sim \bigtriangleup BCE\) and therefore

\[
\displaystyle \frac{AB}{AE}=\frac{BC}{BE}
\]

Substituting our known lengths, we have

\[
\displaystyle \frac{1}{\cos(x)} = \frac{BC}{\sin(x)}
\]

Therefore

\[
\displaystyle BC = \frac{\sin(x)}{\cos(x)}=\tan(x)
\]

Having expressed all lengths in terms of \(x\), we can compute the areas of the triangles and sectors:

\[
\begin{align}
\text{Area of } \bigtriangleup \text{ABE} \quad & \le & \text{Area of sector } x \quad & \le & \text{Area of } \bigtriangleup \text{ABC} \\
\frac{1}{2} \times \cos(x) \times \sin(x) \quad & \le & \pi  1^2\times \frac{x}{2\pi} \quad & \le & \frac{1}{2} \times 1 \times \tan(x) \\
\frac{\sin(x)\cos(x)}{2} \quad &\le &\frac{x}{2} \quad & \le & \frac{\tan{x}}{2} \\
\cos(x) \quad & \le & \frac{x}{\sin(x)} \quad & \le & \frac{1}{\cos(x)}
\end{align}
\]

We can then reverse the inequalities by taking the reciprocals:

\[
\displaystyle \frac{1}{\cos(x)} \ge \frac{\sin(x)}{x} \ge  \cos(x)
\]

Now we can let \( x \to 0\). The limits on either side of the inequalities are \( \displaystyle \lim\limits_{x\to 0} \frac{1}{\cos(x)}=\lim\limits_{x\to 0} \cos(x) = 1 \) and thus by the “squeeze” theorem:

\[
\displaystyle \lim\limits_{x \to 0^{+}} \frac{\sin{x}}{x}=1
\]

We can only take the limit from the positive side because of how we have set up the inequalities. (The inequalities are the areas of the inscribed triangle, the sector and the circumscribed triangle.) Thus a “negative” side or area makes no sense. But this is easily rectified.For the limit from the other side:

\[
\displaystyle \lim\limits_{x\to 0^{-}} \frac{\sin(x)}{x}=\lim\limits_{x \to 0^{+}}\frac{\sin(-x)}{-x}=\lim\limits_{x \to 0^{+}}\frac{-\sin(x)}{-x}=\lim\limits_{x \to 0^{+}}\frac{\sin(x)}{x}=1
\]

Since the limits from each side are defined and equal, the limit itself is defined.

QED

Notes:

  1. Many proofs I see use the assumption that BE < BD < BC, rather than the size of the triangles. I find this a little but non intuitive. I can maybeaccept the assumption that BE < BC. Using this assumption means that one has already proven or taken for granted that the shortest distance from a point (B) to a line (AC) is the segment through the point which is perpendicular to the line. Any other segment from the point to the line must necessarily be greater. Although this is true, it is not so immediately obvious. What I believe is in no way obvious though is that the length of the arc BD is something between the lengths BE and BC.On the other hand it seems much more obvious to me that if every point in set X is also in set Y (but perhaps not the other way around) then the “area” of Y must be at least as big as the area of X ( \( \mu(X) \leq \mu(Y) \) ). If you can accept that there is some sort of notion of what an area is for the simple shapes of triangles and sectors, then I believe this assumption to be much more obvious.

The point of it all.

So what’s this all about?

Well, if you have come across this blog, your more than welcome to browse and even comment, but please know that this blog isn’t really for you.

You see I am usually working through a math text (or sometimes something else) and most of the time I am keeping notes or working through exercises. The problem is I usually jump around in what I am studying and I tend to lose my paper notes/notebooks.

So this blog is an attempt to solve that. I am moving my notes into the Cloud, so to speak.

Thus, this blog is really for me. The reason I tell you that, is please don’t think I am a jerk because I don’t respond to your comments (or maybe even don’t post them. Though if you do have something interesting to add, that I want to remember I will be glad to post it.)

So for me, what’s the point? My purpose:

  • An online notebook for notes and exercises in various texts that I am working through.
  • Where I write down (once and for all) those proofs and derivations that one is supposed to know.
  • Notes on various other texts.
  • Perhaps a few musings now and then.
  • An opportunity to use/learn/hack WordPress
  • Something to fill my time.

A word of warning: Please don’t link to any of my pages for now. I am in the midst of moving some old posts and info into this blog and I haven’t set up any nice permalinks yet, so links may change without warning. Don’t worry. This is something I would like to fix soon.