### Homework 1

# Homework 1¶

### CS328 — Numerical Methods for Visual Computing¶

**Out** on Wednesday 28.9, **due** on Wednesday 12.10 (before class).

This notebook contains literate code, i.e. brief fragments of Python surrounded by descriptive text. Please use the same format when submitting your answers. Begin your response to each problem with a ` ## Solution ` markdown cell. Make sure to use the reference Python distribution so that project files can be opened by the TAs. In this course, we use Anaconda 4, specifically the version based on Python 3.5.

### Prelude¶

The following fragment imports `time`

(for benchmarking) along with NumPy and Matplotlib and configures the latter to produce nice graphics even on recent high-resolution screens. The import statements at the end establish a shorthand notation for the most common integer and floating point formats.

```
%matplotlib inline
%config InlineBackend.figure_format='retina'
import time
import numpy as np
from matplotlib import pyplot as plt
from numpy import uint16 as u16
from numpy import uint32 as u32
from numpy import uint64 as u64
from numpy import float16 as f16
from numpy import float32 as f32
from numpy import float64 as f64
```

When doing the exercises below, please pay attention to the fact that NumPy will often conservatively upgrade the representation of number types in arithmetic operations when the input argument types don't match. This may not be intended and give "too accurate" results—to ensure that computations are performed at the desired precision, all operands must be given using a consistent type.

```
print(type(f32(1) + 1.0)) # oops, 64 bit output (default Python floats use double precision)
print(type(f32(1) + f32(1.0))) # likely this was the intended behavior
print(type(u32(1) + 2)) # oops, '2' is interpreted as a 64 bit integer
print(type(u32(1) + u32(2))) # intended behavior
```

We'll also have to fix something: the NumPy square root function `np.sqrt()`

can change the data type (`dtype`

) of the input array in certain situations, so we'll define an alternative version to be used in this homework that does not have this behavior.

```
def sqrt(x):
x = np.array(x)
return np.sqrt(x, dtype=x.dtype)
```

Finally, two more definitions: the helper functions `f2i`

and `i2f`

below reinterpret floating point values as an integers and vice versa. We'll use these in Problem 1 to access the bit-level representation of an IEEE 754 floating point value.

```
def f2i(value):
return value.view('u%i' % value.itemsize)
def i2f(value):
return value.view('f%i' % value.itemsize)
```

## Problem 1: Warmup: ULPs and Absolute Error (10+10 points)¶

The bit-level layout of IEEE754 floating point values is specially designed such that contiguous ranges of positive (or negative) floating point numbers are also contiguous when re-interpreted as integers. Use this property along with the functions `f2i`

and `i2f`

and the aliases `f16`

etc. defined above to answer the following questions:

How many floating point values are located between the values

`1`

and`2`

, including the endpoints? How about`1001`

and`1002`

? Give answers for 16, 32, and 64 bit precision variants.How large is 1 ULP for the value constant $\pi$ expressed in 64 bit arithmetic? Suppose that we use this value to compute the surface area of the earth given its radius of $\approx 6.353\cdot 10^6m$. Assuming that the earth is perfectly spherical, and that the discretization of $\pi$ is the only source of error in this computation, specify the absolute error in square meters.

## Problem 2: Series Approximations, Forward/Backward Error (10 + 10 + 10)¶

Consider the power series representation of the sine function, which is defined as

$$ \sin(x)=x-\frac{x^3}{3!}+\frac{x^5}{5!}-\frac{x^7}{7!}+\cdots $$What are the

*relative*forward and backward errors when we approximate this function by its first two series terms? Provide results for $x=2^k$ where $k\in[-1, 0, 1]$.What trend do you observe? Plot the two functions (sine and its approximation) on a single graph covering the domain $[0, 2]$ and justify your observation.

The relative condition number of a function $f(x)$ at position $x$ is defined as $$ C(x):=\frac{\left|f(x+\delta x)-f(x)\right|}{\left|f(x)\right|} \big/ \frac{\left|\delta x\right|}{\left|x\right|}. $$ where $\delta x$ is the perturbation to the original problem that would be needed to explain the observed forward error if the computation was flawless (see the definition of the backward error in the text book for details). Compute the relative condition number for each of the three points referenced above and determine where the approximation is the most numerically stable.

List three different kinds of numerical problems that can arise when evaluating the above power series with very many terms (e.g. thousands), and what parts of the expression specifically cause them.

For what argument ranges is evaluating the sine function numerically sensitive? (In the sense that small perturbations of the input mantissa can lead to large changes in the computed result). Why is this the case?

## Problem 3: Solving quadratic Equations (10+10 points)¶

Certain types of numerical errors can manifest in sudden and unexpected ways (i.e. significant errors can appear even in short-running computations). Consider the classical formula for computing solutions of the quadratic equation

$$ ax^2 + bx +c = 0 \quad \Leftrightarrow\quad x_{1,2} = \frac{-b \pm \sqrt{b^2-4ac}}{2a}. $$Use this expression to compute the two roots corresponding to the following set of coefficients (using 16, 32, and 64 bit floating point formats):

$$ A = 1, B = -54.32, C = 0.1 $$What do you observe? Why is this happening?

The following mathematically equivalent definition is sometimes used: $$x_1=\frac{q}{A}, \quad x_2 = \frac{C}{q}$$ where $$ \DeclareMathOperator{\sign}{sign} q = -\frac{1}{2} \left(B + \sign({B}) \sqrt{B^2-4AC}\right). $$ Repeat the computation and discuss which of the two formulations is superior.

## Problem 4: Vectorization (10+10+10 points)¶

Generate a random vector (`np.random.random`

) with $10^7$ entries and store it in a global variable. Next, create two functions: the first should compute the sum of the first $N$ entries using a Python `for`

loop. The second should do the same via `np.sum()`

and array slicing notation.

Measure the time that is required to run each of these functions for 30 logarithmically spaced values of $N$ ranging from $10^1$ to $10^7$ (hint: `np.logspace`

may be useful. The function `time.perf_counter()`

provides a time value in fractional seconds).

Finally, visualize both timing curves in a joint plot with logarithmically spaced horizontal and vertical axes (`plt.loglog()`

).

The two curves should look very different. What is happening here?

Focus on the curve corresponding to the

`np.sum()`

implementation. There should be two clearly different trends for small and large $N$. Where do these come from?Divide the

`np.sum()`

measurements by the number of elements corresponding to each call, giving the time spent per entry. Generate another log-log plot of only this curve and justify the trend you observe.

## Hacker points (15 points)¶

Exercises designated as *hacker points* are undervalued problems that are completely optional (i.e. there is no need to do them to get full grades in this course). However, they will complement your experience and can be useful to offset points lost in other exercises. Partial answers don't count—hacker points are either awarded in full or not at all.

### Interval arithmetic¶

Use

`f2i`

and`i2f`

to define a new function`flt_next(value, dir)`

which takes a floating point value`value`

and a direction`dir`

(either`-1`

or`+1`

) to move to the next smaller or larger floating point value, respectively. It should work for positive and negative normalized inputs, though it is fine to ignore special cases such as overflow to infinity, underflow to zero, etc.(For future reference: NumPy provides the function

`np.nextafter`

which does exactly this while accounting for all special cases, but here we'll use our own implementation for educational reasons.)Read up on the rules of interval arithmetic and extend the interval class template below with the methods

`__add__`

,`__sub__`

,`__neg__`

(unary negation),`__mul__`

,`__truediv__`

(division with fractional result), and`sqrt`

. Since we unfortunately can't adjust the processor's rounding mode in Python, use the following workaround: after every arithmetic operation, use`flt_next`

to conservatively move the lower and upper interval endpoints down and up by 1 ULP, respectively.Use the the finished interval arithmetic class to solve the previous quadratic equation using the "bad" algorithm from Problem 3 (in 16 bit arithmetic). Do the computed intervals bound the roots? How large are they?

Suppose we aren't confident in the value of coefficient

`A`

—however, we can with certainty say that $A\in[0.9, 1.1]$. Use interval arithmetic to determine how this uncertainty propagates to the computed solutions, and which of the two roots is more ambiguous (in absolute terms).

```
class Interval:
""" This class represents a real-valued interval [a, b]. It implements
various elementary arithmetic operations, which produce intervals
that bound the set of results which could be obtained by performing
the same arithmetic operations with interval elements """
def __init__(self, value, extra = None):
""" Initialize the instance with the specified interval or constant """
if type(value) is Interval:
# Initialize with an interval instance
self.x0, self.x1 = value.x0, value.x1
else:
# Initialize with a constant or an interval specified as 2 numbers
self.x0 = np.array(value)
self.x1 = np.array(value if extra is None else extra)
self.dtype = self.x0.dtype
def __repr__(self):
""" Return a string representation of the interval """
return "Interval[%f, %f]" % (self.x0, self.x1)
def __add__(self, other):
""" Add two intervals """
# Ensure that the other argument is an Interval
# (to permit addition of a constant to an interval)
if type(other) is not Interval:
other = Interval(other)
# Hmm, this is probably not right
return Interval(0, 1)
```