Sunday, July 10, 2011

Blogging my way through CLRS section 3.1 [part 5]

Part 4 here.
I wrote an entire blog post explaining the answers to 2.3 but Blogger decided to eat it. I don't want to redo those answers so here is 3.1:
For now on I will title my posts with the section number as well to help Google.

Question 3.1-1: Let $f(n)$ and $g(n)$be asymptotically non-negative functions. Using the basic definition of $\theta$-notation, prove that $\max(f(n) , g(n)) \in \theta(f(n) + g(n))$ .

CLRS defines $\theta$ as $\theta(g(n))= \{ f(n) :$ there exists some positive constants $c_1, c_2$, and $n_0,$ such that $0 \leq c_1g(n) \leq f(n) \leq c_2g(n)$ for all $n \geq n_0\}$ Essentially we must prove that there exists some $c_1$ and $c_2$ such that $c_1 \times (f(n) + g(n)) \leq \max(f(n), g(n)) \leq c_2 \times (f(n) + g(n))$ There are a variety of ways to do this but I will choose the easiest way I could think of. Based on the above equation we know that $\max(f(n), g(n)) \leq f(n) + g(n)$ (as f(n) and g(n) must both me non-negative) and we further know that $\max(f(n), g(n))$ can't be more than twice f(n)+g(n). What we have then are the following inequalities: $$\max(f(n), g(n)) \leq c_1 \times (f(n) + g(n))$$ and $$c_2 \times (f(n) + g(n)) \leq 2 \times \max(f(n), g(n))$$ Solving for $c_1$ we get 1 and for $c_2$ we get $\frac {1} {2}$

Question 3.1-2: Show for any real constants $a$ and $b$ where $b \gt 0$ that $(n+a)^b \in \theta(n^b)$
Because $a$ is a constant and the definition of $\theta$ is true after some $n_0$ adding $a$ to $n$ does not affect the definition and we simplify to $n^b \in \theta(n^b)$ which is trivially true

Question 3.1-3: Explain why the statement "The running time of $A$ is at least $O(n^2)$," is meaningless.

I'm a little uncertain of this answer but I think this is what CLRS is getting at when we say a function $f(n)$ has a running time of $O(g(n))$ what we really mean is that $f(n)$ has an asymptotic upper bound of $g(n)$. This means that $f(n) \leq g(n)$ after some $n_0$. To say a function has a running time of at least g(n) seems to be saying that $f(n) \leq g(n) \And f(n) \geq g(n)$ which is a contradiction.

Question 3.1-4: Is $2^{n+1} = O(2^n)$? Is $2^{2n} = O(2^n)$?

$2^{n+1} = 2 \times 2^n$. which means that $2^{n+1} \leq c_1 \times 2^n$ after $n_0$ so we have our answer that $2^{n+1} \in o(2^n)$ Alternatively we could say that the two functions only differ by a constant coefficient and therefore the answer is yes.

There is no constant such that $2^{2n} = c \times 2^n$ and thefore $2^{2n} \notin O(2^n)$

Question 3.1-5: Prove that for any two functions $f(n)$ and $g(n)$, we have $f(n) \in \theta(g(n)) \iff f(n) \in O(g(n)) \And f(n) \in \Omega(g(n))$

This is an "if an only if" problem so we must prove this in two parts:

Firstly, if $f(n) \in O(g(n))$ then there exists some $c_1$ and $n_0$ such that $f(n) \leq c_1 \times g(n)$ after some $n_0$. Further if $f(n) \in Omega(g(n))$ then there exists some $c_2$ and $n_0$ such that $f(n) \geq c_2 \times g(n)$ after some $n_0$.

If we combine the above two statements (which come from the definitions of $\Omega$ and O) than we know that there exists some $c_1, c_2, and n_0,$ such that $c_1g(n) \leq f(n) \leq c_2g(n)$ for all $n \geq n_0\}$

We could do the same thing backward for the other direction: If $f(n) \in \theta(g(n))$ then we could split the above inequality and show that each of the individual statements are true.

Question 3.1-6: Prove that the running time of an algorithm is $\theta(g(n)) \iff$ its worst-case running time is $O(g(n))$ and its best case running time $\Omega(g(n))$.
I'm going to try for an intuitive proof here instead of a mathematical one. If the worst case is asymptotically bound above in the worst case by a certain function and is asymptotically bound from below in the best case which means that the function is tightly bound by both those functions. f(n) never goes below some constant times g(n) and never goes above some constant times g(n). This is what we get from the above definition of $\theta(g(n)))$ A mathematical follows from question 3.1-5.

Question 3.1-7: Prove that $o(g(n)) \cap \omega(g(n)) = \varnothing$

little o and little omega are defined as follows: $o(g(n)) = \{ f(n) : \forall c > 0 \exists n_0 \text{such that } 0 \leq f(n) \leq c \times g(n) \forall n \gt n_0$ and $\omega(g(n)) = \{ f(n) : \forall c > 0 \exists n_0 \text{such that } 0 \leq c \times g(n) \leq f(n) \forall n \gt n_0$

In other words

$$f(n) \in o(g(n)) \iff \lim_{n \to \infty} \frac {f(n)} {g(n)} = 0$$ and $$f(n) \in \omega(g(n)) \iff \lim_{n \to \infty} \frac {f(n)} {g(n)} = \infty$$

It is obvious that these can not be true at the same time. This would require that $0 = \infty$

Wednesday, June 29, 2011

Blogging my way through CLRS [part 4]

Part 3 here This set is a bit easier than last time.
Question 2.2-1:Express the function $$\frac{n^3}{1000} - 100n^2 - 100n + 3$$ in terms of $\Theta$ notation
A function g(x) is said to be in the set of all functions $\Theta(x)$ if and only if g(x) is also in the set of all functions $\Omega(x)$ and in the set of all functions $O(x)$.
Symbolically: $$g(x) \in \Theta(x) \iff g(x) \in O(x) \And g(x) \in \Omega(x)$$

A function g(x) is in the set of all functions $\Theta(x)$ if and only if after some constant $c$ it is always true that for some constant C, $g(x) \lt Cf(x)$

A function g(x) is in the set of all functions O(x) if and only if after some constant $c$ it is always true that for some constant C, $g(x) \gt Cf(x)$

With our function we could choose practically any function to satisfy either one of these conditions. However we need to satisfy both of them. One thing that makes this easier is that it only has to be true after some constant number. This allows us to throw away the "trivial" parts that are eventually overwhelmed by the faster growing terms. We therefore are only left with $n^3$, which is the answer.

Question 2.2-2: Consider sorting n numbers stored in an array A by first finding the smallest element and exchanging it with the element in A[1], then find the second smallest element and exchange it with A[2], and continue this for the first n-1 elements of A. Write the pseudocode for this algorithm, which is known as Selection Sort. What loop invariant does this algorithm maintain? Why does it need to run only for the first n-1 elements and not for all n? Give the best case and worst case running times in $\Theta$ notation
This question is asking us to analyze selection sort in a variety of ways. I will start with writing out the pseudocode: for $j \leftarrow 1$ to $n-1$
min $\leftarrow$ j
for $i \leftarrow j+1$ to $n$
$\rhd$ if A[i] < A[min] then min $\leftarrow$ i
$\rhd$ if min $\neq$ j then swap A[min] and A[j]
A loop invariant that this algorithm maintains is that every elements prior to A[j] is sorted among the subarray A[1] to A[j] and is less than or equal to every element in the subarray A[j+1] to A[n]. I do not believe a stronger loop invariant is provable. The algorithm only needs to run until n-1 because of the second part of the loop invariant. When $j = n-1$ we know that every element after A[j], which is A[n] is not less than all previous elements. Therefore no check has to be done. In the best case (an already sorted array) and in the worst case (a reverse sorted array) the running time is the same: $\Theta(n^2)$
Question 2.2-3: Consider linear search again. How many elements of the input sequence need to be checked on average, assuming that the element being searched for is equally likely to be any element in the array? How about in the worst case? What are the average-case and worst-case running times of linear search in $\Theta$ notation?
The best case for a linear search algorithm is when the searched-for element is in the first location. In the worst case all n locations must be searched. In the average case $\frac{n}{2}$ locations have to be searched.
Question 2.2-4: How can we modify almost any algorithm to have a good best-case running time?
I have no idea what this question is asking for. I guess checking for the optimal case (as in a pre-sorted array for a sorting algorithm) and then skipping the rest of the procedure might work.

Sunday, June 26, 2011

Blogging my way through CLRS [3/?]

part 2 here
According to wikipedia Introduction to Algorithms is also known as CLRS which is shorter (and more fair to the other authors) so I'll use that name for now on.

Question 2.1-1 asks me to redraw a previous diagram, but with different numbers. I am not going to post that here.
Question 2.1-2 Rewrite the insertion sort procedure to sort into nonincreasing instead of nondecreasing order:
Here is the pseudocode of the nonincreasing version of insertion sort: for j $\leftarrow 2$ to length[A]
do key$\leftarrow A[j]$
$\rhd$ Insert A[j] into sorted sequence A[1..j-1]
$i \leftarrow j - 1$
while $i \gt 0$ AND $A[i] \lt key$
do $A[i+1] \leftarrow A[i]$
$i \leftarrow i - 1$
$A[i+1] \leftarrow key$

Now we prove that this loop correctly terminates with a nonincreasing array to about the same level of formality as the book proved the original.
Initialization: At the first iteration, when $j=2$ the subarray A[1..j-1] is trivially sorted (as it has only one element).
Maintenance: In order to prove maintenance we need to show that the inner loop correctly terminates with an array with "space" for the correct element. As CLRS did not prove this property, I will also skip this proof.
Termination: this loop terminates when j > length[A] or when $j = length[A]+1$. Since we have "proven" (to some level) the maintenance of the loop invariant (that at each point during the loop the subarray [1..j-1] is sorted) we could substitute length[A]+1 for $j$ which becomes [1..length[A]] or the entire array.
This shows that the loop terminates with a correctly sorted array.
Question 2.1-3:
Input:A sequence of $n$ numbers $A = {a_1,a_2,...,a_n}$ and a value $v$.
Output: An index i such that $v = A[i]$ or a special value $\varnothing$ (NIL) if $v \notin A$
Write the pseudocode for Linear Search, which scans through the sequence looking for $v$. Using a loop invariant, prove that your algorithm is correct.
The first part, writing the pseudocode, seems fairly easy:
$r \leftarrow \varnothing$
$j \leftarrow 1$ to length[A]
if $v = A[j] \rhd$ optionally check that $r = \varnothing$
$r \leftarrow j$
return $r$

The second part, proving that this is correct is harder than before because we don't have a trivially true initialization of our loop invariant.
Initialization: $j = 1\ \And\ r = \varnothing$ at the start of our loop. At this point there are no elements prior to A[j] and we have yet to find $v$ in A. As such our invariant (that r will contain the correct value) is true.
Maintenance: At every point in the loop the subarray A[1..j] has either contained $v$ in which case it has been assigned to $r$ or has not contained $v$ in which case $r$ remains $\varnothing$. This means that loop invariant holds for every subarray A[1..j].
Termination: At the end of the loop $j =$ length[A]. We know from our maintenance that $r$ is correct for every subarray A[1..j] so at termination $r$ contains the correct value
Question 2.1-4 Consider the problem of adding two $l$-bit binary integers, stored in two $l$-element arrays $A$ and $B$. the sum of the two integers should be stored in binary form in $(l+1)$-element array $C$. State the problem formally and write pseudocode for adding the integers.
Stating the problem formally looks something like:
Input: Two $l$-bit integers $A$ and $B$ stored as arrays of length $l$ with the most significant bit stored last
Output: An $l+1$-bit integer ($C$) stored as arrays of length $l+1$ with the most significant bit stored last
Here is the pseudocode: $\rhd$ X is a $l$-bit array of bits initialized to all zeros in order to store the carry
for j $\leftarrow$ 1 to $l$
$C[j] \leftarrow copyC \leftarrow A[j] \oplus B[j]$
$X[j+1] \leftarrow A[j] \And B[j]$
$C[j] \leftarrow C[j] \oplus X[j]$
$X[j+1] \leftarrow copyC \oplus X[j+1]$

Thursday, June 23, 2011

Blogging my way through Cormen [2/?]

As I said in part 1 I am reading a book on algorithms and have decided to blog my way through. My primary goal in doing so is to improve my writing skills. A secondary goal is to force myself to actually answer the questions.

1.2-2 Suppose we are comparing implementations of insertion sort and merge sort on the same machine. For inputs of size n, insertion sort runs in $8n^2$ steps, while merge sort runs in $64n \lg n$ steps. For which values of $n$ does insertion sort beat merge sort?
The question is essentially asking for which values of $n$ is $8n^{2} \lt 64n \lg n$. We can solve this question by first factoring out an $8n$ and we get $n \lt 8 \lg n$ Unfortunately this problem is not solvable using elementary operations. Luckily we are being asked for an integer solution (as computers operate in discrete steps) and we could use the underutilized guess-and-and method.
$n$ $8 \lg n$
14 30.46
41 42.86
43 43.41
44 43.675
So there we have it: given this data we would prefer insertion sort whenever we have fewer than 43 items.
1.2-3 What is the smallest value of n such that an algorithm whose running time is $100n^2$ runs faster than an algorithm whose running time is $2^n$ on the same machine.
This question is asking us find the smallest positive integer $n$ that satisfies $100n^{2} \lt 2^n$. This could be solved by doing the math, by looking at a plot of the curves, or using the above method again. . $$2^{14} = 16384$$ $$(100 \times 14^{2}) = 19600$$ $$2^{15} = 32768$$ $$(100 \times 15^{2}) = 22500$$
An exponential algorithm becomes far worse than a polynomial algorithm when we have only 15 items to worry about. In other words: avoid exponential algorithms!
Thank you JT and JM for giving me the idea to go through the book, and for looking at my posts before I publish them.
Updated 2012-09-05: I had a brain lapse the day I originally published this and accidentally used the natural logarithm instead of the base 2 log for question 1.2-2. How I ever managed to do that I will not know, but I've fixed it.

Tuesday, June 21, 2011

Cormen on Algorithms: Blogging my way through [1/?]

Two of my good friends recently started reading Introduction to Algorithms by Thomas H. Cormen, et. al. Being unable to resist peer pressure I decided to follow and read along.

I plan on blogging my way through the chapters writing my answers to the questions as I go through the book. Like most of my plans they don't always work out, but one could try.

Here it goes!

1.1-1: Give a real-world example in which each of the following computational problems appears: (a)Sorting, (b) Determining the best order for multiplying matrices, (c) finding the convex hull of a set of points.
Sorting - Sorting comes up in virtually every algorithm one could think of. Everything from optimizing monetary investments to efficient compression algorithms has to sort data at some point or another. A harder question might be: Name one non-trivial algorithm that doesn't require sorting.
Multiplying Matrices - graphics and scientific problems frequently require matrix operations.
Convex Hull - Collision detection for use in games, modeling biological systems, or other related work could make use of this
1.1-2: Other than speed what other measures of efficiency might one use in a real-world setting?
It is possible to optimize for (and against) every limited resource. For example minimizing the amount of memory usage is important for embedded applications (and desktop ones too). Reducing total disk I/O is important to increase the longevity of hard drives. On a less technical note optimizing for monetary cost or man hours expended is important too.
1.1-3: Select a data structure you have seen previously and discuss its strengths and limitations
One of the most interesting data structures I know is the Bloom Filter. It is a probabilistic data structure that can determine if an element is NOT in a set but can't determine definitively if an element is in a set. It works by hashing each element in a set to a fixed size bit array. It then ORs the hash with itself (which starts at all zeros). One can test to see if an element is in a set by generating the hash and testing to see if every bit set to 1 in the queried element is set to 1 in the filter. If it is then you have some degree of confidence that the element is in the set. Any negative means that what you are querying for has not been added.
While most probabilistic structures have certain properties in common, bloom filters have a number of interesting pros and cons.
1. A negative result is definitive - if a query returns that an element has not been added then one knows this to be 100% true.
2. Since hashes are fixed size the amount of memory a Bloom Filter uses is known and bounded.
3. Bloom filters can quickly become useless with large amounts of data. It is possible that every bit will be set to 1 which effectively makes the query a NOP.
4. It is impossible to remove data from a bloom filter. One can't just set all the bits of the hash to a zero because that might be removing other elements as well.
5. Without a second set of data there is no way to deterministically list all elements (unlike other probabilistic data structures such as Skip Lists).
1.1-4: How are the shortest path and traveling salesmen problems similar? How are they different?
The shortest path problem is
Output:
HiHiHiHiHi
Most Linux distributions
for i in \$(seq 1 1 5);do echo -n "Hi";done;echo "";
Output:
HiHiHiHiHi
Perl
print "-" x 10;print "\n"
Output:
----------
Python
"ab" * 10 Output: 'abababababababababab'
R
paste(rep("Hi",5), collapse='')
Output:
[1] HiHiHiHiHi
Ruby
print "-" * 10;print "\n"
Output:
----------
Tcl
string repeat "Hi" 5
Output:
HiHiHiHiHi
ZSH
repeat 5 printf 'abc';echo "";
Output:
abcabcabcabcabc
update 5/30/11: Thanks to Hans I found out that jot is not POSIX. Also fixed formatting.

Friday, February 11, 2011

The Usefulness of the X-Do-Not-Track Header

Do-Not-Track [0] is a recent proposal by the FTC [1] to deal with the problem of users being “tracked” by advertisers. This consists of adding a new HTTP header[2] into page requests that indicates that the user is “opting out” of being “tracked”

The proposal is backed by a number of major players, including Mozilla [3] , the Electronic Frontier Foundation [4] , Wladimir Palant (the maintainer of of AdBlockPlus)[5] , and Giorgio Maone (the author of NoScript) [6].

Is this a good idea? Does it solve any existing problems?

One important factor to consider is that everyone has a different understanding of the concept of “tracking”. If a user has the header set but logs in to a service is there a difference? What if the user closes the browser in between sessions? Can the service remember who logged on last? Can a bank track a user’s visits for security purposes? What about a quiz website tracking participation to prevent cheating? And these are the simple questions. The definition of the word ‘tracking’ is not officially established.

Google claims it anonymizes IP addresses [7] but the “anonymization only involved clearing the last octet of the user’s IP address.[8] Is that considered tracking? Who decides? You? Google? The government?

Even if we came to a shared definition of what it means to “track”, how can one prove if tracking is done or not?

Let’s imagine that the US government enacts a law requiring websites to follow this header based on this elusive definition of “tracking”. What about servers outside the US? How would their activity be handled? What about a foreign user accessing a US based website? The reverse? What if different jurisdictions came to had two mutually exclusive definitions of “tracking”?

Furthermore, what if websites began to deny service to users that used the X-Do-Not-Track header? Browsers would be forced to remove the header in order to browse the web - effectively nullifying the header’s original purpose.

Arvind Narayanan [9] says that “Examining ad blocking allows us to predict how publishers, ... assuming DNT is implemented as a browser plug-in, ad blocking and DNT would be equivalent ... ad blocking would result in a far greater decline in revenue than merely preventing behavioral ads. We should therefore expect that DNT will be at least as well tolerated by websites as ad blocking.” This analysis assumes that the header will be in a plugin or optional setting. If every browser implements this header by default, as they should to attract more users, a much larger percentage of people will be opting out than with ad-blockers today.

What if the law disallowed differing service for those with or without the header? What would be the point? It would make sense to simply disallow “tracking” for all websites, which would make the header moot. Of course, this idea is subject to the same questions as asked above.

Instead of focusing on silly request-based ideas for websites, browser vendors should be working on fixing the privacy holes that have been already been found. Some examples include Firefox’s fix for the CSS history leak, Internet Explorer’s anti-tracking features [10][11] and related instances

What if browser vendors could consider idea of shipping their browsers with mini versions of ad-preventing software like AdblockPlus, NoScript[12] , and RequestPolicy[11] that blocked major third party advertisers such as doubleclick. Of course this could become a cat and mouse game - and it may not be a good idea at all - but it would be more effective than the do-not-track header. Other options include appeasing advertises with targeted user advertising and behavior analysis that doesn’t violate user privacy. For examples see the footnote [13]

Quite simply what we need for increased client side awareness of the privacy implications of various features and some form of control given to the users about what data the transmit across the Internet about themselves.

[0] http://donottrack.us/
[1] http://www.ftc.gov/os/2010/12/101201privacyreport.pdf
[2] Originally the header was “X-Behavioral-Ad-Opt-Out: 1 X-Do-Not-Track: 1” but the current version is now “X-DNT: 1” to save bandwidth
[3] https://wiki.mozilla.org/Privacy/Jan2011_DoNotTrack_FAQ