Posts Tagged ‘Algorithms’

Dijkstra’s Algorithm code in C++

Tuesday, July 15th, 2008

Dijkstra’s algorithm is a famous algorithm that calculates the routes and distances from a start node to all other nodes in a connected graph where all the distances are positive.

Here is a version which I wrote in C++. My aim here was to try to create a version that was easy to read, rather than the most efficient version. This version is implemented using STL vectors to store the edges and nodes. I might later modify it to use adjacency lists and priority queues to produce a more efficient implementation.

First off is my Node and Edge class. You’ll note that the objects add themselves to the nodes and edges vectors as they are constructed.

vector<Node*> nodes;
vector<Edge*> edges;

class Node
{
public:
	Node(char id)
		: id(id), previous(NULL),
		distanceFromStart(INT_MAX)
	{
		nodes.push_back(this);
	}
public:
	char id;
	Node* previous;
	int distanceFromStart;
};

class Edge
{
public:
	Edge(Node* node1, Node* node2, int distance)
		: node1(node1), node2(node2), distance(distance)
	{
		edges.push_back(this);
	}
	bool Connects(Node* node1, Node* node2)
	{
		return (
			(node1 == this->node1 &&
			node2 == this->node2) ||
			(node1 == this->node2 &&
			node2 == this->node1));
	}
public:
	Node* node1;
	Node* node2;
	int distance;
};

Next up is my code to construct a simple test graph. The starting node is initialised with a distance of 0. Dijkstra’s algorithm is then called before the results are printed out. Underneath the code is a diagram of what this graph looks like.

void DijkstrasTest()
{
	Node* a = new Node('a');
	Node* b = new Node('b');
	Node* c = new Node('c');
	Node* d = new Node('d');
	Node* e = new Node('e');
	Node* f = new Node('f');
	Node* g = new Node('g');

	Edge* e1 = new Edge(a, c, 1);
	Edge* e2 = new Edge(a, d, 2);
	Edge* e3 = new Edge(b, c, 2);
	Edge* e4 = new Edge(c, d, 1);
	Edge* e5 = new Edge(b, f, 3);
	Edge* e6 = new Edge(c, e, 3);
	Edge* e7 = new Edge(e, f, 2);
	Edge* e8 = new Edge(d, g, 1);
	Edge* e9 = new Edge(g, f, 1);

	a->distanceFromStart = 0; // set start node
	Dijkstras();
	PrintShortestRouteTo(f);

	// Node / Edge memory cleanup snipped
}

Dijkstra's algorithm

Here is the actual algorithm implementation. We remove the node with the smallest distance from the list of nodes, and then calculate all the minimum distances. We repeat until there are no more nodes left.

void Dijkstras()
{
	while (nodes.size() > 0)
	{
		Node* smallest = ExtractSmallest(nodes);
		vector<Node*>* adjacentNodes =
			AdjacentRemainingNodes(smallest);

		const int size = adjacentNodes->size();
		for (int i=0; i<size; ++i)
		{
			Node* adjacent = adjacentNodes->at(i);
			int distance = Distance(smallest, adjacent) +
				smallest->distanceFromStart;

			if (distance < adjacent->distanceFromStart)
			{
				adjacent->distanceFromStart = distance;
				adjacent->previous = smallest;
			}
		}
		delete adjacentNodes;
	}
}

Here are the supporting functions. The first function removes and returns the node with the smallest distance from the list.

The next returns a new vector containing all the nodes which are adjacent to the node that we pass in.

The third returns the distance value of an edge which directly connects the two nodes.

The final one checks if a node is in the list of nodes.

// Find the node with the smallest distance,
// remove it, and return it.
Node* ExtractSmallest(vector<Node*>& nodes)
{
	int size = nodes.size();
	if (size == 0) return NULL;
	int smallestPosition = 0;
	Node* smallest = nodes.at(0);
	for (int i=1; i<size; ++i)
	{
		Node* current = nodes.at(i);
		if (current->distanceFromStart <
			smallest->distanceFromStart)
		{
			smallest = current;
			smallestPosition = i;
		}
	}
	nodes.erase(nodes.begin() + smallestPosition);
	return smallest;
}

// Return all nodes adjacent to 'node' which are still
// in the 'nodes' collection.
vector<Node*>* AdjacentRemainingNodes(Node* node)
{
	vector<Node*>* adjacentNodes = new vector<Node*>();
	const int size = edges.size();
	for(int i=0; i<size; ++i)
	{
		Edge* edge = edges.at(i);
		Node* adjacent = NULL;
		if (edge->node1 == node)
		{
			adjacent = edge->node2;
		}
		else if (edge->node2 == node)
		{
			adjacent = edge->node1;
		}
		if (adjacent && Contains(nodes, adjacent))
		{
			adjacentNodes->push_back(adjacent);
		}
	}
	return adjacentNodes;
}

// Return distance between two connected nodes
int Distance(Node* node1, Node* node2)
{
	const int size = edges.size();
	for(int i=0; i<size; ++i)
	{
		Edge* edge = edges.at(i);
		if (edge->Connects(node1, node2))
		{
			return edge->distance;
		}
	}
	return -1; // should never happen
}

// Does the 'nodes' vector contain 'node'
bool Contains(vector<Node*>& nodes, Node* node)
{
	const int size = nodes.size();
	for(int i=0; i<size; ++i)
	{
		if (node == nodes.at(i))
		{
			return true;
		}
	}
	return false;
}

And finally here is the code which prints out the route between our start node ‘a’ and the destination node ‘f’. It starts at the destination node and then works backwards using the previous pointer in each node.

void PrintShortestRouteTo(Node* destination)
{
	Node* previous = destination;
	cout << "Distance from start: "
		<< destination->distanceFromStart << endl;
	while (previous)
	{
		cout << previous->id << " ";
		previous = previous->previous;
	}
	cout << endl;
}

Update: 7th September 2008

I have now uploaded a zip file containing the Visual C++ Express Solution file and source code for this project – Dijkstra’s Source Code (8kb).

Select 100 random values from a stream / linked list

Friday, July 11th, 2008

Here’s an interesting puzzle / interview question that I heard recently.

Problem

You have a either a linked list or a stream containing int values. You don’t know how large the stream / linked list except that it contains 100 or more values. You need to write a method that selects 100 random values from the stream. You have to write the method so that there isn’t a bias towards values at the start, middle or end of the stream / list. In other words the chances of picking a value from anywhere in the stream / list should be roughly the same.

There are some restrictions.
You can’t cache the whole stream (if it is a stream).
Or if it is a linked list – you are only allowed a single pass of the list. You can’t iterate through it once to determine its length.

Solution

The solution is:

  1. Select the first 100 values and store them in an array.
  2. For each value after the first 100 we select them according to a decreasing probability. If we select a value then we replace a randomly selected value from our existing list.

So values 1-100 have a probability of 1 of being initially selected.
Value 101 has a 100/101 chance of being selected.
Value 102 has a 100/102 chance of being selected.
Value 1050 has a 100/1050 chance… etc.

100 is at the top of the division as this is our value for x – the number of values we are selecting.

You can see that value 101 has a very high chance of being selected. 100/101 in fact. As we are replacing one of our existing values, each of the existing values has a 100/101 chance of remaining after the replacement.

Using this simple logic we can get a random sample of the values even though we don’t know the length of the stream / list.

Code

Here is the C++ code showing how to do this. Also included is a simple test which sends the values 1-1 million to the function. 100 are randomly selected.

After the code is a graph which shows the distribution of the selected values. You can see that the distribution is fairly even showing that the code works. For random number selection code such as this, the easiest way to verify that the code works is to manually view the results in a graph and see if the output looks sensible.

const int size = 100;
int selected[size];

int count = 0;
void SelectRandomNumbers(int nextNum)
{
	if (count < size)
	{
		selected[count++] = nextNum;
		return;
	}
	double probability = (double)size / (double)(count+1);
	double select = (double)rand() / RAND_MAX;
	if (select < probability)
	{
		int randomIndex = RandomNumber(0, size);
		selected[randomIndex] = nextNum;
	}
	++count;
}

// min is inclusive
// max is non-inclusive
int RandomNumber(int min, int max)
{
	return (int)((double)rand()	/
		(RAND_MAX + 1) * (max - min) + min);
}

///////////////

const int iterations = 1000000;

void RandomSelectionTest()
{
	srand((unsigned)time(0)); 

	for (int i=0; i<iterations; ++i)
	{
		SelectRandomNumbers(i);
	}

	cout << size <<	" randomly selected ints:" << endl;
	PrintIntArray(selected, size);
}

void PrintIntArray(int thearray[], int size)
{
	for (int i=0; i<size; ++i)
	{
		cout << thearray[i] << endl;
	}
}

Graph

Here are the results. 1 million numbers were passed in. 100 were selected. You can see that the distribution is fairly even which means that the code is doing the correct thing.

Graph of randomly selected numbers

QuickSort Implementation Code in C++

Wednesday, June 25th, 2008

I was having a look at the QuickSort algorithm (as you do for fun) and was trying to find a tidy looking implementation in C++. I found a lot of implementations out there but most of them looked far more complicated than I thought they should have been.

The QuickSort algorithm as printed in the Introduction to Algorithms book look relatively simple. I wanted a C++ version that has a quicksort function and a partition function. Many of the implementations that I’ve seen on the internet have both functions merged into one.

The neatest looking QuickSort code that I found was one this Java version written by a Prof. Dr. H. W. Lang. Compare how short and easy to read this piece of code is to other QuickSort implementations. This is close to what I want but the partition part is rolled into the main QuickSort code.

It also had quite a few differences to the Introduction to Algorithms QuickSort. I used the Prof. Dr. H. W. Lang code as a base, split the partition part out, and then made a few other changes to make the implementation closer to the Introduction to Algorithms version. This is what I ended up with.

void quicksort(int a[], int left, int right)
{
	if (left < right)
	{
		int pivot = partition(a, left, right);
		quicksort(a, left, pivot-1);
		quicksort(a, pivot+1, right);
	}
}

int partition(int a[], int left, int right)
{
	int pivot = a[left];
	while (true)
	{
		while (a[left] < pivot) left++;
		while (a[right] > pivot) right--;
		if (left < right)
		{
			swap(a[left], a[right]);
		}
		else
		{
			return right;
		}
	}
}

I tested this in the free Microsoft Visual C++ Express Edition. Here is what I used to test the QuickSort code.

void arraySortingTests()
{
int numbers1[] = {14, 17, 11, 13, 19, 16, 12, 15, 18, 10};
int numbers2[] = {19, 16, 10, 12, 14, 17, 13, 18, 15, 11};
int numbers3[] = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19};
int numbers4[] = {19, 18, 17, 16, 15, 14, 13, 12, 11, 10};

quicksort(numbers1, 0, 9);
printArray(numbers1, 10);

quicksort(numbers2, 0, 9);
printArray(numbers2, 10);

quicksort(numbers3, 0, 9);
printArray(numbers3, 10);

quicksort(numbers4, 0, 9);
printArray(numbers4, 10);
}

void printArray(int thearray[], int count)
{
	for (int i=0; i<count; ++i)
	{
		cout << thearray[i] << " ";
	}
	cout << endl;
}