24/6/13

Data structures: the Hash Table

Now of course, what kind of tutorial would this be if it didn't end on the hardest note?

The hash tables! 
 

What is a hash table?

The hash table is a tricky one. It's basically a relation between a certain number, string or identifier, called the key, and the information itself, the content, element, etc. This is done in hopes of achieving better access times, without having to iterate through the structure; rather, using the key as the identifier of where the desired information is.
Image courtesy of Wikipedia's entry on Hash Tables

 

The hash function

To achieve better efficiency at times, the key must be a very simple piece of information, normally obtained directly from the content. In order to obtain a key to an element, one needs a hash function, which essentially calculates said key.

It is imperative that the key not only be unique, or as unique as possible, but also to be  calculated with speed. Achieving both is not straight forward at all.

A simple example on how to translate a monster's alphanumerical (numbers and letters) name into a key usable by a hash table. It is by no means the only way to do this.

As you may see, two elements can give out the same key, which may lead to trouble later on. More on that afterwards.  

 

What can you do with a hash table?

  • Create: begin the hash table data structure, reserve the space we ask for.
  • Add: put a new element.
  • Delete: remove the specified element.
  • Alter: change the value in the element.
  • Value: get the value of the element.
  • First: get the first element.
  • Size: gives the number of listed objects
  • Empty: checks whether the list has no elements.

 

How do we add things?

This is probably the most particular part of the hash table. Once an object's key is calculated, we simply put said object in the index shown by the key! Theoretically, this would mean we could insert an element at once, without any more calculations.

 

Collisions!

But what if we couldn't? Take a look at the next example.

Key functions aren't perfect. here, we gave the character '0' value 0, so a name "Ogre" and another name "Ogre0" will return the exact same key!
These are called collisions, when two different elements are supposed to go in the same spot inside a Hash Table. Collisions will eventually happen no matter what hash function you use. It is possible to lower the chances of you finding a collision by thinking through what hash function to use, but one must still realize collisions will happen, and one must prepare against them.

How to prepare for collisions? Obviously, "Ogre" may get in position 45 without problem, but "Ogre1" needs a place to stay. Preparing for collisions means implementing a method of reallocating the latter element so that it is still easy to reallocate, check for, alter and delete, even if it's not in its ideal key position.

One way is to have a secondary hash function, to find the latter elements a new position in the table, or keep searching for one that is empty. Once that element is allocated there, the first element must have a pointer that leads to the latter, and the same will happen if another collision is in place. The disadvantage is of course that this will occupy free keys that posterior elements may need for themselves. If so, one may reserve extra space in a hash table to store collision results so as to not interfere with the prime hashing.

Another way to do it is: instead of having elements in each position of the hash table, have a list! A list of pairs key-element. That way, whenever an element must get stored in a certain position, we simply put it at the end of the list in that position. If the collisions are few the lists do not grow excessively. Still, that means to access an element we may have to go through a whole list to get it.

There are many other ways to manage collisions, of course: instead of a list, using a tree, for instance. My intention was only to show you the simpler ways to do that. There are many tutorials that show many approaches to this issue and I urge you to search the Internet for them.

 

Advantages

  • Best case scenarios are instantaneous
  • Direct relationship between an element's attribute and its key, which brings us closer to DataBase managing (like SQL, Oracle...)
  • Scaling can be better controlled with Hash tables

 

Setbacks

  • Collisions!
  • Resizing the hash table is not trivial
  • Must think a hash function
  • Must think of collision managing.
  • Worst case scenarios are not bad, but still frustrating for the work a hash table needs

 

Implementation 



/*
 * hashMap.h
 *
 *  Created on: 17/09/2012
 *      Author: IronyGames
 *
 *
 *      This snippet of code is free to use.
 */


#ifndef HASHMAP_H_
#define HASHMAP_H_


template<typename M>
class HashEl{
private:
    M *value;
    int nextKey;

public:
    HashEl(M v){
        value = new M;
        *value = v;
        nextKey = -1;
    }

    HashEl(){
        value = new M;
        *value = 0;
        nextKey = -1;
    }

    ~HashEl(){
        value = NULL;
    }

    M getVal(){
        return *value;
    }

    int getNext(){
        return nextKey;
    }

    void setNextElement(int nk){
        nextKey = nk;
    }

    void setVal(M* v){
        value = v;
    }

};

template<typename T>
class hashMap{
private:

public:
    HashEl<T> *elements;
    int sizeTable;

    hashMap(int i){
        elements = new HashEl<T>[i];
        sizeTable = i;
    }

    bool insert(T el){
        int key = hashFunction(el);
        if(key >= sizeTable){ // You could reallocate more space if you wanted.
            std::cout << "Hash Table size insufficient." << std::endl;
            return false;
        }
        if(el == 0){ //In this particular example (C++ array of ints), 0 means a space is empty, therefore
                        //we'll need that as a warner or flag that that position is free.
            std::cout << "Cannot store null elements or elements 0\n";
            return false;
        }

        add(key,el);

        key = 0;
        return true;
    }

    bool remove(T val){
        int key = hashFunction(val);

        bool lastInChain = false; //in case this has collided, elements may point to other elements. The
                                    //first element just may not be the one we're looking for.
        int nK = key;
        while(!lastInChain){
            if(elements[nK].getNext() == -1){
                lastInChain = true;
            }
            if(elements[nK].getVal() == val){
                elements[key] = 0;
                return true;
            }
        }

        return false;

    }

    T get(T val){
        int key = hashFunction(val);

        bool lastInChain = false; //in case this has collided, elements may point to other elements. The
                                    //first element just may not be the one we're looking for.
        int nK = key;
        while(!lastInChain){
            if(elements[nK].getNext() == -1){
                lastInChain = true;
            }
            if(elements[nK].getVal() == val){
                return elements[nK].getVal();
            }
        }

        return 0;
    }

private:

    bool add(int key, T el){

        if(elements[key].getVal() == 0){ //the position is free!
            elements[key] = HashEl<T>(el);
            return true;
        }else{
            std::cout << "Collision at key " << key << " on value " << el << "! Reallocating...\n";

            //If we redirect to other keys, follow that path until the last element in the chain
            bool lastInChain = false;
            int nK = key;
            while(!lastInChain){
                if(elements[nK].getNext() != -1){
                    nK=elements[nK].getNext();
                }else{
                    lastInChain = true;
                }
            }

            //reallocate the element by using the rehash function until we find a free space
            int emptyKey = reallocHashFunction(elements[nK].getVal());

            if(emptyKey == -1){ // flag that means 'could not reallocate'. See reallocHashFunction.
                std::cout << "Reallocation of " << el << " impossible.\n";
                return false;
            }

            //Put the element in the free space
            elements[emptyKey] = HashEl<T>(el);

            //Remember to point the new space from the former last one in chain!
            elements[nK].setNextElement(emptyKey);

        }
        return false;
    }


    bool alter(int i, T val){
        if(val != 0){
            elements[i].setVal(val);
            return true;
        }
        std::cout << "New value cannot be 0\n";
        return false;
    }

    int hashFunction(T val){ //These can be whatever you want, so long as it returns a key!
        return 2*val; //in this case it's key(value) = 2*value;
    }

    int reallocHashFunction(T val){ //Keep searching via a hash function until we find free space
                                        //or we reach a certain condition (in this case 'till we reach
                                        //end of the table).
        for(int i = val; i < sizeTable; i++){
            if(elements[i].getVal() == 0){
                return i;
            }
        }
        return -1; //flag that means 'could not reallocate'
    }


};

#endif /* HASHMAP_H_ */


Finally, I must apologize to you all for not updating for the longest time. I just want to assure you I have not become uninterested with the blog at all. 

Thanks for reading and keep rocking!

17/9/12

Data structures: the Tree (balancing headaches)


I need to explain an extra concept in this Data structure, so the post will be longer than usual. Worth it, though! 


What is a tree?


I apologize for the joke.

A tree is a structure that you must already know about if you've ever dealt with graphs. It consists of a first node that acts like a father father, or root, with a number of sons born from it, and linked with branches. Each son, in time, has its own sons, and so on and so forth. Those nodes who have no offspring are called leaves.
(The nodes are the letter by the way)
Now if you're already a step ahead of me, you might foresee trees are a very loose concept. There's so many types! So many sons, and directions! How can we possibly make a good structure out of this?

Having that in mind we'll introduce 2 restrictions:
  1. The tree must be binary; meaning, any node can have no more than 2 sons (but can have less!).
  2. The elements that are smaller (in any logic) than their father must hang on the right of its father; otherwise, they must hang from the left. You can do it vice versa if you prefer
Other than that, the other restrictions stand, like not mixing types of elements, etc.
Example of a binary ordered tree


What can you do with a tree?
  • Create: begin the tree data structure, reserve space.
  • Add: put a new element. Keep pushing it down the tree this way (by convention, you can change it if you prefer): to the left of the current element if it's bigger than it, or to the right if it's smaller than it.
  • Delete: remove the specified element.
  • Alter: change the value in the element.
  • Value: get the value of the element.
  • First: get the first element.
  • Size: gives the number of listed objects
  • Empty: checks whether the list has no elements.
    The tree has 3 levels: 4, 5-8 and 3-6-7-9. You can reach any number touching only 3 nodes.
Advantages
  • Finding costs less than in a list. Worst case scenario, a list must run through all the elements; with a tree, you only have to run through as many nodes as levels on a tree.
Setbacks
  • Adding and deleting cost more than in a list
  • This only works wonders if the trees are balanced

Balancing
Balancing is one of the major drawbacks of trees. Take this tree for instance.

What we've done here is correct, according to our restrictions. However, for five elements, we have four levels! Before we had a tree with 7 nodes that only had 3 levels! This is a serious efficiency problem that disrupts all the advantages of a tree.

Wouldn't it be much better if it looked like this?
Now we have 3 levels. What we've done here is balance a tree. When a tree self-balances automatically, we call it an AVL tree.
So how do we balance a tree?
Luckily for us, there is a logic, recursive algorithm to do just that. Unfortunately for us, like any new algorithm, it will take some time to understand it.

Courtesy of wikipedia: http://en.wikipedia.org/wiki/AVL_tree
Bear in mind:
  • 3, 4, 5 are the nodes
  • A, B, C, D are sub-trees born from their respective nodes. That comes to suggest you can recursively solve a tree's balancing by balancing their subtrees, one by one, in their smallest nature.

There are four actions you must take, depending on the four cases of imbalance.
  1. Left-Right: will lead to Left-Left. The root
  2. Left-Left: will balance the tree.
  3. Right-Left: will lead to Right-Right.
  4. Right-Right: will balance the tree.

Implementation
We can pretty much recycle the ListEl from the list; however we will use two pointers instead of one, for the left and right sons, thus creating a TreeEl. It should also be noted that this implementation does not contemplate balance.
/*
 * tree.h
 *
 *  Created on: 11/08/2012
 *      Author: Daniel
 */


#ifndef TREE_H_
#define TREE_H_

template<typename T>
class TreeEl { //Elements that go into the list

private:
    T* _element; //Actual information
    TreeEl *_left; //for binary trees; for bigger-offspringed trees you might consider an array of pointers.
    TreeEl *_right;

public:
    TreeEl() {
        _element = NULL;
        _left = NULL;
        _right = NULL;
    }

    TreeEl(T input) {
        _element = new T();
        *_element = input;
        _left = NULL;
        _right = NULL;
    }

    T getElement() {
        return *_element;
    }
    void setElement(T in) {
        _element = new T();
        *_element = in;
    }
    TreeEl* getLeft() {
        return _left;
    }

    TreeEl* getRight() {
        return _right;
    }

    bool pointLeftTo(TreeEl* obj) { //We retain the direction, not the element, for versatility; else we'd be playing a game of russian dolls.
        if (obj != NULL) { //little failsafe to ensure the TreeEl exists and we're not being fed null stuff.
            _left = obj;
            return true;
        }
        std::cout << "Error at pointTo, object is null";
        return false;
    }

    bool pointRightTo(TreeEl* obj) { //We retain the direction, not the element, for versatility; else we'd be playing a game of russian dolls.
        if (obj != NULL) { //little failsafe to ensure the TreeEl exists and we're not being fed null stuff.
            _right = obj;
            return true;
        }
        std::cout << "Error at pointTo, object is null";
        return false;
    }

    void clear() {
        _element = new T();
        _left = NULL;
        _right = NULL;
    }

    bool isLeaf() {
        if (_left == NULL && _right == NULL) { //That would mean no offspring -> end of tree
            return true;
        }
        return false;
    }
};

//------------------------------------------------------------------------------------------------------------------------------------------

//------------------------------------------------------------------------------------------------------------------------------------------

//------------------------------------------------------------------------------------------------------------------------------------------

template<typename T>
class Tree {

private:
    TreeEl<T> *_root; //first of them all.
    TreeEl<T> *_it; //iterator
    int _size;
public:
    Tree(T val) { //let's say it goes from less (left) to more (right)
        _root = new TreeEl<T>(val);
        _it = _root;
        _size = 1;
    }

    void root() {
        _it = _root;
    }
    void left() {
        _it = _it->getLeft();
    }
    void right() {
        _it = _it->getRight();
    }

    void add(T newv) {
        TreeEl<T> *taux = new TreeEl<T>(newv);
        _it = _root;
        while (!_it->isLeaf()) { //That would mean no offspring -> end of tree
            if (biggerOrEq(newv, _it->getElement())) {
                _it = _it->getRight();
            } else {
                _it = _it->getLeft();
            }
        }

        if (biggerOrEq(newv, _it->getElement())) {
            _it->pointRightTo(taux);
        } else {
            _it->pointLeftTo(taux);
        }
    }

    bool biggerOrEq(T newv, T oldv) { //to show the <s and >s should be implemented for more than one typename T, here I'll just implement for ints
        return newv >= oldv ? true : false;

        //This is a one-line way of writing if-else!
        // newv >= oldv ? true : false;
        // is newv bigger-or-equal towards oldv ? if so, true : if not, false;
    }

};

#endif /* TREE_H_ */



------

Now you try and make a function that fetches a node no matter where it is inside the tree.

Finally, hash maps!

17/8/12

Writing: World 1-1

I gotta do this, I gotta make it, I gotta save... Everyone.

I'm the hero, I'm meant to do that, I'm the only one who can save everything.

Save all... From who? Ugh... I don't remember. There are people, people who want the world obeying them, and me... Dead. But who are they? Why can't I remember? I feel like I should know, but I don't recall ever learning...

Is this all that I am? A hero? A servant. Destined to save everyone, but a serf nonetheless. I don't want to be a slave. But i don't want to let anyone down. I love them all, but do they love me back? I can't ever remember a kind word for any of my good actions, no one ever thanking me.

Why amb I fighting? What have the bad guys done? Do they really want to kill me? I know for sure I've never fought for my life before. And yet I know I will. And I will need to fight back if I wanna live. How? What's my fighting style? But I feel my instincts will take care of that for me.

 My subconscious. I rely on it too much. Do I?

A part of me that tells me nothing except to leave everything to it. Just do as it says. Who is it? I know it's natural having a subconscious, but... Is it my subconscious?

But I know one thing. I know about her. I don't know if it knows about her, but I know I do. She's really the only thing that makes me obey my subconscious.

She's beautiful, she's perfect, she's... She's...

...

I don't know how she is. But I know I love her. And that I will save anything because that will make her happy, and I will have made her happy. That's why I'm a hero who will be the hero, the hero of my journey.

I will have this world's eternal gratitude, but I'd trade it for an ephimeral instant of her love.

In the end, I am tranquil. My subconscious may know how to fight, who to fight, who to save. What path to tread. But it knows nothing about the end.

I do.

And it will lead me to the end. With her.

And it will let me discover the path of my adventure.

For the first time.


Again.

--------------------

I scribbled this up in half an hour while waiting for my doctor's visit. I started titleing it "Another World", and I was aiming for a kind of trailerish monologue about how the world (videogame world) is far too simple to be true, even for the story I was trying to tell. So the world was going to be a screwed up place, cruel, with no up and down.

But I ended up with a nervous monologue. I know it makes little sense. But picture it as the first time you push a game inside the console, and start it. The first time the hero is created for you.

We know this hero has a story and that we have to help him get through his journey. But he knows far less about this journey than we do! And it's scary, because he knows he's part of a big plan, he just doesn't know why.

But he doesn't reject his subconscioud command--our command, because he knows something we don't. He knows there's this girl waiting for him at the end of the road. He knows she's real, but doesn't mind the why he loves her, because that's love, it needs no sense to be real. And she's real. And so must be the rest of the journey.

And so must be his subconscious. We are a real part of him, in the end. And the most amazing thing is that, for the rest of our lives, he will be part of us.

11/8/12

Data structures: the List (where arrays have no meaning and your whole life is a lie)


This time, an intuitive structure with a not so intuitive implementation, the list.

What is a list?
A list is an array of elements put in a distinct order, but that can have new elements introduced in between any other two elements, as well as deleting any element in any position. This helps us in not having a fixed structure in which an error while introducing elements can cost us a lot of time and resources to fix.




Therefore, we may need to be able to go through all the elements in the list in order to check, add, or delete any elements. To do that, however, it may be wise to also create an iterator, that is, a pointer that may be able to automatically do such search.

What can you do with a list?

  • Create: begin the list data structure, reserve space.
  • Add: put a new element. Where, however? If we point at an existing element of the list (which is all we can do), do we add the new element before, or after him? And how do we do it to ensure there's capacity for it, and that we do not need to sacrifice too many resources on resizing the list?
  • Delete: remove the element at the specified position.
  • Alter: change the value in the element
  • Value: get the value of the element
  • First: get the first element.
  • Size: gives the number of listed objects
  • Empty: checks whether the list has no elements.
Advantages
  • Extremely flexible: adding, deleting, altering or checking any element, any position.
  • Convenient use of an iterator
Setbacks
  • Tougher implementation: arrays do not work as a storing base.
  • Hardly scalable: worst case scenario, your adding, deleting, altering or checking elements will cost you to go through all the list.

We've all done a list at some point in our life. Having our affairs in order, striking out what we've accomplished, adding new objectives or modifying them. All in all it's a good way to store data so long as it's more of an impromptu or small storage; because a list that grows too big becomes harder and harder to keep track of. With that in mind, we can deduce in which contexts a list is useful.

Implementation: how the buck are we gonna do this?
Having the list modify elements in the middle of the structure means using arrays or simple storages will elevate the cost of the actions dangerously. Every new item will mean rearranging the whole thing. So how do we do it?

Pointers. And magic.

Each element of a list will contain its information and a pointer to the next element, creating a chain of elements. We will only need to have a pointer to the first element; from there we can access the pointers of the elements to advance through the list.

We will also need a Ghost Element. It will be a flagged element which is not valuable, but has a functionality.



It will always be at the end of the list. By adding/deleting elements before a certain element, we will be able to put an element everywhere but at the end of the list. However, with the Ghost, we can add an element before him, effectively putting an element at the end of the usable list.

It is very important that the users can't alter the Ghost.

/*
 * list.h
 *
 *  Created on: 29/06/2012
 *      Author: Daniel
 *
 *      This snippet of code is free to use
 */

#include <iostream>
#include <stdlib.h>

#ifndef LIST_H_
#define LIST_H_

template<typename T>
class ListEl { //Elements that go into the list

private:
    T* _element; //Actual information
    ListEl<T>* _next; //Pointer to the next eListEl

public:
    ListEl() {
        _element = NULL;
        _next = NULL;
    }

    ListEl(T input) {
        _element = new T();
        *_element = input;
        _next = NULL;
    }

    T getElement() {
        return *_element;
    }
    void setElement(T in) {
        _element = new T();
        *_element = in;
    }
    ListEl* getPointed() {
        return _next;
    }

    bool pointTo(ListEl* obj) { //We retain the direction, not the element, for versatility; else we'd be playing a game of russian dolls.
        if (obj != NULL) { //little failsafe to ensure the ListEl exists and we're not being fed null stuff.
            _next = obj;
            return true;
        }
        std::cout << "Error at pointTo, object is null";
        return false;
    }

    void clear(){
        _element = new T();
        _next = NULL;
    }
};

//------------------------------------------------------------------------------------------------------------------------------------------

//------------------------------------------------------------------------------------------------------------------------------------------

//------------------------------------------------------------------------------------------------------------------------------------------

template<typename T>
class List {

private:
    ListEl<T> *_first; //first element, pointer needed to run through the list
    T _flag; //Value that will describe the ghost element. giggle at the ghosties...
    int _size;
    ListEl<T> *it; //iterator

public:
    List(T flag) { //first
        _first = new ListEl<T>(flag); //ghost points nowhere -> danger!
        _flag = flag;
        _size = 0; //we don't count the ghost.
        it = _first;
    }

    bool isEmpty() {
        if (_size > 0) {
            return false;
        }
        return true;
    }

    ListEl<T> getItem() {
        if (it->getElement() != _flag) {
            return *it;
        } else {
            std::cout << "Empty list";
            return NULL;
        }
    }

    void first() {
        it = _first;
    }

    void alter(T value) {
        it->setElement(value);
    }

    void erase() {

        if (!isEmpty()) {

            it->setElement(it->getPointed()->getElement()); //copy the next element
            it->pointTo(it->getPointed()->getPointed());//the element in the middle will now be nowhere

            _size--;

            it->getPointed()->clear(); //free memory
        }else{
            std::cout << "Can't erase - list is empty";
        }
    }

    void next() {
        if (it->getPointed()->getElement() != _flag) {
            it = it->getPointed();
        } else { //if the next is the ghost, we're at the end!
            std::cout << "End of list";
        }
    }

    bool add(T value) { //we could ask for a ListEl, but it's more comfortable for the programmer if they just need to input the value.
        ListEl<T> *laux = new ListEl<T>(); //an empty node...

        if (!isEmpty()) { //so long it's not the ghost (points SOMEWHERE...)
            laux->pointTo(it->getPointed()); //we copy there the current node...
        }
        laux->setElement(it->getElement()); //so laux is the current node...

        it->pointTo(laux); //and the old current node points to laux, putting it BEFORE him...
        it->setElement(value); //and we have added a node BEFORE! tough to wrap your head around it the first time.
        _size++;

        return false;
    }

};

#endif /* LIST_H_ */

------

Left as an exercise is devise a find() function that will tell you the index of the element you query.

Next up, trees.

7/8/12

Lavarinth (Work in Progress)

So I've been learning Flash programming (Actionscript lately) thanks to Foundation Game Design with Flash, recommended lecture!

I still have a ways to go, but I've decided to try and patch something up. Maybe upload it to Kongregate. Thy name is Lavarinth!

Here it is!

So it's basically a maze-solving game, with two twistes:
  • Mazes will be randomly generated every time
  • Lava will slowly flood the maze.
The mazes will also change the size.



I've tried to go for the retro approach, from the graphics to the delivery of the game. Case in point:
  • The graphics are low-res (88x64 I think), simulated.
  • The colors are the ones you'd find in a 3-bit graphics card.
  • The intro is just the title screen loading, then 5 seconds afterwards a few lines of text will roll the credits and the intro text.
  • Only buttons you'll use are arrow keys / WASD and P to pause (like an atari 2600 controller)
  • The sounds will also be low-definition (although I'm not sure how to do that yet).
 This is all simulation for the sake of old-school style, though, and I still have all the tools of Adobe Flash to make this work.

However this is an outdated version as I'm trying to improve the game.

The main problem I have now is that Flash goes too fast for its own good. While I try to load the maze, it's already trying to draw it on screen, so I get to a Null exception. Since I haven't read much more about flash programming I don't know how to handle this issue - namely, how to make the program wait for the maze to initialize.

So you see it's okay to make mistakes or not knowing about stuff, so long as you keep searching. I myself will keep reading the book and its sequel and search on the internet. Any information you can give me will be awesome though.

Until I figure stuff out, rock on!

----

P.S.: Some easter egg sketches of what I used to make the intro art (yes, just like that!):





Shut up, they're sketches, they're SUPPOSED to be sketchy.