CouchDB document management using erlang

CouchDB Logo

Looking at the Erlang Factory roster i found a reference to Apache project CouchDB.

This is an Erlang implementation of a Document oriented Database or in lame men terms a Disabled document management system.
In the introduction, which is quite short, it is stated as not a relational database for persistence layer to OO applications.
This is quite wired since a JSON application can be OO and use the Document Management system.
I think this remark is more to frighten you then to tell the full story.
If you are to see the full story of the limitations of the Document Database you should read Eric Florenzano’s post it is truly through and gives the correct picture in my view.
I will take a few items out of it to explain why this type of database is not your typical database.
Due to its nature the Database will be slow in complex views

The problem with this is that temporary views in CouchDB on large datasets are really slow

.
You may also find that the relational and transactional abilities of it lack.

it doesn’t support transactions in the way that most people typically think about them. That means, enforcing uniqueness of one field across all documents is not safe.

CouchDB sucks at dealing with relational data. If your data makes a lot of sense to be in 3rd normal form, and you try to follow that form in CouchDB, you’re going to run into a lot of trouble.

CouchDB exposes a RESTful HTTP API to communicate with it and some clients have already been implemented, even in Javascript.
The

So what is CouchDB?
1. Not relational databases.
2. Schemaless.
3. Support for data replication.
4. Requests use HTTP: GET, PUT, POST, and DELETE with RESTful semantics.
5. Requests specify the database and record in the URL, with query params used for modifiers.
6. Record creation, updating, and deletion is atomic.
7. Supports all JSON data types (string, number, object, array, true, false, null).
8. Indexing is under user control, by means of views:
a. Defined with arbitrary Javascript functions.
b. Can be stored as documents.
c. Can be run ad hoc, as temporary views
9. Queries are views with modifiers (start_key, end_key, count, descending) supplied as HTTP query parameters.
10. Sorting is flexible and arbitrarily complex, as it is based on the JSON keys defined in the views.
11. Responds with JSON.

But the most important thing is that it is written in Erlang and is much more reliable and stable.

The main advantage of the CouchDB over the relational DB is that there is no schema, this means that information can be saved with unique schema, the best example for this information is a business card. most are similar but not identical making you hold empty columns for information that is usually not available in most cases. with CouchDB each card schema is unique thus not requiring wast of data storage.
Another thing that CouchDB is good at is Evolving Data, if the item has now more fields then before you can update it with no worries approach and the non existent schema will also change.

It seems that even at its first stages of alpha it is very very quick and stable one example is elyservice.co.uk. this is a conversion from dejango made by Jason Davis.
as per performance, there is a bit of an early stage testing going on in Craiglists to see if this DB can be a part of the technology. in the early testing the registered performance of the server was impressive.

It seems that without any tuning or fancy work, I can get about 75-100 inerts/sec on my desktop class Ubuntu box (Intel Core 2 Duo, 2.66GHz, 1GB RAM, single 80GB SATA disk). That’s not bad for out-of-the-box performance. And doing the math on space used for a document set (after compaction), I’m seeing roughly ~3KB/doc. That’s a bit more than I expected but really not bad at all.

Now if you are a python programmer you will ask the simple question of why not use ZODB. its been around for years and it is good, features are almost the same. my thoughts are that the Erlang infrastructure makes the difference for me and makes me trust CouchDB more.

The title of couchDB is relax so now that you know why you can relax!!
Here are a couple of PDF to get you through the day
CouchDB to the edge
and my personal favorite title Relax Internals

Comments

  1. Great post! Just wanted to let you know you have a new subscriber- me!

Speak Your Mind

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.