11:16 AM ~ 1 Comment ~ Written by Thomas Moreau

Quelle gestion collaborative des connaissances ?

Fabienne, notre CKO, interviewée lors de notre passage au salon Solutions Intranet 2009 par Techtoc TV, vous présente la solution Knowledge Plaza. Une interview riche où Fabienne explique notamment que la plate-forme permet de partager des sources d’informations sans même devoir automatiquement se connecter à l’outil.

03:38 PM ~ 4 Comments ~ Written by e.benitoruiz

Addressing the future vertical real-time web

now_watch

Looking back to the past 20 years on technology, one can easily notice a pendulum cycle that gets repeated at each stage. This post doesn’t intend to be a review on the history of web so to sum it up we could state that we always start with horizontal approaches and thus horizontal apps, and as long as users (mainly early adopters) use & spread a given app, new needs arise, which have to be addressed especifically. Basically it’s always from horizontal to vertical, take search engines evolution  for instance, or the evolution from websites to blogs, or the current development of social networks…from unspecialized to specialized ones. Take Twitter, a horizontal organic app. After the huge widespread (but still insufficient to be called a killer app adopted by non-techie mainstream, just like email or msn messaging did) of Twitter, 3rd-party solutions show up to tap Twitter stream for specific purposes.

Indeed, the pendulum and cronology has moved from asynchrony to synchrony, from plain html pages, forums, emails to messaging and chats. From asynchronous blogging to real-time blogging, i.e. Twitter.

I don’t actually agree with what Alexl Cohen says here:

“It’s something of a design truism that you can be decent at more things or great at a few things. Scarce resources, time and customers always force trade-offs. I like to think of “decent at more things” as a horizontal focus. Being “great at a few” would be a vertical focus”.

I understand his stance, though, somehow resounding the old saying “Jack of all trades, master of none”…But Twitter does more than a decent job being horizontal…or Flickr, or Google. In any case, vertical does help leverage and scale whatever that is you are using or searching in a more efficient way.

OK, back to the ‘now’ web, the real-time web, aka Twitter. As Jeff Pulver (@jeffpulver) claimed at the 140 Character Conference in NY, we are living at the state of now. And Twitter was just that now system we didn’t know we needed until we had it (paraphrasing from the t-shirt  Tim O’Reilly wore at that conference). Twitter is basically based upon, as defined by Jack Dorsey (Twitter co-founder), approachability, transparency and immediacy. Horizontal real-time web.

With Microplaza, we’re working towards that pendulum’s second phase: verticality. In this case verticality for the real-time now web. We aim to provide Twitter users the specialized tools and features to focus on specific topics, networks, conversations happening at Twitter, right now. It wasn’t really about coding a hot topic tracker from the public timeline as it was developing an application that would help you keep focused on what’s relevant for you and your trusted network (i.e. our personal timeline or our friends’), considering each passed link as the mother source (yeah, if content is king, then sure link relevance is queen).

Track and search your conversational network.

Search. Has this ever happened to you? When you need some help on a given issue, you have a doubt or whatever, and call someone but no one is available. You have to find your way to solve the problem. Then after a couple of hours or days, they call you back telling you they got a missed call, and ask you if they can help you.  Sorry, your question and your possible answer are not longer timely, nor fresh nor relevant.

Indeed. Search. We need the right data at the right time. And the right time is always now. We have present, future, past and now. Now is the right tense for search.

Indeed. Search. We need fresh data. Fresh not frozen (i.e. historical index or cached, which also come in handy but they’re not the same). But now can be integrated and complemented with fresh data from real-time web.

Indeed. Search. Relevant. What’s relevant for each of us.

Vertical zeitgeist, vertical real-time web is up for grabs and we think Microplaza can grab a piece …or rather, we think we’re cooking part of that recipe.

05:27 PM ~ 1 Comment ~ Written by Sylvain Munaut

Some quick numbers about SSD for PostgreSQL

SSDs are not new anymore, there are plenty of benchmark and tests out there. Even on the specific combination of SSDs and PostgreSQL, there is quite a lot of availabe data. This post by Jignesh Shah is a good example (fyi, if you use PostgreSQL on Solaris, that blog’s awesome !).

And when the time came to choose hardware for one of our DB server, we chose to include some SSDs in it. But once installed and used with real data under a real workload, I wanted to make sure that the prediction we made at the time were accurate and that indeed the latency of the SSDs was low and well used.

To this end, I wrote a little DTrace script (btw, DTrace _rocks_ !) that records all calls to the read syscall with a size of 8192 (PostgreSQL page size) and displays a statistical distribution graph of their completion time, both globally and by table space.

I chose to analyze the reads only because of the various special techniques involved in DB writes that would have made the results hard to interpret.

Here is the script:

#!/usr/sbin/dtrace -qs

dtrace:::BEGIN
{
        self->start = 0;
}

syscall::read:entry
/arg2 == 8192/
{
        self->start = timestamp;
        self->dir = dirname(fds[arg0].fi_pathname);
        self->file = basename(fds[arg0].fi_pathname);
        self->bytes = arg2;
}

syscall::read:return
/self->start/
{
        this->elapsed = timestamp - self->start;
        @read_time_all = quantize(this->elapsed);
        @read_time[self->dir] = quantize(this->elapsed);
        self->start = 0;
}

dtrace:::END
{
        printa("%@d\n", @read_time_all);
        printf("\n");
}

Then, just need to logon to a machine under load and run it.

In this test, I used a PostgreSQL setup with the main PostgreSQL data files on the OS drives and 3 table spaces for the tables/indexes:

  • ts_main: Stored on a minimal RAID10 array of four Seagate SAS 300G 10krpm
  • ts_aux: Stored on the same drives as the OS
  • ts_ssd: Stored on a mirror RAID of two Intel X25-E of 32G (this array also hosts the pg_xlog)

Here are the results after a few minutes of data collection (the value column is the time it took to complete the read, in ns) :

root@painkiller:~/dtrace# ./pg-latency.d
^C

          value  ------------- Distribution ------------- count
           8192 |                                         0
          16384 |@                                        403
          32768 |@                                        223
          65536 |                                         6
         131072 |                                         49
         262144 |@@@@@@@@@                                3176
         524288 |@                                        247
        1048576 |@@                                       754
        2097152 |@@@                                      1071
        4194304 |@@@@@@@@@                                3390
        8388608 |@@@@@@@@                                 2962
       16777216 |@@@@                                     1566
       33554432 |@                                        545
       67108864 |                                         100
      134217728 |                                         20
      268435456 |                                         15
      536870912 |                                         5
     1073741824 |                                         2
     2147483648 |                                         0        

 /var/postgres/8.3/data/pg_clog
          value  ------------- Distribution ------------- count
           8192 |                                         0
          16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@               394
          32768 |@@@@@@@@@@@@@                            201
          65536 |                                         0
         131072 |                                         0
         262144 |                                         0
         524288 |                                         0
        1048576 |                                         0
        2097152 |                                         0
        4194304 |                                         1
        8388608 |                                         0        

 /var/postgres/8.3/data/ts_ssd/17946
          value  ------------- Distribution ------------- count
          16384 |                                         0
          32768 |                                         1
          65536 |                                         0
         131072 |                                         3
         262144 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@          3049
         524288 |@@                                       211
        1048576 |@@@@@@                                   598
        2097152 |                                         32
        4194304 |                                         9
        8388608 |                                         6
       16777216 |                                         5
       33554432 |                                         5
       67108864 |                                         3
      134217728 |                                         2
      268435456 |                                         6
      536870912 |                                         1
     1073741824 |                                         0

 /var/postgres/8.3/data/ts_aux/17946
          value  ------------- Distribution ------------- count
         131072 |                                         0
         262144 |                                         2
         524288 |                                         0
        1048576 |@@                                       48
        2097152 |@@@@@@@@@                                290
        4194304 |@@@@@@@@@@@@@@@@@@                       570
        8388608 |@@@@@@@@                                 252
       16777216 |@@@                                      91
       33554432 |                                         7
       67108864 |                                         1
      134217728 |                                         2
      268435456 |                                         0        

 /var/postgres/8.3/data/ts_main/17946
          value  ------------- Distribution ------------- count
           8192 |                                         0
          16384 |                                         9
          32768 |                                         21
          65536 |                                         6
         131072 |                                         46
         262144 |@                                        125
         524288 |                                         36
        1048576 |                                         108
        2097152 |@@@                                      749
        4194304 |@@@@@@@@@@@@@                            2810
        8388608 |@@@@@@@@@@@@                             2704
       16777216 |@@@@@@@                                  1470
       33554432 |@@                                       533
       67108864 |                                         96
      134217728 |                                         16
      268435456 |                                         9
      536870912 |                                         4
     1073741824 |                                         2
     2147483648 |                                         0

Let’s first focus on the first graph. You can clearly see three zones :

  • 16 – 32us: This is very fast ! Probably data served directly from the OS level cache.
  • 250 us: This spike is quite sharp. Not fast enough to be from the OS cache and probably too fast to be from the RAID. So I’d guess theses are the reads hitting the SSDs.
  • 1 – 32ms: This is a pretty wide range. Most likely the HDDs seek times mixed with the queuing delays from other IOPs.

Ok, now that we have some hypothesis, we can confirm them (or not !) by looking at the detailed graphs.

The second graph are the read hitting the clog directory of PostgresSQL, they are obviously all the ones hitting the OS cache which is not surprising given the ’space/time locality’ of the data stored in there.

The third graph are all the reads to the tables/indexes stored in the SSD tablespace. And as predicted, they are the ones in the 250us range. I’m not quite sure what’s up with the “sub spike” at around 1ms tough. Possibly the effects of queued commands that have to wait for completion of some other reads being issued … but that would require more investigation to be sure. Also note that those 250us are not all due to the latency ! Even at 100 Mbytes/s, just the transfer time already represents around 80us …

Finally the fourth and fifth graph are quite similar and are indeed all the reads hitting the disks, centered around 4-8 ms. This matches the disks specs.

So, what’s the conclusion ? Are the SSDs as good as they say ?

Well, as always, there is not one unique, simple, universal answer ! Yes, they are indeed very good at access times, but whether this really matter for the real world performance of your application depends on your workload.

If you have a lot of random access in some huge table or index, the access time of the SSDs can give you a ~ 10 fold improvement ! On the other hand, if you mostly have sequential scans and if all your random access are confined to a set of pages sufficiently small to fit in the PostgreSQL or OS cache, then you might not need them. Because, let’s face it, at around 1000 eur a piece, those X25-E are kinda expensive and you might be better off with a lot of 15krpm spindles.