Tuesday, May 28, 2013

NSS_OPTIONS

Sometimes developers put undocumented options in their code to help with debugging issues. This morning I came across one of such options which prints extra debug information when executing NSS queries.

First you need to disable nscd:
# svcadm disable -t name-service/cache
Then you need to set env variable debug_eng_loop to >0, for example:
# NSS_OPTIONS="debug_eng_loop=2" ping -I 1 wp.pl
NSS_retry(0): 'ipnodes': trying 'files' ... result=NOTFOUND, action=CONTINUE
NSS: 'ipnodes': continue ...
NSS_retry(0): 'ipnodes': trying 'dns' ... result=SUCCESS, action=RETURN
NSS: 'ipnodes': return.
PING wp.pl: 56 data bytes
What's good about it is that it tells you which database configured in NSS returned the result.

Tuesday, May 21, 2013

Setting RPATH

Today I was made aware that elfedit tool in Solaris 11 allows for setting RPATH (among other things). The only caveat is that a binary had to be linked on Solaris 11. It is very easy to use:

 # elfedit -e 'dyn:runpath $ORIGIN/../lib' /opt/bin/myprog 

There is a nice blog entry about it from Ali Bahrami.

Friday, March 22, 2013

OpenAFS on Solaris 11 x86

Two days ago I presented at the UK Solaris SIG meeting on running OpenAFS on Solaris 11 x86. This is essentially the same talk I gave last year in Edinburgh, I just added few slides explaining what OpenAFS is.

Tuesday, March 05, 2013

ZFS: no-op overwrites

There is an interesting new feature in ZFS in Illumos.

https://www.illumos.org/issues/3236
When overwriting a block which is check summed with a cryptographically secure hash function we can compare the old and new checksums for the block to determine if they differ (at almost no cost since we were going to do the checksums anyway). If they do not differ we don't actually need to do the write. This:
1) Reduces I/O
2) Reduces space usage, because if the old block is referenced by a snapshot we will need to keep both copies of the block around even though they contain the same data.
This functionality is only enabled if:
1) The old and new blocks are checksummed using the same algorithm.
2) That algorithm is cryptographically secure (e.g. sha256)
3) Compression is enabled on that block.
 Philosophical question - should we just trust sha256?
(it seems this can't be disabled nor there is an option similar to verify=on in dedup).

There are more interesting new zfs features in Illumos (for example this one which does a similar thing to what Solaris 11 does). The only regret is that unless one wants to play with one of the appliances based on Illumos the only way to use these features is to use FreeBSD or Linux, which is rather ironic. But on the other hand - why not? At least at home.
 

Friday, November 09, 2012

vmtasks explained

Solaris 11 introduced a new kernel process called "vmtasks" which accelerates some operations when working with shared memory. For more details see here.

20 Years of Solaris

Nice video from Oracle celebrating 20 years of Solaris.

Tuesday, October 23, 2012

Running OpenAFS on Solaris 11 x86 / ZFS

Recently I gave a talk on running OpenAFS services on top of Solaris 11 x86 / ZFS. The talk was split in two parts - first part about $$ benefits of transparent ZFS compression, when running on 3rd party x86 hardware (but it also makes sense when running on Sun/Oracle kit - in some cases even more so). This part also discusses some ideas about running AFS on internal disks instead of directly attached disk arrays, which again, thanks to ZFS built-in compression makes it worthwhile and  delivers even more $$ savings.

The main message of this part is, that if your data compresses well (above 2x), running OpenAFS on ZFS can deliver similar or even better performance but most importantly it can save you lots of $$, both in acquisition costs, and in cost of running AFS plant. In most cases you should even be able to re-use the current x86 hardware you have. The beauty of AFS is, that we were able to migrate data from Linux to Solaris/ZFS, in-place, by re-using the same x86 HW, and all of this was completely transparent to all clients (keep in mind we are talking about PBs of data) - this is truly the cloud file system. I think OpenAFS is one of the under-appreciated technologies in the market.

The second part is about using DTrace, both in dev and in production systems, to find scalability and performance bottlenecks, and other bugs as well. Two easy and real-life examples are discussed, which resulted in considerable improvement in scalability and performance of some operations in OpenAFS, along with some other examples of D scripts which provide top-like output with some statistics (slide #32 is an example from a Solaris NFS server, serving VMWare clients and displaying different stats per VM from a single file system...). DTrace has proven to be a very powerful and helpful tool for us, although it is hard to put a specific $ value it brings.

The slides should be available here.

Wednesday, August 29, 2012

Open Indiana is dead

With the main guy behind the project resigning, OI is essentially dead. It's been dead for some time though and with no commercial backing it never really had much chance. This is sad news indeed (although I haven't really used OI). It marks the end of Open Solaris era.

Can Illumos survive in the long term? Can it become relevant outside of couple of niche use cases?

Ironically, it is Oracle's Solaris which will probably outlive all of them.