Solaris hacking

Solaris 10 Update 5 (05/08) is out!

Solaris 10 update 5 is out! Check out the "what's new". Strangely, nothing on ZFS. Oh well...

All the same, download it at the usual place.

RAID-Z + JBOD or RAID-5?

I was at the SUN-NUS Opensource Day on Friday, and after the event, we all adjourned for dinner. You know, when geeks gather, we inevitably will start discussing the hacks we are deploying for one reason or the other. Or thorny problems. And hence, I brought up my "huge dataset" problem.

Now, the traditional way of doing storage carving is to do RAID-5 on the SAN, cut each RAID-5 set into smaller luns to map to the operating system. If need be, some of the luns will be RAID-1/striped together in the OS. Why? RAID-5 for redundancy, smaller luns as the RAID-5 set is too big, and striped luns to expand storage when required.

So, what's the big problem? Wee Yeh (http://prstat.blogspot.com) pointed out that lun carving is done ACROSS the RAID-5 disk set. It simply follows the way RAID-5 parity is done.

"ACROSS"!!!! Wait a moment... Doesn't this mean if a disk fails, performance is impacted across all the luns in the disk set? And if 2 disks fails, ALL the luns in the disk set are dead! So, if one lun is for applications, another is for data storage, and the third is for user directories, the entire computing stack is down. Unrecoverably dead. Geebes!!!!! <head rolls>

Of course, being a nice guy, Wee Yeh suggests using JBOD disk sets + RAID-Z using ZFS. Since ZFS will do the parity checks, there is still redundancy. If a hard disk fails, the ZFS pool continues to work. If 2 disks fail, at least the damage is contained to this ZFS pool. Added advantage is that both the SAN storage and ZFS will scream when the first hard disk dies, as opposed to only the SAN storage screaming, while the OS remains oblivious. And ZFS has parity checks during disk read/write, nicely overcoming the lack of parity read/write checks on SATA disks.

Okay, time to do some hard thinking

List of *nix commands

Now that I am doing cross platform system administration, it is getting critical to have lists of equivalent commands across the *nixes. Found 2 guides so far:

  1. Unix toolbox: http://cb.vu/unixtoolbox.xhtml
  2. Tom's Hardware Universal Command Guide: http://www.tomshardware.com/ucg/

Will be adding more as the time goes by. Smiling

Re: Data reorganisation woes

Somehow, there is no opposition to the data reorganisation plan.Yeah!!

And I found out, from a post to the LUGS mailing list, 3+ Terabytes file sharing through SAMBA and NFS(v3) has already been done. So, no more drastic hacks, for now. Wee~

Anyway, I just did a Redhat Enterprise Linux 5 kickstart file. May be posting that soon.Smiling

Disk Volume Size Limits

Yesterday, my Sun Microsystems vendor pointed out something I didn't want to think about: as my data storage climbs into multi-terabytes, our current way of storing/distributing data is no longer feasible. Damn, multi-terabyte datasets are irksome...

Here's some idea of the problem.
1) Everyone needs to see all these data, across many different computation machines.
2) Each dataset may need to be "live" for years, as research can take years to fruit. Hence, multi-stage storage strategy may not be applicable.
3) Windows XP has a 2TB volume size limit, and I do not know if CIFS/Samba can even support such big network shares.
4) Filesystem-wise, there will not be enough inodes, unless I use ZFS on Solaris.
5) Even my SAN has a problem, as each LUN can only go to 1TB. Solution? RAID-Z the LUNs.
6) And to top all these, the present dataset structure has to be reorganised, as it is just too messy to be scalable.

Let's not even talk about the network bandwidth problem...

Arrgh...... And I'm looking at scaling to 15TB in 3 years!!!

Here's to another round of persuading everyone that this is a time of changes. Aidios~

Bash Scripting Tips

I went looking around for bash scripting tips, especially secure coding of bash. Can't find much information, so decided to consolidate whatever I found here. Smiling

  • Salt string comparisons of variables to increase security

    if [[ "a$?" == "a4" ]]; then
    

  • Use the full paths to any binaries, either by hardcoding them into the script or use variable substitution. This prevents the script from executing incorrect/rogue binaries in the path.

    /bin/grep "hardcoding the full path" *
    
    echo=/bin/echo
    ${echo} "From bash manpage under EXPANSION:
    The order of expansions is: brace expansion, tilde expansion,  parameter,
    variable  and  arithmetic  expansion  and command substitution (done in a
    left-to-right fashion), word splitting, and pathname expansion."
    

  • Change the environment path at the start of the script to ensure no rouge directories are in the PATH

    #!/bin/bash
    # comments
    PATH=/bin:/usr/bin
    

  • Write a function to explain the usage of the script

    function print_usage () {
        ${echo} "
    $0
    Usage: $0 [-a opts] [arguments]
     or    $0 -h
    Description: Something fishy
    Options:
      -a opts    (Optional) Options
      -h         (Optional) Help
      arguments  Smelly smelly fish
    "
    }
    

  • Here's a sample code snippet to process script options

    if [ $# -lt 2 ]; then
        print_usage
        exit 1
    else
        while getopts ha:b: options; do
            case "${options}" in
                h)  print_usage
                    exit 1
                    ;;
                a)  flag=${options}
                    ;;
                b)  flag=${options}
                    ;;
                *)  echo "default case, everything else fits here"
                    ;;
            esac
        done
        shift $((${OPTIND} - 1))
    

  • Variables should be enclosed in parenthesis when used, to indicate exactly which variable you are using. Of course, this can prevent an exploit involving longer variable names.

    a=erie
    ab=were
    if [[ "${a}b" == "erieb" ]]; then
    

Script: check for missing files in a directory after reorganisation

*Updated: 11 Dec 2007

I'm wondering where I should store the scripts I'm writing. Out of pure laziness, I'll just dump them as my blog entry for now. Sticking out tongue

Here's a script to check for missing files after a directory has been re-organised. Basically, it compares the md5sum of the files in the old directory and the new directory.

Please let me know if there are any bugs. Sticking out tongue

#!/bin/bash

#########################
#
# checkNoMissingFiles
# ===================
#
# This script checks that no files are missing after folders are reorganised.
# Basic algorithm is to checksum all files in both old and new folders, then
# checking through both lists of checksums to ensure all checksums are present
# in both lists.
#
# Changelog
# =========
#
# 18 Oct 2007 - Junhao
# * Initial commit
#
# 11 Dec 2007 - Junhao
# * Tidied style
# * Fixed bug with spaces in filenames
# * added option to save generated checksums
# * changed md5sum to sha1sum
# * changed checksum to general algorithm
#########################

PATH=/bin:/usr/bin;

## Program Locations
awk=/usr/bin/awk
cat=/usr/bin/cat
echo=/usr/bin/echo
find=/usr/bin/find
grep=/bin/grep
checksum="/usr/bin/sha1sum"
mktemp=/bin/mktemp
rm=/usr/bin/rm
tee="/usr/bin/tee -a"
touch="/bin/touch"
## End Program Locations

## Start Script

## Script parameters
f_logFile=/dev/null
d_orgLoc=/dev/null
d_newLoc=/dev/null
v_oldFileName=
v_oldFileChksum=
f_oldChksumLog=
f_newChksumLog=
v_missingFilesCount=0
v_missingFiles=""
v_output=
v_f1flag=1
v_f2flag=1
## End Script parameters

function print_usage () {
    ${echo} "
$0
Usage: $0 [-L logfile] [-f1 filename] [-f2 filename] [oldDir] [newDir]
 or    $0 -h
Description: Checks that there are no missing files after reorganising a directory.
Options:
  -L logfile    (Optional) Path to log file
  -h            (Optional) This help text
  -1           (Optional) Filename to save checksum for old directory
  -2           (OPtional) Filename to save checksum for new directory
  oldDir        Location of old directory
  newDir        Location of new directory
"
}

if [ $# -lt 2 ]; then
    print_usage
    exit 1
else
    while getopts hL:1:2: options; do
        case "${options}" in
            h)  print_usage
                exit 1
                ;;
            L)  f_logFile=${OPTARG}
                ;;
            1)  f_oldChksumLog=${OPTARG}
                v_f1flag=0
                ;;
            2)  f_newChksumLog=${OPTARG}
                v_f2flag=0
                ;;
            *)  f_logFile=/dev/null
                ;;
        esac
    done
    shift $((${OPTIND} - 1))

    if [ -d $1 ]; then
        d_orgLoc=$1
    else
        ${echo} "Error: Original directory does not exist!"
        print_usage
        exit 1
    fi

    if [ -d $2 ]; then
        d_newLoc=$2
    else
        ${echo} "Error: New directory does not exist!"
        print_usage
        exit 1
    fi

    if [ -z ${f_oldChksumLog} ]; then
        f_oldChksumLog=${mktemp}
    elif [ -f ${f_oldChksumLog} ]; then
        ${echo} "Error: File ${f_oldChksumLog} exists! Please give another filename."
        exit 2
    else
        ${touch} ${f_oldChksumLog}
        if [ ! -f ${f_oldChksumLog} ]; then
            ${echo} "Error: ${f_oldChksumLog} cannot be created!"
            exit 4
        fi
    fi

    if [ -z ${f_newChksumLog} ]; then
        f_oldChksumLog=${mktemp}
    elif [ -f ${f_newChksumLog} ]; then
        ${echo} "Error: File ${f_newChksumLog} exists! Please give another filename."
        exit 3
    else
        ${touch} ${f_newChksumLog}
        if [ ! -f ${f_newChksumLog} ]; then
            ${echo} "Error: File ${f_newChksumLog} cannot be created!"
            exit 5
        fi
    fi
fi

${echo} "${find} \"${d_orgLoc}\" -type f -exec ${checksum} \\"\{\}\\" \;" | ${tee} ${f_logFile}
${find} "${d_orgLoc}" -type f -exec ${checksum} \"\{\}\" \; | ${tee} ${f_oldChksumLog}
${find} "${find} \"${d_newLoc}\" -type f -exec ${checksum} \\"\{\}\\" \;" | ${tee} ${f_logFile}
${find} "${d_newLoc}" -type f -exec ${checksum} \"\{\}\" \; | ${tee} ${f_newChksumLog}


while read -r v_oldFileChksum v_oldFileName; do
    if [[ `${grep} ${v_oldFileChksum} ${f_newChksumLog}` ]]; then
        v_output="Okay:  ${v_oldFileName} -> "
        v_output="${v_output} `${grep} \"${v_oldFileChksum}\" \"${f_newChksumLog}\" | ${awk} '{print $2}'`"
    else
        v_output="ERROR: ${v_oldFileName} is missing"
        v_missingFiles="${v_missingFiles} ${v_oldFileName}"
        v_missingFilesCount=$((v_missingFilesCount+1))
    fi
    ${echo} "${v_output}" | ${tee} ${f_logFile}
done < ${f_oldChksumLog}

#### cleanup ####
if [ "1" == ${v_f1flag} ]; then
    ${rm} ${f_oldChksumLot}
fi
if [ "1" == ${v_f2flag} ]; then
    ${rm} ${f_newChksumLog}
fi


if [ ${v_missingFilesCount} -gt 0 ]; then
    ${echo} "ERROR: ${v_missingFilesCount} files are missing:" | ${tee} ${f_logFile}
    ${echo} "ERROR:   ${v_missingFiles}" | ${tee} ${f_logFile}
    exit 99
else
    ${echo} "Success: ${v_missingFilesCount} files are missing" | ${tee} ${f_logFile}
    exit 0
fi

Code Repository

I often have to code many many scripts for my daily work as a system administrator. In the (vain) hopes these might be useful to someone else, maybe I should release these into the public domain.

My style of coding hasn't really stablised; still trying to find a style that allows secure coding and easy readability. If you have suggestions, please let me know. Smiling

Of course, if there a bugs, please let me know. Thanks! Smiling

Cool Applications

Found these on the web... Check them out!!

Linux/Solaris Tips and Tricks

Tips and Tricks for the Linux (and maybe solaris too) platform

Syndicate content