Hbase master logs:
12/05/14 13:33:44 INFO master.LoadBalancer: Skipping load balancing.  servers=10 regions=261 average=26.1 mostloaded=27 leastloaded=26
12/05/14 13:33:44 WARN master.CatalogJanitor: Daughter regiondir does not exist: hdfs://2hei.net:8020/hbase/RecSys_Catalog/7d100af9ac714de605efc9da89a817b3
12/05/14 13:33:44 WARN master.CatalogJanitor: Daughter regiondir does not exist: hdfs://2hei.net:8020/hbase/Track/d87d503a2996200cfa3aae8906767f81
12/05/14 13:33:44 WARN master.CatalogJanitor: Daughter regiondir does not exist: hdfs://2hei.net:8020/hbase/type_subgenre_uniqueId_CatalogIndex/4602f165ca87f345a1f62a48c5677e55

Resons:
May caused by I restart zookeeper first, then stop master and regionserver. when restart Hbase master,it filed init region servers.
other resons:
- region server crashed
- lease timed out
- master starts recovery (can take quite a while to complete)
- region server restarts
- region server sends region server startup message to master
- master waits in rpc handler for old server cleanup (because it
cannot differentiate the new instance from the old).
- ipc from region server to master times out
- region server sends a new startup message. The master thread starts
waiting in the rpc handler for old server cleanup.
- ipc from region server to master times out

Can't find below folders in HDFS
hadoop fs -ls /hbase/RecSys_Catalog/7d100af9ac714de605efc9da89a817b3
hadoop fs -ls /hbase/type_subgenre_uniqueId_CatalogIndex/4602f165ca87f345a1f62a48c5677e55
hadoop fs -ls /hbase/Track/d87d503a2996200cfa3aae8906767f81

Resolve:
find key from .META., delete them
echo "scan '.META.'" | hbase shell|grep 7d100af9ac714de605efc9da89a817b3
echo "scan '.META.'" | hbase shell|grep 4602f165ca87f345a1f62a48c5677e55
echo "scan '.META.'" | hbase shell|grep d87d503a2996200cfa3aae8906767f81

scan '.META.', {COLUMNS => 'info:splitA',TIMESTAMP => 1335245847237}
delete '.META.', 'Track,,1335245609463.1e40dbd0fb394c05fdaf30ca5f933ea8.','info:splitA' 
delete '.META.', 'RecSys_Catalog,,1336785348151.1250fbc6334c578629f95113b2a3ba7b.','info:splitA' 
delete '.META.', 'type_subgenre_uniqueId_CatalogIndex,,1336785356835.d4af2bae84156905367579bc44fcdd97.','info:splitA' 

Hbase logs back to normal:
12/05/14 14:18:44 INFO master.LoadBalancer: Skipping load balancing.  servers=10 regions=261 average=26.1 mostloaded=27 leastloaded=26
12/05/14 14:23:44 INFO master.LoadBalancer: Skipping load balancing.  servers=10 regions=261 average=26.1 mostloaded=27 leastloaded=26
12/05/14 14:28:45 INFO master.LoadBalancer: Skipping load balancing.  servers=10 regions=261 average=26.1 mostloaded=27 leastloaded=26

| | Comments (0) | TrackBacks (0)
Dmidecode:     
Type   Information
       ----------------------------------------
          0   BIOS
          1   System
          2   Base Board
          3   Chassis
          4   Processor
          5   Memory Controller
          6   Memory Module
          7   Cache
          8   Port Connector
          9   System Slots
         10   On Board Devices
         11   OEM Strings
         12   System Configuration Options
         13   BIOS Language
         14   Group Associations
         15   System Event Log
         16   Physical Memory Array
         17   Memory Device
         18   32-bit Memory Error
         19   Memory Array Mapped Address
         20   Memory Device Mapped Address
         21   Built-in Pointing Device
         22   Portable Battery
         23   System Reset
         24   Hardware Security
         25   System Power Controls
         26   Voltage Probe
         27   Cooling Device
         28   Temperature Probe
         29   Electrical Current Probe
         30   Out-of-band Remote Access
         31   Boot Integrity Services
         32   System Boot
         33   64-bit Memory Error
         34   Management Device
         35   Management Device Component
         36   Management Device Threshold Data
         37   Memory Channel
         38   IPMI Device
         39   Power Supply
        
Instead of type_id, you can also pass the keyword to the -t option of the dmidecode command. Following are the available keywords.
       Keyword     Types
       ------------------------------
       bios        0, 13
       system      1, 12, 15, 23, 32
       baseboard   2, 10
       chassis     3
       processor   4
       memory      5, 6, 16, 17
       cache       7
       connector   8
       slot        9   

# dmidecode -t 16

#dmidecode -t 17
or
#dmidecode -t memory

[2hei.net]# ipmitool
No command provided!
Commands:
    raw           Send a RAW IPMI request and print response
    i2c           Send an I2C Master Write-Read command and print response
    spd           Print SPD info from remote I2C device
    lan           Configure LAN Channels
    chassis       Get chassis status and set power state
    power         Shortcut to chassis power commands
    event         Send pre-defined events to MC
    mc            Management Controller status and global enables
    sdr           Print Sensor Data Repository entries and readings
    sensor        Print detailed sensor information
    fru           Print built-in FRU and scan SDR for FRU locators
    gendev        Read/Write Device associated with Generic Device locators sdr
    sel           Print System Event Log (SEL)
    pef           Configure Platform Event Filtering (PEF)
    sol           Configure and connect IPMIv2.0 Serial-over-LAN
    tsol          Configure and connect with Tyan IPMIv1.5 Serial-over-LAN
    isol          Configure IPMIv1.5 Serial-over-LAN
    user          Configure Management Controller users
    channel       Configure Management Controller channels
    session       Print session information
    sunoem        OEM Commands for Sun servers
    kontronoem    OEM Commands for Kontron devices
    picmg         Run a PICMG/ATCA extended cmd
    fwum          Update IPMC using Kontron OEM Firmware Update Manager
    firewall      Configure Firmware Firewall
    shell         Launch interactive IPMI shell
    exec          Run list of commands from file
    set           Set runtime variable for shell and exec
    hpm           Update HPM components using PICMG HPM.1 file
    ekanalyzer    run FRU-Ekeying analyzer using FRU files
   
[2hei.net]# ipmitool event 3
Sending SAMPLE event: Memory - Correctable ECC
   0 | Pre-Init Time-stamp   | Memory #0x53 | Correctable ECC | Asserted
  
[2hei.net]# ipmitool sdr elist
CPU0 below Tmax  | 7Bh | ok  |  3.0 | 47 degrees C
CPU1 below Tmax  | 7Ah | ok  |  3.1 | 39 degrees C
DIMM0 Area(RT3)  | 7Eh | ok  |  7.0 | 30 degrees C
PCI Area(RT2)    | 7Fh | ok  |  7.0 | 36 degrees C
CPU0 VCORE       | 71h | ok  |  3.0 | 0.90 Volts

   
[2hei.net~]#mcelog --cpu nehalem --dmi < /var/log/mcelog >> /home/2hei.net/mcelog.dmi

CPU1 VCORE       | 70h | ok  |  3.1 | 1.02 Volts
3.3V             | 75h | ok  |  7.0 | 3.30 Volts
+12V             | 76h | ok  |  7.0 | 11.81 Volts
VBAT             | 79h | ok  |  7.0 | 3.12 Volts
5V               | 77h | ok  |  7.0 | 4.90 Volts
Sys.1(CPU 1)     | 80h | ns  |  7.0 | No Reading
Sys.2(CPU 0)     | 81h | ok  |  7.0 | 6720 RPM
Sys.3(Front 1)   | 82h | ok  |  7.0 | 4080 RPM
Sys.4(Front 2)   | 83h | ok  |  7.0 | 3840 RPM
Sys.5(Rear 1)    | 84h | ok  |  7.0 | 4200 RPM
Sys.6            | 85h | ns  |  7.0 | No Reading
Sys.7            | 86h | ns  |  7.0 | No Reading
Sys.8            | 87h | ns  |  7.0 | No Reading
Sys.9            | 88h | ns  |  7.0 | No Reading
Sys.10           | 89h | ns  |  7.0 | No Reading
 
| | Comments (0) | TrackBacks (0)
解决办法是先编译好iasl,copy到$PATH中。

wget http://acpica.org/download/acpica-unix-20110922.tar.gz
tar zxvf acpica-unix-20110922.tar.gz
cd acpica-unix-20110922/compiler
make
cp iasl /usr/bin

接下来可以顺利的编译通过xen了。
wget http://bits.xensource.com/oss-xen/release/4.1.2/xen-4.1.2.tar.gz
$ tar zxvf xen-4.1.2.tar.gz
$ cd xen-4.1.2/
make dist-xen dist-tools dist-stubdom
make install-xen
make install-tools PYTHON_PREFIX_ARG="--install-layout=deb"
make install-stubdom
| | Comments (0) | TrackBacks (0)
利用RHEL的kickstart安装ubuntu,通过已有的一台ubuntu的kickstart工具生成了cfg文件,结果中看不中用,clearpart不起作用,无法完全的无人值守安装,因为无法自动删除raid和lvm,所以安装的时候只好把设定好的raid及分区部分注释掉,然后通过文本安装页面进行配置。
另外openssh-server也木有安装好,只能通过console手工安装。

cat ubuntu_ks.cfg
#Generated by Kickstart Configurator
#platform=AMD64 or Intel EM64T

#System language
lang en_US
#Language modules to install
langsupport en_US
#System keyboard
keyboard us
#System mouse
mouse
#System timezone
timezone --utc America/New_York
#Root password
rootpw --iscrypted $1$dIx6XYId$Grao2hlnjSQCXCbmdShWW1
#Initial user
user fisher --fullname "2hei" --iscrypted --password $1$yFK9gVs1$L9RHTs7B6oClIC4fonT.s/
#Reboot after installation
reboot
#Use text mode install
#text
graphical
#Install OS instead of upgrade
install
#Use Web installation
url --url http://2hei.net/install/ubuntu/9.10/
#Clear the Master Boot Record
zerombr yes
#Partition clearing information
clearpart --all --initlabel
bootloader --location=mbr

#Disk partitioning information
#part raid.11 --size 20480 --asprimary --ondisk sda
#part raid.12 --size 100 --asprimary --ondisk sda
#part raid.13 --size 8192 --ondisk sda
#part raid.14 --size 1 --grow --ondisk sda
#part raid.21 --size 20480 --asprimary --ondisk sdb
#part raid.22 --size 100 --asprimary --ondisk sdb
#part raid.23 --size 8192 --ondisk sdb
#part raid.24 --size 1 --grow --ondisk sdb
#raid / --level=1 --device=md1 raid.11 raid.21
#raid /boot --level=1 --device=md0 raid.12 raid.22
#raid swap --level=1 --device=md2 raid.13 raid.23
#raid  --level=1 --device=md3 raid.14 raid.24
#System authorization infomation
auth  --useshadow  --enablemd5
#Network information
network --bootproto=static --ip=192.168.100.2 --netmask=255.255.255.0 --gateway=192.168.100.1 --nameserver=192.168.100.1 --device=eth0
#Firewall configuration
firewall --disabled --http --ssh
#X Window System configuration information
xconfig --depth=32 --resolution=800x600 --defaultdesktop=GNOME --startxonboot
%packages
@Ubuntu-desktop
openssh-server

| | Comments (0) | TrackBacks (0)
gg resolve from bbs:
vi /usr/lib/python2.6/site-packages/pxssh.py
#add line 134,135:
    123     def synch_original_prompt (self):
    124
    125         """This attempts to find the prompt. Basically, press enter and record
    126         the response; press enter again and record the response; if the two
    127         responses are similar then assume we are at the original prompt. """
    128
    129         # All of these timing pace values are magic.
    130         # I came up with these based on what seemed reliable for
    131         # connecting to a heavily loaded machine I have.
    132         # If latency is worse than these values then this will fail.
    133
    134         self.sendline()
    135         time.sleep(0.5)

    136         self.read_nonblocking(size=10000,timeout=1) # GAS: Clear out the cache before getting the prompt
    137         time.sleep(0.1)

we just put something there before ssh expect read.
| | Comments (0) | TrackBacks (0)
CTRL + C    Cancels the currently running command.
CTRL + D    Logs out of the current session.
CTRL + Z    Cancels current operation, moves back a directory or takes the current operation and moves it to the background.
CTRL + A    Moves the cursor to first character.
CTRL + E    Moves the cursor to last character.
CTRL + R    Search history command
CTRL + W    Deletes the last word typed in. For example, if you typed 'mv file1 file2' this shortcut would delete file2.
CTRL + U    Erases the complete line.
CTRL + P    Paste previous line(s). same as up
CTRL + B    Moves the cursor backward one character.
CTRL + F    Moves the cursor forward one character.
CTRL + H    Erase one character. Similar to pressing backspace.
CTRL + S    Stops all output on screen (XOFF).
CTRL + Q    Turns all output stopped on screen back on (XON).
| | Comments (0) | TrackBacks (0)
Just a mark:
nice -n 19 nice
19
sudo nice -n -20 nice
-20
-------------
$man nice

NICE(1)                          User Commands                         NICE(1)

NAME
       nice - run a program with modified scheduling priority

SYNOPSIS
       nice [OPTION] [COMMAND [ARG]...]

DESCRIPTION
       Run  COMMAND  with an adjusted niceness, which affects process scheduling.  With no COMMAND, print the current nice-
       ness.  Nicenesses range from -20 (most favorable scheduling) to 19 (least favorable).

       -n, --adjustment=N
              add integer N to the niceness (default 10)

       --help display this help and exit

       --version
              output version information and exit

       NOTE: your shell may have its own version of nice, which usually supersedes  the  version  described  here.   Please
       refer to your shell?. documentation for details about the options it supports.
| | Comments (0) | TrackBacks (0)
dmesg has such logs:
usb 1-5.1: reset low speed USB device using ehci_hcd and address 4
usb 1-5.1: reset low speed USB device using ehci_hcd and address 4
usb 1-5.1: reset low speed USB device using ehci_hcd and address 4
usb 1-5.1: reset low speed USB device using ehci_hcd and address 4
usb 1-5.1: reset low speed USB device using ehci_hcd and address 4
usb 1-5.1: reset low speed USB device using ehci_hcd and address 4
usb 1-5.1: reset low speed USB device using ehci_hcd and address 4

# modprobe --help
modprobe: unrecognized option `--help'
Usage: modprobe [-v] [-V] [-C config-file] [-n] [-i] [-q] [-b] [-o <modname>] <modname> [parameters...]
modprobe -r [-n] [-i] [-v] <modulename> ...
modprobe -l -t <dirname> [ -a <modulename> ...]

#modprobe -r ehci_hcd
Sep  6 03:32:05 2hei.net kernel: usb 1-5.1: reset low speed USB device using ehci_hcd and address 4
Sep  6 03:33:44 2hei.net kernel: ehci_hcd 0000:00:1d.7: remove, state 1
Sep  6 03:33:44 2hei.net kernel: usb usb1: USB disconnect, address 1
Sep  6 03:33:44 2hei.net kernel: usb 1-5: USB disconnect, address 3
Sep  6 03:33:44 2hei.net kernel: usb 1-5.1: USB disconnect, address 4
Sep  6 03:33:44 2hei.net kernel: ehci_hcd 0000:00:1d.7: USB bus 1 deregistered
Sep  6 03:33:44 2hei.net kernel: ACPI: PCI interrupt for device 0000:00:1d.7 disabled
Sep  6 03:33:44 2hei.net kernel: usb 4-1: new full speed USB device using uhci_hcd and address 2
Sep  6 03:33:44 2hei.net kernel: usb 4-1: configuration #1 chosen from 1 choice
Sep  6 03:33:44 2hei.net kernel: hub 4-1:1.0: USB hub found
Sep  6 03:33:44 2hei.net kernel: hub 4-1:1.0: 3 ports detected
Sep  6 03:33:44 2hei.net kernel: usb 4-1.1: new full speed USB device using uhci_hcd and address 3
Sep  6 03:33:44 2hei.net kernel: usb 4-1.1: configuration #1 chosen from 1 choice
Sep  6 03:33:44 2hei.net kernel: input: American Megatrends Inc. Virtual Keyboard and Mouse as /class/input/input4
Sep  6 03:33:44 2hei.net kernel: input: USB HID v1.10 Keyboard [American Megatrends Inc. Virtual Keyboard and Mouse] on usb-0000:00:1d.2-1.1
Sep  6 03:33:44 2hei.net kernel: input: American Megatrends Inc. Virtual Keyboard and Mouse as /class/input/input5
Sep  6 03:33:44 2hei.net kernel: input: USB HID v1.10 Mouse [American Megatrends Inc. Virtual Keyboard and Mouse] on usb-0000:00:1d.2-1.1
| | Comments (0) | TrackBacks (0)
For files in /tmp directory, linux has cron at /etc/cron.daily to deal with:
[2hei.net cron.daily]$ cat tmpwatch
flags=-umc
/usr/sbin/tmpwatch "$flags" -x /tmp/.X11-unix -x /tmp/.XIM-unix \
    -x /tmp/.font-unix -x /tmp/.ICE-unix -x /tmp/.Test-unix 240 /tmp
/usr/sbin/tmpwatch "$flags" 720 /var/tmp
for d in /var/{cache/man,catman}/{cat?,X11R6/cat?,local/cat?}; do
    if [ -d "$d" ]; then
    /usr/sbin/tmpwatch "$flags" -f 720 "$d"
    fi
done

Linux will keep 10 days files in /tmp and 30 days in /var/tmp by default.

[2hei.net cron.daily]$ man tmpwatch
NAME
       tmpwatch - removes files which haven?. been accessed for a period of time

SYNOPSIS
       tmpwatch [-u|-m|-c] [-MUadfqstvx] [--verbose] [--force] [--all]
                      [--nodirs] [--nosymlinks] [--test] [--fuser] [--quiet]
                      [--atime|--mtime|--ctime] [--dirmtime] [--exclude <path>]
                      [--exclude-user <user>] <hours> <dirs>
OPTIONS
       -u, --atime
              Make the decision about deleting a file based on the file?. atime (access time). This is the default.

              Note that the periodic updatedb file system scans keep the atime of directories recent.

       -m, --mtime
              Make the decision about deleting a file based on the file?. mtime (modification time) instead of the atime.

       -c, --ctime
              Make the decision about deleting a file based on the file?. ctime (inode change time) instead of  the  atime;
              for directories, make the decision based on the mtime.

       -M, --dirmtime
              Make  the  decision  about deleting a directory based on the directory?. mtime (modification time) instead of
              the atime; completely ignore atime for directories.

       -a, --all
              Remove all file types, not just regular files, symbolic links and directories.

       -d, --nodirs
              Do not attempt to remove directories, even if they are empty.

       -d, --nosymlinks
              Do not attempt to remove symbolic links.

       -f, --force
              Remove files even if root doesn?. have write access (akin to rm -f).

       -q, --quiet
              Report only fatal errors.

       -s, --fuser
              Attempt to use the "fuser" command to see if a file is already open  before  removing  it.   Not  enabled  by
              default.    Does  help in some circumstances, but not all.  Dependent on fuser being installed in /sbin.  Not
              supported on HP-UX or Solaris.

       -t, --test
              Don?. remove files, but go through the motions of removing them. This implies -v.

       -U, --exclude-user=user
              Don?. remove files owned by user, which can be an user name or numeric user ID.

       -v, --verbose
              Print a verbose display. Two levels of verboseness are available -- use this option twice  to  get  the  most
              verbose output.

       -x, --exclude=path
              Skip  path;  if  path  is a directory, all files contained in it are skipped too.  If path does not exist, it
              must be an absolute path that contains no symbolic links.
| | Comments (0) | TrackBacks (0)
当前版本:
$ ssh -V
OpenSSH_4.3p2, OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008
目标版本:
openssh-5.8p2.tar.gz
openssl-0.9.8r.tar.gz

compile and upgrade openssl-0.9.8r
wget http://www.openssl.org/source/openssl-0.9.8r.tar.gz
tar zxvf openssl-0.9.8r.tar.gz
mkdir -p /usr/src/redhat/SPECS/
mkdir -p /usr/src/redhat/SOURCES/
cp openssl-0.9.8r/openssl.spec /usr/src/redhat/SPECS/
cp openssl-0.9.8r.tar.gz /usr/src/redhat/SOURCES/
cd /usr/src/redhat/SPECS

#源码自带的SPEC文件有点问题,需要替换一下关键字License
perl -i.bak -pe 's/^Copyright: Freely distributable$/License: Freely distributable/' openssl.spec

[2hei.net ~]#rpmbuild -bb openssl.spec
#如果系统找不到rpmbuild,需要先安装
yum install rpm-build
yum install redhat-rpm-config
yum install pam-devel

#安装结束:
---
Wrote: /usr/src/redhat/RPMS/i386/openssl-0.9.8r-1.i386.rpm
Wrote: /usr/src/redhat/RPMS/i386/openssl-devel-0.9.8r-1.i386.rpm
Wrote: /usr/src/redhat/RPMS/i386/openssl-doc-0.9.8r-1.i386.rpm
Wrote: /usr/src/redhat/RPMS/i386/openssl-debuginfo-0.9.8r-1.i386.rpm
Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.36929
+ umask 022
+ cd /usr/src/redhat/BUILD
+ cd openssl-0.9.8r
+ rm -rf /var/tmp/openssl-0.9.8r-root
+ exit 0
---
[2hei.net ~]#cd /usr/src/redhat/RPMS/i386/
ls -l
-rw-r--r-- 1 root root 1198354 Jul 22 15:31 openssl-0.9.8r-1.i386.rpm
-rw-r--r-- 1 root root  117348 Jul 22 15:31 openssl-debuginfo-0.9.8r-1.i386.rpm
-rw-r--r-- 1 root root 2149166 Jul 22 15:31 openssl-devel-0.9.8r-1.i386.rpm
-rw-r--r-- 1 root root  596803 Jul 22 15:31 openssl-doc-0.9.8r-1.i386.rpm
rpm -Uvh openssl*.rpm
error: Failed dependencies:
    libcrypto.so.6 is needed by (installed) python-2.4.3-27.el5.i386
    libcrypto.so.6 is needed by (installed) openldap-2.3.43-12.el5.i386
    libcrypto.so.6 is needed by (installed) curl-7.15.5-9.el5.i386
    libcrypto.so.6 is needed by (installed) net-snmp-libs-5.3.2.2-9.el5.i386
    ...
    libssl.so.6 is needed by (installed) python-2.4.3-27.el5.i386
    libssl.so.6 is needed by (installed) openldap-2.3.43-12.el5.i386
    libssl.so.6 is needed by (installed) curl-7.15.5-9.el5.i386
    ...
use --nodeps force install
rpm --nodeps -Uvh openssl-*.rpm
Preparing...                ########################################### [100%]
   1:openssl                ########################################### [ 25%]
   2:openssl-debuginfo      ########################################### [ 50%]
   3:openssl-devel          ########################################### [ 75%]
   4:openssl-doc            ########################################### [100%]

#添加旧的链接库链接,使之支持已安装的其他软件的依赖:
[2hei.net ~]#cd /usr/lib
ln -s libcrypto.so.0.9.8 libcrypto.so.6
ln -s libssl.so.0.9.8 libssl.so.6

openssh的编译寄安装过程类似,不过要比openssl的简单一些,因为依赖关系少了很多,在此略去。。。

#重启sshd服务:
service sshd restart
#检查版本:
[root@test-test01 ~]# ssh -V
OpenSSH_5.8p2, OpenSSL 0.9.8r 8 Feb 2011

至此大功告成,不过升级openssl要谨慎,有可能影响其他服务,如httpd,openvpn等等,本文在vm上测试通过,权作记录备忘~
| | Comments (0) | TrackBacks (0)
1) Install

http://labs.renren.com/apache-mirror/cassandra/0.7.6-2/apache-cassandra-0.7.6-2-bin.tar.gz(apache官网推荐人人的mirror,不过link有问题 ^_^)

  * tar -zxvf apache-cassandra-$VERSION.tar.gz
  * cd apache-cassandra-$VERSION
  * sudo mkdir -p /var/log/cassandra
  * sudo chown -R `whoami` /var/log/cassandra
  * sudo mkdir -p /var/lib/cassandra
  * sudo chown -R `whoami` /var/lib/cassandra


Note: The sample configuration files in conf/ determine the file-system
locations Cassandra uses for logging and data storage. You are free to
change these to suit your own environment and adjust the path names
used here accordingly.

Now that we're ready, let's start it up!
#start up front
  * bin/cassandra -f

2) two nodes configuration:
node1:192.168.46.155
node2:192.168.46.179


[2hei.net conf]$ cat cassandra.yaml
cluster_name: 'Test Cluster'
initial_token:
auto_bootstrap: false
hinted_handoff_enabled: true
max_hint_window_in_ms: 3600000 # one hour
hinted_handoff_throttle_delay_in_ms: 50
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authority: org.apache.cassandra.auth.AllowAllAuthority
partitioner: org.apache.cassandra.dht.RandomPartitioner
data_file_directories:
    - /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
saved_caches_directory: /var/lib/cassandra/saved_caches
commitlog_rotation_threshold_in_mb: 128
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
seeds:
    - node2
concurrent_reads: 32
concurrent_writes: 32
memtable_flush_queue_size: 4
sliced_buffer_size_in_kb: 64
storage_port: 7000
listen_address: 192.168.46.155
rpc_address: 0.0.0.0
rpc_port: 9160
rpc_keepalive: true
thrift_framed_transport_size_in_mb: 15
thrift_max_message_length_in_mb: 16
incremental_backups: false
snapshot_before_compaction: false
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 64
compaction_preheat_key_cache: true
rpc_timeout_in_ms: 10000
endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
dynamic_snitch: true
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.0
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128

the same as node2,just need change
seeds:
    - node1

cat /etc/hosts    
192.168.46.155  node1
192.168.46.179  node2

Logs
#node1:
 INFO 10:23:46,151 Listening for thrift clients...
 INFO 10:23:46,315 Compacted to /var/lib/cassandra/data/system/LocationInfo-tmp-f-33-Data.db.  942 to 536 (~56% of original) bytes for 4 keys.  Time: 178ms.
 INFO 10:23:52,089 Node /192.168.46.179 has restarted, now UP again
 INFO 10:23:52,095 Node /192.168.46.179 state jump to normal
 INFO 10:24:02,177 Deleted /var/lib/cassandra/data/system/LocationInfo-f-32
 INFO 10:24:02,179 Deleted /var/lib/cassandra/data/system/LocationInfo-f-31
 INFO 10:24:52,097 Started hinted handoff for endpoint /192.168.46.179
 INFO 10:24:52,100 Finished hinted handoff of 0 rows to endpoint /192.168.46.179  
#node2:
 INFO 10:23:51,930 Binding thrift service to /0.0.0.0:9160
 INFO 10:23:51,939 Using TFastFramedTransport with a max frame size of 15728640 bytes.
 INFO 10:23:51,988 Listening for thrift clients...
 INFO 10:23:52,133 Node /192.168.46.155 has restarted, now UP again
 INFO 10:23:52,137 Node /192.168.46.155 state jump to normal
 INFO 10:23:53,548 InetAddress /192.168.46.155 is now dead.
 INFO 10:23:53,646 InetAddress /192.168.46.155 is now UP
 INFO 10:24:33,362 Started hinted handoff for endpoint /192.168.46.155
 INFO 10:24:33,365 Finished hinted handoff of 0 rows to endpoint /192.168.46.155
 
3) use cassandra:
#check cluster:
[2hei.net apache-cassandra-0.7.6-2]$ bin/nodetool -host localhost ring
Address         Status State   Load            Owns    Token                                      
                                                       168969914150282478893277211064871807700    
192.168.46.155   Up     Normal  53.28 KB        42.38%  70927753273796620281025030712152398970     
192.168.46.179   Up     Normal  45.16 KB        57.62%  168969914150282478893277211064871807700

#use cassandra-cli:
[2hei.net apache-cassandra-0.7.6-2]$bin/cassandra-cli --host localhost
[default@unknown] create keyspace FisherKeyspace;
2b1e86b8-ac65-11e0-9677-2edcd0f45bc6
Waiting for schema agreement...
... schemas agree across the cluster
[default@unknown] use FisherKeyspace;
Authenticated to keyspace: FisherKeyspace
[default@FisherKeyspace] create column family 2hei with comparator=UTF8Type and default_validation_class=UTF8Type;
3a0e8809-ac65-11e0-9677-2edcd0f45bc6
Waiting for schema agreement...
... schemas agree across the cluster
[default@FisherKeyspace] set Users[2hei][first] = 'Fisher';
Users not found in current keyspace.
[default@FisherKeyspace] set 2hei[2hei][first] = 'Fisher';
Value inserted.
[default@FisherKeyspace]  set 2hei[2hei][last] = 'fishman';
Value inserted.
[default@FisherKeyspace] set 2hei[2hei][age] = long(42);
Value inserted.
[default@FisherKeyspace] get 2hei[2hei];
=> (column=age, value=42, timestamp=1310461245929000)
=> (column=first, value=Fisher, timestamp=1310461237981000)
=> (column=last, value=fishman, timestamp=1310461242014000)
Returned 3 results.

[default@unknown] show keyspaces;
Keyspace: FisherKeyspace:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
    Replication Factor: 1
  Column Families:
    ColumnFamily: 2hei
      default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
      Row cache size / save period in seconds: 0.0/0
      Key cache size / save period in seconds: 200000.0/14400
      Memtable thresholds: 0.0234375/5/1440 (millions of ops/minutes/MB)
      GC grace seconds: 864000
      Compaction min/max thresholds: 4/32
      Read repair chance: 1.0
      Built indexes: []

#other useful command of nodetool
bin/nodetool -host localhost ring
bin/nodetool -host localhost info
[2hei.net apache-cassandra-0.7.6-2]$ bin/nodetool -host localhost info
70927753273796620281025030712152398970
Gossip active    : true
Load             : 57.4 KB
Generation No    : 1310523824
Uptime (seconds) : 645
Heap Memory (MB) : 25.27 / 183.31
[2hei.net apache-cassandra-0.7.6-2]$ bin/nodetool -host localhost cfstats
Keyspace: FisherKeyspace
    Read Count: 0
    Read Latency: NaN ms.
    Write Count: 0
    Write Latency: NaN ms.
    Pending Tasks: 0
        Column Family: 2hei
        SSTable count: 0
        Space used (live): 0
        Space used (total): 0
        Memtable Columns Count: 0
        Memtable Data Size: 0
        Memtable Switch Count: 0
        Read Count: 0
        Read Latency: NaN ms.
        Write Count: 0
        Write Latency: NaN ms.
        Pending Tasks: 0
        Key cache capacity: 200000
        Key cache size: 0
        Key cache hit rate: NaN
        Row cache: disabled
        Compacted row minimum size: 0
        Compacted row maximum size: 0
        Compacted row mean size: 0

--------------------------------
接下来会继续完善cassandra的应用及api。

 
| | Comments (0) | TrackBacks (0)
/usr/sbin/smartctl --all /dev/sda -d ata
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

#my ENV:
[2hei.net ~]$ uname -a
Linux 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9 12:54:20 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
[2hei.net ~]$ cat /etc/redhat-release
CentOS release 5.5 (Final)
[2hei.net ~]# rpm -qa|grep smart
smartmontools-5.38-2.el5
[2hei.net ~]# rpm -qf /usr/sbin/smartctl
smartmontools-5.38-2.el5

Looks Sata discs are not accessed via the '-d ata' option
#smartctl --help
  -d TYPE, --device=TYPE
         Specify device type to one of: ata, scsi, marvell, sat, 3ware,N

#get rid of -d ata        
[2hei.net ~]# /usr/sbin/smartctl --all /dev/sda
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD1002FBYS-02A6B0
Serial Number:    WD-WMATV6555969
Firmware Version: 03.00C06
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jul  6 03:32:03 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:          (18600) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 214) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x303f)    SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   253   253   021    Pre-fail  Always       -       1100
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       28
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   089   089   000    Old_age   Always       -       8195
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       26
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       25
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       2
194 Temperature_Celsius     0x0022   116   112   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      8195         -
# 2  Short offline       Completed without error       00%      8189         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

also some said libata patch is needed!
| | Comments (0) | TrackBacks (0)