Saturday, December 25, 2010

Christmas, new hope!

It is Christmas now.
I want to form some new habits in this coming year.

1. sleep early ( set auto-shutdown in both windows and Linux), read books after the computer is closed
2. research and paper reading. form a plan
3. keep positive
4. the accumulation of the confidence and courage

Wednesday, November 24, 2010

cut and gawk

cut is one of the most useful commands for text processing.

cut -d. -f1 file # print out the first column of the file

sed:

Friday, October 29, 2010

grep inside vim

The vim now has built-in grep command. To see all the results, just type :copen

Sunday, October 10, 2010

guess the awk(gawk) command

gawk '{ if ( $1 ~ /start/) { print "process start!" ; for ( i in freq) {print i, freq[i] } delete freq } else freq[$1]++ }' tmp.out > new.out

What did it do?

Wednesday, September 29, 2010

i am not doing the right thing

What is the right thing to do?

Sunday, September 26, 2010

emacs programming

How to insert/delete comment?

Select a block of text and press 【Alt+;】 to make the region into a comment or uncomment.

Monday, August 9, 2010

emacs programming

C-M-h : select the whole function

M C-\ indent region between cursor and mark
M-m move to first (non-space) char in this line
M-^ attach this line to previous
M-; formatize and indent comment
C, C++ and Java Modes
M-a beginning of statement
M-e end of statement
M C-a beginning of function
M C-e end of function
C-c RETURN Set cursor to beginning of function and mark at the end
C-c C-q indent the whole function according to indention style
C-c C-a toggle modus in which after electric signs (like {}:';./*) emacs does the indention
C-c C-d toggle auto hungry mode in which emacs deletes groups of spaces with one del-press
C-c C-u go to beginning of this preprocessor statement
C-c C-c comment out marked area
More general (I guess)
M-x outline-minor-mode collapses function definitions in a file to a mere {...}
M-x show-subtree If you are in one of the collapsed functions, this un-collapses it
In order to achive some of the feats coming up now you have to run etags *.c *.h *.cpp (or what ever ending you source files have) in the source directory
M-. (Thats Meta dot) If you are in a function call, this will take you to it's definition
M-x tags-search ENTER Searches through all you etaged
M-, (Meta comma) jumps to the next occurence for tags-search
M-x tags-query-replace yum. This lets you replace some text in all the tagged files

C-M-n
Move forward over a parenthetical group (forward-list).
C-M-p
Move backward over a parenthetical group (backward-list).
C-M-u
Move up in parenthesis structure (backward-up-list).
C-M-d
Move down in parenthesis structure (down-list).

Monday, August 2, 2010

regular expression

Regular Expression	Class	Type	Meaning
_
.	all	Character Set	A single character (except newline)
^	all	Anchor	Beginning of line
$	all	Anchor	End of line
[...]	all	Character Set	Range of characters
*	all	Modifier	zero or more duplicates
\<	Basic	Anchor	Beginning of word
\>	Basic	Anchor	End of word
$..$	Basic	Backreference	Remembers pattern
\1..\9	Basic	Reference	Recalls pattern
_+	Extended	Modifier	One or more duplicates
?	Extended	Modifier	Zero or one duplicate
\{M,N\}	Extended	Modifier	M to N Duplicates
(...\|...)	Extended	Anchor	Shows alteration
_
$...\\|...$	EMACS	Anchor	Shows alteration
\w	EMACS	Character set	Matches a letter in a word
\W	EMACS	Character set	Opposite of \w

POSIX character sets

POSIX added newer and more portable ways to search for character sets. Instead of using [a-zA-Z] you can replace 'a-zA-Z' with [:alpha:], or to be more complete. replace [a-zA-Z] with [[:alpha:]]. The advantage is that this will match internetional character sets. You can mix the old style and new POSIX styles, such as
grep '[1-9[:alpha:]]'
Here is the fill list

Character Group	Meaning
[:alnum:]	Alphanumeric
[:cntrl:]	Control Character
[:lower:]	Lower case character
[:space:]	Whitespace
[:alpha:]	Alphabetic
[:digit:]	Digit
[:print:]	Printable character
[:upper:]	Upper Case Character
[:blank:]	whitespace, tabe, etc.
[:graph:]	Printable and visible characters
[:punct:]	Puctuation
[:xdigit:]	Extended Digit

Note that some people use [[:alpha:]] as a notation, but the outer '[...]' specifies a character set.

Saturday, July 31, 2010

M-C-\ indent region
C-s C-w search word under cursor

C-M-@
Set mark after end of following balanced expression (mark-sexp). This does not move point.

C-M-h c-mark-function

Wednesday, July 28, 2010

grep excludes files, directories

grep -Ir --exclude="*\.svn*" "pattern" *


  note that the grep path is the full path, not just the file names!

grep regular exp.

Special Characters

Here, we outline the special characters for grep. Note that in egrep (which uses extended regular expressions), which actually are no more functional than standard regular expressions if you use GNU grep ) , the list of special characters increases ( | in grep is the same as \| egrep and vice versa, there are also other differences. Check the man page for details ) The following characters are considered special and need to be "escaped":

?  \  .  [  ]  ^  $

Note that a $ sign loses its meaning if characters follow it (I think) and the carat ^ loses its meaning if other characters precede it.
Square brackets behave a little differently. The rules for square brackets go as follows:

A closing square bracket loses its special meaning if placed first in a list. for example []12] matches ] , 1, or 2.
A dash - loses it's usual meaning inside lists if it is placed last.
A carat ^ loses it's special meaning if it is not placed first
Most special characters lose their meaning inside square brackets
* if at the beginning of the regular exps, lose its meaning.

A regular expression may be followed by one of several repetition operators:
? The preceding item is optional and matched at most once.
* The preceding item will be matched zero or more times.
+ The preceding item will be matched one or more times.
{n} The preceding item is matched exactly n times.
{n,} The preceding item is matched n or more times.
{n,m} The preceding item is matched at least n times, but not more than m times.

In basic regular expressions the metacharacters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions
\?, \+, \{, \|, $, and $.

Monday, July 19, 2010

sed summary (cont)

Some basic POSIX groups:

\d = [[:digit:]], \D = [^[:digit:]].
\s = [[:whitespace:]], including space, tab .. ; \S = ?
\w = [[:alnum:]], including 0-9,a-z, A-Z; \W = ?

executing multiple commands with sed -e command

One method of combining multiple commands is to use a -e before each command:

sed -e 's/a/A/' -e 's/b/B/'
new

A "-e" isn't needed in the earlier examples because sed knows that there must always be one command. If you give sed one argument, it must be a command, and sed will edit the data read from standard input.

Reversing the restriction with !
Sometimes you need to perform an action on every line except those that match a regular expression, or those outside of a range of addresses. The "!" character, which often means not in Unix utilities, inverts the address restriction. You remember that

sed -n '/match/ p'acts like the grep command. The "-v" option to grep prints all lines that don't contain the pattern. Sed can do this with sed -n '/match/ !p'

Ranges by Line Number

You can specify a range on line numbers by inserting a comma between the numbers. To restrict a substitution to the first 100 lines, you can use:

sed '1,100 s/A/a/'

If you know exactly how many lines are in a file, you can explicitly state that number to perform the substitution on the rest of the file. In this case, assume you used wc to find out there are 532 lines in the file:

sed '101,532 s/A/a/'

An easier way is to use the special character "$," which means the last line in the file.

sed '101,$ s/A/a/'

The "$" is one of those conventions that mean "last" in utilities like cat -e, vi, and ed. "cat -e" Line numbers are cumulative if several files are edited. That is,

sed '200,300 s/A/a/' f1 f2 f3 >new

is the same as

cat f1 f2 f3 | sed '200,300 s/A/a/' >new

Transform with Y

If you wanted to change a word from lower case to upper case, you could write 26 character substitutions, converting "a" to "A," etc. Sed has a command that operates like the tr program. It is called the "y" command. For instance, to change the letters "a" through "f" into their upper case form, use:

sed 'y/abcdef/ABCDEF/' file

I could have used an example that converted all 26 letters into upper case, and while this column covers a broad range of topics, the "column" prefers a narrower format.

If you wanted to convert a line that contained a hexadecimal number (e.g. 0x1aff) to upper case (0x1AFF), you could use:

sed '/0x[0-9a-zA-Z]*/ y/abcdef/ABCDEF' file

This works fine if there are only numbers in the file. If you wanted to change the second word in a line to upper case, you are out of luck - unless you use multi-line editing. (Hey - I think there is some sort of theme here!)

Thursday, July 15, 2010

sed summary

sed -n pattern: we would add the /p at the end of the pattern

Thursday, July 8, 2010

How to manage experimental data

The data generated by the experiments is increasing significantly. How to manage them become a huge issue. In this blog we propose some best practices for successfully manage and index the datasets.
We will follow these practices in the future.

We will use the excel or OpenOffice to store the data.
The format is:
In the first table, it is about the overall indexing of the tables needed in experiment.
Next, in each table, it will only store one closely related set of data. For example, when you measure the performance of an algorithm, you may want to measure the running time of the algorithm, also, you want to store the space efficiency of the algorithm. So, in the first page, it will be the name of the two tables and some brief intro to these tables. It is better off including the name of the datasets in the first page. In the second page, it is the time table, which could be the running time of the algorithm, and the running time of some other strawman algorithms. Also, it could include the preprocessing time of the algorithm...

Another question is how to manage the raw data. Raw data is the data that is not yet processed.
For each raw data table, we will need to record the original source of the data, the name of the data.

Sunday, July 4, 2010

今天我们为什么不成功？

问题：今天我们为什么不成功？

1、首先我们没有定义好自己的成功标准是什么（是票子、车子、房子、妻子？），不清楚自己的真正目标，是为了理想、爱好、钱、事业、家庭、权利、欲望、还是人生价值的体现，因此我们每天依旧重复过着糊里糊涂的日子。生活是那么单调、枯燥。

2、我们不清楚已所之长，己所之短，完全不了解自己，到底缺什么、需补什么、擅长什么、有哪些资源，是知识、钱、关系、项目、人脉、还是时势。我们缺乏核心竞争力和不可替代性（即唯一性），所以我们往往不知道该做什么，不该做什么。永远盲目着、彷徨着。

3、我们很容易围着别人转、被别人感染、而不能让别人围着自己转、去感染和影响别人，所以注定了把别人的思想放进自己的脑袋，把自己的钱包放进别人的口袋，自己的命运被别人牢牢把握着，我们的灵魂和思想早被洗窃一空，剩下的只是行尸走肉。既然如此那么我们还能指望自己成就点什么。

4、我们习惯了肤浅的东西，看表面的文章、百般无聊、如出一辙的电视，挂QQ、玩游戏、搓麻将、泡馆子、蹲酒吧、守休闲场所、谈论众说纷纭的炒作新闻等，却少读了几本有价值的书、少见了几个有价值的人，少给了自己几分钟静夜思，严重缺乏看透事物本质的能力。所以我们今天被这个专家、明天被那个大师、后天还有某个名人，前后左右、上上下下、媒体广告、报纸、杂志、电视、网络、轮番轰炸着，在这个混淆视听的环境里，我们缺乏起码的判断力、分析力、概括力、我们往往被迷失了方向，迷失了自我。

5、我们缺乏勇气和魄力，习惯了三点一线的生活，没有了当年的匹夫之勇，不敢走回头路和不归路。我们觉得生活很无奈、工作很单调，发展很受限，却往往詹前顾后，不敢改变自己，懒于学习、不敢做领导、不敢换职位、不敢换工作、不敢创业、不敢质疑、不敢反抗、不敢发表自己的意见、不敢主动交流、不敢创新，因此我们依旧平淡无奇、素然寡味的过平凡人的日子，因为我们人生的旅途缺乏过程、缺乏那种能够品位真正酸、甜、苦、辣的勇气。

6、我们缺乏信任、合作、资源整合，我们总在猜测和矛盾中生活，仍在学着一个人打天下。我们很少拥有真正的朋友、能帮到上忙、借的上钱、铁的了心、有心灵感应时常挂念的朋友、平时不烧香临时抱佛脚，我们不太懂得相互宽容、理解、互补、平衡、分享、互利这些道理，所谓的兄弟、酒肉朋友太多，危难之中，我们可信任的人太少、信任的程度太低、信任的成本太高、我们都在相互猜忌着，力量相互内耗着。我们找不到资源的整合点，其实不会合理利用，仍在感叹我能点做什么，到底怎么办，我们认识的人层次太低，我们的胸怀太狭隘，所以很多道理，真相明白不透，我们默默的做了垫脚石而已。

7、我们缺乏行动力、执行力、做人、处事方法，仍在日复一日，年复一年平淡、懵懂的过日子。我们每时每刻都有美妙的想法，唯独缺没有做法，没有持之以恒的信心和耐力。我们不能时常的照镜子，予以自醒、禅悟。

8、我们缺乏总结力，纠正力，失败了，还是失败了，错误了，依旧错误着。我们的习惯依旧没有改变，由此养成了这种性格，最终决定了这种命运。

9、我们不懂得编织关系网，其实关系网是网状结构，先从你认识和了解的人开始，然后从认识你的人开始，最后从你朋友的朋友开始，依次类推，记得要用心和以诚相待，人与人之间其实是平等的，没有高、低、贵、贱之分（除非你真的有求于他 /她），没有什么了不起的，注意了解他/她人背景和整合资料很重要。

10、我们缺乏理财，常常不知道该买什么，该卖什么，什么是收入，什么是支出，什么是负债，什么是资产，何谓投资，如何开源节流，我们忽视了细节，量变成了质变，因此我们的现金数字依然很尴尬。我们不清楚如何找钱、挣钱、存钱、借钱、还钱、花钱。

11、我们严重缺知识，基础知识+社会知识，即学历太低、经历太少，缺乏不断学习补充、虚心请教、拜师学艺的能力

缺乏海纳百川、中西合璧、文理交融的素质、缺乏一技之长、专攻和全面，我们还是怀着陈旧的思想和笨拙的方法，我们不敢怀疑、挑战、创新新思维。

12、我们早被这个灯红酒绿、物欲横流的世界弄得焦躁不安，不能静下心来，反复，认真的思考自己的人生，稳重走好自己的每一步。我们不懂得管理时间、合理利用时间、守时。以至老大涂伤悲。

13、我们缺乏快乐感、幸福感、安全感，人与人之间太冷漠、太现实，许多家庭支离破碎、许多交际带着有色眼镜，许多圈子旁人所不能及，许多婚姻夹着交易，许多爱情不是爱情，许多亲情缺乏关心、许多兄弟背后插刀、我们害怕房奴、车奴、结婚、生子、生病、失业、人情、意外、整日惶惶，我们不知道什么是快乐幸福，不知道如何寻找、不知道调整自己的心态和位置、不明白取、舍、知足常乐、超越、分享、顺其自然这些东西。

14、我们不懂得把握时势，不懂得政治、经济的厉害关系，不明白风水轮流转、天地合一、互利互惠的道理，不明白红海和蓝海战略，不明白水能载舟亦能覆舟、没有绝对的朋友和绝对的敌人。不懂得顺应潮流和创造潮流，我们依旧固步自封，停滞不前。

15、最后我们看准了方向，做好充分的准备（破斧沉舟），请立即开始行动，坚持、坚持、再坚持！熬过了今天，明天会很美好！其间我们不断的完善自我，调整自我。愿所有有心人能成功！天道酬勤！

Monday, June 21, 2010

Notes for Using Imported Graphics in Latex

1. Some tools to generate the eps files.
a. ImageMagick and GraphicsMagick
The ImageMagick, program convert can convert a BMP, CGM, FIG, FITS, GIF, JPG, PBM, PDF, PGM, PNG, PNM, PPM, PS, RGB, TIF, XBM or XPM file to EPS format.
b. jpeg2eps

Wide figures in two column documents
If you are writing a document using two columns (i.e. you started your document with something like \documentclass[twocolumn]{article}), you might have noticed that you can't use floating elements that are wider than the width of a column (using a LaTeX notation, wider than 0.5\textwidth), otherwise you will see the image overlapping with text. If you really have to use such wide elements, the only solution is to use the "starred" variants of the floating environments, that are {figure*} and {table*}. Those "starred" versions work exactly like the standard ones, but they will be as wide as the page, so you will get no overlapping.

A bad point of those environments is that they can be placed only at the top of the page or on their own page. If you try to specify their position using modifiers like b or h they will be ignored. Add \usepackage{stfloats} to the preamble in order to alleviate this problem with regard to placing these floats at the bottom of a page, using the optional specifier [b]. Default is [tbp]. However, h still does not work.

To prevent the figures from being placed out-of-order with respect to their "non-starred" counterparts, the package fixltx2e [2] should be used (e.g. \usepackage{fixltx2e}).

\wide?
using figure* environment.

c. inserting subfigs
Subfloats
A useful extension is the subfig package [3], which uses subfloats within a single float. This gives the author the ability to have subfigures within figures, or subtables within table floats. Subfloats have their own caption, and an optional global caption. An example will best illustrate the usage of this package:

\usepackage{subfig}

\begin{figure}
\centering
\subfloat[A gull]{\label{fig:gull}\includegraphics[width=0.3\textwidth]{gull}}
\subfloat[A tiger]{\label{fig:tiger}\includegraphics[width=0.3\textwidth]{tiger}}
\subfloat[A mouse]{\label{fig:mouse}\includegraphics[width=0.3\textwidth]{mouse}}
\caption{Pictures of animals}
\label{fig:animals}
\end{figure}

d. The dia is a good tool to generate the vector-based figs.

Thursday, June 17, 2010

backup

create the exclude file list:

sudo rsync -Pa / /media/youbackupdir --exclude=/media/* --exclude=/lib/* --exclude=/sys/* --exclude=/tmp/* --exclude=/proc/* --exclude=/mnt/* --exclude=/home/ye/.mozilla/* --exclude = ...

Also, you need to create a cron job to run every day.
find . -size +30M -print -exec ls -l {} \;

comm can be used to compare the difference between two files

Saturday, June 12, 2010

io redirection

1. writing stderr and stdout seperately
make 1>a.out 2>b.out

2. only get the stdout
make > tmp.out

3. only get the stderr
make > tmp.out 3>&2 2>&1 1>&3 (3 is a descriptor for holding place)

4. put both to files
make > tmp.out 2>&1

& is used when both the src & dest are descriptors

Monday, May 10, 2010

* If necessary, start an instance of an X server on your personal system.
* Open an xterm window and ssh into the cluster's head node with X11-forwarding enabled. Depending on your settings, you may need to invoke the ssh client with "ssh -Y".
* On the cluster's head node, in your .cshrc or .bashrc file:
o Set up your module environment (e.g., module load mpich-debug)
o Set the environmental variable TVDSVRLAUNCHCMD to 'ssh'
o Add totalview (/opt/export/toolworks/totalview/bin) to your PATH environmental variable.
* Build your application as you normally would, but add the -g option to the mpicc command line for debugging.
* Create a PBS script file like the one you would use to run your MPI program, but which does not have an mpirun/mpiexec command in it. This will be used to allocate the nodes/processors.
* Type 'qsub -I -X '. This will create an interactive job with X11 forwarding enabled.
* When you get a compute node prompt, type:
mpirun -tv -machinefile ${PBS_NODEFILE} -np `wc -l <${PBS_NODEFILE}` \

* The TotalView windows should now be displayed by your X server.
o The code should be stopped at the beginning of your main program.
o You should see your source code, and can enter breakpoints, single-step through it, etc.
If you see assembly code, you most likely forgot to use the -g option when building your program.
o When you click on 'Go' for the first time, you will halt at a permanent MPIR_Breakpoint deep in the MPI_Init() code. Just click on 'Go' again to hit your first breakpoint.
* When done debugging, click on File | Exit to close the TotalView windows.
* Log out of the compute node session to end the PBS job.

Wednesday, March 24, 2010

What you think is what you get

Even in the worst cases, a positive mind can make a big difference.

This sentence is for my birthday.

Happy birthday!

Friday, March 12, 2010

pay attention to the drawbacks of the character. Try to change them..
too self-confident, too proud, ...
Try to value the others' ideas, and help the others improve.

Monday, March 1, 2010

life

你的一生，可以是一个方程，可以是加减乘除，可以是积分，也可能是微积分；你的一
生，可以是直线，可以是曲线，也可能是圆。

从你的出生，到你的逝去，你的一生与我的一生拥有共同的起点：经过十月怀胎，离开
母体，呱呱落地；也有共同的终点，生命的休止符。

我们都是人，拥有共同的，也是平等的，作为人的特征，我们的一生。我们都是人，自
从起点之后，我们开始了不同的人生历程。

你的人生道路，我的人生道路，无论是主动的，还是被动的，都是自己选择要走的路。

也许你出生在农村，也许你出生在穷家庭，也许你从小衣食无忧，也许你家人帮你安排
好了一切，也许你彷徨中不知该往哪里去。

无论你是哪一种情况，你自己走的还是被人安排好的，路依然是你自己该走的路，这是
你的一生。

也许你不须多少努力就能轻易达到目标，也许你奋斗了一生目标依然无期遥遥。回首走
过的路，一步一个脚印，走完了一生，有无遗憾？

人活着一生，意义何在？你的一生取决于你的心态和你的行动。

在你感叹自己命运不济的时候，在你仰慕他人功成名就的时候，在你欣欣自喜的时候，
在你不可一世的时候，不外乎是这两种情况：你把自己看低了或者你把自己看高了，以
及你把别人看高了或者你把别人看低了。

你的身影可能很高大，也可能很矮小，但也许你才1米5，也许你是1米9，这个差距并不
大。你无法更改你自己的高度，但你给人的身影可大可小，就看从哪角度哪高度来看你。

高人其实也不是高到哪里去，只是你不了解而已。月有圆缺，人都有优缺点。有的人善
于表现，充满自信；有的人被自身的不足所困，自卑自闭，自我看不清，甚至看不起。

我们不懂得挖掘自身潜力，不知道如何成才，在时光中彷徨。当机会从身边滑过，我们
都没意识去抓住。

每个人的成就与爬山一样，都是一步一个脚印，一步一个台阶往上走，直到登顶的那一
时刻。如果嫌登山太难太累，如果事业艰苦半途而废，自然只有羡慕人的分，自愧弗如。

无论你是聪明，还是觉得自己愚笨，首先要相信自己，不放弃，多努力，未来还是属于
你自己。

未来不是梦，现实需要实实在在的去做你能做的想做的事情。

你不是不行，只是思想上有局限性，能力上有缺乏。要想做成一件事情，需要通过后天
不断地学习，突破局限性，补充提高能力，这个过程会让你思想豁然开朗，做事变得游
刃有余。

一个心灵窗口的打开，打开的是一片新天地。

Monday, February 8, 2010

restart...

We need to restart for the research.
but i do not know where to start.

Quote "One lesson I learned as a graduate student is the best way to finish the dissertation is to do something every day that gets you closer to being done. If all you have left is writing, then write part of the dissertation every day. If you still have research to do, then do part of it every day. Don't just do it when you are "in the mood" or feeling productive. This level of discipline will keep you going through the good times and the bad and will ensure that you finish. "

Sunday, January 10, 2010

unix tr and cut command

Saturday, January 2, 2010

unix tricks

使用unix时，往往会陷入某种固定的使用模式。有时，您没有养成以尽可能最好的方式做事的习惯。有时，您的不良习惯甚至会导致出现混乱。纠正此类缺点的最佳方法之一，就是有意识地采用抵制这些坏习惯的好习惯。本文提出了 10 个值得采用的 UNIX 命令行习惯——帮助您克服许多常见使用怪癖，并在该过程中提高命令行工作效率的好习惯。下面列出了这 10 个好习惯，之后对进行了更详细的描述。

要采用的十个好习惯为：
1 在单个命令中创建目录树。
2 更改路径；不要移动存档。
3 将命令与控制操作符组合使用。
4 谨慎引用变量。
5 使用转义序列来管理较长的输入。
6 在列表中对命令分组。
7 在 find 之外使用 xargs。
8 了解何时 grep 应该执行计数——何时应该绕过。
9 匹配输出中的某些字段，而不只是对行进行匹配。
10 停止对 cat 使用管道。

一、在单个命令中创建目录树

示例 1 演示了最常见的 UNIX 坏习惯之一：一次定义一个目录树。

示例 1. 坏习惯 1 的示例：单独定义每个目录树

~ $ mkdir tmp
~ $ cd tmp
~/tmp $ mkdir a
~/tmp $ cd a
~/tmp/a $ mkdir b
~/tmp/a $ cd b
~/tmp/a/b/ $ mkdir c
~/tmp/a/b/ $ cd c
~/tmp/a/b/c $

使用 mkdir 的 -p 选项并在单个命令中创建所有父目录及其子目录要容易得多。但是即使对于知道此选项的管理员，他们在命令行上创建子目录时也仍然束缚于逐步创建每级子目录。花时间有意识地养成这个好习惯是值得的：

示例 2. 好习惯 1 的示例：使用一个命令来定义目录树
~ $ mkdir -p tmp/a/b/c

您可以使用此选项来创建整个复杂的目录树（在脚本中使用是非常理想的），而不只是创建简单的层次结构。例如：

示例 3. 好习惯 1 的另一个示例：使用一个命令来定义复杂的目录树
~ $ mkdir -p project/{lib/ext,bin,src,doc/{html,info,pdf},demo/stat/a}
（注释：在bash下该条命令可行）

过去，单独定义目录的唯一借口是您的 mkdir 实现不支持此选项，但是在大多数系统上不再是这样了。IBM、AIX®、mkdir、GNU mkdir 和其他遵守单一 UNIX 规范 (Single UNIX Specification) 的系统现在都具有此选项。
对于仍然缺乏该功能的少数系统，您可以使用 mkdirhier 脚本（请参见参考资料），此脚本是执行相同功能的 mkdir 的包装：
~ $ mkdirhier project/{lib/ext,bin,src,doc/{html,info,pdf},demo/stat/a}
（注释：在bash下该条命令可行）

二、提取.tar文件时指定路径；不需要移动存档
另一个不良的使用模式是将 .tar 存档文件移动到某个目录，因为该目录恰好是您希望在其中提取 .tar 文件的目录。其实您根本不需要这样做。您可以随心所欲地将任何 .tar 存档文件解压缩到任何目录——这就是 -C 选项的用途。在解压缩某个存档文件时，使用 -C 选项来指定要在其中解压缩该文件的目录：

示例 4. 好习惯 2 的示例：使用选项 -C 来解压缩 .tar 存档文件
~ $ tar xvf -C tmp/a/b/c newarc.tar.gz

相对于将存档文件移动到您希望在其中解压缩它的位置，切换到该目录，然后才解压缩它，养成使用 -C 的习惯则更加可取——当存档文件位于其他某个位置时尤其如此。

三、将命令与控制操作符组合使用
您可能已经知道，在大多数 Shell 中，您可以在单个命令行上通过在命令之间放置一个分号 (;) 来组合命令。该分号是 Shell 控制操作符，虽然它对于在单个命令行上将离散的命令串联起来很有用，但它并不适用于所有情况。例如，假设您使用分号来组合两个命令，其中第二个命令的正确执行完全依赖于第一个命令的成功完成。如果第一个命令未按您预期的那样退出，第二个命令仍然会运行——结果会导致失败。相反，应该使用更适当的控制操作符（本文将描述其中的部分操作符）。只要您的 Shell 支持它们，就值得养成使用它们的习惯。
仅当另一个命令返回零退出状态时才运行某个命令
使用 && 控制操作符来组合两个命令，以便仅当第一个命令返回零退出状态时才运行第二个命令。换句话说，如果第一个命令运行成功，则第二个命令将运行。如果第一个命令失败，则第二个命令根本就不运行。例如：

示例 5. 好习惯 3 的示例：将命令与控制操作符组合使用
~ $ cd tmp/a/b/c && tar xvf ~/archive.tar

在此例中，存档的内容将提取到 ~/tmp/a/b/c 目录中，除非该目录不存在。如果该目录不存在，则 tar 命令不会运行，因此不会提取任何内容。
仅当另一个命令返回非零退出状态时才运行某个命令
类似地，|| 控制操作符分隔两个命令，并且仅当第一个命令返回非零退出状态时才运行第二个命令。换句话说，如果第一个命令成功，则第二个命令不会运行。如果第一个命令失败，则第二个命令才会运行。在测试某个给定目录是否存在时，通常使用此操作符，如果该目录不存在，则创建它：

示例 6. 好习惯 3 的另一个示例：将命令与控制操作符组合使用
~ $ cd tmp/a/b/c || mkdir -p tmp/a/b/c

您还可以组合使用本部分中描述的控制操作符。每个操作符都影响最后的命令运行：

示例7. 好习惯 3 的组合示例：将命令与控制操作符组合使用
~ $ cd tmp/a/b/c || mkdir -p tmp/a/b/c && tar xvf -C tmp/a/b/c ~/archive.tar
四、谨慎引用变量
始终要谨慎使用 Shell 扩展和变量名称。一般最好将变量调用包括在双引号中，除非您有不这样做的足够理由。类似地，如果您直接在字母数字文本后面使用变量名称，则还要确保将该变量名称包括在方括号 ([]) 中，以使其与周围的文本区分开来。否则，Shell 将把尾随文本解释为变量名称的一部分——并且很可能返回一个空值。示例 8 提供了变量的各种引用和非引用及其影响的示例。

示例 8. 好习惯 4 的示例：引用（和非引用）变量
~ $ ls tmp/
a b
~ $ VAR="tmp/*"
~ $ echo $VAR
tmp/a tmp/b
~ $ echo "$VAR"
tmp/*
~ $ echo $VARa
~ $ echo "$VARa"
~ $ echo "${VAR}a"
tmp/*a
~ $ echo ${VAR}a
tmp/a
~ $

五、使用转义序列来管理较长的输入
您或许看到过使用反斜杠 () 来将较长的行延续到下一行的代码示例，并且您知道大多数 Shell 都将您通过反斜杠联接的后续行上键入的内容视为单个长行。然而，您可能没有在命令行中像通常那样利用此功能。如果您的终端无法正确处理多行回绕，或者您的命令行比通常小（例如在提示符下有长路经的时候），反斜杠就特别有用。反斜杠对于了解键入的长输入行的含义也非常有用，如以下示例所示：

示例 9. 好习惯 5 的示例：将反斜杠用于长输入
~ $ cd tmp/a/b/c ||
> mkdir -p tmp/a/b/c &&
> tar xvf -C tmp/a/b/c ~/archive.tar

或者，也可以使用以下配置：

示例 10. 好习惯 5 的替代示例：将反斜杠用于长输入
~ $ cd tmp/a/b/c
> ||
> mkdir -p tmp/a/b/c
> &&
> tar xvf -C tmp/a/b/c ~/archive.tar

然而，当您将输入行划分到多行上时，Shell 始终将其视为单个连续的行，因为它总是删除所有反斜杠和额外的空格。
注意：在大多数 Shell 中，当您按向上箭头键时，整个多行输入将重绘到单个长输入行上。

六、在列表中对命令分组
大多数 Shell 都具有在列表中对命令分组的方法，以便您能将它们的合计输出向下传递到某个管道，或者将其任何部分或全部流重定向到相同的地方。您一般可以通过在某个 Subshell 中运行一个命令列表或通过在当前 Shell 中运行一个命令列表来实现此目的。
在 Subshell 中运行命令列表
使用括号将命令列表包括在单个组中。这样做将在一个新的 Subshell 中运行命令，并允许您重定向或收集整组命令的输出，如以下示例所示：

示例 11. 好习惯 6 的示例：在 Subshell 中运行命令列表
~ $ ( cd tmp/a/b/c/ || mkdir -p tmp/a/b/c &&
> cd tmp/a/b/c && （原文中没有这条命令，自己添加的）
> VAR=$PWD; cd ~; tar xvf -C $VAR archive.tar )
> | mailx admin -S "Archive contents"

在此示例中，该存档的内容将提取到 tmp/a/b/c/ 目录中，同时将分组命令的输出（包括所提取文件的列表）通过邮件发送到地址 admin。
当您在命令列表中重新定义环境变量，并且您不希望将那些定义应用于当前 Shell 时，使用 Subshell 更可取。
在当前 Shell 中运行命令列表
将命令列表用大括号 ({}) 括起来，以在当前 Shell 中运行。确保在括号与实际命令之间包括空格，否则 Shell 可能无法正确解释括号。此外，还要确保列表中的最后一个命令以分号结尾，如以下示例所示：

示例 12. 好习惯 6 的另一个示例：在当前 Shell 中运行命令列表
~ $ { cp ${VAR}a . && chown -R guest.guest a &&
> tar cvf newarchive.tar a; } | mailx admin -S "New archive"

七、在 find 之外使用 xargs
使用 xargs 工具作为筛选器，以充分利用从 find 命令挑选的输出。find 运行通常提供与某些条件匹配的文件列表。此列表被传递到 xargs 上，后者然后使用该文件列表作为参数来运行其他某些有用的命令，如以下示例所示：

示例 13. xargs 工具的经典用法示例
~ $ find some-file-criteria some-file-path |
> xargs some-great-command-that-needs-filename-arguments

然而，不要将 xargs 仅看作是 find 的辅助工具；它是一个未得到充分利用的工具之一，当您养成使用它的习惯时，将会希望进行所有试验，包括以下用法。
传递空格分隔的列表
在最简单的调用形式中，xargs 就像一个筛选器，它接受一个列表（每个成员分别在单独的行上）作为输入。该工具将那些成员放置在单个空格分隔的行上：

示例 14. xargs 工具产生的输出示例
~ $ xargs
a
b
c
Control-D
a b c
~ $

您可以发送通过 xargs 来输出文件名的任何工具的输出，以便为其他某些接受文件名作为参数的工具获得参数列表，如以下示例所示：

示例 15. xargs 工具的使用示例
~/tmp $ ls -1 | xargs
December_Report.pdf README a archive.tar mkdirhier.sh
~/tmp $ ls -1 | xargs file
December_Report.pdf: PDF document, version 1.3
README: ASCII text
a: directory
archive.tar: POSIX tar archive
mkdirhier.sh: Bourne shell script text executable
~/tmp $

xargs 命令不只用于传递文件名。您还可以在需要将文本筛选到单个行中的任何时候使用它：

示例 16. 好习惯 7 的示例：使用 xargs 工具来将文本筛选到单个行中
~/tmp $ ls -l | xargs
-rw-r--r-- 7 joe joe 12043 Jan 27 20:36 December_Report.pdf -rw-r--r-- 1
root root 238 Dec 03 08:19 README drwxr-xr-x 38 joe joe 354082 Nov 02
16:07 a -rw-r--r-- 3 joe joe 5096 Dec 14 14:26 archive.tar -rwxr-xr-x 1
joe joe 3239 Sep 30 12:40 mkdirhier.sh
~/tmp $

谨慎使用 xargs
从技术上讲，使用 xargs 很少遇到麻烦。缺省情况下，文件结束字符串是下划线 (_)；如果将该字符作为单个输入参数来发送，则它之后的所有内容将被忽略。为了防止这种情况发生，可以使用 -e 标志，它在不带参数的情况下完全禁用结束字符串。

八、了解何时 grep 应该执行计数——何时应该绕过
避免通过管道将 grep 发送到 wc -l 来对输出行数计数。grep 的 -c 选项提供了对与特定模式匹配的行的计数，并且一般要比通过管道发送到 wc 更快，如以下示例所示：

示例 17. 好习惯 8 的示例：使用和不使用 grep 的行计数
~ $ time grep and tmp/a/longfile.txt | wc -l
2811
real 0m0.097s
user 0m0.006s
sys 0m0.032s
~ $ time grep -c and tmp/a/longfile.txt
2811
real 0m0.013s
user 0m0.006s
sys 0m0.005s
~ $

除了速度因素外，-c 选项还是执行计数的好方法。对于多个文件，带 -c 选项的 grep 返回每个文件的单独计数，每行一个计数，而针对 wc 的管道则提供所有文件的组合总计数。
然而，不管是否考虑速度，此示例都表明了另一个要避免地常见错误。这些计数方法仅提供包含匹配模式的行数——如果那就是您要查找的结果，这没什么问题。但是在行中具有某个特定模式的多个实例的情况下，这些方法无法为您提供实际匹配实例数量的真实计数。归根结底，若要对实例计数，您还是要使用 wc 来计数。首先，使用 -o 选项（如果您的版本支持它的话）来运行 grep 命令。此选项仅输出匹配的模式，每行一个模式，而不输出行本身。但是您不能将它与 -c 选项结合使用，因此要使用 wc -l 来对行计数，如以下示例所示：

示例 18. 好习惯 8 的示例：使用 grep 对模式实例计数
~ $ grep -o and tmp/a/longfile.txt | wc -l
3402
~ $

在此例中，调用 wc 要比第二次调用 grep 并插入一个虚拟模式（例如 grep -c）来对行进行匹配和计数稍快一点。

匹配输出中的某些字段，而不只是对行进行匹配
当您只希望匹配输出行中特定字段中的模式时，诸如 awk 等工具要优于 grep。
下面经过简化的示例演示了如何仅列出 12 月修改过的文件。

示例 19. 坏习惯 9 的示例：使用 grep 来查找特定字段中的模式
~/tmp $ ls -l /tmp/a/b/c | grep Dec
-rw-r--r-- 7 joe joe 12043 Jan 27 20:36 December_Report.pdf
-rw-r--r-- 1 root root 238 Dec 03 08:19 README
-rw-r--r-- 3 joe joe 5096 Dec 14 14:26 archive.tar
~/tmp $

在此示例中，grep 对行进行筛选，并输出其修改日期和名称中带 Dec 的所有文件。因此，诸如 December_Report.pdf 等文件是匹配的，即使它自从一月份以来还未修改过。这可能不是您希望的结果。为了匹配特定字段中的模式，最好使用 awk，其中的一个关系运算符对确切的字段进行匹配，如以下示例所示：

示例 20. 好习惯 9 的示例：使用 awk 来查找特定字段中的模式
~/tmp $ ls -l | awk '$6 == "Dec"'
-rw-r--r-- 3 joe joe 5096 Dec 14 14:26 archive.tar
-rw-r--r-- 1 root root 238 Dec 03 08:19 README
~/tmp $

有关如何使用 awk 的更多详细信息，请参见参考资料。

十、停止对 cat 使用管道
grep 的一个常见的基本用法错误是通过管道将 cat 的输出发送到 grep 以搜索单个文件的内容。这绝对是不必要的，纯粹是浪费时间，因为诸如 grep 这样的工具接受文件名作为参数。您根本不需要在这种情况下使用 cat，如以下示例所示：

示例 21. 好习惯和坏习惯 10 的示例：使用带和不带 cat 的 grep

~ $ time cat tmp/a/longfile.txt | grep and
2811
real 0m0.015s
user 0m0.003s
sys 0m0.013s
~ $ time grep and tmp/a/longfile.txt

此错误存在于许多工具中。由于大多数工具都接受使用连字符 (-) 的标准输入作为一个参数，因此即使使用 cat 来分散 stdin 中的多个文件，参数也通常是无效的。仅当您使用带多个筛选选项之一的 cat 时，才真正有必要在管道前首先执行连接。

结束语：养成好习惯
最好检查一下您的命令行习惯中的任何不良的使用模式。不良的使用模式会降低您的速度，并且通常会导致意外错误。本文介绍了 10 个新习惯，它们可以帮助您摆脱许多最常见的使用错误。养成这些好习惯是加强您的 UNIX 命令行技能的积极步骤。