Difference between revisions of "A Productive Desktop Environment for Scientists and Engineers - Part III"

From assela Pathirana
Jump to navigationJump to search
Line 59: Line 59:
unix2dos foo.txt
unix2dos foo.txt
</nowiki></pre> adds a line feed after each carriage return in file <tt>foo.txt</tt>. Try these with above files.
</nowiki></pre> adds a line feed after each carriage return in file <tt>foo.txt</tt>. Try these with above files.
==Editors for Writing Programs==
Well there are hundreads of them. Including the one we are familiar in these pages, nedit. Perhaps most of the text editors today are varients of the model on which nedit is based. These cover a range of products from windows notepad, Linux gedit to built-in editors in many [[wikipedia:Integrated Development Environment|Integrated Development Environment]]s like [[wikipedia:Eclipse (software)|Eclipse]]. One saliant feature of these editors is that it is very easy to start using them. 
However, when one starts spending considerable time writing on the computer (especially programs and other structured text), it becomes increasingly profitable to learn one of the fast editors. There are basically two modern alternatives: [[wikipedia:Vim editor|Vim editor]] or [[wikipedia:Emacs|Emacs]]. These editors have steep learning curves, making them hard to learn in the begining. However, once one spends several hours learning the basics the long-time reward is the editing speed and efficiency that is nearly impossible to achieve with the other category of editors.
I use [[wikipedia:vim editor|vim]]. However, one should respect others [[wikipedia:Editor_war|religious convictions]], so it is my duty to say that both editors are equally good. (Though I don't know a damn thing about emacs!)


==You need a good web browser==
==You need a good web browser==

Revision as of 06:08, 28 April 2006

Text data, Text files & Text Editors

Text editors were discussed briefly before. This article is simply a much through explanation.

I have met at least one person, who used to edit source code using microsoft word! While this is certainly possible to do, there are far easier ways of doing the same thing. In the windows environment, perhaps the most basic tool is the notepad application. Before we dive in to the world of text editors under different platforms, it is benificial to have some background knowlege on text files.

What exactly is a text data

All the information that is handled by a computer is Binary numeral system (bits) that are often grouped in to blocks called words or bytes, which is normally the smallest block of binary data on which a meaningful calculation can be done. This is essentially a numeral value (e.g. an 8-bit byte can be used to represent a number from 0 to 255.

In order to represent human-language characters using computer words, there are convenstions generally agreed-upon, known as character encodings. For example ASCII encoding system defines 128 characters with mapped to numbers from 0 to 127. These included printable characters (Alpha-numerics and some symbols) and other control characters (e.g. line feed characters). Another popular encoding system is unicode system. Most of the control characters have become largely obsolete except for carriage return and line feed. And these two causes a problem when we move text files between operating systems!

Carriage return, line feed and newline

First there were typewriters!

In an old manual typewriter, there is a lever for the typist to end the current line and start a new line (i.e. newline operation). The lever had two functions: to feed a line by rotating the cylinder carrying the paper and to move the cyclinder horizontally so that the typing starts at the left margin of the paper. Early computer designs adopted the typewriter system via the teletype input/output terminals and hence adopted the same convention, namely using a carriage return and line feed (CR+LF) to represent a newline. Later different operating systems adopted the newline convention differently. CP/M, MS-DOS and hence all versions of Microsoft Windows retained CR+LF convension, UNIX use LF and Apple Computer's Mac OS used CR, until recently.

Due to these discrepancies , issues arise when text files are exchanged between different operating systems. This can be easily demonstrated using our cygwin system.

Red warning.gif

This example was written in year 2006, for microsoft notepad version 5.1. There is the possibility that this program will change in the future and this demonstration no longer holds true.

  1. Use nedit to create a file named foo.txt containing:

line one line two line three

  1. Then open it with notepad program. What we will see is something like:

line oneline twoline three

.

This is due to the fact that the notepad program fails to seperate lines with only LF. However, it should be noted that increasingly more utilties from either side of the fence (UNIX, Windows) can handle this difference gracefully.

Similarly,

  1. type the following with notepad in a file named bar.txt

head1 head2 head3

  1. Then on cygwin do the following:

cat bar.txt| sed 's/$/tail/g'

what this is supposed to do is to add the word tail to the end of each line, so that we would get

head1tail head2tail head3tail

But instead, we get something like:

tail1 tail2 head3tail

Take care

It is obvious that one has to be pretty careful when handling text files across operating systems. There are two points that we should remember.

  1. Always try to create and use data within same operating system. e.g. If you need to write a small script for cygwin, use, an editor like nedit, rather than doing it with notepad.
  2. When there is a dataset that is (or you suspect to be) written in other operating system, convert them. Cygwin has two small utilties to do this.

dos2unix bar.txt will remove Carriage return characters from windows/DOS text file bar.txt.

unix2dos foo.txt

adds a line feed after each carriage return in file foo.txt. Try these with above files.

Editors for Writing Programs

Well there are hundreads of them. Including the one we are familiar in these pages, nedit. Perhaps most of the text editors today are varients of the model on which nedit is based. These cover a range of products from windows notepad, Linux gedit to built-in editors in many Integrated Development Environments like Eclipse. One saliant feature of these editors is that it is very easy to start using them.

However, when one starts spending considerable time writing on the computer (especially programs and other structured text), it becomes increasingly profitable to learn one of the fast editors. There are basically two modern alternatives: Vim editor or Emacs. These editors have steep learning curves, making them hard to learn in the begining. However, once one spends several hours learning the basics the long-time reward is the editing speed and efficiency that is nearly impossible to achieve with the other category of editors.

I use vim. However, one should respect others religious convictions, so it is my duty to say that both editors are equally good. (Though I don't know a damn thing about emacs!)

You need a good web browser

Simple! Just download Firefox browser and be happy ever after :-) (at least for the foreseeable future!). I am not just trying to be different from the 'masses' here. Simply trust me on this one, install it, load a web page and press 'Ctrl+T'. Tabbed browsing is one of those simple improvement that endear a tool to the user. I have been using this feature on my browser (first Mozilla and then Mozilla Firefox since 2001 and I can not imagine browsing the internet without it.

Spell checker for your browser

When was the last time you have opend up one of large textarea on a web page. (a good example is using a webmail service like gmail or Yahoo! Mail.) Before submitting the 'Go' button, it is better to correct those misspelled words. This can be done by copying and pasting the text in your wordprocessor (e.g. Microsoft Word), checking spelling and pasting back. But that is a lot of work! It is better to have a built-in spell checker in the browser, always at service.

Spellbound is an extension for Mozilla firefox, for just doing that. Follow the link below to learn how it works and how to install it.