Difference between revisions of "A Productive Desktop Environment for Scientists and Engineers - Part III"

From assela Pathirana
Jump to navigationJump to search
Line 13: Line 13:


Due to these discrepancies , issues arise when text files are exchanged between different operating systems. This can be easily demonstrated using our cygwin system.  
Due to these discrepancies , issues arise when text files are exchanged between different operating systems. This can be easily demonstrated using our cygwin system.  
{{wbox|This example was written in year 2006, for microsoft notepad version 5.1. There is the possibility that this program will change in the future and this demonstration holds no longer true.}}
{{wbox|This example was written in year 2006, for microsoft notepad version 5.1. There is the possibility that this program will change in the future and this demonstration no longer holds true.}}
# Use nedit to create a file named foo.txt containing:<nowiki>
# Use nedit to create a file named foo.txt containing:<nowiki>
  </nowiki><pre><nowiki>
  </nowiki><pre><nowiki>

Revision as of 13:13, 27 April 2006

Get a good text editor

Text editors were discussed briefly before. This article is simply a much through explanation.

I have met at least one person, who used to edit source code using microsoft word! While this is certainly possible to do, there are far easier ways of doing the same thing. In the windows environment, perhaps the most basic tool is the notepad application. Before we dive in to the world of text editors under different platforms, it is benificial to have some background knowlege on text files.

What exactly is a text data

All the information that is handled by a computer is Binary numeral system (bits) that are often grouped in to blocks called words or bytes, which is normally the smallest block of binary data on which a meaningful calculation can be done. This is essentially a numeral value (e.g. an 8-bit byte can be used to represent a number from 0 to 255.

In order to represent human-language characters using computer words, there are convenstions generally agreed-upon, known as character encodings. For example ASCII encoding system defines 128 characters with mapped to numbers from 0 to 127. These included printable characters (Alpha-numerics and some symbols) and other control characters (e.g. line feed characters). Another popular encoding system is unicode system. Most of the control characters have become largely obsolete except for carriage return and line feed. And these two causes a problem when we move text files between operating systems!

Carriage return, line feed and newline

First there were typewriters!

In an old manual typewriter, there is a lever for the typist to end the current line and start a new line (i.e. newline operation). The lever had two functions: to feed a line by rotating the cylinder carrying the paper and to move the cyclinder horizontally so that the typing starts at the left margin of the paper. Early computer designs adopted the typewriter system via the teletype input/output terminals and hence adopted the same convention, namely using a carriage return and line feed (CR+LF) to represent a newline. Later different operating systems adopted the newline convention differently. CP/M, MS-DOS and hence all versions of Microsoft Windows retained CR+LF convension, UNIX use LF and Apple Computer's Mac OS used CR, until recently.

Due to these discrepancies , issues arise when text files are exchanged between different operating systems. This can be easily demonstrated using our cygwin system.

Red warning.gif

This example was written in year 2006, for microsoft notepad version 5.1. There is the possibility that this program will change in the future and this demonstration no longer holds true.

  1. Use nedit to create a file named foo.txt containing:

line one line two line three

  1. Then open it with notepad program. What we will see is something like:

line oneline twoline three

.

This is due to the fact that the notepad program fails to seperate lines with only LF. However, it should be noted that increasingly more utilties from either side of the fence (UNIX, Windows) can handle this difference gracefully.

Similarly,

  1. type the following with notepad in a file named bar.txt

head1 head2 head3

  1. Then on cygwin do the following:

cat bar.txt| sed 's/$/tail/g'

what this is supposed to do is to add the word tail to the end of each line, so that we would get

head1tail head2tail head3tail

But instead, we get something like:

tail1 tail2 head3tail

Take care

It is obvious that one has to be pretty careful when handling text files across operating systems. There are two points that we should remember.

  1. Always try to create and use data within same operating system. e.g. If you need to write a small script for cygwin, use, an editor like nedit, rather than doing it with notepad.
  2. When there is a dataset that is (or you suspect to be) written in other operating system, convert them. Cygwin has two small utilties to do this.

dos2unix bar.txt will remove Carriage return characters from windows/DOS text file bar.txt.

unix2dos foo.txt

adds a line feed after each carriage return in file foo.txt. Try these with above files.

You need a good web browser

Simple! Just download Firefox browser and be happy ever after :-) (at least for the foreseeable future!). I am not just trying to be different from the 'masses' here. Simply trust me on this one, install it, load a web page and press 'Ctrl+T'. Tabbed browsing is one of those simple improvement that endear a tool to the user. I have been using this feature on my browser (first Mozilla and then Mozilla Firefox since 2001 and I can not imagine browsing the internet without it.

Spell checker for your browser

When was the last time you have opend up one of large textarea on a web page. (a good example is using a webmail service like gmail or Yahoo! Mail.) Before submitting the 'Go' button, it is better to correct those misspelled words. This can be done by copying and pasting the text in your wordprocessor (e.g. Microsoft Word), checking spelling and pasting back. But that is a lot of work! It is better to have a built-in spell checker in the browser, always at service.

Spellbound is an extension for Mozilla firefox, for just doing that. Follow the link below to learn how it works and how to install it.