Global
Site Seer
ver 1.0
Solve a CSI problem step-by-step
The Global Site Seer is a small package developed by
System requirement:
u
IBM PC
with Pentium III CPU or higher
u
256MB
RAM or higher (with respect to larger sequence data)
u
Microsoft
Windows 98/2000/XP OS
This package contains
ü
An
execution file (GlobalSiteSeet.exe),
ü
An
instruction file (User's Guide.htm)
ü
8
dynamic-link libraries (*.dll) from LINGO 8.0,
ü
Several
test data files (*.gss).
Step 1: Decompressed the package to a directory,
Step 2: Run "GlobalSiteSeer.exe".
Step1: Edit DNA sequences
The main interface, as depicted in Fig. 1, is a simple
text editor window for editing DNA sequences. It has the basic functions like create,
open, save, print and edit files. Every line represents a single DNA sequence
and no line feed is allowed in a single sequence. Sequences of various lengths
are allowed. The sequence data should contain only the characters A, T, C and
G.

Fig. 1. Main interface of Global Site Seer
Step 2: Specify site format and constraints
To specify the format of the common site and the
logical constraints, use the ¡§Model/Build Model¡¨ instruction on the menu bar or
the target-shape button on the toolbar. Figure 2 illustrates the steps of
modeling the LP program and solving.
The dialog (Fig. 2a) that appears first is to specify
the format of common site and the logical constraints. Two characters, ¡¥N¡¦ and
¡¥X¡¦, are used to describe the site format. Each ¡¥N¡¦ represents a single letter
and each ¡¥X¡¦ represents an ignored letter. In the example in the paper (the CRP
binding site of the lac operon in E. coli genome), the site format is
¡§NNNNNXXXXXXNNNNN¡¨. The complementary relationships between letters inside the
site are specified in the form, ¡§m:n¡¨ (that is, the mth and nth letters are
complementary). When more than one constraint imposed, they should be separated
by commas.

Fig. 2a. Interface of specifying format and constraints.
Step 3: Solve the formulated problem
After the site format and constraints are assigned,
push the ¡§Translate Model¡¨ button to translate the assigned problem into a
linear optimization model like Model 2 in the paper. Figure 2b illustrates the
generated LP model, which can be solved by pushing the ¡§Solve!¡¨ button. A
dialog box like that in Fig. 2c appears and presents information such as
elapsed time as the solving proceeds.

Fig. 2b. Generated LP model

Fig. 2c. Solving status
Step 4: Solution report
When the optimal solution has been found, a solution
report is generated in a dialog box, as shown in Fig. 2d.

Fig. 2d. Solution report
Global Site Seer provides several test data files
(*.gss) for examination. All the test data files is extended from E. Coli
genome sequences, which is taken directly from Stormo and Hartzell (1989).
The naming of the test data files is as below:
Ex. DNA18-105.gss
¡§DNA¡¨ : This is a DNA data file
¡§18¡¨ : Amount of
sequences
¡§-105¡¨ : The length of each sequence
According to this naming rule you can find the test
data file you want.
If any question, please contact E-mail: cjfu@iim.nctu.edu.tw.