Every time we request a web page on our browser, the browser would have created a connection with a web server, sent a request and received the required page before displaying it on our screens. A basic representation of such a request in shown in the diagram below:
Generally pages requested by web browsers are static web pages which would be held in the web server’s local repository. All the web server has to do is, locate the requested web page and return it to the browser. What one means by a static web page is that its content does not change until its owner modifies it.
But some applications or scenarios require information not from pages that have been pre-written but from pages that are generated based on user input. For example, when searching for an article on the web, the results are generated based on the search words entered by the user. Such web pages are known to be generated dynamically. Being able to produce output based on what the user has produced, makes web servers much more interactive. In these situations, the server would actually have to process the information and generate a page to send back to the user based on request. Web servers achieve this by the use of the Common Gateway Interface.
Before moving onto how dynamic pages are created, the following section explains how a static web page would be requested and retrieved from a server.
Basic HTTP Requests
Every time a user requests a webpage by clicking on a link or simply typing it into the address bar, the web browser would break down the URL into three main components:
– The protocol , eg: http, ftp, https
– The server, eg: www.google.com
– And the filename, eg: /file.html
Every web server has an IP address and a domain name. Using the IP address, the web browser would connect to the server. After the connection is formed, following the HTTP protocol, the browser sends a GET request, asking to retrieve the webpage. The server would then process the request, locate the required page and return the result. The server sends back a HTTP response to the browser, containing the HTML code for the webpage. The web browser would then interpret the HTML tags and display the web page.
Using the Common Gateway Interface
The Common Gateway Interface (CGI) is a protocol which defines the rules for transmitting information between a web server and an external program. According to webopedia.com, any program that is designed to accept and return data that conforms to the CGI specification can be classified as a CGI program. Using such programs, allow web servers to dynamically interact with the users. CGI scripts can be written in any programming languages, such as C, Perl, Java, etc.
Basic Overview of the use of CGI:
Using the database example mentioned on the web page at hoohoo.nesa, how CGI is used will be explained:
If a person wants to share his database with the world by connecting to the web, he will require a CGI program to transmit information between the web browser and the database engine. When a client requires some information from the database, his web browser sends request to the server. The database engine using CGI executes a CGI script which would then process the request and return the results to the web browser. As the script is being processed on the web server itself, this technique is known as server-side scripting.
The diagram below illustrates the use of CGI:
[Reference: http://www.webdevelopersnotes.com/basics/client_server_architecture.php3, 23/11/08]
Explanation of how CGI is used:
Generally the input for a CGI script is received from a HTML form on a webpage. For example, the form shown below:
The webpage simulates that of a basic online survey page. The form is created to take in the 4 separate items of data. Once this form is filled and the user presses the Submit button at the bottom of the page, a HTTP request would be sent to the server. Upon submission, the file specified in the action attribute of the form would be requested from the server. This form would request a CGI script on submission.
The information that the user has entered in the form can be submitted by 2 methods: GET and POST. By sending the form data using the GET method, the data is appended to the URL of the CGI program and therefore all the data is visible in the address bar in name/ value pairs.
Shown below is how the address bar would look if the GET method was used:
In the POST method, the data is added to the HTTP request message. If the data sent needs to be secure, for example, the username and password for an email account, the POST method is the suitable method. For this form, I have used the POST method to send the data.
Shown below is the http request that is sent on submitting the form:
Upon receiving the request from the client’s browser, the server executes the CGI Script, which is located in the cgi-bin directory. The script then begins to process the request. As all the data is packaged up when being sent to the script, it has to be separated into their respective parts.
Once each data has been split, the script can begin various other processes, such as validation, etc.
After its processing, the resulting output can be returned back to the web browser in the form of HTML code.
In the form I have created, upon submission the mycgi.pl file is requested from the server. This script uses the split function to separate the various data items. Each data is validated. A web page is then dynamically generated from the script, displaying either the error messages if any field had been incorrectly filled in, or the data that the user has entered into the survey.