Building Websites – HTML, CSS, PHP and MySQL
November 2009 meeting – report by Ian Macfarlane
Steve started by saying that his talk about websites was originally planned for the New Year, and that he had intended to write it during the Christmas break, which prompted Rick Sterry to quip that he could enjoy his Christmas properly now! It wasn’t going to be a tutorial on how to write a website – he wasn’t going to give us a blow by blow account of how to do things. The main aim was to provide some background as to how websites work.
As he had begun to put his thoughts for this talk together, he realised that there were a huge number of three letter acronyms, for example: HTML (not in fact a TLA), URL, ISP, FTP, CSS, SSI, CGI, PHP, SQL... He was going to explain these terms – what they are and what they do. If anyone wanted to follow any of this up, then he suggested finding a book on the subject(s) and asking on the Club forum.
‘What is a website?’ he asked. The simple answer was that it was just a collection of files on a computer – on your own machine or more commonly on a web-server somewhere else. He had a copy of NetSurf version 2.1 (the RISC OS website browser),which he displayed on the screen.
It was just a matter of giving the browser a file, and here he proceeded to open his preferred text editor, StrongED, then typed “Hello World!” and saved the file to disc as a file named ‘TextFile’. He then ‘dropped’ it into NetSurf and NetSurf loaded it. He admitted that this was not very exciting. Files that were stored locally work as simple sites, which can be maintained easily as long as the site doesn’t involve anything more than basic HTML – Hyper Text Markup Language.
For ‘real’ use on the web, a web server is required. If you have a website, whoever ‘hosts’ your website – your Internet Service Provider, or ISP – will be supplying the web-server as well. That evening, rather than going ‘on line’ to the World Wide Web (WWW), Steve had brought a web-server with him.
This was a copy of Apache 2, which is one of the more ‘established’ servers. Originating in the early 1990s, by 2009 the web-based encyclopedia Wikipedia says that Apache is responsible for 100 million sites. The Club’s website runs on Apache, on Purley Hosting’s servers. That evening Steve had it running on his laptop, by pointing NetSurf to the laptop’s IP (Internet Protocol) address.
Uploads via FTP
Steve then answered the question ‘How do we get the text file on to the server?’ Most ISP’s still use FTP (File Transfer Protocol) to upload to sites and on RISC OS, the favourite tool has been for some time Colin Granville’s FTPc, which is a graphical client that uses the familiar RISC OS filer windows to show files and folders. He then started up FTPc on the RISC OS machine, set in the details for the laptop and logged in with the password.
He had problems here with firewalls between the two systems and gave us a tip to switch from active to passive mode to get through the problem. The list of files that were stored on the server came up on the screen. He said it was important to get the file extensions correct. Our example ‘TextFile’ (see above) had to have a file extension of ‘.txt’ to be handled properly by the server. This text file was then displayed by NetSurf, but was not very illustrative of typical web files.
Adding some HTML formatting
All web pages are formatted in HTML, which is a markup language and not a programming language. It allows layouts and formatting to be applied to text, in effect allowing the designer of the web page to present the information in his chosen style to the viewer. However he did explain that HTML was restricted in its scope: it was not intelligent.
Steve then altered the example text file to demonstrate one or two of the HTML functions and then saved the file again and uploaded it to the server. He changed the file extension type to ‘.html’ and refreshed the webpage, for NetSurf to show the changes that occurred with these modifications. It was important to realise that it was the server that decided the file type of the file based on this extension. The server then notified the browser software of the file type of the file that it was about to send.
He explained about looking after the files on a web site and talked about SiteMatch, a piece of RISC OS software written by Dave Edwards and Richard Porter, which looks after web sites on the server, by having a copy of the website on the web designer’s computer. When changes are made to the website in the web designer’s copy then SiteMatch detects these changes and updates the server’s copy again automatically. This could be achieved manually using FTPc, but with more complicated web sites this becomes a chore and it is difficult to prevent errors creeping in. Also only those files which have changed are transferred, making the updating process efficient. Steve demonstrated this by running a file comparison using SiteMatch. He changed a file name to demonstrate that SiteMatch would find the change.
Steve then switched to using Sunfish for server access, as the use of this software would make things far easier with his own server: however, most ISPs won’t offer access via the protocol that Sunfish uses. Steve showed several more features of HTML, which included the makeup of a web page into two distinct sections, the header and the body. When he added these constructs to the example file the resulting web page displayed in NetSurf now had a title.
Colin Sutton wanted to know if the testing procedure of web pages could be shortened from the one Steve was using. Steve agreed that there was, as Colin had indicated, a shortcut method using OLE where the page would update in the browser as soon as it was saved in the editor, but this was only available if the process was taking place locally: wholly on RISC OS, and not with the pages saved on a server.
Using HTML the font could be changed by using the appropriate function tags, and even by abusing semantic markup using embolden and citation tags. This got certain web page creators (purists) annoyed, since they believed that HTML was about content and not about appearance. However others who came to the internet later with commercial agendas forced the balance towards appearance and the use of graphical presentation.
Steve said that this had now come full circle and content and presentation were now in balance. Rick commented that in any walk of life there was a conflict of style against substance, to which Steve agreed. Steve showed an example of the difference, showing that a screen reader for blind web users would interpret the two texts differently and that only one would make sense to the (visually impaired) listener.
Cascading Style Sheets (CSS)
An advanced markup function was found to be necessary to help with this problem. Cascading Style Sheets could now apply styles to a page in much the same way that Impression or Ovation use styles. CSS allows much more flexibility in the design of a webpage, while keeping the underlying semantic meaning of the text. Unfortunately, modern CSS is only properly supported by NetSurf and Firefox on RISC OS.
The style parameters were defined at the start of the webpage and he worked through an example on the screen. The styles of particular words or of paragraphs could be specified. This makes the laying out of pages much easier and more readable to the webmaster for subsequent maintenance.
The style parameters may exist in a separate file (style sheet) and so can be called upon by many web pages, enabling a consistent style to exist throughout a web site. Steve demonstrated such an arrangement and how the changing of one parameter in the style sheet file would produce a change throughout a website. Another point was that the page was now much easier to maintain. Additionally Steve demonstrated how easy it was to change the layout with CSS: by changing the style slightly, images can be added for bullet points in lists, for example.
Steve turned to the question of what would happen to the display of the website if it was viewed on a screen using a browser without CSS capability. The page was still readable, but wasn’t formatted in the manner that was laid out by the designer.
Server Side Includes (SSI)
Steve went on to show that a page could be made more interactive, but this did depend on the software in the Internet Service Provider’s computer – the server. Steve showed an example of this software in the Apache suite, and used it to insert the date and time that the page was downloaded into the text. Since this software is dynamic – interacting with each user – it is generally not offered with free webhosting due to the extra work the server must do. Such webpages can sometimes be recognised from their webaddress extensions of ‘.shtml’.
Steve explained that, because they were provided by the server, SSI features could be used by all RISC OS browsers. For users whose ISPs don’t provide SSI facilities, there is a RISC OS application called WebChange, by Vince Hudd, which can scan the pages of a site before uploading it via FTP. It can look for parameters in the pages, and modify the values seen by the site’s visitors in their browsers. Steve assured us that WebChange worked very well.
Steve talked about the need to have all the pages on a website looking the same for consistency and how this could be achieved by using templates. Instead of a body tag and heading tags, the webpage could call up a template file through SSI, which contained the common heading and layout sections for all the pages on the site. He demonstrated what happened to webpages when their common template file was altered. Again, site maintenance is made much easier by the use of templates.
Common Gateway Interface (CGI)
This allows the designer to generate a piece of the webpage on-the-fly using a bit of software, which could be written in any language that can be run on the server. Steve demonstrated a small piece of software that he had written in Perl (which stands for Practical Extraction and Report Language). This language is especially useful in manipulating text strings, and there is a RISC OS port available, which has been provided by Alex Waugh.
PHP: Hypertext Preprocessor
This was a piece of software which resided in the server (the Internet Service Provider’s computer). Its operation was more secure than that of a CGI function and is more friendly, and it has largely made direct access to CGI obsolete for many websites – but again, it would often not be available from a free webspace provider. PHP will generate dynamic blocks of text – just like the Perl example. Here Steve showed a webpage with the current date and time.
Steve went on to talk about more complex examples of PHP and started by explaining how web addresses with question marks in them work. This function is called ‘get data’: the data gets passed back to the server on the end of the address, and is made available to the PHP code. However this can be quite dangerous and the web designer needs to be well schooled in the possible pitfalls. Note that a user’s input must be screened for valid data, before being passed to the processing software. Steve also showed a similar input function called ‘post data’ which also obtains data from the user but returns it separately from the web address.
Using PHP, interactive webpages could be generated fairly simply. The Club website uses these techniques quite extensively.
Databases and SQL
To round off his talk, Steve briefly introduced the concepts of databases. Users have the ability to add content to the database of a host, running on a server. Last March Nick Mason talked to the Club about DataPower, which features SQL and he demonstrated some of the things that could be done using it. DataPower cannot be used for the web, but most hosting providers who proffer a database use MySQL, which is open source and more or less scalable. Some quite interactive queries can be achieved using MySQL. Steve demonstrated these features by using a database of names and addresses. Again, the responses from users must be very carefully checked before processing and storing the data in the database.
Rounding off his talk, Steve said that his aim had been to make Club members aware of what goes on in website design, and that he could supply the names of some good books on the subject should members want to take the subject further. Questions were welcomed.
Peter Richmond wanted to know how he should check compatibility on his use of Style Sheets for Oregano. Steve suggested using Oregano, alongside NetSurf, an older browser without CSS support and both Internet Explorer and Firefox if available. Some people still use Fresco. Rick Sterry thanked Steve for providing a most interesting talk.