Take Charge of Your Web Site

by Richard Seltzer , seltzer@samizdat.com, www.samizdat.com

Copyright ©2001 Richard Seltzer

Lesson One: What you can do to fix your existing site, which cost you plenty and isn't getting traffic?



The following chapter is from the ebook Take Charge of Your Web Site, originally published by MightyWords. Permission is granted to make and distribute complete verbatim electronic copies of this item for non-commercial purposes provided the copyright information and this permission notice are preserved on all copies. All other rights reserved.

Now that the rights have reverted to the author, he is free to update and revise this online version. Please send email him your feedback/comments at seltzer@samizdat.com

Please visit our online store at http://store.yahoo.com/samizdat


To figure out what's wrong, start with the volume of your business. Has your Web site made a noticeable difference? Enough of a difference to justify the cost of building it, maintaining it, and advertising it? How many prospects have contacted you over the last six months based on what they found on your Web pages?

Are you getting less Web traffic than you should?

Many Web hosting services run programs that analyze the raw traffic logs for your site and provide useful statistics and graphs. If your site does not have such analysis, you should do whatever you can to get it. At the very least, download a trial version of such software and run it from your PC (pointing it to the file on the Internet that contains your raw logs). For instance, you can download Traffic Analyzer from www.webtrends.com, and use it for free for two weeks -- which should be plenty long enough for you to diagnose your problems.

Check the traffic -- especially the page views (the number of Web pages seen) and sessions (the number of times visitors came to your site, regardless of the number of pages they looked at) for your whole site, for categories of content, and for individual pages. ("Hits" tend to be misleading, since each element that loads separately constitutes a "hit"; so the more graphics you have on a page and the more complicated your pages are -- and the more difficult they are to load -- the more "hits" you'll get, without that helping your business at all.)

Next look carefully at the "referrer" information -- where the traffic to your pages is coming from. The top referrers are likely to be search engines and directories, unless you are poorly represented there and depend instead on paid advertising.

With some statistical analysis programs (like WebTrends), you can see the queries your visitors entered at search engines that led them to your site.  Note the variety of unexpected phrases and combinations of words they used. That's why it's very important to have lots of content -- to catch prospects who have very specific interests that your pages match. You may also see "key words" -- single words that happened to appear in queries (regardless of how complex the full query was). These "key words" tend to be relatively useless (the, of, computer, etc.)

Also note any data about "visiting spiders." Those are robot programs (also known as "crawlers") that search engines use to gather information about content on the Web. Those statistics tell you which search engines (if any) are checking your site and how frequently they return. Keep in mind that if you have lots of pages, a crawler might, in a single session, look at all of them, inflating your page-view statistics for that day. To monitor your progress over time, subtract the spider/crawler numbers, so you are tracking only "real" visitors.

When you are familiar with your numbers and any trends they indicate, contact your counterparts at related non-competing sites (perhaps partners of yours) and learn what you can about their stats, to calibrate how you are doing. What might be great stats in one market niche could be terrible in another.

If after taking those steps you believe that your site your site is under performing, then consider factors that might be reducing your traffic and what you can do to change them. If any of these problems sound familiar, then the suggested "quick fixes" might help you turn your Web business around at little or no cost.
 
 

Diagnosis


Page design affects traffic in ways that you probably didn't realize.

Search engines learn what is available on the Web by sending out "crawlers" (also known as "spiders" or "robots" or "bots") -- programs that automatically perform the same kinds of functions as real users. These crawlers typically follow a trail of links. Each page that they fetch has links to other Web pages. They may follow the link trails at a particular site immediately or put those addresses into a queue to be visited later. If many Web pages outside of your site have links to pages at your site, your site will probably be visited often by crawlers and hence the content about your site in search engine indexes will be relatively current. If no one has links to your site (perhaps because your site is new), search engines may not know if you exist.

You can also submit your Web pages to search engines (we'll talk about the details of that in Lesson Three). But whether crawlers find you through links or because of your submissions, they bring back text and only text for inclusion in the search engine indexes. And they index every single word of text -- not just "key words" like a database or a directory -- and, in some case, even the order of the words, so users can search for phrases. If the words in your pages are embedded in graphics, the crawlers won't see them. If the words are automatically generated from a database or through a java script or some other dynamic means, the crawlers won't see them.

For a quick diagnosis of search-engine-related traffic problems at your site, go to AltaVista, www.altavista.com, a popular search engine that has some unique and powerful commands that can help you sort out what might be wrong.

If you have your own domain name (like genius.com or samizdat.com), at AltaVista, submit the query
host:yourdomainname.com (using your real domain name)

If you do not have your own domain name, but rather have an address assigned to you by your Web-hosting service, like members.xoom.com/rseltzer; or your Web site is a directory of a corporate site, submit the query
url:members.xoom.com/rseltzer or
url:www.megacompany.com/productx (using your real address)

If you want to check to see if a particular directory (folder) or a particular page is included, use
url:
followed by the complete address.
(URL stands for "Universal Resource Locator." That means a Web address, in a form that looks like http://www.samizdat.com/sitemap.html).

Did you get any results? If so, how many? The pages that you see are all the pages from your Web site that are in the AltaVista index. Is that reasonably close to the number of pages you have at your site? If not, what's missing? And why might those pages be missing?

Consider these possible factors:
1) Have your technical people deliberately shut out search engine crawlers by using a "robot exclusion" file? The main objective of your technical staff is to make sure your site runs efficiently, and they might see search engine crawlers as a nuisance, a possible source of trouble. Unless instructed otherwise, they will naturally do everything in their power to reduce risk -- even if as a consequence, the beautiful Web site they constructed and maintain gets no visitors.

Quick fix: You need to talk to your technical people, understand their needs and concerns and make sure that they understand yours. Reportedly, in the early days of the Internet, the Internet Shopping Network was one of the very first online shopping sites. The marketing folks repeatedly submitted their pages to the major search engines, but even after many months none of their pages appeared in the indexes. The marketing people resorted to expensive advertising to try to generate the traffic and hence the business that they needed. Eventually, they discovered that a well-meaning Web designer had deliberately blocked all search engine crawlers from the entire site. The fix took less than a minute.

If there are particular parts of your site that you don't want indexed, it is a simple matter for your technical folks to edit their robots.txt file to block crawlers from just those parts, rather than from the whole site. And if they have reason to want to exclude one or more particular Web crawlers, they can do so by name, allowing other crawlers to enter. Compromise, be flexible. But make sure that your technical people understand your business goals, and that decisions of that kind are made with full knowledge of the marketing consequences, and what it might cost to make up the difference in traffic.

2) Does your entire site or a portion of your site require visitors to register? Perhaps your information isn't proprietary; you have no reason to keep it secret. You just want to keep track of who visits and want to ask them what they are interested in and how to get in touch with them. But search engine crawlers can't fill out forms. So as soon as they are required to, they halt -- and your site doesn't get indexed.

Quick fix -- Make registration an option, not a requirement. Let anyone click through on your links, but ask them to take the time to tell you about themselves. Make the benefits of registering clear to them (personalized pages, email alerts, more content that directly meets their needs, better service) or offer immediate rewards (e.g., a discount on their first purchase).

If you are using registration because you charge for your content, you need to understand that that approach is likely to reduce your traffic. You need to weigh the revenue you might bring in by charging for content, with the traffic you lose by doing so (both because of blocking search engines and also because some visitors will seek alternative free sources of comparable information). If the information value of your content diminishes over time, you might charge for current articles, and take advantage of the marketing value of your older articles by making them freely available to the public and to search engines.

3) Does your site generate "dynamic" pages? In other words, does it assemble pages on the fly from a variety of pieces? A typical symptom of that is a ? in the URL (page address). Designers typically use techniques of that kind to simplify site management. They might be able to change a single element that appears on thousands of pages, with a single command. They also use this approach to generate "personalized" pages -- a special view of the site that depends on user preferences or on what the user has seen before. When a crawler arrives at such a page, it captures the immediate text content, but halts, not following the links to other pages at the site. Otherwise, it could be presented with an "infinite" number of pages.

Quick fix -- In addition to what you do already (leaving your dynamic site alone), create plain static pages that have the same content as the elements used to build your dynamic ones. Link from each of those pages to a sitemap page (table of contents, with links to all your static pages), and to your dynamic home page -- as the best way to experience your content. But submit the sitemap page, instead of your home page, to the search engines.

4) Does your site use Active Server Pages (Microsoft technology that allows you to create dynamically-generated web pages)? In other words, do the URLs at your site all look the same and end with .asp? In that case, the page is just a script for the construction of a page, rather than static content. Typically, in that case, visitors who like your content may have trouble adding particular pages of yours to Bookmarks or Favorites. Sites that would like to link to particular pages of yours may have trouble doing that as well. And search engine crawlers may halt. (The results are likely to be erratic, so some sites wind up well indexed and others not at all.)

Quick fix -- Same as above: create plain static pages with the same content.

5) Does your site use Java or Java script to generate text on your pages? To check, go to a typical page and on your browser click View, then Source, and look for the word "Java". Perhaps only part of what the visitor sees is java-related. Perhaps the java-generated text appears in a separate box.

Quick fix -- If the java-related text is important (if it includes words and phrases that potential visitors might search for), then do as suggested above: create plain static pages with the same content.

6) Does your site use "frames"? With this design technique, typically, the same graphics (and link choices) appear along the edge (typically the left and/or the top) of each and every page. For instance, links to the main sections of the site might appear down a column on the left and at the top of the page you might see a banner ad or info about a special offer. Inside the window framed by those repeated elements, you might see one or more "panes." Frames do not prevent pages from being indexed, but they can lead to some very strange results. Search engines will typically index the outside of the frame and each pane of the frame window as separate pages. So when search engine users click on an item in a results list, they see just the frame or just the pane that matched their query -- not the full page as it was designed to be seen. What they see might be confusing because it's out of context. They probably won't see the links that were supposed to be associated with the content. And the overall look and feel is sure to violate your branding guidelines.

Quick fix -- Create non-frames versions of those same pages, and be sure to link to them all directly from a plain static sitemap page, and submit that sitemap page to the search engines.

7) Does your site have audio and video files? While multimedia content can hold the visitors attention and provide a memorable experience, the audio and video cannot be indexed by typical search engines, which depend entirely on text. Text is essential.

Quick fix -- Provide plain text transcripts for your audio and video clips. Link from the transcripts to the sitemap page and to the related audio/video files. And, of course, link from the sitemap page to the transcripts.

8) Does your site put important content in Acrobat and PostScript files? For instance, many large business sites present white papers and even press releases in acrobat (.pdf) form, so they can control the look and feel, and make it very much like the related printed piece. But search engines cannot see any of the text of files in those formats.

Quick fix -- Provide plain text versions as well.

9) Do you have a "flash" page -- multi-media effects that dazzle visitors for a few seconds before automatically moving them to the real Web site? Regardless of how impressive and eye-catching such a page can be, it blocks search engines.

Quick fix -- When you submit your site to search engines, point to your sitemap page, not your flash page.

10) Does your site use redirection -- sending people to one page and then automatically moving them to another? "Redirection" is a trick that porn sites frequently use to catch the attention of people who didn't intend to go to such a site. So most search engines block sites that use redirection, in order to maintain the integrity of their indexes and to make sure that their users in fact get to the kind of content they are asking for.

Quick fix -- Be very careful in the use of redirection. Do not put a redirect anywhere in the likely path of a search engine crawler. Best would be to use a robot exclusion file to prevent any crawler from seeing such a page.
Most search engines follow the rules of the Robot Exclusion Standard and hence will honor commands they see in a file named robots.txt found in the top-level directory of your Web server or in a robots metatag in the markup code for a particular page.  For details on how to do that see http://doc.altavista.com/adv_search/ast_haw_avoiding.html

For example, a robots.txt file consisting of just two lines
User-agent: *
Disallow: /thisdirectory/thispage.html
would prevent any crawler that abides by the standard from looking at a page named thispage.html in a directory named /thisdirectory.
Ask for help from your technical people or your Web hosting service to make sure you do this right. You don't want to inadvertently stop crawlers from seeing the pages that you want them to see.

11) Is your site based on the assumption that your visitors want to see new content every day? Is that what keeps your site lively and interesting? Do you, therefore, delete your old information, or move it to different (archive) directories? If so, while the content may still be available at your site, the old URLs simply won't work. That means that you are confusing the search engines and tripping your fans, who want to point their friends to what they've found useful and who want to bookmark and link to good content.

Yes, it might be convenient to keep plugging new content into old URLs (e.g., www.retailstore.com/specialtoday.html), and to clean out old material, like useless debris. But you need to think first of the convenience of your users, rather than your staff. For generating traffic, old content has greater value than new content -- because it takes time to get embedded in the navigational infrastructure of the Internet (as people create bookmarks/favorites, and link to it, and as it gets included in search engine indexes.)

Quick fix -- Make it a strict rule that once you have posted a Web page, you never change its URL. (By the way, don't use an old URL for new, totally unrelated content. Otherwise, you'll annoy people who come looking for one thing and find something different.) If old content is no longer valid (for instance, you have dropped that product line), keep the page up, but add text explaining the change and a link pointing visitors to the latest and greatest information/product.

12) Does your site use encryption for security? In that case, the URLs typically begin with https:// instead of http:// Search engines cannot see encrypted pages.

Quick fix -- Ask yourself: do all the pages at your site need to be encrypted? Or only certain sections (like customer account information at a banking site)? Redesign, keeping encryption to a minimum.

13) Does your site have very little text content? Search engines only see text, and the more text they have indexed from your site, the more likely users searching for unpredictable phrases and combinations of words will see your pages high on their lists of matches.

Quick fix -- Start writing and hire writers. The more text the better -- and not just random words, but useful relevant information that your visitors can benefit from. (Search engines "sniff" for pages that have been randomly constructed from "keywords" by "search optimization" companies, and blacklist those pages and sometimes entire sites.)

Keep in mind that these "quick fixes" are "quick" from a technical perspective. From a political perspective, they may be very difficult indeed. Many large corporations have branding rules and Web site standards that prohibit taking these simple and easy steps that could boost traffic to your site and generate new business.

In Lessons Two and Three, you will create your own personal Web pages on a public Web site that has nothing to do with your company. There you will experiment to learn what's really involved in creating Web pages, making them findable, and promoting them over the Web. Doing so should help you understand the implications of Web design decisions on Web traffic and put you in a better position to design simple Web pages yourself, or to let your technical team know what you want and why (and what you are willing to pay for it), or to influence whoever you must to make important changes in your corporate Web site.

More symptoms to watch for

Take another look at your search engine results list (host: or url:). Look at the words that are used for the hyperlink (the title). Do you see the same title appearing more than once? Or do you see some pages labeled "no title"?

Keep in mind that Web pages have three kinds of "title." There is the file name -- the last element of the URL, such as pressrelease.html -- which has no effect on search engines. There is the headline -- the words that appear in prominent type at the top of the page itself -- and which search engines treat as ordinary text, with some extra attention because it's at the start or near the start of the page. And there is the HTML title -- the title that appears near the top in the "source" of the page between <title> and </title>, that appears at the very top of the browser window, above the tool bars, and that is used by search engines as the title to link from in a list of results. Some page designers pay little or no attention to HTML titles (in part, because the page creation tools they use assign them automatically or ignore them). But from the perspective of search engines, the HTML title is the most important of a Web page.
How many times have you clicked on a page labeled "no title"? How do you know which page is which when several have the same title? And what do you think of sites that use meaningless, apparently random sets of words for titles?

Each page should have its own unique, clearly descriptive HTML title. It should be the job of the people who understand the content and who write the pages to write those titles -- don't expect the designer to come up with the words you would like to see. Once you decide on those words, the designer can easily insert them.

Also, take a look at the 2-3 sentence description that appears with each item in a list of matches. Do several of your pages or even all of them have the same description?  The default description is typically the first few lines of static text on the page. If those words are likely to be meaningless (for instance, on a page with lots of graphics, and a few words associated with each), the designer can insert a "description metatag" with alternative words for the use of search engines. Unfortunately, many designers simply attach the same metatag to all the pages in a given part of a site or in the entire site.

Once again, designers are typically not writers. You need to give them the descriptions -- a different one for each page. Better still, write and design the pages themselves so the most important content appears at the top (like in a newspaper article). Let text, not graphics prevail. Then, there's no need for description metatags, and, as we'll discuss in Lesson Three, your pages will be likely to appear higher on search engine match lists.

Remember, the home page of a Web site is very different from the cover of a magazine. The art on the cover of a consumer magazine is very important -- that's what attracts customers who see it displayed on a news stand. But on the Internet, nobody sees your home page until they've decided to go there. The artwork and flashy effects don't bring traffic; they simply make your pages slower to load, getting in the way of people who want to get to your information and products.

You're not getting older, you're getting better

Now go back to the AltaVista home page and click on Advanced. (Today the link is near the top of the left column, but they frequently change the look and feel of the site).

In the top box (labeled "Boolean query") enter the same query you did before, beginning with host: or url:

Then enter dates in the From: and To: boxes (using European date format dd/mm/yy). Test to see the age of the pages in the index.

If your site has been around for a while, you might find pages that are 3-5 years old, perhaps with embarrassingly obsolete information.

Do not delete those old pages. In fact, if you find that search engine indexes still include old pages that have already been deleted from your site, do everything you can to get those pages back online, with the same URLs as before.

While the information value of content decreases over time (as it become outdated), its marketing value on the Web increases, as more search engines include it, as more people bookmark those pages and link to them. Do not throw that value away.

When you have new products and information, link from the old to the new (and include appropriate explanations), but do not delete the old. That way customers who have the older products or have heard good things about them, will have an easy path to follow to learn about the new ones and how to migrate.

Think of content on the Web as a marketing asset, and pay attention to technical details that could impact its marketing value, rather than abdicating responsibility for all such details to the technical staff.
 

The missing link

From the home page at AltaVista, search for
+link:yourdomain.com -host:yourdomain.com
or (as above with url:)
+link:members.xoom.com/rseltzer -url:members.xoom.com/rseltzer
or
+link:megacompany.com/productx -url:megacompany.com/productx

This query will show you which pages (and how many pages) outside of your site have links to pages at your site.

You can even fine tune the search, to look for pages with links to individual pages of yours, e.g.,
+link:yourdomain.com/tutorial.html -host:yourdomain.com

The more links the better, especially links from well-respected sites and sites that deal with related content. These links can drive additional traffic to your site and also (for some search engines, like AltaVista and Google) can raise the ranking for your pages (making them appear higher in search result lists).

If you have few links to your pages, do searches to find related sites and contact the webmasters offering to exchange links with them.

If you have many pages linking to yours, assign someone to check them all -- those are potential allies and partners. NB -- At AltaVista you typically only see a maximum of 200 matches (20 screens of 10 matches each). To see more than 200 (which you might very well want to do for link:), use Advanced Search, with the syntax
link:yourdoman.com AND NOT host:yourdomain.com
and after the 20th screen you should be able to keep clicking on "next" to see more.

If you recently reorganized your site, changing URLs (a definite no-no, as discussed above), and if you can't resurrect those old addresses, then search for links to the old addresses, and let the webmasters know about the changes so they can fix their links.

In Lessons Two and Three, you will sign up for your own personal Web space and experiment with pages that you create and post there. What you can do on your own will give you a point of comparison for measuring the success and cost effectiveness of your business site, and give you some ammunition for your discussions with technical experts and bureaucrats.

Suggested further reading:

"How to use content to attract traffic to your Web site, even when branding rules saddle you with a search-engine unfriendly design" www.samizdat.com/brandandtraffic.html
"The power of words on the Internet -- Content-based Internet marketing" www.samizdat.com/report.html
"What belongs on a Web page, and why?" www.samizdat.com/belongs.html
"Advice to a friend redesigning a company Web site" www.samizdat.com/advice.html
"The leap to multimedia -- It all depends on disk space" www.samizdat.com/leap.html
"The future of the Internet and the future of business" www.samizdat.com/maine.html



Introduction, Lesson 2, Lesson 3

Published by B&R Samizdat Express, 33 Gould St., West Roxbury, MA 02132. 617-469-2269. seltzer@samizdat.com

Please check our online store http://store.yahoo.com/samizdat

Can we help you build an Internet business? Richard Seltzer is an independent Internet writer/speaker/consultant. Click here for details.

This book (plus three other Internet business books and numerous related articles) is available on CD ROM for $19. Check our online store at http:/store.yahoo.com/samizdat

Return to B&R Samizdat Express

Sitemap with links to every page at this site.


<
Internet Business Showcase: