Cascading Style Sheets
See CSS.

The practice of grouping web pages by topic to form a directory.
Also see classification

In the context of Web directories, categories refer to collections of links to sites of a similar topic.

Common Gateway Interface - a popular interface between web server software and other programs.

See directory; category

citation analysis
A tool initially deveoped in information science to identify the core (most cited) set of documents for a given topic. In the context of web searching and search engine algorithms the term "link analysis" is more commonly used.

citation count
The number of times a document is referenced by other documents in the same collection. Citation count (or link count) differs from link popularity in that only the number of citations (links) and not the quality of the links is considered.

The process of organizing documents available online into topical categories to form directories. These are normally hierarchical tree structures with "Main Categories" and a number of "Sub Categories" which often go several levels deep.

click tracking
Search engines can track user clicks in order to "learn" from users which pages are most relevant to a query. The best-known example is that of "Direct Hit", a discontinued search engine that not only tracked clicks but also logged the amount of time users spent on pages returned in order to improve relevance.

A computer, program or process requesting information from a server. Email programs are sometimes called e-mail clients. They request e-mail messages from pop3 servers. Spiders (like Googlebot) and browsers (like Internet Explorer and Netscape) are also clients.

click through (click-through; clickthrough)
Referring to the action of clicking through from, for example, a search engine's results page to a web site. Click through rates are especially useful in Internet advertising where it is an important factor in determining the success of an advertisement.

click through rate (CTR)
a.k.a. click rate
Often used in Internet marketing to describe the percentage of users who click on a link or advertisement. The CTR is used as a measure to determine the effectiveness of a link / advertisement. It is most effective if used in conjunction with other measurements like conversion rate.

cloaking (Source:http://www.searchenginedictionary.com/c.shtml )
The practice of delivering content based on the IP address of the client. The practice is sometimes defended by saying it's a way of protecting code from theft. It should be noted that the practice of cloaking can get your site banned from the search engines. For a detailed discussion on cloaking and links to cloaking resources, please refer to the Search Engine Yearbook(http://www.searchenginedictionary.com/sey.shtml).

closed loop
Used to describe a linking structure where a group of web pages interlink heavily while there are few or no links to or from pages outside the group. General consensus is that search engines can detect closed loops and penalize pages in closed loops. It is currently unclear exactly where the cut-off point is. Is it only a closed loop if there are no links to or from pages outside the group or also if there are just too few such links? It is generally advisable to have links to outside pages that in turn also link to many outside pages.

Search results grouped together (to save space on the SERP), usually based on a shared top-level domain.

A technique the search engines use to group different pages from the same domain in their search results pages. Without clustering, the top spots for certain search terms are often completely dominated by one site. Clusters usually consist of one or two pages from one domain with a link that says something like "More results from pandecta.com". The term differs from terms like classification, taxonomy building, tagging, etc. in that it is fully automated. Further human intervention is not needed.

code bloat
When a web page or site is so full of code (scripts, font tags, redundant HTML) that it becomes hard to edit, slow to download, and more difficult for search engines to index.

collaborative filtering
Also known as "social filtering". A technique used to improve relevance, it returns documents other users with similar queries found relevant. This technique is also very effective in cross selling, as seen at Amazon.com ("People who bought 'Mary's Guide to Fast Food' also bought 'Jane's Recipes' ")

A group of documents queried.

collection fusion
The practice of combining search results from multiple collections. Meta search engines are faced with the problem of effectively combining & re-ranking results that have already been ranked by different algorithms.

combined log file
A log file that tracks visitors on a web site. A combined log file typically includes additional information on user agents, referrers etc.
Also see log file and common log file.
For more on log file analysis and downloadable tools that make it easier, please refer to the Search Engine Yearbook.

Comment tags (in HTML) allow the site designer to enter comments explaining the code, making it more understandable for human readers. Comments are not displayed by the browser. Comments are enclosed by the comments tag: <!-- like this -->. The comment tag is also used to enclose scripts, ensuring that the raw code is not displayed on non-compliant browsers. Comment tags are sometimes loaded with keywords to artificially inflate a page's ranking. Loose that sparkle in your eye though… most search engines ignore comment tags completely.

common log file
A standard log file with no additional information.
Also see log file and combined log file.

concept search
A search for documents related conceptually to a search term, rather than for documents that actually contain the search term itself.

content-based filtering
Filtering documents by extracting some or all of the content contained in each document. Modern search engines all use content-based filtering in combination with either filtering mechanisms. Best known of these other mechanisms is Google's PageRank system that measures inbound links from other documents.

conversion cost
Total cost per sale, calculated by dividing the total cost of an advertising campaign by the number of resulting sales. For example, if $1000 is spent on an advertising campaign and that campaign results in 20 sales, the conversion cost per sale is $50 ($1000 / 20). That means it costs $50 to generate one sale.

conversion point
Conversion points are the points at which your customers have completed a specific action on your web site. Common conversion points are: Newsletter sign up - the "thank you for subscribing" page, Order/Sale - the "thank you for your order" page, Download - the "Your download is complete" page.

conversion rate (CR)
The percentage of site visitors that deliver the most wanted response (MWR). The CR is an important measure of the effectiveness of the online sales effort. For example, if 4 out of every 100 visitors to a site deliver the MWR, the CR for that site is 4%.

cosine similarity
See Similarity.

cost per click / cost-per-click (CPC)
See CPC. Sometimes also used as a synonym for PPC.

counter / page counter
Typically accompanied by something like "You are visitor number ___ since Oct 2001". Counters count page views, not visitors. The difference is that one visitor can generate many page views by opening many pages on the site. Counters offer a relatively inaccurate way to measure site traffic and are generally considered amateurish. Log files offer far more accurate and comprehensive visitor data.

Cost per action. Similar to CPS. Also see conversion cost.

Cost per click. The total cost of an advertising campaign divided by the resulting number of unique visitors. Sometimes also used as a synonym for PPC.

Cost per lead. The total cost of an advertising campaign divided by the resulting number of new leads.

Cost per thousand impressions (M= Roman numeral for 1000). A pricing system often used in the banner advertising industry. Typically a fixed price is offered for 1000 impressions of a banner. The price is usually influenced by the topic of the site (how targeted the audience is) rather than the popularity of the site.

Cost per sale. Similar to CPA. Also see conversion cost.

What spiders do. It refers to the action of following links to navigate from page to page and site to site.

See spider.

crawler lag
The delay between the point where a web page is crawled and the point at which it is added to the search engine's index.

cross linking
Referring to links between a family of domains - for example your business site, your personal homepage and your cat's homepage. Cross linking is sometimes used to inflate link popularity. Although not yet proven (to my knowledge), excessive cross linking is widely believed to be penalized by the search engines.

CSS (Cascading Style Sheets)
An add-on to HTML that allows for more accurate control over the way a web page is rendered. CSS allows designers to create custom styles that are then applied to the web site in one of a variety of ways. The main benefit is that something like text colors for an entire site can be changed by editing only the CSS file. CSS can also be used in SEO, but most SEO techniques that involve CSS are considered spam. We have a more detailed discussion of the SEO uses of CSS in our Search Engine Yearbook.

The practice of buying domains that contain popular trade names (for example fordmotors.com) or are common misspellings of popular trade names (for example gogle.com). The intent is usually to either resell the domain or to pull traffic through misspellings, rather than to develop a serious, unique site. Traffic gained through misspellings is often automatically redirected to another domain.
Also see DNS parking.

Referring to professional online researchers. Sometimes also referred to as "super searchers".


