September 4th, 2010 / Search Engine Optimisation

File Types The Search Engines Can Index

If you have created a web site with certain file types that you think (or hope) that search engines can’t read or index, you may be surprised. Google have recently announced that they are now reading and indexing SVG files. These are popular vector based graphic files. The announcement prompted a quick look at what files Google can and does index. Here is a brief list, (you can read the full list here):

  • Adobe – Flash, PDF and PostScript (PS)
  • HTML
  • Java source code (.java)
  • Microsoft – Excel, PowerPoint and Word
  • OpenOffice – presentations, spreadsheets and text
  • Rich Text Format (.rtf, .wri)
  • Text (.ans, .asc, .cas, .txt, .text)
  • XML (.xml)

We have seen in the past where webmasters have tried to hide links in Java code or Adobe Flash. These days Google can and does read these files and it does generally follow any links inserted. If you have suspect links in these types of files, chances are that Google knows about them. If you’re not ranking as highly as you think you should, you may want to check and/or remove those suspect links.

Of greater importance to webmasters is the optimisation of those files. If you have PDF files, do they contain links? Are the file names optimised for keywords? There are many web sites around that have a mass of data hosted on their site. You may think that one link to a document reduces the chances of that document being found – the reality is, if there is a link to a document, the search engines will most likely find it.

Every aspect of your web site needs to be looked at closely to ensure that it has been optimised for search. Whether it’s a graphic, a PDF file, or a simple text file, ensuring information like titles and file names are search engine optimised is now important. At the same time, if you don’t want these files read and indexed, be sure to include the ‘no-index’ operator in your robots.txt file.

0 responses so far!

  • No comments yet.

You must log in to post a comment.