Latent semantic indexing. What?

I often seem to know when a topic is trending.  This time it’s ‘latent semantic indexing’ or ‘LSI’.  I’ve had three enquiries in the last month or so for website copywriting or rewriting where the main pre-requisite has been ‘in-depth understanding of LSI’.  In my books that’s a trend.

Ensuing conversations have revealed quite a strong belief that LSI is some kind of new thinking and a magic secret sauce that’s going to shift poorly ranked website pages onto page 1 of the Google search engine results page.   A quick Google and I find myself numerous articles that suggest just that.  I tend to disagree.

So what is it?

In simple terms, LSI is only about using words in your copywriting that are closely related to your web page’s primary keyword.  Importantly, they have to be in the right context.  They have to link in to the page’s main theme.  They have to fit naturally.

And so for example, on a page about ‘press release writing’ for example I might include the words, ‘news’, ‘newsworthy’, ‘objective’, ‘headline’, ‘editor’ etc.  What such related words are doing is signalling more clearly to Google that this page is about press release writing.

LSI is ultimately helping Google to do its job even better i.e. ensuring that the pages it presents in response to a search enquiry are the most useful and directly relevant resources.  It provides content clues and signposts, and in short, that’s it.

What it isn’t

One clear falsehood I’ve noted in my recent conversations is that LSI is all about including words that are ‘similar in nature’ to a web page’s primary keyword i.e. words that have the same meaning, synonyms to be precise.  Wrong.

Using synonyms is unquestionably a good thing, since we can all use different vocabulary, it helps keep the keyword density down, and once more it’s assisting Google in understanding what a web page is really all about.  But it isn’t LSI.

So for my page on ‘press release writing’, I might well talk about drafting ‘news releases’, ‘product releases’ or ‘online news releases’ but such synonyms don’t add any contextual value or detail to raise the credibility or authority of the web page.

And it’s nothing new either

I don’t know why, in my own little world, ‘LSI’ is trending right now but what I do know is that it simply isn’t new.  I attended a course on search engine optimised copywriting a decade ago and there was a whole section on LSI (and how LDA, ‘Latent Dirichlet Allocation’, was actually better (another story)).

And in reality the concept goes back a lot further than that, to the 1980s in fact, the basis to academic studies and mathematical techniques for language processing and accurately retrieving documents.  So it’s not a new idea by any means.  It was, and still is, about detecting ‘patterns’ of words and identifying themes.

While I’m no super-expert in search engine algorithms, I believe I’m pretty safe in saying that LSI is certainly a useful tactic in optimising a website page for search (and indeed for engaging with human readers…), but it is not the be all and end all when it comes to search engine optimisation.

It’s just one of many different factors affecting a website’s overall SEO credentials. Good writers with a sound knowledge of SEO copywriting tactics have known this for some time and ‘LSI’ their output as a matter of course.  And it’s worth doing too.