What is clustering semantic core. How to assemble and ungroup the semantic core: complete instructions

This is a grouping of keywords that are simply a list, dividing them into clusters (groups). This is what turns your thousands of queries into a complete structure, broken down into categories, pages, articles, etc. Without proper breakdowns, you will be wasting a lot of money and time "idle", as some queries cannot be "planted" on one page. Or vice versa, keywords require these queries to be on the same URL.

When collecting the semantic core (CY), I usually do clustering by hand, using, here are the links on the topic:

But all this is easy and simple when we have clear groups of queries with different logical meanings. We know perfectly well that different landing pages must exist for the query "Twin stroller" and "Stroller for boy".

But there are requests that are not quite clearly divided among themselves and it is difficult to "feel" to determine which requests should be placed on one page, and which requests should be sent to different landing URLs.

One of the participants in my SEO marathon asked me a question: "Petya, what about these keys: put everything on one page, create several, if so, how many?" And here is an excerpt from the list of keywords:

Only one word "java" is used in three variations ("java", "java"), plus all this people are looking for it for different games, devices, etc. There are a lot of requests there and it's really hard to understand how best to proceed.

What do you think is correct? Right. An analysis of competitors who are already in the TOP for these keywords is best suited. Today I will tell you how you can do clustering of the semantic core based on data from competitors.

If you already have a ready-made list of keywords to cluster, you can skip straight to step 4.

1. Query Matrix

Let me take another example: I have one client with an online store of electrical and lighting equipment. The store has a very large number of products (several tens of thousands).

Of course, every store has products that are the highest priority for sale. These products may have high margins, or you just need to get rid of this product from the warehouse. So, I received a letter, something like this "Petya, here is a list of goods that are of interest to us." And there the list was listed:

  • switches;
  • lamps;
  • lamps;
  • spotlights;
  • extension cords;
  • and a few more points.

I asked to make a so-called "query matrix". Since the store owners know their assortment better than me, I needed to collect all the goods and the main characteristics / differences for each product.

It turned out something like this:

When compiling the matrix, do not forget that some English-speaking brands are requested in Russian as well, this must be taken into account and added.

Of course, if the product had other characteristics, a column was added. These can be "Color", "Material", etc.

And this work was done for the highest priority products.

2. Multiplication of queries

There are many services and programs for multiplying requests. I used this generator of key phrases http://key-cleaner.ru/KeyGenerator, we drive in all our queries by columns there:

The service has multiplied all sorts of options with the word extension cord. Important: many generators multiply only consecutive columns, that is, 1 column with the second, then the first two with the third, etc. And this one multiplies everything in a row from the first column with others: the first with the second, then the first with the third, the fourth; then the first * second * third, first * second * fourth, etc. That is, we get the maximum number of phrases with the content of the main word in the first column (this is the so-called marker).

Marker- this is the main phrase from which you need to generate a key. Without a token, it is impossible to create an adequate key request. We do not need the phrases "iek wholesale" or "buy on a reel".

When multiplying, it is important that each key phrase contains this marker. In our example, this is the phrase "extension cord". As a result, 1439 (!) Unique key phrases were generated in this example:

3. Clearing requests from "garbage"

Now there are 2 options for the development of events. You can start clustering all these requests and create a huge number of generated pages for each cluster, if your site's system allows it. Of course, each page should have its own unique meta tags, h1, etc. Yes, and sometimes it is problematic to stick such types of pages into the index.

We did not have such an opportunity in technical terms, so we did not even consider this option. It was necessary to create only the most necessary new landing pages in a "semi-manual" mode.

What type of frequency to work with? Since our list of goods + intersections was not very popular (narrowly targeted), I focused on frequencies with quotes(no exclamation marks) - that is, in various word forms. These are key phrases in different case, number, gender, declension. It is this indicator that allows us to more or less estimate the traffic that we can get from Yandex if we hit the TOP.

We remove in Key Collector the frequencies in quotation marks for these phrases (of course, if you have a seasonal product, then you need to remove the frequencies in the "season"):

And we delete everything that is equal to zero. If your topic is more popular and there are a lot of nonzero words, you can increase the lower threshold to 5 or even higher. I have only 43 non-zero queries out of 1439 phrases in the Moscow region and the region.

I transfer these 43 phrases with frequency data to Excel:

4. Clustering requests

I do all this in Rush Analytics, here is the clustering algorithm in this service:

For each request, it is "pulled" from the TOP-10 URL for a given region. Further, clustering occurs along the common URL. The accuracy of clustering can be set by yourself (from 3 to 8 common urls).

Let's say we set the accuracy to 3. The system remembers the URL of the pages that are in the TOP-10 on the first request. If the second request from the list in the TOP-10 contains the same 3 URLs that the first had, then these two requests will fall into 1 cluster. The number of shared URLs depends on the precision we set. And this processing happens with every request. As a result, the keywords are divided into clusters.

  1. We go to RushAnalytics -> Clustering, create a new project (upon registration, everyone receives 200 rubles to the account for testing, it is convenient):
  2. Choosing a priority search engine for us and a region:

  3. Choose the type of clustering. I choose "Wordstat" in this case. The "Manual Markers" method does not work for me, since there is only one "Extension" marker in the queries. If you download several different types of goods at once (for example, an extension cord, a light bulb, etc.), then you better choose the type "Wordstat + hand markers" and specify the markers (markers will need to be marked with the number 1 in the second column, and not markers with the number 0, the frequency will go to the third column). The markers will be the most basic queries that do not logically connect with each other (the query "extension" and "light bulb" cannot "fit" on one page). In my case, I work in stages with each product and create separate campaigns for convenience. Also choose the precision of clustering. If you do not yet know which method to choose, you can mark everything (this will not affect the price in any way), and then, after getting the result, you can choose the option that best clustered your requests. From experience I will say that the most suitable in all topics is accuracy = 5. If you are doing clustering for an existing site, I recommend that you drive in the URL of your site (if your site is in the TOP-10 by request, then your URL will be highlighted in green the resulting file):

  4. In the next step, upload the file to the system. You can also set up stop words, but I had a file without them, so this function is not needed in this example. Clustering price - 50-30 kopecks per request (depends on the volume):
  5. You will need to wait a bit for the Rush Analytics service to do its job. Go to the completed project. Already there you can view the clusters based on the clustering accuracy (the beginning of a new cluster and its name are highlighted in bold):
  6. Again, it is best to use precision 5 for clustering. He most often fits.
  7. Also in the next tab you can see a list of non-clustered words:

    Why weren't they clustered, you ask? Most likely, according to these requests, the output is not very high-quality and it was impossible to automatically assign these requests to any cluster. What to do with them? You can manually cluster and create separate landing pages logically if possible. You can even create a separate cluster for one request and "plant" it on a separate page. Alternatively, you can expand the word list and re-cluster in Rush Analytics.
  8. In the "Topics Leaders" tab, you can see the TOP domains for these queries:

  9. By the way, in some queries you can see these thumbs up, highlighted in green:
    This means that according to these requests, you already have a landing page for this cluster in the TOP-10 and you need to work on it.
  10. The whole thing can be downloaded to your computer in Excel and work already in this document. I am working with precision 5, so I download this file:

  11. The Excel document has the same information. The beginning of each cluster and its name are highlighted in gray (click on the image to enlarge):

  12. In addition to the name of the clusters, here you will see their sizes, frequencies, total frequencies, Top URL, relevant URL and highlighting, which is very necessary when working on a landing page. Here they are:

    Please note that the "Universal" brand (through "U") is also highlighted, and I did not even suspect that this brand could be prescribed this way. In the highlights, you will also see synonyms and thematic phrases that are highly desirable to use on landing pages to reach the TOP.

Conclusion

What's next? What will this clustering give us? Now, for each cluster, our website should have a separate, and most importantly, relevant url. The promotion of these pages is completely in our hands and we move forward as best we can (content optimization, internal linking, external optimization, social factors, etc.).

If we did the wrong clustering, then it would be difficult to promote a lot of queries. This would be the "anchor" that would hold us back even though we would spend a lot of money promoting these pages.

Correct clustering will help you save a lot and make it much easier to get into the coveted TOP.

What do you think about this? How do you cluster the Semantic Core queries?

We have released a new book, “Content Marketing on Social Media: How to Get Into the Heads of Subscribers and Fall in Love with Your Brand”.

What is Semantics Clustering?

The service works online and allows you to cluster keys based on the results of search engines. In fact, grouping is only one of the features of the service, but now let's talk about it.

We create a new project, in which we indicate its name, select a country, region, etc.

We set the accuracy and indicate with what frequency the service will have to work.

Let's create a project. In the window that appears, we will see "control information", which contains the cost of our project.

You can also get acquainted with the cost by simply clicking on the price tab.

After clustering, the program unloads an Excel document with ungrouped key phrases.

We go over what happened and refine it, because after all, the machine works and errors are possible.

  • work takes place online;
  • all projects with which we worked are saved;
  • costs money;
  • the final price is expensive;
  • you still have to go through everything with your hands.

Surely many have heard about this program, and some have worked in it, collecting frequencies. Key grouping is just a small part of what this utility can do.

You can group queries by phrases based on the search results. Search based grouping only works when the KEI is collected. Everything about everything takes an average of 2 minutes.

  • intuitive interface;
  • the ability to customize grouping;
  • a huge number of options for working with semantics;
  • relatively low price of the product;

  • must be installed on a PC;
  • you cannot edit the received groups in the utility itself - only in the Excel dock;
  • you need to manually adjust the clusters.

A well-known seo platform with an automatic clustering tool. It differs from competitors - in real time it breaks through the top 30 results for each phrase added to the clustering, and makes groups of semantically related phrases based on how many sites use this phrase on their site. The more sites from the top 30 have the same phrases, the higher the connection between them - then the service will add them to the cluster. A separate document has been removed about the mechanism of work. video developers.
And if the technical part of clustering is complicated, then it is easy for the user to customize the project.

There are 4 stages in total. On the first, you need to set the name of the project. The second step is to import key phrases for clustering, you can add them manually via Ctrl + V or import a file. The third stage is the choice of the clustering region. The service allows you to select a region up to a city, which is important for local SEO. And in the final fourth step - setting the type and strength of clustering.




In the fourth step, you can leave the clustering mode at the default. If you don't like the result, you can simply change the project settings and re-try the same keywords with different parameters for free.


The result is exported to XLS, XLSX, CSV files and looks like this: - one of the most important stages of working with a project. This procedure allows you to properly configure the site at the initial stages of the development of the resource and lead it in the right direction in the future. With it, you can avoid unnecessary alterations and revisions of the resource after several months of promotion.

On my own behalf, I can add that grouping using services is good and convenient. But in order to be 100% sure of the result, in any case, it is necessary to do the final stage of semantics clustering by hand.

When there is already a list of requests, this is not yet a semantic core - it would be necessary to start scattering requests across the pages in order to have an idea of ​​how to fill the site. Without good semantics, it will be very difficult to get traffic from search.

What is query clustering

Clustering queries is exactly the distribution of search queries of the same topic into groups for promoting the landing page.

Clustering includes the following processes:

  • grouping of requests depending on the intentions of the user (intent);
  • checking the compatibility of keywords for promotion on one page in the top of Yandex.

Requests with the same intent- these are different requests through which a person, in fact, is looking for the same thing. Obvious examples are the queries [Parker pen] and [Parker pen]. The situation is more complicated with such synonyms as: [table lamp] - [night light], [birth certificate] - [metric], [monitor] - [screen]. The difficulty lies in the fact that when searching for synonyms of keys through the Yandex dictionary, the system does not always offer an adequate selection.

In practice, similar queries can have a lot of different characteristics, because of which they cannot be placed on the same page. Clustering queries by tops comes to the rescue. The clusterizer finds the same URLs in the top of the search engine results, thereby signaling the presence of the same intent. The result of the work is expressed in the following:

  • the presence of the same URLs in the top for queries means the possibility of their promotion on the same page;
  • the absence of shared URLs indicates, with a high probability, the impossibility of such a promotion.

Why do you need clustering

Even the largest semantic cores can be grouped quickly with automatic clusters. If earlier it took weeks and months to disassemble the kernel, then thanks to the clusterizers, the work is reduced to a couple of hours. A big advantage of clustering is the distribution of requests across pages in such a way that they can be promoted at the same time. It is difficult to imagine a manual analogue of high-precision clustering, since even an experienced optimizer commits up to 30% of erroneous distributions. It follows from this that keyword clustering is necessary in almost any case.

When I was a webmaster-dummy, I made a website where there was a separate article for each request. Of course, he did not receive traffic - it turned out only a failure. And this is really a problem for many beginners - incorrect queries or incorrect clustering.

Clustering techniques

When grouping queries, there is ambiguity in the methodology for combining them based on tops. In practice, there are two main methods: “soft” and “hard” clustering.

Soft clustering is based on the formation of a group from one "central" query. Everyone else is compared to him in terms of the number of shared URLs in Yandex's top 10. Soft clustering forms groups of a fairly large size, but errors often occur in determining the possibility of joint promotion of requests on the page.

Hard clustering is characterized by combining requests into a group, when there is a set of URLs common to all requests, which is shown for all these requests in the top 10.

There are two criteria for assessing clustering:

  1. Completeness- the number of requests in the group that have the same "intent". If all requests with the same intent fall into the same group, the completeness rate is 100%.
  2. Compatibility requests among themselves that fell into the same group. The case when all queries that have entered the cluster are compatible with each other is taken as 100%.

An important role is played by such a parameter as " clustering threshold". This is the minimum number of shared URLs to form a group. A large number means a high accuracy of the groups, but at the same time they naturally decrease in size. Experience in using semantics clusters shows that the minimum working threshold for "hard" clustering is 3 URLs, for "soft" it is 4 URLs.

Even with a threshold of 3 URLs, hard clustering provides over 90% accuracy. For comparison: without the use of tools, the accuracy of the work of an experienced optimizer, at best, will be 70%, and a beginner's - no more than 30%. Despite the high accuracy, the “hard” method gives only about 40% of the completeness.

Soft clustering has a high completeness rate, but significantly loses in accuracy. Thus, the “soft” and “hard” methods are inversely proportional to each other. The use of one method or another depends on the goals of the optimization process.

With "traffic" promotion, when it is important to display as many requests as possible on the page, soft-clustering is better suited. If "positional" promotion is carried out, then the decisive word is behind the hard.

Also, hard-clustering is used in the textual analysis of the page. Any textual analysis by a group of requests for a page is rather strictly related to the quality of this group. Only the “hard” method provides groups of the required quality.

How to group the semantic core

I usually do clustering in two steps. In the first stage, I throw the kernel into some service / program for automatic clustering, and in the second stage, I finish the kernel manually. Via Excel. Here's something like these peasants:

In these videos, in principle, it is clear how to do manual finishing, and about automatic clusterizers - here everyone chooses what he likes best.

Semparser

Topvisor's automatic query grouper is an alternative to Rush Analytics and Semparser, and the interface is similar to the latter. The degree of grouping and saving of the project in Excel file is present.

There is a "regrouping" operation in the Topvisor clusterizer. After its application, the number of groups increases, and the number of requests in them decreases noticeably. This function is useful for those who are not satisfied with soft-clustering and the hard option is suitable.

"Regrouping" is paid here, although it charges no more than a couple of rubles.

Topvisor's dignity is based on its high grouping speed. The clusterizer will distribute the semantic core of 1000 queries in a matter of minutes. Disadvantages: high cost of grouping and, of course, the need for manual editing.

Grouping via Key Collector

Another example of an automatic clusterizer is presented as an online tool at coolakov.ru. The breakdown of requests into groups is based on the similarity of the top 10 Yandex.

Plus: Free online service.
Cons: low accuracy of grouping, lack of uploading to a file.

Summing up, you can confidently opt for automatic clustering tools offered by various online services. But, unfortunately, the work of any clusterizer requires manual revision.

Which I supplement a little bit all the time. But I practically did not write anything about what clustering of keywords (search) words is and how to do it.

So, in order to get started, we need:

  • Semantic core (1 piece),
  • Clustering tools (2-3 pcs),
  • Stock of patience (2 kg).

In order to understand how the clustering of search words occurs, we need this very list of words. I have written more than once how to assemble the semantic core on my own, so I will not repeat myself. Let's imagine the semantics are collected, tea is brewed, and a small patience cart is waiting at the desktop.

What is clustering?

We have several terms, understanding which is extremely important for our work. So, we will start with them:

Cluster Analysis - a multivariate statistical procedure that collects data containing information about a sample of objects, and then organizes the objects into relatively homogeneous groups

(c) Wikipedia

Semantic core clustering- ordering the list of keywords, creating clusters of promotion and dividing keys by relevant pages.

How is keyword clustering obtained?

Clustering ... or grouping of keywords is possible according to several principles. There are a lot of copyrighted technologies on the net, but basically I would single out 2 main principles:

Manual clustering of search engines requests (suitable for new sites that are only in the project, the ability to set semantics at the start of the site launch) - it is assumed that you collect keywords by immediately (or later) setting groups manually.

Example. You can collect keywords for a small business site that you want to show to users in organic search results. For example, the site sells services in the field of apartment renovation ...

The principle of collecting the semantic core for a small site

The services themselves are divided into several categories, for example, finishing work and interior finishing work. Each of the directions is divided into a group, i.e. you already have 2 groups. Next, you analyze search queries and form a separate core for each of the groups. As a result, you get a clustered semantic core, for example, in the form of a table with fields:

  • Keywords
  • Frequency
  • Url page
  • Group

And then, using a filter in the table, sort by keyword groups. As a result, you have lists of words for each of the pages (sections) of the site, which are the sum of the clustered semantic core.

How to collect semantics for a project and cluster it most efficiently?

Let's take what is described above as an example and look at the intended structure of the site.

Also, we can add some additions to our keyword clustering.

Keywords for the main- this cluster should include the most important keywords for your site. To which the page itself is relevant. (if you offer apartment renovation services, an example of the query "apartment renovation in Kiev" is fine). Let's get a list of requests for more general content of our niche.

Service and product pages- clustering of the semantic core for these pages begins with a logical separation of importance. Which is more important to you, a kitchen renovation service or a “bedroom renovation service”, or are they all the same priority? This cluster should contain words that will correspond to a user request on the topic of services, for example: "services of a construction brigade".

Articles and Blog- clustering of the semantic core will contain information requests. For example: "how to whitewash a wall yourself" or "manufacturers of wall paints", etc. Do not neglect such sections of the site, despite the fact that you have a commercial site and only pages with services bring profit, content that is normal and useful will create stable traffic for you and help convert readers into customers.

Automatic clustering of the semantic core on an existing site

If you decide to engage in SEO optimization of an existing site and do not know where to start, check for what keywords you can do it.

For example, this can be done using Serpstat. It is enough to drive in the address of the page being checked. You just have to contemplate which keywords you already have positions for.


For example, I entered the address of the main page and received a list of key phrases with positions, and in the URL table I found links that are displayed in search queries, following the link I received a list of relevant phrases for a specific page.

Thus, you can see not only what positions your site is in, but also cluster search queries using Serpstat.

To be continued…

Consider in the near future:

  • Tools for manual clustering of search queries,
  • Tools for automatic clustering of search queries.

P.S. If you want to start clustering search queries but don't have the time. You can post a link to your project in the comments, and I will write material on a specific example on the topic of how to practically implement clustering of the semantic core.

Clustering queries Is a grouping of the semantic core in order to distribute all requests to sections of the site, or to create the correct site structure, taking into account the demand in search engines. In this tutorial, we'll take a look at a proper example of Semantic Core clustering we've put together.

Watch the video on Semantic Core query clustering

Let's get back to working with our application, in which we are. Previously, we saved the received search phrases separately for contextual advertising, now it's time to save the results for and combine them into a single Excel file for further work.

In our case, we have only two mask groups. We combine information and remove all unnecessary columns. We leave only three of them: phrases, general frequency, frequency in quotes. As a result, we get the following:

We delete (if not done earlier) queries with extremely low frequency. And we begin work on grouping the keywords that remain.

Online clustering of semantic core queries

Fill in a new sheet with data:

After completing the definition of the main sections of the site, it's time to start listing pages with filters. Let's go back to the page of a successful online women's clothing store and go down:

The so-called filter sheet opens in front of us. These pages are a great opportunity to promote multiple site requests without interfering with user experience and sometimes can help with navigation. In the future, we will analyze how exactly to create such a structure on the site itself. In the meantime, let's return to the creation of its future structure.

For convenience, you can highlight the groups of requests in different colors: let the future sections on the site be green, and the pages of filters and tags as yellow. Next, add them all to the second sheet of our document.

We add the last third point - articles:

This section on our site is able to collect exactly that second type of search phrases - informational. They will bring traffic that, with proper marketing, can be converted into conversions and regular customers.

Ultimately, you should not have cluster groups left: they should all be distributed among the three points in a new sheet of the document. In the following articles and related videos, look at about for each group of requests.

In the meantime, we are faced with the task of creating the structure of the site, creating the necessary sections and assigning tasks for - writing texts and articles.

Do not forget to think about the format in which the articles will be submitted. On the site of a competitor from our niche, you can see as many as 3 opportunities for collecting traffic for information requests:

Summing up, it is worth noting the logical need to structure the site using clustering: for us, first of all, it is important that it is the visitors who are comfortable and easy to navigate on your site. This will allow you to get more sales and good promotion results.