the actionable guide to doing better semantic keyword research #brightonseo (with knime)
TRANSCRIPT
The Actionable Guide to Doing Better Semantic
Keyword Research
The Prevalence of Semantic Search (Unstructured)Search Engines are coming to rely more-and-more on semantic search technology to understand websites and how users search.• As a result SEOs need to better understand how language
and keywords relate to each other in order to do more effective keyword research.
Do semantic keyword research!
What Is Semantic Search?Strings can represent things:
• Search Engines are looking past exact match keyword occurrences on web pages.
• They are learning the meaning behind keywords and examining how they relate to each other conceptually
• The strength of that conceptual connection being scored for relevancy within search queries and on-page.
What is a mammal that has a vertebrate and lives in
water?
+1 Probability
+1 Probability
+1 Probability
Google Hummingbird
What’s up with Hummingbird?
“Hummingbird is paying more attention to each word in a query, ensuring that
the whole query – the whole sentence or conversation or
meaning – is taken into account, rather than particular words. The
goal is that pages matching the meaning do better, rather than pages
matching just a few words.”
Hummingbird improves semantic understanding of search queries AND makes conversational search better, which is important for the future of mobile and voice search.
Hummingbird SummarizedI like Gianluca Fiorelli’s analysis of the theoretical capabilities of a post-Hummingbird Google search:
1.To better understand the intent of a query;2.To broaden the pool of web documents that may answer that query;3.To simplify how it delivers information, because if query A, query B,
and query C substantively mean the same thing, Google doesn't need to propose three different SERPs, but just one;
4.To offer a better search experience, because expanding the query and better understanding the relationships between search entities (also based on direct/indirect personalization elements), Google can now offer results that have a higher probability of satisfying the needs of the user.
5.As a consequence, Google may present better SERPs also in terms of better ads, because in 99% of the cases, verbose queries were not presenting ads in their SERPs before Hummingbird.
Source: http://pshapi.ro/mozingbird
How Can SEOs Optimize for Semantic Search?
1. Make sure our content delights our users Create quality content and use personas
2. Optimize for searcher intent and build topical authority using semantic topic modeling
Understand how users search andhave command of your niche’s
languageNow THIS is
great content.
Build Topical Authority for a SubjectWhen conducting keyword research, optimizing on-page, or creating content, have a deep understanding of your niche’s language:
1. Understand how concepts relate to one another and which keywords pertain to those concepts.
2. Ensure these concepts are well represented.
keyword
keyword
keyword
keyword
keyword
keyword
keyword
keyword
keyword
keyword
keyword
keyword
keyword
keyword
keyword
Optimize for Searcher IntentHave an exceptional understanding of consumer language and the myriad of ways users may search about your niche
1. What are consumers looking for when they are familiar with your niche? • Language used should represent core keywords.
2. What are consumers looking for when they are not familiar with your niche?• Language tends to be more conversational. You
may uncover more related terms when exploring your niche from this perspective.
3. What else do these two groups search for?• These searches may be directly and/or indirectly
related.
Actually doing Semantic Keyword Research…
Social Media Is an Awesome Data Source
Social Media Is an AWESOME Data Sourcefor Semantic Keyword Research1. Social media data helps us expand our collection of
keyword ideas—especially new, breaking keywords.
2. Social media language is inherently conversational and can help us understand how conversation queries may be phrased.
3. We can use it to mimic the language of the customer, which has a secondary CRO benefit.
#Awesome
Secondary CRO Benefit: The Echo EffectWhile you’re at it, use social media language to mimic the language of your consumer. There are several studies that indicate it may help build trust and boost conversions• Study published in the International Journal of Hospitality
Management: Waitresses who verbally mimicked a person’s
order were more likely to receive higher tips.
• Study publish in the Journal of Language and Social Psychology:
Mirroring people’s words can be very important in building likability, safety, rapport, and social cohesion.http://pshapi.ro/echohospitality
http://pshapi.ro/echoinfluence
Once We Collect SERP and Social Media Data...There are some way we can break it down and analyze.
Co-occurrence• How often two or more words appear along side each
other in a corpus of documents.
Latent Dirichlet Allocation (LDA)• Finds semantically related keywords and groups them into
topical buckets.
TF-IDF (Term Frequency-Inverse Document Frequency)• Reflects how important a keyword is to a document in a
whole collection of documents.
The Ultimate Tool
KNIME Is the One Tool to Rule Them All
• Free and open source, running on every platform
• Allows you to do things using a drag-and-drop interface that you would normally need a developer or programming background to accomplish.
• Synergizes data-oriented tasks and helps easily automate: Data collection Data manipulation Analysis Visualization Reporting
http://pshapi.ro/downloadknime
Visualizations KNIME Produces That Will Help Optimize for Semantic Search
Keyword Node GraphsSegmented Word Clouds
Basics of KNIME
What’s a Node?
• Pre-built drag-and-drop boxes designed to do a single task.
• They are combined together into “workflows” to do larger, more complex tasks.
• Nodes can be grouped together into meta-nodes which can be configured in unison.
How Do You Add Nodes and How Do They Connect?How do you add nodes?
How do you connect nodes to one-another?
Configuring Nodes and Running WorkflowsConfiguring Nodes
Running Workflows
OR
Accessing Data from SERP and Twitter + Common Node Configurations We’ll Be Using
Get a Twitter API KeyFill out the forms!• Application “Name”,
“Description”, and “Website” don’t matter for our purposes.
Go to “Keys and Access Tokens” tab and grab:• Consumer Key (API Key)• Consumer Secret (API
Secret)Click “Create my access token” and grab:• Access Token• Access Token Secret
Go to: https://apps.twitter.com/
Accessing Social Data – Twitter API Nodes
Right-Click and “Configure” to input API
information
Right-Click and “Configure” Twitter
Search Query (and type)
It’s Stupid Easy
Extract Only the Links from Twitter
A little trickier than it should be since you have to expand t.co links and URL shorteners.
Accessing SERP Data – Inputting Data Manually
Manually input URLs with Excel Spreadsheet or CSV (Desktop Rank Checkers)
Manually input URLs with “Table Creator” node (Right-Click Configure – edit just like a spreadsheet)
Accessing SERP Data – Inputting Data via API (Better)
Example – GetSTAT
More-Complicated Meta Node Method
Make Webpages Plain Text (for Analysis)
Use Boilerpipe API (pre-made meta-node download to be provided)
http://boilerpipe-web.appspot.com/
Getting Things into a Text Analysis Format
Use the built-in “Strings To Document” node
A Few More Useful Base Nodes for Text Analysis
Parts of Speech Tagging (POS)
Calculate TF-IDF
Co-Occurrence Nodes
LDA (Latent Dirichlet Allocation) Node
Color Manager & Word Cloud
Network Graph
Process: Using KNIME for Semantic Topic Modeling and Keyword Research
Bringing It All Together: Applying Concepts to Visualizations1.Search Twitter for keyword and collect all of the Tweet text2.Search Twitter for keyword, extract links only, scrape text
from links3.Extract top 10 ranking pages keyword and scrape text from
links4.Isolate single word keywords and/or multi-word N-grams5.Calculate TF-IDF
THEN we can…
• Tag Parts of Speech (Nouns, Adjectives, Verbs, etc.) and display in Word Cloud
• Do Co-Occurrence Analysis and display in Node Graph (remember earlier patent?)
• Identify semantic topic groupings with LDA and display in Node Graph
Analysis We Can Do Based on a Google Patent
Simplified with a smaller corpus, but easily replicable with KNIME:
1. Filter out too common terms using TF-IDF2. Take the top 20 or so terms that are above a certain
threshold based upon TF-IDF and remove the rest.3. Calculate Co-occurrence of the remaining terms.4. Optimize your site for these!
Bill Slawski Patent Analysis: http://pshapi.ro/cooccurencepatent
Bringing It All Together – Parts of Speech Output
Bringing It All Together – TF-IDF + Co-Occurence Output
Bringing It All Together – TF-IDF + LDA Output
Now Start Building More Effective semantically Optimized Websites!
© 2015 by Catalyst Digital. All rights reserved.