find almost unlimited new link targets

36
Find Almost Unlimited New Link Targets By Michael Carlin © 2013-2014, Michael Carlin . No part of this publication may be reproduced, or altered by any means without the prior written permission of the publisher and copyright owner. This PDF can be freely distributed in its original, unmodified form.

Upload: imiloje

Post on 19-Oct-2015

55 views

Category:

Documents


7 download

DESCRIPTION

How to get 60K+ new verified links

TRANSCRIPT

  • Find AlmostUnlimited New Link

    TargetsBy Michael Carlin

    2013-2014, Michael Carlin. No part of this publication may be reproduced, oraltered by any means without the prior written permission of the publisher andcopyright owner. This PDF can be freely distributed in its original, unmodified

    form.

  • Table of ContentsIntroduction Why You Must Read This Strategy Guide...............................................................3How This Guide Is Structured.........................................................................................................5What You Need To Follow Along With This Tutorial.....................................................................7The 5 Steps To Follow.....................................................................................................................8Step 1 - Getting Started And Choosing Your Target CMS...............................................................9

    A. Choose a CMS you already have a success/verified links-list for.........................................9Exporting Existing GSA-SER Links-Lists.............................................................................9What If You Don't Know Which List To Use Or Don't Have A List At All?.......................10

    B. Choose a CMS, and then find or create a starting/seed list for it.........................................10Using GSA-SER To Get CMS Ideas....................................................................................11Another Approach Brand New CMS Targets!...................................................................13Be Efficient! Avoid Crappy CMS.........................................................................................15Getting Close And Personal With A CMS............................................................................17Short Version Of Step 4........................................................................................................18Tech Note: Beware The Vertical Bar...................................................................................19

    Step 1 - Summary......................................................................................................................20Step 2 - Import and Analyze a starting Or seed Links-list With Footprint Factory...............21

    Using Footprint Factory............................................................................................................22Overview How Footprint Factory Will Rock The SEO World..........................................22General Default Options.......................................................................................................22Mode Settings.......................................................................................................................23

    Filtering Footprints....................................................................................................................24Using URL Footprints...............................................................................................................26

    Step 3: Decide On Scraping Strategy and Export Footprint Factory Results...............................27Snippet/Footprint Lists VS Web Scraping Footprint Lists........................................................27How To Build Your Footprint List............................................................................................28Step 3 Summary........................................................................................................................29

    Step 4 - Use Your Footprint List With Your Web Scraper.............................................................30Using ScrapeBox.......................................................................................................................30Using HRefer............................................................................................................................32

    Step 5 - Run Your Scraped List To Test For Successes................................................................33Importing Your URLs Into GSA-SER (2 options)....................................................................33Important Project Settings.........................................................................................................33

    Other Tips.............................................................................................................................34Know The Numbers To Gauge Your ROI............................................................................35

    To Infinity And Beyond!................................................................................................................35Resources.......................................................................................................................................36

    Tools:.........................................................................................................................................36More Tutorials:..........................................................................................................................36Speak To Me..............................................................................................................................36

  • Introduction Why You Must Read This Strategy Guide

    If you do this process for 30 minutes each day for 5 days you will end up with hundreds of thousands of new link targets. I'm talking about contextual, do-follow links too!

    Links bring traffic.Traffic brings sales.

    This guide describes how I make new link-lists for link building, as of December 2013. Here's a tier two GSA-SER project I ran and the scraped target list is the result of a Footprint Factory analysis:

    I have just over 67,000 verified links from 22,000 unique domains. The reason this is so exciting is because the raw scrape list I'm running only had about 830,000 URLs in it.

    That means that about over 8% of the URLs I scraped during the making of this tutorial gave me verified back links. To put that in perspective, a lot of people only get about 1-2% with a raw list. So 8% is pretty amazing. That cuts your working time down by 75%, and it also means you only have to do this process a few times to build a very large database of powerful links you can use to rank your sites quickly and easily.

    This guide is almost 6,000 words long because it covers all the tiny (but important)details you won't find on SEO blogs. Despite the word count, the fact is you can setup the link-finding process in less than 30 minutes once you understand the 5 step process.

    Below are some of the results I got from using this strategy. It's pretty much all GSA-SER links. There are some blog network links in there were as well. But mostof it is GSA, so it's the tier one of 871 links and the 67,203 tier 2 links pointing to

  • it, shown in the image above. This is a new website and the ranking were checked 1 week after I first made the video series:

    I then did not build any links for more than 2 weeks. Not a single one. And the rankings continued to improve (results vs 14 days earlier):

  • As you can see, the strategy does work even though the general consensus these days is that tiered link building and GSA-SER are not effective anymore. You can see that's simply not true. Of course you have to do tiered linking the right way, and find lots of new link targets that have not already been spammed to death! More about strategy later!

    How This Guide Is Structured

    In general, SEO and making money online can rarely be defined in a step by stepmanner. You often have to take diversions when you're working in the trenches and forging your own path as much of it is open road rather than linear.However I've tried to describe the strategy as an iterative process. Here is a visual representation:

  • This guide will describe this cycle in 5 steps. I also occasionally talk about the free video series I made, but both this PDF and the video series can be consumed separately. In fact I suggest you absorb both because there are quite a few differences. The PDF is slightly more recent and better organized, and the video series is an over the shoulder look. Everything in the video is current, but I feel that I've simplified the concepts better in this guide.

    Even if you don't like watching videos, I suspect you will want to have a look by the end of this PDF! The series is less than 30 minutes total, and I've edited them extensively because I hate it when tutorial videos are full of time-wasting erms and ahhs!

    You will see links throughout the PDF, but try to follow the tutorial to the end in

  • one sitting. External resources are listed at the end of this PDF to help prevent distractions! There are no affiliate links in this guide except for GSA-SER and I think most SEOs own it already, anyway!

    What You Need To Follow Along With This Tutorial

    1. A Web-Scraper (ScrapeBox, HRefer etc)You'll need a web scraper. I prefer HRefer and I also show you how you can use ScrapeBox instead!

    2. ProxiesI have a separate tutorial that shows how to get unlimited free proxies if you don't already have a proxy source. You will need free proxies because if you use a small number of private proxies they will burn out very quickly.

    3. Link Building Tool (GSA-SER, Ultimate Demon, or any link building tool you can add new target lists to)I'm using GSA-SER to find "starter/seed lists" (explained later), and for testing ournew links-lists once we've scraped them.

    4. Footprint Factory (Free, although the Pro version is better)You can do this with the free version but the Pro version also allows you to extract and create additional scraping filters, which is useful if you're using HRefer or the ScrapeBox filter methods I show you later. The Pro version has other benefits like increased URL/keyword/footprint limits and multi-threading.

    You don't need to use the exact same tool set I use because the ideas in this tutorial are transferable. But for your reference I used:Scraper HRefer & ScrapeBoxProxies 300 free public proxiesLink builder GSA-SERFootprint Factory Pro

  • The 5 Steps To Follow

    This process has 5 steps. As I said earlier part of it is open, but don't worry I've used a decision tree to show this visually and it is not complicated.

    1. Choose a single CMS (example: Wordpress, Vbulletin etc.)2. Import and analyze a starting or seed links-list with Footprint Factory3. Decide on scraping strategy and export Footprint Factory results4. Use your footprint list with your Web Scraper5. Run the list and test for successes (I'm using GSA-SER for this).

    Bear with me here, it will all make sense soon if it does not already! Step 1 is the longest section of this guide because I've really broken it down. Its arguably the most important step and well worth doing properly.

    Understand that you will sometimes hit snags (like researching a CMS only to

  • discover it's not worth your time) and you will also have to apply yourself. I've explained the thought processes involved in creating huge new links-lists in the hope that you will be able to recognize and solve any future challenges you may have by yourself.

    Step 1 - Getting Started And Choosing Your Target CMS

    First step is to find a good platform for Content Management System (CMS) you want to target. The goal is to have a starter/seed list (for just one specific CMS) for use with Footprint Factory, so you can multiply your targets lists later.

    The reason for choosing just one CMS is because each CMS has different posting rules and a different procedure for adding content and therefore getting your backlinks. So it's best to focus on one CMS at a time.

    As shown in the previous image, you can either:

    A. Choose a CMS you already have a success/verified links-list forB. Choose a CMS, and then find or create a starting/seed list for it.

    A. Choose a CMS you already have a success/verified links-list for

    Option A, you can simply import single CMS success lists that you already have directly in Footprint Factory. If you don't have any links-lists you can buy them on BlackHatWorld and other forums,. Some people will sell their success list or their verified linked lists and those are very often split down into different CMS.

    If you have GSA-SER or some other link builder already, you can export your verified links from that program and you can use that as your starter/seed list for Footprint Factory.

    Exporting Existing GSA-SER Links-ListsIf you're already a GSA-SER user that saves verified links-lists, Step 1 is very easyif you know which CMS you want to target. You simply type this into windows explorer: %appdata%

  • Then open the GSA-SER folder and then open the verified folder. The precise name of the verified folder is specified in the GSA-SER advanced options panel, but for most people it will be site_list-verify. These lists are CMS specific, and are perfect for importing directly into Footprint Factory!

    What If You Don't Know Which List To Use Or Don't Have A List At All?But what if you can't decide which existing success/verified links-list you want to use? Or you want to target a new CMS, or you don't have a starting list to use with Footprint Factory? Which CMS do you choose?

    B. Choose a CMS, and then find or create a starting/seed list for it

    The second thing you could do is research CMS first, and then find/buy or create astarter/seed links-list to expand, or choose an existing list if you have one for that CMS.

    I've made a whole video on choosing a CMS because it's an open-ended question, with lots to consider. Starting with a new list is tricky for people who don't know what they're doing. A lot of people will not even know where to start. So if you're not experienced in mass scraping and mass link building, then you'll find the video very useful.

  • Remember: you have to have a starter/seed links-list to import into Footprint Factory so you can find lots of footprints and make a very targeted scraping list.

    Using Footprint Factory means you will find ALL possible footprints, and gain an unfair advantage over other link-builders!

    Using GSA-SER To Get CMS IdeasI like to use GSA-SER because if I choose a CMS from it then the chances are I'll easily be able to post to that CMS. Sometimes you may have to edit some of GSA-SER's engine files to make it post properly (another tutorial for another time!), but 90% of the time GSA-SER will give you a fair impression of how many links you can make from a single target CMS.

    With GSA-SER open, let's go ahead and research a do follow contextual platform. Start by having all engines selected.

    Right-click and then select uncheck engine that use no contextual links. And then

  • after that you can select uncheck engines with no follow. So now we can see a few remaining engines that we can look at that are both do-follow and contextual:

    Links that are Do-Follow and contextual are the most powerful links. Any of these remaining CMS would be excellent candidates for further investigation (as shown in the image above):

    PHPFox Ground CTRL PHPizabi SocialGo SocialEngine

  • DokuWiki MacOSWiki MoinMoin

    There are even more such targets in GSA, I only listed the ones in the image above.

    The next step is to check to see if you have decent verified lists for any of those CMS. You only need 25 unique domains in order to move on to the next step (Step2 with Footprint Factory). Obviously the more unique domains you have, the better. Usually I don't move on to the next step until I have 1000 unique domains for that CMS, but it can be done with just 25.

    If you don't have existing verified lists for any of those CMS, then you can either change your requirements (we looked for Do-Follow & Contextual), or you can create your own starter list for one of those CMS. Creating your own starter list involves a small initial scrape (Step 4).

    Another Approach Brand New CMS Targets!Go into the program files folder and then into the GSA-SER folder, and then theengines folder. Notice that the engine files are not usually stored in the same place as your verified lists, so remember that your GSA-SER files are stored in 2 different locations.

    You can see when the engines were updated. So if I order the list by date modified (by clicking the Date Modified column header) I can see the newest ones:

  • It also says in the GSA-SER update/release notes that this CMS (MyUPB) was added recently.

    The GSA-SER developer puts in new engines into the update notes. So it means you can actually see the new engines as they come in...

  • This means most GSA-SER users haven't started targeting them yet! There is a window of opportunity when these new targets are released.

    So I've decided to investigate MyUPB. I don't have any verified URLs for this CMS. Lucky you, I'll show you how to determine if a CMS is a potential winner!

    Be Efficient! Avoid Crappy CMSBear in mind you don't want to spend two days scraping for a single CMS only to find that only 10,000 sites use it! So there's little bit of research involved.

    How do we check to see if there are enough domains using this CMS?

    Go back into the GSA-SER engine files (the same place as in the previous screen shot above), open it up to find the default scraping footprints:

    See the bottom line with search term=...? GSA-SER holds default scraping footprints for all the engines. And they're not usually a comprehensive list but they're enough for us to get started with.

    I'm going to copy one and go to Google and paste the search string in:

  • And there seems to be enough results to make this CMS worth investigating. Notice I used a typical kind of footprint Powered by MyUPB. These powered by footprints are rarely enough to judge a CMS by.

    Many webmasters know to remove these obvious footprints, and also the search results will be full of pages of people talking about the footprint itself, and not just pages showing the footprint naturally on a site that actually uses that CMS. So we'll put another footprint together with the first one. This won't expand your results, but it will refine them so we get less noise:

  • So there are some results (814,000). It's not massive but there's definitely enough here to be worth scraping for and spamming to!

    Getting Close And Personal With A CMSAnd I've had a look at what the CMS looks like as well. It looks like a normal forum:

  • You should always have a quick look at some real sites using the CMS you are researching, because you will see what sort of links you will get.

    Will they be new blog posts? Or will they will links added to existing pages? This is an important distinction because it determines whether or not you can build PR links immediately, or if all your links will be on new PR0 pages.

    Also with some CMS (particularly blog post or new page ones like Social Network and Wiki type targets) you want to know if you can spam it mercilessly, or if you have to be a bit more careful and make readable content and only create one account at the site and/or use proxies and real email addresses. There are many CMS in GSA-SER, and they behave very differently, so the masterGSA-SER spammer (that's you and me!) will treat them differently! Don't be lazy;do a little more work up front and it will multiply your returns.

    So I'm going to continue with the MyUPB CMS! Phew, we finally have a good CMS to do this process with!

    Let's create the initial starter/seed links-list so we go on to Step 2 and analyze it foradditional footprints. Creating our initial list from nothing involved scraping. I won't cover it in full detail now because there is a comprehensive description in Step 4.

    Short Version Of Step 4Simply use the footprints from the GSA-SER engine files, and put them together tomake sure you get good results that will be useful for Step 2.

    With ScrapeBox, use the footprints in the keyword box. Keywords are not usually needed. Combine the footprints where it makes sense, and then make sure all the separate footprints are in quotation marks separately:

  • I've typed out the full footprints below:

    "Powered by MyUPB" "PHP Outburst 2002 - 2013" "You are not logged in. PleaseRegister or Login""Powered by MyUPB v2.2.5" "PHP Outburst 2002 - 2013" "You are not logged in. Please Register or Login""Powered by MyUPB v2.2.6" "PHP Outburst 2002 - 2013" "You are not logged in. Please Register or Login""Powered by MyUPB v2.2.7" "PHP Outburst 2002 - 2013" "You are not logged in. Please Register or Login"

    Again, you don't need any keywords. It may be helpful to use private proxies for small scrapes like this if you can. I use free proxies for large scrapes, but for this initial list we only need 25-1000 unique domains for our footprint analysis. This initial scrape only needs to run for 10-15 minutes, at most.

    Tech Note: Beware The Vertical BarIf your footprints have vertical bars like | in them, cut those out because they willscrew up your scraping. When we use Footprint Factory it will do this for us

  • automatically so we won't have this issue when we do the full scrape later. For example:

    Home | Contact | Privacy (Bad footprint! Don't use it!)

    Becomes: Home Contact Privacy

    Just trust me on this, avoid vertical bars even if those footprints seems tempting. They will just waste your time and resources!

    Step 1 - SummarySo to summarize Step 1 in a nutshell you will either:

    1. Use an existing verified links-list (for example a list you bought, or made with GSA) , but make sure it's a CMS-specific list.

    Or if you don't have a CMS starter links-list to work with:

    2. Choose a CMS from a list (like the available GSA-SER CMS engines) for research. Find out if there are enough targets. Once you've chosen a CMS you can either use an existing success/verified list, or create your own starter list for that CMS for Step 2.

    Here is the decision tree image again for the visual learners:

  • Step 2 - Import and Analyze a starting Or seed Links-list With Footprint Factory

    Now we have our starting/seed URL list, we should do 2 things before importing into Footprint Factory:

    1. If the list is old, check the URLs are alive and still working. ScrapeBox Alive-Check plug-in is great for this. Doing this will save you a lot of time later. Screen-shot below for the SB noobs!

    2. After saving your list, remove duplicate domains (after alive check), because Footprint Factory works best with one URL from each domain. Thatmakes sense right? If you have 100 URLs in your list, and 50 of them belong to one site, you're not going to get accurate numbers for the FootprintFactory analysis that basically asks: what % of 100 sites have this footprint?

  • Using Footprint Factory

    I'll show you some key points but I won't do a full tutorial of how to use Footprint Factory, it comes with a PDF that explains everything. The video series also shows this in detail, and there is an additional tutorial video series for Footprint Factory.

    Overview How Footprint Factory Will Rock The SEO WorldFootprint Factory Pro has three modes and you can use them at the same time. So I'm going to use the Text Snippets and also the Process URLs modes.

    The key thing is to know that Footprint Factory can analyze everything on the URLs you import. It breaks down everything on the page, and slices everything up, and then compares it to everything on all the other pages.

    This means no stone is left unturned in your automated quest for footprints! Each piece of text Footprint Factory extracts is called a snippet. If that snippet appears on many domains from your imported URL list, you can assume that the snippet is also a footprint. You can then export these footprints to find many more target URLs with the web scraper of your choice.

    With the Pro version you can also extract URL patterns for filtering URL lists, and also scraping with inurl: search strings if you like to do that sort of thing!

    General Default OptionsOn the right hand side you'll see 4 options.

    The reason we have the Min./Max. file size because sometimes you get pages that are maybe 1 or 2KB, and that probably means that it's probably a 404 page that says something like this is a 404 page, click here to go to homepage.Obviously that's no use to us. We want a page that actually has some HTML on it. So that's why we set the Min. file size to 10KB min. You can set a lower if you want but you've been warned!I set Max. size to 200KB, and this is large because the file size doesn't include any

  • media that's on the page. Images and videos are not included. 200KB is just the HTML and CSS and Javascript (and any text content) limit.

    Mode SettingsOpen Text Snippets mode settings, Treat pipes as snippet separators should be checked. I also drop the Max. snippet length to 200 characters. Replace Numbers with * is very useful for stacking up version number and date/time footprints. I suggest always leaving that checked!

    For the URLs settings, I want to compare Path and File-name

    What that means is that these URLs are going to be compared by their file path and

  • the File-name, for example:

    http://example1.com/wiki/index.php?title=User:Valarie14http://www.example2.com/projects/wiki/members.html

    Example 1:File path: /wiki/File-name: index.php

    Example 2:File path: /projects/wiki/File-name: members.html

    So those are the things we're going to compare because we can use those as a sieve filter later when it comes to scraping even more targets. This process refines your results so you have more results that are only for your target CMS.

    Import your URL list (after removing duplicate domains) into Footprint Factory.

    Then click Get Footprints.

    Filtering FootprintsSoon you'll have a very large list of extracted and compared text snippets:

  • 56,532 is far too many snippets, and many of them will be useless anyway. We have to briefly look through this list to see what we should be removing. The quickest way to reduce the footprint count is to remove everything below a certain frequency.So, if you imported 100 URLs from unique domains, and there are snippets that only appear on 3 of the URLs (3%), then those are probably not good footprints.

    We can select all snippets under any frequency using the frequency threshold:

    Remember that Footprint Factory finds and analyzes every snippet on all your files, so you will need to filter the useless snippets.

    Using Freq. Threshold will greatly reduce the number of snippets because the majority of snippets will only appear once or twice, and they are not footprints at all!

    I like to use 30-80 of the best snippets to then use as footprints. In time you will find your own sweet spot. Comb through the rest of list by eye. If you're using the free version of Footprint Factory then you should remove anything like:

    search contact twitter Google

  • password

    If you're using the Pro version, then it does not matter so much because you can merge footprints together. The above snippets are useless on their own for scraping. They won't help us find a specific CMS. But if you put them together using Footprint Factory Pro then you will be able to use more snippets as footprintsas you are forcing them to work together as one footprint:

    "search" "contact" "twitter" "Google" "password"

    I'll show you how to do this later in the guide.

    If you edit your text snippets list, then make sure you save them by exporting the snippets to a text file. It can be imported again later.

    Using URL FootprintsYou don't need to filter the URL footprints, unless you are using them for inurl: footprints but I don't recommend doing that because it burns out scraping proxies very quickly.

    I think it's best to only use the URL footprints as a sieve filter. So, there is no harm in having a few bad filters in there, it won't make much difference to your results in the end.

    The only URL Footprint you should not save and export is the / (forward slash) on its own, because that would defeat the point of the exercise: Why would you filter for / when almost every URL contains it? That's like sorting people by the criteria red blood: yes/no. It doesn't make sense! So save your time, don't spend it going through the URL footprint list. Just remove the forward slash and export the URL footprints so you can use them later.

    Also note that URL footprints are exported to a .csv file. Depending on your URL Footprint settings, you may need to edit that file and re-save as a text file when youuse it with other programs (such as ScrapeBox).

  • Step 3: Decide On Scraping Strategy and Export Footprint Factory Results

    Once we have our footprint list refined, the third step is deciding on the footprint strategy. What does that mean?

    There are different ways you can combine footprints and also the keywords before exporting your final footprint list in the web-scraper format.

    Snippet/Footprint Lists VS Web Scraping Footprint ListsTry not to get confused, we made a list of footprints in Step 2 but the actual footprint list (the end product of this process) is different. So far we have a footprint list that looks like this:

    footprint 1

    footprint 2

    footprint 2

    But we want to scrape search engines, so we need something more like this:

    footprint 1

    footprint 2

    footprint 3or

  • footprint 1 keyword 1

    footprint 1 keyword 2

    footprint 2 keyword 1

    footprint 2 keyword 2

    footprint 3 keyword 1

    footprint 3 keyword 2orfootprint 1 footprint 2 keyword 1

    footprint 1 footprint 2 keyword 2

    footprint 1 footprint 2 keyword 3

    or any combination or expansion of the formats above.

    How To Build Your Footprint ListWhen we're using CMS footprints in search engine queries, it's best to have them in quotation marks. Next thing we want to do is combine these footprints (Footprint Factory Pro).

    Many of your footprints will be common words. Those may all appear on the CMSyou're targeting but obviously if we searched for Twitter or contact on their own it's not going to help us very much. So we have to combine these together to make sure that we're actually getting the right results when we scrape.

    At the top right of the user interface (shown below) you have an estimation of how many footprints you are currently generating with your settings (5,995 in the image). This number will grow exponentially when using footprint permutations (same thing as combinations for our purposes).

  • You can set the min and max values. So if your min is 2 then your generated list will only have footprint combinations (every footprint, with every other footprint). None of the footprints would be added to your scraping list on their own.

    Add keywords if you want to, but it's not always necessary if the CMS is not that common, or you have a large number of footprints. It is useful if you're targeting a specific language though!

    Click Generate Footprints to finish and make sure you export the list and save your work.

    Step 3 SummaryWe've decided on a strategy and exported our results. Your strategy is determined by the following:

    how many combinations you're using if you're combining the keywords

  • what web scraper you're using (SB only allows 1 million URLs) how large your starter/seed list is (this affects size of snippet list) how many domains use the target CMS (roughly)

    Use your brain. If you're targeting a huge CMS like WordPress or VBulletin, then you can have large footprint lists because there are many, many targets you need tofind. But for smaller CMS targets like some image comment galleries, where there's only a small number of sites, maybe 20,000 30,000 different domains, then you can use a much smaller scraping footprint list.It would be overkill if there's only 10,000 sites for a CMS, and you're trying to use 20,000 footprints!

    Export your generated footprint list. Then you can import that final footprint list into any scraper you choose, and search for your new backlink targets!

    Step 4 - Use Your Footprint List With Your Web Scraper

    I usually use HRefer (it's not for everyone, it costs $650) but I will also show you how to use ScrapeBox ($57).

    Using ScrapeBoxThe first thing you need to do is get some proxies (see resources if you need help). Secondly, you can put your keywords or your footprint lists in the ScrapeBox keyword box, not both!

    Remember your footprint strategy from earlier. If you have a huge (think 10,000+) footprint list then you may not want to use keywords at all, just use the Import button and select your footprint list file.

    But if you have a smaller footprint list, or you have decided to use keywords then don't import your footprint list. Instead, import your keywords:

  • Try to add some keywords for different languages. Use two or three words in the most popular languages, such as English, Spanish, German, Russian, French and Mandarin (Chinese).Make sure you're happy with your connection settings (Settings Adjust Maximum Connections), then start harvesting URLs!

    After ScrapeBox has finished scraping you can filter your results. First Remove/Filter Remove Duplicate URLs, and then with that sieve filter I've mentioned a few times. If you have a URL footprint list, you can use it as a sieve filter with ScrapeBox.

    Important Note:The URL footprint list will be in .csv format because paths and filenames are stored separately. But if you've ever used spreadsheet software then it's very easy

  • to edit and then save as a text (.txt) file for the next step.Click Remove/Filter Remove URLs Not Containing Entries From... and select your URL footprint file (it should contain all the paths that you got from FPF, but with the single forward slash removed).

    You now have your raw URL list for testing with your link-building software!

    Using HReferAnd for HRefer users, your text footprints are your additive words, and your URL footprints are your sieve filter.

    Don't load additive words into the program itself, use the config files. Go into your HRefer folder, then the templates folder.

    You can see I've made two new files. One is the name of the platform (sieve/URL footprints). And the other CMS_addwords is the additive words. Make sure the seconds file has the same name as the first, but with an underscore _ and then addwords.

    So those are the two files you need for HRefer. I'll assume you know how to use proxies with this program. You'll need to reload the program after changing any config files.

    When HRefer is reloaded, select the engine config files by choosing it from the drop-down menu on the Search Engines options & Filter tab.

  • Step 5 - Run Your Scraped List To Test For Successes

    This seems like a pretty obvious last step but it is important how you to do this because that's how you know how successful your scrape was. So if you get the last bit wrong, you may think that you had poor results, when in actual fact you dideverything correctly until the last stage.

    This tutorial has been based around GSA-SER because I think is the most versatile link building tool there is. You have to be a bit more clever about the strategy thesedays but it definitely still works. And as I showed you before, what's so great aboutthis is that there's so many different CMS already programmed in. You can just import a list and then run it.

    Importing Your URLs Into GSA-SER (2 options)After doing steps 1-4 you should have a huge URL list. Right-click the GSA-SER project you will be using and then import target URLs and then from file. And thenselect the file containing all the URLs you've scraped. An prompt will appear and ask you if you want to randomize the list. I highly suggest you do that.

    For reference there is another way of importing URLs by using the Advanced Options and then Tools Import and Identify into Platforms. But don't do this for your new raw list. I've tested both importing methods side-by-side and the first one is much better. Importing target URLs on the project level and then running the project until completion is the fastest way to get your links working for you and to verify them.

    Important Project SettingsThere are a couple of options you should be aware of. When you do a large scrape make sure you have the continuously try to post to a site even if it's failed before option checked.

    Now that might sound like a bad thing because it sounds like you are trying to hammer a website over and over again. But what this is actually doing is overriding the already parsed message you would see in the reporting window here.

  • It can be quite annoying when you have a huge list and GSA-SER is stopping you from trying to post the same domain more than once. We're usually looking for verified URLs to post to, not just domains so it's important we use this option and test all URLs in our raw lists.

    There is another option that allows postings to same sites again (allow multiple accounts). Ramp this up as well.

    When running a raw list make sure you have plenty of email addresses loaded, and also make sure that you check all engines. Some of your raw list may be recognized as a different CMS, so why waste those links? Try for all of them!

    Other TipsIt should go without saying that if you're testing a raw list you should not send it toa money site directly. Not a good idea. You may end up with tens of thousands of direct links. Also, if you're using just one project for GSA-SER to test a raw list, then the chances are that the amount of content you have loaded into the project is not enough to make 100,000 unique submissions. By default most content tools, like Content Foundry and Kontent Machine only allow you to submit a couple of thousand unique submissions (Content Foundry can be tweaked to do much more though).Never test a raw list on a money site. Even using your raw list as a tier two project,you want make sure that the tier 1 it's linking to is fairly large because heavy-loaded link pyramids are not as reliable as they used to be.

  • Know The Numbers To Gauge Your ROII like to go to the GSA-SER Tools menu (Advanced Options tab) and record my stats both before and after running a large raw list. It also helps to remove your duplicate URLs from GSA-SER every once in a while so you can get a true indication of how many unique verified URLs you have.

    When you are happy with your other settings and you have your GSA-SER project content all loaded, run the project!

    To Infinity And Beyond!Now you know the process, you can go back into your GSA-SER verified links folder and just start picking off these different CMS one by one. You can work your way through all of them if you want!

    Run them through Footprint Factory, build your footprint (text and URL) lists, export that into your scraper and repeat to build enormous links-lists.

    Footprint Factory will save you hours of time and allow you to easily combine footprints. It will also reveal footprints that you would never find manually if you were checking individual sites and cutting text snippets out by hand.

    This guide is quite dense, but the process is mainly automated. This means once you understand what you are doing the only part that takes more than 5 minutes is researching a CMS to target. GSA-SER and Footprint Factory really do take all the guess-work out of it for you.And also making these comprehensive footprint lists actually gets you much better results that can result in 8% verified rates from a raw list.

    Don't wait! Feed these programs some data and set it to run so you can start ranking and banking.

    Cheers!

    Michael Carlin

  • Resources

    Tools:Footprint Factory FreeFootprint Factory Pro UpgradeGSA-SERScrapeBox Discount Link

    More Tutorials:Find Almost Unlimited New Link Targets (blog post & video)This guide on my blog. Leave a comment and ask a question!

    How To Find New Link Targets (YouTube)An over-the-should look at how I go through this process.

    Unlimited Free Proxies With ScrapeBox (blog post & video)Get thousands of anonymous proxies for free.

    Using GSA-SER (blog post & video)By time this PDF is released there will be an updated version of my GSA-SER review. See what settings you should be using for hard spam vs. more sensitive targets, like client sites.

    Speak To MeAsk me questions and see more SEO tutorials on the FightBack Networks support forum!

    It's free to join, and of course you get access to the world's first (and only) blog network platform too! Details of the free membership are on the main page:

    http://fightbacknetworks.com

    Introduction Why You Must Read This Strategy GuideHow This Guide Is StructuredWhat You Need To Follow Along With This TutorialThe 5 Steps To FollowStep 1 - Getting Started And Choosing Your Target CMSA. Choose a CMS you already have a success/verified links-list forExporting Existing GSA-SER Links-ListsWhat If You Don't Know Which List To Use Or Don't Have A List At All?

    B. Choose a CMS, and then find or create a starting/seed list for itUsing GSA-SER To Get CMS IdeasAnother Approach Brand New CMS Targets!Be Efficient! Avoid Crappy CMSGetting Close And Personal With A CMSShort Version Of Step 4Tech Note: Beware The Vertical Bar

    Step 1 - Summary

    Step 2 - Import and Analyze a starting Or seed Links-list With Footprint FactoryUsing Footprint FactoryOverview How Footprint Factory Will Rock The SEO WorldGeneral Default OptionsMode Settings

    Filtering FootprintsUsing URL Footprints

    Step 3: Decide On Scraping Strategy and Export Footprint Factory ResultsSnippet/Footprint Lists VS Web Scraping Footprint ListsHow To Build Your Footprint ListStep 3 Summary

    Step 4 - Use Your Footprint List With Your Web ScraperUsing ScrapeBoxUsing HRefer

    Step 5 - Run Your Scraped List To Test For SuccessesImporting Your URLs Into GSA-SER (2 options)Important Project SettingsOther TipsKnow The Numbers To Gauge Your ROI

    To Infinity And Beyond!ResourcesTools:More Tutorials:Speak To Me