Removing Spam Pages From WordPress Sites

Removing Spam Pages From WordPress Sites

What is a Spam Page?


Spam pages are files added to your publicly available web site with the intent of manipulating search engine result pages. The more inbound links a site receives, the higher the placement of the target web site in the search results. Inbound links from sites with high reputation ranking are even more valuable. Sites with older domain names, are .edu top level domain names, or highly popular sites that are updated often are the most desirable targets. As such, attackers often use these sites to build complex link networks for highly competitive search phrases.




Determining if your site is infected


Site owners often do not know their site is infected with spam pages until search results for their site start showing odd results that are unrelated to the site’s content. An examination of the site and the search engine result pages will clearly show if there are spam pages placed within the site files or the database. Spam pages can be found as stand alone files or added as posts or pages within your database.




Finding and Removing Spam Pages


Removal of spam pages requires analysis of all publicly available files on the web server. As they can also be placed in your database, you will need to review everything to find and remove those pages. First create a backup of site files and database.


Spam pages may be related to any number of highly competitive search niches including:



  • Pharmaceutical sales

  • Essay writing sites

  • Ringtones and music downloads

  • Movie downloads

  • Online casino or gambling

  • Fraudulent/replica designer sales

  • Weight loss supplements or products

  • Adult content




Stand Alone Spam Pages


Look for unfamiliar directories either outside of your content management system or hidden within subfolders of an administrative directory. Determine which files are not relevant to your site and remove them. Directory/folder names can appear to be functional like “headers” or “stats” or they can be nonsensical like “a3051.” The files may be html files or obfuscated (intentionally obscured to make code ambiguous) and appear to be jibberish. There are usually thousands of files.




Htaccess file review


Most sites have an .htaccess file that gives the server directives on which pages to serve to the site visitor. If you have spam pages on your site, there may be code inserted in the htaccess file that will direct site visitors to certain pages based on results on the query string. The query string is a bit of code that is added to the end of your site’s index file.


If there are directives placed on the query string, telling the search engines to look for specific spam pages instead of directing the user to site content. Removing querystring directive spam can be challenging as it requires removing this from the htaccess file and then adding additional code to htaccess that tells the search engines that those pages are now gone. This can be complex, and if you are uncertain of working with regular expressions and htaccess directives, we recommend getting help with this.




Here are some examples of spam within htaccess files.


RewriteRule . - [E=REWRITEBASE:/]

RewriteRule ^b(\d+)[-/].*[-/]p(\d+)-.*$ index\.php?id=$1-$2&%{QUERY_STRING} [L]

RewriteRule ^b(\d+)[-/]p(\d+)[-/].*$ index\.php?id=$1-$2&%{QUERY_STRING} [L]

RewriteRule ^p(\d+)[-/].*[-/]b(\d+)[-/].*$ index\.php?id=$2-$1&%{QUERY_STRING} [L]

RewriteRule ^p(\d+)[-/]b(\d+)[-/].*$ index\.php?id=$2-$1&%{QUERY_STRING} [L]



Spam Pages within the Database


Spam pages within the database are usually fairly easy to remove. They are added as either pages or posts in the content management system database. You can often see them as pages or posts that you did not add, so you can easily delete them. They are most often added by a compromised administrative account, and your content management system will often tell you which account used to add the spam pages, thus which account was compromised.




Working with the search engines


Removing the pages from your site is not enough. The search engine result pages often lag behind your site’s content by days, or sometimes weeks, depending on how often the search engine bots come by to visit your site.


Quickly clearing spam pages from the search engines, notably Google, requires adding your site to Google search console. After you have cleared all of the spam pages from your site, add your site to Google search console. Add a sitemap to your site, submit that sitemap via the search console, and perform a fetch on the spam pages to ensure that they respond as a 404 (not found). You can then submit your site to get indexed. It is then a waiting game. The sitemap and updated content are most helpful in ensuring that your search results return to normal.




Looking Beyond the Spam Page


Spam pages are placed on the site through exploitation of some vulnerability on the site, either through backdoors, unpatched site code, or compromised administrative, FTP, or other accounts.


If you find spam pages on your site, it is important to determine how those pages were placed. There may be other types of malware or security vulnerabilities on your site that allowed an attacker to gain access. A review of the entire site is important.




If after reading this guide, you are unsure of how to remove spam pages, if you are looking for more answers as to how the spam pages were placed on your site, or if you need assistance ensuring that all spam results are removed from the search engine result pages, get help by getting on chat with us or emailing us at support.consignweb.com.





    • Related Articles

    • Removing Phishing Pages From WordPress Sites

      What is Phishing? Phishing is a malicious attempt to obtain sensitive information such as usernames, passwords, credit card information through a coordinated email and web-based campaign. Phishing starts with deceptive messages (emails, text ...
    • Finding and Removing Spam Links

      What is a Spam Link? Spam links are links inserted into a website with the intent of manipulating search engine result pages. The more inbound links a site receives, the higher the placement of the target web site in the search results. Spam links ...
    • Finding and Removing Backdoors

      What is a backdoor? A malicious backdoor is code that allows unauthorized and often unrestricted access to a compromised site. They allow attackers access to all of the files within the hosting account. Backdoors can look like normal php code or ...
    • Removing Malicious Redirects From Your Site

      What is a malicious redirect? A malicious redirect is a bit of code inserted into a website with the intent of redirecting the site visitor to another website. Malicious redirects are typically inserted into a website by attackers with the intent of ...
    • How to Remove Suspicious Code From WordPress Sites

      What is suspicious code? Suspicious code is code that matches general malware practices, but may not fit into a specific category of malicious intent. Suspicious code may have nothing inherently malicious within it, however, it matches patterns of ...