Multithreading & Proxies

Recommended Proxies: Proxy-Cheap

Overview #

Multithreading is a powerful way to scrape a lot of results quickly.

However, there are some important caveats to keep in mind before you begin multithreaded bulk scrapes.

Without proper planning and management, you can burn through your proxy budget quickly, so we’ve put together some information to keep in mind that can help make bulk scraping as cheap and efficient as possible.

Kingmaker Software is not liable for the costs associated with proxies during bulk scrapes. By using the new version of Q&A Grabber that supports proxies & multithreading, you acknowledge that you have read the following information, and you agree/acknowledge that any incurred expenses from APIs & proxies are your responsibility.

Recommended Minimum Specs #

It is recommended you have 200mb of free RAM for each thread you intend to run.

Q&A Grabber has an intelligent multithreading system that scales up and down based on the resources available. This means, even if you set the software to run at 30 threads, if you don’t have the RAM or processing power available, it will not continue pushing. Instead, it will check again periodically to see if more resources have freed up. If so, it will scale up (to the maximum number of threads specified in the Settings). If less resources are available, it will scale down.

Setting Proxies & Threads #

Setting Proxies #

In order to add proxies, go to Settings => General.

Then Click the “Manage Proxies” button in the right-side column.

Add each proxy, one per line, with the format(s) specified.

Choose the Proxy Type (usually HTTP, unless otherwise noted as SOCKS4/5 or HTTPS by the proxy provider).

Save & Close the Proxies Window.

Setting Threads #

In order to set the thread count, go to Settings => General

Then choose the number of threads in the Threads dropdown.

Keep in Mind:

  • If you’re using a machine with a small amount of RAM, consider limiting the threads as outlined in the Recommended Minimum Specs section above.
  • If you’re using Datacenter or Static IP Proxies, set the thread count to no more than a 1:1 ratio (i.e. if you’re using 10 proxies, keep the Threads setting at a maximum of 10). Ideally, keep the proxies to thread ratio between 2:1 – 3:1 (i.e. 2-3 proxies per thread). Also, set the delay to at least a 5 second minimum.
  • If you’re using a rotating proxy, you can set the maximum threads to as many as your machine can handle, and can reduce the delay to 1 second.

Always Test Your Setup Before Scaling #

Below, we will outline some key information to keep in mind when working with multithreading and proxies. No matter what options you choose, it’s crucial you perform small tests before scraping results for hundreds or thousands of keywords to ensure that the output is what you’re expecting.

This is especially critical if you’re using proxies that charge by the bandwidth used, or if you’re using paid/limited APIs such as Spin Rewriter/WordAi/DeepL. In these scenarios, it’s recommended to try scraping with 10-20 keywords and ensure you’re seeing the expected output before scaling.

Kingmaker Software is not liable for the costs associated with paid API charges during bulk scrapes, whether the data returned was what you expected, or otherwise. By using Q&A Grabber, you acknowledge that you have read and agreed to this information.

Proxy Performance & Pricing #

There are three main types of proxies you can use to scrape with: Datacenter Proxies, Static Residential Proxies, and Rotating Residential Proxies. Each choice has its pros and cons, which are outlined below.

Note: No matter what proxy option you choose, always opt for private proxies. Public and shared proxies are often burned and banned quickly by Google, making them very difficult to scrape with.

Datacenter Proxies #

Proxies run through datacenters.

Best for budget, worst for performance.

Pros #

  • The cheapest of all proxy types (often $1-3 per proxy)
  • Predictable pricing

Cons #

  • The least reliable of the proxy options
  • Most likely to get blocked by Google, leading to scraping issues and slower results

Static Residential Proxies #

Proxies with a set IP address, running through residential internet connections.

Best balance between budget and performance.

Pros #

  • More reliable than datacenter proxies
  • Reasonably priced (often $3-5 per proxy)
  • Predictable pricing

Cons #

  • Can have issues with scraping depending on supplier/source
  • May be unethically acquired (part of a botnet, etc.)

Rotating Residential Proxies #

Proxies with a rotating pool of IP addresses from residential internet connections.

Best for performance, highest cost.

Pros #

  • Most reliable for scraping thanks to a large pool of available IPs

Cons #

  • Most expensive pricing (often charged per GB of bandwidth)
  • Less predictable pricing
  • May be unethically acquired (part of a botnet, etc.)

Recommended Proxies #

Whether you choose datacenter proxies, static residential proxies, or rotating residential proxies, it’s important to choose a good supplier that offers reasonable prices.

If you don’t already have a reliable proxy source you can try Proxy Cheap. We’ve tested multiple proxy providers and have found them to be the best choice, especially for the price.

Multithreading With Paid APIs #

If you’re using paid APIs such as Spin Rewriter, WordAi, or DeepL, it’s important to keep their respective limits in mind.

Spin Rewriter Limitations #

With Spin Rewriter, you can only send 500 requests per day. After this limit is reached, the API will throw an error, and Q&A Grabber will stop running (regardless of how many keywords are left to scrape).

Additionally, Spin Rewriter only allows one spin request every 7 seconds. This delay is built into Q&A Grabber, but it will slow down your scraping results so that only one set of content is spun and saved every 7 seconds or so.

DeepL Limitations #

With DeepL, you have a free quota of 500,000 characters per month. After this, you’ll be charged, so it’s important to factor in usage before you scale up to accommodate for extra translation costs.