Proxy / VPN / Bad IP Detection

Advanced IP Intelligence Service Using Machine Learning & Modern Computing Techniques

IP Intelligence is a service that determines how likely an IP address is a proxy / VPN / bad IP using advanced mathematical and modern computing techniques

Why Use IP Intelligence?

🛡️

Reduce Fraud

Greatly reduce fraud on e-commerce sites with anti-fraud protection and payment gateway security.

🔒

Protect Against Attacks

Protect your site from XSS, SQLi, brute force attacks, application scanning, and other automated hacking attempts.

🚫

Prevent Content Theft

Protect your site from crawlers that steal your content and stop bots from scraping your data.

👥

Stop Abuse

Prevent users from abusing promotional offers, multiple sign-ups, affiliate abuse, and spam.

🎯

Eliminate Fake Activity

Reduce fake views, clicks, and activity that results in click fraud and view fraud with anti-bot detection.

🔐

Prevent Bans

Stop trolls and users trying to bypass bans. Protect account hijacking with enhanced security measures.

The system is serving millions of API requests a week and growing as more people find it useful in protecting their online infrastructure. Our service is used by gaming communities, e-commerce websites, research universities & institutions, law enforcement, and large financial institutions.

Not all proxy / VPN detection services are the same. The techniques involved can be vastly different and produce noticeable differences. Feel free to compare the results from this service to any other, including paid options from various vendors.


How It Works

Given an IP address, the system will return a probabilistic value (between 0 and 1) of how likely the IP is a VPN / proxy / hosting / bad IP.

A value of 1 means that the IP is explicitly banned (a web host, VPN, or TOR node) by our dynamic lists. Otherwise, the output will return a real number value between 0 and 1, of how likely the IP is bad / VPN / proxy, which is inferred through machine learning & probability theory techniques using dynamic checks with large datasets.

Billions of new records are parsed each month to ensure the datasets have the latest information and old records automatically expire. The system is designed to be efficient, fast, simple, and accurate.

Important Assumptions

The following assumptions must be met for the sake of accuracy and correctness.

Usage & Implementation

Web Interface

A quick start to checking any IP address. The web interface by default uses flags=f.

API

Expected Input

The proxy check system takes in an input via HTTP GET request. The URL is http://check.getipintel.net/check.php and the parameter is ip. The system fully supports IPv4 with partial support for IPv6.

http://check.getipintel.net/check.php?ip=IPHere&contact=YourEmailAddressHere

Include Your Contact Information

Include your contact information so I can notify you if a problem arises or if there are core changes. In some situations, people query the system in a wrong manner and assume everything is working (but due to the lack of or improper handling of error codes), it's not the case. Since I only have the connecting IP address, I cannot help the person correct the error.

To include your contact information, add another parameter to your request called contact and provide your email.

Important Notes:
  • Do not use URL encoding on the input parameters.
  • All queries that do not contain accurate contact information will be rejected with an error or dropped by the firewall.
  • Start with flags=m option if only proxy / VPN detection is needed.
  • If flags=m does not have a noticeable impact, then use flags=b.
  • The default query (no flags) is mostly used in front of a payment gateway to protect against fraud because bad IP detection is included.

If you are contacted, please respond in 2 days or your contact information could be considered as inaccurate. Your information will only be used for the purpose of communication with GetIPIntel.

Optional Settings for Input

  • flags=m - Used when you're only looking for the value of "1" as the result. Skips dynamic checks and only uses dynamic ban lists.
  • flags=b - Used when you want to use dynamic ban and dynamic checks with partial bad IP check.
  • flags=f - Used when you want to force the system to do a full lookup, which can take up to 5 seconds.
  • flags=n - Used to exclude the real time block list. Append the character "n" if you're already using flags=m, b, or f (e.g., flags=nm).
  • oflags=b - Used when you want to see if the IP is considered as a bad IP.
  • oflags=c - Used when you want to see which country the IP came from / belongs to (GeoIP Location). Currently in alpha testing.
  • oflags=i - Used when you want to exclude iCloud Relay Egress IPs, Google Cloud One VPN, or similar services.
  • oflags=a - Used when you want to see the ASN number of the IP.
  • oflags=r - Used when you want to use residential proxy detection.
  • format=json - Returns the result in JSON format with extra information.

Expected Output

On a valid request, the system will return a value between 0 - 1 (inclusive) of how likely the given IP is a proxy. On error, a negative value will be returned. If format=json is used, a valid JSON format will be returned with extra information.


Interpretation of Results

If a value of 0.50 is returned, then it is as good as flipping a 2-sided fair coin, which implies it's not very accurate. From personal experience, values > 0.95 should be looked at and values > 0.99 are most likely proxies.

Anything below the value of 0.90 is considered as "low risk". Since a real value is returned, different levels of protection can be implemented. It is best for a system admin to test some sample datasets with this system and adjust implementation accordingly.

Recommendation: I only recommend automated action on high values (> 0.99 or even > 0.995) but it's always best to manually review IPs that return high values. For example, mark an order as "under manual review" and don't automatically provision the product for high proxy values.

Be sure to experiment with the results of this system before you use it live on your projects. If you believe the result is wrong, don't hesitate to contact me, I can tell you why. If it's an error on my end, I'll correct it. If you email me, expect a reply within 12 hours.

Comparing Different Flags

Flags Data Sets Used Pros Cons Response Time Suggested Use
flags=m dynamic ban lists fastest, smallest chance for false positives IPs not on blocklists will get through < 60 ms Least false positives | fastest speeds | ok letting some IPs through | only care about proxies & VPNs
flags=b dynamic ban lists, dynamic checks, some bad IP checks fast, catches more proxy/VPN IPs than flags=m, skips some compromised system detection higher chance of false positives than flags=m < 130 ms fast speeds, want to let less proxy/VPN IPs through than flags=m | do not want full bad IP detection | only care about proxies & VPNs
no flags (default) dynamic ban lists, dynamic checks, full bad IP checks fast, full IP check, balance between speed and full IP check higher chance of false positives than flags=m | might require 1 more query after 5 seconds < 130 ms fast speeds, ok with making multiple queries with the same IP
flags=f dynamic ban lists, dynamic checks, full bad IP checks forces a full IP check which does not take additional queries higher chance of false positives than flags=m, slowest < 5000 ms ok with waiting for a full lookup that can take up to 5 secs

Error Codes

The proxy check system will return negative values on error. For standard format (non-json), an additional HTTP 400 status code is returned:

  • -1 - Invalid no input
  • -2 - Invalid IP address
  • -3 - Unroutable address / private address
  • -4 - Unable to reach database, most likely the database is being updated. Keep an eye on twitter for more information.
  • -5 - Your connecting IP has been banned from the system or you do not have permission to access a particular service. Did you exceed your query limits? Did you use an invalid email address?
  • -6 - You did not provide any contact information with your query or the contact information is invalid.
  • If you exceed the number of allowed queries, you'll receive a HTTP 429 error.
Be sure to implement exception handling such as timeouts, HTTP 429 error, and the error codes listed above.

Disclaimer

GetIPIntel provides this service on an "as is" and "as available" basis without any express or implied warranties. Use of this service is entirely at your sole risk and discretion. In no event shall GetIPIntel, its owners, operators, or affiliates be liable for any damages or claims of any kind.

Terms of Service

By using this service, you agree to:


Contact

You can find me on Twitter, GitHub, or email.

If I do not respond to your email within 24 hours then something is wrong, check your spam folder. Please send an email to my gmail address, or contact me via twitter.

Ultimately, I want the system to be as accurate as possible, so please let me know if there are any inaccuracies, I'd like to fix the issue. Let me know if you have any custom requirements such as more queries per minute, skip cache so it always gets the latest data and recompute the result, etc.