Top Related Projects
Python ProxyPool for web spider
Lists of HTTP, SOCKS4, SOCKS5 proxies with geolocation info. Updated every hour.
Get PROXY List that gets updated everyday
Daily feed of bad IPs (with blacklist hit scores)
Quick Overview
IPProxyTool is an open-source project designed to collect and verify free proxy IP addresses from various sources. It provides a tool for gathering, testing, and managing proxy IPs, which can be useful for web scraping, anonymity, or bypassing geographical restrictions.
Pros
- Automatically collects and verifies proxy IPs from multiple sources
- Supports both HTTP and HTTPS proxies
- Includes a built-in web interface for easy management of proxy IPs
- Allows customization of proxy sources and verification methods
Cons
- Last updated in 2017, potentially outdated
- Limited documentation, especially for non-Chinese speakers
- May require additional configuration for optimal performance
- Reliability of free proxy IPs can be inconsistent
Code Examples
# Initialize the proxy collector
from ipproxytool.spiders.proxy.xicidaili import XiCiDaiLiSpider
spider = XiCiDaiLiSpider()
spider.start()
# Verify a proxy IP
from ipproxytool.spiders.validator.httpbin import HttpBinSpider
validator = HttpBinSpider()
is_valid = validator.validate_proxy('127.0.0.1', '8080')
print(f"Proxy is valid: {is_valid}")
# Retrieve valid proxies from the database
from ipproxytool.db.mysql import MySQLManager
db = MySQLManager()
valid_proxies = db.get_valid_proxy(count=10)
print(f"Valid proxies: {valid_proxies}")
Getting Started
-
Clone the repository:
git clone https://github.com/awolfly9/IPProxyTool.git cd IPProxyTool
-
Install dependencies:
pip install -r requirements.txt
-
Configure the database settings in
config.py
-
Run the main script:
python ipproxytool.py
-
Access the web interface at
http://localhost:8000
to manage and view collected proxies.
Competitor Comparisons
Python ProxyPool for web spider
Pros of proxy_pool
- More active development with recent updates
- Supports multiple database backends (Redis, MongoDB)
- Includes a RESTful API for easy integration
Cons of proxy_pool
- Less detailed documentation compared to IPProxyTool
- Primarily focused on Chinese proxy sources
Code Comparison
IPProxyTool:
class Validator(object):
def __init__(self, proxies):
self.proxies = proxies
self.timeout = 10
self.threads = 20
proxy_pool:
class ProxyCheck(object):
def __init__(self):
self.raw_proxy_queue = Queue()
self.thread_list = list()
self.useful_proxy_queue = Queue()
Both projects use object-oriented programming for their core functionality. IPProxyTool's Validator class focuses on validating proxies, while proxy_pool's ProxyCheck class manages proxy queues and threading.
proxy_pool offers more flexibility with database options and includes an API, making it easier to integrate into existing projects. However, IPProxyTool provides more detailed documentation, which can be beneficial for users new to proxy management.
The choice between these tools depends on specific requirements, such as the need for an API, database preferences, and the importance of documentation quality.
Lists of HTTP, SOCKS4, SOCKS5 proxies with geolocation info. Updated every hour.
Pros of proxy-list
- Regularly updated proxy lists in multiple formats (TXT, JSON)
- Simple and straightforward to use, with no additional setup required
- Supports various proxy protocols (HTTP, HTTPS, SOCKS4, SOCKS5)
Cons of proxy-list
- Lacks proxy validation or testing functionality
- No built-in tools for proxy scraping or management
- Limited customization options for users
Code comparison
proxy-list:
# No specific code to show, as it's primarily a collection of proxy lists
IPProxyTool:
from config import *
from sql.sql import SqlManager
from ipproxy import IPProxy
ipproxy = IPProxy()
ipproxy.run()
Summary
proxy-list is a straightforward repository that provides regularly updated proxy lists in various formats, making it easy for users to access and use proxies without additional setup. However, it lacks advanced features like proxy validation or management tools.
IPProxyTool, on the other hand, offers a more comprehensive solution for proxy management, including scraping, validation, and database storage. It requires more setup but provides greater flexibility and control over the proxy collection and validation process.
Choose proxy-list for quick access to proxy lists or IPProxyTool for a more robust proxy management solution.
Get PROXY List that gets updated everyday
Pros of PROXY-List
- Larger collection of proxy servers, updated more frequently
- Simpler to use, with ready-to-use proxy lists in various formats
- Active community and regular contributions
Cons of PROXY-List
- Less sophisticated proxy validation and testing
- Lacks advanced features like proxy rotation or API integration
- No built-in proxy scraping functionality
Code Comparison
PROXY-List typically provides proxy lists in plain text or JSON format:
123.45.67.89:8080
98.76.54.32:3128
IPProxyTool offers more structured output and functionality:
class Proxy(object):
def __init__(self):
self.ip = ''
self.port = ''
self.country = ''
self.anonymity = ''
self.https = ''
self.speed = ''
self.source = ''
PROXY-List is more suitable for users who need quick access to a large number of proxies without extensive validation or management features. IPProxyTool is better for those requiring more control over proxy selection, testing, and integration into larger projects.
Daily feed of bad IPs (with blacklist hit scores)
Pros of ipsum
- Regularly updated with a large list of malicious IP addresses
- Lightweight and easy to integrate into existing security systems
- Provides multiple formats for IP lists (plain text, CSV, JSON)
Cons of ipsum
- Focused solely on malicious IP addresses, not proxy servers
- Lacks features for testing or verifying IP addresses
- No built-in proxy management or rotation capabilities
Code comparison
IPProxyTool:
def start(self):
for p in self.proxy_getter_functions:
p.start()
for p in self.proxy_getter_functions:
p.join()
ipsum:
def update(self):
for name in self.SOURCES:
worker = threading.Thread(target=self._retrieve_worker, args=(name,))
worker.start()
self._threads.append(worker)
Both projects use threading to retrieve data from multiple sources concurrently. IPProxyTool focuses on gathering and managing proxy servers, while ipsum collects malicious IP addresses from various sources.
IPProxyTool offers more comprehensive proxy management features, including testing and verification. ipsum, on the other hand, provides a simpler solution for maintaining a list of malicious IP addresses, which can be useful for security applications and firewalls.
The choice between these tools depends on the specific use case: IPProxyTool for proxy management and ipsum for maintaining a blocklist of malicious IPs.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
IPProxyTool
ä½¿ç¨ scrapy ç¬è«æå代çç½ç«ï¼è·å大éçå 费代ç ipãè¿æ»¤åºææå¯ç¨ç ipï¼åå ¥æ°æ®åºä»¥å¤ä½¿ç¨ã å¯ä»¥è®¿é®æç个人ç«ç¹ï¼æ¥çæçæ´å¤æè¶£é¡¹ç® 西ç
æè°¢ youngjeff åæä¸èµ·ç»´æ¤è¯¥é¡¹ç®
è¿è¡ç¯å¢
å®è£ python3 and mysql æ°æ®åº
cryptography模åå®è£ ç¯å¢:
sudo yum install gcc libffi-devel python-devel openssl-devel
$ pip install -r requirements.txt
ä¸è½½ä½¿ç¨
å°é¡¹ç®å éå°æ¬å°
$ git clone https://github.com/awolfly9/IPProxyTool.git
è¿å ¥å·¥ç¨ç®å½
$ cd IPProxyTool
ä¿®æ¹ mysql æ°æ®åºé ç½® config.py ä¸ database_config çç¨æ·ååå¯ç ä¸ºæ°æ®åºçç¨æ·ååå¯ç
$ vim config.py
---------------
database_config = {
'host': 'localhost',
'port': 3306,
'user': 'root',
'password': '123456',
'charset': 'utf8',
}
MYSQL: å¯¼å ¥æ°æ®è¡¨ç»æ
$ mysql> create database ipproxy;
Query OK, 1 row affected (0.00 sec)
$ mysql> use ipproxy;
Database changed
$ mysql> source '/ä½ ç项ç®ç®å½/db.sql'
è¿è¡å¯å¨èæ¬ ipproxytool.py ä¹å¯ä»¥åå«è¿è¡æåï¼éªè¯ï¼æå¡å¨æ¥å£èæ¬ï¼è¿è¡æ¹æ³åè项ç®è¯´æ
$ python ipproxytool.py
æ°å¢å¼æ¥éªè¯æ¹å¼ï¼è¿è¡æ¹æ³å¦ä¸
$ python ipproxytool.py async
项ç®è¯´æ
æå代çç½ç«
æææå代çç½ç«ç代ç é½å¨ proxy
æ©å±æåå ¶ä»ç代çç½ç«
1.å¨ proxy ç®å½ä¸æ°å»ºèæ¬å¹¶ç»§æ¿èª BaseSpider
2.设置 nameãurlsãheaders
3.éå parse_page æ¹æ³ï¼æåä»£çæ°æ®
4.å°æ°æ®åå
¥æ°æ®åº å
·ä½å¯ä»¥åè ip181 kuaidaili
5.妿éè¦æåç¹å«å¤æç代çç½ç«ï¼å¯ä»¥åèpeuland
ä¿®æ¹ run_crawl_proxy.py å¯¼å ¥æååºï¼æ·»å å°æåéå
å¯ä»¥åç¬è¿è¡ run_crawl_proxy.py èæ¬å¼å§æå代çç½ç«
$ python run_crawl_proxy.py
éªè¯ä»£ç ip æ¯å¦ææ
ç®åéªè¯æ¹å¼ï¼
1.ä»ä¸ä¸æ¥æåå¹¶åå¨çæ°æ®åºä¸ååºææç代ç IP
2.å©ç¨ååºç代ç IP å»è¯·æ± httpbin
3.æ ¹æ®è¯·æ±ç»æå¤æåºä»£ç IP çæææ§ï¼æ¯å¦æ¯æ HTTPS 以åå¿å度ï¼å¹¶åå¨å°è¡¨ httpbin ä¸
4.ä» httpbin 表ä¸ååºä»£çå»è®¿é®ç®æ ç½ç«ï¼ä¾å¦ è±ç£
5.å¦æè¯·æ±å¨åéçæ¶é´è¿åæåçæ°æ®ï¼å认为è¿ä¸ªä»£ç IP ææãå¹¶ä¸åå
¥ç¸åºç表ä¸
ä¸ä¸ªç®æ ç½ç«å¯¹åºä¸ä¸ªèæ¬ï¼ææéªè¯ä»£ç ip ç代ç é½å¨ validator
æ©å±éªè¯å ¶ä»ç½ç«
1.å¨ validator ç®å½ä¸æ°å»ºèæ¬å¹¶ç»§æ¿ Validator
2.设置 nameãtimeoutãurlsãheaders
3.ç¶åè°ç¨ init æ¹æ³,å¯ä»¥åè baidu douban
4.妿éè¦ç¹å«å¤æçéªè¯æ¹å¼ï¼å¯ä»¥åè assetstore
ä¿®æ¹ run_validator.py å¯¼å ¥éªè¯åºï¼æ·»å å°éªè¯éå
å¯ä»¥åç¬è¿è¡ run_validator.py å¼å§éªè¯ä»£çipçæææ§
$ python run_validator.py
è·å代ç ip æ°æ®æå¡å¨æ¥å£
å¨ config.py ä¸ä¿®æ¹å¯å¨æå¡å¨ç«¯å£é ç½® data_portï¼é»è®¤ä¸º 8000 å¯å¨æå¡å¨
$ python run_server.py
æå¡å¨æä¾æ¥å£
è·å
http://127.0.0.1:8000/select?name=httpbin&anonymity=1&https=yes&order=id&sort=desc&count=100
åæ°
Name | Type | Description | must |
---|---|---|---|
name | str | æ°æ®åºåç§° | æ¯ |
anonymity | int | 1:é«å¿ 2:å¿å 3:éæ | å¦ |
https | str | https:yes http:no | å¦ |
order | str | table åæ®µ | å¦ |
sort | str | asc ååºï¼desc éåº | å¦ |
count | int | è·åä»£çæ°éï¼é»è®¤ 100 | å¦ |
å é¤
http://127.0.0.1:8000/delete?name=httpbin&ip=27.197.144.181
åæ°
Name | Type | Description | æ¯å¦å¿ é¡» |
---|---|---|---|
name | str | æ°æ®åºåç§° | æ¯ |
ip | str | éè¦å é¤ç ip | æ¯ |
æå ¥
åæ°
Name | Type | Description | æ¯å¦å¿ é¡» |
---|---|---|---|
name | str | æ°æ®åºåç§° | æ¯ |
ip | str | ip å°å | æ¯ |
port | str | ç«¯å£ | æ¯ |
country | str | å½å®¶ | å¦ |
anonymity | int | 1:é«å¿,2:å¿å,3:éæ | å¦ |
https | str | yes:https,no:http | å¦ |
speed | float | 访é®é度 | å¦ |
source | str | ip æ¥æº | å¦ |
TODO
- æ·»å 夿°æ®åºæ¯æ
- mysql
- redis TODO...
- sqlite TODO...
- æ·»å æåæ´å¤å
费代çç½ç«ï¼ç®åæ¯æçæåçå
费代ç IP ç«ç¹ï¼ç®åæä¸äºå½å¤çç«ç¹è¿æ¥ä¸ç¨³å®
- (å½å¤) http://www.freeproxylists.net/
- (å½å¤) http://gatherproxy.com/
- (å½å ) https://hidemy.name/en/proxy-list/
- (å½å ) http://www.ip181.com/
- (å½å ) http://www.kuaidaili.com/
- (å½å¤) https://proxy.peuland.com/proxy_list_by_category.htm
- (å½å¤) https://list.proxylistplus.com/
- (å½å ) http://m.66ip.cn
- (å½å¤) http://www.us-proxy.org/
- (å½å ) http://www.xicidaili.com
- åå¸å¼é¨ç½²é¡¹ç®
æ·»å æå¡å¨è·åæ¥å£æ´å¤ç鿡件å¤è¿ç¨éªè¯ä»£ç IPæ·»å https æ¯ææ·»å æ£æµ ip çå¿å度
åè
é¡¹ç®æ´æ°
-----------------------------2020-12-29----------------------------
- ä¿®æ¹ä¹åé误çè·¯å¾å½å
- ä¿®æ¹mysql è¡¨ç»æ
-----------------------------2017-6-23----------------------------
1.python2 -> python3
2.web.py -> flask
-----------------------------2017-5-17----------------------------
1.æ¬ç³»ç»å¨åæ¥çåºç¡ä¸å å
¥äºdockerãæä½è§ä¸æ¹ï¼å
³äºdockerçç¸å
³ç¥è¯å¯ä»¥ä¸å®ç½ççhttp://www.docker.com.
-----------------------------2017-3-30----------------------------
1.ä¿®æ¹å®å readme
2.æ°æ®æå
¥æ¯æäºå¡
-----------------------------2017-3-14----------------------------
1.æ´æ¹æå¡å¨æ¥å£ï¼æ·»å æåºæ¹å¼
2.æ·»å å¤è¿ç¨æ¹å¼éªè¯ä»£ç ip çæææ§
-----------------------------2017-2-20----------------------------
1.æ·»å æå¡å¨è·åæ¥å£æ´å¤ç鿡件
-----------------------------2017-2-16----------------------------
1.éªè¯ä»£ç IP çå¿å度
2.éªè¯ä»£ç IP HTTPS æ¯æ
3.æ·»å httpbin éªè¯å¹¶åæ°è®¾ç½®ï¼é»è®¤ä¸º 4
å¨ç³»ç»ä¸å®è£ dockerå°±å¯ä»¥ä½¿ç¨æ¬ç¨åºï¼
ä¸è½½æ¬ç¨åº
git clone https://github.com/awolfly9/IPProxyTool
ç¶åè¿å ¥ç®å½ï¼
cd IPProxyTool
å建éåï¼
docker build -t proxy .
è¿è¡å®¹å¨ï¼
docker run -it proxy
å¨config.pyä¸æç §èªå·±çéæ±ä¿®æ¹é 置信æ¯
database_config = {
'host': 'localhost',
'port': 3306,
'user': 'root',
'password': 'root',
'charset': 'utf8',
}
Top Related Projects
Python ProxyPool for web spider
Lists of HTTP, SOCKS4, SOCKS5 proxies with geolocation info. Updated every hour.
Get PROXY List that gets updated everyday
Daily feed of bad IPs (with blacklist hit scores)
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot