scylla
Intelligent proxy pool for Humans™ to extract content from the internet and build your own Large Language Models in this new AI era
Top Related Projects
A Rust port of shadowsocks
An unidentifiable mechanism that helps you bypass GFW.
A platform for building proxies to bypass network restrictions.
Xray, Penetrates Everything. Also the best v2ray-core. Where the magic happens. An open platform for various uses.
Make a fortune quietly
Lantern官方版本下载 蓝灯 翻墙 代理 科学上网 外网 加速器 梯子 路由 - Быстрый, надежный и безопасный доступ к открытому интернету - lantern proxy vpn censorship-circumvention censorship gfw accelerator پراکسی لنترن، ضدسانسور، امن، قابل اعتماد و پرسرعت
Quick Overview
Scylla is an open-source HTTP proxy pool for web scraping and data collection. It provides a robust and scalable solution for managing and rotating IP addresses, helping users bypass rate limits and access geo-restricted content. Scylla is designed to be easy to deploy and integrate into existing web scraping workflows.
Pros
- Easy to deploy using Docker, with support for both x86 and ARM architectures
- Provides a RESTful API for easy integration with various programming languages and tools
- Supports multiple proxy providers and automatic proxy rotation
- Includes a web-based dashboard for monitoring and managing proxies
Cons
- Limited documentation, which may make it challenging for new users to get started
- Requires some technical knowledge to set up and configure properly
- May require additional configuration for optimal performance in large-scale scraping operations
- Lacks built-in support for advanced features like session management or browser fingerprinting
Code Examples
- Fetching a proxy from Scylla:
import requests
proxy = requests.get('http://localhost:8899/api/v1/proxies').json()
print(f"Proxy: {proxy['ip']}:{proxy['port']}")
- Using a Scylla proxy with requests:
import requests
proxy_url = 'http://localhost:8899/api/v1/proxies'
proxy = requests.get(proxy_url).json()
proxies = {
'http': f"http://{proxy['ip']}:{proxy['port']}",
'https': f"http://{proxy['ip']}:{proxy['port']}"
}
response = requests.get('https://example.com', proxies=proxies)
print(response.text)
- Configuring Scrapy to use Scylla:
# In your Scrapy settings.py file
DOWNLOADER_MIDDLEWARES = {
'scrapy_proxy_pool.middlewares.ProxyPoolMiddleware': 610,
'scrapy_proxy_pool.middlewares.BanDetectionMiddleware': 620,
}
PROXY_POOL_ENABLED = True
PROXY_POOL_CONFIG = {
'url': 'http://localhost:8899/api/v1/proxies',
}
Getting Started
To get started with Scylla, follow these steps:
- Install Docker on your system.
- Run Scylla using Docker:
docker run -d -p 8899:8899 -p 8081:8081 -v /var/www/scylla:/var/www/scylla --name scylla wildcat/scylla:latest
- Access the web dashboard at
http://localhost:8081
. - Use the API endpoint
http://localhost:8899/api/v1/proxies
to fetch proxies in your application.
For more detailed configuration and usage instructions, refer to the project's GitHub repository.
Competitor Comparisons
A Rust port of shadowsocks
Pros of shadowsocks-rust
- Written in Rust, offering better performance and memory safety
- More mature project with a larger community and longer development history
- Supports multiple ciphers and protocols, providing flexibility for users
Cons of shadowsocks-rust
- Focused solely on the Shadowsocks protocol, limiting its use cases
- May have a steeper learning curve for users unfamiliar with Rust
Code Comparison
Scylla (Python):
async def handle_client(client_reader, client_writer):
try:
data = await client_reader.read(BUFFER_SIZE)
remote_reader, remote_writer = await asyncio.open_connection(
target_host, target_port)
remote_writer.write(data)
await remote_writer.drain()
# ... (rest of the function)
shadowsocks-rust (Rust):
async fn handle_client(socket: TcpStream, method: &CipherKind) -> io::Result<()> {
let (mut reader, mut writer) = socket.split();
let mut cipher = method.cipher();
let mut buf = vec![0u8; 0x3fff];
loop {
let n = reader.read(&mut buf).await?;
// ... (rest of the function)
The code snippets show the different approaches and languages used in handling client connections. Scylla uses Python's asyncio for asynchronous operations, while shadowsocks-rust leverages Rust's async/await syntax and low-level networking primitives.
An unidentifiable mechanism that helps you bypass GFW.
Pros of Trojan
- Designed specifically for bypassing GFW, offering better performance in restricted networks
- Simpler setup and configuration process
- Lighter resource usage, suitable for low-end devices
Cons of Trojan
- Limited protocol support compared to Scylla's multi-protocol approach
- Less flexibility in customization and advanced features
- Smaller community and fewer third-party clients
Code Comparison
Trojan (server configuration):
{
"run_type": "server",
"local_addr": "0.0.0.0",
"local_port": 443,
"remote_addr": "127.0.0.1",
"remote_port": 80,
"password": ["password1"],
"ssl": {
"cert": "/path/to/certificate.crt",
"key": "/path/to/private.key",
"key_password": "",
"cipher": "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384",
"cipher_tls13": "TLS_AES_128_GCM_SHA256:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_256_GCM_SHA384",
"prefer_server_cipher": true,
"alpn": [
"http/1.1"
],
"reuse_session": true,
"session_ticket": false,
"session_timeout": 600,
"plain_http_response": "",
"curves": "",
"dhparam": ""
}
}
Scylla (configuration example):
proxy:
- name: http
type: http
port: 8080
listen: 0.0.0.0
- name: socks5
type: socks5
port: 1080
listen: 0.0.0.0
- name: shadowsocks
type: ss
port: 8388
listen: 0.0.0.0
password: your_password
method: aes-256-gcm
A platform for building proxies to bypass network restrictions.
Pros of v2ray-core
- More comprehensive and feature-rich proxy solution
- Supports multiple protocols and transport layers
- Larger community and more active development
Cons of v2ray-core
- More complex configuration and setup
- Higher resource usage due to its extensive features
- Steeper learning curve for new users
Code Comparison
v2ray-core (Go):
type User struct {
Level uint32
Email string
}
type Account struct {
Id string
AlterId uint32
}
Scylla (Python):
class Proxy:
def __init__(self, host, port):
self.host = host
self.port = port
def __str__(self):
return f"{self.host}:{self.port}"
v2ray-core is a more comprehensive proxy solution written in Go, offering support for multiple protocols and transport layers. It has a larger community and more active development compared to Scylla. However, v2ray-core is more complex to configure and set up, and it may have higher resource usage due to its extensive features.
Scylla, on the other hand, is a simpler proxy crawler and pool written in Python. It focuses on providing a straightforward solution for proxy management and is easier to set up and use. However, it may lack some of the advanced features and protocol support that v2ray-core offers.
The code comparison shows the difference in complexity and focus between the two projects, with v2ray-core handling more complex user and account structures, while Scylla provides a simpler proxy representation.
Xray, Penetrates Everything. Also the best v2ray-core. Where the magic happens. An open platform for various uses.
Pros of Xray-core
- More comprehensive protocol support, including VLESS, Trojan, and Shadowsocks
- Advanced traffic routing capabilities with flexible rule-based configurations
- Active development with frequent updates and improvements
Cons of Xray-core
- Steeper learning curve due to more complex configuration options
- Potentially higher resource usage for advanced features
Code Comparison
Xray-core configuration example:
{
"inbounds": [{"port": 1080, "protocol": "socks"}],
"outbounds": [{"protocol": "freedom"}]
}
Scylla usage example:
from scylla import Scylla
proxy = Scylla()
proxy.start()
Summary
Xray-core is a feature-rich proxy tool with extensive protocol support and advanced routing capabilities, making it suitable for complex networking scenarios. However, it may require more setup time and resources.
Scylla, on the other hand, is a simpler proxy scraper and manager focused on ease of use and quick deployment. It's more suitable for basic proxy needs and automated scraping tasks.
The choice between the two depends on the specific requirements of your project, with Xray-core being more powerful but complex, and Scylla offering simplicity and ease of use for proxy management.
Make a fortune quietly
Pros of naiveproxy
- Built on Chromium's network stack, providing robust and up-to-date protocol support
- Designed for better censorship resistance and traffic obfuscation
- Supports multiple protocols including HTTP, HTTPS, and QUIC
Cons of naiveproxy
- More complex setup and configuration compared to Scylla
- Larger codebase and resource footprint due to Chromium dependencies
- Limited to client-side proxy functionality, whereas Scylla offers both client and server components
Code Comparison
naiveproxy (C++):
int main(int argc, char* argv[]) {
base::CommandLine::Init(argc, argv);
logging::LoggingSettings settings;
settings.logging_dest = logging::LOG_TO_SYSTEM_DEBUG_LOG;
logging::InitLogging(settings);
return naive::naive_main(argc, argv);
}
Scylla (Python):
def start(self):
self.loop.run_until_complete(self._start())
try:
self.loop.run_forever()
except KeyboardInterrupt:
pass
The code snippets show the entry points for both projects. naiveproxy uses C++ and integrates with Chromium's base libraries, while Scylla is written in Python and uses asyncio for event handling.
Lantern官方版本下载 蓝灯 翻墙 代理 科学上网 外网 加速器 梯子 路由 - Быстрый, надежный и безопасный доступ к открытому интернету - lantern proxy vpn censorship-circumvention censorship gfw accelerator پراکسی لنترن، ضدسانسور، امن، قابل اعتماد و پرسرعت
Pros of Lantern
- More comprehensive solution for internet freedom, including VPN and proxy services
- Larger user base and community support
- Actively maintained with regular updates and releases
Cons of Lantern
- Closed-source components, limiting transparency and community contributions
- More complex setup and configuration compared to Scylla
- Potential privacy concerns due to centralized infrastructure
Code Comparison
Lantern (Go):
func (c *Client) Dial(network, addr string) (net.Conn, error) {
return c.DialWithDialer(&net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
}, network, addr)
}
Scylla (Python):
async def create_proxy_server(host: str, port: int, **kwargs) -> ProxyServer:
return await ProxyServer.create(host, port, **kwargs)
While both projects aim to provide internet freedom solutions, Lantern offers a more comprehensive package with VPN and proxy services, whereas Scylla focuses on proxy functionality. Lantern has a larger user base and more active development, but Scylla's open-source nature provides greater transparency. The code snippets demonstrate the different languages and approaches used in each project.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Scylla
An intelligent proxy pool for humanities, to extract content from the internet and build your own Large Language Models in this new AI era.
Key features:
- Automatic proxy ip crawling and validation
- Easy-to-use JSON API
- Simple but beautiful web-based user interface (eg. geographical distribution of proxies)
- Get started with only 1 command minimally
- Simple HTTP Forward proxy server
- Scrapy and requests integration with only 1 line of code minimally
- Headless browser crawling
Get started
Installation
Install with Docker (highly recommended)
docker run -d -p 8899:8899 -p 8081:8081 -v /var/www/scylla:/var/www/scylla --name scylla wildcat/scylla:latest
Install directly via pip
pip install scylla
scylla --help
scylla # Run the crawler and web server for JSON API
Install from source
git clone https://github.com/imWildCat/scylla.git
cd scylla
pip install -r requirements.txt
cd frontend
npm install
cd ..
make assets-build
python -m scylla
Usage
This is an example of running a service locally (localhost
), using
port 8899
.
Note: You might have to wait for 1 to 2 minutes in order to get some proxy ips populated in the database for the first time you use Scylla.
JSON API
Proxy IP List
http://localhost:8899/api/v1/proxies
Optional URL parameters:
Parameters | Default value | Description |
---|---|---|
page | 1 | The page number |
limit | 20 | The number of proxies shown on each page |
anonymous | any | Show anonymous proxies or not. Possible valuesï¼true , only anonymous proxies; false , only transparent proxies |
https | any | Show HTTPS proxies or not. Possible valuesï¼true , only HTTPS proxies; false , only HTTP proxies |
countries | None | Filter proxies for specific countries. Format example: US , or multi-countries: US,GB |
Sample result:
{
"proxies": [{
"id": 599,
"ip": "91.229.222.163",
"port": 53281,
"is_valid": true,
"created_at": 1527590947,
"updated_at": 1527593751,
"latency": 23.0,
"stability": 0.1,
"is_anonymous": true,
"is_https": true,
"attempts": 1,
"https_attempts": 0,
"location": "54.0451,-0.8053",
"organization": "AS57099 Boundless Networks Limited",
"region": "England",
"country": "GB",
"city": "Malton"
}, {
"id": 75,
"ip": "75.151.213.85",
"port": 8080,
"is_valid": true,
"created_at": 1527590676,
"updated_at": 1527593702,
"latency": 268.0,
"stability": 0.3,
"is_anonymous": true,
"is_https": true,
"attempts": 1,
"https_attempts": 0,
"location": "32.3706,-90.1755",
"organization": "AS7922 Comcast Cable Communications, LLC",
"region": "Mississippi",
"country": "US",
"city": "Jackson"
},
...
],
"count": 1025,
"per_page": 20,
"page": 1,
"total_page": 52
}
System Statistics
http://localhost:8899/api/v1/stats
Sample result:
{
"median": 181.2566407083,
"valid_count": 1780,
"total_count": 9528,
"mean": 174.3290085201
}
HTTP Forward Proxy Server
By default, Scylla will start a HTTP Forward Proxy Server on port
8081
. This server will select one proxy updated recently from the
database and it will be used for forward proxy. Whenever an HTTP request
comes, the proxy server will select a proxy randomly.
Note: HTTPS requests are not supported at present.
The example for curl
using this proxy server is shown below:
curl http://api.ipify.org -x http://127.0.0.1:8081
You could also use this feature with requests:
requests.get('http://api.ipify.org', proxies={'http': 'http://127.0.0.1:8081'})
Web UI
Open http://localhost:8899
in your browser to see the Web UI of this
project.
Proxy IP List
http://localhost:8899/
Screenshot:
Globally Geographical Distribution Map
http://localhost:8899/#/geo
Screenshot:
API Documentation
Please read Module Index.
Roadmap
Please see Projects.
Development and Contribution
git clone https://github.com/imWildCat/scylla.git
cd scylla
pip install -r requirements.txt
npm install
make assets-build
Testing
If you wish to run tests locally, the commands are shown below:
pip install -r tests/requirements-test.txt
pytest tests/
You are welcomed to add more test cases to this project, increasing the robustness of this project.
Naming of This Project
Scylla is derived from the name of a group of memory chips in the American TV series, Prison Break. This project was named after this American TV series to pay tribute to it.
Help
How to install Python Scylla on CentOS7
Donation
If you find this project useful, could you please donate some money to it?
No matter how much the money is, Your donation will inspire the author to develop new features continuously! ð Thank you!
The ways for donation are shown below:
GitHub Sponsor
I super appreciate if you can join my sponsors here.
https://github.com/sponsors/imWildCat
PayPal
License
Apache License 2.0. For more details, please read the LICENSE file.
Top Related Projects
A Rust port of shadowsocks
An unidentifiable mechanism that helps you bypass GFW.
A platform for building proxies to bypass network restrictions.
Xray, Penetrates Everything. Also the best v2ray-core. Where the magic happens. An open platform for various uses.
Make a fortune quietly
Lantern官方版本下载 蓝灯 翻墙 代理 科学上网 外网 加速器 梯子 路由 - Быстрый, надежный и безопасный доступ к открытому интернету - lantern proxy vpn censorship-circumvention censorship gfw accelerator پراکسی لنترن، ضدسانسور، امن، قابل اعتماد و پرسرعت
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot