抛砖引玉一下。。。

公司内网搭建了分布式扫描器,但是却耗费了好多服务器资源,那么如何在外网无VPS和服务器的情况下,快速搭建一套分布式扫描器呢?

首先,先鄙视一下我这种羊毛党。

然后引入正文daocloud.io,这个平台可以免费运行3台docker容器。

一台作为celery的broker(这里我使用的是redis)

一台作为数据接收db(这里我使用的是mongodb。注:从dockerhub的镜像部署,daocloud自身的mongodb镜像不让用)

另外一台作为woker

先看一下woker这台容器的镜像是如何构建的 https://raw.githubusercontent.com/yuxiaokui/celery-app/master/Dockerfile

FROM fedora:latest
MAINTAINER xi4okv
RUN dnf -y install python
RUN pip install celery redis IPy pymongo requests
ENV THREAD 10
ENTRYPOINT  cd /app && celery -A tasks worker  --loglevel=info -c $THREAD

OK。。。。
就是这么简单。
然后看一下部署效果。

然后tasks.py的内容是

from celery import Celery
BROKER_URL = 'redis://redis.t2.daoapp.io:61xx4/0'
BACKEND_URL = 'redis://redis.t2.daoapp.io:61xx4/1'

celery = Celery('tasks',
    broker=BROKER_URL,
    backend=BACKEND_URL)
    
import socket
import sys, os, time
import requests
import pymongo


client = pymongo.MongoClient('mongodb://mongodb.t2.daoapp.io:61xx7/')
db = client['sites']
posts = db['sites']


@celery.task
def scan(url):
    headers = requests.head(url, timeout=3).headers
    headers["url"] = url
    posts.insert_one(headers)
    return headers

也是很简单的一次headers收集。

下面开始分发任务。

from tasks import scan
for url in open("urls.txt"):
    scan.delay("http://" + url.strip())
    print url

OK,执行完毕,看下数据处理效果。

import  pymongo

client = pymongo.MongoClient('mongodb://mongodb.t2.daoapp.io:61xx7/')
db = client['sites']
posts = db['sites']

servers = posts.aggregate([{"$group" : {"_id" : "$X-Powered-By", "num_tutorial" : {"$sum":1 }}}])
for server in servers:
    print server[/code][code]============== RESTART: C:\Users\xi4okv\Desktop\task\getData.py ==============
{u'num_tutorial': 1, u'_id': u'ARR/3.0, ASP.NET'}
{u'num_tutorial': 1, u'_id': u'ASP.NET, UrlRewriter.NET 2.0.0'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.18'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.10'}
{u'num_tutorial': 1, u'_id': u'KCDNS'}
{u'num_tutorial': 1, u'_id': u'PHP/5.2.8'}
{u'num_tutorial': 1, u'_id': u'ThinkPHP2.1'}
{u'num_tutorial': 1, u'_id': u'WWW_80'}
{u'num_tutorial': 1, u'_id': u'PHP/5.4.13'}
{u'num_tutorial': 2, u'_id': u'PHP/5.5.38'}
{u'num_tutorial': 1, u'_id': u'PHP/5.3.10-1ubuntu3.18'}
{u'num_tutorial': 2, u'_id': u'PHP/5.6.21'}
{u'num_tutorial': 1, u'_id': u'PHP/5.3.10-1ubuntu3.26'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.12'}
{u'num_tutorial': 1, u'_id': u'PHP/5.6.26'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.4'}
{u'num_tutorial': 1, u'_id': u'jb51.net'}
{u'num_tutorial': 1, u'_id': u'PHP/5.4.9-4ubuntu2.4'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.9-1ubuntu4.5'}
{u'num_tutorial': 2, u'_id': u'PHP/5.2.17'}
{u'num_tutorial': 1, u'_id': u'ASP.NET, ARR/2.5'}
{u'num_tutorial': 1, u'_id': u'PHP/5.3.17'}
{u'num_tutorial': 1, u'_id': u'PHP/5.3.19'}
{u'num_tutorial': 6, u'_id': u'PHP/5.3.3'}
{u'num_tutorial': 6, u'_id': u'PHP/5.4.45'}
{u'num_tutorial': 1, u'_id': u'PHP/7.0.13'}
{u'num_tutorial': 1, u'_id': u'PHP/5.4.41'}
{u'num_tutorial': 3, u'_id': u'PHP/5.4.16'}
{u'num_tutorial': 11, u'_id': u'ThinkPHP'}
{u'num_tutorial': 6, u'_id': u'WAF/2.0'}
{u'num_tutorial': 1, u'_id': u'PHP/7.0.10'}
{u'num_tutorial': 2, u'_id': u'PHP/5.6.30'}
{u'num_tutorial': 91, u'_id': u'ASP.NET'}
{u'num_tutorial': 1, u'_id': u'PHP/5.4.23'}
{u'num_tutorial': 1, u'_id': u'PHP/5.3.14'}
{u'num_tutorial': 1, u'_id': u'PHP/5.2.14'}
{u'num_tutorial': 1, u'_id': u'PHP/5.6.7'}
{u'num_tutorial': 1, u'_id': u'PHP/5.2.17, ASP.NET'}
{u'num_tutorial': 2, u'_id': u'PHP/5.6.17'}
{u'num_tutorial': 1, u'_id': u'PHP/5.4.4'}
{u'num_tutorial': 1, u'_id': u'shci_v1.03'}
{u'num_tutorial': 9, u'_id': u'Servlet/2.5 JSP/2.1'}
{u'num_tutorial': 1, u'_id': u'HC'}
{u'num_tutorial': 1, u'_id': u'PHP/7.1.6'}
{u'num_tutorial': 1, u'_id': u'YZCMS'}
{u'num_tutorial': 6, u'_id': u'PHP/5.3.29'}
{u'num_tutorial': 2, u'_id': u'QQ342556105'}
{u'num_tutorial': 8, u'_id': u'Express'}
{u'num_tutorial': 3, u'_id': u'ThinkCMF'}
{u'num_tutorial': 4, u'_id': u'PHP/5.5.30'}
{u'num_tutorial': 715, u'_id': None}
{u'num_tutorial': 1, u'_id': u'PHP/5.6.9'}
{u'num_tutorial': 1, u'_id': u'PHP/5.3.15'}
{u'num_tutorial': 2, u'_id': u'Servlet 2.5; JBoss-5.0/JBossWeb-2.1'}
{u'num_tutorial': 2, u'_id': u'UrlRewriter.NET 2.0.0, ASP.NET'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.23'}
{u'num_tutorial': 1, u'_id': u'PHP/5.6.24'}
{u'num_tutorial': 1, u'_id': u'PHP/5.2.17p1'}
{u'num_tutorial': 1, u'_id': u'PHP/5.3.29, ASP.NET'}
{u'num_tutorial': 2, u'_id': u'PHP/7.0.14'}
{u'num_tutorial': 1, u'_id': u'PHP/5.6.16'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.25'}
{u'num_tutorial': 1, u'_id': u'PHP/5.4.42'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.9-1ubuntu4.11'}
{u'num_tutorial': 1, u'_id': u'PHP/5.6.20'}
{u'num_tutorial': 2, u'_id': u'Leichi_IT_01'}
{u'num_tutorial': 1, u'_id': u'PHP/7.0.15'}
{u'num_tutorial': 1, u'_id': u'PHP/5.5.9-1ubuntu4.19'}
{u'num_tutorial': 2, u'_id': u'PHP/5.3.28'}
{u'num_tutorial': 1, u'_id': u'ASP.NET,HiSheng'}
{u'num_tutorial': 1, u'_id': u'PHP/5.6.13'}

嗯,那么分布式在哪里?  这就很简单了吧。

再注册一个daocloud账号,就有3台woker了。 使用docker部署很方便,分分钟搞定。

最后还是建议大家自己买VPS这么玩。

总薅羊毛容易被封号。

本文就是简单介绍一下分布式框架的入门。

如果真想用起来,还是建议使用猪猪侠的 thorns_project

源链接

Hacking more

...