NAME
download novel /bbs thread 小说/贴子下载器
site
support novel/forum website 支持小说/贴子站点
type
support robot ouput file type, 支持小说输出形式
get_novel.pl
ARG
-s : site, 指定查询的站点
-u : book url,小说url
-w : writer url 作者专栏URL / writer name 作者名
-b : book name,书名
-f : txt file / txt file dir, 指定文本文件来源(可以是单个目录或文件)
-q : query type, 查询的类型
-k : query keyword, 查询的关键字
-t : save type, 小说保存类型,例如txt/html
-o : output filename, 保存的小说文件名
-D : only print info, not download, 只输出信息,不下载,或进行其他处理操作
-C : with_toc, 小说保存时是否生成目录(默认是)
-g : grep_content , 提取关键字
-G : filter_content , 过滤关键字
-i : {min/max}_{tiezi_page/chapter_num}, 只取 x-y 章
-A : only_poster, 贴子只看楼主
-N : min_floor_word_num, 贴子每层最小字数
-n : split chapter num, 单个文件最大章节数(一本小说可以分多个文件,每个文件最多n章)
-r : chapter regex, 指定分割章节的正则表达式(例如:"第[ \\t\\d]+章")
-E : select menu, 是否输出小说选择菜单
-B : forum board number, 版块序号,例如hjj的xq版块号为3
-I : {min/max}_{query/board}_page, 结果列表只取 x-y 页
-M : max_{query_item/board_item}_num, 结果列表最多取x项
-m : max_tiezi_floor_num, 结果列表最多取x项
-v : verbose, 显示进度条(默认显示)
-p : max_process_num, 进程个数
download
下载
get_novel.pl -u [url] -t [type] -i [chapter_number_list] -o [dst_file/dst_dir] -g [grep_content] -G [filter_content]
get_novel.pl -u [小说目录页url] -t [目标文件类型] -i [章节号] -o [目标文件名/目标文件夹] -g [提取关键字] -G [过滤关键字]
get_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -t txt
get_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -t html
get_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -t web -o some_dir
download and split
download and split novel into pieces, each piece contains $n chapters
分段下载小说
get_novel.pl -u [url] -t [type] -n [split_chapter_number_step]
get_novel.pl -u [小说目录页url] -t [目标类型] -n [分段章节数]
get_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -n 10 -t txt
convert txt
convert txt, parse chapter name, add toc
转换txt,加目录
get_novel.pl -w [writer] -b [book] -f [txt_file/directory] -r [chapter_regex] -t [type]
get_novel.pl -w [作者] -b [书名] -f [txt文件或目录] -r [章节标题匹配的正则式] -t [目标文件类型]
get_novel.pl -w 牵机 -b 断情逐妖记 -f dq1.txt -t html
get_novel.pl -w 牵机 -b 断情逐妖记 -f dq1.txt,dq2.txt,dir1 -r "第[ \\t\\d]+章" -t html
get_novel.pl -f 飘灯-像妖怪一样自由.txt -t html
bulk download novels/threads
批量下载小说/贴子
get_novel.pl -b [board_url/writer_url] -m [select_menu_or_not] -t [packer_type]
get_novel.pl -s [site] -q [query_type] -k [query_keyword] -m [select_menu_or_not] -t [packer_type]
get_novel.pl -b "http://www.jjwxc.net/oneauthor.php?authorid=14644" -m 1 -t html
get_novel.pl -s jjwxc -q 作品 -k 断情逐妖记 -m 1 -t html
only print info
only print info, but not download 输出小说信息(不下载)
get_novel.pl -u [url] -D 1
get_novel.pl -b [board_url/writer_url] -D 1
get_novel.pl -s [site] -q [query_type] -k [query_keyword] -D 1
get_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -D 1
get_novel.pl -b "http://www.jjwxc.net/oneauthor.php?authorid=14644" -D 1
get_novel.pl -s jjwxc -q 作品 -k 断情逐妖记 -D 1
conv_novel.pl
convert novel file into epub/mobi/... use calibre's ebook-convert, default filename format is [writer]-[bookname].[type]
将下载的 html格式 的小说转换成 其他格式的电子书,例如epub、mobi等等
需要预先安装calibre的ebook-convert,源文件名称格式默认为 [作者-书名]
conv_novel.pl -f [txt_file] -t [type] -w [writer] -b [book]
conv_novel.pl -f [源文件] -t [目标文件类型(小写)] -w [作者] -b [书名]
conv_novel.pl -f 天平-风起阿房.html -t mobi
conv_novel.pl -f 施定柔-迷侠记.html -t epub
run_novel.pl
ARG
-u : book url,小说url
-w : writer name, 作者
-b : book name,书名
-f : txt file, txt文件或目录
-t : send to email address,推送的目标邮箱地址
-o : output filename,输出电子书文件名
-T : ebook type,电子书类型
-G : get_novel.pl args
-C : conv_novel.pl args
-S : sendEmail args
-h : remote host,远程调用的机器名
download url and convert
just download and convert ebook, 简单下载或处理电子书
only download novel and convert to ebook 只下载小说并转换为电子书
run_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -T mobi
run_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -o mytest.epub
only download 1-3 chapter 只下载小说的1-3章,最终输出文件名为abc.mobi
run_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -o abc.mobi -G "-i 1-3"
deal txt and convert
run_novel.pl -f 飘灯-风尘叹.txt -T mobi
run_novel.pl -f fct.txt -w 飘灯 -b 风尘叹 -T mobi
send novel with email
need sendEmail,需要预先安装sendEmail
download/convert novel, and send mobi to email address : xxx@kindle.cn, 下载小说并推送到指定邮箱
local smtp service 本地已安装smtp服务
run_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -T mobi -t "xxx@kindle.cn" -S "-f yyy@somesite.cn"
run_novel.pl -f fct.txt -w 飘灯 -b 风尘叹 -T mobi -t "xxx@kindle.cn" -S "-f yyy@somesite.cn"
remote smtp service 使用远程smtp服务
run_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -T mobi -t "xxx@kindle.cn" -G "-i 1-3" -S "-f yyy@somesite.cn -s smtp.src.com -xu xxx -xp somepwd"
run_novel.pl -u "http://www.jjwxc.net/onebook.php?novelid=14838" -T mobi -t "xxx@kindle.cn" -S "-f yyy@qq.com -s smtp.qq.com:587 -o tls=yes -xu yyy -xp 'aaaaaaaaaaaaagga'"
run_novel.pl -f fct.txt -w 飘灯 -b 风尘叹 -T mobi -t "xxx@kindle.cn" -S "-f yyy@somesite.cn -s smtp.src.com -xu xxx -xp somepwd"
run_novel.pl -f fct.txt -w 飘灯 -b 风尘叹 -T mobi -T mobi -t "xxx@kindle.cn" -S "-f yyy@qq.com -s smtp.qq.com:587 -o tls=yes -xu yyy -xp 'aaaaaaaaaaaaagga'"
use ansible,push ebook to remote host, and then sendEmail 使用ansible把电子书上传到远程服务器,再在远程服务器调用其本机smtp推送
run_novel.pl -h remote.vps.com -u "http://www.jjwxc.net/onebook.php?novelid=14838" -T mobi -t "xxx@kindle.cn" -S "-f yyy@somesite.cn"
run_novel.pl -h remote.vps.com -f fct.txt -w 飘灯 -b 风尘叹 -T mobi -t "xxx@kindle.cn" -S "-f yyy@somesite.cn"
FUNCTION
new
init to set src site and dst type
初始化设置解析引擎,目标文件类型
my $xs = Novel::Robot->new(
site => 'jjwxc',
type => 'html',
);
set_parser
set src site, 设置解析引擎
$xs->set_parser('jjwxc');
set_packer
set dst type, 设置打包引擎
$xs->set_packer('html');
get_item
download one novel/thread 下载整本小说
$xs->set_parser('jjwxc');
my $index_url = 'http://www.jjwxc.net/onebook.php?novelid=2456';
$xs->get_item($index_url);
$xs->set_parser('txt');
$xs->get_item([ '/somepath/somefile.txt' ]
writer => '牵机', book => '断情逐妖记',
);
bulk download
bulk download writer's novel / board's threads 下载作者专栏/版块
$xs->set_parser('jjwxc');
my $writer_url = 'http://www.jjwxc.net/oneauthor.php?authorid=14644';
my ($writer_name, $books_ref) = $xs->{parser}->get_board_ref($writer_url, %opt);
$xs->get_item($_, %opt) for @$books_ref;
$xs->set_parser('hjj');
my $board_url = "http://bbs.jjwxc.net/showmsg.php?board=153";
my ($info, $tiezis_ref) = $xs->{parser}->get_board_ref($board_url, %opt);
$xs->get_item($_, %opt) for @$tiezis_ref;
query and download
查询并下载
my $query_type = '作者';
my $query_keyword='牵机';
my ($info, $items_ref) = $xs->{parser}->get_query_ref($query_keyword, query_type => $query_type, %opt);
$xs->get_item($_, %opt) for @$items_ref;
select_item
在Term下选择小说
my $select_books_ref = $xs->select_item($banner_info, $books_ref);