NAME
Novel::Robot::Parser - get novel / bbs content from website
小说站点解析引擎
INIT
site
support novel website 支持小说站点
asxs 爱尚
day66 天天小说
dddbbb 豆豆
dingdian 顶点
hkslg 顺隆书院
jjwxc 绿晋江
kanshu 要看书
kanunu 努努
luoqiu 落秋
my285 梦远
qidian 起点
qqxs 千千
shunong 书农
snwx 少年文学
tadu 塔读文学
ttzw 天天中文
yanqingji 言情记
ybdu 一本读
yqhhy 言情后花园
zilang 紫琅文学
support txt file 支持处理txt
txt 指定解析txt文件
support forum website 支持论坛站点
hjj 红晋江
tieba 百度贴吧
xvna 炫浪网络
new
init funtion, need set sitename,or url 初始化解析模块,需指定站点名称或网址
#sitename : 直接指定站点
my $parser = Novel::Robot::Parser->new( site => 'jjwxc' );
#url : 通过url自动检测站点
my $url = 'http://www.jjwxc.net/onebook.php?novelid=2456';
my $parser = Novel::Robot::Parser->new( site => $url );
get_item_ref
get novel / forum thread data hash ref
获取小说/贴子内容,返回一个hash引用
my $r = $parser->get_item_ref($url, %opt);
NOVEL FUNCTION
get_novel_ref
get novel data, 获取小说内容
my $r = $parser->get_novel_ref($url, %opt);
get_index_ref
get novel index data, 获取目录页信息
my $index_ref = $parser->get_index_ref($index_url, %opt);
parse_index
parse novel index html content, 解析目录页
my $index_ref = $parser->parse_index($index_html_ref);
get_chapter_ref
get novel chapter data, 获取章节页信息
my $chapter_url = 'http://m.jjwxc.net/book2/2456/2';
my $chapter_ref = $parser->get_chapter_ref($chapter_url, 2);
parse_chapter
parse novel chapter html content, 解析章节页
my $chapter_ref = $parser->parse_chapter($chapter_html_ref);
TIEZI FUNCTION
get_tiezi_ref
get forum thread data, 获取贴子内容
my $r = $parser->get_tiezi_ref($url, %opt);
parse_tiezi
parse forum thread html content, 解析帖子信息
my $tz_ref = $parser->parse_tiezi($tz_html_ref);
parse_tiezi_floors
parse forum thread html floor content, 解析贴子楼层
my $floors = $parser->parse_tiezi_floors($tz_html_ref);
parse_tiezi_urls
get forum thread pages, 获取帖子分页
my $urls = $parser->parse_tiezi_urls($tz_html_ref);
BOARD FUNCTION
writer -> multi books
forum board -> multi threads
get_board_ref
get writer / board info, 获取版块信息
my $r = $parser->get_board_ref($url, %opt);
parse_board
parse writer / fourm board info,解析作者专栏/版块信息
my $board_ref = $parser->parse_board($board_html_ref);
parse_board_tiezis
parse board thread urls, 解析版块内容url
my $tzs = $parser->parse_board_items($board_html_ref);
parse_board_urls
parse board pages, 解析版块分页url
my $urls = $parser->parse_board_urls($board_html_ref);
parse_board_subboards
parse forum subboards, 获取子版块url
my $subboards = $parser->parse_board_subboards($board_html_ref);
QUERY FUNCTION
get_query_ref
query info, 获取查询结果
my $query_type = '作者';
my $query_keyword = '顾漫';
my ($info, $items_ref) = $parser->get_query_ref( $query_keyword,
query_type => $query_type );
make_query_request 指定查询请求
make query http data,查询请求数据
my ($query_url, $post_data) =
$parser->make_query_request( $query_keyword,
query_type => $query_type );
parse_query
parse query html,解析查询结果
my $query_title = $parser->parse_query($query_html_ref);
parse_query_items
parse query result, for examle, novel/thread url,解析查询结果列表,例如小说url
my $items_ref = $parser->parse_query_items($query_html_ref);
parse_query_urls
parse query pages, 查询结果为分页url
my $urls_ref = $parser->parse_query_urls($query_html_ref);