智能选择优质基金
1. 关于本项目
本项目希望能够通过每天爬取基金数据,通过一些金融知识或者机器学习模型,给出当天优质基金。
2. 基金数据爬取
python3运行code中CrawlingFund.py
代码。
# encoding=utf-8
from selenium import webdriver
from bs4 import BeautifulSoup
import os
import time
def extract_url_info(url="https://www.howbuy.com/fund/fundranking/", scroll_times=0):
driver = webdriver.Chrome(executable_path="E:\软件安装\Google Chrome\chromedriver_win32_87\chromedriver.exe")#用chrome浏览器打开
driver.get(url)
time.sleep(2) #让操作稍微停一下
cookie = driver.get_cookies()
time.sleep(2)
# 滚动鼠标
def execute_times(times):
for i in range(times):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(2)
execute_times(scroll_times)
info_list = []
html = driver.page_source
soup1 = BeautifulSoup(html,'lxml')
count = 0
for info in soup1.find_all('tr'):
try:
#print(len(info.find_all('td')))
# 基金链接及名称
td = info.find_all('td')[3]
link = td.find('a').get('href')
name = td.find('a').contents[0]
# 基金代码
td = info.find_all('td')[0]
code = td.find('input').get("value")
# 日期 净值 近一周 近一月 近三月 近半年 近一年 今年以来
td = info.find_all('td')[4]
date = td.contents[0]
jingzhi = info.find_all('td')[5].contents[0]
week_change = info.find_all('td')[6].find('span').contents[0]
month_change = info.find_all('td')[7].find('span').contents[0]
threemonth_change = info.find_all('td')[8].find('span').contents[0]
halfyear_change = info.find_all('td')[9].find('span').contents[0]
year_change = info.find_all('td')[10].find('span').contents[0]
thisyear_change = info.find_all('td')[11].find('span').contents[0]
# 基金的信息
assert str(code) in str(link)
header = "基金代码,基金名称,日期,净值,近一周,近一月,近三月,近半年,近一年,今年以来,基金的详细链接"
result = "{},{},{},{},{},{},{},{},{},{},{}".format(code, name, date, jingzhi, week_change, month_change, threemonth_change,
halfyear_change, year_change, thisyear_change, link)
#print(result)
info_list.append(result)
count += 1
except Exception as e:
print("error", e)
print("基金数量", count)
#driver.close()
return info_list, header
def main():
# 股票型,...
url_list = ["https://www.howbuy.com/fund/fundranking/gupiao.htm", "https://www.howbuy.com/fund/fundranking/zhaiquan.htm",
"https://www.howbuy.com/fund/fundranking/hunhe.htm", "https://www.howbuy.com/fund/fundranking/licai.htm",
"https://www.howbuy.com/fund/fundranking/huobi.htm", "https://www.howbuy.com/fund/fundranking/zhishu.htm",
"https://www.howbuy.com/fund/fundranking/jiegou.htm", "https://www.howbuy.com/fund/fundranking/qdii.htm",
"https://www.howbuy.com/fund/fundranking/duichong.htm"]
if not os.path.exists('./result/'):
os.makedirs('./result/')
for url in url_list:
type = url.split("/")[-1].replace(".htm", "")
print("type", type)
info_list,header = extract_url_info(url=url, scroll_times=13)
with open("./result/haomai_{}.csv".format(type), "w", encoding="utf-8") as f:
f.write(header)
f.write('\n')
for info in info_list:
f.write(str(info))
f.write('\n')
exit(0)
if __name__ == "__main__":
main()
爬取网站:好买基金 https://www.howbuy.com/fund/fundranking
获取数据有,股票型,债券型,混合型,理财型,货币性,指数型,结构型,对冲型,QDII型基金,数据格式CSV文件。
爬取的信息:
基金代码,基金名称,日期,净值,近一周,近一月,近三月,近半年,近一年,今年以来,基金的详细链接
515700,平安中证新能源汽车ETF,01-26,2.3064,8.42%,13.16%,54.46%,81.46%,111.46%,10.77%,https://www.howbuy.com/fund/515700
501057,汇添富中证新能源汽车A,01-26,2.3612,8.09%,12.42%,50.39%,76.87%,109.81%,10.49%,https://www.howbuy.com/fund/501057
501058,汇添富中证新能源汽车C,01-26,2.3428,8.09%,12.39%,50.29%,76.64%,109.29%,10.47%,https://www.howbuy.com/fund/501058
159806,国泰中证新能源汽车ETF,01-26,2.2750,6.30%,9.58%,48.43%,76.53%,--,7.60%,https://www.howbuy.com/fund/159806
515030,华夏中证新能源汽车ETF,01-26,1.7843,6.23%,9.27%,47.87%,72.93%,--,7.34%,https://www.howbuy.com/fund/515030
165520,中信保诚中证800有色指数(LOF),01-26,1.4670,7.79%,12.67%,44.45%,44.87%,63.02%,13.11%,https://www.howbuy.com/fund/165520
161028,富国中证新能源汽车指数,01-26,1.1330,5.89%,8.32%,44.19%,67.27%,87.72%,6.79%,https://www.howbuy.com/fund/161028
009067,国泰中证新能源汽车ETF联接A,01-26,2.1231,6.06%,8.54%,43.48%,68.50%,--,7.32%,https://www.howbuy.com/fund/009067
009068,国泰中证新能源汽车ETF联接C,01-26,2.1182,6.05%,8.50%,43.37%,68.24%,--,7.29%,https://www.howbuy.com/fund/009068
最新估算净值爬取
#!/usr/env/bin python
#coding:utf-8
import requests
from bs4 import BeautifulSoup
import re
# 抓取网页
def get_url(url, params=None):
"""
爬取天天基金中基金的最新估值。
e.g. http://fundgz.1234567.com.cn/js/161725.js?rt=1589463125600
return: html
"""
rsp = requests.get(url, params=params)
rsp.raise_for_status()
return rsp.text
def get_fund_data(jjcode, base_url="http://fundgz.1234567.com.cn/js/"):
"""
请求数据,处理html数据
"""
url = base_url + str(jjcode) + ".js"
html = get_url(url)
print(html, type(html))
soup = BeautifulSoup(html, 'html.parser')
print("soup", soup, type(soup))
def main():
# 请求数据
get_fund_data(jjcode=161725)
if __name__ == "__main__":
main()
3. 筛选优质基金(待完善)
3.1 金融知识
4433法则:一只好的基金要同时满足以下条件:
1.近1年业绩排名在同类基金中位列前1/4;
2.近2/3/5年业绩排名在同类基金中位列前1/4;
3.近3个月业绩排名在同类基金中位列前1/3;
4.近6个月业绩排名在同类基金中位列前1/3。
3.2 机器学习模型
根据每周或者每月或者每半年等时间维度基金涨幅为预测目标,构建回归模型,预测各个基金的涨幅。
项目GitHub地址, 欢迎star。